date:20181010

Re: [PATCH net-next 0/4] Adds support of RSS to HNS3 Driver for Rev 2(=0x21) H/W

2018-10-10 Thread David Miller

From: Salil Mehta 
Date: Wed, 10 Oct 2018 20:05:33 +0100

> This patch-set mainly adds new additions related to RSS for the new
> hardware Revision 0x21. It also adds support to use RSS hash value
> provided by the hardware along with descriptor.

Series applied.

Re: [PATCH net-next 0/4] Adds support of RSS to HNS3 Driver for Rev 2(=0x21) H/W

2018-10-10 Thread David Miller

From: Salil Mehta 
Date: Wed, 10 Oct 2018 20:05:33 +0100

> This patch-set mainly adds new additions related to RSS for the new
> hardware Revision 0x21. It also adds support to use RSS hash value
> provided by the hardware along with descriptor.

Series applied.

Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals

2018-10-10 Thread Nicholas A. Bellinger

Hello MNC & Co,

On Wed, 2018-10-10 at 11:58 -0500, Mike Christie wrote:
> On 10/09/2018 10:23 PM, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes 
> > no
> > signals will be pending for task_struct executing the normal session 
> > shutdown
> > and I/O quiesce code-path.
> > 



> > diff --git a/drivers/target/target_core_transport.c 
> > b/drivers/target/target_core_transport.c
> > index 86c0156..fc3093d2 100644
> > --- a/drivers/target/target_core_transport.c
> > +++ b/drivers/target/target_core_transport.c
> > @@ -2754,7 +2754,7 @@ static void target_release_cmd_kref(struct kref *kref)
> > if (se_sess) {
> > spin_lock_irqsave(_sess->sess_cmd_lock, flags);
> > list_del_init(_cmd->se_cmd_list);
> > -   if (list_empty(_sess->sess_cmd_list))
> > +   if (se_sess->sess_tearing_down && 
> > list_empty(_sess->sess_cmd_list))
> 
> I think there is another issue with 00d909a107 and ibmvscsi_tgt.
> 
> The problem is that ibmvscsi_tgt never called
> target_sess_cmd_list_set_waiting. It only called
> target_wait_for_sess_cmds. So before 00d909a107 there was a bug in that
> driver and target_wait_for_sess_cmds never did what was intended because
> sess_wait_list would always be empty.
> 
> With 00d909a107, we no longer need to call
> target_sess_cmd_list_set_waiting to wait for outstanding commands, so
> for ibmvscsi_tgt will now wait for commands like we wanted. However, the
> commit added a WARN_ON that is hit if target_sess_cmd_list_set_waiting
> is not called, so we could hit that.
> 
> So I think we need to add a target_sess_cmd_list_set_waiting call in
> ibmvscsi_tgt to go along with your patch chunk above and make sure we do
> not trigger the WARN_ON.
> 

Nice catch.  :)

With target_wait_for_sess_cmd() usage pre 00d909a107 doing a list-splice
in target_sess_cmd_list_set_waiting(), this particular usage in
ibmvscsi_tgt has always been list_empty(>sess_wait_list) = true
(eg: no se_cmd I/O is quiesced, because no se_cmd in sess_wait_list)
since commit 712db3eb in 4.9.y code.

That said, ibmvscsi_tgt usage is very similar to vhost/scsi in the
respect individual /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/
endpoints used by VMs do not remove their I_T nexus while the VM is
active.

So AFAICT, ibmvscsi_tgt doesn't strictly need target_sess_wait_for_cmd()
at all if this is true.

Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals

2018-10-10 Thread Nicholas A. Bellinger

Hello MNC & Co,

On Wed, 2018-10-10 at 11:58 -0500, Mike Christie wrote:
> On 10/09/2018 10:23 PM, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes 
> > no
> > signals will be pending for task_struct executing the normal session 
> > shutdown
> > and I/O quiesce code-path.
> > 



> > diff --git a/drivers/target/target_core_transport.c 
> > b/drivers/target/target_core_transport.c
> > index 86c0156..fc3093d2 100644
> > --- a/drivers/target/target_core_transport.c
> > +++ b/drivers/target/target_core_transport.c
> > @@ -2754,7 +2754,7 @@ static void target_release_cmd_kref(struct kref *kref)
> > if (se_sess) {
> > spin_lock_irqsave(_sess->sess_cmd_lock, flags);
> > list_del_init(_cmd->se_cmd_list);
> > -   if (list_empty(_sess->sess_cmd_list))
> > +   if (se_sess->sess_tearing_down && 
> > list_empty(_sess->sess_cmd_list))
> 
> I think there is another issue with 00d909a107 and ibmvscsi_tgt.
> 
> The problem is that ibmvscsi_tgt never called
> target_sess_cmd_list_set_waiting. It only called
> target_wait_for_sess_cmds. So before 00d909a107 there was a bug in that
> driver and target_wait_for_sess_cmds never did what was intended because
> sess_wait_list would always be empty.
> 
> With 00d909a107, we no longer need to call
> target_sess_cmd_list_set_waiting to wait for outstanding commands, so
> for ibmvscsi_tgt will now wait for commands like we wanted. However, the
> commit added a WARN_ON that is hit if target_sess_cmd_list_set_waiting
> is not called, so we could hit that.
> 
> So I think we need to add a target_sess_cmd_list_set_waiting call in
> ibmvscsi_tgt to go along with your patch chunk above and make sure we do
> not trigger the WARN_ON.
> 

Nice catch.  :)

With target_wait_for_sess_cmd() usage pre 00d909a107 doing a list-splice
in target_sess_cmd_list_set_waiting(), this particular usage in
ibmvscsi_tgt has always been list_empty(>sess_wait_list) = true
(eg: no se_cmd I/O is quiesced, because no se_cmd in sess_wait_list)
since commit 712db3eb in 4.9.y code.

That said, ibmvscsi_tgt usage is very similar to vhost/scsi in the
respect individual /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/
endpoints used by VMs do not remove their I_T nexus while the VM is
active.

So AFAICT, ibmvscsi_tgt doesn't strictly need target_sess_wait_for_cmd()
at all if this is true.

Re: [PATCH -next] phy: phy-ocelot-serdes: fix return value check in serdes_probe()

2018-10-10 Thread David Miller

From: Wei Yongjun 
Date: Wed, 10 Oct 2018 02:00:24 +

> In case of error, the function syscon_node_to_regmap() returns ERR_PTR()
> and never returns NULL. The NULL test in the return value check should
> be replaced with IS_ERR().
> 
> Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing")
> Signed-off-by: Wei Yongjun 

Applied.

Re: [PATCH -next] phy: phy-ocelot-serdes: fix return value check in serdes_probe()

2018-10-10 Thread David Miller

From: Wei Yongjun 
Date: Wed, 10 Oct 2018 02:00:24 +

> In case of error, the function syscon_node_to_regmap() returns ERR_PTR()
> and never returns NULL. The NULL test in the return value check should
> be replaced with IS_ERR().
> 
> Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing")
> Signed-off-by: Wei Yongjun 

Applied.

[PATCH] platform/x86: thinkpad_acpi: Change the keymap for Favorites hotkey

2018-10-10 Thread Zhang Xianwei

The keycode KEY_FAVORITES(0x16c) used in thinkpad_acpi driver is too
big (out of range > 255) for xorg to handle.
xkeyboard-config has already mapped KEY_BOOKMARKS(156) to
XF86Favorites:

keycodes/evdev:
 = 164;   // #define KEY_BOOKMARKS   156

symbols/inet:
key{  [ XF86Favorites ]   };

So change the keymap to KEY_BOOKMARKS for Favorites hotkey.

Signed-off-by: Zhang Xianwei 
---
 drivers/platform/x86/thinkpad_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/x86/thinkpad_acpi.c 
b/drivers/platform/x86/thinkpad_acpi.c
index fde08a9..a86cf47 100644
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -3457,7 +3457,7 @@ static int __init hotkey_init(struct ibm_init_struct 
*iibm)
KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN,
KEY_UNKNOWN,
 
-   KEY_FAVORITES,   /* Favorite app, 0x311 */
+   KEY_BOOKMARKS,   /* Favorite app, 0x311 */
KEY_RESERVED,/* Clipping tool */
KEY_CALC,/* Calculator (above numpad, P52) */
KEY_BLUETOOTH,   /* Bluetooth */
-- 
2.9.5

[PATCH] platform/x86: thinkpad_acpi: Change the keymap for Favorites hotkey

2018-10-10 Thread Zhang Xianwei

The keycode KEY_FAVORITES(0x16c) used in thinkpad_acpi driver is too
big (out of range > 255) for xorg to handle.
xkeyboard-config has already mapped KEY_BOOKMARKS(156) to
XF86Favorites:

keycodes/evdev:
 = 164;   // #define KEY_BOOKMARKS   156

symbols/inet:
key{  [ XF86Favorites ]   };

So change the keymap to KEY_BOOKMARKS for Favorites hotkey.

Signed-off-by: Zhang Xianwei 
---
 drivers/platform/x86/thinkpad_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/x86/thinkpad_acpi.c 
b/drivers/platform/x86/thinkpad_acpi.c
index fde08a9..a86cf47 100644
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -3457,7 +3457,7 @@ static int __init hotkey_init(struct ibm_init_struct 
*iibm)
KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN, KEY_UNKNOWN,
KEY_UNKNOWN,
 
-   KEY_FAVORITES,   /* Favorite app, 0x311 */
+   KEY_BOOKMARKS,   /* Favorite app, 0x311 */
KEY_RESERVED,/* Clipping tool */
KEY_CALC,/* Calculator (above numpad, P52) */
KEY_BLUETOOTH,   /* Bluetooth */
-- 
2.9.5

Re: [PATCH v8 0/3] x86/boot/KASLR: Parse ACPI table and limit kaslr in immovable memory

2018-10-10 Thread Chao Fan

On Thu, Oct 11, 2018 at 08:29:55AM +0800, Baoquan He wrote:
>On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote:
>> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote:
>> > On 10/10/18 at 11:19am, Borislav Petkov wrote:
>> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
>> > > > Yes, it's different, but if the SRAT information is available early, 
>> > > > then
>> > > > the command line parameter can go away because then the required
>> > > > information for Masa's problem is available as well.
>> > > 
>> > > Exactly. And I'd prefer we delayed the command line parameter until we
>> > > figure out we really need it and not expose it to upstream and then
>> > > remove it shortly after.
>> > > 
>> > > So I'd suggest we move Masa's patches to a separate branch and not send
>> > > it up this round.
>> > 
>> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
>> > to solve the padding issue.
>> 
>> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's
>> patch may be useful for calculating the padding size. If so, we don't
>> need the new kernel parameter. It's nice!
>> 
>> Do you happen to have ideas how we access immovable_mem[num_immovable_mem]
>> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so
>> I suppose it is not easy to access it... 
>> I would appreciate if you could give some advice.
>
>Hmm, they are living in different life cycle and space. So we can only
>reuse the code from Chao's patch, but not the variable. Means we need
>go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill
>immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to
>do like that.

Reading three times is redundant, but reading two times is needed.
Becasue the ACPI code run very stable, but we need more information
before that.
As for Masa's issue, I am wondering whether we can tranfer the
information or only the address of SRAT table from compressed period
to the period after start_kernel().

Thanks,
Chao Fan

>
>Thanks
>Baoquan
>
>

Re: [PATCH v8 0/3] x86/boot/KASLR: Parse ACPI table and limit kaslr in immovable memory

2018-10-10 Thread Chao Fan

On Thu, Oct 11, 2018 at 08:29:55AM +0800, Baoquan He wrote:
>On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote:
>> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote:
>> > On 10/10/18 at 11:19am, Borislav Petkov wrote:
>> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
>> > > > Yes, it's different, but if the SRAT information is available early, 
>> > > > then
>> > > > the command line parameter can go away because then the required
>> > > > information for Masa's problem is available as well.
>> > > 
>> > > Exactly. And I'd prefer we delayed the command line parameter until we
>> > > figure out we really need it and not expose it to upstream and then
>> > > remove it shortly after.
>> > > 
>> > > So I'd suggest we move Masa's patches to a separate branch and not send
>> > > it up this round.
>> > 
>> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
>> > to solve the padding issue.
>> 
>> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's
>> patch may be useful for calculating the padding size. If so, we don't
>> need the new kernel parameter. It's nice!
>> 
>> Do you happen to have ideas how we access immovable_mem[num_immovable_mem]
>> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so
>> I suppose it is not easy to access it... 
>> I would appreciate if you could give some advice.
>
>Hmm, they are living in different life cycle and space. So we can only
>reuse the code from Chao's patch, but not the variable. Means we need
>go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill
>immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to
>do like that.

Reading three times is redundant, but reading two times is needed.
Becasue the ACPI code run very stable, but we need more information
before that.
As for Masa's issue, I am wondering whether we can tranfer the
information or only the address of SRAT table from compressed period
to the period after start_kernel().

Thanks,
Chao Fan

>
>Thanks
>Baoquan
>
>

KASAN: use-after-free Read in __llc_lookup_established

2018-10-10 Thread syzbot


Hello,

syzbot found the following crash on:

HEAD commit:3d647e62686f Merge tag 's390-4.19-4' of git://git.kernel.o..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1707d80940
kernel config:  https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
dashboard link: https://syzkaller.appspot.com/bug?extid=11e05f04c15e03be5254
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+11e05f04c15e03be5...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in llc_estab_match net/llc/llc_conn.c:494  
[inline]
BUG: KASAN: use-after-free in __llc_lookup_established+0xc80/0xe10  
net/llc/llc_conn.c:522

Read of size 1 at addr 8801c5794a7f by task syz-executor3/10277

CPU: 0 PID: 10277 Comm: syz-executor3 Not tainted 4.19.0-rc7+ #55
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
net_ratelimit: 9 callbacks suppressed
openvswitch: netlink: Key type 12288 is out of range max 29
 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
 llc_estab_match net/llc/llc_conn.c:494 [inline]
 __llc_lookup_established+0xc80/0xe10 net/llc/llc_conn.c:522
openvswitch: netlink: Key type 12288 is out of range max 29
 llc_lookup_established+0x36/0x60 net/llc/llc_conn.c:554
 llc_ui_bind+0x810/0xdd0 net/llc/af_llc.c:381
 __sys_bind+0x331/0x440 net/socket.c:1483
 __do_sys_bind net/socket.c:1494 [inline]
 __se_sys_bind net/socket.c:1492 [inline]
 __x64_sys_bind+0x73/0xb0 net/socket.c:1492
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7f2a18100c78 EFLAGS: 0246 ORIG_RAX: 0031
RAX: ffda RBX: 0003 RCX: 00457579
RDX: 0010 RSI: 2040 RDI: 0006
RBP: 0072bf00 R08:  R09: 
R10:  R11: 0246 R12: 7f2a181016d4
R13: 004bd718 R14: 004cbfe0 R15: 

Allocated by task 10278:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
 __do_kmalloc mm/slab.c:3718 [inline]
 __kmalloc+0x14e/0x760 mm/slab.c:3727
 kmalloc include/linux/slab.h:518 [inline]
 sk_prot_alloc+0x1b0/0x2e0 net/core/sock.c:1468
 sk_alloc+0x10d/0x1690 net/core/sock.c:1522
 llc_sk_alloc+0x35/0x4b0 net/llc/llc_conn.c:949
 llc_ui_create+0x142/0x520 net/llc/af_llc.c:173
 __sock_create+0x536/0x930 net/socket.c:1277
 sock_create net/socket.c:1317 [inline]
 __sys_socket+0x106/0x260 net/socket.c:1347
 __do_sys_socket net/socket.c:1356 [inline]
 __se_sys_socket net/socket.c:1354 [inline]
 __x64_sys_socket+0x73/0xb0 net/socket.c:1354
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 10276:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kfree+0xcf/0x230 mm/slab.c:3813
 sk_prot_free net/core/sock.c:1505 [inline]
 __sk_destruct+0x797/0xa80 net/core/sock.c:1587
 sk_destruct+0x78/0x90 net/core/sock.c:1595
 __sk_free+0xcf/0x300 net/core/sock.c:1606
 sk_free+0x42/0x50 net/core/sock.c:1617
 sock_put include/net/sock.h:1691 [inline]
 llc_sk_free+0x9d/0xb0 net/llc/llc_conn.c:1017
 llc_ui_release+0x161/0x2a0 net/llc/af_llc.c:218
 __sock_release+0xd7/0x250 net/socket.c:579
 sock_close+0x19/0x20 net/socket.c:1141
 __fput+0x385/0xa30 fs/file_table.c:278
 fput+0x15/0x20 fs/file_table.c:309
 task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:193 [inline]
 exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at 8801c5794600
 which belongs to the cache kmalloc-2048 of size 2048
The buggy address is located 1151 bytes inside of
 2048-byte region [8801c5794600, 8801c5794e00)
The buggy address belongs to the page:

KASAN: use-after-free Read in __llc_lookup_established

2018-10-10 Thread syzbot


Hello,

syzbot found the following crash on:

HEAD commit:3d647e62686f Merge tag 's390-4.19-4' of git://git.kernel.o..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1707d80940
kernel config:  https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
dashboard link: https://syzkaller.appspot.com/bug?extid=11e05f04c15e03be5254
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+11e05f04c15e03be5...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in llc_estab_match net/llc/llc_conn.c:494  
[inline]
BUG: KASAN: use-after-free in __llc_lookup_established+0xc80/0xe10  
net/llc/llc_conn.c:522

Read of size 1 at addr 8801c5794a7f by task syz-executor3/10277

CPU: 0 PID: 10277 Comm: syz-executor3 Not tainted 4.19.0-rc7+ #55
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
net_ratelimit: 9 callbacks suppressed
openvswitch: netlink: Key type 12288 is out of range max 29
 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
 llc_estab_match net/llc/llc_conn.c:494 [inline]
 __llc_lookup_established+0xc80/0xe10 net/llc/llc_conn.c:522
openvswitch: netlink: Key type 12288 is out of range max 29
 llc_lookup_established+0x36/0x60 net/llc/llc_conn.c:554
 llc_ui_bind+0x810/0xdd0 net/llc/af_llc.c:381
 __sys_bind+0x331/0x440 net/socket.c:1483
 __do_sys_bind net/socket.c:1494 [inline]
 __se_sys_bind net/socket.c:1492 [inline]
 __x64_sys_bind+0x73/0xb0 net/socket.c:1492
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7f2a18100c78 EFLAGS: 0246 ORIG_RAX: 0031
RAX: ffda RBX: 0003 RCX: 00457579
RDX: 0010 RSI: 2040 RDI: 0006
RBP: 0072bf00 R08:  R09: 
R10:  R11: 0246 R12: 7f2a181016d4
R13: 004bd718 R14: 004cbfe0 R15: 

Allocated by task 10278:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
 __do_kmalloc mm/slab.c:3718 [inline]
 __kmalloc+0x14e/0x760 mm/slab.c:3727
 kmalloc include/linux/slab.h:518 [inline]
 sk_prot_alloc+0x1b0/0x2e0 net/core/sock.c:1468
 sk_alloc+0x10d/0x1690 net/core/sock.c:1522
 llc_sk_alloc+0x35/0x4b0 net/llc/llc_conn.c:949
 llc_ui_create+0x142/0x520 net/llc/af_llc.c:173
 __sock_create+0x536/0x930 net/socket.c:1277
 sock_create net/socket.c:1317 [inline]
 __sys_socket+0x106/0x260 net/socket.c:1347
 __do_sys_socket net/socket.c:1356 [inline]
 __se_sys_socket net/socket.c:1354 [inline]
 __x64_sys_socket+0x73/0xb0 net/socket.c:1354
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 10276:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kfree+0xcf/0x230 mm/slab.c:3813
 sk_prot_free net/core/sock.c:1505 [inline]
 __sk_destruct+0x797/0xa80 net/core/sock.c:1587
 sk_destruct+0x78/0x90 net/core/sock.c:1595
 __sk_free+0xcf/0x300 net/core/sock.c:1606
 sk_free+0x42/0x50 net/core/sock.c:1617
 sock_put include/net/sock.h:1691 [inline]
 llc_sk_free+0x9d/0xb0 net/llc/llc_conn.c:1017
 llc_ui_release+0x161/0x2a0 net/llc/af_llc.c:218
 __sock_release+0xd7/0x250 net/socket.c:579
 sock_close+0x19/0x20 net/socket.c:1141
 __fput+0x385/0xa30 fs/file_table.c:278
 fput+0x15/0x20 fs/file_table.c:309
 task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:193 [inline]
 exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at 8801c5794600
 which belongs to the cache kmalloc-2048 of size 2048
The buggy address is located 1151 bytes inside of
 2048-byte region [8801c5794600, 8801c5794e00)
The buggy address belongs to the page:

Re: [PATCH] mfd: remove redundant 'default n' from Kconfig

2018-10-10 Thread Lee Jones

On Wed, 10 Oct 2018, Bartlomiej Zolnierkiewicz wrote:

> 'default n' is the default value for any bool or tristate Kconfig
> setting so there is no need to write it explicitly.
> 
> Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO
> is not set' for visible symbols") the Kconfig behavior is the same
> regardless of 'default n' being present or not:
> 
> ...
> One side effect of (and the main motivation for) this change is making
> the following two definitions behave exactly the same:
> 
> config FOO
> bool
> 
> config FOO
> bool
> default n
> 
> With this change, neither of these will generate a
> '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
> That might make it clearer to people that a bare 'default n' is
> redundant.
> ...
> 
> Signed-off-by: Bartlomiej Zolnierkiewicz 
> ---
>  drivers/mfd/Kconfig |6 --
>  1 file changed, 6 deletions(-)

The change looks okay to me, but I would also like you to include the
Maintainers/Reviewers for the affected source files.

Also, I assume you are not just submitting these changes to the MFD
subsystem.  My suggesting is to change each subsystem per patch (as
you have done here), and submit them in one patch-set with each of the
subsystem Maintainers included, so each of us has some visibility into
how the general idea is being received.

> Index: b/drivers/mfd/Kconfig
> ===
> --- a/drivers/mfd/Kconfig 2018-10-09 15:58:40.547122978 +0200
> +++ b/drivers/mfd/Kconfig 2018-10-10 16:49:37.575915230 +0200
> @@ -8,7 +8,6 @@ menu "Multifunction device drivers"
>  config MFD_CORE
>   tristate
>   select IRQ_DOMAIN
> - default n
>  
>  config MFD_CS5535
>   tristate "AMD CS5535 and CS5536 southbridge core functions"
> @@ -870,7 +869,6 @@ config MFD_VIPERBOARD
>  tristate "Nano River Technologies Viperboard"
>   select MFD_CORE
>   depends on USB
> - default n
>   help
> Say yes here if you want support for Nano River Technologies
> Viperboard.
> @@ -1575,7 +1573,6 @@ config MFD_TWL4030_AUDIO
>   bool "TI TWL4030 Audio"
>   depends on TWL4030_CORE
>   select MFD_CORE
> - default n
>  
>  config TWL6040_CORE
>   bool "TI TWL6040 audio codec"
> @@ -1583,7 +1580,6 @@ config TWL6040_CORE
>   select MFD_CORE
>   select REGMAP_I2C
>   select REGMAP_IRQ
> - default n
>   help
> Say yes here if you want support for Texas Instruments TWL6040 audio
> codec.
> @@ -1605,7 +1601,6 @@ config MFD_WL1273_CORE
>   tristate "TI WL1273 FM radio"
>   depends on I2C
>   select MFD_CORE
> - default n
>   help
> This is the core driver for the TI WL1273 FM radio. This MFD
> driver connects the radio-wl1273 V4L2 module and the wl1273
> @@ -1649,7 +1644,6 @@ config MFD_TC3589X
>  
>  config MFD_TMIO
>   bool
> - default n
>  
>  config MFD_T7L66XB
>   bool "Toshiba T7L66XB"

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCH] mfd: remove redundant 'default n' from Kconfig

2018-10-10 Thread Lee Jones

On Wed, 10 Oct 2018, Bartlomiej Zolnierkiewicz wrote:

> 'default n' is the default value for any bool or tristate Kconfig
> setting so there is no need to write it explicitly.
> 
> Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO
> is not set' for visible symbols") the Kconfig behavior is the same
> regardless of 'default n' being present or not:
> 
> ...
> One side effect of (and the main motivation for) this change is making
> the following two definitions behave exactly the same:
> 
> config FOO
> bool
> 
> config FOO
> bool
> default n
> 
> With this change, neither of these will generate a
> '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
> That might make it clearer to people that a bare 'default n' is
> redundant.
> ...
> 
> Signed-off-by: Bartlomiej Zolnierkiewicz 
> ---
>  drivers/mfd/Kconfig |6 --
>  1 file changed, 6 deletions(-)

The change looks okay to me, but I would also like you to include the
Maintainers/Reviewers for the affected source files.

Also, I assume you are not just submitting these changes to the MFD
subsystem.  My suggesting is to change each subsystem per patch (as
you have done here), and submit them in one patch-set with each of the
subsystem Maintainers included, so each of us has some visibility into
how the general idea is being received.

> Index: b/drivers/mfd/Kconfig
> ===
> --- a/drivers/mfd/Kconfig 2018-10-09 15:58:40.547122978 +0200
> +++ b/drivers/mfd/Kconfig 2018-10-10 16:49:37.575915230 +0200
> @@ -8,7 +8,6 @@ menu "Multifunction device drivers"
>  config MFD_CORE
>   tristate
>   select IRQ_DOMAIN
> - default n
>  
>  config MFD_CS5535
>   tristate "AMD CS5535 and CS5536 southbridge core functions"
> @@ -870,7 +869,6 @@ config MFD_VIPERBOARD
>  tristate "Nano River Technologies Viperboard"
>   select MFD_CORE
>   depends on USB
> - default n
>   help
> Say yes here if you want support for Nano River Technologies
> Viperboard.
> @@ -1575,7 +1573,6 @@ config MFD_TWL4030_AUDIO
>   bool "TI TWL4030 Audio"
>   depends on TWL4030_CORE
>   select MFD_CORE
> - default n
>  
>  config TWL6040_CORE
>   bool "TI TWL6040 audio codec"
> @@ -1583,7 +1580,6 @@ config TWL6040_CORE
>   select MFD_CORE
>   select REGMAP_I2C
>   select REGMAP_IRQ
> - default n
>   help
> Say yes here if you want support for Texas Instruments TWL6040 audio
> codec.
> @@ -1605,7 +1601,6 @@ config MFD_WL1273_CORE
>   tristate "TI WL1273 FM radio"
>   depends on I2C
>   select MFD_CORE
> - default n
>   help
> This is the core driver for the TI WL1273 FM radio. This MFD
> driver connects the radio-wl1273 V4L2 module and the wl1273
> @@ -1649,7 +1644,6 @@ config MFD_TC3589X
>  
>  config MFD_TMIO
>   bool
> - default n
>  
>  config MFD_T7L66XB
>   bool "Toshiba T7L66XB"

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCH net-next v7 28/28] net: WireGuard secure network tunnel

2018-10-10 Thread Jiri Pirko

Wed, Oct 10, 2018 at 10:27:46PM CEST, ja...@zx2c4.com wrote:
>Hey Jiri,
>
>Actually, in the end I went with the suggestion from Andrew and Lukas,
>which is to follow Dan's guideline:
>https://lkml.org/lkml/2016/8/22/374 . It looks like this:
>
>https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/drivers/net/wireguard/device.c?h=jd/wireguard#n280

I prefer:
 err = do_something();
 if (err)
 goto err_do_something;

But your style is also quite common. Up to you, I guess.


>
>Jason

Re: [PATCH net-next v7 28/28] net: WireGuard secure network tunnel

2018-10-10 Thread Jiri Pirko

Wed, Oct 10, 2018 at 10:27:46PM CEST, ja...@zx2c4.com wrote:
>Hey Jiri,
>
>Actually, in the end I went with the suggestion from Andrew and Lukas,
>which is to follow Dan's guideline:
>https://lkml.org/lkml/2016/8/22/374 . It looks like this:
>
>https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/drivers/net/wireguard/device.c?h=jd/wireguard#n280

I prefer:
 err = do_something();
 if (err)
 goto err_do_something;

But your style is also quite common. Up to you, I guess.


>
>Jason

Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals

2018-10-10 Thread Nicholas A. Bellinger

Hey Peter & Co,

On Wed, 2018-10-10 at 10:43 +0200, Peter Zijlstra wrote:
> On Wed, Oct 10, 2018 at 03:23:10AM +, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes 
> > no
> > signals will be pending for task_struct executing the normal session 
> > shutdown
> > and I/O quiesce code-path.
> > 
> > For example, iscsi-target and iser-target issue SIGINT to all kthreads as
> > part of session shutdown.  This has been the behaviour since day one.
> 
> Not knowing much context here; but does it make sense for those
> kthreads to handle signals, ever? Most kthreads should be fine with
> ignore_signals().
> 

iscsi-target + ib-isert uses SIGINT amongst dedicated rx/tx connection
kthreads to signal connection shutdown, requiring in-flight se_cmd I/O
descriptors to be quiesced before making forward progress to release
se_session.

By the point wait_event_lock_irq_timeout() is called in the example
here, one of the two rx/tx connection kthreads has been stopped, and the
other kthread is still processing shutdown.  So while historically the
pending SIGINTs where not cleared (or ignored) during shutdown at this
point, there is no reason why they could not be ignored for iscsi-target
+ ib-isert.

That said, pre commit 00d909a107 code always used wait_for_completion()
and ignored pending signals.  As-is target_wait_for_sess_cmds() is
called directly from fabric driver code and in one case also from
user-space via configfs_write_file(), so AFAICT it does need
TASK_UNINTERRUPTIBLE.

Re: [GIT PULL] xfs: fixes for v4.19-rc7

2018-10-10 Thread Greg KH

On Thu, Oct 11, 2018 at 10:55:34AM +1100, Dave Chinner wrote:
> Hi Greg,
> 
> Can you please pull the XFS update from the tag listed below? This
> contains the fix for the clone_file_range data corruption issue I
> mentioned in my -rc6 pull request (zero post-eof blocks), as well as
> fixes for several other equally serious problems we found while
> auditing the clone/dedupe ioctls for other issues. The rest of the
> problems we found (there were a *lot*) will be addressed in the 4.20
> cycle.
> 
> Cheers,
> 
> Dave.
> 
> The following changes since commit e55ec4ddbef9897199c307dfb23167e3801fdaf5:
> 
>   xfs: fix error handling in xfs_bmap_extents_to_btree (2018-10-01 08:11:07 
> +1000)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/fs/xfs/xfs-linux tags/xfs-fixes-for-4.19-rc7

Now merged, thanks.

greg k-h

Re: [PATCH 2/2] target: Fix target_wait_for_sess_cmds breakage with active signals

2018-10-10 Thread Nicholas A. Bellinger

Hey Peter & Co,

On Wed, 2018-10-10 at 10:43 +0200, Peter Zijlstra wrote:
> On Wed, Oct 10, 2018 at 03:23:10AM +, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > With the addition of commit 00d909a107 in v4.19-rc, it incorrectly assumes 
> > no
> > signals will be pending for task_struct executing the normal session 
> > shutdown
> > and I/O quiesce code-path.
> > 
> > For example, iscsi-target and iser-target issue SIGINT to all kthreads as
> > part of session shutdown.  This has been the behaviour since day one.
> 
> Not knowing much context here; but does it make sense for those
> kthreads to handle signals, ever? Most kthreads should be fine with
> ignore_signals().
> 

iscsi-target + ib-isert uses SIGINT amongst dedicated rx/tx connection
kthreads to signal connection shutdown, requiring in-flight se_cmd I/O
descriptors to be quiesced before making forward progress to release
se_session.

By the point wait_event_lock_irq_timeout() is called in the example
here, one of the two rx/tx connection kthreads has been stopped, and the
other kthread is still processing shutdown.  So while historically the
pending SIGINTs where not cleared (or ignored) during shutdown at this
point, there is no reason why they could not be ignored for iscsi-target
+ ib-isert.

That said, pre commit 00d909a107 code always used wait_for_completion()
and ignored pending signals.  As-is target_wait_for_sess_cmds() is
called directly from fabric driver code and in one case also from
user-space via configfs_write_file(), so AFAICT it does need
TASK_UNINTERRUPTIBLE.

Re: [GIT PULL] xfs: fixes for v4.19-rc7

2018-10-10 Thread Greg KH

On Thu, Oct 11, 2018 at 10:55:34AM +1100, Dave Chinner wrote:
> Hi Greg,
> 
> Can you please pull the XFS update from the tag listed below? This
> contains the fix for the clone_file_range data corruption issue I
> mentioned in my -rc6 pull request (zero post-eof blocks), as well as
> fixes for several other equally serious problems we found while
> auditing the clone/dedupe ioctls for other issues. The rest of the
> problems we found (there were a *lot*) will be addressed in the 4.20
> cycle.
> 
> Cheers,
> 
> Dave.
> 
> The following changes since commit e55ec4ddbef9897199c307dfb23167e3801fdaf5:
> 
>   xfs: fix error handling in xfs_bmap_extents_to_btree (2018-10-01 08:11:07 
> +1000)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/fs/xfs/xfs-linux tags/xfs-fixes-for-4.19-rc7

Now merged, thanks.

greg k-h

Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux,phandle in new nodes

2018-10-10 Thread Frank Rowand

On 10/10/18 14:03, Frank Rowand wrote:
> On 10/10/18 13:40, Alan Tull wrote:
>> On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand  wrote:
>>>
>>> On 10/09/18 23:04, frowand.l...@gmail.com wrote:
 From: Frank Rowand 

 "of: overlay: use prop add changeset entry for property in new nodes"
 fixed a problem where an 'update property' changeset entry was
 created for properties contained in nodes added by a changeset.
 The fix was to use an 'add property' changeset entry.

 This exposed more bugs in the apply overlay code.  The properties
 'name', 'phandle', and 'linux,phandle' were filtered out by
 add_changeset_property() as special properties.  Change the filter
 to be only for existing nodes, not newly added nodes.

 The second bug is that the 'name' property does not exist in the
 newest FDT version, and has to be constructed from the node's
 full_name.  Construct an 'add property' changeset entry for
 newly added nodes.

 Signed-off-by: Frank Rowand 
 ---

 Hi Alan,

 Thanks for reporting the problem with missing node names.

 I was able to replicate the problem, and have created this preliminary
 version of a patch to fix the problem.

 I have not extensively reviewed the patch yet, but would appreciate
 if you can confirm this fixes your problem.

 I created this patch as patch 17 of the series, but have also
 applied it as patch 05.1, immediately after patch 05/16, and
 built the kernel, booted, and verified name and phandle for
 one of the nodes in a unittest overlay for both cases.  So
 minimal testing so far on my part.

 I have not verified whether the series builds and boots after
 each of patches 06..16 if this patch is applied as patch 05.1.

 There is definitely more work needed for me to complete this
 patch because it allocates some more memory, but does not yet
 free it when the overlay is released.

 -Frank

  drivers/of/overlay.c | 72 

  1 file changed, 67 insertions(+), 5 deletions(-)

 diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
 index 0b0904f44bc7..9746cea2aa91 100644
 --- a/drivers/of/overlay.c
 +++ b/drivers/of/overlay.c
 @@ -301,10 +301,11 @@ static int add_changeset_property(struct 
 overlay_changeset *ovcs,
   struct property *new_prop = NULL, *prop;
   int ret = 0;

 - if (!of_prop_cmp(overlay_prop->name, "name") ||
 - !of_prop_cmp(overlay_prop->name, "phandle") ||
 - !of_prop_cmp(overlay_prop->name, "linux,phandle"))
 - return 0;
 + if (target->in_livetree)
 + if (!of_prop_cmp(overlay_prop->name, "name") ||
 + !of_prop_cmp(overlay_prop->name, "phandle") ||
 + !of_prop_cmp(overlay_prop->name, "linux,phandle"))
 + return 0;
>>>
>>> This is a big hammer patch.
>>>
>>> Nobody should waste time reviewing this patch.
>>
>> I wasn't clear if you still could use the testing so I did re-run my
>> test.  This patch adds back some of the missing properties, but the
>> the kobject names aren't set as dev_name() returns NULL:
>>
>> * without this patch some of_node properties don't show up in sysfs:
>> root@arria10:~# ls
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
>> clockscompatibleinterrupt-parent  interruptsreg
>>
>> * with this patch, the of_node properties phandle and name are back:
>> root@arria10:~#  ls
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
>> clockscompatibleinterrupt-parent  interrupts
>>  name  phandle   reg
> 
> Thanks for the testing.  I'll keep chasing after this problem today.
> 
> This is useful data for me as I was not looking under the /sys/bus/...
> tree that you reported, but was instead looking at /proc/device-tree/...
> which showed the same type of problem since the overlay I was using
> does not show up under /sys/bus/...
> 
> I'll have to create a useful overlay test case that will show up under
> /sys/bus/...
> 
> In the meantime, can you send me the base FDT and the overlay FDT for
> your test case?

I now have a test case that shows the problem under /sys/bus/... so I
no longer need the base FDT and overlay FDT for your test case.

I have determined the location that sets the name to "" but do
not have the fix yet.  Still working on that.

-Frank

> 
> Thanks,
> 
> Frank
> 
> 
>>
>> root@arria10:~# cat
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name
>> freeze_controllerroot@arria10:~#  ("freeze_controller" w/o the \n so
>> the name is correct)
>>
>> * with or without the patch I see the behavior I reported yesterday,
>> kobj names are NULL.
>> root@arria10:~# ls

Re: fore200e DMA cleanups and fixes

2018-10-10 Thread David Miller

From: Christoph Hellwig 
Date: Tue,  9 Oct 2018 16:57:13 +0200

> The fore200e driver came up during some dma-related audits, so
> here is the fallout.  Compile tested (x86 & sparc) only.

Series applied to net-next.

Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux,phandle in new nodes

2018-10-10 Thread Frank Rowand

On 10/10/18 14:03, Frank Rowand wrote:
> On 10/10/18 13:40, Alan Tull wrote:
>> On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand  wrote:
>>>
>>> On 10/09/18 23:04, frowand.l...@gmail.com wrote:
 From: Frank Rowand 

 "of: overlay: use prop add changeset entry for property in new nodes"
 fixed a problem where an 'update property' changeset entry was
 created for properties contained in nodes added by a changeset.
 The fix was to use an 'add property' changeset entry.

 This exposed more bugs in the apply overlay code.  The properties
 'name', 'phandle', and 'linux,phandle' were filtered out by
 add_changeset_property() as special properties.  Change the filter
 to be only for existing nodes, not newly added nodes.

 The second bug is that the 'name' property does not exist in the
 newest FDT version, and has to be constructed from the node's
 full_name.  Construct an 'add property' changeset entry for
 newly added nodes.

 Signed-off-by: Frank Rowand 
 ---

 Hi Alan,

 Thanks for reporting the problem with missing node names.

 I was able to replicate the problem, and have created this preliminary
 version of a patch to fix the problem.

 I have not extensively reviewed the patch yet, but would appreciate
 if you can confirm this fixes your problem.

 I created this patch as patch 17 of the series, but have also
 applied it as patch 05.1, immediately after patch 05/16, and
 built the kernel, booted, and verified name and phandle for
 one of the nodes in a unittest overlay for both cases.  So
 minimal testing so far on my part.

 I have not verified whether the series builds and boots after
 each of patches 06..16 if this patch is applied as patch 05.1.

 There is definitely more work needed for me to complete this
 patch because it allocates some more memory, but does not yet
 free it when the overlay is released.

 -Frank

  drivers/of/overlay.c | 72 

  1 file changed, 67 insertions(+), 5 deletions(-)

 diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
 index 0b0904f44bc7..9746cea2aa91 100644
 --- a/drivers/of/overlay.c
 +++ b/drivers/of/overlay.c
 @@ -301,10 +301,11 @@ static int add_changeset_property(struct 
 overlay_changeset *ovcs,
   struct property *new_prop = NULL, *prop;
   int ret = 0;

 - if (!of_prop_cmp(overlay_prop->name, "name") ||
 - !of_prop_cmp(overlay_prop->name, "phandle") ||
 - !of_prop_cmp(overlay_prop->name, "linux,phandle"))
 - return 0;
 + if (target->in_livetree)
 + if (!of_prop_cmp(overlay_prop->name, "name") ||
 + !of_prop_cmp(overlay_prop->name, "phandle") ||
 + !of_prop_cmp(overlay_prop->name, "linux,phandle"))
 + return 0;
>>>
>>> This is a big hammer patch.
>>>
>>> Nobody should waste time reviewing this patch.
>>
>> I wasn't clear if you still could use the testing so I did re-run my
>> test.  This patch adds back some of the missing properties, but the
>> the kobject names aren't set as dev_name() returns NULL:
>>
>> * without this patch some of_node properties don't show up in sysfs:
>> root@arria10:~# ls
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
>> clockscompatibleinterrupt-parent  interruptsreg
>>
>> * with this patch, the of_node properties phandle and name are back:
>> root@arria10:~#  ls
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
>> clockscompatibleinterrupt-parent  interrupts
>>  name  phandle   reg
> 
> Thanks for the testing.  I'll keep chasing after this problem today.
> 
> This is useful data for me as I was not looking under the /sys/bus/...
> tree that you reported, but was instead looking at /proc/device-tree/...
> which showed the same type of problem since the overlay I was using
> does not show up under /sys/bus/...
> 
> I'll have to create a useful overlay test case that will show up under
> /sys/bus/...
> 
> In the meantime, can you send me the base FDT and the overlay FDT for
> your test case?

I now have a test case that shows the problem under /sys/bus/... so I
no longer need the base FDT and overlay FDT for your test case.

I have determined the location that sets the name to "" but do
not have the fix yet.  Still working on that.

-Frank

> 
> Thanks,
> 
> Frank
> 
> 
>>
>> root@arria10:~# cat
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name
>> freeze_controllerroot@arria10:~#  ("freeze_controller" w/o the \n so
>> the name is correct)
>>
>> * with or without the patch I see the behavior I reported yesterday,
>> kobj names are NULL.
>> root@arria10:~# ls

Re: fore200e DMA cleanups and fixes

2018-10-10 Thread David Miller

From: Christoph Hellwig 
Date: Tue,  9 Oct 2018 16:57:13 +0200

> The fore200e driver came up during some dma-related audits, so
> here is the fallout.  Compile tested (x86 & sparc) only.

Series applied to net-next.

Re: [PATCH net-next V3] virtio_net: ethtool tx napi configuration

2018-10-10 Thread David Miller

From: Jason Wang 
Date: Tue,  9 Oct 2018 10:06:26 +0800

> Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers.
> Interrupt moderation is currently not supported, so these accept and
> display the default settings of 0 usec and 1 frame.
> 
> Toggle tx napi through setting tx-frames. So as to not interfere
> with possible future interrupt moderation, value 1 means tx napi while
> value 0 means not.
> 
> Only allow the switching when device is down for simplicity.
> 
> Link: https://patchwork.ozlabs.org/patch/948149/
> Suggested-by: Jason Wang 
> Signed-off-by: Willem de Bruijn 
> Signed-off-by: Jason Wang 
> ---
> Changes from V2:
> - only allow the switching when device is done
> - remove unnecessary global variable and initialization
> Changes from V1:
> - try to synchronize with datapath to allow changing mode when
>   interface is up.
> - use tx-frames 0 as to disable tx napi while tx-frames 1 to enable tx napi

Applied, with...

> + bool running = netif_running(dev);

this unused variable removed.

Re: netconsole warning in 4.19.0-rc7

2018-10-10 Thread Cong Wang

(Cc'ing Dave)

On Wed, Oct 10, 2018 at 5:14 AM Meelis Roos  wrote:
>
> Thies 4.19-rc7 on a bunch of test machines and got this warning from one.
> It is reproducible and I have not noticed it before.
>
[...]
> [9.914805] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:168 
> __local_bh_enable_ip+0x2e/0x44
> [9.914806] Modules linked in:
> [9.914808] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc7 #210
> [9.914810] Hardware name: MicroLink   /D850MV 
> , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003
> [9.914811] EIP: __local_bh_enable_ip+0x2e/0x44
> [9.914813] Code: cc 02 5f c8 a9 00 00 0f 00 75 1f 83 ea 01 f7 da 01 15 cc 
> 02 5f c8 a1 cc 02 5f c8 a9 00 ff 1f 00 74 0c ff 0d cc 02 5f c8 5d c3 <0f> 0b 
> eb dd 66 a1 80 cd 5e c8 66 85 c0 74 e9 e8 87 ff ff ff eb e2
> [9.914814] EAX: 80010200 EBX: f602b000 ECX: 36346270 EDX: 0200
> [9.914815] ESI: f620ecc0 EDI: f620ebac EBP: f600de40 ESP: f600de40
> [9.914816] DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068 EFLAGS: 00010006
> [9.914817] CR0: 80050033 CR2: b7f5f000 CR3: 36389000 CR4: 06d0
> [9.914818] Call Trace:
> [9.914819]  
> [9.914820]  netpoll_send_skb_on_dev+0xa5/0x1b0

This is exactly what I mentioned in my review here:
https://marc.info/?l=linux-netdev=153816136624679=2

"But irq is disabled here, so not sure if rcu_read_lock_bh()
could cause trouble... "

Re: [PATCH net-next V3] virtio_net: ethtool tx napi configuration

2018-10-10 Thread David Miller

From: Jason Wang 
Date: Tue,  9 Oct 2018 10:06:26 +0800

> Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers.
> Interrupt moderation is currently not supported, so these accept and
> display the default settings of 0 usec and 1 frame.
> 
> Toggle tx napi through setting tx-frames. So as to not interfere
> with possible future interrupt moderation, value 1 means tx napi while
> value 0 means not.
> 
> Only allow the switching when device is down for simplicity.
> 
> Link: https://patchwork.ozlabs.org/patch/948149/
> Suggested-by: Jason Wang 
> Signed-off-by: Willem de Bruijn 
> Signed-off-by: Jason Wang 
> ---
> Changes from V2:
> - only allow the switching when device is done
> - remove unnecessary global variable and initialization
> Changes from V1:
> - try to synchronize with datapath to allow changing mode when
>   interface is up.
> - use tx-frames 0 as to disable tx napi while tx-frames 1 to enable tx napi

Applied, with...

> + bool running = netif_running(dev);

this unused variable removed.

Re: netconsole warning in 4.19.0-rc7

2018-10-10 Thread Cong Wang

(Cc'ing Dave)

On Wed, Oct 10, 2018 at 5:14 AM Meelis Roos  wrote:
>
> Thies 4.19-rc7 on a bunch of test machines and got this warning from one.
> It is reproducible and I have not noticed it before.
>
[...]
> [9.914805] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:168 
> __local_bh_enable_ip+0x2e/0x44
> [9.914806] Modules linked in:
> [9.914808] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc7 #210
> [9.914810] Hardware name: MicroLink   /D850MV 
> , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003
> [9.914811] EIP: __local_bh_enable_ip+0x2e/0x44
> [9.914813] Code: cc 02 5f c8 a9 00 00 0f 00 75 1f 83 ea 01 f7 da 01 15 cc 
> 02 5f c8 a1 cc 02 5f c8 a9 00 ff 1f 00 74 0c ff 0d cc 02 5f c8 5d c3 <0f> 0b 
> eb dd 66 a1 80 cd 5e c8 66 85 c0 74 e9 e8 87 ff ff ff eb e2
> [9.914814] EAX: 80010200 EBX: f602b000 ECX: 36346270 EDX: 0200
> [9.914815] ESI: f620ecc0 EDI: f620ebac EBP: f600de40 ESP: f600de40
> [9.914816] DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068 EFLAGS: 00010006
> [9.914817] CR0: 80050033 CR2: b7f5f000 CR3: 36389000 CR4: 06d0
> [9.914818] Call Trace:
> [9.914819]  
> [9.914820]  netpoll_send_skb_on_dev+0xa5/0x1b0

This is exactly what I mentioned in my review here:
https://marc.info/?l=linux-netdev=153816136624679=2

"But irq is disabled here, so not sure if rcu_read_lock_bh()
could cause trouble... "

[PATCH v10 0/3] powerpc: Detection and scheduler optimization for POWER9 bigcore

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

Hi,

This is the tenth iteration of the patchset to add support for
big-core on POWER9. This patch also optimizes the task placement on
such big-core systems.

The previous versions can be found here:
v9: https://lkml.org/lkml/2018/10/1/608
v8: https://lkml.org/lkml/2018/9/20/899
v7: https://lkml.org/lkml/2018/8/20/52
v6: https://lkml.org/lkml/2018/8/9/119
v5: https://lkml.org/lkml/2018/8/6/587
v4: https://lkml.org/lkml/2018/7/24/79
v3: https://lkml.org/lkml/2018/7/6/255
v2: https://lkml.org/lkml/2018/7/3/401
v1: https://lkml.org/lkml/2018/5/11/245

Changes :
v9 --> v10:
   - Rebased it on v4.19-rc7
   - Added a patch to report the correct shared_cpu_map for L1-caches
   on big-core systems.

Description:


IBM POWER9 SMT8 cores consists of two groups of small-cores where each
group has its own L1 cache, translation cache and instruction-data
flow. This can be discovered via the "ibm,thread-groups" CPU property
in the device tree. Furthermore, on POWER9 the thread-ids of such a
big-core is obtained by interleaving the thread-ids of the two
small-cores.

Eg: In an SMT8 core with thread ids {0,1,2,3,4,5,6,7}, the thread-ids
of the threads in the two small-cores respectively will be {0,2,4,6}
and {1,3,5,7} respectively.

   -
   |L1 Cache   |
   --
   |L2| | | |  |
   |  |  0  |  2  |  4  |  6   |Small Core0
   |C | | | |  |
Big|a --
Core   |c | | | |  |
   |h |  1  |  3  |  5  |  7   | Small Core1
   |e | | | |  |
   -
  | L1 Cache   |
  --

On such a big-core system, when multiple tasks are scheduled to run on
the big-core, we get the best performance when the tasks are spread
across the pair of small-cores.

Eg: Suppose there 4 tasks {p1, p2, p3, p4} are run on a big core, then

An Example of Optimal Task placement:
   --
   | | | |  |
   |  0  |  2  |  4  |  6   |   Small Core0
   | (p1)| (p2)| |  |
Big Core   --
   | | | |  |
   |  1  |  3  |  5  |  7   |   Small Core1
   | | (p3)| | (p4) |
   --

An example of Suboptimal Task placement:
   --
   | | | |  |
   |  0  |  2  |  4  |  6   |   Small Core0
   | (p1)| (p2)| |  (p4)|
Big Core   --
   | | | |  |
   |  1  |  3  |  5  |  7   |   Small Core1
   | | (p3)| |  |
   --

Currently on the big-core systems, the sched domain hierarchy is:

SMT   : group of CPUs in the SMT8 core.
DIE   : groups of CPUs on the same die.
NUMA  : all the CPUs in the system.

Thus the scheduler doesn't distinguish between CPUs in the core that
share the L1-cache vs the ones that don't resulting in a run-to-run
variance when multithreaded applications are run on an SMT8 core.

In this patch-set, we address this by defining the sched-domain on the
big-core systems to be:

SMT   : group of CPUs sharing the L1 cache
CACHE : group of CPUs in the SMT8 core.
DIE   : groups of CPUs on the same die.
NUMA  : all the CPUs in the system.

With this, the Linux Kernel load-balancer will ensure that the tasks
are spread across all the component small cores in the system, thereby
yielding optimum performance.

Furthermore, this solution works correctly across all SMT modes
(8,4,2), as the interleaved thread-ids ensures that when we go to
lower SMT modes (4,2) the threads are offlined in a descending order,
thereby leaving equal number of threads from the component small cores
online as illustrated below.

This patchset contains three patches which on detecting the presence
of big-cores, defines the SMT level sched domain to correspond to the
threads of the small cores.

Patch 1: adds support to detect the presence of
big-cores and parses the output of "ibm,thread-groups" device-tree
which using which it updates a per-cpu mask named cpu_smallcore_mask

Patch 2: Defines the SMT level sched domain to correspond to the
threads of the small cores.

Patch 3: Added a patch to report the correct shared_cpu_map for L1-caches
on big-core systems.

   Without patch 3:
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff

With patch 3:
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map

[PATCH v10 2/3] powerpc: Use cpu_smallcore_sibling_mask at SMT level on bigcores

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

POWER9 SMT8 cores consist of two groups of threads, where threads in
each group shares L1-cache. The scheduler is not aware of this
distinction as the current sched-domain hierarchy has all the threads
of the core defined at the SMT domain.

SMT  [Thread siblings of the SMT8 core]
DIE  [CPUs in the same die]
NUMA [All the CPUs in the system]

Due to this, we can observe run-to-run variance when we run a
multi-threaded benchmark bound to a single core based on how the
scheduler spreads the software threads across the two groups in the
core.

We fix this in this patch by defining each group of threads which
share L1-cache to be the SMT level. The group of threads in the SMT8
core is defined to be the CACHE level. The sched-domain hierarchy
after this patch will be :

SMT [Thread siblings in the core that share L1 cache]
CACHE   [Thread siblings that are in the SMT8 core]
DIE [CPUs in the same die]
NUMA[All the CPUs in the system]

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/smp.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 22a14a9..356751e 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1266,6 +1266,7 @@ static void add_cpu_to_masks(int cpu)
 void start_secondary(void *unused)
 {
unsigned int cpu = smp_processor_id();
+   struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
 
mmgrab(_mm);
current->active_mm = _mm;
@@ -1291,11 +1292,13 @@ void start_secondary(void *unused)
/* Update topology CPU masks */
add_cpu_to_masks(cpu);
 
+   if (has_big_cores)
+   sibling_mask = cpu_smallcore_mask;
/*
 * Check for any shared caches. Note that this must be done on a
 * per-core basis because one core in the pair might be disabled.
 */
-   if (!cpumask_equal(cpu_l2_cache_mask(cpu), cpu_sibling_mask(cpu)))
+   if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
shared_caches = true;
 
set_numa_node(numa_cpu_lookup_table[cpu]);
@@ -1362,6 +1365,13 @@ static const struct cpumask *shared_cache_mask(int cpu)
return cpu_l2_cache_mask(cpu);
 }
 
+#ifdef CONFIG_SCHED_SMT
+static const struct cpumask *smallcore_smt_mask(int cpu)
+{
+   return cpu_smallcore_mask(cpu);
+}
+#endif
+
 static struct sched_domain_topology_level power9_topology[] = {
 #ifdef CONFIG_SCHED_SMT
{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
@@ -1389,6 +1399,13 @@ void __init smp_cpus_done(unsigned int max_cpus)
shared_proc_topology_init();
dump_numa_cpu_topology();
 
+#ifdef CONFIG_SCHED_SMT
+   if (has_big_cores) {
+   pr_info("Using small cores at SMT level\n");
+   power9_topology[0].mask = smallcore_smt_mask;
+   powerpc_topology[0].mask = smallcore_smt_mask;
+   }
+#endif
/*
 * If any CPU detects that it's sharing a cache with another CPU then
 * use the deeper topology that is aware of this sharing.
-- 
1.9.4

[PATCH v10 3/3] powerpc/cacheinfo: Report the correct shared_cpu_map on big-cores

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

Currently on POWER9 SMT8 cores systems, in sysfs, we report the
shared_cache_map for L1 caches (both data and instruction) to be the
cpu-ids of the threads in SMT8 cores. This is incorrect since on
POWER9 SMT8 cores there are two groups of threads, each of which
shares its own L1 cache.

This patch addresses this by reporting the shared_cpu_map correctly in
sysfs for L1 caches.

Before the patch
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff

After the patch
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00aa
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00aa

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/cacheinfo.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index a8f20e5..be57bd0 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "cacheinfo.h"
 
@@ -627,17 +629,48 @@ static ssize_t level_show(struct kobject *k, struct 
kobj_attribute *attr, char *
 static struct kobj_attribute cache_level_attr =
__ATTR(level, 0444, level_show, NULL);
 
+static unsigned int index_dir_to_cpu(struct cache_index_dir *index)
+{
+   struct kobject *index_dir_kobj = >kobj;
+   struct kobject *cache_dir_kobj = index_dir_kobj->parent;
+   struct kobject *cpu_dev_kobj = cache_dir_kobj->parent;
+   struct device *dev = kobj_to_dev(cpu_dev_kobj);
+
+   return dev->id;
+}
+
+/*
+ * On big-core systems, each core has two groups of CPUs each of which
+ * has its own L1-cache. The thread-siblings which share l1-cache with
+ * @cpu can be obtained via cpu_smallcore_mask().
+ */
+static const struct cpumask *get_big_core_shared_cpu_map(int cpu, struct cache 
*cache)
+{
+   if (cache->level == 1)
+   return cpu_smallcore_mask(cpu);
+
+   return >shared_cpu_map;
+}
+
 static ssize_t shared_cpu_map_show(struct kobject *k, struct kobj_attribute 
*attr, char *buf)
 {
struct cache_index_dir *index;
struct cache *cache;
-   int ret;
+   const struct cpumask *mask;
+   int ret, cpu;
 
index = kobj_to_cache_index_dir(k);
cache = index->cache;
 
+   if (has_big_cores) {
+   cpu = index_dir_to_cpu(index);
+   mask = get_big_core_shared_cpu_map(cpu, cache);
+   } else {
+   mask  = >shared_cpu_map;
+   }
+
ret = scnprintf(buf, PAGE_SIZE - 1, "%*pb\n",
-   cpumask_pr_args(>shared_cpu_map));
+   cpumask_pr_args(mask));
buf[ret++] = '\n';
buf[ret] = '\0';
return ret;
-- 
1.9.4

[PATCH v10 1/3] powerpc: Detect the presence of big-cores via "ibm,thread-groups"

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

On IBM POWER9, the device tree exposes a property array identifed by
"ibm,thread-groups" which will indicate which groups of threads share
a particular set of resources.

As of today we only have one form of grouping identifying the group of
threads in the core that share the L1 cache, translation cache and
instruction data flow.

This patch adds helper functions to parse the contents of
"ibm,thread-groups" and populate a per-cpu variable to cache
information about siblings of each CPU that share the L1, traslation
cache and instruction data-flow.

It also defines a new global variable named "has_big_cores" which
indicates if the cores on this configuration have multiple groups of
threads that share L1 cache.

For each online CPU, it maintains a cpu_smallcore_mask, which
indicates the online siblings which share the L1-cache with it.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/cputhreads.h |   2 +
 arch/powerpc/include/asm/smp.h|  11 ++
 arch/powerpc/kernel/smp.c | 222 ++
 3 files changed, 235 insertions(+)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index d71a909..deb99fd 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -23,11 +23,13 @@
 extern int threads_per_core;
 extern int threads_per_subcore;
 extern int threads_shift;
+extern bool has_big_cores;
 extern cpumask_t threads_core_mask;
 #else
 #define threads_per_core   1
 #define threads_per_subcore1
 #define threads_shift  0
+#define has_big_cores  0
 #define threads_core_mask  (*get_cpu_mask(0))
 #endif
 
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 95b66a0..4169574 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -100,6 +100,7 @@ static inline void set_hard_smp_processor_id(int cpu, int 
phys)
 DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
+DECLARE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 
 static inline struct cpumask *cpu_sibling_mask(int cpu)
 {
@@ -116,6 +117,11 @@ static inline struct cpumask *cpu_l2_cache_mask(int cpu)
return per_cpu(cpu_l2_cache_map, cpu);
 }
 
+static inline struct cpumask *cpu_smallcore_mask(int cpu)
+{
+   return per_cpu(cpu_smallcore_map, cpu);
+}
+
 extern int cpu_to_core_id(int cpu);
 
 /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
@@ -166,6 +172,11 @@ static inline const struct cpumask *cpu_sibling_mask(int 
cpu)
return cpumask_of(cpu);
 }
 
+static inline const struct cpumask *cpu_smallcore_mask(int cpu)
+{
+   return cpumask_of(cpu);
+}
+
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 61c1fad..22a14a9 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -74,14 +74,32 @@
 #endif
 
 struct thread_info *secondary_ti;
+bool has_big_cores;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
 
 EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
+EXPORT_SYMBOL_GPL(has_big_cores);
+
+#define MAX_THREAD_LIST_SIZE   8
+#define THREAD_GROUP_SHARE_L1   1
+struct thread_groups {
+   unsigned int property;
+   unsigned int nr_groups;
+   unsigned int threads_per_group;
+   unsigned int thread_list[MAX_THREAD_LIST_SIZE];
+};
+
+/*
+ * On big-cores system, cpu_l1_cache_map for each CPU corresponds to
+ * the set its siblings that share the L1-cache.
+ */
+DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map);
 
 /* SMP operations for this machine */
 struct smp_ops_t *smp_ops;
@@ -674,6 +692,185 @@ static void set_cpus_unrelated(int i, int j,
 }
 #endif
 
+/*
+ * parse_thread_groups: Parses the "ibm,thread-groups" device tree
+ *  property for the CPU device node @dn and stores
+ *  the parsed output in the thread_groups
+ *  structure @tg if the ibm,thread-groups[0]
+ *  matches @property.
+ *
+ * @dn: The device node of the CPU device.
+ * @tg: Pointer to a thread group structure into which the parsed
+ *  output of "ibm,thread-groups" is stored.
+ * @property: The property of the thread-group that the caller is
+ *interested in.
+ *
+ * ibm,thread-groups[0..N-1] array defines which group of threads in
+ * the CPU-device node can be grouped together based on the property.
+ *
+ * ibm,thread-groups[0] tells us the property based on which the
+ * threads are being grouped together. If this value is 1, it implies
+ * that the threads in the same group share L1, translation

[PATCH v10 0/3] powerpc: Detection and scheduler optimization for POWER9 bigcore

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

Hi,

This is the tenth iteration of the patchset to add support for
big-core on POWER9. This patch also optimizes the task placement on
such big-core systems.

The previous versions can be found here:
v9: https://lkml.org/lkml/2018/10/1/608
v8: https://lkml.org/lkml/2018/9/20/899
v7: https://lkml.org/lkml/2018/8/20/52
v6: https://lkml.org/lkml/2018/8/9/119
v5: https://lkml.org/lkml/2018/8/6/587
v4: https://lkml.org/lkml/2018/7/24/79
v3: https://lkml.org/lkml/2018/7/6/255
v2: https://lkml.org/lkml/2018/7/3/401
v1: https://lkml.org/lkml/2018/5/11/245

Changes :
v9 --> v10:
   - Rebased it on v4.19-rc7
   - Added a patch to report the correct shared_cpu_map for L1-caches
   on big-core systems.

Description:


IBM POWER9 SMT8 cores consists of two groups of small-cores where each
group has its own L1 cache, translation cache and instruction-data
flow. This can be discovered via the "ibm,thread-groups" CPU property
in the device tree. Furthermore, on POWER9 the thread-ids of such a
big-core is obtained by interleaving the thread-ids of the two
small-cores.

Eg: In an SMT8 core with thread ids {0,1,2,3,4,5,6,7}, the thread-ids
of the threads in the two small-cores respectively will be {0,2,4,6}
and {1,3,5,7} respectively.

   -
   |L1 Cache   |
   --
   |L2| | | |  |
   |  |  0  |  2  |  4  |  6   |Small Core0
   |C | | | |  |
Big|a --
Core   |c | | | |  |
   |h |  1  |  3  |  5  |  7   | Small Core1
   |e | | | |  |
   -
  | L1 Cache   |
  --

On such a big-core system, when multiple tasks are scheduled to run on
the big-core, we get the best performance when the tasks are spread
across the pair of small-cores.

Eg: Suppose there 4 tasks {p1, p2, p3, p4} are run on a big core, then

An Example of Optimal Task placement:
   --
   | | | |  |
   |  0  |  2  |  4  |  6   |   Small Core0
   | (p1)| (p2)| |  |
Big Core   --
   | | | |  |
   |  1  |  3  |  5  |  7   |   Small Core1
   | | (p3)| | (p4) |
   --

An example of Suboptimal Task placement:
   --
   | | | |  |
   |  0  |  2  |  4  |  6   |   Small Core0
   | (p1)| (p2)| |  (p4)|
Big Core   --
   | | | |  |
   |  1  |  3  |  5  |  7   |   Small Core1
   | | (p3)| |  |
   --

Currently on the big-core systems, the sched domain hierarchy is:

SMT   : group of CPUs in the SMT8 core.
DIE   : groups of CPUs on the same die.
NUMA  : all the CPUs in the system.

Thus the scheduler doesn't distinguish between CPUs in the core that
share the L1-cache vs the ones that don't resulting in a run-to-run
variance when multithreaded applications are run on an SMT8 core.

In this patch-set, we address this by defining the sched-domain on the
big-core systems to be:

SMT   : group of CPUs sharing the L1 cache
CACHE : group of CPUs in the SMT8 core.
DIE   : groups of CPUs on the same die.
NUMA  : all the CPUs in the system.

With this, the Linux Kernel load-balancer will ensure that the tasks
are spread across all the component small cores in the system, thereby
yielding optimum performance.

Furthermore, this solution works correctly across all SMT modes
(8,4,2), as the interleaved thread-ids ensures that when we go to
lower SMT modes (4,2) the threads are offlined in a descending order,
thereby leaving equal number of threads from the component small cores
online as illustrated below.

This patchset contains three patches which on detecting the presence
of big-cores, defines the SMT level sched domain to correspond to the
threads of the small cores.

Patch 1: adds support to detect the presence of
big-cores and parses the output of "ibm,thread-groups" device-tree
which using which it updates a per-cpu mask named cpu_smallcore_mask

Patch 2: Defines the SMT level sched domain to correspond to the
threads of the small cores.

Patch 3: Added a patch to report the correct shared_cpu_map for L1-caches
on big-core systems.

   Without patch 3:
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff

With patch 3:
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map

[PATCH v10 2/3] powerpc: Use cpu_smallcore_sibling_mask at SMT level on bigcores

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

POWER9 SMT8 cores consist of two groups of threads, where threads in
each group shares L1-cache. The scheduler is not aware of this
distinction as the current sched-domain hierarchy has all the threads
of the core defined at the SMT domain.

SMT  [Thread siblings of the SMT8 core]
DIE  [CPUs in the same die]
NUMA [All the CPUs in the system]

Due to this, we can observe run-to-run variance when we run a
multi-threaded benchmark bound to a single core based on how the
scheduler spreads the software threads across the two groups in the
core.

We fix this in this patch by defining each group of threads which
share L1-cache to be the SMT level. The group of threads in the SMT8
core is defined to be the CACHE level. The sched-domain hierarchy
after this patch will be :

SMT [Thread siblings in the core that share L1 cache]
CACHE   [Thread siblings that are in the SMT8 core]
DIE [CPUs in the same die]
NUMA[All the CPUs in the system]

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/smp.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 22a14a9..356751e 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1266,6 +1266,7 @@ static void add_cpu_to_masks(int cpu)
 void start_secondary(void *unused)
 {
unsigned int cpu = smp_processor_id();
+   struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
 
mmgrab(_mm);
current->active_mm = _mm;
@@ -1291,11 +1292,13 @@ void start_secondary(void *unused)
/* Update topology CPU masks */
add_cpu_to_masks(cpu);
 
+   if (has_big_cores)
+   sibling_mask = cpu_smallcore_mask;
/*
 * Check for any shared caches. Note that this must be done on a
 * per-core basis because one core in the pair might be disabled.
 */
-   if (!cpumask_equal(cpu_l2_cache_mask(cpu), cpu_sibling_mask(cpu)))
+   if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
shared_caches = true;
 
set_numa_node(numa_cpu_lookup_table[cpu]);
@@ -1362,6 +1365,13 @@ static const struct cpumask *shared_cache_mask(int cpu)
return cpu_l2_cache_mask(cpu);
 }
 
+#ifdef CONFIG_SCHED_SMT
+static const struct cpumask *smallcore_smt_mask(int cpu)
+{
+   return cpu_smallcore_mask(cpu);
+}
+#endif
+
 static struct sched_domain_topology_level power9_topology[] = {
 #ifdef CONFIG_SCHED_SMT
{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
@@ -1389,6 +1399,13 @@ void __init smp_cpus_done(unsigned int max_cpus)
shared_proc_topology_init();
dump_numa_cpu_topology();
 
+#ifdef CONFIG_SCHED_SMT
+   if (has_big_cores) {
+   pr_info("Using small cores at SMT level\n");
+   power9_topology[0].mask = smallcore_smt_mask;
+   powerpc_topology[0].mask = smallcore_smt_mask;
+   }
+#endif
/*
 * If any CPU detects that it's sharing a cache with another CPU then
 * use the deeper topology that is aware of this sharing.
-- 
1.9.4

[PATCH v10 3/3] powerpc/cacheinfo: Report the correct shared_cpu_map on big-cores

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

Currently on POWER9 SMT8 cores systems, in sysfs, we report the
shared_cache_map for L1 caches (both data and instruction) to be the
cpu-ids of the threads in SMT8 cores. This is incorrect since on
POWER9 SMT8 cores there are two groups of threads, each of which
shares its own L1 cache.

This patch addresses this by reporting the shared_cpu_map correctly in
sysfs for L1 caches.

Before the patch
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff

After the patch
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00aa
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00aa

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/cacheinfo.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index a8f20e5..be57bd0 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "cacheinfo.h"
 
@@ -627,17 +629,48 @@ static ssize_t level_show(struct kobject *k, struct 
kobj_attribute *attr, char *
 static struct kobj_attribute cache_level_attr =
__ATTR(level, 0444, level_show, NULL);
 
+static unsigned int index_dir_to_cpu(struct cache_index_dir *index)
+{
+   struct kobject *index_dir_kobj = >kobj;
+   struct kobject *cache_dir_kobj = index_dir_kobj->parent;
+   struct kobject *cpu_dev_kobj = cache_dir_kobj->parent;
+   struct device *dev = kobj_to_dev(cpu_dev_kobj);
+
+   return dev->id;
+}
+
+/*
+ * On big-core systems, each core has two groups of CPUs each of which
+ * has its own L1-cache. The thread-siblings which share l1-cache with
+ * @cpu can be obtained via cpu_smallcore_mask().
+ */
+static const struct cpumask *get_big_core_shared_cpu_map(int cpu, struct cache 
*cache)
+{
+   if (cache->level == 1)
+   return cpu_smallcore_mask(cpu);
+
+   return >shared_cpu_map;
+}
+
 static ssize_t shared_cpu_map_show(struct kobject *k, struct kobj_attribute 
*attr, char *buf)
 {
struct cache_index_dir *index;
struct cache *cache;
-   int ret;
+   const struct cpumask *mask;
+   int ret, cpu;
 
index = kobj_to_cache_index_dir(k);
cache = index->cache;
 
+   if (has_big_cores) {
+   cpu = index_dir_to_cpu(index);
+   mask = get_big_core_shared_cpu_map(cpu, cache);
+   } else {
+   mask  = >shared_cpu_map;
+   }
+
ret = scnprintf(buf, PAGE_SIZE - 1, "%*pb\n",
-   cpumask_pr_args(>shared_cpu_map));
+   cpumask_pr_args(mask));
buf[ret++] = '\n';
buf[ret] = '\0';
return ret;
-- 
1.9.4

[PATCH v10 1/3] powerpc: Detect the presence of big-cores via "ibm,thread-groups"

2018-10-10 Thread Gautham R. Shenoy

From: "Gautham R. Shenoy" 

On IBM POWER9, the device tree exposes a property array identifed by
"ibm,thread-groups" which will indicate which groups of threads share
a particular set of resources.

As of today we only have one form of grouping identifying the group of
threads in the core that share the L1 cache, translation cache and
instruction data flow.

This patch adds helper functions to parse the contents of
"ibm,thread-groups" and populate a per-cpu variable to cache
information about siblings of each CPU that share the L1, traslation
cache and instruction data-flow.

It also defines a new global variable named "has_big_cores" which
indicates if the cores on this configuration have multiple groups of
threads that share L1 cache.

For each online CPU, it maintains a cpu_smallcore_mask, which
indicates the online siblings which share the L1-cache with it.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/cputhreads.h |   2 +
 arch/powerpc/include/asm/smp.h|  11 ++
 arch/powerpc/kernel/smp.c | 222 ++
 3 files changed, 235 insertions(+)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index d71a909..deb99fd 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -23,11 +23,13 @@
 extern int threads_per_core;
 extern int threads_per_subcore;
 extern int threads_shift;
+extern bool has_big_cores;
 extern cpumask_t threads_core_mask;
 #else
 #define threads_per_core   1
 #define threads_per_subcore1
 #define threads_shift  0
+#define has_big_cores  0
 #define threads_core_mask  (*get_cpu_mask(0))
 #endif
 
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 95b66a0..4169574 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -100,6 +100,7 @@ static inline void set_hard_smp_processor_id(int cpu, int 
phys)
 DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
+DECLARE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 
 static inline struct cpumask *cpu_sibling_mask(int cpu)
 {
@@ -116,6 +117,11 @@ static inline struct cpumask *cpu_l2_cache_mask(int cpu)
return per_cpu(cpu_l2_cache_map, cpu);
 }
 
+static inline struct cpumask *cpu_smallcore_mask(int cpu)
+{
+   return per_cpu(cpu_smallcore_map, cpu);
+}
+
 extern int cpu_to_core_id(int cpu);
 
 /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
@@ -166,6 +172,11 @@ static inline const struct cpumask *cpu_sibling_mask(int 
cpu)
return cpumask_of(cpu);
 }
 
+static inline const struct cpumask *cpu_smallcore_mask(int cpu)
+{
+   return cpumask_of(cpu);
+}
+
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 61c1fad..22a14a9 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -74,14 +74,32 @@
 #endif
 
 struct thread_info *secondary_ti;
+bool has_big_cores;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
 
 EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
+EXPORT_SYMBOL_GPL(has_big_cores);
+
+#define MAX_THREAD_LIST_SIZE   8
+#define THREAD_GROUP_SHARE_L1   1
+struct thread_groups {
+   unsigned int property;
+   unsigned int nr_groups;
+   unsigned int threads_per_group;
+   unsigned int thread_list[MAX_THREAD_LIST_SIZE];
+};
+
+/*
+ * On big-cores system, cpu_l1_cache_map for each CPU corresponds to
+ * the set its siblings that share the L1-cache.
+ */
+DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map);
 
 /* SMP operations for this machine */
 struct smp_ops_t *smp_ops;
@@ -674,6 +692,185 @@ static void set_cpus_unrelated(int i, int j,
 }
 #endif
 
+/*
+ * parse_thread_groups: Parses the "ibm,thread-groups" device tree
+ *  property for the CPU device node @dn and stores
+ *  the parsed output in the thread_groups
+ *  structure @tg if the ibm,thread-groups[0]
+ *  matches @property.
+ *
+ * @dn: The device node of the CPU device.
+ * @tg: Pointer to a thread group structure into which the parsed
+ *  output of "ibm,thread-groups" is stored.
+ * @property: The property of the thread-group that the caller is
+ *interested in.
+ *
+ * ibm,thread-groups[0..N-1] array defines which group of threads in
+ * the CPU-device node can be grouped together based on the property.
+ *
+ * ibm,thread-groups[0] tells us the property based on which the
+ * threads are being grouped together. If this value is 1, it implies
+ * that the threads in the same group share L1, translation

Re: [PATCH] isdn/hisax: amd7930_fn: Remove unnecessary parentheses

2018-10-10 Thread David Miller

From: Nathan Chancellor 
Date: Mon,  8 Oct 2018 15:59:05 -0700

> Clang warns when multiple sets of parentheses are used for a single
> conditional statement.
> 
> drivers/isdn/hisax/amd7930_fn.c:628:32: warning: equality comparison
> with extraneous parentheses [-Wparentheses-equality]
> if ((cs->dc.amd7930.ph_state == 8)) {
>  ^~~~
> drivers/isdn/hisax/amd7930_fn.c:628:32: note: remove extraneous
> parentheses around the comparison to silence this warning
> if ((cs->dc.amd7930.ph_state == 8)) {
> ~^   ~
> drivers/isdn/hisax/amd7930_fn.c:628:32: note: use '=' to turn this
> equality comparison into an assignment
> if ((cs->dc.amd7930.ph_state == 8)) {
>  ^~
>  =
> 1 warning generated.
> 
> Signed-off-by: Nathan Chancellor 

Applied.

Re: [PATCH] isdn/hisax: amd7930_fn: Remove unnecessary parentheses

2018-10-10 Thread David Miller

From: Nathan Chancellor 
Date: Mon,  8 Oct 2018 15:59:05 -0700

> Clang warns when multiple sets of parentheses are used for a single
> conditional statement.
> 
> drivers/isdn/hisax/amd7930_fn.c:628:32: warning: equality comparison
> with extraneous parentheses [-Wparentheses-equality]
> if ((cs->dc.amd7930.ph_state == 8)) {
>  ^~~~
> drivers/isdn/hisax/amd7930_fn.c:628:32: note: remove extraneous
> parentheses around the comparison to silence this warning
> if ((cs->dc.amd7930.ph_state == 8)) {
> ~^   ~
> drivers/isdn/hisax/amd7930_fn.c:628:32: note: use '=' to turn this
> equality comparison into an assignment
> if ((cs->dc.amd7930.ph_state == 8)) {
>  ^~
>  =
> 1 warning generated.
> 
> Signed-off-by: Nathan Chancellor 

Applied.

Re: [PATCH net 00/10] rxrpc: Fix packet reception code

2018-10-10 Thread David Miller

From: David Howells 
Date: Mon, 08 Oct 2018 23:47:18 +0100

> Here are a set of patches that prepares for and fix problems in rxrpc's
> package reception code.  There serious problems are:
 ...
> The second patch fixes (A) - (C); the third patch renders (B) and (C)
> non-issues by using the recap_rcv hook instead of data_ready - and the
> final patch fixes (D).  That last is the most complex.
> 
> The preparatory patches are:
 ...
> And then there are three main patches - note that these are mixed in with
> the preparatory patches somewhat:
 ...
> The patches are tagged here:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-fixes-20181008

Pulled, thanks David.

Re: [PATCH net 00/10] rxrpc: Fix packet reception code

2018-10-10 Thread David Miller

From: David Howells 
Date: Mon, 08 Oct 2018 23:47:18 +0100

> Here are a set of patches that prepares for and fix problems in rxrpc's
> package reception code.  There serious problems are:
 ...
> The second patch fixes (A) - (C); the third patch renders (B) and (C)
> non-issues by using the recap_rcv hook instead of data_ready - and the
> final patch fixes (D).  That last is the most complex.
> 
> The preparatory patches are:
 ...
> And then there are three main patches - note that these are mixed in with
> the preparatory patches somewhat:
 ...
> The patches are tagged here:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
>   rxrpc-fixes-20181008

Pulled, thanks David.

linux-next: Tree for Oct 11

2018-10-10 Thread Stephen Rothwell

Hi all,

Changes since 20181010:

The crypto tree gained conflicts against the mac80211-next tree.

Non-merge commits (relative to Linus' tree): 9625
 9096 files changed, 465434 insertions(+), 197078 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 291 trees (counting Linus' and 66 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (b8db9e69dba9 Merge tag 'for-4.19/dm-fixes-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm)
Merging fixes/master (72358c0b59b7 linux-next: build warnings from the build of 
Linus' tree)
Merging kbuild-current/fixes (5318321d367c samples: disable CONFIG_SAMPLES for 
UML)
Merging arc-current/for-curr (c58a584f05e3 ARC: clone syscall to setp r25 as 
thread pointer)
Merging arm-current/fixes (3a58ac65e2d7 ARM: 8799/1: mm: fix pci_ioremap_io() 
offset check)
Merging arm64-fixes/for-next/fixes (2a3f93459d68 arm64: KVM: Sanitize PSTATE.M 
when being set from userspace)
Merging m68k-current/for-linus (0986b16ab49b m68k/mac: Use correct PMU response 
format)
Merging powerpc-fixes/fixes (ac1788cc7da4 powerpc/numa: Skip onlining a offline 
node in kdump path)
Merging sparc/master (ff5d1a42096c sunvdc: Remove VLA usage)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (52b5d6f5dcf0 net: make skb_partial_csum_set() more robust 
against overflows)
Merging bpf/master (262f9d811c76 bpf: do not blindly change rlimit in reuseport 
net selftest)
Merging ipsec/master (4da402597c2b xfrm: fix gro_cells leak when remove virtual 
xfrm interfaces)
Merging netfilter/master (1ad98e9d1bdf tcp/dccp: fix lockdep issue when SYN is 
backlogged)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (3baafeffa48a iwlwifi: 1000: set the TFD queue 
size)
Merging mac80211/master (8d0be26c781a mac80211_hwsim: fix module init error 
paths for netlink)
Merging rdma-fixes/for-rc (5c5702e259dc RDMA/core: Set right entry state before 
releasing reference)
Merging sound-current/for-linus (709ae62e8e6d ALSA: hda/realtek - Cannot adjust 
speaker's volume on Dell XPS 27 7760)
Merging sound-asoc-fixes/for-linus (296a42942aa3 Merge branch 'asoc-4.19' into 
asoc-linus)
Merging regmap-fixes/for-linus (7876320f8880 Linux 4.19-rc4)
Merging regulator-fixes/for-linus (0238df646e62 Linux 4.19-rc7)
Merging spi-fixes/for-linus (0238df646e62 Linux 4.19-rc7)
Merging pci-current/for-linus (2edab4df98d9 PCI: Expand the "PF" acronym in 
Kconfig help text)
Merging driver-core.current/driver-core-linus (7876320f8880 Linux 4.19-rc4)
Merging tty.current/tty-linus (0238df646e62 Linux 4.19-rc7)
Merging usb.current/usb-linus (0238df646e62 Linux 4.19-rc7)
Merging usb-gadget-fixes/fixes (d9707490077b usb: dwc2: Fix call location of 
dwc2_check_core_endianness)
Merging usb-serial-fixes/usb-linus (0238df646e62 Linux 4.19-rc7)
Merging usb-chipidea-fixes/ci-for-usb-stable (a930d8bd94d8 usb: chipidea: 
Always build ULPI code)
Merging phy/fixes (5b394b2ddf03 Linux 4.19-rc1)
Merging staging.current/staging-linus (7876320f8880 Linux 4.19-rc4)
Merging char-misc.current/char-misc-linus (0238df646e62 Linux 4.19-rc7)
Merging soundwire-fixes/fixes (8d6ccf5cebbc soundwire: Fix acquiring bus loc

linux-next: Tree for Oct 11

2018-10-10 Thread Stephen Rothwell

Hi all,

Changes since 20181010:

The crypto tree gained conflicts against the mac80211-next tree.

Non-merge commits (relative to Linus' tree): 9625
 9096 files changed, 465434 insertions(+), 197078 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 291 trees (counting Linus' and 66 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (b8db9e69dba9 Merge tag 'for-4.19/dm-fixes-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm)
Merging fixes/master (72358c0b59b7 linux-next: build warnings from the build of 
Linus' tree)
Merging kbuild-current/fixes (5318321d367c samples: disable CONFIG_SAMPLES for 
UML)
Merging arc-current/for-curr (c58a584f05e3 ARC: clone syscall to setp r25 as 
thread pointer)
Merging arm-current/fixes (3a58ac65e2d7 ARM: 8799/1: mm: fix pci_ioremap_io() 
offset check)
Merging arm64-fixes/for-next/fixes (2a3f93459d68 arm64: KVM: Sanitize PSTATE.M 
when being set from userspace)
Merging m68k-current/for-linus (0986b16ab49b m68k/mac: Use correct PMU response 
format)
Merging powerpc-fixes/fixes (ac1788cc7da4 powerpc/numa: Skip onlining a offline 
node in kdump path)
Merging sparc/master (ff5d1a42096c sunvdc: Remove VLA usage)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (52b5d6f5dcf0 net: make skb_partial_csum_set() more robust 
against overflows)
Merging bpf/master (262f9d811c76 bpf: do not blindly change rlimit in reuseport 
net selftest)
Merging ipsec/master (4da402597c2b xfrm: fix gro_cells leak when remove virtual 
xfrm interfaces)
Merging netfilter/master (1ad98e9d1bdf tcp/dccp: fix lockdep issue when SYN is 
backlogged)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (3baafeffa48a iwlwifi: 1000: set the TFD queue 
size)
Merging mac80211/master (8d0be26c781a mac80211_hwsim: fix module init error 
paths for netlink)
Merging rdma-fixes/for-rc (5c5702e259dc RDMA/core: Set right entry state before 
releasing reference)
Merging sound-current/for-linus (709ae62e8e6d ALSA: hda/realtek - Cannot adjust 
speaker's volume on Dell XPS 27 7760)
Merging sound-asoc-fixes/for-linus (296a42942aa3 Merge branch 'asoc-4.19' into 
asoc-linus)
Merging regmap-fixes/for-linus (7876320f8880 Linux 4.19-rc4)
Merging regulator-fixes/for-linus (0238df646e62 Linux 4.19-rc7)
Merging spi-fixes/for-linus (0238df646e62 Linux 4.19-rc7)
Merging pci-current/for-linus (2edab4df98d9 PCI: Expand the "PF" acronym in 
Kconfig help text)
Merging driver-core.current/driver-core-linus (7876320f8880 Linux 4.19-rc4)
Merging tty.current/tty-linus (0238df646e62 Linux 4.19-rc7)
Merging usb.current/usb-linus (0238df646e62 Linux 4.19-rc7)
Merging usb-gadget-fixes/fixes (d9707490077b usb: dwc2: Fix call location of 
dwc2_check_core_endianness)
Merging usb-serial-fixes/usb-linus (0238df646e62 Linux 4.19-rc7)
Merging usb-chipidea-fixes/ci-for-usb-stable (a930d8bd94d8 usb: chipidea: 
Always build ULPI code)
Merging phy/fixes (5b394b2ddf03 Linux 4.19-rc1)
Merging staging.current/staging-linus (7876320f8880 Linux 4.19-rc4)
Merging char-misc.current/char-misc-linus (0238df646e62 Linux 4.19-rc7)
Merging soundwire-fixes/fixes (8d6ccf5cebbc soundwire: Fix acquiring bus loc

Re: [PATCH] net: aquantia: remove some redundant variable initializations

2018-10-10 Thread David Miller

From: Colin King 
Date: Mon,  8 Oct 2018 14:35:58 +0100

> From: Colin Ian King 
> 
> There are several variables being initialized that are being set later
> and hence the initialization is redundant and can be removed. Remove
> then.
> 
> Signed-off-by: Colin Ian King 

Applied to net-next.

Re: [PATCH] net: aquantia: remove some redundant variable initializations

2018-10-10 Thread David Miller

From: Colin King 
Date: Mon,  8 Oct 2018 14:35:58 +0100

> From: Colin Ian King 
> 
> There are several variables being initialized that are being set later
> and hence the initialization is redundant and can be removed. Remove
> then.
> 
> Signed-off-by: Colin Ian King 

Applied to net-next.

Re: [PATCH] mm: Speed up mremap on large regions

2018-10-10 Thread Kirill A. Shutemov

On Wed, Oct 10, 2018 at 05:46:18PM -0700, Joel Fernandes wrote:
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
> b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index 391ed2c3b697..8a33f2044923 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -192,14 +192,12 @@ static inline pgtable_t pmd_pgtable(pmd_t pmd)
>   return (pgtable_t)pmd_page_vaddr(pmd);
>  }
>  
> -static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
> -   unsigned long address)
> +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
>  {
>   return (pte_t *)pte_fragment_alloc(mm, address, 1);
>  }

This is obviously broken.

-- 
 Kirill A. Shutemov

Re: [PATCH] mm: Speed up mremap on large regions

2018-10-10 Thread Kirill A. Shutemov

On Wed, Oct 10, 2018 at 05:46:18PM -0700, Joel Fernandes wrote:
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
> b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index 391ed2c3b697..8a33f2044923 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -192,14 +192,12 @@ static inline pgtable_t pmd_pgtable(pmd_t pmd)
>   return (pgtable_t)pmd_page_vaddr(pmd);
>  }
>  
> -static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
> -   unsigned long address)
> +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
>  {
>   return (pte_t *)pte_fragment_alloc(mm, address, 1);
>  }

This is obviously broken.

-- 
 Kirill A. Shutemov

Re: [Xen-devel] [PATCH] xen: drop writing error messages to xenstore

2018-10-10 Thread Juergen Gross

On 10/10/2018 18:57, Boris Ostrovsky wrote:
> On 10/10/18 11:53 AM, Juergen Gross wrote:
>> On 10/10/2018 17:09, Joao Martins wrote:
>>> On 10/09/2018 05:09 PM, Juergen Gross wrote:
 xenbus_va_dev_error() will try to write error messages to Xenstore
 under the error//error node (with  something like
 "device/vbd/51872"). This will fail normally and another message
 about this failure is added to dmesg.

 I believe this is a remnant from very ancient times, as it was added
 in the first pvops rush of commits in 2007.

 So remove the additional message when writing to Xenstore failed as
 a minimum step.

 Signed-off-by: Juergen Gross 
 ---
 I am considering removing the Xenstore write altogether, but I'm
 not sure it isn't needed e.g. by xend based installations. So please
 speak up in case you know why this write is there.
>>> So this:
>>>
>>> "This will fail normally and another message about this failure is added to 
>>> dmesg."
>>>
>>> Brings me to the question: What about {stub,driver}domains? Ideally you
>>> shouldn't be looking at domU's dmesg as a control domain no? I can't 
>>> remember
>>> any other error node, but if something fails e.g. netfront fails to 
>>> allocate an
>>> unbound event channel - how do you know the cause from the control domain
>>> perspective?
>>>
>>> Irrespective of xend or not: isn't this 'error' node the only one that
>>> propagates error causes per device from domU?
>> What does it help you in dom0 if you have an error message in Xenstore
>> if a frontend driver couldn't do its job? Is there anything you can do
>> as a Xen admin?
> 
> The admin may want to know, for example, that a hotplug in the guest failed.

Shouldn't he ask the guest for that? There are dozens of other possible
problems letting hotplug fail which won't write anything to Xenstore.

This might be interesting for development/test purposes, but I really
question it to stay in mature code.


Juergen

Re: [Xen-devel] [PATCH] xen: drop writing error messages to xenstore

2018-10-10 Thread Juergen Gross

On 10/10/2018 18:57, Boris Ostrovsky wrote:
> On 10/10/18 11:53 AM, Juergen Gross wrote:
>> On 10/10/2018 17:09, Joao Martins wrote:
>>> On 10/09/2018 05:09 PM, Juergen Gross wrote:
 xenbus_va_dev_error() will try to write error messages to Xenstore
 under the error//error node (with  something like
 "device/vbd/51872"). This will fail normally and another message
 about this failure is added to dmesg.

 I believe this is a remnant from very ancient times, as it was added
 in the first pvops rush of commits in 2007.

 So remove the additional message when writing to Xenstore failed as
 a minimum step.

 Signed-off-by: Juergen Gross 
 ---
 I am considering removing the Xenstore write altogether, but I'm
 not sure it isn't needed e.g. by xend based installations. So please
 speak up in case you know why this write is there.
>>> So this:
>>>
>>> "This will fail normally and another message about this failure is added to 
>>> dmesg."
>>>
>>> Brings me to the question: What about {stub,driver}domains? Ideally you
>>> shouldn't be looking at domU's dmesg as a control domain no? I can't 
>>> remember
>>> any other error node, but if something fails e.g. netfront fails to 
>>> allocate an
>>> unbound event channel - how do you know the cause from the control domain
>>> perspective?
>>>
>>> Irrespective of xend or not: isn't this 'error' node the only one that
>>> propagates error causes per device from domU?
>> What does it help you in dom0 if you have an error message in Xenstore
>> if a frontend driver couldn't do its job? Is there anything you can do
>> as a Xen admin?
> 
> The admin may want to know, for example, that a hotplug in the guest failed.

Shouldn't he ask the guest for that? There are dozens of other possible
problems letting hotplug fail which won't write anything to Xenstore.

This might be interesting for development/test purposes, but I really
question it to stay in mature code.


Juergen

[PATCH 0/2] docs: memory-hotplug: add details about locking internals

2018-10-10 Thread Mike Rapoport

Hi,

As discussed at [1], the latest updates to memory hotplug documentation are
causing a conflict between docs and mmotm trees.
These patches resolve the conflict.

[1] https://lkml.org/lkml/2018/10/8/227

David Hildenbrand (1):
  docs/core-api: memory-hotplug: add some details about locking internals

Mike Rapoport (1):
  docs/core-api: rename memory-hotplug-notifier to memory-hotplug

 Documentation/core-api/index.rst   |   2 +-
 Documentation/core-api/memory-hotplug-notifier.rst |  84 --
 Documentation/core-api/memory-hotplug.rst  | 125 +
 3 files changed, 126 insertions(+), 85 deletions(-)
 delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst
 create mode 100644 Documentation/core-api/memory-hotplug.rst

-- 
2.7.4

[PATCH 0/2] docs: memory-hotplug: add details about locking internals

2018-10-10 Thread Mike Rapoport

Hi,

As discussed at [1], the latest updates to memory hotplug documentation are
causing a conflict between docs and mmotm trees.
These patches resolve the conflict.

[1] https://lkml.org/lkml/2018/10/8/227

David Hildenbrand (1):
  docs/core-api: memory-hotplug: add some details about locking internals

Mike Rapoport (1):
  docs/core-api: rename memory-hotplug-notifier to memory-hotplug

 Documentation/core-api/index.rst   |   2 +-
 Documentation/core-api/memory-hotplug-notifier.rst |  84 --
 Documentation/core-api/memory-hotplug.rst  | 125 +
 3 files changed, 126 insertions(+), 85 deletions(-)
 delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst
 create mode 100644 Documentation/core-api/memory-hotplug.rst

-- 
2.7.4

[PATCH 1/2] docs/core-api: rename memory-hotplug-notifier to memory-hotplug

2018-10-10 Thread Mike Rapoport

From: Mike Rapoport 

to allow additions of new documentation about memory hotplug under the same
roof.

Signed-off-by: Mike Rapoport 
---
 Documentation/core-api/index.rst   |  2 +-
 Documentation/core-api/memory-hotplug-notifier.rst | 84 -
 Documentation/core-api/memory-hotplug.rst  | 87 ++
 3 files changed, 88 insertions(+), 85 deletions(-)
 delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst
 create mode 100644 Documentation/core-api/memory-hotplug.rst

diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 4f8a426..29c790f 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -32,7 +32,7 @@ Core utilities
gfp_mask-from-fs-io
timekeeping
boot-time-mm
-   memory-hotplug-notifier
+   memory-hotplug
 
 
 Interfaces for kernel debugging
diff --git a/Documentation/core-api/memory-hotplug-notifier.rst 
b/Documentation/core-api/memory-hotplug-notifier.rst
deleted file mode 100644
index 35347cc..000
--- a/Documentation/core-api/memory-hotplug-notifier.rst
+++ /dev/null
@@ -1,84 +0,0 @@
-.. _memory_hotplug_notifier:
-
-=
-Memory hotplug event notifier
-=
-
-Hotplugging events are sent to a notification queue.
-
-There are six types of notification defined in ``include/linux/memory.h``:
-
-MEM_GOING_ONLINE
-  Generated before new memory becomes available in order to be able to
-  prepare subsystems to handle memory. The page allocator is still unable
-  to allocate from the new memory.
-
-MEM_CANCEL_ONLINE
-  Generated if MEM_GOING_ONLINE fails.
-
-MEM_ONLINE
-  Generated when memory has successfully brought online. The callback may
-  allocate pages from the new memory.
-
-MEM_GOING_OFFLINE
-  Generated to begin the process of offlining memory. Allocations are no
-  longer possible from the memory but some of the memory to be offlined
-  is still in use. The callback can be used to free memory known to a
-  subsystem from the indicated memory block.
-
-MEM_CANCEL_OFFLINE
-  Generated if MEM_GOING_OFFLINE fails. Memory is available again from
-  the memory block that we attempted to offline.
-
-MEM_OFFLINE
-  Generated after offlining memory is complete.
-
-A callback routine can be registered by calling::
-
-  hotplug_memory_notifier(callback_func, priority)
-
-Callback functions with higher values of priority are called before callback
-functions with lower values.
-
-A callback function must have the following prototype::
-
-  int callback_func(
-struct notifier_block *self, unsigned long action, void *arg);
-
-The first argument of the callback function (self) is a pointer to the block
-of the notifier chain that points to the callback function itself.
-The second argument (action) is one of the event types described above.
-The third argument (arg) passes a pointer of struct memory_notify::
-
-   struct memory_notify {
-   unsigned long start_pfn;
-   unsigned long nr_pages;
-   int status_change_nid_normal;
-   int status_change_nid_high;
-   int status_change_nid;
-   }
-
-- start_pfn is start_pfn of online/offline memory.
-- nr_pages is # of pages of online/offline memory.
-- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
-  is (will be) set/clear, if this is -1, then nodemask status is not changed.
-- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
-  is (will be) set/clear, if this is -1, then nodemask status is not changed.
-- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
-  set/clear. It means a new(memoryless) node gets new memory by online and a
-  node loses all memory. If this is -1, then nodemask status is not changed.
-
-  If status_changed_nid* >= 0, callback should create/discard structures for 
the
-  node if necessary.
-
-The callback routine shall return one of the values
-NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
-defined in ``include/linux/notifier.h``
-
-NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
-
-NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE,
-MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops
-further processing of the notification queue.
-
-NOTIFY_STOP stops further processing of the notification queue.
diff --git a/Documentation/core-api/memory-hotplug.rst 
b/Documentation/core-api/memory-hotplug.rst
new file mode 100644
index 000..a99f2f2
--- /dev/null
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -0,0 +1,87 @@
+.. _memory_hotplug:
+
+==
+Memory hotplug
+==
+
+Memory hotplug event notifier
+=
+
+Hotplugging events are sent to a notification queue.
+
+There are six types of notification defined in ``include/linux/memory.h``:
+
+MEM_GOING_ONLINE
+  Generated before new memory becomes

[PATCH 2/2] docs/core-api: memory-hotplug: add some details about locking internals

2018-10-10 Thread Mike Rapoport

From: David Hildenbrand 

Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.

[ rppt: moved the text to Documentation/core-api/memory-hotplug.rst ]

Link: http://lkml.kernel.org/r/20180925091457.28651-7-da...@redhat.com
Signed-off-by: David Hildenbrand 
Reviewed-by: Pavel Tatashin 
Reviewed-by: Rashmica Gupta 
Cc: Jonathan Corbet 
Cc: Michal Hocko 
Cc: Balbir Singh 
Cc: Benjamin Herrenschmidt 
Cc: Boris Ostrovsky 
Cc: Dan Williams 
Cc: Greg Kroah-Hartman 
Cc: Haiyang Zhang 
Cc: Heiko Carstens 
Cc: John Allen 
Cc: Joonsoo Kim 
Cc: Juergen Gross 
Cc: Kate Stewart 
Cc: "K. Y. Srinivasan" 
Cc: Len Brown 
Cc: Martin Schwidefsky 
Cc: Mathieu Malaterre 
Cc: Michael Ellerman 
Cc: Michael Neuling 
Cc: Nathan Fontenot 
Cc: Oscar Salvador 
Cc: Paul Mackerras 
Cc: Philippe Ombredanne 
Cc: Rafael J. Wysocki 
Cc: "Rafael J. Wysocki" 
Cc: Stephen Hemminger 
Cc: Thomas Gleixner 
Cc: Vlastimil Babka 
Cc: YASUAKI ISHIMATSU 
Signed-off-by: Andrew Morton 
Signed-off-by: Mike Rapoport 
---
 Documentation/core-api/memory-hotplug.rst | 38 +++
 1 file changed, 38 insertions(+)

diff --git a/Documentation/core-api/memory-hotplug.rst 
b/Documentation/core-api/memory-hotplug.rst
index a99f2f2..de7467e 100644
--- a/Documentation/core-api/memory-hotplug.rst
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -85,3 +85,41 @@ MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It 
stops
 further processing of the notification queue.
 
 NOTIFY_STOP stops further processing of the notification queue.
+
+Locking Internals
+=
+
+When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
+the device_hotplug_lock should be held to:
+
+- synchronize against online/offline requests (e.g. via sysfs). This way, 
memory
+  block devices can only be accessed (.online/.state attributes) by user
+  space once memory has been fully added. And when removing memory, we
+  know nobody is in critical sections.
+- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
+
+Especially, there is a possible lock inversion that is avoided using
+device_hotplug_lock when adding memory and user space tries to online that
+memory faster than expected:
+
+- device_online() will first take the device_lock(), followed by
+  mem_hotplug_lock
+- add_memory_resource() will first take the mem_hotplug_lock, followed by
+  the device_lock() (while creating the devices, during bus_add_device()).
+
+As the device is visible to user space before taking the device_lock(), this
+can result in a lock inversion.
+
+onlining/offlining of memory should be done via device_online()/
+device_offline() - to make sure it is properly synchronized to actions
+via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
+
+When adding/removing/onlining/offlining memory or adding/removing
+heterogeneous/device memory, we should always hold the mem_hotplug_lock in
+write mode to serialise memory hotplug (e.g. access to global/zone
+variables).
+
+In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read
+mode allows for a quite efficient get_online_mems/put_online_mems
+implementation, so code accessing memory can protect from that memory
+vanishing.
-- 
2.7.4

[PATCH 1/2] docs/core-api: rename memory-hotplug-notifier to memory-hotplug

2018-10-10 Thread Mike Rapoport

From: Mike Rapoport 

to allow additions of new documentation about memory hotplug under the same
roof.

Signed-off-by: Mike Rapoport 
---
 Documentation/core-api/index.rst   |  2 +-
 Documentation/core-api/memory-hotplug-notifier.rst | 84 -
 Documentation/core-api/memory-hotplug.rst  | 87 ++
 3 files changed, 88 insertions(+), 85 deletions(-)
 delete mode 100644 Documentation/core-api/memory-hotplug-notifier.rst
 create mode 100644 Documentation/core-api/memory-hotplug.rst

diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 4f8a426..29c790f 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -32,7 +32,7 @@ Core utilities
gfp_mask-from-fs-io
timekeeping
boot-time-mm
-   memory-hotplug-notifier
+   memory-hotplug
 
 
 Interfaces for kernel debugging
diff --git a/Documentation/core-api/memory-hotplug-notifier.rst 
b/Documentation/core-api/memory-hotplug-notifier.rst
deleted file mode 100644
index 35347cc..000
--- a/Documentation/core-api/memory-hotplug-notifier.rst
+++ /dev/null
@@ -1,84 +0,0 @@
-.. _memory_hotplug_notifier:
-
-=
-Memory hotplug event notifier
-=
-
-Hotplugging events are sent to a notification queue.
-
-There are six types of notification defined in ``include/linux/memory.h``:
-
-MEM_GOING_ONLINE
-  Generated before new memory becomes available in order to be able to
-  prepare subsystems to handle memory. The page allocator is still unable
-  to allocate from the new memory.
-
-MEM_CANCEL_ONLINE
-  Generated if MEM_GOING_ONLINE fails.
-
-MEM_ONLINE
-  Generated when memory has successfully brought online. The callback may
-  allocate pages from the new memory.
-
-MEM_GOING_OFFLINE
-  Generated to begin the process of offlining memory. Allocations are no
-  longer possible from the memory but some of the memory to be offlined
-  is still in use. The callback can be used to free memory known to a
-  subsystem from the indicated memory block.
-
-MEM_CANCEL_OFFLINE
-  Generated if MEM_GOING_OFFLINE fails. Memory is available again from
-  the memory block that we attempted to offline.
-
-MEM_OFFLINE
-  Generated after offlining memory is complete.
-
-A callback routine can be registered by calling::
-
-  hotplug_memory_notifier(callback_func, priority)
-
-Callback functions with higher values of priority are called before callback
-functions with lower values.
-
-A callback function must have the following prototype::
-
-  int callback_func(
-struct notifier_block *self, unsigned long action, void *arg);
-
-The first argument of the callback function (self) is a pointer to the block
-of the notifier chain that points to the callback function itself.
-The second argument (action) is one of the event types described above.
-The third argument (arg) passes a pointer of struct memory_notify::
-
-   struct memory_notify {
-   unsigned long start_pfn;
-   unsigned long nr_pages;
-   int status_change_nid_normal;
-   int status_change_nid_high;
-   int status_change_nid;
-   }
-
-- start_pfn is start_pfn of online/offline memory.
-- nr_pages is # of pages of online/offline memory.
-- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
-  is (will be) set/clear, if this is -1, then nodemask status is not changed.
-- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
-  is (will be) set/clear, if this is -1, then nodemask status is not changed.
-- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
-  set/clear. It means a new(memoryless) node gets new memory by online and a
-  node loses all memory. If this is -1, then nodemask status is not changed.
-
-  If status_changed_nid* >= 0, callback should create/discard structures for 
the
-  node if necessary.
-
-The callback routine shall return one of the values
-NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
-defined in ``include/linux/notifier.h``
-
-NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
-
-NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE,
-MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops
-further processing of the notification queue.
-
-NOTIFY_STOP stops further processing of the notification queue.
diff --git a/Documentation/core-api/memory-hotplug.rst 
b/Documentation/core-api/memory-hotplug.rst
new file mode 100644
index 000..a99f2f2
--- /dev/null
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -0,0 +1,87 @@
+.. _memory_hotplug:
+
+==
+Memory hotplug
+==
+
+Memory hotplug event notifier
+=
+
+Hotplugging events are sent to a notification queue.
+
+There are six types of notification defined in ``include/linux/memory.h``:
+
+MEM_GOING_ONLINE
+  Generated before new memory becomes

[PATCH 2/2] docs/core-api: memory-hotplug: add some details about locking internals

2018-10-10 Thread Mike Rapoport

From: David Hildenbrand 

Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.

[ rppt: moved the text to Documentation/core-api/memory-hotplug.rst ]

Link: http://lkml.kernel.org/r/20180925091457.28651-7-da...@redhat.com
Signed-off-by: David Hildenbrand 
Reviewed-by: Pavel Tatashin 
Reviewed-by: Rashmica Gupta 
Cc: Jonathan Corbet 
Cc: Michal Hocko 
Cc: Balbir Singh 
Cc: Benjamin Herrenschmidt 
Cc: Boris Ostrovsky 
Cc: Dan Williams 
Cc: Greg Kroah-Hartman 
Cc: Haiyang Zhang 
Cc: Heiko Carstens 
Cc: John Allen 
Cc: Joonsoo Kim 
Cc: Juergen Gross 
Cc: Kate Stewart 
Cc: "K. Y. Srinivasan" 
Cc: Len Brown 
Cc: Martin Schwidefsky 
Cc: Mathieu Malaterre 
Cc: Michael Ellerman 
Cc: Michael Neuling 
Cc: Nathan Fontenot 
Cc: Oscar Salvador 
Cc: Paul Mackerras 
Cc: Philippe Ombredanne 
Cc: Rafael J. Wysocki 
Cc: "Rafael J. Wysocki" 
Cc: Stephen Hemminger 
Cc: Thomas Gleixner 
Cc: Vlastimil Babka 
Cc: YASUAKI ISHIMATSU 
Signed-off-by: Andrew Morton 
Signed-off-by: Mike Rapoport 
---
 Documentation/core-api/memory-hotplug.rst | 38 +++
 1 file changed, 38 insertions(+)

diff --git a/Documentation/core-api/memory-hotplug.rst 
b/Documentation/core-api/memory-hotplug.rst
index a99f2f2..de7467e 100644
--- a/Documentation/core-api/memory-hotplug.rst
+++ b/Documentation/core-api/memory-hotplug.rst
@@ -85,3 +85,41 @@ MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It 
stops
 further processing of the notification queue.
 
 NOTIFY_STOP stops further processing of the notification queue.
+
+Locking Internals
+=
+
+When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
+the device_hotplug_lock should be held to:
+
+- synchronize against online/offline requests (e.g. via sysfs). This way, 
memory
+  block devices can only be accessed (.online/.state attributes) by user
+  space once memory has been fully added. And when removing memory, we
+  know nobody is in critical sections.
+- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
+
+Especially, there is a possible lock inversion that is avoided using
+device_hotplug_lock when adding memory and user space tries to online that
+memory faster than expected:
+
+- device_online() will first take the device_lock(), followed by
+  mem_hotplug_lock
+- add_memory_resource() will first take the mem_hotplug_lock, followed by
+  the device_lock() (while creating the devices, during bus_add_device()).
+
+As the device is visible to user space before taking the device_lock(), this
+can result in a lock inversion.
+
+onlining/offlining of memory should be done via device_online()/
+device_offline() - to make sure it is properly synchronized to actions
+via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
+
+When adding/removing/onlining/offlining memory or adding/removing
+heterogeneous/device memory, we should always hold the mem_hotplug_lock in
+write mode to serialise memory hotplug (e.g. access to global/zone
+variables).
+
+In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read
+mode allows for a quite efficient get_online_mems/put_online_mems
+implementation, so code accessing memory can protect from that memory
+vanishing.
-- 
2.7.4

Re: [PATCH v3 3/3] iio: magnetometer: Add driver support for PNI RM3100

2018-10-10 Thread Song Qiang





On 2018年10月07日 23:44, Jonathan Cameron wrote:

On Tue,  2 Oct 2018 22:38:12 +0800
Song Qiang  wrote:


PNI RM3100 is a high resolution, large signal immunity magnetometer,
composed of 3 single sensors and a processing chip with a MagI2C
interface.

Following functions are available:
  - Single-shot measurement from
/sys/bus/iio/devices/iio:deviceX/in_magn_{axis}_raw
  - Triggerd buffer measurement.
  - Both i2c and spi interface are supported.
  - Both interrupt and polling measurement is supported, depends on if
the 'interrupts' in DT is declared.

Signed-off-by: Song Qiang 

I realise now that I should have read the datasheet properly.
Sorry about that.

What we have here is a hybrid of polled and continuous measurement.

If you are using the dataready as a trigger it is fine to support
continuous measurement, but you aren't doing that here.

The single shot measurement should be done with the method
described in the datasheet where you write a POLL command and then wait
for the single interrupt.  There is no problem with racing and that
interrupt is a high level one and can be handled as such.  We should
not do it by waiting for the next continuous measurement to happen
after clearing the status register, which is what I think is happening
here.

If you want to use it in continuous mode, you should provide a trigger.
That trigger will be fired by the dataready signal and the
discussion I put in the earlier reply becomes relevant.

Doing both of these options requires the interrupt handler to know
which mode you are in, but that is straight forward to implement and
is done in a number of other drivers.

Sorry again that I failed to identify this issue earlier.

Thanks to Phil as his question in the interrupt type got me thinking
about how you were handing the interrupts.

Jonathan



Hi Jonathan,

I learned the way of handling single shot from the driver of hmc5843, 
seems like it needs changing, too.
There was some problems with my computer. Lenovo updates told me to 
update BIOS and it went dead. I didn't write any code the past few days, 
just got it fixed today.


yours,
Song Qiang


---
  MAINTAINERS|   7 +
  drivers/iio/magnetometer/Kconfig   |  29 ++
  drivers/iio/magnetometer/Makefile  |   4 +
  drivers/iio/magnetometer/rm3100-core.c | 539 +
  drivers/iio/magnetometer/rm3100-i2c.c  |  58 +++
  drivers/iio/magnetometer/rm3100-spi.c  |  64 +++
  drivers/iio/magnetometer/rm3100.h  |  17 +
  7 files changed, 718 insertions(+)
  create mode 100644 drivers/iio/magnetometer/rm3100-core.c
  create mode 100644 drivers/iio/magnetometer/rm3100-i2c.c
  create mode 100644 drivers/iio/magnetometer/rm3100-spi.c
  create mode 100644 drivers/iio/magnetometer/rm3100.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 967ce8cdd1cc..14eeeb072403 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11393,6 +11393,13 @@ M: "Rafael J. Wysocki" 
  S:Maintained
  F:drivers/pnp/
  
+PNI RM3100 IIO DRIVER

+M: Song Qiang 
+L: linux-...@vger.kernel.org
+S: Maintained
+F: drivers/iio/magnetometer/rm3100*
+F: Documentation/devicetree/bindings/iio/magnetometer/pni,rm3100.txt
+
  POSIX CLOCKS and TIMERS
  M:Thomas Gleixner 
  L:linux-kernel@vger.kernel.org
diff --git a/drivers/iio/magnetometer/Kconfig b/drivers/iio/magnetometer/Kconfig
index ed9d776d01af..8a63cbbca4b7 100644
--- a/drivers/iio/magnetometer/Kconfig
+++ b/drivers/iio/magnetometer/Kconfig
@@ -175,4 +175,33 @@ config SENSORS_HMC5843_SPI
  - hmc5843_core (core functions)
  - hmc5843_spi (support for HMC5983)
  
+config SENSORS_RM3100

+   tristate
+   select IIO_BUFFER
+   select IIO_TRIGGERED_BUFFER
+
+config SENSORS_RM3100_I2C
+   tristate "PNI RM3100 3-Axis Magnetometer (I2C)"
+   depends on I2C
+   select SENSORS_RM3100
+   select REGMAP_I2C
+   help
+ Say Y here to add support for the PNI RM3100 3-Axis Magnetometer.
+
+ This driver can also be compiled as a module.
+ To compile this driver as a module, choose M here: the module
+ will be called rm3100-i2c.
+
+config SENSORS_RM3100_SPI
+   tristate "PNI RM3100 3-Axis Magnetometer (SPI)"
+   depends on SPI_MASTER
+   select SENSORS_RM3100
+   select REGMAP_SPI
+   help
+ Say Y here to add support for the PNI RM3100 3-Axis Magnetometer.
+
+ This driver can also be compiled as a module.
+ To compile this driver as a module, choose M here: the module
+ will be called rm3100-spi.
+
  endmenu
diff --git a/drivers/iio/magnetometer/Makefile 
b/drivers/iio/magnetometer/Makefile
index 664b2f866472..ba1bc34b82fa 100644
--- a/drivers/iio/magnetometer/Makefile
+++ b/drivers/iio/magnetometer/Makefile
@@ -24,3 +24,7 @@ obj-$(CONFIG_IIO_ST_MAGN_SPI_3AXIS) += st_magn_spi.o
  obj-$(CONFIG_SENSORS_HMC5843) += hmc5843_core.o
  obj-$(CONFIG_SENSORS_HMC5843_I2C) += hmc5843_i2c.o

Re: [PATCH v3 3/3] iio: magnetometer: Add driver support for PNI RM3100

2018-10-10 Thread Song Qiang





On 2018年10月07日 23:44, Jonathan Cameron wrote:

On Tue,  2 Oct 2018 22:38:12 +0800
Song Qiang  wrote:


PNI RM3100 is a high resolution, large signal immunity magnetometer,
composed of 3 single sensors and a processing chip with a MagI2C
interface.

Following functions are available:
  - Single-shot measurement from
/sys/bus/iio/devices/iio:deviceX/in_magn_{axis}_raw
  - Triggerd buffer measurement.
  - Both i2c and spi interface are supported.
  - Both interrupt and polling measurement is supported, depends on if
the 'interrupts' in DT is declared.

Signed-off-by: Song Qiang 

I realise now that I should have read the datasheet properly.
Sorry about that.

What we have here is a hybrid of polled and continuous measurement.

If you are using the dataready as a trigger it is fine to support
continuous measurement, but you aren't doing that here.

The single shot measurement should be done with the method
described in the datasheet where you write a POLL command and then wait
for the single interrupt.  There is no problem with racing and that
interrupt is a high level one and can be handled as such.  We should
not do it by waiting for the next continuous measurement to happen
after clearing the status register, which is what I think is happening
here.

If you want to use it in continuous mode, you should provide a trigger.
That trigger will be fired by the dataready signal and the
discussion I put in the earlier reply becomes relevant.

Doing both of these options requires the interrupt handler to know
which mode you are in, but that is straight forward to implement and
is done in a number of other drivers.

Sorry again that I failed to identify this issue earlier.

Thanks to Phil as his question in the interrupt type got me thinking
about how you were handing the interrupts.

Jonathan



Hi Jonathan,

I learned the way of handling single shot from the driver of hmc5843, 
seems like it needs changing, too.
There was some problems with my computer. Lenovo updates told me to 
update BIOS and it went dead. I didn't write any code the past few days, 
just got it fixed today.


yours,
Song Qiang


---
  MAINTAINERS|   7 +
  drivers/iio/magnetometer/Kconfig   |  29 ++
  drivers/iio/magnetometer/Makefile  |   4 +
  drivers/iio/magnetometer/rm3100-core.c | 539 +
  drivers/iio/magnetometer/rm3100-i2c.c  |  58 +++
  drivers/iio/magnetometer/rm3100-spi.c  |  64 +++
  drivers/iio/magnetometer/rm3100.h  |  17 +
  7 files changed, 718 insertions(+)
  create mode 100644 drivers/iio/magnetometer/rm3100-core.c
  create mode 100644 drivers/iio/magnetometer/rm3100-i2c.c
  create mode 100644 drivers/iio/magnetometer/rm3100-spi.c
  create mode 100644 drivers/iio/magnetometer/rm3100.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 967ce8cdd1cc..14eeeb072403 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11393,6 +11393,13 @@ M: "Rafael J. Wysocki" 
  S:Maintained
  F:drivers/pnp/
  
+PNI RM3100 IIO DRIVER

+M: Song Qiang 
+L: linux-...@vger.kernel.org
+S: Maintained
+F: drivers/iio/magnetometer/rm3100*
+F: Documentation/devicetree/bindings/iio/magnetometer/pni,rm3100.txt
+
  POSIX CLOCKS and TIMERS
  M:Thomas Gleixner 
  L:linux-kernel@vger.kernel.org
diff --git a/drivers/iio/magnetometer/Kconfig b/drivers/iio/magnetometer/Kconfig
index ed9d776d01af..8a63cbbca4b7 100644
--- a/drivers/iio/magnetometer/Kconfig
+++ b/drivers/iio/magnetometer/Kconfig
@@ -175,4 +175,33 @@ config SENSORS_HMC5843_SPI
  - hmc5843_core (core functions)
  - hmc5843_spi (support for HMC5983)
  
+config SENSORS_RM3100

+   tristate
+   select IIO_BUFFER
+   select IIO_TRIGGERED_BUFFER
+
+config SENSORS_RM3100_I2C
+   tristate "PNI RM3100 3-Axis Magnetometer (I2C)"
+   depends on I2C
+   select SENSORS_RM3100
+   select REGMAP_I2C
+   help
+ Say Y here to add support for the PNI RM3100 3-Axis Magnetometer.
+
+ This driver can also be compiled as a module.
+ To compile this driver as a module, choose M here: the module
+ will be called rm3100-i2c.
+
+config SENSORS_RM3100_SPI
+   tristate "PNI RM3100 3-Axis Magnetometer (SPI)"
+   depends on SPI_MASTER
+   select SENSORS_RM3100
+   select REGMAP_SPI
+   help
+ Say Y here to add support for the PNI RM3100 3-Axis Magnetometer.
+
+ This driver can also be compiled as a module.
+ To compile this driver as a module, choose M here: the module
+ will be called rm3100-spi.
+
  endmenu
diff --git a/drivers/iio/magnetometer/Makefile 
b/drivers/iio/magnetometer/Makefile
index 664b2f866472..ba1bc34b82fa 100644
--- a/drivers/iio/magnetometer/Makefile
+++ b/drivers/iio/magnetometer/Makefile
@@ -24,3 +24,7 @@ obj-$(CONFIG_IIO_ST_MAGN_SPI_3AXIS) += st_magn_spi.o
  obj-$(CONFIG_SENSORS_HMC5843) += hmc5843_core.o
  obj-$(CONFIG_SENSORS_HMC5843_I2C) += hmc5843_i2c.o

Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover

2018-10-10 Thread Samuel Mendoza-Jonas

On Wed, 2018-10-10 at 22:36 +, justin.l...@dell.com wrote:
> Hi Samuel,
> 
> I am still testing your change and have some comments below.
> 
> Thanks,
> Justin
> 
> 
> > This patch extends the ncsi-netlink interface with two new commands and
> > three new attributes to configure multiple packages and/or channels at
> > once, and configure specific failover modes.
> > 
> > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist
> > of packages or channels allowed to be configured with the
> > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes
> > respectively. If one of these whitelists is set only packages or
> > channels matching the whitelist are considered for the channel queue in
> > ncsi_choose_active_channel().
> > 
> > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that
> > multiple packages or channels may be configured simultaneously. NCSI
> > hardware arbitration (HWA) must be available in order to enable
> > multi-package mode. Multi-channel mode is always available.
> > 
> > If the NCSI_ATTR_CHANNEL_ID attribute is present in the
> > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as
> > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred
> > channel and channel whitelist defines a primary channel and the allowed
> > failover channels.
> > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred
> > channel is configured for Tx/Rx and the other channels are enabled only
> > for Rx.
> > 
> > Signed-off-by: Samuel Mendoza-Jonas 
> > ---
> >  include/uapi/linux/ncsi.h |  16 +++
> >  net/ncsi/internal.h   |  11 +-
> >  net/ncsi/ncsi-aen.c   |   2 +-
> >  net/ncsi/ncsi-manage.c| 138 
> >  net/ncsi/ncsi-netlink.c   | 217 +-
> >  net/ncsi/ncsi-rsp.c   |   2 +-
> >  6 files changed, 312 insertions(+), 74 deletions(-)
> > 
> > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
> > index 4c292ecbb748..035fba1693f9 100644
> > --- a/include/uapi/linux/ncsi.h
> > +++ b/include/uapi/linux/ncsi.h
> > @@ -23,6 +23,13 @@
> >   * optionally the preferred NCSI_ATTR_CHANNEL_ID.
> >   * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel 
> > combination.
> >   * Requires NCSI_ATTR_IFINDEX.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + * Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + * Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > + * NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > + * the primary channel.
> >   * @NCSI_CMD_MAX: highest command number
> >   */
> 
> There are some typo in the description.
> * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
>  *Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
>  * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels.
>  *Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
>  *NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
>  *the primary channel.

Haha, yes I threw that in at the end, thanks for catching it.

> 
> >  enum ncsi_nl_commands {
> > @@ -30,6 +37,8 @@ enum ncsi_nl_commands {
> > NCSI_CMD_PKG_INFO,
> > NCSI_CMD_SET_INTERFACE,
> > NCSI_CMD_CLEAR_INTERFACE,
> > +   NCSI_CMD_SET_PACKAGE_MASK,
> > +   NCSI_CMD_SET_CHANNEL_MASK,
> >  
> > __NCSI_CMD_AFTER_LAST,
> > NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
> > @@ -43,6 +52,10 @@ enum ncsi_nl_commands {
> >   * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
> >   * @NCSI_ATTR_PACKAGE_ID: package ID
> >   * @NCSI_ATTR_CHANNEL_ID: channel ID
> > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled 
> > with
> > + * NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK.
> > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages.
> > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels.
> >   * @NCSI_ATTR_MAX: highest attribute number
> >   */
> >  enum ncsi_nl_attrs {
> > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs {
> > NCSI_ATTR_PACKAGE_LIST,
> > NCSI_ATTR_PACKAGE_ID,
> > NCSI_ATTR_CHANNEL_ID,
> > +   NCSI_ATTR_MULTI_FLAG,
> > +   NCSI_ATTR_PACKAGE_MASK,
> > +   NCSI_ATTR_CHANNEL_MASK,
> 
> Is there a case that we might set these two masks at the same time?
> If not, maybe we can just have one generic MASK attribute.
> 

I thought of this too: not yet, but I wonder if we might in the future.
I'll have a think about it.

> >  
> > __NCSI_ATTR_AFTER_LAST,
> > NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
> > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> > index 3d0a33b874f5..8437474d0a78 100644
> > --- a/net/ncsi/internal.h
> > +++ b/net/ncsi/internal.h
> > @@ -213,6 +213,10 @@ struct ncsi_package {
> > unsigned int

Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover

2018-10-10 Thread Samuel Mendoza-Jonas

On Wed, 2018-10-10 at 22:36 +, justin.l...@dell.com wrote:
> Hi Samuel,
> 
> I am still testing your change and have some comments below.
> 
> Thanks,
> Justin
> 
> 
> > This patch extends the ncsi-netlink interface with two new commands and
> > three new attributes to configure multiple packages and/or channels at
> > once, and configure specific failover modes.
> > 
> > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist
> > of packages or channels allowed to be configured with the
> > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes
> > respectively. If one of these whitelists is set only packages or
> > channels matching the whitelist are considered for the channel queue in
> > ncsi_choose_active_channel().
> > 
> > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that
> > multiple packages or channels may be configured simultaneously. NCSI
> > hardware arbitration (HWA) must be available in order to enable
> > multi-package mode. Multi-channel mode is always available.
> > 
> > If the NCSI_ATTR_CHANNEL_ID attribute is present in the
> > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as
> > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred
> > channel and channel whitelist defines a primary channel and the allowed
> > failover channels.
> > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred
> > channel is configured for Tx/Rx and the other channels are enabled only
> > for Rx.
> > 
> > Signed-off-by: Samuel Mendoza-Jonas 
> > ---
> >  include/uapi/linux/ncsi.h |  16 +++
> >  net/ncsi/internal.h   |  11 +-
> >  net/ncsi/ncsi-aen.c   |   2 +-
> >  net/ncsi/ncsi-manage.c| 138 
> >  net/ncsi/ncsi-netlink.c   | 217 +-
> >  net/ncsi/ncsi-rsp.c   |   2 +-
> >  6 files changed, 312 insertions(+), 74 deletions(-)
> > 
> > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
> > index 4c292ecbb748..035fba1693f9 100644
> > --- a/include/uapi/linux/ncsi.h
> > +++ b/include/uapi/linux/ncsi.h
> > @@ -23,6 +23,13 @@
> >   * optionally the preferred NCSI_ATTR_CHANNEL_ID.
> >   * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel 
> > combination.
> >   * Requires NCSI_ATTR_IFINDEX.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + * Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + * Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > + * NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > + * the primary channel.
> >   * @NCSI_CMD_MAX: highest command number
> >   */
> 
> There are some typo in the description.
> * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
>  *Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
>  * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels.
>  *Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
>  *NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
>  *the primary channel.

Haha, yes I threw that in at the end, thanks for catching it.

> 
> >  enum ncsi_nl_commands {
> > @@ -30,6 +37,8 @@ enum ncsi_nl_commands {
> > NCSI_CMD_PKG_INFO,
> > NCSI_CMD_SET_INTERFACE,
> > NCSI_CMD_CLEAR_INTERFACE,
> > +   NCSI_CMD_SET_PACKAGE_MASK,
> > +   NCSI_CMD_SET_CHANNEL_MASK,
> >  
> > __NCSI_CMD_AFTER_LAST,
> > NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
> > @@ -43,6 +52,10 @@ enum ncsi_nl_commands {
> >   * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
> >   * @NCSI_ATTR_PACKAGE_ID: package ID
> >   * @NCSI_ATTR_CHANNEL_ID: channel ID
> > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled 
> > with
> > + * NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK.
> > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages.
> > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels.
> >   * @NCSI_ATTR_MAX: highest attribute number
> >   */
> >  enum ncsi_nl_attrs {
> > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs {
> > NCSI_ATTR_PACKAGE_LIST,
> > NCSI_ATTR_PACKAGE_ID,
> > NCSI_ATTR_CHANNEL_ID,
> > +   NCSI_ATTR_MULTI_FLAG,
> > +   NCSI_ATTR_PACKAGE_MASK,
> > +   NCSI_ATTR_CHANNEL_MASK,
> 
> Is there a case that we might set these two masks at the same time?
> If not, maybe we can just have one generic MASK attribute.
> 

I thought of this too: not yet, but I wonder if we might in the future.
I'll have a think about it.

> >  
> > __NCSI_ATTR_AFTER_LAST,
> > NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
> > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> > index 3d0a33b874f5..8437474d0a78 100644
> > --- a/net/ncsi/internal.h
> > +++ b/net/ncsi/internal.h
> > @@ -213,6 +213,10 @@ struct ncsi_package {
> > unsigned int

Re: [PATCH] kbuild: Fail the build early when no lz4 present

2018-10-10 Thread Masahiro Yamada

Hi Borislav,

On Thu, Oct 11, 2018 at 7:23 AM Borislav Petkov  wrote:
>
> From: Borislav Petkov 
>
> When building randconfigs, the build fails at kernel compression stage
> due to missing lz4 on the system but CONFIG_KERNEL_LZ4 has been selected
> by randconfig. The result looks somethins like this:
>
> (cat arch/x86/boot/compressed/vmlinux.bin 
> arch/x86/boot/compressed/vmlinux.relocs | lz4c -l -c1 stdin stdout && printf 
> \334\141\301\001) > arch/x86/boot/compressed/vmlinux.bin.lz4 || (rm -f 
> arch/x86/boot/compressed/vmlinux.bin.lz4 ; false)
>   /bin/sh: 1: lz4c: not found


So, the cause of the failure is clear enough
from the build log.



It is weird to check only lz4c.
If CONFIG_KERNEL_LZO is enabled, but lzop is not installed,
I see this log

  LZO arch/x86/boot/compressed/vmlinux.bin.lzo
/bin/sh: 1: lzop: not found

It is still clear what to do, though.




>   make[2]: *** [arch/x86/boot/compressed/Makefile:143: 
> arch/x86/boot/compressed/vmlinux.bin.lz4] Error 1
>   make[1]: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] 
> Error 2
>   make: *** [arch/x86/Makefile:290: bzImage] Error 2
>
> Fail the build much earlier by checking for lz4c presence before doing
> anything else.


Is it necessary to check this earlier?

If you get this error, you just need to install the tool.
Then, you can re-run the incremental build.


BTW, this patch has a drawback.

[1] Enable CONFIG_KERNEL_LZ4 on the system
without lz4c installed

[2] Run 'make' and you will get the error
"lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled"

[3] Run 'make menuconfig' and
switch from CONFIG_KERNEL_LZ4 to CONFIG_KERNEL_GZIP

[4] Run 'make' and you will still get the same error
even after you have chosen to use GZIP instead of LZ4.




> Signed-off-by: Borislav Petkov 
> Cc: Masahiro Yamada 
> Cc: Michal Marek 
> Cc: linux-kbu...@vger.kernel.org
> ---
>  Makefile | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index a0c32de80482..f91de649234b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -788,6 +788,12 @@ KBUILD_CFLAGS_KERNEL += -ffunction-sections 
> -fdata-sections
>  LDFLAGS_vmlinux += --gc-sections
>  endif
>
> +ifdef CONFIG_KERNEL_LZ4
> +ifeq ($(CONFIG_SHELL which lz4c),)
> +$(error "lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled")
> +endif
> +endif
> +
>  # arch Makefile may override CC so keep this after arch Makefile is included
>  NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
>
> --
> 2.19.0.271.gfe8321ec057f
>


-- 
Best Regards
Masahiro Yamada

Re: [PATCH] kbuild: Fail the build early when no lz4 present

2018-10-10 Thread Masahiro Yamada

Hi Borislav,

On Thu, Oct 11, 2018 at 7:23 AM Borislav Petkov  wrote:
>
> From: Borislav Petkov 
>
> When building randconfigs, the build fails at kernel compression stage
> due to missing lz4 on the system but CONFIG_KERNEL_LZ4 has been selected
> by randconfig. The result looks somethins like this:
>
> (cat arch/x86/boot/compressed/vmlinux.bin 
> arch/x86/boot/compressed/vmlinux.relocs | lz4c -l -c1 stdin stdout && printf 
> \334\141\301\001) > arch/x86/boot/compressed/vmlinux.bin.lz4 || (rm -f 
> arch/x86/boot/compressed/vmlinux.bin.lz4 ; false)
>   /bin/sh: 1: lz4c: not found


So, the cause of the failure is clear enough
from the build log.



It is weird to check only lz4c.
If CONFIG_KERNEL_LZO is enabled, but lzop is not installed,
I see this log

  LZO arch/x86/boot/compressed/vmlinux.bin.lzo
/bin/sh: 1: lzop: not found

It is still clear what to do, though.




>   make[2]: *** [arch/x86/boot/compressed/Makefile:143: 
> arch/x86/boot/compressed/vmlinux.bin.lz4] Error 1
>   make[1]: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] 
> Error 2
>   make: *** [arch/x86/Makefile:290: bzImage] Error 2
>
> Fail the build much earlier by checking for lz4c presence before doing
> anything else.


Is it necessary to check this earlier?

If you get this error, you just need to install the tool.
Then, you can re-run the incremental build.


BTW, this patch has a drawback.

[1] Enable CONFIG_KERNEL_LZ4 on the system
without lz4c installed

[2] Run 'make' and you will get the error
"lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled"

[3] Run 'make menuconfig' and
switch from CONFIG_KERNEL_LZ4 to CONFIG_KERNEL_GZIP

[4] Run 'make' and you will still get the same error
even after you have chosen to use GZIP instead of LZ4.




> Signed-off-by: Borislav Petkov 
> Cc: Masahiro Yamada 
> Cc: Michal Marek 
> Cc: linux-kbu...@vger.kernel.org
> ---
>  Makefile | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index a0c32de80482..f91de649234b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -788,6 +788,12 @@ KBUILD_CFLAGS_KERNEL += -ffunction-sections 
> -fdata-sections
>  LDFLAGS_vmlinux += --gc-sections
>  endif
>
> +ifdef CONFIG_KERNEL_LZ4
> +ifeq ($(CONFIG_SHELL which lz4c),)
> +$(error "lz4 tool not found on this system but CONFIG_KERNEL_LZ4 enabled")
> +endif
> +endif
> +
>  # arch Makefile may override CC so keep this after arch Makefile is included
>  NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
>
> --
> 2.19.0.271.gfe8321ec057f
>


-- 
Best Regards
Masahiro Yamada

[PATCH v3 7/7] ia64: wire up system calls

2018-10-10 Thread Firoz Khan

wire up perf_event_open, seccomp, pkey_mprotect, pkey_alloc,
pkey_free, statx, io_pgetevents and rseq system calls

This require an architecture specific implementation as it not
present now.

Signed-off-by: Firoz Khan 
---
 arch/ia64/kernel/syscalls/syscall.tbl | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index 6b64f60..1f42b60 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -335,3 +335,19 @@
 323 common  copy_file_range sys_copy_file_range
 324 common  preadv2 sys_preadv2
 325 common  pwritev2sys_pwritev2
+# perf_event_open requires an architecture specific implementation
+326common  perf_event_open sys_perf_event_open
+# seccomp requires an architecture specific implementation
+327common  seccomp sys_seccomp
+# pkey_mprotect requires an architecture specific implementation
+328common  pkey_mprotect   sys_pkey_mprotect
+# pkey_alloc requires an architecture specific implementation
+329common  pkey_alloc  sys_pkey_alloc
+# pkey_free requires an architecture specific implementation
+330common  pkey_free   sys_pkey_free
+# statx requires an architecture specific implementation
+331common  statx   sys_statx
+# io_pgetevents requires an architecture specific implementation
+332common  io_pgetevents   sys_io_pgetevents
+# rseq requires an architecture specific implementation
+333common  rseqsys_rseq
-- 
1.9.1

[PATCH v3 7/7] ia64: wire up system calls

2018-10-10 Thread Firoz Khan

wire up perf_event_open, seccomp, pkey_mprotect, pkey_alloc,
pkey_free, statx, io_pgetevents and rseq system calls

This require an architecture specific implementation as it not
present now.

Signed-off-by: Firoz Khan 
---
 arch/ia64/kernel/syscalls/syscall.tbl | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index 6b64f60..1f42b60 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -335,3 +335,19 @@
 323 common  copy_file_range sys_copy_file_range
 324 common  preadv2 sys_preadv2
 325 common  pwritev2sys_pwritev2
+# perf_event_open requires an architecture specific implementation
+326common  perf_event_open sys_perf_event_open
+# seccomp requires an architecture specific implementation
+327common  seccomp sys_seccomp
+# pkey_mprotect requires an architecture specific implementation
+328common  pkey_mprotect   sys_pkey_mprotect
+# pkey_alloc requires an architecture specific implementation
+329common  pkey_alloc  sys_pkey_alloc
+# pkey_free requires an architecture specific implementation
+330common  pkey_free   sys_pkey_free
+# statx requires an architecture specific implementation
+331common  statx   sys_statx
+# io_pgetevents requires an architecture specific implementation
+332common  io_pgetevents   sys_io_pgetevents
+# rseq requires an architecture specific implementation
+333common  rseqsys_rseq
-- 
1.9.1

[PATCH v3 6/7] ia64: uapi header and system call table file generation

2018-10-10 Thread Firoz Khan

System call table generation script must be run to generate
unistd_64.h and syscall_table.h files. This patch will have
changes which will invokes the script.

This patch will generate unistd_64.h and syscall_table.h
files by the syscall table generation script invoked by
arch/ia64/Makefile and the generated files against the
removed files will be identical.

The generated uapi header file will be included in
uapi/asm/unistd.h and generated system call table support
file will be included by ia64/kernel/syscall_table.S file.

Signed-off-by: Firoz Khan 
---
 arch/ia64/Makefile  |   3 +
 arch/ia64/include/asm/Kbuild|   1 +
 arch/ia64/include/uapi/asm/Kbuild   |   1 +
 arch/ia64/include/uapi/asm/unistd.h | 332 +---
 arch/ia64/kernel/syscall_table.S| 331 +--
 5 files changed, 9 insertions(+), 659 deletions(-)

diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index 45f5980..320d86f 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -80,6 +80,9 @@ unwcheck: vmlinux
 archclean:
$(Q)$(MAKE) $(clean)=$(boot)
 
+archheaders:
+   $(Q)$(MAKE) $(build)=arch/ia64/kernel/syscalls all
+
 CLEAN_FILES += vmlinux.gz bootloader
 
 boot:  lib/lib.a vmlinux
diff --git a/arch/ia64/include/asm/Kbuild b/arch/ia64/include/asm/Kbuild
index 557bbc8..5b17695 100644
--- a/arch/ia64/include/asm/Kbuild
+++ b/arch/ia64/include/asm/Kbuild
@@ -7,3 +7,4 @@ generic-y += preempt.h
 generic-y += trace_clock.h
 generic-y += vtime.h
 generic-y += word-at-a-time.h
+generic-y += syscall_table.h
diff --git a/arch/ia64/include/uapi/asm/Kbuild 
b/arch/ia64/include/uapi/asm/Kbuild
index 3982e67..5c30543 100644
--- a/arch/ia64/include/uapi/asm/Kbuild
+++ b/arch/ia64/include/uapi/asm/Kbuild
@@ -8,3 +8,4 @@ generic-y += msgbuf.h
 generic-y += poll.h
 generic-y += sembuf.h
 generic-y += shmbuf.h
+generic-y += unistd_64.h
diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index bd2575f..286349b 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -13,336 +13,6 @@
 #define __BREAK_SYSCALL__IA64_BREAK_SYSCALL
 
 #define __NR_Linux  1024
-#define __NR_ni_syscall(__NR_Linux + 0)
-#define __NR_exit  (__NR_Linux + 1)
-#define __NR_read  (__NR_Linux + 2)
-#define __NR_write (__NR_Linux + 3)
-#define __NR_open  (__NR_Linux + 4)
-#define __NR_close (__NR_Linux + 5)
-#define __NR_creat (__NR_Linux + 6)
-#define __NR_link  (__NR_Linux + 7)
-#define __NR_unlink(__NR_Linux + 8)
-#define __NR_execve(__NR_Linux + 9)
-#define __NR_chdir (__NR_Linux + 10)
-#define __NR_fchdir(__NR_Linux + 11)
-#define __NR_utimes(__NR_Linux + 12)
-#define __NR_mknod (__NR_Linux + 13)
-#define __NR_chmod (__NR_Linux + 14)
-#define __NR_chown (__NR_Linux + 15)
-#define __NR_lseek (__NR_Linux + 16)
-#define __NR_getpid(__NR_Linux + 17)
-#define __NR_getppid   (__NR_Linux + 18)
-#define __NR_mount (__NR_Linux + 19)
-#define __NR_umount(__NR_Linux + 20)
-#define __NR_setuid(__NR_Linux + 21)
-#define __NR_getuid(__NR_Linux + 22)
-#define __NR_geteuid   (__NR_Linux + 23)
-#define __NR_ptrace(__NR_Linux + 24)
-#define __NR_access(__NR_Linux + 25)
-#define __NR_sync  (__NR_Linux + 26)
-#define __NR_fsync (__NR_Linux + 27)
-#define __NR_fdatasync (__NR_Linux + 28)
-#define __NR_kill  (__NR_Linux + 29)
-#define __NR_rename(__NR_Linux + 30)
-#define __NR_mkdir (__NR_Linux + 31)
-#define __NR_rmdir (__NR_Linux + 32)
-#define __NR_dup   (__NR_Linux + 33)
-#define __NR_pipe  (__NR_Linux + 34)
-#define __NR_times (__NR_Linux + 35)
-#define __NR_brk   (__NR_Linux + 36)
-#define __NR_setgid(__NR_Linux + 37)
-#define __NR_getgid(__NR_Linux + 38)
-#define __NR_getegid   (__NR_Linux + 39)
-#define __NR_acct  (__NR_Linux + 40)
-#define __NR_ioctl (__NR_Linux + 41)
-#define __NR_fcntl (__NR_Linux + 42)
-#define __NR_umask (__NR_Linux + 43)
-#define __NR_chroot(__NR_Linux + 44)
-#define __NR_ustat (__NR_Linux + 45)
-#define __NR_dup2  (__NR_Linux + 46)
-#define __NR_setreuid  (__NR_Linux + 47)
-#define __NR_setregid  (__NR_Linux + 48)
-#define __NR_getresuid (__NR_Linux + 49)
-#define __NR_setresuid (__NR_Linux + 50)
-#define __NR_getresgid (__NR_Linux + 51)
-#define __NR_setresgid (__NR_Linux + 52)
-#define __NR_getgroups (__NR_Linux + 53)
-#define __NR_setgroups (__NR_Linux + 54)
-#define __NR_getpgid   (__NR_Linux + 55)
-#define __NR_setpgid   (__NR_Linux + 56)
-#define __NR_setsid(__NR_Linux + 57)
-#define __NR_getsid(__NR_Linux + 58)
-#define __NR_sethostname   (__NR_Linux + 59)
-#define __NR_setrlimit (__NR_Linux + 60)
-#define __NR_getrlimit (__NR_Linux + 61)
-#define __NR_getrusage (__NR_Linux + 62)
-#define __NR_gettimeofday  (__NR_Linux + 63)
-#define __NR_settimeofday  (__NR_Linux + 64)
-#define __NR_select

[PATCH v3 4/7] ia64: replace the system call table entries from entry.S

2018-10-10 Thread Firoz Khan

In IA64, system call table entries are the part of entry.S file.
We need to keep it in a separate file so that one of the patch in
this patch series contains a system call table generation script
which can separately handle system call table entries.

Replaced the system call table from entry.S to syscall_table.S,
this is a new file. This change will unify the implementation
across all the architecture and to simplify the implementation for
system call table generation using the script.

Signed-off-by: Firoz Khan 
---
 arch/ia64/kernel/entry.S | 333 +-
 arch/ia64/kernel/syscall_table.S | 334 +++
 2 files changed, 335 insertions(+), 332 deletions(-)
 create mode 100644 arch/ia64/kernel/syscall_table.S

diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index 68362b3..249b2e9 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -1426,335 +1426,4 @@ END(ftrace_stub)
 
 #endif /* CONFIG_FUNCTION_TRACER */
 
-   .rodata
-   .align 8
-   .globl sys_call_table
-sys_call_table:
-   data8 sys_ni_syscall//  This must be sys_ni_syscall!  See 
ivt.S.
-   data8 sys_exit  // 1025
-   data8 sys_read
-   data8 sys_write
-   data8 sys_open
-   data8 sys_close
-   data8 sys_creat // 1030
-   data8 sys_link
-   data8 sys_unlink
-   data8 ia64_execve
-   data8 sys_chdir
-   data8 sys_fchdir// 1035
-   data8 sys_utimes
-   data8 sys_mknod
-   data8 sys_chmod
-   data8 sys_chown
-   data8 sys_lseek // 1040
-   data8 sys_getpid
-   data8 sys_getppid
-   data8 sys_mount
-   data8 sys_umount
-   data8 sys_setuid// 1045
-   data8 sys_getuid
-   data8 sys_geteuid
-   data8 sys_ptrace
-   data8 sys_access
-   data8 sys_sync  // 1050
-   data8 sys_fsync
-   data8 sys_fdatasync
-   data8 sys_kill
-   data8 sys_rename
-   data8 sys_mkdir // 1055
-   data8 sys_rmdir
-   data8 sys_dup
-   data8 sys_ia64_pipe
-   data8 sys_times
-   data8 ia64_brk  // 1060
-   data8 sys_setgid
-   data8 sys_getgid
-   data8 sys_getegid
-   data8 sys_acct
-   data8 sys_ioctl // 1065
-   data8 sys_fcntl
-   data8 sys_umask
-   data8 sys_chroot
-   data8 sys_ustat
-   data8 sys_dup2  // 1070
-   data8 sys_setreuid
-   data8 sys_setregid
-   data8 sys_getresuid
-   data8 sys_setresuid
-   data8 sys_getresgid // 1075
-   data8 sys_setresgid
-   data8 sys_getgroups
-   data8 sys_setgroups
-   data8 sys_getpgid
-   data8 sys_setpgid   // 1080
-   data8 sys_setsid
-   data8 sys_getsid
-   data8 sys_sethostname
-   data8 sys_setrlimit
-   data8 sys_getrlimit // 1085
-   data8 sys_getrusage
-   data8 sys_gettimeofday
-   data8 sys_settimeofday
-   data8 sys_select
-   data8 sys_poll  // 1090
-   data8 sys_symlink
-   data8 sys_readlink
-   data8 sys_uselib
-   data8 sys_swapon
-   data8 sys_swapoff   // 1095
-   data8 sys_reboot
-   data8 sys_truncate
-   data8 sys_ftruncate
-   data8 sys_fchmod
-   data8 sys_fchown// 1100
-   data8 ia64_getpriority
-   data8 sys_setpriority
-   data8 sys_statfs
-   data8 sys_fstatfs
-   data8 sys_gettid// 1105
-   data8 sys_semget
-   data8 sys_semop
-   data8 sys_semctl
-   data8 sys_msgget
-   data8 sys_msgsnd// 1110
-   data8 sys_msgrcv
-   data8 sys_msgctl
-   data8 sys_shmget
-   data8 sys_shmat
-   data8 sys_shmdt // 1115
-   data8 sys_shmctl
-   data8 sys_syslog
-   data8 sys_setitimer
-   data8 sys_getitimer
-   data8 sys_ni_syscall// 1120 /* was: 
ia64_oldstat */
-   data8 sys_ni_syscall/* was: 
ia64_oldlstat */
-   data8 sys_ni_syscall/* was: 
ia64_oldfstat */
-   data8 sys_vhangup
-   data8 sys_lchown
-   data8 sys_remap_file_pages  // 1125
-   data8 sys_wait4
-   data8 sys_sysinfo
-   data8 sys_clone
-   data8 sys_setdomainname
-   data8 sys_newuname  // 1130
-   data8 sys_adjtimex
-   data8 sys_ni_syscall/* was: 
ia64_create_module */
-   data8 sys_init_module
-   data8 sys_delete_module
-   data8 sys_ni_syscall// 1135

[PATCH v3 5/7] ia64: add system call table generation support

2018-10-10 Thread Firoz Khan

The system call tables are in different format in all
architecture and it will be difficult to manually add or
modify the system calls in the respective files. To make
it easy by keeping a script and which'll generate the
header file and syscall table file so this change will
unify them across all architectures.

The system call table generation script is added in
syscalls directory which contain the script to generate
both uapi header file system call table generation file
and syscall.tbl file which'll be the input for the scripts.

syscall.tbl contains the list of available system calls
along with system call number and corresponding entry point.
Add a new system call in this architecture will be possible
by adding new entry in the syscall.tbl file.

Adding a new table entry consisting of:
- System call number.
- ABI.
- System call name.
- Entry point name.

syscallhdr.sh and syscalltbl.sh will generate uapi header-
unistd_64.h and syscall_table.h files respectively. File
syscall_table.h is included by syscall_table.S - the real
system call table. Both .sh files will parse the content
syscall.tbl to generate the header and table files.

ARM, s390 and x86 architecuture does have the similar support.
I leverage their implementation to come up with a generic
solution. And this is the ground work for y2038 issue. We need
to change two dozons of system call implementation and this
work will reduce the effort by simply modify two dozon entries
in syscall.tbl.

Signed-off-by: Firoz Khan 
---
 arch/ia64/kernel/syscalls/Makefile  |  39 
 arch/ia64/kernel/syscalls/syscall.tbl   | 337 
 arch/ia64/kernel/syscalls/syscallhdr.sh |  35 
 arch/ia64/kernel/syscalls/syscalltbl.sh |  37 
 4 files changed, 448 insertions(+)
 create mode 100644 arch/ia64/kernel/syscalls/Makefile
 create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl
 create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh
 create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh

diff --git a/arch/ia64/kernel/syscalls/Makefile 
b/arch/ia64/kernel/syscalls/Makefile
new file mode 100644
index 000..011cf31
--- /dev/null
+++ b/arch/ia64/kernel/syscalls/Makefile
@@ -0,0 +1,39 @@
+# SPDX-License-Identifier: GPL-2.0
+kapi := arch/$(SRCARCH)/include/generated/asm
+uapi := arch/$(SRCARCH)/include/generated/uapi/asm
+
+_dummy := $(shell [ -d '$(uapi)' ] || mkdir -p '$(uapi)') \
+ $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)')
+
+syscall := $(srctree)/$(src)/syscall.tbl
+syshdr := $(srctree)/$(src)/syscallhdr.sh
+systbl := $(srctree)/$(src)/syscalltbl.sh
+
+quiet_cmd_syshdr = SYSHDR  $@
+  cmd_syshdr = $(CONFIG_SHELL) '$(syshdr)' '$<' '$@'  \
+  '$(syshdr_abi_$(basetarget))'  \
+  '$(syshdr_pfx_$(basetarget))'  \
+  '$(syshdr_offset_$(basetarget))'
+
+quiet_cmd_systbl = SYSTBL  $@
+  cmd_systbl = $(CONFIG_SHELL) '$(systbl)' '$<' '$@'  \
+  '$(systbl_abi_$(basetarget))'  \
+  '$(systbl_offset_$(basetarget))'
+
+syshdr_offset_unistd_64 := __NR_Linux
+$(uapi)/unistd_64.h: $(syscall) $(syshdr)
+   $(call if_changed,syshdr)
+
+systbl_offset_syscall_table := 1024
+$(kapi)/syscall_table.h: $(syscall) $(systbl)
+   $(call if_changed,systbl)
+
+uapisyshdr-y   += unistd_64.h
+kapisyshdr-y   += syscall_table.h
+
+targets+= $(uapisyshdr-y) $(kapisyshdr-y)
+
+PHONY += all
+all: $(addprefix $(uapi)/,$(uapisyshdr-y))
+all: $(addprefix $(kapi)/,$(kapisyshdr-y))
+   @:
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
new file mode 100644
index 000..6b64f60
--- /dev/null
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -0,0 +1,337 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# Linux system call numbers and entry vectors for IA64
+#
+# The format is:
+#
+#
+# Add 1024 to  will get the actual system call number
+#
+# The  is always "common" for this file
+#
+0   common  ni_syscall  sys_ni_syscall
+1   common  exitsys_exit
+2   common  readsys_read
+3   common  write   sys_write
+4   common  opensys_open
+5   common  close   sys_close
+6   common  creat   sys_creat
+7   common  linksys_link
+8   common  unlink  sys_unlink
+9   common  execve  ia64_execve
+10  common  chdir   sys_chdir
+11  common  fchdir  sys_fchdir
+12  common  utimes  sys_utimes
+13  common  mknod   sys_mknod
+14  common  chmod   sys_chmod
+15  common  chown

[PATCH v3 6/7] ia64: uapi header and system call table file generation

2018-10-10 Thread Firoz Khan

System call table generation script must be run to generate
unistd_64.h and syscall_table.h files. This patch will have
changes which will invokes the script.

This patch will generate unistd_64.h and syscall_table.h
files by the syscall table generation script invoked by
arch/ia64/Makefile and the generated files against the
removed files will be identical.

The generated uapi header file will be included in
uapi/asm/unistd.h and generated system call table support
file will be included by ia64/kernel/syscall_table.S file.

Signed-off-by: Firoz Khan 
---
 arch/ia64/Makefile  |   3 +
 arch/ia64/include/asm/Kbuild|   1 +
 arch/ia64/include/uapi/asm/Kbuild   |   1 +
 arch/ia64/include/uapi/asm/unistd.h | 332 +---
 arch/ia64/kernel/syscall_table.S| 331 +--
 5 files changed, 9 insertions(+), 659 deletions(-)

diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index 45f5980..320d86f 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -80,6 +80,9 @@ unwcheck: vmlinux
 archclean:
$(Q)$(MAKE) $(clean)=$(boot)
 
+archheaders:
+   $(Q)$(MAKE) $(build)=arch/ia64/kernel/syscalls all
+
 CLEAN_FILES += vmlinux.gz bootloader
 
 boot:  lib/lib.a vmlinux
diff --git a/arch/ia64/include/asm/Kbuild b/arch/ia64/include/asm/Kbuild
index 557bbc8..5b17695 100644
--- a/arch/ia64/include/asm/Kbuild
+++ b/arch/ia64/include/asm/Kbuild
@@ -7,3 +7,4 @@ generic-y += preempt.h
 generic-y += trace_clock.h
 generic-y += vtime.h
 generic-y += word-at-a-time.h
+generic-y += syscall_table.h
diff --git a/arch/ia64/include/uapi/asm/Kbuild 
b/arch/ia64/include/uapi/asm/Kbuild
index 3982e67..5c30543 100644
--- a/arch/ia64/include/uapi/asm/Kbuild
+++ b/arch/ia64/include/uapi/asm/Kbuild
@@ -8,3 +8,4 @@ generic-y += msgbuf.h
 generic-y += poll.h
 generic-y += sembuf.h
 generic-y += shmbuf.h
+generic-y += unistd_64.h
diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index bd2575f..286349b 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -13,336 +13,6 @@
 #define __BREAK_SYSCALL__IA64_BREAK_SYSCALL
 
 #define __NR_Linux  1024
-#define __NR_ni_syscall(__NR_Linux + 0)
-#define __NR_exit  (__NR_Linux + 1)
-#define __NR_read  (__NR_Linux + 2)
-#define __NR_write (__NR_Linux + 3)
-#define __NR_open  (__NR_Linux + 4)
-#define __NR_close (__NR_Linux + 5)
-#define __NR_creat (__NR_Linux + 6)
-#define __NR_link  (__NR_Linux + 7)
-#define __NR_unlink(__NR_Linux + 8)
-#define __NR_execve(__NR_Linux + 9)
-#define __NR_chdir (__NR_Linux + 10)
-#define __NR_fchdir(__NR_Linux + 11)
-#define __NR_utimes(__NR_Linux + 12)
-#define __NR_mknod (__NR_Linux + 13)
-#define __NR_chmod (__NR_Linux + 14)
-#define __NR_chown (__NR_Linux + 15)
-#define __NR_lseek (__NR_Linux + 16)
-#define __NR_getpid(__NR_Linux + 17)
-#define __NR_getppid   (__NR_Linux + 18)
-#define __NR_mount (__NR_Linux + 19)
-#define __NR_umount(__NR_Linux + 20)
-#define __NR_setuid(__NR_Linux + 21)
-#define __NR_getuid(__NR_Linux + 22)
-#define __NR_geteuid   (__NR_Linux + 23)
-#define __NR_ptrace(__NR_Linux + 24)
-#define __NR_access(__NR_Linux + 25)
-#define __NR_sync  (__NR_Linux + 26)
-#define __NR_fsync (__NR_Linux + 27)
-#define __NR_fdatasync (__NR_Linux + 28)
-#define __NR_kill  (__NR_Linux + 29)
-#define __NR_rename(__NR_Linux + 30)
-#define __NR_mkdir (__NR_Linux + 31)
-#define __NR_rmdir (__NR_Linux + 32)
-#define __NR_dup   (__NR_Linux + 33)
-#define __NR_pipe  (__NR_Linux + 34)
-#define __NR_times (__NR_Linux + 35)
-#define __NR_brk   (__NR_Linux + 36)
-#define __NR_setgid(__NR_Linux + 37)
-#define __NR_getgid(__NR_Linux + 38)
-#define __NR_getegid   (__NR_Linux + 39)
-#define __NR_acct  (__NR_Linux + 40)
-#define __NR_ioctl (__NR_Linux + 41)
-#define __NR_fcntl (__NR_Linux + 42)
-#define __NR_umask (__NR_Linux + 43)
-#define __NR_chroot(__NR_Linux + 44)
-#define __NR_ustat (__NR_Linux + 45)
-#define __NR_dup2  (__NR_Linux + 46)
-#define __NR_setreuid  (__NR_Linux + 47)
-#define __NR_setregid  (__NR_Linux + 48)
-#define __NR_getresuid (__NR_Linux + 49)
-#define __NR_setresuid (__NR_Linux + 50)
-#define __NR_getresgid (__NR_Linux + 51)
-#define __NR_setresgid (__NR_Linux + 52)
-#define __NR_getgroups (__NR_Linux + 53)
-#define __NR_setgroups (__NR_Linux + 54)
-#define __NR_getpgid   (__NR_Linux + 55)
-#define __NR_setpgid   (__NR_Linux + 56)
-#define __NR_setsid(__NR_Linux + 57)
-#define __NR_getsid(__NR_Linux + 58)
-#define __NR_sethostname   (__NR_Linux + 59)
-#define __NR_setrlimit (__NR_Linux + 60)
-#define __NR_getrlimit (__NR_Linux + 61)
-#define __NR_getrusage (__NR_Linux + 62)
-#define __NR_gettimeofday  (__NR_Linux + 63)
-#define __NR_settimeofday  (__NR_Linux + 64)
-#define __NR_select

[PATCH v3 4/7] ia64: replace the system call table entries from entry.S

2018-10-10 Thread Firoz Khan

In IA64, system call table entries are the part of entry.S file.
We need to keep it in a separate file so that one of the patch in
this patch series contains a system call table generation script
which can separately handle system call table entries.

Replaced the system call table from entry.S to syscall_table.S,
this is a new file. This change will unify the implementation
across all the architecture and to simplify the implementation for
system call table generation using the script.

Signed-off-by: Firoz Khan 
---
 arch/ia64/kernel/entry.S | 333 +-
 arch/ia64/kernel/syscall_table.S | 334 +++
 2 files changed, 335 insertions(+), 332 deletions(-)
 create mode 100644 arch/ia64/kernel/syscall_table.S

diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index 68362b3..249b2e9 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -1426,335 +1426,4 @@ END(ftrace_stub)
 
 #endif /* CONFIG_FUNCTION_TRACER */
 
-   .rodata
-   .align 8
-   .globl sys_call_table
-sys_call_table:
-   data8 sys_ni_syscall//  This must be sys_ni_syscall!  See 
ivt.S.
-   data8 sys_exit  // 1025
-   data8 sys_read
-   data8 sys_write
-   data8 sys_open
-   data8 sys_close
-   data8 sys_creat // 1030
-   data8 sys_link
-   data8 sys_unlink
-   data8 ia64_execve
-   data8 sys_chdir
-   data8 sys_fchdir// 1035
-   data8 sys_utimes
-   data8 sys_mknod
-   data8 sys_chmod
-   data8 sys_chown
-   data8 sys_lseek // 1040
-   data8 sys_getpid
-   data8 sys_getppid
-   data8 sys_mount
-   data8 sys_umount
-   data8 sys_setuid// 1045
-   data8 sys_getuid
-   data8 sys_geteuid
-   data8 sys_ptrace
-   data8 sys_access
-   data8 sys_sync  // 1050
-   data8 sys_fsync
-   data8 sys_fdatasync
-   data8 sys_kill
-   data8 sys_rename
-   data8 sys_mkdir // 1055
-   data8 sys_rmdir
-   data8 sys_dup
-   data8 sys_ia64_pipe
-   data8 sys_times
-   data8 ia64_brk  // 1060
-   data8 sys_setgid
-   data8 sys_getgid
-   data8 sys_getegid
-   data8 sys_acct
-   data8 sys_ioctl // 1065
-   data8 sys_fcntl
-   data8 sys_umask
-   data8 sys_chroot
-   data8 sys_ustat
-   data8 sys_dup2  // 1070
-   data8 sys_setreuid
-   data8 sys_setregid
-   data8 sys_getresuid
-   data8 sys_setresuid
-   data8 sys_getresgid // 1075
-   data8 sys_setresgid
-   data8 sys_getgroups
-   data8 sys_setgroups
-   data8 sys_getpgid
-   data8 sys_setpgid   // 1080
-   data8 sys_setsid
-   data8 sys_getsid
-   data8 sys_sethostname
-   data8 sys_setrlimit
-   data8 sys_getrlimit // 1085
-   data8 sys_getrusage
-   data8 sys_gettimeofday
-   data8 sys_settimeofday
-   data8 sys_select
-   data8 sys_poll  // 1090
-   data8 sys_symlink
-   data8 sys_readlink
-   data8 sys_uselib
-   data8 sys_swapon
-   data8 sys_swapoff   // 1095
-   data8 sys_reboot
-   data8 sys_truncate
-   data8 sys_ftruncate
-   data8 sys_fchmod
-   data8 sys_fchown// 1100
-   data8 ia64_getpriority
-   data8 sys_setpriority
-   data8 sys_statfs
-   data8 sys_fstatfs
-   data8 sys_gettid// 1105
-   data8 sys_semget
-   data8 sys_semop
-   data8 sys_semctl
-   data8 sys_msgget
-   data8 sys_msgsnd// 1110
-   data8 sys_msgrcv
-   data8 sys_msgctl
-   data8 sys_shmget
-   data8 sys_shmat
-   data8 sys_shmdt // 1115
-   data8 sys_shmctl
-   data8 sys_syslog
-   data8 sys_setitimer
-   data8 sys_getitimer
-   data8 sys_ni_syscall// 1120 /* was: 
ia64_oldstat */
-   data8 sys_ni_syscall/* was: 
ia64_oldlstat */
-   data8 sys_ni_syscall/* was: 
ia64_oldfstat */
-   data8 sys_vhangup
-   data8 sys_lchown
-   data8 sys_remap_file_pages  // 1125
-   data8 sys_wait4
-   data8 sys_sysinfo
-   data8 sys_clone
-   data8 sys_setdomainname
-   data8 sys_newuname  // 1130
-   data8 sys_adjtimex
-   data8 sys_ni_syscall/* was: 
ia64_create_module */
-   data8 sys_init_module
-   data8 sys_delete_module
-   data8 sys_ni_syscall// 1135

[PATCH v3 5/7] ia64: add system call table generation support

2018-10-10 Thread Firoz Khan

The system call tables are in different format in all
architecture and it will be difficult to manually add or
modify the system calls in the respective files. To make
it easy by keeping a script and which'll generate the
header file and syscall table file so this change will
unify them across all architectures.

The system call table generation script is added in
syscalls directory which contain the script to generate
both uapi header file system call table generation file
and syscall.tbl file which'll be the input for the scripts.

syscall.tbl contains the list of available system calls
along with system call number and corresponding entry point.
Add a new system call in this architecture will be possible
by adding new entry in the syscall.tbl file.

Adding a new table entry consisting of:
- System call number.
- ABI.
- System call name.
- Entry point name.

syscallhdr.sh and syscalltbl.sh will generate uapi header-
unistd_64.h and syscall_table.h files respectively. File
syscall_table.h is included by syscall_table.S - the real
system call table. Both .sh files will parse the content
syscall.tbl to generate the header and table files.

ARM, s390 and x86 architecuture does have the similar support.
I leverage their implementation to come up with a generic
solution. And this is the ground work for y2038 issue. We need
to change two dozons of system call implementation and this
work will reduce the effort by simply modify two dozon entries
in syscall.tbl.

Signed-off-by: Firoz Khan 
---
 arch/ia64/kernel/syscalls/Makefile  |  39 
 arch/ia64/kernel/syscalls/syscall.tbl   | 337 
 arch/ia64/kernel/syscalls/syscallhdr.sh |  35 
 arch/ia64/kernel/syscalls/syscalltbl.sh |  37 
 4 files changed, 448 insertions(+)
 create mode 100644 arch/ia64/kernel/syscalls/Makefile
 create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl
 create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh
 create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh

diff --git a/arch/ia64/kernel/syscalls/Makefile 
b/arch/ia64/kernel/syscalls/Makefile
new file mode 100644
index 000..011cf31
--- /dev/null
+++ b/arch/ia64/kernel/syscalls/Makefile
@@ -0,0 +1,39 @@
+# SPDX-License-Identifier: GPL-2.0
+kapi := arch/$(SRCARCH)/include/generated/asm
+uapi := arch/$(SRCARCH)/include/generated/uapi/asm
+
+_dummy := $(shell [ -d '$(uapi)' ] || mkdir -p '$(uapi)') \
+ $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)')
+
+syscall := $(srctree)/$(src)/syscall.tbl
+syshdr := $(srctree)/$(src)/syscallhdr.sh
+systbl := $(srctree)/$(src)/syscalltbl.sh
+
+quiet_cmd_syshdr = SYSHDR  $@
+  cmd_syshdr = $(CONFIG_SHELL) '$(syshdr)' '$<' '$@'  \
+  '$(syshdr_abi_$(basetarget))'  \
+  '$(syshdr_pfx_$(basetarget))'  \
+  '$(syshdr_offset_$(basetarget))'
+
+quiet_cmd_systbl = SYSTBL  $@
+  cmd_systbl = $(CONFIG_SHELL) '$(systbl)' '$<' '$@'  \
+  '$(systbl_abi_$(basetarget))'  \
+  '$(systbl_offset_$(basetarget))'
+
+syshdr_offset_unistd_64 := __NR_Linux
+$(uapi)/unistd_64.h: $(syscall) $(syshdr)
+   $(call if_changed,syshdr)
+
+systbl_offset_syscall_table := 1024
+$(kapi)/syscall_table.h: $(syscall) $(systbl)
+   $(call if_changed,systbl)
+
+uapisyshdr-y   += unistd_64.h
+kapisyshdr-y   += syscall_table.h
+
+targets+= $(uapisyshdr-y) $(kapisyshdr-y)
+
+PHONY += all
+all: $(addprefix $(uapi)/,$(uapisyshdr-y))
+all: $(addprefix $(kapi)/,$(kapisyshdr-y))
+   @:
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
new file mode 100644
index 000..6b64f60
--- /dev/null
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -0,0 +1,337 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# Linux system call numbers and entry vectors for IA64
+#
+# The format is:
+#
+#
+# Add 1024 to  will get the actual system call number
+#
+# The  is always "common" for this file
+#
+0   common  ni_syscall  sys_ni_syscall
+1   common  exitsys_exit
+2   common  readsys_read
+3   common  write   sys_write
+4   common  opensys_open
+5   common  close   sys_close
+6   common  creat   sys_creat
+7   common  linksys_link
+8   common  unlink  sys_unlink
+9   common  execve  ia64_execve
+10  common  chdir   sys_chdir
+11  common  fchdir  sys_fchdir
+12  common  utimes  sys_utimes
+13  common  mknod   sys_mknod
+14  common  chmod   sys_chmod
+15  common  chown

[PATCH v3 2/7] ia64: replace NR_syscalls macro from asm/unistd.h

2018-10-10 Thread Firoz Khan

NR_syscalls macro holds the number of system call exist in IA64
architecture. This macro is currently the part of asm/unistd.h
file. We have to change the value of NR_syscalls, if we add or
delete a system call.

One of the patch in this patch series has a script which will
generate a uapi header based on syscall.tbl file. The syscall.tbl
file contains the number of system call information. So we have
two option to update NR_syscalls value.

1. Update NR_syscalls in asm/unistd.h manually by counting the
   no.of system calls. No need to update NR_syscalls until we
   either add a new system call or delete an existing system
   call.

2. We can keep this feature it above mentioned script, that'll
   count the number of syscalls and keep it in a generated file.
   In this case we don't need to explicitly update NR_syscalls
   in asm/unistd.h file.

The 2nd option will be the recommended one. For that, I come up
with another macro - __NR_syscalls which will be updated by the
script and it will be present in uapi/asm/unistd.h. The macro
name changed form NR_syscalls to __NR_syscalls for making the
name convention same across all architecture. While __NR_syscalls
isn't strictly part of the uapi, having it as part of the generated
header to simplifies the implementation. We also need to enclose
this macro with #ifdef __KERNEL__ to avoid side effects.

Signed-off-by: Firoz Khan 
---
 arch/ia64/include/asm/unistd.h  | 4 +---
 arch/ia64/include/uapi/asm/unistd.h | 4 
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h
index ffb705d..397b143 100644
--- a/arch/ia64/include/asm/unistd.h
+++ b/arch/ia64/include/asm/unistd.h
@@ -10,9 +10,7 @@
 
 #include 
 
-
-
-#define NR_syscalls326 /* length of syscall table */
+#define NR_syscalls__NR_syscalls /* length of syscall table */
 
 /*
  * The following defines stop scripts/checksyscalls.sh from complaining about
diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index 4d590c9..4186dc2 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -341,4 +341,8 @@
 #define __NR_preadv2   1348
 #define __NR_pwritev2  1349
 
+#ifdef __KERNEL__
+#define __NR_syscalls  326
+#endif
+
 #endif /* _UAPI_ASM_IA64_UNISTD_H */
-- 
1.9.1

[PATCH v3 1/7] ia64: add __NR_old_getpagesize in uapi/asm/unistd.h

2018-10-10 Thread Firoz Khan

sys_getpagesize entry is present in entry.S file to support
for old user interface. So we need to add an uapi entry too.

Add __NR_old_getpagesize in order to not break old user space
as it is reserved for backwards compatibility with old __NR_
getpagesize.

Signed-off-by: Firoz Khan 
---
 arch/ia64/include/uapi/asm/unistd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index 5fe71d4..4d590c9 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -161,7 +161,7 @@
 #define __NR_nanosleep 1168
 #define __NR_nfsservctl1169
 #define __NR_prctl 1170
-/* 1171 is reserved for backwards compatibility with old __NR_getpagesize */
+#define __NR_old_getpagesize1171
 #define __NR_mmap2 1172
 #define __NR_pciconfig_read1173
 #define __NR_pciconfig_write   1174
-- 
1.9.1

[PATCH v3 2/7] ia64: replace NR_syscalls macro from asm/unistd.h

2018-10-10 Thread Firoz Khan

NR_syscalls macro holds the number of system call exist in IA64
architecture. This macro is currently the part of asm/unistd.h
file. We have to change the value of NR_syscalls, if we add or
delete a system call.

One of the patch in this patch series has a script which will
generate a uapi header based on syscall.tbl file. The syscall.tbl
file contains the number of system call information. So we have
two option to update NR_syscalls value.

1. Update NR_syscalls in asm/unistd.h manually by counting the
   no.of system calls. No need to update NR_syscalls until we
   either add a new system call or delete an existing system
   call.

2. We can keep this feature it above mentioned script, that'll
   count the number of syscalls and keep it in a generated file.
   In this case we don't need to explicitly update NR_syscalls
   in asm/unistd.h file.

The 2nd option will be the recommended one. For that, I come up
with another macro - __NR_syscalls which will be updated by the
script and it will be present in uapi/asm/unistd.h. The macro
name changed form NR_syscalls to __NR_syscalls for making the
name convention same across all architecture. While __NR_syscalls
isn't strictly part of the uapi, having it as part of the generated
header to simplifies the implementation. We also need to enclose
this macro with #ifdef __KERNEL__ to avoid side effects.

Signed-off-by: Firoz Khan 
---
 arch/ia64/include/asm/unistd.h  | 4 +---
 arch/ia64/include/uapi/asm/unistd.h | 4 
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/include/asm/unistd.h b/arch/ia64/include/asm/unistd.h
index ffb705d..397b143 100644
--- a/arch/ia64/include/asm/unistd.h
+++ b/arch/ia64/include/asm/unistd.h
@@ -10,9 +10,7 @@
 
 #include 
 
-
-
-#define NR_syscalls326 /* length of syscall table */
+#define NR_syscalls__NR_syscalls /* length of syscall table */
 
 /*
  * The following defines stop scripts/checksyscalls.sh from complaining about
diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index 4d590c9..4186dc2 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -341,4 +341,8 @@
 #define __NR_preadv2   1348
 #define __NR_pwritev2  1349
 
+#ifdef __KERNEL__
+#define __NR_syscalls  326
+#endif
+
 #endif /* _UAPI_ASM_IA64_UNISTD_H */
-- 
1.9.1

[PATCH v3 1/7] ia64: add __NR_old_getpagesize in uapi/asm/unistd.h

2018-10-10 Thread Firoz Khan

sys_getpagesize entry is present in entry.S file to support
for old user interface. So we need to add an uapi entry too.

Add __NR_old_getpagesize in order to not break old user space
as it is reserved for backwards compatibility with old __NR_
getpagesize.

Signed-off-by: Firoz Khan 
---
 arch/ia64/include/uapi/asm/unistd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index 5fe71d4..4d590c9 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -161,7 +161,7 @@
 #define __NR_nanosleep 1168
 #define __NR_nfsservctl1169
 #define __NR_prctl 1170
-/* 1171 is reserved for backwards compatibility with old __NR_getpagesize */
+#define __NR_old_getpagesize1171
 #define __NR_mmap2 1172
 #define __NR_pciconfig_read1173
 #define __NR_pciconfig_write   1174
-- 
1.9.1

[PATCH v3 3/7] ia64: add an offset for system call number

2018-10-10 Thread Firoz Khan

The system call number in IA64 architecture starts with 1024. But
most of the other architecute starts with 0. In order to come up
with a common implementation to generate uapi header we need to add
an offset - __NR_Linux with a value 1024.

One of the patch in this patch series does have a script to generate
uapi header which uses syscall.tbl file. In syscall.tbl contain
system call number. With the use of __NR_Linux, we can start the
number from 0 instead of 1024.

Signed-off-by: Firoz Khan 
---
 arch/ia64/include/uapi/asm/unistd.h | 658 ++--
 1 file changed, 329 insertions(+), 329 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index 4186dc2..bd2575f 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -8,338 +8,338 @@
 #ifndef _UAPI_ASM_IA64_UNISTD_H
 #define _UAPI_ASM_IA64_UNISTD_H
 
-
 #include 
 
-#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL
+#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL
 
-#define __NR_ni_syscall1024
-#define __NR_exit  1025
-#define __NR_read  1026
-#define __NR_write 1027
-#define __NR_open  1028
-#define __NR_close 1029
-#define __NR_creat 1030
-#define __NR_link  1031
-#define __NR_unlink1032
-#define __NR_execve1033
-#define __NR_chdir 1034
-#define __NR_fchdir1035
-#define __NR_utimes1036
-#define __NR_mknod 1037
-#define __NR_chmod 1038
-#define __NR_chown 1039
-#define __NR_lseek 1040
-#define __NR_getpid1041
-#define __NR_getppid   1042
-#define __NR_mount 1043
-#define __NR_umount1044
-#define __NR_setuid1045
-#define __NR_getuid1046
-#define __NR_geteuid   1047
-#define __NR_ptrace1048
-#define __NR_access1049
-#define __NR_sync  1050
-#define __NR_fsync 1051
-#define __NR_fdatasync 1052
-#define __NR_kill  1053
-#define __NR_rename1054
-#define __NR_mkdir 1055
-#define __NR_rmdir 1056
-#define __NR_dup   1057
-#define __NR_pipe  1058
-#define __NR_times 1059
-#define __NR_brk   1060
-#define __NR_setgid1061
-#define __NR_getgid1062
-#define __NR_getegid   1063
-#define __NR_acct  1064
-#define __NR_ioctl 1065
-#define __NR_fcntl 1066
-#define __NR_umask 1067
-#define __NR_chroot1068
-#define __NR_ustat 1069
-#define __NR_dup2  1070
-#define __NR_setreuid  1071
-#define __NR_setregid  1072
-#define __NR_getresuid 1073
-#define __NR_setresuid 1074
-#define __NR_getresgid 1075
-#define __NR_setresgid 1076
-#define __NR_getgroups 1077
-#define __NR_setgroups 1078
-#define __NR_getpgid   1079
-#define __NR_setpgid   1080
-#define __NR_setsid1081
-#define __NR_getsid1082
-#define __NR_sethostname   1083
-#define __NR_setrlimit 1084
-#define __NR_getrlimit 1085
-#define __NR_getrusage 1086
-#define __NR_gettimeofday  1087
-#define __NR_settimeofday  1088
-#define __NR_select1089
-#define __NR_poll  1090
-#define __NR_symlink   1091
-#define __NR_readlink  1092
-#define __NR_uselib1093
-#define __NR_swapon1094
-#define __NR_swapoff   1095
-#define __NR_reboot1096
-#define __NR_truncate  1097
-#define __NR_ftruncate 1098
-#define __NR_fchmod1099
-#define __NR_fchown1100
-#define __NR_getpriority   1101
-#define __NR_setpriority   1102
-#define __NR_statfs1103
-#define __NR_fstatfs   1104
-#define __NR_gettid1105
-#define __NR_semget1106
-#define __NR_semop 1107
-#define __NR_semctl1108
-#define __NR_msgget1109
-#define __NR_msgsnd1110
-#define

[PATCH v3 3/7] ia64: add an offset for system call number

2018-10-10 Thread Firoz Khan

The system call number in IA64 architecture starts with 1024. But
most of the other architecute starts with 0. In order to come up
with a common implementation to generate uapi header we need to add
an offset - __NR_Linux with a value 1024.

One of the patch in this patch series does have a script to generate
uapi header which uses syscall.tbl file. In syscall.tbl contain
system call number. With the use of __NR_Linux, we can start the
number from 0 instead of 1024.

Signed-off-by: Firoz Khan 
---
 arch/ia64/include/uapi/asm/unistd.h | 658 ++--
 1 file changed, 329 insertions(+), 329 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/unistd.h 
b/arch/ia64/include/uapi/asm/unistd.h
index 4186dc2..bd2575f 100644
--- a/arch/ia64/include/uapi/asm/unistd.h
+++ b/arch/ia64/include/uapi/asm/unistd.h
@@ -8,338 +8,338 @@
 #ifndef _UAPI_ASM_IA64_UNISTD_H
 #define _UAPI_ASM_IA64_UNISTD_H
 
-
 #include 
 
-#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL
+#define __BREAK_SYSCALL__IA64_BREAK_SYSCALL
 
-#define __NR_ni_syscall1024
-#define __NR_exit  1025
-#define __NR_read  1026
-#define __NR_write 1027
-#define __NR_open  1028
-#define __NR_close 1029
-#define __NR_creat 1030
-#define __NR_link  1031
-#define __NR_unlink1032
-#define __NR_execve1033
-#define __NR_chdir 1034
-#define __NR_fchdir1035
-#define __NR_utimes1036
-#define __NR_mknod 1037
-#define __NR_chmod 1038
-#define __NR_chown 1039
-#define __NR_lseek 1040
-#define __NR_getpid1041
-#define __NR_getppid   1042
-#define __NR_mount 1043
-#define __NR_umount1044
-#define __NR_setuid1045
-#define __NR_getuid1046
-#define __NR_geteuid   1047
-#define __NR_ptrace1048
-#define __NR_access1049
-#define __NR_sync  1050
-#define __NR_fsync 1051
-#define __NR_fdatasync 1052
-#define __NR_kill  1053
-#define __NR_rename1054
-#define __NR_mkdir 1055
-#define __NR_rmdir 1056
-#define __NR_dup   1057
-#define __NR_pipe  1058
-#define __NR_times 1059
-#define __NR_brk   1060
-#define __NR_setgid1061
-#define __NR_getgid1062
-#define __NR_getegid   1063
-#define __NR_acct  1064
-#define __NR_ioctl 1065
-#define __NR_fcntl 1066
-#define __NR_umask 1067
-#define __NR_chroot1068
-#define __NR_ustat 1069
-#define __NR_dup2  1070
-#define __NR_setreuid  1071
-#define __NR_setregid  1072
-#define __NR_getresuid 1073
-#define __NR_setresuid 1074
-#define __NR_getresgid 1075
-#define __NR_setresgid 1076
-#define __NR_getgroups 1077
-#define __NR_setgroups 1078
-#define __NR_getpgid   1079
-#define __NR_setpgid   1080
-#define __NR_setsid1081
-#define __NR_getsid1082
-#define __NR_sethostname   1083
-#define __NR_setrlimit 1084
-#define __NR_getrlimit 1085
-#define __NR_getrusage 1086
-#define __NR_gettimeofday  1087
-#define __NR_settimeofday  1088
-#define __NR_select1089
-#define __NR_poll  1090
-#define __NR_symlink   1091
-#define __NR_readlink  1092
-#define __NR_uselib1093
-#define __NR_swapon1094
-#define __NR_swapoff   1095
-#define __NR_reboot1096
-#define __NR_truncate  1097
-#define __NR_ftruncate 1098
-#define __NR_fchmod1099
-#define __NR_fchown1100
-#define __NR_getpriority   1101
-#define __NR_setpriority   1102
-#define __NR_statfs1103
-#define __NR_fstatfs   1104
-#define __NR_gettid1105
-#define __NR_semget1106
-#define __NR_semop 1107
-#define __NR_semctl1108
-#define __NR_msgget1109
-#define __NR_msgsnd1110
-#define

[PATCH v3 0/7] ia64: system call table generation support

2018-10-10 Thread Firoz Khan

The purpose of this patch series is:
1. We can easily add/modify/delete system call by changing entry 
in syscall.tbl file. No need to manually edit many files.

2. It is easy to unify the system call implementation across all 
the architectures. 

The system call tables are in different format in all architecture 
and it will be difficult to manually add or modify the system calls
in the respective files manually. To make it easy by keeping a script 
and which'll generate the header file and syscall table file so this 
change will unify them across all architectures.

syscall.tbl contains the list of available system calls along with 
system call number and corresponding entry point. Add a new system 
call in this architecture will be possible by adding new entry in 
the syscall.tbl file.

Adding a new table entry consisting of:
- System call number.
- ABI.
- System call name.
- Entry point name.
- Compat entry name, if required.

ARM, s390 and x86 architecuture does exist the similar support. I 
leverage their implementation to come up with a generic solution.

I have done the same support for work for alpha, microblaze, sparc,
mips, parisc, powerpc, sh, sparc, and xtensa. But I started sending 
the patch for one architecuture for review. Below mentioned git
repository contains more details.
Git repo:- https://github.com/frzkhn/system_call_table_generator/

In v3 patch series, I wired up perf_event_open, seccomp, pkey_
mprotect, pkey_alloc, pkey_free, statx, io_pgetevents and rseq 
system calls. This require an architecture specific implementation 
as it not present now.

Finally, this is the ground work for solving the Y2038 issue. We 
need to add/change two dozen of system calls to solve Y2038 issue. 
So this patch series will help to easily modify from existing 
system call to Y2038 compatible system calls.

Firoz Khan (7):
  ia64: add __NR_old_getpagesize in uapi/asm/unistd.h
  ia64: replace NR_syscalls macro from asm/unistd.h
  ia64: add an offset for system call number
  ia64: replace the system call table entries from entry.S
  ia64: add system call table generation support
  ia64: uapi header and system call table file generation
  ia64: wire up system calls

 arch/ia64/Makefile  |   3 +
 arch/ia64/include/asm/Kbuild|   1 +
 arch/ia64/include/asm/unistd.h  |   4 +-
 arch/ia64/include/uapi/asm/Kbuild   |   1 +
 arch/ia64/include/uapi/asm/unistd.h | 332 +-
 arch/ia64/kernel/entry.S| 333 +-
 arch/ia64/kernel/syscall_table.S|   9 +
 arch/ia64/kernel/syscalls/Makefile  |  39 
 arch/ia64/kernel/syscalls/syscall.tbl   | 353 
 arch/ia64/kernel/syscalls/syscallhdr.sh |  35 
 arch/ia64/kernel/syscalls/syscalltbl.sh |  37 
 11 files changed, 483 insertions(+), 664 deletions(-)
 create mode 100644 arch/ia64/kernel/syscall_table.S
 create mode 100644 arch/ia64/kernel/syscalls/Makefile
 create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl
 create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh
 create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh

-- 
1.9.1

[PATCH v3 0/7] ia64: system call table generation support

2018-10-10 Thread Firoz Khan

The purpose of this patch series is:
1. We can easily add/modify/delete system call by changing entry 
in syscall.tbl file. No need to manually edit many files.

2. It is easy to unify the system call implementation across all 
the architectures. 

The system call tables are in different format in all architecture 
and it will be difficult to manually add or modify the system calls
in the respective files manually. To make it easy by keeping a script 
and which'll generate the header file and syscall table file so this 
change will unify them across all architectures.

syscall.tbl contains the list of available system calls along with 
system call number and corresponding entry point. Add a new system 
call in this architecture will be possible by adding new entry in 
the syscall.tbl file.

Adding a new table entry consisting of:
- System call number.
- ABI.
- System call name.
- Entry point name.
- Compat entry name, if required.

ARM, s390 and x86 architecuture does exist the similar support. I 
leverage their implementation to come up with a generic solution.

I have done the same support for work for alpha, microblaze, sparc,
mips, parisc, powerpc, sh, sparc, and xtensa. But I started sending 
the patch for one architecuture for review. Below mentioned git
repository contains more details.
Git repo:- https://github.com/frzkhn/system_call_table_generator/

In v3 patch series, I wired up perf_event_open, seccomp, pkey_
mprotect, pkey_alloc, pkey_free, statx, io_pgetevents and rseq 
system calls. This require an architecture specific implementation 
as it not present now.

Finally, this is the ground work for solving the Y2038 issue. We 
need to add/change two dozen of system calls to solve Y2038 issue. 
So this patch series will help to easily modify from existing 
system call to Y2038 compatible system calls.

Firoz Khan (7):
  ia64: add __NR_old_getpagesize in uapi/asm/unistd.h
  ia64: replace NR_syscalls macro from asm/unistd.h
  ia64: add an offset for system call number
  ia64: replace the system call table entries from entry.S
  ia64: add system call table generation support
  ia64: uapi header and system call table file generation
  ia64: wire up system calls

 arch/ia64/Makefile  |   3 +
 arch/ia64/include/asm/Kbuild|   1 +
 arch/ia64/include/asm/unistd.h  |   4 +-
 arch/ia64/include/uapi/asm/Kbuild   |   1 +
 arch/ia64/include/uapi/asm/unistd.h | 332 +-
 arch/ia64/kernel/entry.S| 333 +-
 arch/ia64/kernel/syscall_table.S|   9 +
 arch/ia64/kernel/syscalls/Makefile  |  39 
 arch/ia64/kernel/syscalls/syscall.tbl   | 353 
 arch/ia64/kernel/syscalls/syscallhdr.sh |  35 
 arch/ia64/kernel/syscalls/syscalltbl.sh |  37 
 11 files changed, 483 insertions(+), 664 deletions(-)
 create mode 100644 arch/ia64/kernel/syscall_table.S
 create mode 100644 arch/ia64/kernel/syscalls/Makefile
 create mode 100644 arch/ia64/kernel/syscalls/syscall.tbl
 create mode 100644 arch/ia64/kernel/syscalls/syscallhdr.sh
 create mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh

-- 
1.9.1

[PATCH v15 1/2] leds: core: Introduce LED pattern trigger

2018-10-10 Thread Baolin Wang

This patch adds a new led trigger that LED device can employ
software or hardware pattern engine.

Consumers can write 'pattern' file to enable the software pattern
which alters the brightness for the specified duration with one
software timer.

Moreover consumers can write 'hw_pattern' file to enable the hardware
pattern for some LED controllers which can autonomously control
brightness over time, according to some preprogrammed hardware
patterns.

Signed-off-by: Raphael Teysseyre 
Signed-off-by: Baolin Wang 
---
Changes from v14:
 - Improve the commit message and ABI documentation.
 - Fix some coding style issues.
 - Do not limit the tuple's duration larger than 50 ms, and treat is
 as a a step change of brightness.

Changes from v13:
 - Add duration validation for gradual dimming.
 - Coding style optimization.

Changes from v12:
 - Add gradual dimming support for software pattern.

Changes from v11:
 - Change -1 means repeat indefinitely.

Changes from v10:
 - Change 'int' to 'u32' for delta_t field.

Changes from v9:
 - None.

Changes from v8:
 - None.

Changes from v7:
 - Move the SC27XX hardware patterns description into its own ABI file.

Changes from v6:
 - Improve commit message.
 - Optimize the description of the hw_pattern file.
 - Simplify some logics.

Changes from v5:
 - Add one 'hw_pattern' file for hardware patterns.

Changes from v4:
 - Change the repeat file to return the originally written number.
 - Improve comments.
 - Fix some build warnings.

Changes from v3:
 - Reset pattern number to 0 if user provides incorrect pattern string.
 - Support one pattern.

Changes from v2:
 - Remove hardware_pattern boolen.
 - Chnage the pattern string format.

Changes from v1:
 - Use ATTRIBUTE_GROUPS() to define attributes.
 - Introduce hardware_pattern flag to determine if software pattern
 or hardware pattern.
 - Re-implement pattern_trig_store_pattern() function.
 - Remove pattern_get() interface.
 - Improve comments.
 - Other small optimization.
---
 .../ABI/testing/sysfs-class-led-trigger-pattern|   82 
 drivers/leds/trigger/Kconfig   |7 +
 drivers/leds/trigger/Makefile  |1 +
 drivers/leds/trigger/ledtrig-pattern.c |  411 
 include/linux/leds.h   |   15 +
 5 files changed, 516 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-led-trigger-pattern
 create mode 100644 drivers/leds/trigger/ledtrig-pattern.c

diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern 
b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
new file mode 100644
index 000..fb3d1e0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
@@ -0,0 +1,82 @@
+What:  /sys/class/leds//pattern
+Date:  September 2018
+KernelVersion: 4.20
+Description:
+   Specify a software pattern for the LED, that supports altering
+   the brightness for the specified duration with one software
+   timer. It can do gradual dimming and step change of brightness.
+
+   The pattern is given by a series of tuples, of brightness and
+   duration (ms). The LED is expected to traverse the series and
+   each brightness value for the specified duration. Duration of
+   0 means brightness should immediately change to new value, and
+   writing malformed pattern deactivates any active one.
+
+   1. For gradual dimming, the dimming interval now is set as 50
+   milliseconds. So the tuple with duration less than dimming
+   interval (50ms) is treated as a step change of brightness,
+   i.e. the subsequent brightness will be applied without adding
+   intervening dimming intervals.
+
+   The gradual dimming format of the software pattern values 
should be:
+   "brightness_1 duration_1 brightness_2 duration_2 brightness_3
+   duration_3 ...". For example:
+
+   echo 0 1000 255 2000 > pattern
+
+   It will make the LED go gradually from zero-intensity to max 
(255)
+   intensity in 1000 milliseconds, then back to zero intensity in 
2000
+   milliseconds:
+
+   LED brightness
+   ^
+   255-|   / \/ \/
+   |  /\ /\ /
+   | /   \  /   \  /
+   |/  \   /  \   /
+ 0-|   / \/ \/
+   +---0123456> time (s)
+
+   2. To make the LED go instantly from one brigntess value to 
another,
+   we should use use zero-time lengths (the brightness must be 
same as
+   the previous tuple's). So the format should be:
+   "brightness_1

[PATCH v15 1/2] leds: core: Introduce LED pattern trigger

2018-10-10 Thread Baolin Wang

This patch adds a new led trigger that LED device can employ
software or hardware pattern engine.

Consumers can write 'pattern' file to enable the software pattern
which alters the brightness for the specified duration with one
software timer.

Moreover consumers can write 'hw_pattern' file to enable the hardware
pattern for some LED controllers which can autonomously control
brightness over time, according to some preprogrammed hardware
patterns.

Signed-off-by: Raphael Teysseyre 
Signed-off-by: Baolin Wang 
---
Changes from v14:
 - Improve the commit message and ABI documentation.
 - Fix some coding style issues.
 - Do not limit the tuple's duration larger than 50 ms, and treat is
 as a a step change of brightness.

Changes from v13:
 - Add duration validation for gradual dimming.
 - Coding style optimization.

Changes from v12:
 - Add gradual dimming support for software pattern.

Changes from v11:
 - Change -1 means repeat indefinitely.

Changes from v10:
 - Change 'int' to 'u32' for delta_t field.

Changes from v9:
 - None.

Changes from v8:
 - None.

Changes from v7:
 - Move the SC27XX hardware patterns description into its own ABI file.

Changes from v6:
 - Improve commit message.
 - Optimize the description of the hw_pattern file.
 - Simplify some logics.

Changes from v5:
 - Add one 'hw_pattern' file for hardware patterns.

Changes from v4:
 - Change the repeat file to return the originally written number.
 - Improve comments.
 - Fix some build warnings.

Changes from v3:
 - Reset pattern number to 0 if user provides incorrect pattern string.
 - Support one pattern.

Changes from v2:
 - Remove hardware_pattern boolen.
 - Chnage the pattern string format.

Changes from v1:
 - Use ATTRIBUTE_GROUPS() to define attributes.
 - Introduce hardware_pattern flag to determine if software pattern
 or hardware pattern.
 - Re-implement pattern_trig_store_pattern() function.
 - Remove pattern_get() interface.
 - Improve comments.
 - Other small optimization.
---
 .../ABI/testing/sysfs-class-led-trigger-pattern|   82 
 drivers/leds/trigger/Kconfig   |7 +
 drivers/leds/trigger/Makefile  |1 +
 drivers/leds/trigger/ledtrig-pattern.c |  411 
 include/linux/leds.h   |   15 +
 5 files changed, 516 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-led-trigger-pattern
 create mode 100644 drivers/leds/trigger/ledtrig-pattern.c

diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern 
b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
new file mode 100644
index 000..fb3d1e0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
@@ -0,0 +1,82 @@
+What:  /sys/class/leds//pattern
+Date:  September 2018
+KernelVersion: 4.20
+Description:
+   Specify a software pattern for the LED, that supports altering
+   the brightness for the specified duration with one software
+   timer. It can do gradual dimming and step change of brightness.
+
+   The pattern is given by a series of tuples, of brightness and
+   duration (ms). The LED is expected to traverse the series and
+   each brightness value for the specified duration. Duration of
+   0 means brightness should immediately change to new value, and
+   writing malformed pattern deactivates any active one.
+
+   1. For gradual dimming, the dimming interval now is set as 50
+   milliseconds. So the tuple with duration less than dimming
+   interval (50ms) is treated as a step change of brightness,
+   i.e. the subsequent brightness will be applied without adding
+   intervening dimming intervals.
+
+   The gradual dimming format of the software pattern values 
should be:
+   "brightness_1 duration_1 brightness_2 duration_2 brightness_3
+   duration_3 ...". For example:
+
+   echo 0 1000 255 2000 > pattern
+
+   It will make the LED go gradually from zero-intensity to max 
(255)
+   intensity in 1000 milliseconds, then back to zero intensity in 
2000
+   milliseconds:
+
+   LED brightness
+   ^
+   255-|   / \/ \/
+   |  /\ /\ /
+   | /   \  /   \  /
+   |/  \   /  \   /
+ 0-|   / \/ \/
+   +---0123456> time (s)
+
+   2. To make the LED go instantly from one brigntess value to 
another,
+   we should use use zero-time lengths (the brightness must be 
same as
+   the previous tuple's). So the format should be:
+   "brightness_1

[PATCH v15 2/2] leds: sc27xx: Add pattern_set/clear interfaces for LED controller

2018-10-10 Thread Baolin Wang

This patch implements the 'pattern_set'and 'pattern_clear'
interfaces to support SC27XX LED breathing mode.

Signed-off-by: Baolin Wang 
Acked-by: Pavel Machek 
---
Chnages from v14:
 - None.

Changes from v13:
 - None.

Changes from v12:
 - None.

Changes from v11:
 - None.

Changes from v10:
 - Add duration alignment function suggested by Jacek.
 - Add acked tag from Pavel.

Changes from v9:
 - Optimize the ABI documentation file.
 - Update the brightness value in hardware pattern mode.

Changes from v8:
 - Optimize the ABI documentation file.

Changes from v7:
 - Add its own ABI documentation file.

Changes from v6:
 - None.

Changes from v5:
 - None.

Changes from v4:
 - None.

Changes from v3:
 - None.

Changes from v2:
 - None.

Changes from v1:
 - Remove pattern_get interface.
---
 .../ABI/testing/sysfs-class-led-driver-sc27xx  |   22 
 drivers/leds/leds-sc27xx-bltc.c|  121 
 2 files changed, 143 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-led-driver-sc27xx

diff --git a/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx 
b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx
new file mode 100644
index 000..45b1e60
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx
@@ -0,0 +1,22 @@
+What:  /sys/class/leds//hw_pattern
+Date:  September 2018
+KernelVersion: 4.20
+Description:
+   Specify a hardware pattern for the SC27XX LED. For the SC27XX
+   LED controller, it only supports 4 stages to make a single
+   hardware pattern, which is used to configure the rise time,
+   high time, fall time and low time for the breathing mode.
+
+   For the breathing mode, the SC27XX LED only expects one 
brightness
+   for the high stage. To be compatible with the hardware pattern
+   format, we should set brightness as 0 for rise stage, fall
+   stage and low stage.
+
+   Min stage duration: 125 ms
+   Max stage duration: 31875 ms
+
+   Since the stage duration step is 125 ms, the duration should be
+   a multiplier of 125, like 125ms, 250ms, 375ms, 500ms ... 
31875ms.
+
+   Thus the format of the hardware pattern values should be:
+   "0 rise_duration brightness high_duration 0 fall_duration 0 
low_duration".
diff --git a/drivers/leds/leds-sc27xx-bltc.c b/drivers/leds/leds-sc27xx-bltc.c
index 9d9b7aa..fecf27f 100644
--- a/drivers/leds/leds-sc27xx-bltc.c
+++ b/drivers/leds/leds-sc27xx-bltc.c
@@ -32,8 +32,18 @@
 #define SC27XX_DUTY_MASK   GENMASK(15, 0)
 #define SC27XX_MOD_MASKGENMASK(7, 0)
 
+#define SC27XX_CURVE_SHIFT 8
+#define SC27XX_CURVE_L_MASKGENMASK(7, 0)
+#define SC27XX_CURVE_H_MASKGENMASK(15, 8)
+
 #define SC27XX_LEDS_OFFSET 0x10
 #define SC27XX_LEDS_MAX3
+#define SC27XX_LEDS_PATTERN_CNT4
+/* Stage duration step, in milliseconds */
+#define SC27XX_LEDS_STEP   125
+/* Minimum and maximum duration, in milliseconds */
+#define SC27XX_DELTA_T_MIN SC27XX_LEDS_STEP
+#define SC27XX_DELTA_T_MAX (SC27XX_LEDS_STEP * 255)
 
 struct sc27xx_led {
char name[LED_MAX_NAME_SIZE];
@@ -122,6 +132,113 @@ static int sc27xx_led_set(struct led_classdev *ldev, enum 
led_brightness value)
return err;
 }
 
+static void sc27xx_led_clamp_align_delta_t(u32 *delta_t)
+{
+   u32 v, offset, t = *delta_t;
+
+   v = t + SC27XX_LEDS_STEP / 2;
+   v = clamp_t(u32, v, SC27XX_DELTA_T_MIN, SC27XX_DELTA_T_MAX);
+   offset = v - SC27XX_DELTA_T_MIN;
+   offset = SC27XX_LEDS_STEP * (offset / SC27XX_LEDS_STEP);
+
+   *delta_t = SC27XX_DELTA_T_MIN + offset;
+}
+
+static int sc27xx_led_pattern_clear(struct led_classdev *ldev)
+{
+   struct sc27xx_led *leds = to_sc27xx_led(ldev);
+   struct regmap *regmap = leds->priv->regmap;
+   u32 base = sc27xx_led_get_offset(leds);
+   u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL;
+   u8 ctrl_shift = SC27XX_CTRL_SHIFT * leds->line;
+   int err;
+
+   mutex_lock(>priv->lock);
+
+   /* Reset the rise, high, fall and low time to zero. */
+   regmap_write(regmap, base + SC27XX_LEDS_CURVE0, 0);
+   regmap_write(regmap, base + SC27XX_LEDS_CURVE1, 0);
+
+   err = regmap_update_bits(regmap, ctrl_base,
+   (SC27XX_LED_RUN | SC27XX_LED_TYPE) << ctrl_shift, 0);
+
+   ldev->brightness = LED_OFF;
+
+   mutex_unlock(>priv->lock);
+
+   return err;
+}
+
+static int sc27xx_led_pattern_set(struct led_classdev *ldev,
+ struct led_pattern *pattern,
+ u32 len, int repeat)
+{
+   struct sc27xx_led *leds = to_sc27xx_led(ldev);
+   u32 base = sc27xx_led_get_offset(leds);
+   u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL;
+   u8 ctrl_shift = SC27XX_CTRL_SHIFT *

[PATCH v15 2/2] leds: sc27xx: Add pattern_set/clear interfaces for LED controller

2018-10-10 Thread Baolin Wang

This patch implements the 'pattern_set'and 'pattern_clear'
interfaces to support SC27XX LED breathing mode.

Signed-off-by: Baolin Wang 
Acked-by: Pavel Machek 
---
Chnages from v14:
 - None.

Changes from v13:
 - None.

Changes from v12:
 - None.

Changes from v11:
 - None.

Changes from v10:
 - Add duration alignment function suggested by Jacek.
 - Add acked tag from Pavel.

Changes from v9:
 - Optimize the ABI documentation file.
 - Update the brightness value in hardware pattern mode.

Changes from v8:
 - Optimize the ABI documentation file.

Changes from v7:
 - Add its own ABI documentation file.

Changes from v6:
 - None.

Changes from v5:
 - None.

Changes from v4:
 - None.

Changes from v3:
 - None.

Changes from v2:
 - None.

Changes from v1:
 - Remove pattern_get interface.
---
 .../ABI/testing/sysfs-class-led-driver-sc27xx  |   22 
 drivers/leds/leds-sc27xx-bltc.c|  121 
 2 files changed, 143 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-led-driver-sc27xx

diff --git a/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx 
b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx
new file mode 100644
index 000..45b1e60
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-led-driver-sc27xx
@@ -0,0 +1,22 @@
+What:  /sys/class/leds//hw_pattern
+Date:  September 2018
+KernelVersion: 4.20
+Description:
+   Specify a hardware pattern for the SC27XX LED. For the SC27XX
+   LED controller, it only supports 4 stages to make a single
+   hardware pattern, which is used to configure the rise time,
+   high time, fall time and low time for the breathing mode.
+
+   For the breathing mode, the SC27XX LED only expects one 
brightness
+   for the high stage. To be compatible with the hardware pattern
+   format, we should set brightness as 0 for rise stage, fall
+   stage and low stage.
+
+   Min stage duration: 125 ms
+   Max stage duration: 31875 ms
+
+   Since the stage duration step is 125 ms, the duration should be
+   a multiplier of 125, like 125ms, 250ms, 375ms, 500ms ... 
31875ms.
+
+   Thus the format of the hardware pattern values should be:
+   "0 rise_duration brightness high_duration 0 fall_duration 0 
low_duration".
diff --git a/drivers/leds/leds-sc27xx-bltc.c b/drivers/leds/leds-sc27xx-bltc.c
index 9d9b7aa..fecf27f 100644
--- a/drivers/leds/leds-sc27xx-bltc.c
+++ b/drivers/leds/leds-sc27xx-bltc.c
@@ -32,8 +32,18 @@
 #define SC27XX_DUTY_MASK   GENMASK(15, 0)
 #define SC27XX_MOD_MASKGENMASK(7, 0)
 
+#define SC27XX_CURVE_SHIFT 8
+#define SC27XX_CURVE_L_MASKGENMASK(7, 0)
+#define SC27XX_CURVE_H_MASKGENMASK(15, 8)
+
 #define SC27XX_LEDS_OFFSET 0x10
 #define SC27XX_LEDS_MAX3
+#define SC27XX_LEDS_PATTERN_CNT4
+/* Stage duration step, in milliseconds */
+#define SC27XX_LEDS_STEP   125
+/* Minimum and maximum duration, in milliseconds */
+#define SC27XX_DELTA_T_MIN SC27XX_LEDS_STEP
+#define SC27XX_DELTA_T_MAX (SC27XX_LEDS_STEP * 255)
 
 struct sc27xx_led {
char name[LED_MAX_NAME_SIZE];
@@ -122,6 +132,113 @@ static int sc27xx_led_set(struct led_classdev *ldev, enum 
led_brightness value)
return err;
 }
 
+static void sc27xx_led_clamp_align_delta_t(u32 *delta_t)
+{
+   u32 v, offset, t = *delta_t;
+
+   v = t + SC27XX_LEDS_STEP / 2;
+   v = clamp_t(u32, v, SC27XX_DELTA_T_MIN, SC27XX_DELTA_T_MAX);
+   offset = v - SC27XX_DELTA_T_MIN;
+   offset = SC27XX_LEDS_STEP * (offset / SC27XX_LEDS_STEP);
+
+   *delta_t = SC27XX_DELTA_T_MIN + offset;
+}
+
+static int sc27xx_led_pattern_clear(struct led_classdev *ldev)
+{
+   struct sc27xx_led *leds = to_sc27xx_led(ldev);
+   struct regmap *regmap = leds->priv->regmap;
+   u32 base = sc27xx_led_get_offset(leds);
+   u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL;
+   u8 ctrl_shift = SC27XX_CTRL_SHIFT * leds->line;
+   int err;
+
+   mutex_lock(>priv->lock);
+
+   /* Reset the rise, high, fall and low time to zero. */
+   regmap_write(regmap, base + SC27XX_LEDS_CURVE0, 0);
+   regmap_write(regmap, base + SC27XX_LEDS_CURVE1, 0);
+
+   err = regmap_update_bits(regmap, ctrl_base,
+   (SC27XX_LED_RUN | SC27XX_LED_TYPE) << ctrl_shift, 0);
+
+   ldev->brightness = LED_OFF;
+
+   mutex_unlock(>priv->lock);
+
+   return err;
+}
+
+static int sc27xx_led_pattern_set(struct led_classdev *ldev,
+ struct led_pattern *pattern,
+ u32 len, int repeat)
+{
+   struct sc27xx_led *leds = to_sc27xx_led(ldev);
+   u32 base = sc27xx_led_get_offset(leds);
+   u32 ctrl_base = leds->priv->base + SC27XX_LEDS_CTRL;
+   u8 ctrl_shift = SC27XX_CTRL_SHIFT *

[PATCH 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.

2018-10-10 Thread Aneesh Kumar K.V

NestMMU requires us to mark the pte invalid and flush the tlb when we do a
RW upgrade of pte. We fixed a variant of this in the fault path in commit
Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")

Do the same for mprotect and autonuma upgrades.

Hugetlb is handled in the next patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 +++
 arch/powerpc/mm/pgtable-book3s64.c   | 34 
 2 files changed, 52 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f108e2ce7f64..c55468eaedc7 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1324,6 +1324,24 @@ static inline const int pud_pfn(pud_t pud)
BUILD_BUG();
return 0;
 }
+#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
+pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
+pte_t *, pte_t, pte_t);
+
+/*
+ * Returns true for a Read or Write upgrade of pte.
+ */
+static inline bool is_pte_upgrade(unsigned long old_val, unsigned long new_val)
+{
+   if ((!(old_val & _PAGE_READ)) && (new_val & _PAGE_READ))
+   return true;
+
+   if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE))
+   return true;
+
+   return false;
+}
 
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 43e99e1d947b..43f71125249b 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -481,3 +481,37 @@ void arch_report_meminfo(struct seq_file *m)
   atomic_long_read(_pages_count[MMU_PAGE_1G]) << 20);
 }
 #endif /* CONFIG_PROC_FS */
+
+pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
+pte_t *ptep)
+{
+   unsigned long pte_val;
+
+   /*
+* Clear the _PAGE_PRESENT so that no hardware parallel update is
+* possible. Also keep the pte_present true so that we don't take
+* wrong fault.
+*/
+   pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, 
_PAGE_INVALID, 0);
+
+   return __pte(pte_val);
+
+}
+EXPORT_SYMBOL(ptep_modify_prot_start);
+
+void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
+pte_t *ptep, pte_t old_pte, pte_t pte)
+{
+   struct mm_struct *mm = vma->vm_mm;
+
+   /*
+* To avoid NMMU hang while relaxing access we need to flush the tlb 
before
+* we set the new value.
+*/
+   if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) &&
+   (atomic_read(>context.copros) > 0))
+   flush_tlb_page(vma, addr);
+
+   set_pte_at(mm, addr, ptep, pte);
+}
+EXPORT_SYMBOL(ptep_modify_prot_commit);
-- 
2.17.1

[PATCH 2/5] mm: update ptep_modify_prot_commit to take old pte value as arg

2018-10-10 Thread Aneesh Kumar K.V

Architectures like ppc64 requires to do a conditional tlb flush based on the old
and new value of pte. Enable that by passing old pte value as the arg.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/s390/include/asm/pgtable.h | 3 ++-
 arch/s390/mm/pgtable.c  | 2 +-
 arch/x86/include/asm/paravirt.h | 2 +-
 fs/proc/task_mmu.c  | 8 +---
 include/asm-generic/pgtable.h   | 2 +-
 mm/memory.c | 8 
 mm/mprotect.c   | 6 +++---
 7 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 8e7f26dfedc6..626250436897 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1036,7 +1036,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
 pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
-void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, 
pte_t);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
+pte_t *, pte_t, pte_t);
 
 #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 29c0a21cd34a..b283b92722cc 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -322,7 +322,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, 
unsigned long addr,
 EXPORT_SYMBOL(ptep_modify_prot_start);
 
 void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
-pte_t *ptep, pte_t pte)
+pte_t *ptep, pte_t old_pte, pte_t pte)
 {
pgste_t pgste;
struct mm_struct *mm = vma->vm_mm;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index c5d203a51e50..17214e074286 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -434,7 +434,7 @@ static inline pte_t ptep_modify_prot_start(struct 
vm_area_struct *vma, unsigned
 }
 
 static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr,
-  pte_t *ptep, pte_t pte)
+  pte_t *ptep, pte_t old_pte, pte_t 
pte)
 {
struct mm_struct *mm = vma->vm_mm;
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 229df16e7ad0..505aa21d04df 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -938,10 +938,12 @@ static inline void clear_soft_dirty(struct vm_area_struct 
*vma,
pte_t ptent = *pte;
 
if (pte_present(ptent)) {
-   ptent = ptep_modify_prot_start(vma, addr, pte);
-   ptent = pte_wrprotect(ptent);
+   pte_t old_pte;
+
+   old_pte = ptep_modify_prot_start(vma, addr, pte);
+   ptent = pte_wrprotect(old_pte);
ptent = pte_clear_soft_dirty(ptent);
-   ptep_modify_prot_commit(vma, addr, pte, ptent);
+   ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 021b94cd3260..4e4723f6be5e 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -619,7 +619,7 @@ static inline pte_t ptep_modify_prot_start(struct 
vm_area_struct *vma,
  */
 static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
   unsigned long addr,
-  pte_t *ptep, pte_t pte)
+  pte_t *ptep, pte_t old_pte, pte_t 
pte)
 {
__ptep_modify_prot_commit(vma->vm_mm, addr, ptep, pte);
 }
diff --git a/mm/memory.c b/mm/memory.c
index 261d30f51499..211df764f232 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3786,7 +3786,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
int last_cpupid;
int target_nid;
bool migrated = false;
-   pte_t pte;
+   pte_t pte, old_pte;
bool was_writable = pte_savedwrite(vmf->orig_pte);
int flags = 0;
 
@@ -3806,12 +3806,12 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
 * Make it present again, Depending on how arch implementes non
 * accessible ptes, some can allow access by kernel mode.
 */
-   pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte);
-   pte = pte_modify(pte, vma->vm_page_prot);
+   old_pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte);
+   pte = pte_modify(old_pte, vma->vm_page_prot);
pte = pte_mkyoung(pte);
if (was_writable)
pte = pte_mkwrite(pte);
-   ptep_modify_prot_commit(vma, vmf->address, vmf->pte, pte);

[PATCH 5/5] arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade

2018-10-10 Thread Aneesh Kumar K.V

NestMMU requires us to mark the pte invalid and flush the tlb when we do a
RW upgrade of pte. We fixed a variant of this in the fault path in commit
Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 +
 arch/powerpc/include/asm/hugetlb.h   |  2 +-
 arch/powerpc/mm/hugetlbpage.c| 35 
 3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h 
b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index 5b0177733994..a12bde29a5f0 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -42,4 +42,12 @@ static inline bool gigantic_page_supported(void)
 /* hugepd entry valid bit */
 #define HUGEPD_VAL_BITS(0x8000UL)
 
+#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start
+extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+unsigned long addr, pte_t *ptep);
+
+#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit
+extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
+unsigned long addr, pte_t *ptep,
+pte_t old_pte, pte_t new_pte);
 #endif
diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 2d00cc530083..60c1d37e446a 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -4,7 +4,6 @@
 
 #ifdef CONFIG_HUGETLB_PAGE
 #include 
-#include 
 
 extern struct kmem_cache *hugepte_cache;
 
@@ -176,6 +175,7 @@ static inline void arch_clear_hugepage_flags(struct page 
*page)
 {
 }
 
+#include 
 #else /* ! CONFIG_HUGETLB_PAGE */
 static inline void flush_hugetlb_page(struct vm_area_struct *vma,
  unsigned long vmaddr)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a7226ed9cae6..8b098bedaff5 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -913,3 +913,38 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned 
long addr,
 
return 1;
 }
+
+#ifdef CONFIG_PPC_BOOK3S_64
+pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+   unsigned long pte_val;
+   /*
+* Clear the _PAGE_PRESENT so that no hardware parallel update is
+* possible. Also keep the pte_present true so that we don't take
+* wrong fault.
+*/
+   pte_val = pte_update(vma->vm_mm, addr, ptep,
+_PAGE_PRESENT, _PAGE_INVALID, 1);
+
+   return __pte(pte_val);
+}
+EXPORT_SYMBOL(huge_ptep_modify_prot_start);
+
+void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long 
addr,
+ pte_t *ptep, pte_t old_pte, pte_t pte)
+{
+   struct mm_struct *mm = vma->vm_mm;
+
+   /*
+* To avoid NMMU hang while relaxing access we need to flush the tlb 
before
+* we set the new value.
+*/
+   if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) &&
+   (atomic_read(>context.copros) > 0))
+   flush_hugetlb_page(vma, addr);
+
+   set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+}
+EXPORT_SYMBOL(huge_ptep_modify_prot_commit);
+#endif
-- 
2.17.1

[PATCH 4/5] mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update

2018-10-10 Thread Aneesh Kumar K.V

Signed-off-by: Aneesh Kumar K.V 
---
 include/linux/hugetlb.h | 18 ++
 mm/hugetlb.c|  8 +---
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 087fd5f48c91..e2a3b0c854eb 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -543,6 +543,24 @@ static inline void set_huge_swap_pte_at(struct mm_struct 
*mm, unsigned long addr
set_huge_pte_at(mm, addr, ptep, pte);
 }
 #endif
+
+#ifndef huge_ptep_modify_prot_start
+static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep)
+{
+   return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
+}
+#endif
+
+#ifndef huge_ptep_modify_prot_commit
+static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep,
+   pte_t old_pte, pte_t pte)
+{
+   set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+}
+#endif
+
 #else  /* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page(v, a, r) NULL
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5c390f5a5207..1f3a4df95b2e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4367,10 +4367,12 @@ unsigned long hugetlb_change_protection(struct 
vm_area_struct *vma,
continue;
}
if (!huge_pte_none(pte)) {
-   pte = huge_ptep_get_and_clear(mm, address, ptep);
-   pte = pte_mkhuge(huge_pte_modify(pte, newprot));
+   pte_t old_pte;
+
+   old_pte = huge_ptep_modify_prot_start(vma, address, 
ptep);
+   pte = pte_mkhuge(huge_pte_modify(old_pte, newprot));
pte = arch_make_huge_pte(pte, vma, NULL, 0);
-   set_huge_pte_at(mm, address, ptep, pte);
+   huge_ptep_modify_prot_commit(vma, address, ptep, 
old_pte, pte);
pages++;
}
spin_unlock(ptl);
-- 
2.17.1

[PATCH 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.

2018-10-10 Thread Aneesh Kumar K.V

NestMMU requires us to mark the pte invalid and flush the tlb when we do a
RW upgrade of pte. We fixed a variant of this in the fault path in commit
Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")

Do the same for mprotect and autonuma upgrades.

Hugetlb is handled in the next patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 +++
 arch/powerpc/mm/pgtable-book3s64.c   | 34 
 2 files changed, 52 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f108e2ce7f64..c55468eaedc7 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1324,6 +1324,24 @@ static inline const int pud_pfn(pud_t pud)
BUILD_BUG();
return 0;
 }
+#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
+pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
+pte_t *, pte_t, pte_t);
+
+/*
+ * Returns true for a Read or Write upgrade of pte.
+ */
+static inline bool is_pte_upgrade(unsigned long old_val, unsigned long new_val)
+{
+   if ((!(old_val & _PAGE_READ)) && (new_val & _PAGE_READ))
+   return true;
+
+   if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE))
+   return true;
+
+   return false;
+}
 
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 43e99e1d947b..43f71125249b 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -481,3 +481,37 @@ void arch_report_meminfo(struct seq_file *m)
   atomic_long_read(_pages_count[MMU_PAGE_1G]) << 20);
 }
 #endif /* CONFIG_PROC_FS */
+
+pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
+pte_t *ptep)
+{
+   unsigned long pte_val;
+
+   /*
+* Clear the _PAGE_PRESENT so that no hardware parallel update is
+* possible. Also keep the pte_present true so that we don't take
+* wrong fault.
+*/
+   pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, 
_PAGE_INVALID, 0);
+
+   return __pte(pte_val);
+
+}
+EXPORT_SYMBOL(ptep_modify_prot_start);
+
+void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
+pte_t *ptep, pte_t old_pte, pte_t pte)
+{
+   struct mm_struct *mm = vma->vm_mm;
+
+   /*
+* To avoid NMMU hang while relaxing access we need to flush the tlb 
before
+* we set the new value.
+*/
+   if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) &&
+   (atomic_read(>context.copros) > 0))
+   flush_tlb_page(vma, addr);
+
+   set_pte_at(mm, addr, ptep, pte);
+}
+EXPORT_SYMBOL(ptep_modify_prot_commit);
-- 
2.17.1

[PATCH 2/5] mm: update ptep_modify_prot_commit to take old pte value as arg

2018-10-10 Thread Aneesh Kumar K.V

Architectures like ppc64 requires to do a conditional tlb flush based on the old
and new value of pte. Enable that by passing old pte value as the arg.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/s390/include/asm/pgtable.h | 3 ++-
 arch/s390/mm/pgtable.c  | 2 +-
 arch/x86/include/asm/paravirt.h | 2 +-
 fs/proc/task_mmu.c  | 8 +---
 include/asm-generic/pgtable.h   | 2 +-
 mm/memory.c | 8 
 mm/mprotect.c   | 6 +++---
 7 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 8e7f26dfedc6..626250436897 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1036,7 +1036,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
 pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
-void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, 
pte_t);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
+pte_t *, pte_t, pte_t);
 
 #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 29c0a21cd34a..b283b92722cc 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -322,7 +322,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, 
unsigned long addr,
 EXPORT_SYMBOL(ptep_modify_prot_start);
 
 void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
-pte_t *ptep, pte_t pte)
+pte_t *ptep, pte_t old_pte, pte_t pte)
 {
pgste_t pgste;
struct mm_struct *mm = vma->vm_mm;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index c5d203a51e50..17214e074286 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -434,7 +434,7 @@ static inline pte_t ptep_modify_prot_start(struct 
vm_area_struct *vma, unsigned
 }
 
 static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr,
-  pte_t *ptep, pte_t pte)
+  pte_t *ptep, pte_t old_pte, pte_t 
pte)
 {
struct mm_struct *mm = vma->vm_mm;
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 229df16e7ad0..505aa21d04df 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -938,10 +938,12 @@ static inline void clear_soft_dirty(struct vm_area_struct 
*vma,
pte_t ptent = *pte;
 
if (pte_present(ptent)) {
-   ptent = ptep_modify_prot_start(vma, addr, pte);
-   ptent = pte_wrprotect(ptent);
+   pte_t old_pte;
+
+   old_pte = ptep_modify_prot_start(vma, addr, pte);
+   ptent = pte_wrprotect(old_pte);
ptent = pte_clear_soft_dirty(ptent);
-   ptep_modify_prot_commit(vma, addr, pte, ptent);
+   ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 021b94cd3260..4e4723f6be5e 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -619,7 +619,7 @@ static inline pte_t ptep_modify_prot_start(struct 
vm_area_struct *vma,
  */
 static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
   unsigned long addr,
-  pte_t *ptep, pte_t pte)
+  pte_t *ptep, pte_t old_pte, pte_t 
pte)
 {
__ptep_modify_prot_commit(vma->vm_mm, addr, ptep, pte);
 }
diff --git a/mm/memory.c b/mm/memory.c
index 261d30f51499..211df764f232 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3786,7 +3786,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
int last_cpupid;
int target_nid;
bool migrated = false;
-   pte_t pte;
+   pte_t pte, old_pte;
bool was_writable = pte_savedwrite(vmf->orig_pte);
int flags = 0;
 
@@ -3806,12 +3806,12 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
 * Make it present again, Depending on how arch implementes non
 * accessible ptes, some can allow access by kernel mode.
 */
-   pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte);
-   pte = pte_modify(pte, vma->vm_page_prot);
+   old_pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte);
+   pte = pte_modify(old_pte, vma->vm_page_prot);
pte = pte_mkyoung(pte);
if (was_writable)
pte = pte_mkwrite(pte);
-   ptep_modify_prot_commit(vma, vmf->address, vmf->pte, pte);

[PATCH 5/5] arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade

2018-10-10 Thread Aneesh Kumar K.V

NestMMU requires us to mark the pte invalid and flush the tlb when we do a
RW upgrade of pte. We fixed a variant of this in the fault path in commit
Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 +
 arch/powerpc/include/asm/hugetlb.h   |  2 +-
 arch/powerpc/mm/hugetlbpage.c| 35 
 3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h 
b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index 5b0177733994..a12bde29a5f0 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -42,4 +42,12 @@ static inline bool gigantic_page_supported(void)
 /* hugepd entry valid bit */
 #define HUGEPD_VAL_BITS(0x8000UL)
 
+#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start
+extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+unsigned long addr, pte_t *ptep);
+
+#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit
+extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
+unsigned long addr, pte_t *ptep,
+pte_t old_pte, pte_t new_pte);
 #endif
diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 2d00cc530083..60c1d37e446a 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -4,7 +4,6 @@
 
 #ifdef CONFIG_HUGETLB_PAGE
 #include 
-#include 
 
 extern struct kmem_cache *hugepte_cache;
 
@@ -176,6 +175,7 @@ static inline void arch_clear_hugepage_flags(struct page 
*page)
 {
 }
 
+#include 
 #else /* ! CONFIG_HUGETLB_PAGE */
 static inline void flush_hugetlb_page(struct vm_area_struct *vma,
  unsigned long vmaddr)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a7226ed9cae6..8b098bedaff5 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -913,3 +913,38 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned 
long addr,
 
return 1;
 }
+
+#ifdef CONFIG_PPC_BOOK3S_64
+pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+   unsigned long pte_val;
+   /*
+* Clear the _PAGE_PRESENT so that no hardware parallel update is
+* possible. Also keep the pte_present true so that we don't take
+* wrong fault.
+*/
+   pte_val = pte_update(vma->vm_mm, addr, ptep,
+_PAGE_PRESENT, _PAGE_INVALID, 1);
+
+   return __pte(pte_val);
+}
+EXPORT_SYMBOL(huge_ptep_modify_prot_start);
+
+void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long 
addr,
+ pte_t *ptep, pte_t old_pte, pte_t pte)
+{
+   struct mm_struct *mm = vma->vm_mm;
+
+   /*
+* To avoid NMMU hang while relaxing access we need to flush the tlb 
before
+* we set the new value.
+*/
+   if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) &&
+   (atomic_read(>context.copros) > 0))
+   flush_hugetlb_page(vma, addr);
+
+   set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+}
+EXPORT_SYMBOL(huge_ptep_modify_prot_commit);
+#endif
-- 
2.17.1

[PATCH 4/5] mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update

2018-10-10 Thread Aneesh Kumar K.V

Signed-off-by: Aneesh Kumar K.V 
---
 include/linux/hugetlb.h | 18 ++
 mm/hugetlb.c|  8 +---
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 087fd5f48c91..e2a3b0c854eb 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -543,6 +543,24 @@ static inline void set_huge_swap_pte_at(struct mm_struct 
*mm, unsigned long addr
set_huge_pte_at(mm, addr, ptep, pte);
 }
 #endif
+
+#ifndef huge_ptep_modify_prot_start
+static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep)
+{
+   return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
+}
+#endif
+
+#ifndef huge_ptep_modify_prot_commit
+static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep,
+   pte_t old_pte, pte_t pte)
+{
+   set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+}
+#endif
+
 #else  /* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page(v, a, r) NULL
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5c390f5a5207..1f3a4df95b2e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4367,10 +4367,12 @@ unsigned long hugetlb_change_protection(struct 
vm_area_struct *vma,
continue;
}
if (!huge_pte_none(pte)) {
-   pte = huge_ptep_get_and_clear(mm, address, ptep);
-   pte = pte_mkhuge(huge_pte_modify(pte, newprot));
+   pte_t old_pte;
+
+   old_pte = huge_ptep_modify_prot_start(vma, address, 
ptep);
+   pte = pte_mkhuge(huge_pte_modify(old_pte, newprot));
pte = arch_make_huge_pte(pte, vma, NULL, 0);
-   set_huge_pte_at(mm, address, ptep, pte);
+   huge_ptep_modify_prot_commit(vma, address, ptep, 
old_pte, pte);
pages++;
}
spin_unlock(ptl);
-- 
2.17.1

[PATCH 0/5] NestMMU pte upgrade workaround for mprotect and autonuma

2018-10-10 Thread Aneesh Kumar K.V

We can upgrade pte access (R -> RW transition) via mprotect or autonuma. We need
to make sure we follow the recommended pte update sequence as outlined in
commit: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")
for such updates. This patch series do that.

Aneesh Kumar K.V (5):
  mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg
  mm: update ptep_modify_prot_commit to take old pte value as arg
  arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.
  mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update
  arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW
upgrade

 arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 +
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++
 arch/powerpc/include/asm/hugetlb.h   |  2 +-
 arch/powerpc/mm/hugetlbpage.c| 35 
 arch/powerpc/mm/pgtable-book3s64.c   | 34 +++
 arch/s390/include/asm/pgtable.h  |  5 +--
 arch/s390/mm/pgtable.c   |  8 +++--
 arch/x86/include/asm/paravirt.h  |  9 +++--
 fs/proc/task_mmu.c   |  8 +++--
 include/asm-generic/pgtable.h| 10 +++---
 include/linux/hugetlb.h  | 18 ++
 mm/hugetlb.c |  8 +++--
 mm/memory.c  |  8 ++---
 mm/mprotect.c|  6 ++--
 14 files changed, 150 insertions(+), 27 deletions(-)

-- 
2.17.1

[PATCH 1/5] mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg

2018-10-10 Thread Aneesh Kumar K.V

Some architecture may want to call flush_tlb_range from these helpers.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/s390/include/asm/pgtable.h | 4 ++--
 arch/s390/mm/pgtable.c  | 6 --
 arch/x86/include/asm/paravirt.h | 7 +--
 fs/proc/task_mmu.c  | 4 ++--
 include/asm-generic/pgtable.h   | 8 
 mm/memory.c | 4 ++--
 mm/mprotect.c   | 4 ++--
 7 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 0e7cb0dc9c33..8e7f26dfedc6 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1035,8 +1035,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 }
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
-pte_t ptep_modify_prot_start(struct mm_struct *, unsigned long, pte_t *);
-void ptep_modify_prot_commit(struct mm_struct *, unsigned long, pte_t *, 
pte_t);
+pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, 
pte_t);
 
 #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index f2cc7da473e4..29c0a21cd34a 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -301,12 +301,13 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long 
addr,
 }
 EXPORT_SYMBOL(ptep_xchg_lazy);
 
-pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr,
+pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
 pte_t *ptep)
 {
pgste_t pgste;
pte_t old;
int nodat;
+   struct mm_struct *mm = vma->vm_mm;
 
preempt_disable();
pgste = ptep_xchg_start(mm, addr, ptep);
@@ -320,10 +321,11 @@ pte_t ptep_modify_prot_start(struct mm_struct *mm, 
unsigned long addr,
 }
 EXPORT_SYMBOL(ptep_modify_prot_start);
 
-void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
+void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
 pte_t *ptep, pte_t pte)
 {
pgste_t pgste;
+   struct mm_struct *mm = vma->vm_mm;
 
if (!MACHINE_HAS_NX)
pte_val(pte) &= ~_PAGE_NOEXEC;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index e375d4266b53..c5d203a51e50 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -421,10 +421,11 @@ static inline pgdval_t pgd_val(pgd_t pgd)
 }
 
 #define  __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
-static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long 
addr,
+static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, 
unsigned long addr,
   pte_t *ptep)
 {
pteval_t ret;
+   struct mm_struct *mm = vma->vm_mm;
 
ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start,
 mm, addr, ptep);
@@ -432,9 +433,11 @@ static inline pte_t ptep_modify_prot_start(struct 
mm_struct *mm, unsigned long a
return (pte_t) { .pte = ret };
 }
 
-static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long 
addr,
+static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr,
   pte_t *ptep, pte_t pte)
 {
+   struct mm_struct *mm = vma->vm_mm;
+
if (sizeof(pteval_t) > sizeof(long))
/* 5 arg words */
pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5ea1d64cb0b4..229df16e7ad0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -938,10 +938,10 @@ static inline void clear_soft_dirty(struct vm_area_struct 
*vma,
pte_t ptent = *pte;
 
if (pte_present(ptent)) {
-   ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
+   ptent = ptep_modify_prot_start(vma, addr, pte);
ptent = pte_wrprotect(ptent);
ptent = pte_clear_soft_dirty(ptent);
-   ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
+   ptep_modify_prot_commit(vma, addr, pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 88ebc6102c7c..021b94cd3260 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -606,22 +606,22 @@ static inline void __ptep_modify_prot_commit(struct 
mm_struct *mm,
  * queue the update to be done at some later time.  The update must be
  * actually committed before the pte lock is released, however.
  */
-static inline pte_t ptep_modify_prot_start(struct

[PATCH] mtd: spi-nor: Add support for SPI boot flash access for AMD Family 16h

2018-10-10 Thread Brett Grandbois

Add support to expose the SPI boot flash on AMD Family 16h CPUs as a
standard mtd device to give userspace BIOS updaters greater feature
support.  The BIOS and Kernel Developer's Guide refers to this as the
'SPI ROM' controller and so the driver follows that naming convention
for consistency.

Signed-off-by: Brett Grandbois 
---
 drivers/mtd/spi-nor/Kconfig  |  15 +
 drivers/mtd/spi-nor/Makefile |   1 +
 drivers/mtd/spi-nor/amd-spirom.c | 805 +++
 3 files changed, 821 insertions(+)
 create mode 100644 drivers/mtd/spi-nor/amd-spirom.c

diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig
index 6cc9c929ff57..f99b40ec0fef 100644
--- a/drivers/mtd/spi-nor/Kconfig
+++ b/drivers/mtd/spi-nor/Kconfig
@@ -129,4 +129,19 @@ config SPI_STM32_QUADSPI
  This enables support for the STM32 Quad SPI controller.
  We only connect the NOR to this controller.
 
+config SPI_AMD_SPIROM
+   tristate "AMD Hudson FCH SPI flash drvier (DANGEROUS)"
+   depends on X86 && PCI
+   help
+ This enables support for the AMD Family 16h SPI flash controller to
+ access the boot flash from Linux as an mtd device.
+
+ Using this driver it is possible to upgrade BIOS directly from Linux.
+
+ Say N here unless you know what you are doing. Overwriting the
+ SPI flash may render the system unbootable.
+
+ To compile this driver as a module, choose M here: the module
+ will be called amd-spirom.
+
 endif # MTD_SPI_NOR
diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile
index f4c61d282abd..e49ea4e619c1 100644
--- a/drivers/mtd/spi-nor/Makefile
+++ b/drivers/mtd/spi-nor/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_SPI_INTEL_SPI)   += intel-spi.o
 obj-$(CONFIG_SPI_INTEL_SPI_PCI)+= intel-spi-pci.o
 obj-$(CONFIG_SPI_INTEL_SPI_PLATFORM)   += intel-spi-platform.o
 obj-$(CONFIG_SPI_STM32_QUADSPI)+= stm32-quadspi.o
+obj-$(CONFIG_SPI_AMD_SPIROM)   += amd-spirom.o
diff --git a/drivers/mtd/spi-nor/amd-spirom.c b/drivers/mtd/spi-nor/amd-spirom.c
new file mode 100644
index ..514e67edc9cd
--- /dev/null
+++ b/drivers/mtd/spi-nor/amd-spirom.c
@@ -0,0 +1,805 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2018, Opengear
+ *
+ * AMD Family 16h Hudson FCH SPI flash driver.
+ *
+ * When the FCH is strapped to SPI boot ROM mode 'SPIROM'
+ * the FCH will do a flash auto-probe and self-configure
+ * for read operations to the ROM address range(s).
+ * For any command outside of read/write (chip erase, etc)
+ * you need to go through the alternate program method.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* FCH Device LPC Bridge Configuration Registers */
+#define PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE   0x780E
+
+#define FCH_PCI_CONTROL0x40
+#define FCH_INTEGRATED_EC_PRESENT  0x80
+#define FCH_EC_SEM 0x40
+#define FCH_BIOS_SEM   0x20
+#define FCH_LEGACY_DMA 0x04
+
+#define FCH_ROM_ADDR_RANGE_2   0x6C
+
+#define FCH_SPI_BASE_ADDR  0xA0
+#define FCH_SPI_BASE_ADDR_MASK 0xFFC0
+#define FCH_SPI_ROUTE_TPM_SPI  0x08
+#define FCH_SPI_ROM_ENABLE 0x02
+
+/* up through FIFO [C6:80] */
+#define SPI_IO_REGION_LEN  256
+
+/* SPI Registers, the labels come from the BKDG */
+#define SPI_CNTRL0 0x00
+#define SPI_CNTRL0_FIFO_PTR_CLEAR  0x0010
+#define SPI_CNTRL0_FIFO_PTR_CLEAR_MASK 0xFFEF
+#define SPI_CNTRL0_SPI_ARB_ENABLE  0x0008
+#define SPI_CNTRL0_SPI_ARB_ENABLE_MASK 0xFFF7
+
+#define ALT_SPI_CS 0x1D
+#define ALT_SPI_CS_MASK0xFC
+#define ALT_SPI_CS_WR_BUF_EN   0x04
+
+#define SPI100_ENABLE  0x20
+#define SPI100_SPEED_CONFIG0x22
+
+
+/* SPI control shadow registers */
+#define CMD_CODE   0x45
+
+#define CMD_TRIGGER0x47
+#define CMD_TRIGGER_EXECUTE0x80
+
+#define TX_BYTE_COUNT  0x48
+
+#define RX_BYTE_COUNT  0x4B
+
+#define SPI_STATUS 0x4C
+#define SPI_STATUS_BUSY_MASK   0x8000
+
+#define SPI_FIFO   0x80
+
+static const struct pci_device_id amd_fch_lpc_pci_device_ids[] = {
+   { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE) },
+   {}
+};
+MODULE_DEVICE_TABLE(pci, amd_fch_lpc_pci_device_ids);
+
+static short norm_speed = -1;
+module_param(norm_speed, short, 0444);
+MODULE_PARM_DESC(norm_speed, "Specify SPI speed for normal read.  This sets 
NormSpeedNew[3:0] from BKDG. -1 means use

[PATCH 0/5] NestMMU pte upgrade workaround for mprotect and autonuma

2018-10-10 Thread Aneesh Kumar K.V

We can upgrade pte access (R -> RW transition) via mprotect or autonuma. We need
to make sure we follow the recommended pte update sequence as outlined in
commit: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")
for such updates. This patch series do that.

Aneesh Kumar K.V (5):
  mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg
  mm: update ptep_modify_prot_commit to take old pte value as arg
  arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.
  mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update
  arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW
upgrade

 arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 +
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++
 arch/powerpc/include/asm/hugetlb.h   |  2 +-
 arch/powerpc/mm/hugetlbpage.c| 35 
 arch/powerpc/mm/pgtable-book3s64.c   | 34 +++
 arch/s390/include/asm/pgtable.h  |  5 +--
 arch/s390/mm/pgtable.c   |  8 +++--
 arch/x86/include/asm/paravirt.h  |  9 +++--
 fs/proc/task_mmu.c   |  8 +++--
 include/asm-generic/pgtable.h| 10 +++---
 include/linux/hugetlb.h  | 18 ++
 mm/hugetlb.c |  8 +++--
 mm/memory.c  |  8 ++---
 mm/mprotect.c|  6 ++--
 14 files changed, 150 insertions(+), 27 deletions(-)

-- 
2.17.1

[PATCH 1/5] mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg

2018-10-10 Thread Aneesh Kumar K.V

Some architecture may want to call flush_tlb_range from these helpers.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/s390/include/asm/pgtable.h | 4 ++--
 arch/s390/mm/pgtable.c  | 6 --
 arch/x86/include/asm/paravirt.h | 7 +--
 fs/proc/task_mmu.c  | 4 ++--
 include/asm-generic/pgtable.h   | 8 
 mm/memory.c | 4 ++--
 mm/mprotect.c   | 4 ++--
 7 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 0e7cb0dc9c33..8e7f26dfedc6 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1035,8 +1035,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 }
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
-pte_t ptep_modify_prot_start(struct mm_struct *, unsigned long, pte_t *);
-void ptep_modify_prot_commit(struct mm_struct *, unsigned long, pte_t *, 
pte_t);
+pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, 
pte_t);
 
 #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index f2cc7da473e4..29c0a21cd34a 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -301,12 +301,13 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long 
addr,
 }
 EXPORT_SYMBOL(ptep_xchg_lazy);
 
-pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr,
+pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
 pte_t *ptep)
 {
pgste_t pgste;
pte_t old;
int nodat;
+   struct mm_struct *mm = vma->vm_mm;
 
preempt_disable();
pgste = ptep_xchg_start(mm, addr, ptep);
@@ -320,10 +321,11 @@ pte_t ptep_modify_prot_start(struct mm_struct *mm, 
unsigned long addr,
 }
 EXPORT_SYMBOL(ptep_modify_prot_start);
 
-void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
+void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
 pte_t *ptep, pte_t pte)
 {
pgste_t pgste;
+   struct mm_struct *mm = vma->vm_mm;
 
if (!MACHINE_HAS_NX)
pte_val(pte) &= ~_PAGE_NOEXEC;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index e375d4266b53..c5d203a51e50 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -421,10 +421,11 @@ static inline pgdval_t pgd_val(pgd_t pgd)
 }
 
 #define  __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
-static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long 
addr,
+static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, 
unsigned long addr,
   pte_t *ptep)
 {
pteval_t ret;
+   struct mm_struct *mm = vma->vm_mm;
 
ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start,
 mm, addr, ptep);
@@ -432,9 +433,11 @@ static inline pte_t ptep_modify_prot_start(struct 
mm_struct *mm, unsigned long a
return (pte_t) { .pte = ret };
 }
 
-static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long 
addr,
+static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr,
   pte_t *ptep, pte_t pte)
 {
+   struct mm_struct *mm = vma->vm_mm;
+
if (sizeof(pteval_t) > sizeof(long))
/* 5 arg words */
pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5ea1d64cb0b4..229df16e7ad0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -938,10 +938,10 @@ static inline void clear_soft_dirty(struct vm_area_struct 
*vma,
pte_t ptent = *pte;
 
if (pte_present(ptent)) {
-   ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
+   ptent = ptep_modify_prot_start(vma, addr, pte);
ptent = pte_wrprotect(ptent);
ptent = pte_clear_soft_dirty(ptent);
-   ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
+   ptep_modify_prot_commit(vma, addr, pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 88ebc6102c7c..021b94cd3260 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -606,22 +606,22 @@ static inline void __ptep_modify_prot_commit(struct 
mm_struct *mm,
  * queue the update to be done at some later time.  The update must be
  * actually committed before the pte lock is released, however.
  */
-static inline pte_t ptep_modify_prot_start(struct

[PATCH] mtd: spi-nor: Add support for SPI boot flash access for AMD Family 16h

2018-10-10 Thread Brett Grandbois

Add support to expose the SPI boot flash on AMD Family 16h CPUs as a
standard mtd device to give userspace BIOS updaters greater feature
support.  The BIOS and Kernel Developer's Guide refers to this as the
'SPI ROM' controller and so the driver follows that naming convention
for consistency.

Signed-off-by: Brett Grandbois 
---
 drivers/mtd/spi-nor/Kconfig  |  15 +
 drivers/mtd/spi-nor/Makefile |   1 +
 drivers/mtd/spi-nor/amd-spirom.c | 805 +++
 3 files changed, 821 insertions(+)
 create mode 100644 drivers/mtd/spi-nor/amd-spirom.c

diff --git a/drivers/mtd/spi-nor/Kconfig b/drivers/mtd/spi-nor/Kconfig
index 6cc9c929ff57..f99b40ec0fef 100644
--- a/drivers/mtd/spi-nor/Kconfig
+++ b/drivers/mtd/spi-nor/Kconfig
@@ -129,4 +129,19 @@ config SPI_STM32_QUADSPI
  This enables support for the STM32 Quad SPI controller.
  We only connect the NOR to this controller.
 
+config SPI_AMD_SPIROM
+   tristate "AMD Hudson FCH SPI flash drvier (DANGEROUS)"
+   depends on X86 && PCI
+   help
+ This enables support for the AMD Family 16h SPI flash controller to
+ access the boot flash from Linux as an mtd device.
+
+ Using this driver it is possible to upgrade BIOS directly from Linux.
+
+ Say N here unless you know what you are doing. Overwriting the
+ SPI flash may render the system unbootable.
+
+ To compile this driver as a module, choose M here: the module
+ will be called amd-spirom.
+
 endif # MTD_SPI_NOR
diff --git a/drivers/mtd/spi-nor/Makefile b/drivers/mtd/spi-nor/Makefile
index f4c61d282abd..e49ea4e619c1 100644
--- a/drivers/mtd/spi-nor/Makefile
+++ b/drivers/mtd/spi-nor/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_SPI_INTEL_SPI)   += intel-spi.o
 obj-$(CONFIG_SPI_INTEL_SPI_PCI)+= intel-spi-pci.o
 obj-$(CONFIG_SPI_INTEL_SPI_PLATFORM)   += intel-spi-platform.o
 obj-$(CONFIG_SPI_STM32_QUADSPI)+= stm32-quadspi.o
+obj-$(CONFIG_SPI_AMD_SPIROM)   += amd-spirom.o
diff --git a/drivers/mtd/spi-nor/amd-spirom.c b/drivers/mtd/spi-nor/amd-spirom.c
new file mode 100644
index ..514e67edc9cd
--- /dev/null
+++ b/drivers/mtd/spi-nor/amd-spirom.c
@@ -0,0 +1,805 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2018, Opengear
+ *
+ * AMD Family 16h Hudson FCH SPI flash driver.
+ *
+ * When the FCH is strapped to SPI boot ROM mode 'SPIROM'
+ * the FCH will do a flash auto-probe and self-configure
+ * for read operations to the ROM address range(s).
+ * For any command outside of read/write (chip erase, etc)
+ * you need to go through the alternate program method.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* FCH Device LPC Bridge Configuration Registers */
+#define PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE   0x780E
+
+#define FCH_PCI_CONTROL0x40
+#define FCH_INTEGRATED_EC_PRESENT  0x80
+#define FCH_EC_SEM 0x40
+#define FCH_BIOS_SEM   0x20
+#define FCH_LEGACY_DMA 0x04
+
+#define FCH_ROM_ADDR_RANGE_2   0x6C
+
+#define FCH_SPI_BASE_ADDR  0xA0
+#define FCH_SPI_BASE_ADDR_MASK 0xFFC0
+#define FCH_SPI_ROUTE_TPM_SPI  0x08
+#define FCH_SPI_ROM_ENABLE 0x02
+
+/* up through FIFO [C6:80] */
+#define SPI_IO_REGION_LEN  256
+
+/* SPI Registers, the labels come from the BKDG */
+#define SPI_CNTRL0 0x00
+#define SPI_CNTRL0_FIFO_PTR_CLEAR  0x0010
+#define SPI_CNTRL0_FIFO_PTR_CLEAR_MASK 0xFFEF
+#define SPI_CNTRL0_SPI_ARB_ENABLE  0x0008
+#define SPI_CNTRL0_SPI_ARB_ENABLE_MASK 0xFFF7
+
+#define ALT_SPI_CS 0x1D
+#define ALT_SPI_CS_MASK0xFC
+#define ALT_SPI_CS_WR_BUF_EN   0x04
+
+#define SPI100_ENABLE  0x20
+#define SPI100_SPEED_CONFIG0x22
+
+
+/* SPI control shadow registers */
+#define CMD_CODE   0x45
+
+#define CMD_TRIGGER0x47
+#define CMD_TRIGGER_EXECUTE0x80
+
+#define TX_BYTE_COUNT  0x48
+
+#define RX_BYTE_COUNT  0x4B
+
+#define SPI_STATUS 0x4C
+#define SPI_STATUS_BUSY_MASK   0x8000
+
+#define SPI_FIFO   0x80
+
+static const struct pci_device_id amd_fch_lpc_pci_device_ids[] = {
+   { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FCH_LPC_BRIDGE) },
+   {}
+};
+MODULE_DEVICE_TABLE(pci, amd_fch_lpc_pci_device_ids);
+
+static short norm_speed = -1;
+module_param(norm_speed, short, 0444);
+MODULE_PARM_DESC(norm_speed, "Specify SPI speed for normal read.  This sets 
NormSpeedNew[3:0] from BKDG. -1 means use

Re: [PATCH security-next v5 00/30] LSM: Explict ordering

2018-10-10 Thread James Morris

On Wed, 10 Oct 2018, Kees Cook wrote:

> v5:
> - redesigned to use CONFIG_LSM= and lsm= for both ordering and enabling
> - dropped various Reviewed-bys due to rather large refactoring

Patches 1-10 applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git 
next-general
and next-testing.

-- 
James Morris

Re: [PATCH security-next v5 00/30] LSM: Explict ordering

2018-10-10 Thread James Morris

On Wed, 10 Oct 2018, Kees Cook wrote:

> v5:
> - redesigned to use CONFIG_LSM= and lsm= for both ordering and enabling
> - dropped various Reviewed-bys due to rather large refactoring

Patches 1-10 applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git 
next-general
and next-testing.

-- 
James Morris

Re: [f2fs-dev] [PATCH] f2fs: fix data corruption issue with hardware encryption

2018-10-10 Thread Sahitya Tummala

On Wed, Oct 10, 2018 at 08:05:44PM -0700, Jaegeuk Kim wrote:
> On 10/10, Jaegeuk Kim wrote:
> > On 10/11, Sahitya Tummala wrote:
> > > On Wed, Oct 10, 2018 at 02:34:02PM -0700, Jaegeuk Kim wrote:
> > > > On 10/10, Sahitya Tummala wrote:
> > > > > Direct IO can be used in case of hardware encryption. The following
> > > > > scenario results into data corruption issue in this path -
> > > > > 
> > > > > Thread A -  Thread B-
> > > > > -> write file#1 in direct IO
> > > > > -> GC gets kicked in
> > > > > -> GC submitted bio on meta 
> > > > > mapping
> > > > >  for file#1, but pending 
> > > > > completion
> > > > > -> write file#1 again with new data
> > > > >in direct IO
> > > > > -> GC bio gets completed now
> > > > > -> GC writes old data to the new
> > > > >location and thus file#1 is
> > > > >  corrupted.
> > > > > 
> > > > > Fix this by submitting and waiting for pending io on meta mapping
> > > > > for direct IO case in f2fs_map_blocks().
> > > > > 
> > > > > Signed-off-by: Sahitya Tummala 
> > > > > ---
> > > > >  fs/f2fs/data.c | 12 
> > > > >  1 file changed, 12 insertions(+)
> > > > > 
> > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > > > > index 9ef6f1f..7b2fef0 100644
> > > > > --- a/fs/f2fs/data.c
> > > > > +++ b/fs/f2fs/data.c
> > > > > @@ -1028,6 +1028,12 @@ int f2fs_map_blocks(struct inode *inode, 
> > > > > struct f2fs_map_blocks *map,
> > > > >   map->m_pblk = ei.blk + pgofs - ei.fofs;
> > > > >   map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - 
> > > > > pgofs);
> > > > >   map->m_flags = F2FS_MAP_MAPPED;
> > > > > + /* for HW encryption, but to avoid potential issue in 
> > > > > future */
> > > > > + if (flag == F2FS_GET_BLOCK_DIO) {
> > > > > + blkaddr = map->m_pblk;
> > > > > + for (; blkaddr < map->m_pblk + map->m_len; 
> > > > > blkaddr++)
> > > > > + f2fs_wait_on_block_writeback(sbi, 
> > > > > blkaddr);
> > > > 
> > > > Do we need this? IIRC, DIO would give create=1.
> > > 
> > > Yes, we need it. When we are overwriting an existing file, DIO calls
> > > f2fs_map_blocks() with create=0. From the DIO code, I see that this 
> > > happens
> > > because blockdev_direct_IO() passes this dio flag DIO_SKIP_HOLES. And then
> > > in get_more_blocks(), below code updates create=0, when we are overwriting
> > > an existing file.
> > > 
> > > create = dio->op == REQ_OP_WRITE;
> > > if (dio->flags & DIO_SKIP_HOLES) {
> > > if (fs_startblk <= ((i_size_read(dio->inode) - 1) 
> > > >>
> > > i_blkbits))
> > > create = 0;
> > > }
> > > 
> > > ret = (*sdio->get_block)(dio->inode, fs_startblk,
> > > map_bh, create);
> > > 
> > 
> > Got it.
> > How about this?
> > 
> 
> Sorry, this is v2.

Looks good to me. Thanks for updating it :)

> 
> From b78dd7b2e0317be18716b9496269e9792829f63e Mon Sep 17 00:00:00 2001
> From: Sahitya Tummala 
> Date: Wed, 10 Oct 2018 10:56:22 +0530
> Subject: [PATCH] f2fs: fix data corruption issue with hardware encryption
> 
> Direct IO can be used in case of hardware encryption. The following
> scenario results into data corruption issue in this path -
> 
> Thread A -  Thread B-
> -> write file#1 in direct IO
> -> GC gets kicked in
> -> GC submitted bio on meta mapping
>  for file#1, but pending completion
> -> write file#1 again with new data
>in direct IO
> -> GC bio gets completed now
> -> GC writes old data to the new
>location and thus file#1 is
>  corrupted.
> 
> Fix this by submitting and waiting for pending io on meta mapping
> for direct IO case in f2fs_map_blocks().
> 
> Signed-off-by: Sahitya Tummala 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/data.c| 11 +++
>  fs/f2fs/f2fs.h|  2 ++
>  fs/f2fs/segment.c |  9 +
>  3 files changed, 22 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index be19257d9e36..8952f2d610a6 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1030,6 +1030,11 @@ int f2fs_map_blocks(struct inode *inode, struct 
> f2fs_map_blocks *map,
>   map->m_flags = F2FS_MAP_MAPPED;
>   if (map->m_next_extent)
>   *map->m_next_extent

Re: [f2fs-dev] [PATCH] f2fs: fix data corruption issue with hardware encryption

2018-10-10 Thread Sahitya Tummala

On Wed, Oct 10, 2018 at 08:05:44PM -0700, Jaegeuk Kim wrote:
> On 10/10, Jaegeuk Kim wrote:
> > On 10/11, Sahitya Tummala wrote:
> > > On Wed, Oct 10, 2018 at 02:34:02PM -0700, Jaegeuk Kim wrote:
> > > > On 10/10, Sahitya Tummala wrote:
> > > > > Direct IO can be used in case of hardware encryption. The following
> > > > > scenario results into data corruption issue in this path -
> > > > > 
> > > > > Thread A -  Thread B-
> > > > > -> write file#1 in direct IO
> > > > > -> GC gets kicked in
> > > > > -> GC submitted bio on meta 
> > > > > mapping
> > > > >  for file#1, but pending 
> > > > > completion
> > > > > -> write file#1 again with new data
> > > > >in direct IO
> > > > > -> GC bio gets completed now
> > > > > -> GC writes old data to the new
> > > > >location and thus file#1 is
> > > > >  corrupted.
> > > > > 
> > > > > Fix this by submitting and waiting for pending io on meta mapping
> > > > > for direct IO case in f2fs_map_blocks().
> > > > > 
> > > > > Signed-off-by: Sahitya Tummala 
> > > > > ---
> > > > >  fs/f2fs/data.c | 12 
> > > > >  1 file changed, 12 insertions(+)
> > > > > 
> > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > > > > index 9ef6f1f..7b2fef0 100644
> > > > > --- a/fs/f2fs/data.c
> > > > > +++ b/fs/f2fs/data.c
> > > > > @@ -1028,6 +1028,12 @@ int f2fs_map_blocks(struct inode *inode, 
> > > > > struct f2fs_map_blocks *map,
> > > > >   map->m_pblk = ei.blk + pgofs - ei.fofs;
> > > > >   map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - 
> > > > > pgofs);
> > > > >   map->m_flags = F2FS_MAP_MAPPED;
> > > > > + /* for HW encryption, but to avoid potential issue in 
> > > > > future */
> > > > > + if (flag == F2FS_GET_BLOCK_DIO) {
> > > > > + blkaddr = map->m_pblk;
> > > > > + for (; blkaddr < map->m_pblk + map->m_len; 
> > > > > blkaddr++)
> > > > > + f2fs_wait_on_block_writeback(sbi, 
> > > > > blkaddr);
> > > > 
> > > > Do we need this? IIRC, DIO would give create=1.
> > > 
> > > Yes, we need it. When we are overwriting an existing file, DIO calls
> > > f2fs_map_blocks() with create=0. From the DIO code, I see that this 
> > > happens
> > > because blockdev_direct_IO() passes this dio flag DIO_SKIP_HOLES. And then
> > > in get_more_blocks(), below code updates create=0, when we are overwriting
> > > an existing file.
> > > 
> > > create = dio->op == REQ_OP_WRITE;
> > > if (dio->flags & DIO_SKIP_HOLES) {
> > > if (fs_startblk <= ((i_size_read(dio->inode) - 1) 
> > > >>
> > > i_blkbits))
> > > create = 0;
> > > }
> > > 
> > > ret = (*sdio->get_block)(dio->inode, fs_startblk,
> > > map_bh, create);
> > > 
> > 
> > Got it.
> > How about this?
> > 
> 
> Sorry, this is v2.

Looks good to me. Thanks for updating it :)

> 
> From b78dd7b2e0317be18716b9496269e9792829f63e Mon Sep 17 00:00:00 2001
> From: Sahitya Tummala 
> Date: Wed, 10 Oct 2018 10:56:22 +0530
> Subject: [PATCH] f2fs: fix data corruption issue with hardware encryption
> 
> Direct IO can be used in case of hardware encryption. The following
> scenario results into data corruption issue in this path -
> 
> Thread A -  Thread B-
> -> write file#1 in direct IO
> -> GC gets kicked in
> -> GC submitted bio on meta mapping
>  for file#1, but pending completion
> -> write file#1 again with new data
>in direct IO
> -> GC bio gets completed now
> -> GC writes old data to the new
>location and thus file#1 is
>  corrupted.
> 
> Fix this by submitting and waiting for pending io on meta mapping
> for direct IO case in f2fs_map_blocks().
> 
> Signed-off-by: Sahitya Tummala 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/data.c| 11 +++
>  fs/f2fs/f2fs.h|  2 ++
>  fs/f2fs/segment.c |  9 +
>  3 files changed, 22 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index be19257d9e36..8952f2d610a6 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1030,6 +1030,11 @@ int f2fs_map_blocks(struct inode *inode, struct 
> f2fs_map_blocks *map,
>   map->m_flags = F2FS_MAP_MAPPED;
>   if (map->m_next_extent)
>   *map->m_next_extent

Re: [PATCH] scsi: arcmsr: clean up clang warning on extraneous parentheses

2018-10-10 Thread Martin K. Petersen



Colin,

> There are extraneous parantheses that are causing clang to produce a
> warning so remove these.
>
> Clean up 3 clang warnings:
> equality comparison with extraneous parentheses [-Wparentheses-equality]

Applied to 4.20/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: arcmsr: clean up clang warning on extraneous parentheses

2018-10-10 Thread Martin K. Petersen



Colin,

> There are extraneous parantheses that are causing clang to produce a
> warning so remove these.
>
> Clean up 3 clang warnings:
> equality comparison with extraneous parentheses [-Wparentheses-equality]

Applied to 4.20/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1504 matches

Mail list logo