date:20201218

Re: [linux-next:master 13538/13785] /tmp/metronomefb-846872.s:300: Error: unrecognized opcode `zext.b a2,a2'

2020-12-18 Thread Pavel Machek

Crazy robot, stop spamming. This report is obviously bogus, yet, you
sent me 5 copies.

Whoever is responsible for this, please sign emails with your real
name!

Pavel


On Sat 2020-12-19 14:19:16, kernel test robot wrote:
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 
> master
> head:   0d52778b8710eb11cb616761a02aee0a7fd60425
> commit: f08fdc654a5940aa23259e1ed53ab0f401ca7068 [13538/13785] leds: ss4200: 
> simplify the return expression of register_nasgpio_led()
> config: riscv-randconfig-r014-20201217 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
> cee1e7d14f4628d6174b33640d502bff3b54ae45)
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install riscv cross compiling tool for clang build
> # apt-get install binutils-riscv64-linux-gnu
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=f08fdc654a5940aa23259e1ed53ab0f401ca7068
> git remote add linux-next 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> git fetch --no-tags linux-next master
> git checkout f08fdc654a5940aa23259e1ed53ab0f401ca7068
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=riscv 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> Note: the linux-next/master HEAD 0d52778b8710eb11cb616761a02aee0a7fd60425 
> builds fine.
>   It may have been fixed somewhere.
> 
> All errors (new ones prefixed by >>):
> 
>In file included from include/asm-generic/hardirq.h:17:
>In file included from include/linux/irq.h:20:
>In file included from include/linux/io.h:13:
>In file included from arch/riscv/include/asm/io.h:149:
>include/asm-generic/io.h:564:9: warning: performing pointer arithmetic on 
> a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
>return inw(addr);
>   ^
>arch/riscv/include/asm/io.h:56:76: note: expanded from macro 'inw'
>#define inw(c)  ({ u16 __v; __io_pbr(); __v = 
> readw_cpu((void*)(PCI_IOBASE + (c))); __io_par(__v); __v; })
>
> ~~ ^
>arch/riscv/include/asm/mmio.h:88:76: note: expanded from macro 'readw_cpu'
>#define readw_cpu(c)({ u16 __r = le16_to_cpu((__force 
> __le16)__raw_readw(c)); __r; })
>   
>   ^
>include/uapi/linux/byteorder/little_endian.h:36:51: note: expanded from 
> macro '__le16_to_cpu'
>#define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
>  ^
>In file included from drivers/video/fbdev/metronomefb.c:28:
>In file included from include/linux/interrupt.h:11:
>In file included from include/linux/hardirq.h:10:
>In file included from ./arch/riscv/include/generated/asm/hardirq.h:1:
>In file included from include/asm-generic/hardirq.h:17:
>In file included from include/linux/irq.h:20:
>In file included from include/linux/io.h:13:
>In file included from arch/riscv/include/asm/io.h:149:
>include/asm-generic/io.h:572:9: warning: performing pointer arithmetic on 
> a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
>return inl(addr);
>   ^
>arch/riscv/include/asm/io.h:57:76: note: expanded from macro 'inl'
>#define inl(c)  ({ u32 __v; __io_pbr(); __v = 
> readl_cpu((void*)(PCI_IOBASE + (c))); __io_par(__v); __v; })
>
> ~~ ^
>arch/riscv/include/asm/mmio.h:89:76: note: expanded from macro 'readl_cpu'
>#define readl_cpu(c)({ u32 __r = le32_to_cpu((__force 
> __le32)__raw_readl(c)); __r; })
>   
>   ^
>include/uapi/linux/byteorder/little_endian.h:34:51: note: expanded from 
> macro '__le32_to_cpu'
>#define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
>  ^
>In file included from drivers/video/fbdev/metronomefb.c:28:
>In file included from include/linux/interrupt.h:11:
>In file included from include/linux/hardirq.h:10:
>In file included from ./arch/riscv/include/generated/asm/hardirq.h:1:
>In file included from include/asm-generic/hardirq.h:17:
>In file included from include/linux/irq.h:20:
>In file included from include/linux/io.h:13:
>In file included from arch/riscv/include/asm/io.h:149:
>include/

Re: KASAN: slab-out-of-bounds Read in lock_sock_nested

2020-12-18 Thread syzbot

syzbot has found a reproducer for the following issue on:

HEAD commit:a409ed15 Merge tag 'gpio-v5.11-1' of git://git.kernel.org/..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=174778a750
kernel config:  https://syzkaller.appspot.com/x/.config?x=20efebc728efc8ff
dashboard link: https://syzkaller.appspot.com/bug?extid=9a0875bc1b2ca466b484
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10a4445b50

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+9a0875bc1b2ca466b...@syzkaller.appspotmail.com

==
BUG: KASAN: slab-out-of-bounds in __lock_acquire+0x3da6/0x54b0 
kernel/locking/lockdep.c:4702
Read of size 8 at addr 88801938c0a0 by task kworker/1:1/34

CPU: 1 PID: 34 Comm: kworker/1:1 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events l2cap_chan_timeout
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 __lock_acquire+0x3da6/0x54b0 kernel/locking/lockdep.c:4702
 lock_acquire kernel/locking/lockdep.c:5437 [inline]
 lock_acquire+0x29d/0x750 kernel/locking/lockdep.c:5402
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
 _raw_spin_lock_bh+0x2f/0x40 kernel/locking/spinlock.c:175
 spin_lock_bh include/linux/spinlock.h:359 [inline]
 lock_sock_nested+0x3b/0x110 net/core/sock.c:3049
 l2cap_sock_teardown_cb+0xa1/0x660 net/bluetooth/l2cap_sock.c:1520
 l2cap_chan_del+0xbc/0xaa0 net/bluetooth/l2cap_core.c:618
 l2cap_chan_close+0x1bc/0xaf0 net/bluetooth/l2cap_core.c:823
 l2cap_chan_timeout+0x17e/0x2f0 net/bluetooth/l2cap_core.c:436
 process_one_work+0x98d/0x1630 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Allocated by task 11222:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
 __do_kmalloc mm/slab.c:3659 [inline]
 __kmalloc+0x18b/0x340 mm/slab.c:3668
 kmalloc include/linux/slab.h:557 [inline]
 kzalloc include/linux/slab.h:682 [inline]
 tomoyo_get_name+0x22b/0x4c0 security/tomoyo/memory.c:173
 tomoyo_parse_name_union+0xbc/0x160 security/tomoyo/util.c:260
 tomoyo_update_path_acl security/tomoyo/file.c:395 [inline]
 tomoyo_write_file+0x4c0/0x7f0 security/tomoyo/file.c:1022
 tomoyo_write_domain2+0x116/0x1d0 security/tomoyo/common.c:1152
 tomoyo_add_entry security/tomoyo/common.c:2042 [inline]
 tomoyo_supervisor+0xbee/0xf20 security/tomoyo/common.c:2103
 tomoyo_audit_path_log security/tomoyo/file.c:168 [inline]
 tomoyo_path_permission security/tomoyo/file.c:587 [inline]
 tomoyo_path_permission+0x270/0x3a0 security/tomoyo/file.c:573
 tomoyo_path_perm+0x37c/0x3f0 security/tomoyo/file.c:838
 tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
 security_path_symlink+0xdf/0x150 security/security.c:
 do_symlinkat+0x123/0x2c0 fs/namei.c:3985
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at 88801938c000
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 32 bytes to the right of
 128-byte region [88801938c000, 88801938c080)
The buggy address belongs to the page:
page:b7b67fec refcount:1 mapcount:0 mapping: index:0x0 
pfn:0x1938c
flags: 0xfff200(slab)
raw: 00fff200 ea4e6508 eaa5cf48 888010840400
raw:  88801938c000 00010010 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 88801938bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 88801938c000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>88801938c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   ^
 88801938c100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 88801938c180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==

[PATCH] hwrng: ingenic - Fix a resource leak in an error handling path

2020-12-18 Thread Christophe JAILLET

In case of error, we should call 'clk_disable_unprepare()' to undo a
previous 'clk_prepare_enable()' call, as already done in the remove
function.

Fixes: 406346d22278 ("hwrng: ingenic - Add hardware TRNG for Ingenic X1830")
Signed-off-by: Christophe JAILLET 
---
 drivers/char/hw_random/ingenic-trng.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/char/hw_random/ingenic-trng.c 
b/drivers/char/hw_random/ingenic-trng.c
index 954a8411d67d..0eb80f786f4d 100644
--- a/drivers/char/hw_random/ingenic-trng.c
+++ b/drivers/char/hw_random/ingenic-trng.c
@@ -113,13 +113,17 @@ static int ingenic_trng_probe(struct platform_device 
*pdev)
ret = hwrng_register(&trng->rng);
if (ret) {
dev_err(&pdev->dev, "Failed to register hwrng\n");
-   return ret;
+   goto err_unprepare_clk;
}
 
platform_set_drvdata(pdev, trng);
 
dev_info(&pdev->dev, "Ingenic DTRNG driver registered\n");
return 0;
+
+err_unprepare_clk:
+   clk_disable_unprepare(trng->clk);
+   return ret;
 }
 
 static int ingenic_trng_remove(struct platform_device *pdev)
-- 
2.27.0

Re: [PATCH v15 1/3] scsi: ufs: Introduce HPB feature

2020-12-18 Thread Greg KH

On Sat, Dec 19, 2020 at 12:30:39PM +0900, Daejun Park wrote:
> @@ -323,6 +325,8 @@ static struct attribute *ufs_sysfs_device_descriptor[] = {
>   &dev_attr_number_of_secure_wpa.attr,
>   &dev_attr_psa_max_data_size.attr,
>   &dev_attr_psa_state_timeout.attr,
> + &dev_attr_hpb_version.attr,
> + &dev_attr_hpb_control.attr,
>   &dev_attr_ext_feature_sup.attr,
>   &dev_attr_wb_presv_us_en.attr,
>   &dev_attr_wb_type.attr,

I thought I said this before, but you are adding new sysfs files, with
no Documentation/ABI/ update which is not allowed.

Please fix up.

thanks,

greg k-h

Re: [PATCH] tty: fix timeout equals to 0 problem in tty_ioctl

2020-12-18 Thread Greg KH

On Sat, Dec 19, 2020 at 11:27:31AM +0800, zhangqiumi...@huwei.com wrote:
> From: zhangqiumiao 
> 
> Fix the problem that tty buffer data was not flushed when timeout=0.
> 
> Signed-off-by: zhangqiumiao 
> ---
>  drivers/tty/tty_ioctl.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c
> index 4de1c6ddb8ff..f9d4b6e22308 100644
> --- a/drivers/tty/tty_ioctl.c
> +++ b/drivers/tty/tty_ioctl.c
> @@ -220,13 +220,21 @@ void tty_wait_until_sent(struct tty_struct *tty, long 
> timeout)
>   tty_debug_wait_until_sent(tty, "wait until sent, timeout=%ld\n", 
> timeout);
>  
>   if (!timeout)
> - timeout = MAX_SCHEDULE_TIMEOUT;
> + timeout = 60 * HZ;

Why change this?

>  
>   timeout = wait_event_interruptible_timeout(tty->write_wait,
>   !tty_chars_in_buffer(tty), timeout);
> - if (timeout <= 0)
> + if (timeout < 0)
>   return;

How can timeout be 0 here?

> + if (timeout == 0) {
> + if (tty->ops->flush_buffer) {
> + pr_info("%s: flush buffer\n", __func__);

Debugging code?

> + tty->ops->flush_buffer(tty);
> + }
> + return;
> + }
> +

thanks,

greg k-h

Re: [PATCH] device-dax: Fix range release

2020-12-18 Thread Leizhen (ThunderTown)




On 2020/12/19 10:41, Dan Williams wrote:
> There are multiple locations that open-code the release of the last
> range in a device-dax instance. Consolidate this into a new
> dev_dax_trim_range() helper.
> 
> This also addresses a kmemleak report:
> 
> # cat /sys/kernel/debug/kmemleak
> [..]
> unreferenced object 0x976bd46f6240 (size 64):
>comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s)
>hex dump (first 32 bytes):
>  00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00  .. .7...
>  ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00  8...
>backtrace:
>  [<064003cf>] __kmalloc_track_caller+0x136/0x379
>  [] krealloc+0x67/0x92
>  [] __alloc_dev_dax_range+0x73/0x25c
>  [<27d58626>] devm_create_dev_dax+0x27d/0x416
>  [<434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core]
>  [<83726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem]
>  [] nvdimm_bus_probe+0x9d/0x340 [libnvdimm]
>  [] really_probe+0x230/0x48d
>  [<6cabd38e>] driver_probe_device+0x122/0x13b
>  [<29c7b95a>] device_driver_attach+0x5b/0x60
>  [<53e5659b>] bind_store+0xb7/0xc3
>  [] drv_attr_store+0x27/0x31
>  [<949069c5>] sysfs_kf_write+0x4a/0x57
>  [<4a8b5adf>] kernfs_fop_write+0x150/0x1e5
>  [] __vfs_write+0x1b/0x34
>  [] vfs_write+0xd8/0x1d1
> 
> Reported-by: Jane Chu 
> Cc: Zhen Lei 
> Signed-off-by: Dan Williams 
> ---
>  drivers/dax/bus.c |   44 +---
>  1 file changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 9761cb40d4bb..720cd140209f 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -367,19 +367,28 @@ void kill_dev_dax(struct dev_dax *dev_dax)
>  }
>  EXPORT_SYMBOL_GPL(kill_dev_dax);
>  
> -static void free_dev_dax_ranges(struct dev_dax *dev_dax)
> +static void trim_dev_dax_range(struct dev_dax *dev_dax)
>  {
> + int i = dev_dax->nr_range - 1;
> + struct range *range = &dev_dax->ranges[i].range;
>   struct dax_region *dax_region = dev_dax->region;
> - int i;
>  
>   device_lock_assert(dax_region->dev);
> - for (i = 0; i < dev_dax->nr_range; i++) {
> - struct range *range = &dev_dax->ranges[i].range;
> -
> - __release_region(&dax_region->res, range->start,
> - range_len(range));
> + dev_dbg(&dev_dax->dev, "delete range[%d]: %#llx:%#llx\n", i,
> + (unsigned long long)range->start,
> + (unsigned long long)range->end);
> +
> + __release_region(&dax_region->res, range->start, range_len(range));
> + if (--dev_dax->nr_range == 0) {
> + kfree(dev_dax->ranges);
> + dev_dax->ranges = NULL;
>   }
> - dev_dax->nr_range = 0;
> +}
> +
> +static void free_dev_dax_ranges(struct dev_dax *dev_dax)
> +{
> + while (dev_dax->nr_range)
It's better to use READ_ONCE to get the value of dev_dax->nr_range,
to prevent compiler optimization.

> + trim_dev_dax_range(dev_dax);
>  }
>  
>  static void unregister_dev_dax(void *dev)
> @@ -804,15 +813,10 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
> u64 start,
>   return 0;
>  
>   rc = devm_register_dax_mapping(dev_dax, dev_dax->nr_range - 1);
> - if (rc) {
> - dev_dbg(dev, "delete range[%d]: %pa:%pa\n", dev_dax->nr_range - 
> 1,
> - &alloc->start, &alloc->end);
> - dev_dax->nr_range--;
> - __release_region(res, alloc->start, resource_size(alloc));
> - return rc;
> - }
> + if (rc)
> + trim_dev_dax_range(dev_dax);
>  
> - return 0;
> + return rc;
>  }
>  
>  static int adjust_dev_dax_range(struct dev_dax *dev_dax, struct resource 
> *res, resource_size_t size)
> @@ -885,12 +889,7 @@ static int dev_dax_shrink(struct dev_dax *dev_dax, 
> resource_size_t size)
>   if (shrink >= range_len(range)) {
>   devm_release_action(dax_region->dev,
>   unregister_dax_mapping, &mapping->dev);
> - __release_region(&dax_region->res, range->start,
> - range_len(range));
> - dev_dax->nr_range--;
> - dev_dbg(dev, "delete range[%d]: %#llx:%#llx\n", i,
> - (unsigned long long) range->start,
> - (unsigned long long) range->end);
> + trim_dev_dax_range(dev_dax);
>   to_shrink -= shrink;
>   if (!to_shrink)
>   break;
> @@ -1267,7 +1266,6 @@ static void dev_dax_release(struct device *dev)
>   put_dax(dax_dev);
>   free_d

Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement

2020-12-18 Thread Fox Chen

On Sat, Dec 19, 2020 at 8:53 AM Ian Kent  wrote:
>
> On Fri, 2020-12-18 at 21:20 +0800, Fox Chen wrote:
> > On Fri, Dec 18, 2020 at 7:21 PM Ian Kent  wrote:
> > > On Fri, 2020-12-18 at 16:01 +0800, Fox Chen wrote:
> > > > On Fri, Dec 18, 2020 at 3:36 PM Ian Kent 
> > > > wrote:
> > > > > On Thu, 2020-12-17 at 10:14 -0500, Tejun Heo wrote:
> > > > > > Hello,
> > > > > >
> > > > > > On Thu, Dec 17, 2020 at 07:48:49PM +0800, Ian Kent wrote:
> > > > > > > > What could be done is to make the kernfs node attr_mutex
> > > > > > > > a pointer and dynamically allocate it but even that is
> > > > > > > > too
> > > > > > > > costly a size addition to the kernfs node structure as
> > > > > > > > Tejun has said.
> > > > > > >
> > > > > > > I guess the question to ask is, is there really a need to
> > > > > > > call kernfs_refresh_inode() from functions that are usually
> > > > > > > reading/checking functions.
> > > > > > >
> > > > > > > Would it be sufficient to refresh the inode in the
> > > > > > > write/set
> > > > > > > operations in (if there's any) places where things like
> > > > > > > setattr_copy() is not already called?
> > > > > > >
> > > > > > > Perhaps GKH or Tejun could comment on this?
> > > > > >
> > > > > > My memory is a bit hazy but invalidations on reads is how
> > > > > > sysfs
> > > > > > namespace is
> > > > > > implemented, so I don't think there's an easy around that.
> > > > > > The
> > > > > > only
> > > > > > thing I
> > > > > > can think of is embedding the lock into attrs and doing xchg
> > > > > > dance
> > > > > > when
> > > > > > attaching it.
> > > > >
> > > > > Sounds like your saying it would be ok to add a lock to the
> > > > > attrs structure, am I correct?
> > > > >
> > > > > Assuming it is then, to keep things simple, use two locks.
> > > > >
> > > > > One global lock for the allocation and an attrs lock for all
> > > > > the
> > > > > attrs field updates including the kernfs_refresh_inode()
> > > > > update.
> > > > >
> > > > > The critical section for the global lock could be reduced and
> > > > > it
> > > > > changed to a spin lock.
> > > > >
> > > > > In __kernfs_iattrs() we would have something like:
> > > > >
> > > > > take the allocation lock
> > > > > do the allocated checks
> > > > >   assign if existing attrs
> > > > >   release the allocation lock
> > > > >   return existing if found
> > > > > othewise
> > > > >   release the allocation lock
> > > > >
> > > > > allocate and initialize attrs
> > > > >
> > > > > take the allocation lock
> > > > > check if someone beat us to it
> > > > >   free and grab exiting attrs
> > > > > otherwise
> > > > >   assign the new attrs
> > > > > release the allocation lock
> > > > > return attrs
> > > > >
> > > > > Add a spinlock to the attrs struct and use it everywhere for
> > > > > field updates.
> > > > >
> > > > > Am I on the right track or can you see problems with this?
> > > > >
> > > > > Ian
> > > > >
> > > >
> > > > umm, we update the inode in kernfs_refresh_inode, right??  So I
> > > > guess
> > > > the problem is how can we protect the inode when
> > > > kernfs_refresh_inode
> > > > is called, not the attrs??
> > >
> > > But the attrs (which is what's copied from) were protected by the
> > > mutex lock (IIUC) so dealing with the inode attributes implies
> > > dealing with the kernfs node attrs too.
> > >
> > > For example in kernfs_iop_setattr() the call to setattr_copy()
> > > copies
> > > the node attrs to the inode under the same mutex lock. So, if a
> > > read
> > > lock is used the copy in kernfs_refresh_inode() is no longer
> > > protected,
> > > it needs to be protected in a different way.
> > >
> >
> > Ok, I'm actually wondering why the VFS holds exclusive i_rwsem for
> > .setattr but
> >  no lock for .getattr (misdocumented?? sometimes they have as you've
> > found out)?
> > What does it protect against?? Because .permission does a similar
> > thing
> > here -- updating inode attributes, the goal is to provide the same
> > protection level
> > for .permission as for .setattr, am I right???
>
> As far as the documentation goes that's probably my misunderstanding
> of it.
>
> It does happen that the VFS makes assumptions about how call backs
> are meant to be used.
>
> Read like call backs, like .getattr() and .permission() are meant to
> be used, well, like read like functions so the VFS should be ok to
> take locks or not based on the operation context at hand.
>
> So it's not about the locking for these call backs per se, it's about
> the context in which they are called.
>
> For example, in link_path_walk(), at the beginning of the component
> lookup loop (essentially for the containing directory at that point),
> may_lookup() is called which leads to a call to .permission() without
> any inode lock held at that point.
>
> But file opens (possibly following a path walk to resolve a path)
> are different.
>
> For example, do_filp_open() calls path_openat() which leads to a
> call to open_last_lookups(), which leads

[GIT PULL] pcmcia updates for v5.11

2020-12-18 Thread Dominik Brodowski

Linus,

The following changes since commit b3298500b23f0b53a8d81e0d5ad98a29db71f4f0:

  Merge tag 'for-5.10/dm-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 
(2020-12-04 13:28:39 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux.git pcmcia-next

for you to fetch changes up to 70d3a462fc244b0580268cc8e6c47ae4463db68a:

  pcmcia: omap: Fix error return code in omap_cf_probe() (2020-12-05 09:59:13 
+0100)


Besides a few PCMCIA odd fixes, the NEC VRC4173 CARDU driver is
removed, as it has not compiled in ages.


Thanks,
Dominik



Christophe JAILLET (1):
  pcmcia/electra_cf: Fix some return values in 'electra_cf_probe()' in case 
of error

Jason Yan (1):
  pcmcia: db1xxx_ss: remove unneeded semicolon

Sebastian Andrzej Siewior (1):
  pcmcia: Remove NEC VRC4173 CARDU

Wang ShaoBo (1):
  pcmcia: omap: Fix error return code in omap_cf_probe()

 drivers/pcmcia/Kconfig |   4 -
 drivers/pcmcia/Makefile|   1 -
 drivers/pcmcia/db1xxx_ss.c |   2 +-
 drivers/pcmcia/electra_cf.c|   2 +
 drivers/pcmcia/omap_cf.c   |   8 +-
 drivers/pcmcia/vrc4173_cardu.c | 591 -
 drivers/pcmcia/vrc4173_cardu.h | 247 -
 7 files changed, 9 insertions(+), 846 deletions(-)
 delete mode 100644 drivers/pcmcia/vrc4173_cardu.c
 delete mode 100644 drivers/pcmcia/vrc4173_cardu.h


signature.asc
Description: PGP signature

Re: linux-next: Signed-off-by missing for commit in the ipsec tree

2020-12-18 Thread Steffen Klassert

On Sat, Dec 19, 2020 at 02:26:09PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Commit
> 
>   06148d3b3f2e ("xfrm: Fix oops in xfrm_replay_advance_bmp")
> 
> is missing a Signed-off-by from its committer.

My bad. I did a forced push to fix it.

Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement

2020-12-18 Thread Ian Kent

On Fri, 2020-12-18 at 09:59 -0500, Tejun Heo wrote:
> Hello,
> 
> On Fri, Dec 18, 2020 at 03:36:21PM +0800, Ian Kent wrote:
> > Sounds like your saying it would be ok to add a lock to the
> > attrs structure, am I correct?
> 
> Yeah, adding a lock to attrs is a lot less of a problem and it looks
> like
> it's gonna have to be either that or hashed locks, which might
> actually make
> sense if we're worried about the size of attrs (I don't think we need
> to).

Maybe that isn't needed.

And looking further I see there's a race that kernfs can't do anything
about between kernfs_refresh_inode() and fs/inode.c:update_times().

kernfs could avoid fighting with the VFS to keep the attributes set to
those of the kernfs node by using the inode operation .update_times()
and, if it makes sense, the kernfs node attributes that it wants to be
updated on file system activity could also be updated here.

I can't find any reason why this shouldn't be done but kernfs is
fairly widely used in other kernel subsystems so what does everyone
think of this patch, updated to set kernfs node attributes that
should be updated of course, see comment in the patch?

kernfs: fix attributes update race

From: Ian Kent 

kernfs uses kernfs_refresh_inode() (called from kernfs_iop_getattr()
and kernfs_iop_permission()) to keep the inode attributes set to the
attibutes of the kernfs node.

But there is no way for kernfs to prevent racing with the function
fs/inode.c:update_times().

The better choice is to use the inode operation .update_times() and
just let the VFS use the generic functions for .getattr() and
.permission().

Signed-off-by: Ian Kent 
---
 fs/kernfs/inode.c   |   37 ++---
 fs/kernfs/kernfs-internal.h |4 +---
 2 files changed, 15 insertions(+), 26 deletions(-)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index fc2469a20fed..51780329590c 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -24,9 +24,8 @@ static const struct address_space_operations kernfs_aops = {
 };
 
 static const struct inode_operations kernfs_iops = {
-   .permission = kernfs_iop_permission,
+   .update_time= kernfs_update_time,
.setattr= kernfs_iop_setattr,
-   .getattr= kernfs_iop_getattr,
.listxattr  = kernfs_iop_listxattr,
 };
 
@@ -183,18 +182,26 @@ static void kernfs_refresh_inode(struct kernfs_node *kn, 
struct inode *inode)
set_nlink(inode, kn->dir.subdirs + 2);
 }
 
-int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
-  u32 request_mask, unsigned int query_flags)
+static int kernfs_iop_update_time(struct inode *inode, struct timespec64 
*time, int flags)
 {
-   struct inode *inode = d_inode(path->dentry);
struct kernfs_node *kn = inode->i_private;
+   struct kernfs_iattrs *attrs;
 
mutex_lock(&kernfs_mutex);
+   attrs = kernfs_iattrs(kn);
+   if (!attrs) {
+   mutex_unlock(&kernfs_mutex);
+   return -ENOMEM;
+   }
+
+   /* Which kernfs node attributes should be updated from
+* time?
+*/
+
kernfs_refresh_inode(kn, inode);
mutex_unlock(&kernfs_mutex);
 
-   generic_fillattr(inode, stat);
-   return 0;
+   return 0
 }
 
 static void kernfs_init_inode(struct kernfs_node *kn, struct inode *inode)
@@ -272,22 +279,6 @@ void kernfs_evict_inode(struct inode *inode)
kernfs_put(kn);
 }
 
-int kernfs_iop_permission(struct inode *inode, int mask)
-{
-   struct kernfs_node *kn;
-
-   if (mask & MAY_NOT_BLOCK)
-   return -ECHILD;
-
-   kn = inode->i_private;
-
-   mutex_lock(&kernfs_mutex);
-   kernfs_refresh_inode(kn, inode);
-   mutex_unlock(&kernfs_mutex);
-
-   return generic_permission(inode, mask);
-}
-
 int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
 void *value, size_t size)
 {
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 7ee97ef59184..98d08b928f93 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -89,10 +89,8 @@ extern struct kmem_cache *kernfs_node_cache, 
*kernfs_iattrs_cache;
  */
 extern const struct xattr_handler *kernfs_xattr_handlers[];
 void kernfs_evict_inode(struct inode *inode);
-int kernfs_iop_permission(struct inode *inode, int mask);
+int kernfs_update_time(struct inode *inode, struct timespec64 *time, int 
flags);
 int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr);
-int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
-  u32 request_mask, unsigned int query_flags);
 ssize_t kernfs_iop_listxattr(struct dentry *dentry, char *buf, size_t size);
 int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);

[GIT PULL] i3c: Changes for 5.11

2020-12-18 Thread Boris Brezillon

Hello Linus,

Here the I3C PR for 5.11. This should be my last PR (I resigned from
my maintainer position). Alexandre Belloni (maintainer of the RTC
subsystem) kindly proposed to take over, so he should send the I3C PRs
from now on.

Regards,

Boris 

The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec:

  Linux 5.10-rc1 (2020-10-25 15:14:11 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux.git tags/i3c/for-5.11

for you to fetch changes up to 95393f3e07ab53855b91881692a4a5b52dcdc03c:

  i3c/master/mipi-i3c-hci: quiet maybe-unused variable warning (2020-12-17 
10:31:30 +0100)


* Add the HCI driver
* Add a missing destroy_workqueue() in an error path
* Flag Alexandre Belloni as the new maintainer


Boris Brezillon (1):
  i3c: Resign from my maintainer role

Colin Ian King (1):
  i3c/master: Fix uninitialized variable next_addr

Nicolas Pitre (3):
  dt-bindings: i3c: MIPI I3C Host Controller Interface
  i3c/master: introduce the mipi-i3c-hci driver
  i3c/master/mipi-i3c-hci: quiet maybe-unused variable warning

Qinglang Miao (1):
  i3c master: fix missing destroy_workqueue() on error in 
i3c_master_register

 Documentation/devicetree/bindings/i3c/mipi-i3c-hci.yaml |   47 +++
 MAINTAINERS |2 +-
 drivers/i3c/master.c|5 +-
 drivers/i3c/master/Kconfig  |   13 +
 drivers/i3c/master/Makefile |1 +
 drivers/i3c/master/mipi-i3c-hci/Makefile|6 +
 drivers/i3c/master/mipi-i3c-hci/cmd.h   |   67 
 drivers/i3c/master/mipi-i3c-hci/cmd_v1.c|  378 
+++
 drivers/i3c/master/mipi-i3c-hci/cmd_v2.c|  316 
 drivers/i3c/master/mipi-i3c-hci/core.c  |  798 

 drivers/i3c/master/mipi-i3c-hci/dat.h   |   32 ++
 drivers/i3c/master/mipi-i3c-hci/dat_v1.c|  184 ++
 drivers/i3c/master/mipi-i3c-hci/dct.h   |   16 +
 drivers/i3c/master/mipi-i3c-hci/dct_v1.c|   36 ++
 drivers/i3c/master/mipi-i3c-hci/dma.c   |  784 
+++
 drivers/i3c/master/mipi-i3c-hci/ext_caps.c  |  308 
 drivers/i3c/master/mipi-i3c-hci/ext_caps.h  |   19 +
 drivers/i3c/master/mipi-i3c-hci/hci.h   |  144 
 drivers/i3c/master/mipi-i3c-hci/ibi.h   |   42 +++
 drivers/i3c/master/mipi-i3c-hci/pio.c   | 1041 

 drivers/i3c/master/mipi-i3c-hci/xfer_mode_rate.h|   79 
 21 files changed, 4316 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/i3c/mipi-i3c-hci.yaml
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/Makefile
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/cmd.h
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/cmd_v1.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/cmd_v2.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/core.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/dat.h
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/dat_v1.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/dct.h
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/dct_v1.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/dma.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/ext_caps.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/ext_caps.h
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/hci.h
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/ibi.h
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/pio.c
 create mode 100644 drivers/i3c/master/mipi-i3c-hci/xfer_mode_rate.h

Re: [PATCH] MAINTAINERS: Update email address for Sean Christopherson

2020-12-18 Thread Nathan Chancellor

On Thu, Nov 19, 2020 at 10:37:07AM -0800, Sean Christopherson wrote:
> From: Sean Christopherson 
> 
> Update my email address to one provided by my new benefactor.
> 
> Cc: Thomas Gleixner 
> Cc: Borislav Petkov 
> Cc: Jarkko Sakkinen 
> Cc: Dave Hansen 
> Cc: Andy Lutomirski 
> Cc: Vitaly Kuznetsov 
> Cc: Wanpeng Li 
> Cc: Jim Mattson 
> Cc: Joerg Roedel 
> Cc: k...@vger.kernel.org
> Signed-off-by: Sean Christopherson 
> ---
> Resorted to sending this via a private dummy account as getting my corp
> email to play nice with git-sendemail has been further delayed, and I
> assume y'all are tired of getting bounces.
> 
>  .mailmap| 1 +
>  MAINTAINERS | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/.mailmap b/.mailmap
> index 1e14566a3d56..a0d1685a165a 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -287,6 +287,7 @@ Santosh Shilimkar 
>  Sarangdhar Joshi 
>  Sascha Hauer 
>  S.Çağlar Onur 
> +Sean Christopherson  
>  Sean Nyekjaer  
>  Sebastian Reichel  
>  Sebastian Reichel  
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4a34b25ecc1f..0478d9ef72fc 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9662,7 +9662,7 @@ F:  tools/testing/selftests/kvm/s390x/
>  
>  KERNEL VIRTUAL MACHINE FOR X86 (KVM/x86)
>  M:   Paolo Bonzini 
> -R:   Sean Christopherson 
> +R:   Sean Christopherson 
>  R:   Vitaly Kuznetsov 
>  R:   Wanpeng Li 
>  R:   Jim Mattson 
> -- 
> 2.29.2.299.gdc1121823c-goog
> 

Not sure how it happened but commit c2b1209d852f ("MAINTAINERS: Update
email address for Sean Christopherson") dropped the MAINTAINERS
hunk so it still shows your @intel.com address. I almost sent a patch
there before I realized there was a .mailmap entry while looking through
git history.

Cheers,
Nathan

[PATCH] KVM: SVM: Add register operand to vmsave call in sev_es_vcpu_load

2020-12-18 Thread Nathan Chancellor

When using LLVM's integrated assembler (LLVM_IAS=1) while building
x86_64_defconfig + CONFIG_KVM=y + CONFIG_KVM_AMD=y, the following build
error occurs:

 $ make LLVM=1 LLVM_IAS=1 arch/x86/kvm/svm/sev.o
 arch/x86/kvm/svm/sev.c:2004:15: error: too few operands for instruction
 asm volatile(__ex("vmsave") : : "a" (__sme_page_pa(sd->save_area)) : 
"memory");
  ^
 arch/x86/kvm/svm/sev.c:28:17: note: expanded from macro '__ex'
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
 ^
 ./arch/x86/include/asm/kvm_host.h:1646:10: note: expanded from macro 
'__kvm_handle_fault_on_reboot'
 "666: \n\t" \
 ^
 :2:2: note: instantiated into assembly here
 vmsave
 ^
 1 error generated.

This happens because LLVM currently does not support calling vmsave
without the fixed register operand (%rax for 64-bit and %eax for
32-bit). This will be fixed in LLVM 12 but the kernel currently supports
LLVM 10.0.1 and newer so this needs to be handled.

Add the proper register using the _ASM_AX macro, which matches the
vmsave call in vmenter.S.

Fixes: 861377730aa9 ("KVM: SVM: Provide support for SEV-ES vCPU loading")
Link: https://reviews.llvm.org/D93524
Link: https://github.com/ClangBuiltLinux/linux/issues/1216
Signed-off-by: Nathan Chancellor 
---
 arch/x86/kvm/svm/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e57847ff8bd2..958370758ed0 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2001,7 +2001,7 @@ void sev_es_vcpu_load(struct vcpu_svm *svm, int cpu)
 * of which one step is to perform a VMLOAD. Since hardware does not
 * perform a VMSAVE on VMRUN, the host savearea must be updated.
 */
-   asm volatile(__ex("vmsave") : : "a" (__sme_page_pa(sd->save_area)) : 
"memory");
+   asm volatile(__ex("vmsave %%"_ASM_AX) : : "a" 
(__sme_page_pa(sd->save_area)) : "memory");
 
/*
 * Certain MSRs are restored on VMEXIT, only save ones that aren't
-- 
2.30.0.rc0

[GIT PULL] xen: branch for v5.11-rc1

2020-12-18 Thread Juergen Gross

Linus,

Please git pull the following tag:

 git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
for-linus-5.11-rc1b-tag

xen: branch for v5.11-rc1

It contains some minor cleanup patches and a small series disentangling some Xen
related Kconfig options.

Thanks.

Juergen

 arch/x86/include/asm/xen/page.h |  2 +-
 arch/x86/xen/Kconfig| 38 ++
 arch/x86/xen/p2m.c  | 12 +---
 drivers/block/xen-blkfront.c|  1 +
 drivers/xen/Makefile|  2 +-
 drivers/xen/manage.c|  1 +
 6 files changed, 27 insertions(+), 29 deletions(-)

Gustavo A. R. Silva (2):
  xen-blkfront: Fix fall-through warnings for Clang
  xen/manage: Fix fall-through warnings for Clang

Jason Andryuk (3):
  xen: Remove Xen PVH/PVHVM dependency on PCI
  xen: Kconfig: nest Xen guest options
  xen: Kconfig: remove X86_64 depends from XEN_512GB

Qinglang Miao (1):
  x86/xen: Convert to DEFINE_SHOW_ATTRIBUTE

Tom Rix (1):
  xen: remove trailing semicolon in macro definition

Re: [PATCH] xen: Kconfig: remove X86_64 depends from XEN_512GB

2020-12-18 Thread Jürgen Groß


On 16.12.20 15:08, Jason Andryuk wrote:

commit bfda93aee0ec ("xen: Kconfig: nest Xen guest options")
accidentally re-added X86_64 as a dependency to XEN_512GB.  It was
originally removed in commit a13f2ef168cb ("x86/xen: remove 32-bit Xen
PV guest support").  Remove it again.

Fixes: bfda93aee0ec ("xen: Kconfig: nest Xen guest options")
Reported-by: Boris Ostrovsky 
Signed-off-by: Jason Andryuk 


Applied to xen/tip.git for-linus-5.11


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature

[rcu:dev.2020.12.15b] BUILD SUCCESS f895a17eec290b0038a6294d884a9cc92d7d6e80

2020-12-18 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git  
dev.2020.12.15b
branch HEAD: f895a17eec290b0038a6294d884a9cc92d7d6e80  rcu/nocb: Add grace 
period and task state to show_rcu_nocb_state() output

elapsed time: 721m

configs tested: 126
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
powerpc tqm8541_defconfig
m68k apollo_defconfig
armmini2440_defconfig
powerpc canyonlands_defconfig
powerpc mpc834x_mds_defconfig
powerpc   motionpro_defconfig
powerpc  ep88xc_defconfig
parisc   alldefconfig
arm   omap1_defconfig
mipse55_defconfig
powerpc mpc836x_rdk_defconfig
m68kmvme147_defconfig
powerpc mpc837x_mds_defconfig
powerpcicon_defconfig
arm shannon_defconfig
ia64defconfig
mips  maltasmvp_defconfig
um   x86_64_defconfig
c6xevmc6678_defconfig
shedosk7760_defconfig
powerpc asp8347_defconfig
alpha   defconfig
mips   lemote2f_defconfig
arm  pxa3xx_defconfig
arm at91_dt_defconfig
mips cu1000-neo_defconfig
powerpc sequoia_defconfig
powerpc   mpc834x_itxgp_defconfig
powerpc sbc8548_defconfig
mipsmaltaup_xpa_defconfig
m68kdefconfig
armspear3xx_defconfig
sh apsh4a3a_defconfig
arm  tango4_defconfig
mips  decstation_64_defconfig
i386 alldefconfig
mips  fuloong2e_defconfig
riscv   defconfig
armmps2_defconfig
arm   cns3420vb_defconfig
powerpc kilauea_defconfig
sh  rsk7201_defconfig
ia64 allmodconfig
ia64 allyesconfig
m68k allyesconfig
m68k allmodconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a003-20201217
x86_64   randconfig-a006-20201217
x86_64   randconfig-a002-20201217
x86_64   randconfig-a005-20201217
x86_64   randconfig-a004-20201217
x86_64   randconfig-a001-20201217
i386 randconfig-a001-20201217
i386 randconfig-a004-20201217
i386 randconfig-a003-20201217
i386 randconfig-a002-20201217
i386 randconfig-a006-20201217
i386 randconfig-a005-20201217
x86_64   randconfig-a016-20201218
x86_64   randconfig-a013-20201218
x86_64   randconfig-a012-20201218
x86_64   randconfig-a015-20201218
x86_64   randconfig-a014-20201218
x86_64   randconfig-a011-20201218
i386 randconfig-a014-20201217
i386 randconfig-a013-20201217
i386 randconfig-a012-20201217
i386 randconfig-a011-20201217
i386 randconfig

[PATCH] arm64: smp: Add support for cpu park

2020-12-18 Thread Sang Yan

Introducing a feature of CPU PARK in order to save time
of cpus down and up during kexec, which may cost 250ms of
per cpu's down and 30ms of up.

As a result, for 128 cores, it costs more than 30 seconds
to down and up cpus during kexec. Think about 256 cores and more.

CPU PARK is a state that cpu power-on and staying in spin loop, polling
for exit chances, such as writing exit address.

Reserving a block of memory, to fill with cpu park text section,
exit address and park-magic-flag of each cpu. In implementation,
reserved one page for one cpu core.

Cpus going to park state instead of down in machine_shutdown().
Cpus going out of park state in smp_init instead of brought up.

One of cpu park sections in pre-reserved memory blocks,:
+--+
+ exit address +
+--+
+ park magic   +
+--+
+ park codes   +
+  .   +
+  .   +
+  .   +
+--+

Signed-off-by: Sang Yan 
---
 arch/arm64/Kconfig|  12 +++
 arch/arm64/include/asm/cpu.h  |   9 ++
 arch/arm64/include/asm/kexec.h|   7 ++
 arch/arm64/include/asm/smp.h  |   3 +
 arch/arm64/kernel/Makefile|   1 +
 arch/arm64/kernel/cpu-park.S  |  49 +
 arch/arm64/kernel/machine_kexec.c |   2 +-
 arch/arm64/kernel/process.c   |   4 +
 arch/arm64/kernel/smp.c   | 220 ++
 arch/arm64/mm/init.c  |  56 ++
 10 files changed, 362 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/cpu-park.S

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9f0139b..7a9defd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -347,6 +347,18 @@ config KASAN_SHADOW_OFFSET
default 0xeff8 if ARM64_VA_BITS_36 && KASAN_SW_TAGS
default 0x
 
+config ARM64_CPU_PARK
+   bool "Support CPU PARK on kexec"
+   depends on SMP
+   depends on KEXEC_CORE
+   help
+This enables support for CPU PARK feature in
+order to save time of cpu down to up.
+CPU park is a state through kexec, spin loop
+instead of cpu die before jumping to new kernel,
+jumping out from loop to new kernel entry in
+smp_init.
+
 source "arch/arm64/Kconfig.platforms"
 
 menu "Kernel Features"
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 7faae6f..e616a50 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -68,4 +68,13 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info);
 void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
 struct cpuinfo_arm64 *boot);
 
+#ifdef CONFIG_ARM64_CPU_PARK
+#define PARK_SECTION_SIZE PAGE_SIZE
+extern unsigned long park_start;
+extern unsigned long park_len;
+extern unsigned long park_start_v;
+extern void __cpu_park(unsigned long text, unsigned long exit);
+extern void __do_cpu_park(unsigned long exit);
+#endif
+
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index d24b527..69a66ca 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -25,6 +25,11 @@
 
 #define KEXEC_ARCH KEXEC_ARCH_AARCH64
 
+#ifdef CONFIG_ARM64_CPU_PARK
+/* CPU park state flag: "park" */
+#define PARK_MAGIC 0x7061726b
+#endif
+
 #ifndef __ASSEMBLY__
 
 /**
@@ -90,6 +95,8 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+void machine_kexec_mask_interrupts(void);
+
 #ifdef CONFIG_KEXEC_FILE
 #define ARCH_HAS_KIMAGE_ARCH
 
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 2e7f529..9141fa8 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -145,6 +145,9 @@ bool cpus_are_stuck_in_kernel(void);
 
 extern void crash_smp_send_stop(void);
 extern bool smp_crash_stop_failed(void);
+#ifdef CONFIG_ARM64_CPU_PARK
+extern int kexec_smp_send_park(void);
+#endif
 
 #endif /* ifndef __ASSEMBLY__ */
 
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 86364ab..7ea26ab 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -51,6 +51,7 @@ obj-$(CONFIG_RANDOMIZE_BASE)  += kaslr.o
 obj-$(CONFIG_HIBERNATION)  += hibernate.o hibernate-asm.o
 obj-$(CONFIG_KEXEC_CORE)   += machine_kexec.o relocate_kernel.o
\
   cpu-reset.o
+obj-$(CONFIG_ARM64_CPU_PARK)   += cpu-park.o
 obj-$(CONFIG_KEXEC_FILE)   += machine_kexec_file.o kexec_image.o
 obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
diff --git a/arch/arm64/kernel/cpu-park.S b/arch/arm64/kernel/cpu-park.S
new file mode 100644
index ..8c01484
--- /dev/null
+++ b/arch/arm64/kernel/cpu-park.S
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * CPU park routines
+ *
+ * Co

Re: [PATCH v4 10/10] selftests/vm: test faulting in kernel, and verify pinnable pages

2020-12-18 Thread John Hubbard


On 12/17/20 10:52 AM, Pavel Tatashin wrote:
>

Hi Pavel,

This all looks good pretty good to me, with just a couple of minor
doubts interleaved with the documentation tweaks:

a) I'm not yet sure if the is_pinnable_page() concept is a keeper. If it's
not for some reason, then we should revisit this patch.

b) I don't yet understand why FOLL_TOUCH from gup/pup is a critical part
of the test.



When pages are pinned they can be faulted in userland and migrated, and
they can be faulted right in kernel without migration.

In either case, the pinned pages must end-up being pinnable (not movable).


Let's delete the above two sentences, which are confusing as currently
worded, and just keep approximately the last sentence below.



Add a new test without touching pages in userland, and use FOLL_TOUCH
instead. Also, verify that pinned pages are pinnable.


Maybe this instead:

Add a new test to gup_test, to verify that only "pinnable" pages are
pinned. Also, use gup/pup + FOLL_TOUCH to fault in the pages, rather
than faulting them in from user space.


?  But I don't know why that second point is important. Is it actually
important in order to have a valid test? If so, why?




Signed-off-by: Pavel Tatashin 
---
  mm/gup_test.c |  6 ++
  tools/testing/selftests/vm/gup_test.c | 17 +
  2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/mm/gup_test.c b/mm/gup_test.c
index 24c70c5814ba..24fd542091ee 100644
--- a/mm/gup_test.c
+++ b/mm/gup_test.c
@@ -52,6 +52,12 @@ static void verify_dma_pinned(unsigned int cmd, struct page 
**pages,
  
  dump_page(page, "gup_test failure");

break;
+   } else if (cmd == PIN_LONGTERM_BENCHMARK &&
+   WARN(!is_pinnable_page(page),
+"pages[%lu] is NOT pinnable but pinned\n",
+i)) {
+   dump_page(page, "gup_test failure");
+   break;
}
}
break;
diff --git a/tools/testing/selftests/vm/gup_test.c 
b/tools/testing/selftests/vm/gup_test.c
index 42c71483729f..f08cc97d424d 100644
--- a/tools/testing/selftests/vm/gup_test.c
+++ b/tools/testing/selftests/vm/gup_test.c
@@ -13,6 +13,7 @@
  
  /* Just the flags we need, copied from mm.h: */

  #define FOLL_WRITE0x01/* check pte is writable */
+#define FOLL_TOUCH 0x02/* mark page accessed */



Aha, now I see why you wanted to pass other GUP flags, in the previous
patch. I think it's OK to pass this set of possible flags (as
.gup_flags) through ioctl, yes.

However (this is about the previous patch), I *think* we're better off
leaving the gup_test behavior as: "default is read-only pages, but you
can pass in -w to specify FOLL_WRITE". As opposed to passing in raw
flags from the command line. And yes, I realize that my -F option seemed
to recommand the latter...I'm regretting that -F approach now.

The other direction to go might be to stop doing that, and shift over to
just let the user specify FOLL_* flags directly on the command line, but
IMHO there's no need for that (yet), and it's a little less error-prone
to constrain it.

This leads to: change the "-F 1", to some other better-named option,
perhaps. Open to suggestion there.


  
  static char *cmd_to_str(unsigned long cmd)

  {
@@ -39,11 +40,11 @@ int main(int argc, char **argv)
unsigned long size = 128 * MB;
int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 1;
unsigned long cmd = GUP_FAST_BENCHMARK;
-   int flags = MAP_PRIVATE;
+   int flags = MAP_PRIVATE, touch = 0;



Silly nit, can we put it on its own line? This pre-existing mess of
declarations makes it hard to read everything. One item per line is
easier on the reader, who is often just looking for a single item at a
time. Actually why not rename it slightly while we're here (see below),
maybe to this:

int use_foll_touch = 0;



char *file = "/dev/zero";
char *p;
  
-	while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHp")) != -1) {

+   while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHpz")) != -1) {


Yes, this seems worth its own command line option.


switch (opt) {
case 'a':
cmd = PIN_FAST_BENCHMARK;
@@ -110,6 +111,10 @@ int main(int argc, char **argv)
case 'H':
flags |= (MAP_HUGETLB | MAP_ANONYMOUS);
break;
+   case 'z':
+   /* fault pages in gup, do not fault in userland */


How about:
/*
 * Use gup/pup(FOLL_TOUCH), *instead* of faulting
 * pages in from user space.
 */
use_foll_touch = 1;


+   touch = 1;
+

[PATCH] kconfig: config script: add a little user help

2020-12-18 Thread Randy Dunlap

Give the user a clue about the problem along with the 35 lines of
usage/help text.

Signed-off-by: Randy Dunlap 
Cc: Andi Kleen 
Cc: Masahiro Yamada 
Cc: linux-kbu...@vger.kernel.org
---
 scripts/config |1 +
 1 file changed, 1 insertion(+)

--- linux-next-20201218.orig/scripts/config
+++ linux-next-20201218/scripts/config
@@ -223,6 +223,7 @@ while [ "$1" != "" ] ; do
;;
 
*)
+   echo "bad cmd: $CMD"
usage
;;
esac

[PATCH] zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c

2020-12-18 Thread Randy Dunlap

In 11fb479ff5d9 ("zlib: export S390 symbols for zlib modules"), I added
EXPORT_SYMBOL()s to dfltcc_inflate.c but then Mikhail said that these
should probably be in dfltcc_syms.c with the other EXPORT_SYMBOL()s.

However, that is contrary to the current kernel style, which places
EXPORT_SYMBOL() immediately after the function that it applies to,
so move all EXPORT_SYMBOL()s to their respective function locations
and drop the dfltcc_syms.c file. Also move MODULE_LICENSE() from the
deleted file to dfltcc.c.

Fixes: 11fb479ff5d9 ("zlib: export S390 symbols for zlib modules")
Signed-off-by: Randy Dunlap 
Cc: Zaslonko Mikhail 
Cc: Andrew Morton 
Cc: Acked-by: Ilya Leoshkevich 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
---
 lib/zlib_dfltcc/dfltcc.c |6 +-
 lib/zlib_dfltcc/dfltcc_deflate.c |3 +++
 lib/zlib_dfltcc/dfltcc_syms.c|   17 -
 3 files changed, 8 insertions(+), 18 deletions(-)

--- linux-next-20201218.orig/lib/zlib_dfltcc/dfltcc.c
+++ linux-next-20201218/lib/zlib_dfltcc/dfltcc.c
@@ -1,7 +1,8 @@
 // SPDX-License-Identifier: Zlib
 /* dfltcc.c - SystemZ DEFLATE CONVERSION CALL support. */
 
-#include 
+#include 
+#include 
 #include "dfltcc_util.h"
 #include "dfltcc.h"
 
@@ -53,3 +54,6 @@ void dfltcc_reset(
 dfltcc_state->dht_threshold = DFLTCC_DHT_MIN_SAMPLE_SIZE;
 dfltcc_state->param.ribm = DFLTCC_RIBM;
 }
+EXPORT_SYMBOL(dfltcc_reset);
+
+MODULE_LICENSE("GPL");
--- linux-next-20201218.orig/lib/zlib_dfltcc/dfltcc_deflate.c
+++ linux-next-20201218/lib/zlib_dfltcc/dfltcc_deflate.c
@@ -4,6 +4,7 @@
 #include "dfltcc_util.h"
 #include "dfltcc.h"
 #include 
+#include 
 #include 
 
 /*
@@ -34,6 +35,7 @@ int dfltcc_can_deflate(
 
 return 1;
 }
+EXPORT_SYMBOL(dfltcc_can_deflate);
 
 static void dfltcc_gdht(
 z_streamp strm
@@ -277,3 +279,4 @@ again:
 goto again; /* deflate() must use all input or all output */
 return 1;
 }
+EXPORT_SYMBOL(dfltcc_deflate);
--- linux-next-20201218.orig/lib/zlib_dfltcc/dfltcc_syms.c
+++ /dev/null
@@ -1,17 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * linux/lib/zlib_dfltcc/dfltcc_syms.c
- *
- * Exported symbols for the s390 zlib dfltcc support.
- *
- */
-
-#include 
-#include 
-#include 
-#include "dfltcc.h"
-
-EXPORT_SYMBOL(dfltcc_can_deflate);
-EXPORT_SYMBOL(dfltcc_deflate);
-EXPORT_SYMBOL(dfltcc_reset);
-MODULE_LICENSE("GPL");

Re: [PATCH] cpufreq: intel_pstate: Use most recent guaranteed performance values

2020-12-18 Thread srinivas pandruvada

On Thu, 2020-12-17 at 20:17 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> 
> When turbo has been disabled by the BIOS, but HWP_CAP.GUARANTEED is
> changed later, user space may want to take advantage of this
> increased
> guaranteed performance.
> 
> HWP_CAP.GUARANTEED is not a static value.  It can be adjusted by an
> out-of-band agent or during an Intel Speed Select performance level
> change.  The HWP_CAP.MAX is still the maximum achievable performance
> with turbo disabled by the BIOS, so HWP_CAP.GUARANTEED can still
> change as long as it remains less than or equal to HWP_CAP.MAX.
> 
> When HWP_CAP.GUARANTEED is changed, the sysfs base_frequency
> attribute shows the most recent guaranteed frequency value. This
> attribute can be used by user space software to update the scaling
> min/max limits of the CPU.
> 
> Currently, the ->setpolicy() callback already uses the latest
> HWP_CAP values when setting HWP_REQ, but the ->verify() callback will
> restrict the user settings to the to old guaranteed performance value
> which prevents user space from making use of the extra CPU capacity
> theoretically available to it after increasing HWP_CAP.GUARANTEED.
> 
> To address this, read HWP_CAP in intel_pstate_verify_cpu_policy()
> to obtain the maximum P-state that can be used and use that to
> confine the policy max limit instead of using the cached and
> possibly stale pstate.max_freq value for this purpose.
> 
> For consistency, update intel_pstate_update_perf_limits() to use the
> maximum available P-state returned by intel_pstate_get_hwp_max() to
> compute the maximum frequency instead of using the return value of
> intel_pstate_get_max_freq() which, again, may be stale.
> 
> This issue is a side-effect of fixing the scaling frequency limits in
> commit eacc9c5a927e ("cpufreq: intel_pstate: Fix
> intel_pstate_get_hwp_max()
> for turbo disabled") which currected 
corrected

Thanks,
Srinivas

> the setting of the reduced scaling
> frequency values, but caused stale HWP_CAP.GUARANTEED to be used in
> the case at hand.
> 
> Fixes: eacc9c5a927e ("cpufreq: intel_pstate: Fix
> intel_pstate_get_hwp_max() for turbo disabled")
> Reported-by: Srinivas Pandruvada  >
> Tested-by: Srinivas Pandruvada 
> Cc: 5.8+  # 5.8+
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/intel_pstate.c |   16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -2207,9 +2207,9 @@ static void intel_pstate_update_perf_lim
>   unsigned int policy_min,
>   unsigned int policy_max)
>  {
> - int max_freq = intel_pstate_get_max_freq(cpu);
>   int32_t max_policy_perf, min_policy_perf;
>   int max_state, turbo_max;
> + int max_freq;
>  
>   /*
>* HWP needs some special consideration, because on BDX the
> @@ -2223,6 +2223,7 @@ static void intel_pstate_update_perf_lim
>   cpu->pstate.max_pstate : cpu-
> >pstate.turbo_pstate;
>   turbo_max = cpu->pstate.turbo_pstate;
>   }
> + max_freq = max_state * cpu->pstate.scaling;
>  
>   max_policy_perf = max_state * policy_max / max_freq;
>   if (policy_max == policy_min) {
> @@ -2325,9 +2326,18 @@ static void intel_pstate_adjust_policy_m
>  static void intel_pstate_verify_cpu_policy(struct cpudata *cpu,
>  struct cpufreq_policy_data
> *policy)
>  {
> + int max_freq;
> +
>   update_turbo_state();
> - cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
> -  intel_pstate_get_max_freq(cpu));
> + if (hwp_active) {
> + int max_state, turbo_max;
> +
> + intel_pstate_get_hwp_max(cpu->cpu, &turbo_max,
> &max_state);
> + max_freq = max_state * cpu->pstate.scaling;
> + } else {
> + max_freq = intel_pstate_get_max_freq(cpu);
> + }
> + cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
> max_freq);
>  
>   intel_pstate_adjust_policy_max(cpu, policy);
>  }
> 
> 
>

Re: [PATCH v2 16/18] arm64: dts: hi3660: Harmonize DWC USB3 DT nodes name

2020-12-18 Thread John Stultz

On Wed, Nov 11, 2020 at 1:22 AM Serge Semin
 wrote:
>
> In accordance with the DWC USB3 bindings the corresponding node
> name is suppose to comply with the Generic USB HCD DT schema, which
> requires the USB nodes to have the name acceptable by the regexp:
> "^usb(@.*)?" . Make sure the "snps,dwc3"-compatible nodes are correctly
> named.
>
> Signed-off-by: Serge Semin 
> Acked-by: Krzysztof Kozlowski 
> ---
>  arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
> b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> index d25aac5e0bf8..aea3800029b5 100644
> --- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> +++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> @@ -1166,7 +1166,7 @@ usb_phy: usb-phy {
> };
> };
>
> -   dwc3: dwc3@ff10 {
> +   dwc3: usb@ff10 {
> compatible = "snps,dwc3";
> reg = <0x0 0xff10 0x0 0x10>;

Oof. So this patch is breaking the usb gadget functionality on HiKey960 w/ AOSP.

In order to choose the right controller for gadget mode with AOSP, one
sets the "sys.usb.controller" property, which until now for HiKey960
has been "ff10.dwc3".
After this patch, the controller isn't found and we would have to
change userland to use "ff10.usb", which would then break booting
on older kernels (testing various LTS releases on AOSP is one of the
key uses of the HiKey960).

So while I understand the desire to unify the schema, as HiKey960
really isn't likely to be used outside of AOSP, I wonder if reverting
this one change is in the best interest of not breaking existing
userland?

thanks
-john

[ANNOUNCE] Git v2.30.0-rc1

2020-12-18 Thread Junio C Hamano

A release candidate Git v2.30.0-rc1 is now available for testing
at the usual places.  It is comprised of 455 non-merge commits
since v2.29.0, contributed by 72 people, 26 of which are new faces.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.30.0-rc1' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.29.0 are as follows.
Welcome to the Git development community!

  Alexey, Amanda Shafack, Bradley M. Kuhn, Caleb Tillman, Charvi
  Mendiratta, Daniel Duvall, Daniel Gurney, Dennis Ameling, Javier
  Spagnoletti, Jinoh Kang, Joey Salazar, Konrad Borowski, Marlon
  Rac Cambasis, Martin Schön, Michał Kępień, Nate Avers,
  Nipunn Koorapati, Rafael Silva, Robert Karszniewicz, Samuel
  Čavoj, Sean Barag, Sibo Dong, Simão Afonso, Sohom Datta,
  Thomas Koutcher, and Victor Engmark.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Spiers, Ævar Arnfjörð Bjarmason, Alex Vandiver, Arnout
  Engelen, brian m. carlson, Christian Couder, Chris. Webster,
  David Aguilar, Denton Liu, Derrick Stolee, Dimitriy Ryazantcev,
  Đoàn Trần Công Danh, Drew DeVault, Elijah Newren,
  Emily Shaffer, Eric Sunshine, Felipe Contreras, Han-Wen
  Nienhuys, Jeff Hostetler, Jeff King, Jiang Xin, Johannes
  Berg, Johannes Schindelin, Jonathan Tan, Josh Steadmon,
  Junio C Hamano, Kyle Meyer, Martin Ågren, Matheus Tavares,
  Nicolas Morey-Chaisemartin, Patrick Steinhardt, Peter Kaestle,
  Philippe Blain, Phillip Wood, Pranit Bauva, Pratyush Yadav,
  Ramsay Jones, Randall S. Becker, René Scharfe, Sergey Organov,
  Serg Tereshchenko, Srinidhi Kaushik, Stefan Haller, Štěpán
  Němec, SZEDER Gábor, and Taylor Blau.



Git 2.30 Release Notes (draft)
==

Updates since v2.29
---

UI, Workflows & Features

 * Userdiff for PHP update.

 * Userdiff for Rust update.

 * Userdiff for CSS update.

 * The command line completion script (in contrib/) learned that "git
   stash show" takes the options "git diff" takes.

 * "git worktree list" now shows if each worktree is locked.  This
   possibly may open us to show other kinds of states in the future.

 * "git maintenance", an extended big brother of "git gc", continues
   to evolve.

 * "git push --force-with-lease[=]" can easily be misused to lose
   commits unless the user takes good care of their own "git fetch".
   A new option "--force-if-includes" attempts to ensure that what is
   being force-pushed was created after examining the commit at the
   tip of the remote ref that is about to be force-replaced.

 * "git clone" learned clone.defaultremotename configuration variable
   to customize what nickname to use to call the remote the repository
   was cloned from.

 * "git checkout" learned to use checkout.guess configuration variable
   and enable/disable its "--[no-]guess" option accordingly.

 * "git resurrect" script (in contrib/) learned that the object names
   may be longer than 40-hex depending on the hash function in use.

 * "git diff A...B" learned "git diff --merge-base A B", which is a
   longer short-hand to say the same thing.

 * A sample 'push-to-checkout' hook, that performs the same as
   what the built-in default action does, has been added.

 * "git diff" family of commands learned the "-I" option to
   ignore hunks whose changed lines all match the given pattern.

 * The userdiff pattern learned to identify the function definition in
   POSIX shells and bash.

 * "git checkout-index" did not consistently signal an error with its
   exit status, but now it does.

 * A commit and tag object may have CR at the end of each and
   every line (you can create such an object with hash-object or
   using --cleanup=verbatim to decline the default clean-up
   action), but it would make it impossible to have a blank line
   to separate the title from the body of the message.  We are now
   more lenient and accept a line with lone CR on it as a blank line,
   too.

 * Exit codes from "git remote add" etc. were not usable by scripted
   callers, but now they are.

 * "git archive" now allows compression level higher than "-9"
   when generating tar.gz output.

 * Zsh autocompletion (in contrib/) update.

 * The maximum length of output filenames "git format-patch" creates
   has become configurable (used to be capped at 64).

 * "git rev-parse" learned the "--end-of-options" to help scripts to
   safely take a parameter that is supposed to be a revision, e.g.
   "git rev-parse --verify -q --end-of-options $rev".

 * The command line completion script (in contrib/) learned to expand
   commands that are alias of alias.

 * "git update-ref --stdin" learns

[PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-18 Thread Nadav Amit

From: Nadav Amit 

Userfaultfd self-tests fail occasionally, indicating a memory
corruption.

Analyzing this problem indicates that there is a real bug since
mmap_lock is only taken for read in mwriteprotect_range(). This might
cause the TLBs to be in an inconsistent state due to the deferred
batched TLB flushes.

Consider the following scenario with 3 CPUs (cpu2 is not shown):

cpu0cpu1

userfaultfd_writeprotect()
[ write-protecting ]
mwriteprotect_range()
 mmap_read_lock()
 change_protection()
  change_protection_range()
   ...
   change_pte_range()
   [ defer TLB flushes]
userfaultfd_writeprotect()
 mmap_read_lock()
 change_protection()
 [ write-unprotect ]
 ...
  [ unprotect PTE logically ]
...
[ page-fault]
...
wp_page_copy()
[ set new writable page in PTE]

At this point no TLB flush took place. cpu2 (not shown) might have a
stale writable PTE, which was cached in the TLB before cpu0 called
userfaultfd_writeprotect(), and this PTE points to a different page.

Therefore, write-protecting of memory, even using userfaultfd, which
does not change the vma, requires to prevent any concurrent reader (#PF
handler) from reading PTEs from the page-tables if any CPU might still
hold in it TLB a PTE with higher permissions for the same address. To
do so mmap_lock needs to be taken for write.

Surprisingly, memory-unprotection using userfaultfd also poses a
problem. Although memory unprotection is logically a promotion of PTE
permissions, and therefore should not require a TLB flush, the current
code might actually cause a demotion of the permission, and therefore
requires a TLB flush.

During unprotection of userfaultfd managed memory region, the PTE is not
really made writable, but instead marked "logically" as writable, and
left for the #PF handler to be handled later. While this is ok, the code
currently also *removes* the write permission, and therefore makes it
necessary to flush the TLBs.

To resolve these problems, acquire mmap_lock for write when
write-protecting userfaultfd region using ioctl's. Keep taking mmap_lock
for read when unprotecting memory, but keep the write-bit set when
resolving userfaultfd write-protection.

Cc: Peter Xu 
Cc: Andrea Arcangeli 
Cc: Pavel Emelyanov 
Cc: Mike Kravetz 
Cc: Mike Rapoport 
Cc: 
Fixes: 292924b26024 ("userfaultfd: wp: apply _PAGE_UFFD_WP bit")
Signed-off-by: Nadav Amit 
---
 mm/mprotect.c|  3 ++-
 mm/userfaultfd.c | 15 +--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index ab709023e9aa..c08c4055b051 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -75,7 +75,8 @@ static unsigned long change_pte_range(struct vm_area_struct 
*vma, pmd_t *pmd,
oldpte = *pte;
if (pte_present(oldpte)) {
pte_t ptent;
-   bool preserve_write = prot_numa && pte_write(oldpte);
+   bool preserve_write = (prot_numa || uffd_wp_resolve) &&
+ pte_write(oldpte);
 
/*
 * Avoid trapping faults against the zero or KSM
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 9a3d451402d7..7423808640ef 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -652,7 +652,15 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned 
long start,
/* Does the address range wrap, or is the span zero-sized? */
BUG_ON(start + len <= start);
 
-   mmap_read_lock(dst_mm);
+   /*
+* Although we do not change the VMA, we have to ensure deferred TLB
+* flushes are performed before page-faults can be handled. Otherwise
+* we can get inconsistent TLB state.
+*/
+   if (enable_wp)
+   mmap_write_lock(dst_mm);
+   else
+   mmap_read_lock(dst_mm);
 
/*
 * If memory mappings are changing because of non-cooperative
@@ -686,6 +694,9 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned 
long start,
 
err = 0;
 out_unlock:
-   mmap_read_unlock(dst_mm);
+   if (enable_wp)
+   mmap_write_unlock(dst_mm);
+   else
+   mmap_read_unlock(dst_mm);
return err;
 }
-- 
2.25.1

Re: [PATCH 04/14] dt-bindings: display: bridge: Add i.MX8qm/qxp pixel combiner binding

2020-12-18 Thread Liu Ying

Hi,

On Fri, 2020-12-18 at 16:42 -0600, Rob Herring wrote:
> On Thu, Dec 17, 2020 at 7:48 PM Liu Ying  wrote:
> > 
> > Hi,
> > 
> > On Thu, 2020-12-17 at 12:50 -0600, Rob Herring wrote:
> > > On Thu, 17 Dec 2020 17:59:23 +0800, Liu Ying wrote:
> > > > This patch adds bindings for i.MX8qm/qxp pixel combiner.
> > > > 
> > > > Signed-off-by: Liu Ying 
> > > > ---
> > > >  .../display/bridge/fsl,imx8qxp-pixel-combiner.yaml | 160
> > > > +
> > > >  1 file changed, 160 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/display/bridge/fsl,imx8qxp-
> > > > pixel-combiner.yaml
> > > > 
> > > 
> > > My bot found errors running 'make dt_binding_check' on your
> > > patch:
> > > 
> > > yamllint warnings/errors:
> > > 
> > > dtschema/dtc warnings/errors:
> > > Documentation/devicetree/bindings/display/bridge/fsl,imx8qxp-
> > > pixel-combiner.example.dts:19:18: fatal error: dt-
> > > bindings/clock/imx8-lpcg.h: No such file or directory
> > >19 | #include 
> > >   |  ^~~
> > > compilation terminated.
> > > make[1]: *** [scripts/Makefile.lib:342:
> > > Documentation/devicetree/bindings/display/bridge/fsl,imx8qxp-
> > > pixel-combiner.example.dt.yaml] Error 1
> > > make[1]: *** Waiting for unfinished jobs
> > > make: *** [Makefile:1364: dt_binding_check] Error 2
> > > 
> > > See 
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fpatch%2F1417599&data=04%7C01%7Cvictor.liu%40nxp.com%7C96806e0ce6bc40c936fa08d8a3a64551%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637439281816690986%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Cjyszb0alRE5z2OGKdZZEg5PQpH11U%2BGqVt6couCLGE%3D&reserved=0
> > > 
> > > This check can fail if there are any dependencies. The base for a
> > > patch
> > > series is generally the most recent rc1.
> > 
> > This series can be applied to linux-next/master branch.
> 
> I can't know that to apply and run checks automatically. I guessed
> that reviewing this before sending, but I want it abundantly clear
> what the result of applying this might be and it wasn't mentioned in
> this patch.
> 
> Plus linux-next is a base no one can apply patches to, so should you
> be sending patches based on it? It's also the merge window, so maybe

I sent this series based on drm-misc-next.  This series is applicable
to linux-next/master, and may pass 'make dt_binding_check' there.

I'll mention dependencies in the future where similar situations
appear. Thanks.

BTW, does it make sense for the bot to additionaly try linux-next if
needed?  Maybe, that'll be helpful?

Regards,
Liu Ying

> wait until rc1 when your dependency is in and the patch can actually
> be applied. Also, the drm-misc folks will still need to know they
> need
> to merge rc1 in before this is applied.
> 
> Rob

[tip:locking/urgent] BUILD SUCCESS 91ea62d58bd661827c328a2c6c02a87fa4aae88b

2020-12-18 Thread kernel test robot

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git  
locking/urgent
branch HEAD: 91ea62d58bd661827c328a2c6c02a87fa4aae88b  softirq: Avoid bad 
tracing / lockdep interaction

elapsed time: 723m

configs tested: 121
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
powerpc tqm8541_defconfig
m68k apollo_defconfig
armmini2440_defconfig
powerpc canyonlands_defconfig
powerpc mpc834x_mds_defconfig
powerpc   motionpro_defconfig
powerpc  ep88xc_defconfig
parisc   alldefconfig
arm   omap1_defconfig
mipse55_defconfig
powerpc mpc836x_rdk_defconfig
m68kmvme147_defconfig
powerpc mpc837x_mds_defconfig
powerpcicon_defconfig
sh  urquell_defconfig
powerpc mpc8560_ads_defconfig
mips   ci20_defconfig
sh   se7780_defconfig
arm shannon_defconfig
mips  maltasmvp_defconfig
ia64defconfig
powerpc   mpc834x_itxgp_defconfig
powerpc sbc8548_defconfig
mipsmaltaup_xpa_defconfig
m68kdefconfig
armspear3xx_defconfig
sh apsh4a3a_defconfig
arm  tango4_defconfig
mips  decstation_64_defconfig
sh   se7206_defconfig
powerpc   holly_defconfig
arm vf610m4_defconfig
mipsvocore2_defconfig
i386 alldefconfig
mips  fuloong2e_defconfig
riscv   defconfig
armmps2_defconfig
openriscdefconfig
ia64  tiger_defconfig
pariscgeneric-32bit_defconfig
arm lpc32xx_defconfig
arm   viper_defconfig
ia64 allmodconfig
ia64 allyesconfig
m68k allmodconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a003-20201217
x86_64   randconfig-a006-20201217
x86_64   randconfig-a002-20201217
x86_64   randconfig-a005-20201217
x86_64   randconfig-a004-20201217
x86_64   randconfig-a001-20201217
i386 randconfig-a001-20201217
i386 randconfig-a004-20201217
i386 randconfig-a003-20201217
i386 randconfig-a002-20201217
i386 randconfig-a006-20201217
i386 randconfig-a005-20201217
x86_64   randconfig-a016-20201218
x86_64   randconfig-a013-20201218
x86_64   randconfig-a012-20201218
x86_64   randconfig-a015-20201218
x86_64   randconfig-a014-20201218
x86_64   randconfig-a011-20201218
i386 randconfig-a014-20201217
i386 randconfig-a013-20201217
i386 randconfig-a012-20201217
i386 randconfig-a011-20201217
i386

Re: [PATCH] dmaengine: qcom: bam_dma: Add LOCK and UNLOCK flag bit support

2020-12-18 Thread Thara Gopinath





On 12/17/20 9:37 AM, Md Sadre Alam wrote:

This change will add support for LOCK & UNLOCK flag bit support
on CMD descriptor.

If DMA_PREP_LOCK flag passed in prep_slave_sg then requester of this
transaction wanted to lock the DMA controller for this transaction so
BAM driver should set LOCK bit for the HW descriptor.

If DMA_PREP_UNLOCK flag passed in prep_slave_sg then requester of this
transaction wanted to unlock the DMA controller.so BAM driver should set
UNLOCK bit for the HW descriptor.

Hi,

This is a generic question. What is the point of LOCK/UNLOCK with 
allocating LOCK groups to the individual dma channels? By default

doesn't all channels fall in the same group. This would mean that
a lock does not prevent the dma controller from not executing a
transaction on the other channels.

--
Warm Regards
Thara



Signed-off-by: Md Sadre Alam 
---
  Documentation/driver-api/dmaengine/provider.rst | 9 +
  drivers/dma/qcom/bam_dma.c  | 9 +
  include/linux/dmaengine.h   | 5 +
  3 files changed, 23 insertions(+)

diff --git a/Documentation/driver-api/dmaengine/provider.rst 
b/Documentation/driver-api/dmaengine/provider.rst
index ddb0a81..d7516e2 100644
--- a/Documentation/driver-api/dmaengine/provider.rst
+++ b/Documentation/driver-api/dmaengine/provider.rst
@@ -599,6 +599,15 @@ DMA_CTRL_REUSE
- This flag is only supported if the channel reports the DMA_LOAD_EOT
  capability.
  
+- DMA_PREP_LOCK

+
+  - If set , the client driver tells DMA controller I am locking you for
+this transcation.
+
+- DMA_PREP_UNLOCK
+
+  - If set, the client driver will tells DMA controller I am releasing the lock
+
  General Design Notes
  
  
diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c

index 4eeb8bb..cdbe395 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -58,6 +58,8 @@ struct bam_desc_hw {
  #define DESC_FLAG_EOB BIT(13)
  #define DESC_FLAG_NWD BIT(12)
  #define DESC_FLAG_CMD BIT(11)
+#define DESC_FLAG_LOCK BIT(10)
+#define DESC_FLAG_UNLOCK BIT(9)
  
  struct bam_async_desc {

struct virt_dma_desc vd;
@@ -644,6 +646,13 @@ static struct dma_async_tx_descriptor 
*bam_prep_slave_sg(struct dma_chan *chan,
  
  	/* fill in temporary descriptors */

desc = async_desc->desc;
+   if (flags & DMA_PREP_CMD) {
+   if (flags & DMA_PREP_LOCK)
+   desc->flags |= cpu_to_le16(DESC_FLAG_LOCK);
+   if (flags & DMA_PREP_UNLOCK)
+   desc->flags |= cpu_to_le16(DESC_FLAG_UNLOCK);
+   }
+
for_each_sg(sgl, sg, sg_len, i) {
unsigned int remainder = sg_dma_len(sg);
unsigned int curr_offset = 0;
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index dd357a7..79ccadb4 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -190,6 +190,9 @@ struct dma_interleaved_template {
   *  transaction is marked with DMA_PREP_REPEAT will cause the new transaction
   *  to never be processed and stay in the issued queue forever. The flag is
   *  ignored if the previous transaction is not a repeated transaction.
+ * @DMA_PREP_LOCK: tell the driver that DMA HW engine going to be locked for 
this
+ *  transaction , until not seen DMA_PREP_UNLOCK flag set.
+ * @DMA_PREP_UNLOCK: tell the driver to unlock the DMA HW engine.
   */
  enum dma_ctrl_flags {
DMA_PREP_INTERRUPT = (1 << 0),
@@ -202,6 +205,8 @@ enum dma_ctrl_flags {
DMA_PREP_CMD = (1 << 7),
DMA_PREP_REPEAT = (1 << 8),
DMA_PREP_LOAD_EOT = (1 << 9),
+   DMA_PREP_LOCK = (1 << 10),
+   DMA_PREP_UNLOCK = (1 << 11),
  };
  
  /**

[PATCH v15 3/3] scsi: ufs: Prepare HPB read for cached sub-region

2020-12-18 Thread Daejun Park

This patch changes the read I/O to the HPB read I/O.

If the logical address of the read I/O belongs to active sub-region, the
HPB driver modifies the read I/O command to HPB read. It modifies the UPIU
command of UFS instead of modifying the existing SCSI command.

In the HPB version 1.0, the maximum read I/O size that can be converted to
HPB read is 4KB.

The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.

Reviewed-by: Can Guo 
Reviewed-by: Bart Van Assche 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufshcd.c |   2 +
 drivers/scsi/ufs/ufshpb.c | 232 ++
 drivers/scsi/ufs/ufshpb.h |   2 +
 3 files changed, 236 insertions(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 0ec0ed237140..41554afead63 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2600,6 +2600,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
ufshcd_comp_scsi_upiu(hba, lrbp);
 
+   ufshpb_prep(hba, lrbp);
+
err = ufshcd_map_sg(hba, lrbp);
if (err) {
lrbp->cmd = NULL;
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 65b7760c0b07..725bcff3a5c7 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -31,6 +31,29 @@ bool ufshpb_is_allowed(struct ufs_hba *hba)
return !(hba->ufshpb_dev.hpb_disabled);
 }
 
+static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn,
+struct ufshpb_subregion *srgn)
+{
+   return rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID;
+}
+
+static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd)
+{
+   return req_op(cmd->request) == REQ_OP_READ;
+}
+
+static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd)
+{
+   return op_is_write(req_op(cmd->request)) ||
+  op_is_discard(req_op(cmd->request));
+}
+
+static bool ufshpb_is_support_chunk(int transfer_len)
+{
+   return transfer_len <= HPB_MULTI_CHUNK_HIGH;
+}
+
 static bool ufshpb_is_general_lun(int lun)
 {
return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
@@ -98,6 +121,215 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int 
state)
atomic_set(&hpb->hpb_state, state);
 }
 
+static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int set_bit_len;
+   int bitmap_len = hpb->entries_per_srgn;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   set_bit_len = bitmap_len - srgn_offset;
+   else
+   set_bit_len = cnt;
+
+   if (rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID)
+   bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len);
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= set_bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+
+   WARN_ON(cnt < 0);
+}
+
+static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+  int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int bitmap_len = hpb->entries_per_srgn;
+   int bit_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (!ufshpb_is_valid_srgn(rgn, srgn))
+   return true;
+
+   /*
+* If the region state is active, mctx must be allocated.
+* In this case, check whether the region is evicted or
+* mctx allcation fail.
+*/
+   WARN_ON(!srgn->mctx);
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   bit_len = bitmap_len - srgn_offset;
+   else
+   bit_len = cnt;
+
+   if (find_next_bit(srgn->mctx->ppn_dirty,
+ bit_len, srgn_offset) >= srgn_offset)
+   return true;
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+
+   return false;
+}
+
+static u64 ufshpb_get_ppn(struct ufshpb_lu *hpb,
+ struct ufshpb_map_ctx *mctx, int pos, int *error)
+{
+   u64 *ppn_table;
+   struct page *page;
+   int index, offset;
+
+   index = pos / (PAGE_SIZE / HPB_ENTRY_SIZE);
+   offset = pos % (PAGE_SIZE / HPB_ENTRY_SIZE);
+
+   page = mctx->m_page[index];
+   if (unlikely(!page)) {
+

[PATCH 2/6] drivers: crypto: qce: sha: Hold back a block of data to be transferred as part of final

2020-12-18 Thread Thara Gopinath

If the available data to transfer is exactly a multiple of block size, save
the last block to be transferred in qce_ahash_final (with the last block
bit set) if this is indeed the end of data stream. If not this saved block
will be transferred as part of next update. If this block is not held back
and if this is indeed the end of data stream, the digest obtained will be
wrong since qce_ahash_final will see that rctx->buflen is 0 and return
doing nothing which in turn means that a digest will not be copied to the
destination result buffer.  qce_ahash_final cannot be made to alter this
behavior and allowed to proceed if rctx->buflen is 0 because the crypto
engine BAM does not allow for zero length transfers.

Signed-off-by: Thara Gopinath 
---
 drivers/crypto/qce/sha.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index b8428da6716d..02d89267a806 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -193,6 +193,25 @@ static int qce_ahash_update(struct ahash_request *req)
 
/* calculate how many bytes will be hashed later */
hash_later = total % blocksize;
+
+   /*
+* At this point, there is more than one block size of data.  If
+* the available data to transfer is exactly a multiple of block
+* size, save the last block to be transferred in qce_ahash_final
+* (with the last block bit set) if this is indeed the end of data
+* stream. If not this saved block will be transferred as part of
+* next update. If this block is not held back and if this is
+* indeed the end of data stream, the digest obtained will be wrong
+* since qce_ahash_final will see that rctx->buflen is 0 and return
+* doing nothing which in turn means that a digest will not be
+* copied to the destination result buffer.  qce_ahash_final cannot
+* be made to alter this behavior and allowed to proceed if
+* rctx->buflen is 0 because the crypto engine BAM does not allow
+* for zero length transfers.
+*/
+   if (!hash_later)
+   hash_later = blocksize;
+
if (hash_later) {
unsigned int src_offset = req->nbytes - hash_later;
scatterwalk_map_and_copy(rctx->buf, req->src, src_offset,
-- 
2.25.1

[PATCH v15 2/3] scsi: ufs: L2P map management for HPB read

2020-12-18 Thread Daejun Park

This is a patch for managing L2P map in HPB module.

The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.

Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sensedata.
The HPB module performs L2P mapping management on the host through the
delivered information.

A pinned region is a pre-set regions on the UFS device that is always
activate-state.

The data structure for map data request and L2P map uses mempool API,
minimizing allocation overhead while avoiding static allocation.

The mininum size of the memory pool used in the HPB is implemented
as a module parameter, so that it can be configurable by the user.

To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096

The map_work manages active/inactive by 2 "to-do" lists.
Each hpb lun maintains 2 "to-do" lists:
  hpb->lh_inact_rgn - regions to be inactivated, and
  hpb->lh_act_srgn - subregions to be activated
Those lists are maintained on IO completion.

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufs.h|   36 ++
 drivers/scsi/ufs/ufshcd.c |3 +
 drivers/scsi/ufs/ufshpb.c | 1003 -
 drivers/scsi/ufs/ufshpb.h |   63 ++-
 4 files changed, 1089 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index cbab6f54eb12..de89c2182638 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -472,6 +472,41 @@ struct utp_cmd_rsp {
u8 sense_data[UFS_SENSE_SIZE];
 };
 
+struct ufshpb_active_field {
+   __be16 active_rgn;
+   __be16 active_srgn;
+};
+#define HPB_ACT_FIELD_SIZE 4
+
+/**
+ * struct utp_hpb_rsp - Response UPIU structure
+ * @residual_transfer_count: Residual transfer count DW-3
+ * @reserved1: Reserved double words DW-4 to DW-7
+ * @sense_data_len: Sense data length DW-8 U16
+ * @desc_type: Descriptor type of sense data
+ * @additional_len: Additional length of sense data
+ * @hpb_op: HPB operation type
+ * @reserved2: Reserved field
+ * @active_rgn_cnt: Active region count
+ * @inactive_rgn_cnt: Inactive region count
+ * @hpb_active_field: Recommended to read HPB region and subregion
+ * @hpb_inactive_field: To be inactivated HPB region and subregion
+ */
+struct utp_hpb_rsp {
+   __be32 residual_transfer_count;
+   __be32 reserved1[4];
+   __be16 sense_data_len;
+   u8 desc_type;
+   u8 additional_len;
+   u8 hpb_op;
+   u8 reserved2;
+   u8 active_rgn_cnt;
+   u8 inactive_rgn_cnt;
+   struct ufshpb_active_field hpb_active_field[2];
+   __be16 hpb_inactive_field[2];
+};
+#define UTP_HPB_RSP_SIZE 40
+
 /**
  * struct utp_upiu_rsp - general upiu response structure
  * @header: UPIU header structure DW-0 to DW-2
@@ -482,6 +517,7 @@ struct utp_upiu_rsp {
struct utp_upiu_header header;
union {
struct utp_cmd_rsp sr;
+   struct utp_hpb_rsp hr;
struct utp_upiu_query qr;
};
 };
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index d8eb45ca485a..0ec0ed237140 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4945,6 +4945,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 */
pm_runtime_get_noresume(hba->dev);
}
+
+   if (scsi_status == SAM_STAT_GOOD)
+   ufshpb_rsp_upiu(hba, lrbp);
break;
case UPIU_TRANSACTION_REJECT_UPIU:
/* TODO: handle Reject UPIU Response */
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index a87f5e9ddb05..65b7760c0b07 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -16,11 +16,73 @@
 #include "ufshpb.h"
 #include "../sd.h"
 
+/* memory management */
+static struct kmem_cache *ufshpb_mctx_cache;
+static mempool_t *ufshpb_mctx_pool;
+static mempool_t *ufshpb_page_pool;
+/* A cache size of 2MB can cache ppn in the 1GB range. */
+static unsigned int ufshpb_host_map_kbytes = 2048;
+static int tot_active_srgn_pages;
+
+static struct workqueue_struct *ufshpb_wq;
+
 bool ufshpb_is_allowed(struct ufs_hba *hba)
 {
return !(hba->ufshpb_dev.hpb_disabled);
 }
 
+static bool ufshpb_is_general_lun(int lun)
+{
+   return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
+}
+
+static bool
+ufshpb_is_pinned_region(struct ufshpb_lu *hpb, int rgn_idx)
+{
+   if (hpb->lu_pinn

[PATCH 5/6] drivers: crypto: qce: Remover src_tbl from qce_cipher_reqctx

2020-12-18 Thread Thara Gopinath

src_table is unused and hence remove it from struct qce_cipher_reqctx

Signed-off-by: Thara Gopinath 
---
 drivers/crypto/qce/cipher.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/crypto/qce/cipher.h b/drivers/crypto/qce/cipher.h
index cffa9fc628ff..850f257d00f3 100644
--- a/drivers/crypto/qce/cipher.h
+++ b/drivers/crypto/qce/cipher.h
@@ -40,7 +40,6 @@ struct qce_cipher_reqctx {
struct scatterlist result_sg;
struct sg_table dst_tbl;
struct scatterlist *dst_sg;
-   struct sg_table src_tbl;
struct scatterlist *src_sg;
unsigned int cryptlen;
struct skcipher_request fallback_req;   // keep at the end
-- 
2.25.1

[PATCH 4/6] drivers: crypto: qce: common: Set data unit size to message length for AES XTS transformation

2020-12-18 Thread Thara Gopinath

Set the register REG_ENCR_XTS_DU_SIZE to cryptlen for AES XTS
transformation. Anything else causes the engine to return back
wrong results.

Signed-off-by: Thara Gopinath 
---
 drivers/crypto/qce/common.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 5006e74c40cd..7ae0b779563f 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -294,15 +294,15 @@ static void qce_xtskey(struct qce_device *qce, const u8 
*enckey,
 {
u32 xtskey[QCE_MAX_CIPHER_KEY_SIZE / sizeof(u32)] = {0};
unsigned int xtsklen = enckeylen / (2 * sizeof(u32));
-   unsigned int xtsdusize;
 
qce_cpu_to_be32p_array((__be32 *)xtskey, enckey + enckeylen / 2,
   enckeylen / 2);
qce_write_array(qce, REG_ENCR_XTS_KEY0, xtskey, xtsklen);
 
-   /* xts du size 512B */
-   xtsdusize = min_t(u32, QCE_SECTOR_SIZE, cryptlen);
-   qce_write(qce, REG_ENCR_XTS_DU_SIZE, xtsdusize);
+   /* Set data unit size to cryptlen. Anything else causes
+* crypto engine to return back incorrect results.
+*/
+   qce_write(qce, REG_ENCR_XTS_DU_SIZE, cryptlen);
 }
 
 static int qce_setup_regs_skcipher(struct crypto_async_request *async_req,
-- 
2.25.1

[PATCH 3/6] drivers: crypto: qce: skcipher: Fix regressions found during fuzz testing

2020-12-18 Thread Thara Gopinath

This patch contains the following fixes for the supported encryption
algorithms in the Qualcomm crypto engine(CE)
1. Return unsupported if key1 = key2 for AES XTS algorithm since CE
does not support this and the operation causes the engine to hang.
2. Return unsupprted if any three keys are same for DES3 algorithms
since CE does not support this and the operation causes the engine to
hang.
3. Return unsupported for 0 length plain texts since crypto engine BAM
dma does not support 0 length data.
4. ECB messages do not have an IV and hence set the ivsize to 0.
5. Ensure that the data passed for ECB/CBC encryption/decryption is
blocksize aligned. Otherwise the CE hangs on the operation.
6. Allow messages of length less that 512 bytes for all other encryption
algorithms other than AES XTS. The recommendation is only for AES XTS
to have data size greater than 512 bytes.

Signed-off-by: Thara Gopinath 
---
 drivers/crypto/qce/skcipher.c | 68 ++-
 1 file changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index a2d3da0ad95f..936bfb7c769b 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -167,16 +167,32 @@ static int qce_skcipher_setkey(struct crypto_skcipher 
*ablk, const u8 *key,
struct crypto_tfm *tfm = crypto_skcipher_tfm(ablk);
struct qce_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
unsigned long flags = to_cipher_tmpl(ablk)->alg_flags;
+   unsigned int __keylen;
int ret;
 
if (!key || !keylen)
return -EINVAL;
 
-   switch (IS_XTS(flags) ? keylen >> 1 : keylen) {
+   /*
+* AES XTS key1 = key2 not supported by crypto engine.
+* Revisit to request a fallback cipher in this case.
+*/
+   if (IS_XTS(flags)) {
+   __keylen = keylen >> 1;
+   if (!memcmp(key, key + __keylen, __keylen))
+   return -EINVAL;
+   } else {
+   __keylen = keylen;
+   }
+   switch (__keylen) {
case AES_KEYSIZE_128:
case AES_KEYSIZE_256:
memcpy(ctx->enc_key, key, keylen);
break;
+   case AES_KEYSIZE_192:
+   break;
+   default:
+   return -EINVAL;
}
 
ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
@@ -204,12 +220,27 @@ static int qce_des3_setkey(struct crypto_skcipher *ablk, 
const u8 *key,
   unsigned int keylen)
 {
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(ablk);
+   u32 _key[6];
int err;
 
err = verify_skcipher_des3_key(ablk, key);
if (err)
return err;
 
+   /*
+* The crypto engine does not support any two keys
+* being the same for triple des algorithms. The
+* verify_skcipher_des3_key does not check for all the
+* below conditions. Return -ENOKEY in case any two keys
+* are the same. Revisit to see if a fallback cipher
+* is needed to handle this condition.
+*/
+   memcpy(_key, key, DES3_EDE_KEY_SIZE);
+   if (!((_key[0] ^ _key[2]) | (_key[1] ^ _key[3])) ||
+   !((_key[2] ^ _key[4]) | (_key[3] ^ _key[5])) ||
+   !((_key[0] ^ _key[4]) | (_key[1] ^ _key[5])))
+   return -ENOKEY;
+
ctx->enc_keylen = keylen;
memcpy(ctx->enc_key, key, keylen);
return 0;
@@ -221,6 +252,7 @@ static int qce_skcipher_crypt(struct skcipher_request *req, 
int encrypt)
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
struct qce_cipher_reqctx *rctx = skcipher_request_ctx(req);
struct qce_alg_template *tmpl = to_cipher_tmpl(tfm);
+   unsigned int blocksize = crypto_skcipher_blocksize(tfm);
int keylen;
int ret;
 
@@ -228,14 +260,34 @@ static int qce_skcipher_crypt(struct skcipher_request 
*req, int encrypt)
rctx->flags |= encrypt ? QCE_ENCRYPT : QCE_DECRYPT;
keylen = IS_XTS(rctx->flags) ? ctx->enc_keylen >> 1 : ctx->enc_keylen;
 
-   /* qce is hanging when AES-XTS request len > QCE_SECTOR_SIZE and
-* is not a multiple of it; pass such requests to the fallback
+   /* CE does not handle 0 length messages */
+   if (!req->cryptlen)
+   return -EINVAL;
+
+   /*
+* ECB and CBC algorithms require message lengths to be
+* multiples of block size.
+* TODO: The spec says AES CBC mode for certain versions
+* of crypto engine can handle partial blocks as well.
+* Test and enable such messages.
+*/
+   if (IS_ECB(rctx->flags) || IS_CBC(rctx->flags))
+   if (!IS_ALIGNED(req->cryptlen, blocksize))
+   return -EINVAL;
+
+   /*
+* Conditions for requesting a fallback cipher
+* AES-192 (not supported by crypto engine (CE))
+* AES-XTS request with len <= 512 byte (not recommended to use

[PATCH 6/6] drivers: crypto: qce: Remove totallen and offset in qce_start

2020-12-18 Thread Thara Gopinath

totallen is used to get the size of the data to be transformed.
This is also available via nbytes or cryptlen in the qce_sha_reqctx
and qce_cipher_ctx. Similarly offset convey nothing for the supported
encryption and authentication transformations and is always 0.
Remove these two redundant parameters in qce_start.

Signed-off-by: Thara Gopinath 
---
 drivers/crypto/qce/common.c   | 17 +++--
 drivers/crypto/qce/common.h   |  3 +--
 drivers/crypto/qce/sha.c  |  2 +-
 drivers/crypto/qce/skcipher.c |  2 +-
 4 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/qce/common.c b/drivers/crypto/qce/common.c
index 7ae0b779563f..e292d8453ef6 100644
--- a/drivers/crypto/qce/common.c
+++ b/drivers/crypto/qce/common.c
@@ -139,8 +139,7 @@ static u32 qce_auth_cfg(unsigned long flags, u32 key_size)
return cfg;
 }
 
-static int qce_setup_regs_ahash(struct crypto_async_request *async_req,
-   u32 totallen, u32 offset)
+static int qce_setup_regs_ahash(struct crypto_async_request *async_req)
 {
struct ahash_request *req = ahash_request_cast(async_req);
struct crypto_ahash *ahash = __crypto_ahash_cast(async_req->tfm);
@@ -305,8 +304,7 @@ static void qce_xtskey(struct qce_device *qce, const u8 
*enckey,
qce_write(qce, REG_ENCR_XTS_DU_SIZE, cryptlen);
 }
 
-static int qce_setup_regs_skcipher(struct crypto_async_request *async_req,
-u32 totallen, u32 offset)
+static int qce_setup_regs_skcipher(struct crypto_async_request *async_req)
 {
struct skcipher_request *req = skcipher_request_cast(async_req);
struct qce_cipher_reqctx *rctx = skcipher_request_ctx(req);
@@ -366,7 +364,7 @@ static int qce_setup_regs_skcipher(struct 
crypto_async_request *async_req,
 
qce_write(qce, REG_ENCR_SEG_CFG, encr_cfg);
qce_write(qce, REG_ENCR_SEG_SIZE, rctx->cryptlen);
-   qce_write(qce, REG_ENCR_SEG_START, offset & 0x);
+   qce_write(qce, REG_ENCR_SEG_START, 0);
 
if (IS_CTR(flags)) {
qce_write(qce, REG_CNTR_MASK, ~0);
@@ -375,7 +373,7 @@ static int qce_setup_regs_skcipher(struct 
crypto_async_request *async_req,
qce_write(qce, REG_CNTR_MASK2, ~0);
}
 
-   qce_write(qce, REG_SEG_SIZE, totallen);
+   qce_write(qce, REG_SEG_SIZE, rctx->cryptlen);
 
/* get little endianness */
config = qce_config_reg(qce, 1);
@@ -387,17 +385,16 @@ static int qce_setup_regs_skcipher(struct 
crypto_async_request *async_req,
 }
 #endif
 
-int qce_start(struct crypto_async_request *async_req, u32 type, u32 totallen,
- u32 offset)
+int qce_start(struct crypto_async_request *async_req, u32 type)
 {
switch (type) {
 #ifdef CONFIG_CRYPTO_DEV_QCE_SKCIPHER
case CRYPTO_ALG_TYPE_SKCIPHER:
-   return qce_setup_regs_skcipher(async_req, totallen, offset);
+   return qce_setup_regs_skcipher(async_req);
 #endif
 #ifdef CONFIG_CRYPTO_DEV_QCE_SHA
case CRYPTO_ALG_TYPE_AHASH:
-   return qce_setup_regs_ahash(async_req, totallen, offset);
+   return qce_setup_regs_ahash(async_req);
 #endif
default:
return -EINVAL;
diff --git a/drivers/crypto/qce/common.h b/drivers/crypto/qce/common.h
index 85ba16418a04..3bc244bcca2d 100644
--- a/drivers/crypto/qce/common.h
+++ b/drivers/crypto/qce/common.h
@@ -94,7 +94,6 @@ struct qce_alg_template {
 void qce_cpu_to_be32p_array(__be32 *dst, const u8 *src, unsigned int len);
 int qce_check_status(struct qce_device *qce, u32 *status);
 void qce_get_version(struct qce_device *qce, u32 *major, u32 *minor, u32 
*step);
-int qce_start(struct crypto_async_request *async_req, u32 type, u32 totallen,
- u32 offset);
+int qce_start(struct crypto_async_request *async_req, u32 type);
 
 #endif /* _COMMON_H_ */
diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index 02d89267a806..141cfe14574d 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -107,7 +107,7 @@ static int qce_ahash_async_req_handle(struct 
crypto_async_request *async_req)
 
qce_dma_issue_pending(&qce->dma);
 
-   ret = qce_start(async_req, tmpl->crypto_alg_type, 0, 0);
+   ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_terminate;
 
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 936bfb7c769b..2f327640b4de 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -143,7 +143,7 @@ qce_skcipher_async_req_handle(struct crypto_async_request 
*async_req)
 
qce_dma_issue_pending(&qce->dma);
 
-   ret = qce_start(async_req, tmpl->crypto_alg_type, req->cryptlen, 0);
+   ret = qce_start(async_req, tmpl->crypto_alg_type);
if (ret)
goto error_terminate;
 
-- 
2.25.1

[PATCH v15 1/3] scsi: ufs: Introduce HPB feature

2020-12-18 Thread Daejun Park

This is a patch for the HPB initialization and adds HPB function calls to
UFS core driver.

NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of IO requests to the corresponding physical
addresses of the flash storage.
In UFS, Logical-address-to-Physical-address (L2P) map data, which is
required to identify the physical address for the requested IOs, can only
be partially stored in SRAM from NAND flash. Due to this partial loading,
accessing the flash address area where the L2P information for that address
is not loaded in the SRAM can result in serious performance degradation.

The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command.
The HPB READ command allows to read data faster than a read command in UFS
since it provides the physical address (HPB Entry) of the desired logical
block in addition to its logical address. The UFS device can access the
physical block in NAND directly without searching and uploading L2P mapping
table. This improves read performance because the NAND read operation for
uploading L2P mapping table is removed.

In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, some HPB
parameters are configured in the device.

We measured the total start-up time of popular applications and observed
the difference by enabling the HPB.
Popular applications are 12 game apps and 24 non-game apps. Each target
applications were launched in order. The cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.

The Following is experiment environment:
 - kernel version: 4.4.0
 - RAM: 8GB
 - UFS 2.1 (64GB)

Result:
+---+--+--+---+
| cycle | baseline | with HPB | diff  |
+---+--+--+---+
| 1 | 272.4| 264.9| -7.5  |
| 2 | 250.4| 248.2| -2.2  |
| 3 | 226.2| 215.6| -10.6 |
| 4 | 230.6| 214.8| -15.8 |
| 5 | 232.0| 218.1| -13.9 |
| 6 | 231.9| 212.6| -19.3 |
+---+--+--+---+

We also measured HPB performance using iozone.
Here is my iozone script:
iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16
-s $IO_RANGE/16 -F mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5
mnt/tmp_6 mnt/tmp_7 mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12
mnt/tmp_13 mnt/tmp_14 mnt/tmp_15 mnt/tmp_16

Result:
+--++-+
| IO range | HPB on | HPB off |
+--++-+
|   1 GB   | 294.8  | 300.87  |
|   4 GB   | 293.51 | 179.35  |
|   8 GB   | 294.85 | 162.52  |
|  16 GB   | 293.45 | 156.26  |
|  32 GB   | 277.4  | 153.25  |
+--++-+

Thanks,
Daejun

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/Kconfig |   9 +
 drivers/scsi/ufs/Makefile|   1 +
 drivers/scsi/ufs/ufs-sysfs.c |  18 ++
 drivers/scsi/ufs/ufs.h   |  13 +
 drivers/scsi/ufs/ufshcd.c|  48 +++
 drivers/scsi/ufs/ufshcd.h|  23 +-
 drivers/scsi/ufs/ufshpb.c| 562 +++
 drivers/scsi/ufs/ufshpb.h| 167 +++
 8 files changed, 840 insertions(+), 1 deletion(-)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
index 3f6dfed4fe84..0dfaccec779d 100644
--- a/drivers/scsi/ufs/Kconfig
+++ b/drivers/scsi/ufs/Kconfig
@@ -181,3 +181,12 @@ config SCSI_UFS_CRYPTO
  Enabling this makes it possible for the kernel to use the crypto
  capabilities of the UFS device (if present) to perform crypto
  operations on data being transferred to/from the device.
+
+config SCSI_UFS_HPB
+   bool "Support UFS Host Performance Booster"
+   depends on SCSI_UFSHCD
+   help
+ The UFS HPB feature improves random read performance. It caches
+ L2P (logical to physical) map of UFS to host DRAM. The driver uses HPB
+ read command by piggybacking physical page number for bypassing FTL 
(flash
+ translation layer)'s L2P address translation.
diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
index 4679af1b564e..663e17cee359 100644
--- a/drivers/scsi/ufs/Makefile
+++ b/drivers/scsi/ufs/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SCSI_UFSHCD) += ufshcd-core.o
 ufshcd-core-y  += ufshcd.o ufs-sysfs.o
 ufshcd-core-$(CONFIG_SCSI_UFS_BSG) += ufs_bsg.o
 ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO) += ufshcd-crypto.o
+ufshcd-core-$(CONFIG_SCSI_UFS_HPB) += ufshpb.o
 obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
 obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o
 obj-$(CONFIG_SCSI_UFS_HISI) += ufs-hisi.o
diff --git a/drivers/scsi/u

[PATCH 1/6] drivers: crypto: qce: sha: Restore/save sha1_state/sha256_state with qce_sha_reqctx in export/import

2020-12-18 Thread Thara Gopinath

Export and import interfaces save and restore partial transformation
states. The partial states were being stored and restored in struct
sha1_state for sha1/hmac(sha1) transformations and sha256_state for
sha256/hmac(sha256) transformations.This led to a bunch of corner cases
where improper state was being stored and restored. A few of the corner
cases that turned up during testing are:

- wrong byte_count restored if export/import is called twice without h/w
transaction in between
- wrong buflen restored back if the pending buffer
length is exactly the block size.
- wrong state restored if buffer length is 0.

To fix these issues, save and restore the entire qce_sha_rctx structure
instead of parts of it in sha1_state and sha256_state structures.
This in turn simplifies the export and import apis.

Signed-off-by: Thara Gopinath 
---
 drivers/crypto/qce/sha.c | 93 
 1 file changed, 8 insertions(+), 85 deletions(-)

diff --git a/drivers/crypto/qce/sha.c b/drivers/crypto/qce/sha.c
index 61c418c12345..b8428da6716d 100644
--- a/drivers/crypto/qce/sha.c
+++ b/drivers/crypto/qce/sha.c
@@ -139,97 +139,20 @@ static int qce_ahash_init(struct ahash_request *req)
 
 static int qce_ahash_export(struct ahash_request *req, void *out)
 {
-   struct crypto_ahash *ahash = crypto_ahash_reqtfm(req);
-   struct qce_sha_reqctx *rctx = ahash_request_ctx(req);
-   unsigned long flags = rctx->flags;
-   unsigned int digestsize = crypto_ahash_digestsize(ahash);
-   unsigned int blocksize =
-   crypto_tfm_alg_blocksize(crypto_ahash_tfm(ahash));
-
-   if (IS_SHA1(flags) || IS_SHA1_HMAC(flags)) {
-   struct sha1_state *out_state = out;
-
-   out_state->count = rctx->count;
-   qce_cpu_to_be32p_array((__be32 *)out_state->state,
-  rctx->digest, digestsize);
-   memcpy(out_state->buffer, rctx->buf, blocksize);
-   } else if (IS_SHA256(flags) || IS_SHA256_HMAC(flags)) {
-   struct sha256_state *out_state = out;
-
-   out_state->count = rctx->count;
-   qce_cpu_to_be32p_array((__be32 *)out_state->state,
-  rctx->digest, digestsize);
-   memcpy(out_state->buf, rctx->buf, blocksize);
-   } else {
-   return -EINVAL;
-   }
-
-   return 0;
-}
-
-static int qce_import_common(struct ahash_request *req, u64 in_count,
-const u32 *state, const u8 *buffer, bool hmac)
-{
-   struct crypto_ahash *ahash = crypto_ahash_reqtfm(req);
struct qce_sha_reqctx *rctx = ahash_request_ctx(req);
-   unsigned int digestsize = crypto_ahash_digestsize(ahash);
-   unsigned int blocksize;
-   u64 count = in_count;
-
-   blocksize = crypto_tfm_alg_blocksize(crypto_ahash_tfm(ahash));
-   rctx->count = in_count;
-   memcpy(rctx->buf, buffer, blocksize);
-
-   if (in_count <= blocksize) {
-   rctx->first_blk = 1;
-   } else {
-   rctx->first_blk = 0;
-   /*
-* For HMAC, there is a hardware padding done when first block
-* is set. Therefore the byte_count must be incremened by 64
-* after the first block operation.
-*/
-   if (hmac)
-   count += SHA_PADDING;
-   }
 
-   rctx->byte_count[0] = (__force __be32)(count & ~SHA_PADDING_MASK);
-   rctx->byte_count[1] = (__force __be32)(count >> 32);
-   qce_cpu_to_be32p_array((__be32 *)rctx->digest, (const u8 *)state,
-  digestsize);
-   rctx->buflen = (unsigned int)(in_count & (blocksize - 1));
+   memcpy(out, rctx, sizeof(struct qce_sha_reqctx));
 
return 0;
 }
 
 static int qce_ahash_import(struct ahash_request *req, const void *in)
 {
-   struct qce_sha_reqctx *rctx;
-   unsigned long flags;
-   bool hmac;
-   int ret;
-
-   ret = qce_ahash_init(req);
-   if (ret)
-   return ret;
-
-   rctx = ahash_request_ctx(req);
-   flags = rctx->flags;
-   hmac = IS_SHA_HMAC(flags);
-
-   if (IS_SHA1(flags) || IS_SHA1_HMAC(flags)) {
-   const struct sha1_state *state = in;
+   struct qce_sha_reqctx *rctx = ahash_request_ctx(req);
 
-   ret = qce_import_common(req, state->count, state->state,
-   state->buffer, hmac);
-   } else if (IS_SHA256(flags) || IS_SHA256_HMAC(flags)) {
-   const struct sha256_state *state = in;
+   memcpy(rctx, in, sizeof(struct qce_sha_reqctx));
 
-   ret = qce_import_common(req, state->count, state->state,
-   state->buf, hmac);
-   }
-
-   return ret;
+   return 0;
 }
 
 static int qce_ahash_update(struct ahash_request *req)
@@ -450,7 +373,7 @@ static const struct qce_ahash_def ahash_de

[PATCH v15 0/3] scsi: ufs: Add Host Performance Booster Support

2020-12-18 Thread Daejun Park

Changelog:

v14 -> v15
1. Remove duplicated sysfs ABI entries in documentation.
2. Add experiment result of HPB performance testing with iozone.

v13 -> v14
1. Cleanup codes by commentted in Greg's review.
2. Add documentation for sysfs entries (from Greg's review).
3. Add experiment result of HPB performance testing. (in this mail)

v12 -> v13
1. Cleanup codes by comments from Can Guo.
2. Add HPB related descriptor/flag/attributes in sysfs.
3. Change base commit from 5.10/scsi-queue to 5.11/scsi-queue.

v11 -> v12
1. Fixed to return error value when HPB fails to initialize pinned active 
region.
2. Fixed to disable HPB feature if HPB fails to allocate essential memory
and workqueue.
3. Fixed to change proper sub-region state when region is already evicted.

v10 -> v11
Add a newline at end the last line on Kconfig file.

v9 -> v10
1. Fixed 64-bit division error
2. Fixed problems commentted in Bart's review.

v8 -> v9
1. Change sysfs initialization.
2. Change reading descriptor during HPB initialization
3. Fixed problems commentted in Bart's review.
4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue.

v7 -> v8
Remove wrongly added tags.

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer (FTL)
to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for the
FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory. 

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device information of the UFS is
queried. After checking supported features, the data structure for the HPB
is initialized according to the device information.

A read I/O in the active sub-region where the map is cached is changed to
HPB READ by the HPB.

The HPB manages the L2P map using information received from the
device. For active sub-region, the HPB caches through ufshpb_map
request. For the in-active region, the HPB discards the L2P map.
When a write I/O occurs in an active sub-region area, associated dirty
bitmap checked as dirty for preventing stale read.

HPB is shown to have a performance improvement of 58 - 67% for random read
workload. [1]

[1]:
https://www.usenix.org/conference/hotstorage17/program/presentation/jeong

Daejun Park (3):
  scsi: ufs: Introduce HPB feature
  scsi: ufs: L2P map management for HPB read
  scsi: ufs: Prepare HPB read for cached sub-region

 drivers/scsi/ufs/Kconfig |9 +
 drivers/scsi/ufs/Makefile|1 +
 drivers/scsi/ufs/ufs-sysfs.c |   18 +
 drivers/scsi/ufs/ufs.h   |   49 +
 drivers/scsi/ufs/ufshcd.c|   53 +
 drivers/scsi/ufs/ufshcd.h|   23 +-
 drivers/scsi/ufs/ufshpb.c| 1767 ++
 drivers/scsi/ufs/ufshpb.h|  230 +
 8 files changed, 2149 insertions(+), 1 deletion(-)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

-- 
2.25.1

[PATCH 0/6] Regression fixes/clean ups in the Qualcomm crypto engine driver

2020-12-18 Thread Thara Gopinath

This patch series is a result of running kernel crypto fuzz tests (by
enabling CONFIG_CRYPTO_MANAGER_EXTRA_TESTS) on the transformations
currently supported via the Qualcomm crypto engine on sdm845.
The first four patches are fixes for various regressions found during
testing. The last two patches are minor clean ups of unused variable
and parameters.

Thara Gopinath (6):
  drivers: crypto: qce: sha: Restore/save sha1_state/sha256_state with
qce_sha_reqctx in export/import
  drivers: crypto: qce: sha: Hold back a block of data to be transferred
as part of final
  drivers: crypto: qce: skcipher: Fix regressions found during fuzz
testing
  drivers: crypto: qce: common: Set data unit size to message length for
AES XTS transformation
  drivers: crypto: qce: Remover src_tbl from qce_cipher_reqctx
  drivers: crypto: qce: Remove totallen and offset in qce_start

 drivers/crypto/qce/cipher.h   |   1 -
 drivers/crypto/qce/common.c   |  25 
 drivers/crypto/qce/common.h   |   3 +-
 drivers/crypto/qce/sha.c  | 114 +-
 drivers/crypto/qce/skcipher.c |  70 ++---
 5 files changed, 101 insertions(+), 112 deletions(-)

-- 
2.25.1

linux-next: Signed-off-by missing for commit in the ipsec tree

2020-12-18 Thread Stephen Rothwell

Hi all,

Commit

  06148d3b3f2e ("xfrm: Fix oops in xfrm_replay_advance_bmp")

is missing a Signed-off-by from its committer.

-- 
Cheers,
Stephen Rothwell


pgpELN38Ev1kg.pgp
Description: OpenPGP digital signature

Re: [PATCH] mm/vmscan: DRY cleanup for do_try_to_free_pages()

2020-12-18 Thread Jacob Wen




On 12/19/20 9:21 AM, Chris Down wrote:

Jacob Wen writes:
set_task_reclaim_state() is a function with 3 lines of code of which 
2 lines contain WARN_ON_ONCE.


I am not comfortable with the current repetition.


Ok, but could you please go into _why_ others should feel that way 
too? There are equally also reasons to err on the side of leaving code 
as-is -- since we know it already works, and this code generally has 
pretty high inertia -- and avoid mutation of code without concrete 
description of the benefits.


I don't get your point. The patch doesn't change code of 
set_task_reclaim_state(), so I am fine with the repeated WARN_ON_ONCE.


I mean I prefer removing duplicate code to avoid going down the rabbit 
hole of set_task_reclaim_state().


It's a fundamental principle to me to move the code into its own 
function. I'd like to hear the others' opinions.

Re: kernel BUG at drivers/dma-buf/dma-buf.c:LINE!

2020-12-18 Thread syzbot

syzbot suspects this issue was fixed by commit:

commit e722a295cf493388dae474745d30e91e1a2ec549
Author: Greg Kroah-Hartman 
Date:   Thu Aug 27 12:36:27 2020 +

staging: ion: remove from the tree

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17d4f13750
start commit:   abb3438d Merge tag 'm68knommu-for-v5.9-rc3' of git://git.k..
git tree:   upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=978db74cb30aa994
dashboard link: https://syzkaller.appspot.com/bug?extid=d6734079f30f7fc39021
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1742859690

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: staging: ion: remove from the tree

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

[PATCH] device-dax: Fix range release

2020-12-18 Thread Dan Williams

There are multiple locations that open-code the release of the last
range in a device-dax instance. Consolidate this into a new
dev_dax_trim_range() helper.

This also addresses a kmemleak report:

# cat /sys/kernel/debug/kmemleak
[..]
unreferenced object 0x976bd46f6240 (size 64):
   comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s)
   hex dump (first 32 bytes):
 00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00  .. .7...
 ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00  8...
   backtrace:
 [<064003cf>] __kmalloc_track_caller+0x136/0x379
 [] krealloc+0x67/0x92
 [] __alloc_dev_dax_range+0x73/0x25c
 [<27d58626>] devm_create_dev_dax+0x27d/0x416
 [<434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core]
 [<83726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem]
 [] nvdimm_bus_probe+0x9d/0x340 [libnvdimm]
 [] really_probe+0x230/0x48d
 [<6cabd38e>] driver_probe_device+0x122/0x13b
 [<29c7b95a>] device_driver_attach+0x5b/0x60
 [<53e5659b>] bind_store+0xb7/0xc3
 [] drv_attr_store+0x27/0x31
 [<949069c5>] sysfs_kf_write+0x4a/0x57
 [<4a8b5adf>] kernfs_fop_write+0x150/0x1e5
 [] __vfs_write+0x1b/0x34
 [] vfs_write+0xd8/0x1d1

Reported-by: Jane Chu 
Cc: Zhen Lei 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |   44 +---
 1 file changed, 21 insertions(+), 23 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 9761cb40d4bb..720cd140209f 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -367,19 +367,28 @@ void kill_dev_dax(struct dev_dax *dev_dax)
 }
 EXPORT_SYMBOL_GPL(kill_dev_dax);
 
-static void free_dev_dax_ranges(struct dev_dax *dev_dax)
+static void trim_dev_dax_range(struct dev_dax *dev_dax)
 {
+   int i = dev_dax->nr_range - 1;
+   struct range *range = &dev_dax->ranges[i].range;
struct dax_region *dax_region = dev_dax->region;
-   int i;
 
device_lock_assert(dax_region->dev);
-   for (i = 0; i < dev_dax->nr_range; i++) {
-   struct range *range = &dev_dax->ranges[i].range;
-
-   __release_region(&dax_region->res, range->start,
-   range_len(range));
+   dev_dbg(&dev_dax->dev, "delete range[%d]: %#llx:%#llx\n", i,
+   (unsigned long long)range->start,
+   (unsigned long long)range->end);
+
+   __release_region(&dax_region->res, range->start, range_len(range));
+   if (--dev_dax->nr_range == 0) {
+   kfree(dev_dax->ranges);
+   dev_dax->ranges = NULL;
}
-   dev_dax->nr_range = 0;
+}
+
+static void free_dev_dax_ranges(struct dev_dax *dev_dax)
+{
+   while (dev_dax->nr_range)
+   trim_dev_dax_range(dev_dax);
 }
 
 static void unregister_dev_dax(void *dev)
@@ -804,15 +813,10 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
u64 start,
return 0;
 
rc = devm_register_dax_mapping(dev_dax, dev_dax->nr_range - 1);
-   if (rc) {
-   dev_dbg(dev, "delete range[%d]: %pa:%pa\n", dev_dax->nr_range - 
1,
-   &alloc->start, &alloc->end);
-   dev_dax->nr_range--;
-   __release_region(res, alloc->start, resource_size(alloc));
-   return rc;
-   }
+   if (rc)
+   trim_dev_dax_range(dev_dax);
 
-   return 0;
+   return rc;
 }
 
 static int adjust_dev_dax_range(struct dev_dax *dev_dax, struct resource *res, 
resource_size_t size)
@@ -885,12 +889,7 @@ static int dev_dax_shrink(struct dev_dax *dev_dax, 
resource_size_t size)
if (shrink >= range_len(range)) {
devm_release_action(dax_region->dev,
unregister_dax_mapping, &mapping->dev);
-   __release_region(&dax_region->res, range->start,
-   range_len(range));
-   dev_dax->nr_range--;
-   dev_dbg(dev, "delete range[%d]: %#llx:%#llx\n", i,
-   (unsigned long long) range->start,
-   (unsigned long long) range->end);
+   trim_dev_dax_range(dev_dax);
to_shrink -= shrink;
if (!to_shrink)
break;
@@ -1267,7 +1266,6 @@ static void dev_dax_release(struct device *dev)
put_dax(dax_dev);
free_dev_dax_id(dev_dax);
dax_region_put(dax_region);
-   kfree(dev_dax->ranges);
kfree(dev_dax->pgmap);
kfree(dev_dax);
 }

[PATCH v4] sched/fair: Avoid stale CPU util_est value for schedutil in task dequeue

2020-12-18 Thread Xuewen Yan

From: Xuewen Yan 

CPU (root cfs_rq) estimated utilization (util_est) is currently used in
dequeue_task_fair() to drive frequency selection before it is updated.

with:

CPU_util: rq->cfs.avg.util_avg
CPU_util_est: rq->cfs.avg.util_est
CPU_utilization : max(CPU_util, CPU_util_est)
task_util   : p->se.avg.util_avg
task_util_est   : p->se.avg.util_est

dequeue_task_fair():

/* (1) CPU_util and task_util update + inform schedutil about
   CPU_utilization changes */
for_each_sched_entity() /* 2 loops */
(dequeue_entity() ->) update_load_avg() -> cfs_rq_util_change()
 -> cpufreq_update_util() ->...-> sugov_update_[shared\|single]
 -> sugov_get_util() -> cpu_util_cfs()

/* (2) CPU_util_est and task_util_est update */
util_est_dequeue()

cpu_util_cfs() uses CPU_utilization which could lead to a false (too
high) utilization value for schedutil in task ramp-down or ramp-up
scenarios during task dequeue.

To mitigate the issue split the util_est update (2) into:

 (A) CPU_util_est update in util_est_dequeue()
 (B) task_util_est update in util_est_update()

Place (A) before (1) and keep (B) where (2) is. The latter is necessary
since (B) relies on task_util update in (1).

Fixes: 7f65ea42eb00 ("sched/fair: Add util_est on top of PELT")

Signed-off-by: Xuewen Yan 
Reviewed-by: Dietmar Eggemann 
Reviewed-by: Vincent Guittot 
---
Change since v3:
-add reviewer
-add more comment details

Changes since v2:
-modify the comment
-move util_est_dequeue above within_margin()
-modify the tab and space

Changes since v1:
-change the util_est_dequeue/update to inline type
-use unsigned int enqueued rather than util_est in util_est_dequeue
-remove "cpu" var

---
 kernel/sched/fair.c | 43 ---
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ae7ceba..f3a1b7a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3932,6 +3932,22 @@ static inline void util_est_enqueue(struct cfs_rq 
*cfs_rq,
trace_sched_util_est_cfs_tp(cfs_rq);
 }
 
+static inline void util_est_dequeue(struct cfs_rq *cfs_rq,
+   struct task_struct *p)
+{
+   unsigned int enqueued;
+
+   if (!sched_feat(UTIL_EST))
+   return;
+
+   /* Update root cfs_rq's estimated utilization */
+   enqueued  = cfs_rq->avg.util_est.enqueued;
+   enqueued -= min_t(unsigned int, enqueued, _task_util_est(p));
+   WRITE_ONCE(cfs_rq->avg.util_est.enqueued, enqueued);
+
+   trace_sched_util_est_cfs_tp(cfs_rq);
+}
+
 /*
  * Check if a (signed) value is within a specified (unsigned) margin,
  * based on the observation that:
@@ -3945,23 +3961,16 @@ static inline bool within_margin(int value, int margin)
return ((unsigned int)(value + margin - 1) < (2 * margin - 1));
 }
 
-static void
-util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
+static inline void util_est_update(struct cfs_rq *cfs_rq,
+  struct task_struct *p,
+  bool task_sleep)
 {
long last_ewma_diff;
struct util_est ue;
-   int cpu;
 
if (!sched_feat(UTIL_EST))
return;
 
-   /* Update root cfs_rq's estimated utilization */
-   ue.enqueued  = cfs_rq->avg.util_est.enqueued;
-   ue.enqueued -= min_t(unsigned int, ue.enqueued, _task_util_est(p));
-   WRITE_ONCE(cfs_rq->avg.util_est.enqueued, ue.enqueued);
-
-   trace_sched_util_est_cfs_tp(cfs_rq);
-
/*
 * Skip update of task's estimated utilization when the task has not
 * yet completed an activation, e.g. being migrated.
@@ -4001,8 +4010,7 @@ static inline bool within_margin(int value, int margin)
 * To avoid overestimation of actual task utilization, skip updates if
 * we cannot grant there is idle time in this CPU.
 */
-   cpu = cpu_of(rq_of(cfs_rq));
-   if (task_util(p) > capacity_orig_of(cpu))
+   if (task_util(p) > capacity_orig_of(cpu_of(rq_of(cfs_rq
return;
 
/*
@@ -4085,8 +4093,11 @@ static inline int newidle_balance(struct rq *rq, struct 
rq_flags *rf)
 util_est_enqueue(struct cfs_rq *cfs_rq, struct task_struct *p) {}
 
 static inline void
-util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p,
-bool task_sleep) {}
+util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p) {}
+
+static inline void
+util_est_update(struct cfs_rq *cfs_rq, struct task_struct *p,
+   bool task_sleep) {}
 static inline void update_misfit_status(struct task_struct *p, struct rq *rq) 
{}
 
 #endif /* CONFIG_SMP */
@@ -5589,6 +5600,8 @@ static void dequeue_task_fair(struct rq *rq, struct 
task_struct *p, int flags)
int idle_h_nr_running = task_has_idle_policy(p);
bool was_sched_idle = sched_idle_rq(rq);
 
+   util_est_dequeue(&rq->cfs, p);

Re: [RFC PATCH 0/5] running kernel mode SIMD with softirqs disabled

2020-12-18 Thread Herbert Xu

On Fri, Dec 18, 2020 at 06:01:01PM +0100, Ard Biesheuvel wrote:
>
> Questions:
> - what did I miss or break horribly?
> - does any of this matter for RT? AIUI, RT runs softirqs from a dedicated
>   kthread, so I don't think it cares.
> - what would be a reasonable upper bound to keep softirqs disabled? I suppose
>   100s of cycles or less is overkill, but I'm not sure how to derive a better
>   answer.
> - could we do the same on x86, now that kernel_fpu_begin/end is no longer
>   expensive?

If this approach works not only would it allow us to support the
synchronous users better, it would also allow us to remove loads
of cruft in the Crypto API that exist solely to support these SIMD
code paths.

So I eagerly await the assessment of the scheduler/RT folks on this
approach.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH] staging: qlge: Removed duplicate word in comment.

2020-12-18 Thread Daniel West

This patch fixes the checkpatch warning:

WARNING: Possible repeated word: 'and'

Signed-off-by: Daniel West 
---
 drivers/staging/qlge/qlge_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/qlge/qlge_main.c b/drivers/staging/qlge/qlge_main.c
index e6b7baa12cd6..22167eca7c50 100644
--- a/drivers/staging/qlge/qlge_main.c
+++ b/drivers/staging/qlge/qlge_main.c
@@ -3186,7 +3186,7 @@ static void ql_enable_msix(struct ql_adapter *qdev)
 "Running with legacy interrupts.\n");
 }
 
-/* Each vector services 1 RSS ring and and 1 or more
+/* Each vector services 1 RSS ring and 1 or more
  * TX completion rings.  This function loops through
  * the TX completion rings and assigns the vector that
  * will service it.  An example would be if there are
-- 
2.25.1

[PATCH] dt-bindings: rtc: pcf2127: update bindings

2020-12-18 Thread Alexandre Belloni

pcf2127, pcf2129 and pca2129 support start-year and reset-source.

Signed-off-by: Alexandre Belloni 
---
 .../devicetree/bindings/rtc/nxp,pcf2127.yaml  | 54 +++
 .../devicetree/bindings/rtc/trivial-rtc.yaml  |  6 ---
 2 files changed, 54 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml

diff --git a/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml 
b/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml
new file mode 100644
index ..daa479b395a6
--- /dev/null
+++ b/Documentation/devicetree/bindings/rtc/nxp,pcf2127.yaml
@@ -0,0 +1,54 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/rtc/nxp,pcf2127.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NXP PCF2127, PXF2129 and PCA2129 Real Time Clocks
+
+allOf:
+  - $ref: "rtc.yaml#"
+
+maintainers:
+  - Alexandre Belloni 
+
+properties:
+  compatible:
+enum:
+  - nxp,pcf2127
+  - nxp,pcf2129
+  - nxp,pca2129
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  start-year: true
+
+  reset-source: true
+
+required:
+  - compatible
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+i2c {
+#address-cells = <1>;
+#size-cells = <0>;
+
+rtc@51 {
+compatible = "nxp,pcf2127";
+reg = <0x51>;
+pinctrl-0 = <&rtc_nint_pins>;
+interrupts-extended = <&gpio1 16 IRQ_TYPE_LEVEL_HIGH>;
+reset-source;
+};
+};
+
+...
diff --git a/Documentation/devicetree/bindings/rtc/trivial-rtc.yaml 
b/Documentation/devicetree/bindings/rtc/trivial-rtc.yaml
index c7d14de214c4..17816b734a51 100644
--- a/Documentation/devicetree/bindings/rtc/trivial-rtc.yaml
+++ b/Documentation/devicetree/bindings/rtc/trivial-rtc.yaml
@@ -48,12 +48,6 @@ properties:
   - microcrystal,rv3029
   # Real Time Clock
   - microcrystal,rv8523
-  # Real-time clock
-  - nxp,pcf2127
-  # Real-time clock
-  - nxp,pcf2129
-  # Real-time clock
-  - nxp,pca2129
   # Real-time Clock Module
   - pericom,pt7c4338
   # I2C bus SERIAL INTERFACE REAL-TIME CLOCK IC
-- 
2.29.2

Re: [PATCH] KVM/x86: Move definition of __ex to x86.h

2020-12-18 Thread Krish Sadhukhan




On 12/18/20 4:11 AM, Uros Bizjak wrote:

Merge __kvm_handle_fault_on_reboot with its sole user
and move the definition of __ex to a common include to be
shared between VMX and SVM.

Cc: Paolo Bonzini 
Cc: Sean Christopherson 
Signed-off-by: Uros Bizjak 
---
  arch/x86/include/asm/kvm_host.h | 25 -
  arch/x86/kvm/svm/svm.c  |  2 --
  arch/x86/kvm/vmx/vmx_ops.h  |  4 +---
  arch/x86/kvm/x86.h  | 23 +++
  4 files changed, 24 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7e5f33a0d0e2..ff152ee1d63f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1623,31 +1623,6 @@ enum {
  #define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 
1 : 0)
  #define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm)
  
-asmlinkage void kvm_spurious_fault(void);

-
-/*
- * Hardware virtualization extension instructions may fault if a
- * reboot turns off virtualization while processes are running.
- * Usually after catching the fault we just panic; during reboot
- * instead the instruction is ignored.
- */
-#define __kvm_handle_fault_on_reboot(insn) \
-   "666: \n\t"   \
-   insn "\n\t"   \
-   "jmp   668f \n\t" \
-   "667: \n\t"   \
-   "1: \n\t" \
-   ".pushsection .discard.instr_begin \n\t"  \
-   ".long 1b - . \n\t"   \
-   ".popsection \n\t"\
-   "call  kvm_spurious_fault \n\t"   \
-   "1: \n\t" \
-   ".pushsection .discard.instr_end \n\t"\
-   ".long 1b - . \n\t"   \
-   ".popsection \n\t"\
-   "668: \n\t"   \
-   _ASM_EXTABLE(666b, 667b)
-
  #define KVM_ARCH_WANT_MMU_NOTIFIER
  int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long 
end,
unsigned flags);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index da7eb4aaf44f..0a72ab9fd568 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -42,8 +42,6 @@
  
  #include "svm.h"
  
-#define __ex(x) __kvm_handle_fault_on_reboot(x)

-
  MODULE_AUTHOR("Qumranet");
  MODULE_LICENSE("GPL");
  
diff --git a/arch/x86/kvm/vmx/vmx_ops.h b/arch/x86/kvm/vmx/vmx_ops.h

index 692b0c31c9c8..7e3cb53c413f 100644
--- a/arch/x86/kvm/vmx/vmx_ops.h
+++ b/arch/x86/kvm/vmx/vmx_ops.h
@@ -4,13 +4,11 @@
  
  #include 
  
-#include 

  #include 
  
  #include "evmcs.h"

  #include "vmcs.h"
-
-#define __ex(x) __kvm_handle_fault_on_reboot(x)
+#include "x86.h"
  
  asmlinkage void vmread_error(unsigned long field, bool fault);

  __attribute__((regparm(0))) void vmread_error_trampoline(unsigned long field,
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index e7ca622a468f..608548d05e84 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -7,6 +7,29 @@
  #include "kvm_cache_regs.h"
  #include "kvm_emulate.h"
  
+asmlinkage void kvm_spurious_fault(void);

+
+/*
+ * Hardware virtualization extension instructions may fault if a
+ * reboot turns off virtualization while processes are running.
+ * Usually after catching the fault we just panic; during reboot
+ * instead the instruction is ignored.
+ */
+#define __ex(insn) \
+   "666:  " insn "\n"  \
+   "  jmp 669f\n"\
+   "667:\n"  \
+   ".pushsection .discard.instr_begin\n" \
+   ".long 667b - .\n"\
+   ".popsection\n"   \
+   "  call kvm_spurious_fault\n" \
+   "668:\n"  \
+   ".pushsection .discard.instr_end\n"   \
+   ".long 668b - .\n"\
+   ".popsection\n"   \
+   "669:\n"  \
+   _ASM_EXTABLE(666b, 667b)
+
  #define KVM_DEFAULT_PLE_GAP   128
  #define KVM_VMX_DEFAULT_PLE_WINDOW4096
  #define KVM_DEFAULT_PLE_WINDOW_GROW   2

Reviewed-by: Krish Sadhukhan

Re: [PATCH] mm/vmscan: DRY cleanup for do_try_to_free_pages()

2020-12-18 Thread Chris Down


Jacob Wen writes:
set_task_reclaim_state() is a function with 3 lines of code of which 2 
lines contain WARN_ON_ONCE.


I am not comfortable with the current repetition.


Ok, but could you please go into _why_ others should feel that way too? There 
are equally also reasons to err on the side of leaving code as-is -- since we 
know it already works, and this code generally has pretty high inertia -- and 
avoid mutation of code without concrete description of the benefits.

Re: [RFC][PATCH 2/3] dma-buf: system_heap: Add pagepool support to system heap

2020-12-18 Thread John Stultz

On Fri, Dec 18, 2020 at 6:36 AM Daniel Vetter  wrote:
> On Thu, Dec 17, 2020 at 11:06:11PM +, John Stultz wrote:
> > Reuse/abuse the pagepool code from the network code to speed
> > up allocation performance.
> >
> > This is similar to the ION pagepool usage, but tries to
> > utilize generic code instead of a custom implementation.
>
> We also have one of these in ttm. I think we should have at most one of
> these for the gpu ecosystem overall, maybe as a helper that can be plugged
> into all the places.
>
> Or I'm kinda missing something, which could be since I only glanced at
> yours for a bit. But it's also called page pool for buffer allocations,
> and I don't think there's that many ways to implement that really :-)

Yea, when I was looking around the ttm one didn't seem quite as
generic as the networking one, which more easily fit in here.

The main benefit for the system heap is not so much the pool itself
(the normal page allocator is pretty good), as it being able to defer
the free and zero the pages in a background thread, so the pool is
effectively filled with pre-zeroed pages.

But I'll take another look at the ttm implementation and see if it can
be re-used or the shared code refactored and pulled out somehow.

thanks
-john

Re: [PATCH -tip V2 10/10] workqueue: Fix affinity of kworkers when attaching into pool

2020-12-18 Thread Lai Jiangshan

On Sat, Dec 19, 2020 at 1:59 AM Valentin Schneider
 wrote:
>
>
> On 18/12/20 17:09, Lai Jiangshan wrote:
> > From: Lai Jiangshan 
> >
> > When worker_attach_to_pool() is called, we should not put the workers
> > to pool->attrs->cpumask when there is not CPU online in it.
> >
> > We have to use wq_online_cpumask in worker_attach_to_pool() to check
> > if pool->attrs->cpumask is valid rather than cpu_online_mask or
> > cpu_active_mask due to gaps between stages in cpu hot[un]plug.
> >
> > So for that late-spawned per-CPU kworker case: the outgoing CPU should have
> > already been cleared from wq_online_cpumask, so it gets its affinity reset
> > to the possible mask and the subsequent wakeup will ensure it's put on an
> > active CPU.
> >
> > To use wq_online_cpumask in worker_attach_to_pool(), we need to protect
> > wq_online_cpumask in wq_pool_attach_mutex and we modify 
> > workqueue_online_cpu()
> > and workqueue_offline_cpu() to enlarge wq_pool_attach_mutex protected
> > region. We also put updating wq_online_cpumask and [re|un]bind_workers()
> > in the same wq_pool_attach_mutex protected region to make the update
> > for percpu workqueue atomically.
> >
> > Cc: Qian Cai 
> > Cc: Peter Zijlstra 
> > Cc: Vincent Donnefort 
> > Link: 
> > https://lore.kernel.org/lkml/20201210163830.21514-3-valentin.schnei...@arm.com/
> > Acked-by: Valentin Schneider 
>
> So an etiquette thing: I never actually gave an Acked-by. I did say it
> looked good to me, and that probably should've been bundled with a
> Reviewed-by, but it wasn't (I figured I'd wait for v2). Forging is bad,
> m'kay.
>
> When in doubt (e.g. someone says they're ok with your patch but don't give
> any Ack/Reviewed-by), just ask via mail or on IRC.

Hello, Valentin

I'm sorry not to have asked for your option.  When I saw
"Seems alright to me." I felt a huge encouragement and rushed.

I was in doubt should I promote "Seems alright to me." to "Ack".
Instead of asking, I wrongly did it right the way.  I knew may I'm
just forging, and added a log in the cover letter:

>Add Valentin's ack for patch 10 because "Seems alright to me." and
>add Valentin's comments to the changelog which is integral.

Anyway, it is my bad and I learnt.

>
> For now, please make this a:
>
> Reviewed-by: Valentin Schneider 

Hello Peter, cloud you help change it if there is no other
feedback that causes V3 patchset to be made.

Thanks
Lai

>
> > Acked-by: Tejun Heo 
> > Signed-off-by: Lai Jiangshan 
> > ---
> >  kernel/workqueue.c | 32 +++-
> >  1 file changed, 15 insertions(+), 17 deletions(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 65270729454c..eeb726598f80 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -310,7 +310,7 @@ static bool workqueue_freezing;   /* PL: have 
> > wqs started freezing? */
> >  /* PL: allowable cpus for unbound wqs and work items */
> >  static cpumask_var_t wq_unbound_cpumask;
> >
> > -/* PL: online cpus (cpu_online_mask with the going-down cpu cleared) */
> > +/* PL&A: online cpus (cpu_online_mask with the going-down cpu cleared) */
> >  static cpumask_var_t wq_online_cpumask;
> >
> >  /* CPU where unbound work was last round robin scheduled from this CPU */
> > @@ -1848,11 +1848,11 @@ static void worker_attach_to_pool(struct worker 
> > *worker,
> >  {
> >   mutex_lock(&wq_pool_attach_mutex);
> >
> > - /*
> > -  * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> > -  * online CPUs.  It'll be re-applied when any of the CPUs come up.
> > -  */
> > - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> > + /* Is there any cpu in pool->attrs->cpumask online? */
> > + if (cpumask_intersects(pool->attrs->cpumask, wq_online_cpumask))
> > + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
> > pool->attrs->cpumask) < 0);
> > + else
> > + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, 
> > cpu_possible_mask) < 0);
> >
> >   /*
> >* The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
> > @@ -5081,13 +5081,12 @@ int workqueue_online_cpu(unsigned int cpu)
> >   int pi;
> >
> >   mutex_lock(&wq_pool_mutex);
> > - cpumask_set_cpu(cpu, wq_online_cpumask);
> >
> > - for_each_cpu_worker_pool(pool, cpu) {
> > - mutex_lock(&wq_pool_attach_mutex);
> > + mutex_lock(&wq_pool_attach_mutex);
> > + cpumask_set_cpu(cpu, wq_online_cpumask);
> > + for_each_cpu_worker_pool(pool, cpu)
> >   rebind_workers(pool);
> > - mutex_unlock(&wq_pool_attach_mutex);
> > - }
> > + mutex_unlock(&wq_pool_attach_mutex);
> >
> >   /* update CPU affinity of workers of unbound pools */
> >   for_each_pool(pool, pi) {
> > @@ -5117,14 +5116,13 @@ int workqueue_offline_cpu(unsigned int cpu)
> >   if (WARN_ON(cpu != smp_processor_id()))
> >   return -1;
> >
> > - for_each_cpu_worker_pool(pool, cpu

Re: WARNING: suspicious RCU usage in modeset_lock

2020-12-18 Thread Tetsuo Handa

On Wed, Dec 16, 2020 at 5:16 PM Paul E. McKenney  wrote:
> In my experience, lockdep will indeed complain if an interrupt handler
> returns while in an RCU read-side critical section.

Can't we add lock status checks into the beginning and the end of interrupt 
handler functions
(e.g. whether "struct task_struct"->lockdep_depth did not change) ?

Re: [PATCH v3 0/2] add reset-source RTC binding, update pcf2127 driver

2020-12-18 Thread Alexandre Belloni

On Fri, 18 Dec 2020 11:10:52 +0100, Rasmus Villemoes wrote:
> This adds a reset-source RTC DT binding, as suggested by Alexandre,
> and resends Uwe's patch making use of that property in pcf2127 driver
> to avoid the driver exposing a watchdog that doesn't work (and
> potentially shuffling the enumeration of the existing devices that do
> work).
> 
> v3: elide the refactoring patch already in -next (5d78533a0c53 - rtc:
> pcf2127: move watchdog initialisation to a separate function), make
> sure to cc the DT binding list.
> 
> [...]

Applied, thanks!

[1/2] dt-bindings: rtc: add reset-source property
  commit: 320d159e2d63a97a40f24cd6dfda5a57eec65b91
[2/2] rtc: pcf2127: only use watchdog when explicitly available
  commit: 71ac13457d9d1007effde65b54818106b2c2b525

Best regards,
-- 
Alexandre Belloni

Re: [GIT PULL] pwm: Changes for v5.11-rc1

2020-12-18 Thread Thierry Reding

On Fri, Dec 18, 2020 at 12:35:09PM -0800, Linus Torvalds wrote:
> On Fri, Dec 18, 2020 at 8:04 AM Thierry Reding  
> wrote:
> >
> > This is a fairly big release cycle from the PWM framework's point of
> > view.
> 
> Why does all of this have commit dates from the last day?
> 
> It clearly cannot have been in linux-next in this form, at least.
> 
> I pulled and then unpulled. Don't send me stuff that hasn't been in
> next without a _lot_ of explanations for why, most certainly not the
> week before Christmas.

I didn't realize that this would show up as all new commits. The reason
why this happens is because the first commit in the tree is a fix for an
issue for which Uwe had sent an alternative patch to you directly for
inclusion in v5.10.

After going over the patches again as I was preparing the pull request,
I realized that the commit message was no longer accurate, so I changed
the commit message of the first commit, which then caused all of the
subsequent patches (i.e. all of them) to be rewritten.

The only change that hasn't been in linux-next for at least a week is a
bugfix I merged two days ago. The rest should be identical except for
the commit message on that first commit.

For reference, here's a diff on my for-next branch that the pull request
is based on, compared to what it was like a week ago:

$ git diff for-next@{8days}..pwm/for-5.11-rc1
diff --git a/drivers/pwm/pwm-sun4i.c b/drivers/pwm/pwm-sun4i.c
index cc1eb0818648..ce5c4fc8da6f 100644
--- a/drivers/pwm/pwm-sun4i.c
+++ b/drivers/pwm/pwm-sun4i.c
@@ -294,12 +294,8 @@ static int sun4i_pwm_apply(struct pwm_chip *chip, 
struct pwm_device *pwm,

ctrl |= BIT_CH(PWM_CLK_GATING, pwm->hwpwm);

-   if (state->enabled) {
+   if (state->enabled)
ctrl |= BIT_CH(PWM_EN, pwm->hwpwm);
-   } else {
-   ctrl &= ~BIT_CH(PWM_EN, pwm->hwpwm);
-   ctrl &= ~BIT_CH(PWM_CLK_GATING, pwm->hwpwm);
-   }

sun4i_pwm_writel(sun4i_pwm, ctrl, PWM_CTRL_REG);

And that corresponds to the topmost patch.

I hope this clarifies things, and sorry for not mentioning this in the
pull request.

Thierry

signature.asc
Description: PGP signature

Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement

2020-12-18 Thread Ian Kent

On Fri, 2020-12-18 at 21:20 +0800, Fox Chen wrote:
> On Fri, Dec 18, 2020 at 7:21 PM Ian Kent  wrote:
> > On Fri, 2020-12-18 at 16:01 +0800, Fox Chen wrote:
> > > On Fri, Dec 18, 2020 at 3:36 PM Ian Kent 
> > > wrote:
> > > > On Thu, 2020-12-17 at 10:14 -0500, Tejun Heo wrote:
> > > > > Hello,
> > > > > 
> > > > > On Thu, Dec 17, 2020 at 07:48:49PM +0800, Ian Kent wrote:
> > > > > > > What could be done is to make the kernfs node attr_mutex
> > > > > > > a pointer and dynamically allocate it but even that is
> > > > > > > too
> > > > > > > costly a size addition to the kernfs node structure as
> > > > > > > Tejun has said.
> > > > > > 
> > > > > > I guess the question to ask is, is there really a need to
> > > > > > call kernfs_refresh_inode() from functions that are usually
> > > > > > reading/checking functions.
> > > > > > 
> > > > > > Would it be sufficient to refresh the inode in the
> > > > > > write/set
> > > > > > operations in (if there's any) places where things like
> > > > > > setattr_copy() is not already called?
> > > > > > 
> > > > > > Perhaps GKH or Tejun could comment on this?
> > > > > 
> > > > > My memory is a bit hazy but invalidations on reads is how
> > > > > sysfs
> > > > > namespace is
> > > > > implemented, so I don't think there's an easy around that.
> > > > > The
> > > > > only
> > > > > thing I
> > > > > can think of is embedding the lock into attrs and doing xchg
> > > > > dance
> > > > > when
> > > > > attaching it.
> > > > 
> > > > Sounds like your saying it would be ok to add a lock to the
> > > > attrs structure, am I correct?
> > > > 
> > > > Assuming it is then, to keep things simple, use two locks.
> > > > 
> > > > One global lock for the allocation and an attrs lock for all
> > > > the
> > > > attrs field updates including the kernfs_refresh_inode()
> > > > update.
> > > > 
> > > > The critical section for the global lock could be reduced and
> > > > it
> > > > changed to a spin lock.
> > > > 
> > > > In __kernfs_iattrs() we would have something like:
> > > > 
> > > > take the allocation lock
> > > > do the allocated checks
> > > >   assign if existing attrs
> > > >   release the allocation lock
> > > >   return existing if found
> > > > othewise
> > > >   release the allocation lock
> > > > 
> > > > allocate and initialize attrs
> > > > 
> > > > take the allocation lock
> > > > check if someone beat us to it
> > > >   free and grab exiting attrs
> > > > otherwise
> > > >   assign the new attrs
> > > > release the allocation lock
> > > > return attrs
> > > > 
> > > > Add a spinlock to the attrs struct and use it everywhere for
> > > > field updates.
> > > > 
> > > > Am I on the right track or can you see problems with this?
> > > > 
> > > > Ian
> > > > 
> > > 
> > > umm, we update the inode in kernfs_refresh_inode, right??  So I
> > > guess
> > > the problem is how can we protect the inode when
> > > kernfs_refresh_inode
> > > is called, not the attrs??
> > 
> > But the attrs (which is what's copied from) were protected by the
> > mutex lock (IIUC) so dealing with the inode attributes implies
> > dealing with the kernfs node attrs too.
> > 
> > For example in kernfs_iop_setattr() the call to setattr_copy()
> > copies
> > the node attrs to the inode under the same mutex lock. So, if a
> > read
> > lock is used the copy in kernfs_refresh_inode() is no longer
> > protected,
> > it needs to be protected in a different way.
> > 
> 
> Ok, I'm actually wondering why the VFS holds exclusive i_rwsem for
> .setattr but
>  no lock for .getattr (misdocumented?? sometimes they have as you've
> found out)?
> What does it protect against?? Because .permission does a similar
> thing
> here -- updating inode attributes, the goal is to provide the same
> protection level
> for .permission as for .setattr, am I right???

As far as the documentation goes that's probably my misunderstanding
of it.

It does happen that the VFS makes assumptions about how call backs
are meant to be used.

Read like call backs, like .getattr() and .permission() are meant to
be used, well, like read like functions so the VFS should be ok to
take locks or not based on the operation context at hand.

So it's not about the locking for these call backs per se, it's about
the context in which they are called.

For example, in link_path_walk(), at the beginning of the component
lookup loop (essentially for the containing directory at that point),
may_lookup() is called which leads to a call to .permission() without
any inode lock held at that point.

But file opens (possibly following a path walk to resolve a path)
are different.

For example, do_filp_open() calls path_openat() which leads to a
call to open_last_lookups(), which leads to a call to .permission()
along the way. And in this case there are two contexts, an open()
create or one without create, the former needing the exclusive inode
lock and the later able to use the shared lock.

So it's about the locking needed for the encompassing operation that
is bei

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-12-18 Thread Thierry Reding

On Fri, Dec 18, 2020 at 04:15:45PM -0600, Rob Herring wrote:
> On Thu, Dec 17, 2020 at 9:00 AM Thierry Reding  
> wrote:
> >
> > On Tue, Nov 10, 2020 at 08:33:09PM +0100, Thierry Reding wrote:
> > > On Fri, Nov 06, 2020 at 04:25:48PM +0100, Thierry Reding wrote:
> > > > On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> > > > > On 2020-11-05 16:43, Thierry Reding wrote:
> > > > > > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > > > > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > > > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding 
> > > > > > > > > wrote:
> > > > > > > > > > From: Thierry Reding 
> > > > > > > > > >
> > > > > > > > > > Reserved memory regions can be marked as "active" if 
> > > > > > > > > > hardware is
> > > > > > > > > > expected to access the regions during boot and before the 
> > > > > > > > > > operating
> > > > > > > > > > system can take control. One example where this is useful 
> > > > > > > > > > is for the
> > > > > > > > > > operating system to infer whether the region needs to be 
> > > > > > > > > > identity-
> > > > > > > > > > mapped through an IOMMU.
> > > > > > > > >
> > > > > > > > > I like simple solutions, but this hardly seems adequate to 
> > > > > > > > > solve the
> > > > > > > > > problem of passing IOMMU setup from bootloader/firmware to 
> > > > > > > > > the OS. Like
> > > > > > > > > what is the IOVA that's supposed to be used if identity 
> > > > > > > > > mapping is not
> > > > > > > > > used?
> > > > > > > >
> > > > > > > > The assumption here is that if the region is not active there 
> > > > > > > > is no need
> > > > > > > > for the IOVA to be specified because the kernel will allocate 
> > > > > > > > memory and
> > > > > > > > assign any IOVA of its choosing.
> > > > > > > >
> > > > > > > > Also, note that this is not meant as a way of passing IOMMU 
> > > > > > > > setup from
> > > > > > > > the bootloader or firmware to the OS. The purpose of this is to 
> > > > > > > > specify
> > > > > > > > that some region of memory is actively being accessed during 
> > > > > > > > boot. The
> > > > > > > > particular case that I'm looking at is where the bootloader set 
> > > > > > > > up a
> > > > > > > > splash screen and keeps it on during boot. The bootloader has 
> > > > > > > > not set up
> > > > > > > > an IOMMU mapping and the identity mapping serves as a way of 
> > > > > > > > keeping the
> > > > > > > > accesses by the display hardware working during the 
> > > > > > > > transitional period
> > > > > > > > after the IOMMU translations have been enabled by the kernel 
> > > > > > > > but before
> > > > > > > > the kernel display driver has had a chance to set up its own 
> > > > > > > > IOMMU
> > > > > > > > mappings.
> > > > > > > >
> > > > > > > > > If you know enough about the regions to assume identity 
> > > > > > > > > mapping, then
> > > > > > > > > can't you know if active or not?
> > > > > > > >
> > > > > > > > We could alternatively add some property that describes the 
> > > > > > > > region as
> > > > > > > > requiring an identity mapping. But note that we can't make any
> > > > > > > > assumptions here about the usage of these regions because the 
> > > > > > > > IOMMU
> > > > > > > > driver simply has no way of knowing what they are being used 
> > > > > > > > for.
> > > > > > > >
> > > > > > > > Some additional information is required in device tree for the 
> > > > > > > > IOMMU
> > > > > > > > driver to be able to make that decision.
> > > > > > >
> > > > > > > Rob, can you provide any hints on exactly how you want to move 
> > > > > > > this
> > > > > > > forward? I don't know in what direction you'd like to proceed.
> > > > > >
> > > > > > Hi Rob,
> > > > > >
> > > > > > do you have any suggestions on how to proceed with this? I'd like 
> > > > > > to get
> > > > > > this moving again because it's something that's been nagging me for 
> > > > > > some
> > > > > > months now. It also requires changes across two levels in the 
> > > > > > bootloader
> > > > > > stack as well as Linux and it takes quite a bit of work to make all 
> > > > > > the
> > > > > > changes, so before I go and rewrite everything I'd like to get the 
> > > > > > DT
> > > > > > bindings sorted out first.
> > > > > >
> > > > > > So just to summarize why I think this simple solution is good 
> > > > > > enough: it
> > > > > > tries to solve a very narrow and simple problem. This is not an 
> > > > > > attempt
> > > > > > at describing the firmware's full IOMMU setup to the kernel. In 
> > > > > > fact, it
> > > > > > is primarily targetted at cases where the firmware hasn't setup an 
> > > > > > IOMMU
> > > > > > at all, and we just want to make sure that when the kernel takes 
> > > > > > over
> > > > > > and does want to enable the IOMMU, that all the regions that are
> > > > > > actively being accessed by no

Re: [PATCH v2 12/12] ipu3-cio2: Add cio2-bridge to ipu3-cio2 driver

2020-12-18 Thread Laurent Pinchart

Hi Daniel,

On Fri, Dec 18, 2020 at 11:57:54PM +, Daniel Scally wrote:
> Hi Laurent - thanks for the comments
> 
> On 18/12/2020 16:53, Laurent Pinchart wrote:
> >> +static void cio2_bridge_init_property_names(struct cio2_sensor *sensor)
> >> +{
> >> +  strscpy(sensor->prop_names.clock_frequency, "clock-frequency",
> >> +  sizeof(sensor->prop_names.clock_frequency));
> >> +  strscpy(sensor->prop_names.rotation, "rotation",
> >> +  sizeof(sensor->prop_names.rotation));
> >> +  strscpy(sensor->prop_names.bus_type, "bus-type",
> >> +  sizeof(sensor->prop_names.bus_type));
> >> +  strscpy(sensor->prop_names.data_lanes, "data-lanes",
> >> +  sizeof(sensor->prop_names.data_lanes));
> >> +  strscpy(sensor->prop_names.remote_endpoint, "remote-endpoint",
> >> +  sizeof(sensor->prop_names.remote_endpoint));
> >> +  strscpy(sensor->prop_names.link_frequencies, "link-frequencies",
> >> +  sizeof(sensor->prop_names.link_frequencies));
> > 
> > Just curious, was there anything not working correctly with the proposal
> > I made ?
> > 
> > static const struct cio2_property_names prop_names = {
> > .clock_frequency = "clock-frequency",
> > .rotation = "rotation",
> > .bus_type = "bus-type",
> > .data_lanes = "data-lanes",
> > .remote_endpoint = "remote-endpoint",
> > };
> > 
> > static void cio2_bridge_init_property_names(struct cio2_sensor *sensor)
> > {
> > sensor->prop_names = prop_names;
> > }
> > 
> > It generates a warning when the string is too long for the field size,
> > which should help catching issues at compilation time.
> 
> Yes, though I don't know how much of a real-world problem it would have
> been - if you recall we have the issue that the device grabs a reference
> to the software_nodes (after we stopped delaying until after the
> i2c_client is available), which means we can't safely free the
> cio2_bridge struct on module unload. That also means we can't rely on
> those pointers to string literals existing, because if the ipu3-cio2
> module gets unloaded they'll be gone.

But the strings above are not stored as literals in .rodata, they're
copied in prop_names (itself in .rodata), which is then copied to
sensor->prop_names.

> Shame, as it's way neater.
> 
> >> +static void cio2_bridge_init_swnode_names(struct cio2_sensor *sensor)
> >> +{
> >> +  snprintf(sensor->node_names.remote_port, 7, "port@%u", 
> >> sensor->ssdb.link);
> >> +  strscpy(sensor->node_names.port, "port@0", 
> >> sizeof(sensor->node_names.port));
> >> +  strscpy(sensor->node_names.endpoint, "endpoint@0", 
> >> sizeof(sensor->node_names.endpoint));
> > 
> > I'd wrap lines, but maybe that's because I'm an old-school, 80-columns
> > programmer :-)
> 
> Heh sure, I'll wrap them.
> 
> >> +static int cio2_bridge_connect_sensors(struct cio2_bridge *bridge,
> >> + struct pci_dev *cio2)
> >> +{
> >> +  struct fwnode_handle *fwnode;
> >> +  struct cio2_sensor *sensor;
> >> +  struct acpi_device *adev;
> >> +  unsigned int i;
> >> +  int ret = 0;
> >> +
> >> +  for (i = 0; i < ARRAY_SIZE(cio2_supported_sensors); i++) {
> >> +  const struct cio2_sensor_config *cfg = 
> >> &cio2_supported_sensors[i];
> >> +
> >> +  for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
> >> +  if (bridge->n_sensors >= CIO2_NUM_PORTS) {
> >> +  dev_warn(&cio2->dev, "Exceeded available CIO2 
> >> ports\n");
> >> +  /* overflow i so outer loop ceases */
> >> +  i = ARRAY_SIZE(cio2_supported_sensors);
> >> +  break;
> > 
> > Or just
> > 
> > return 0;
> > 
> > ?
> 
> Derp, yes of course.
> 
> 
> >> +/* Data representation as it is in ACPI SSDB buffer */
> >> +struct cio2_sensor_ssdb {
> >> +  u8 version; /*  */
> >> +  u8 sku; /* 0001 */
> >> +  u8 guid_csi2[16];   /* 0002 */
> >> +  u8 devfunction; /* 0003 */
> >> +  u8 bus; /* 0004 */
> >> +  u32 dphylinkenfuses;/* 0005 */
> >> +  u32 clockdiv;   /* 0009 */
> >> +  u8 link;/* 0013 */
> >> +  u8 lanes;   /* 0014 */
> >> +  u32 csiparams[10];  /* 0015 */
> >> +  u32 maxlanespeed;   /* 0019 */
> >> +  u8 sensorcalibfileidx;  /* 0023 */
> >> +  u8 sensorcalibfileidxInMBZ[3];  /* 0024 */
> >> +  u8 romtype; /* 0025 */
> >> +  u8 vcmtype; /* 0026 */
> >> +  u8 platforminfo;/* 0027 */
> > 
> > Why stop at 27 ? :-) I'd either go all the way, or not at all. It's also
> > quite customary to represent offset as hex values, as that's what most
> > hex editors / viewers will show.
> 
> Oops

mmotm 2020-12-18-16-27 uploaded

2020-12-18 Thread akpm

The mm-of-the-moment snapshot 2020-12-18-16-27 has been uploaded to

   https://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

https://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
https://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

https://github.com/hnaz/linux-mm

The directory https://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.10:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-memcg-bail-early-from-swap-accounting-if-memcg-disabled.patch
* mm-memcg-warning-on-memcg-after-readahead-page-charged.patch
* mm-memcg-remove-unused-definitions.patch
* mm-kvm-account-kvm_vcpu_mmap-to-kmemcg.patch
* mm-memcontrol-rewrite-mem_cgroup_page_lruvec.patch
* 
epoll-check-for-events-when-removing-a-timed-out-thread-from-the-wait-queue.patch
* epoll-simplify-signal-handling.patch
* epoll-pull-fatal-signal-checks-into-ep_send_events.patch
* epoll-move-eavail-next-to-the-list_empty_careful-check.patch
* epoll-simplify-and-optimize-busy-loop-logic.patch
* epoll-pull-all-code-between-fetch_events-and-send_event-into-the-loop.patch
* epoll-replace-gotos-with-a-proper-loop.patch
* epoll-eliminate-unnecessary-lock-for-zero-timeout.patch
* kasan-drop-unnecessary-gpl-text-from-comment-headers.patch
* kasan-kasan_vmalloc-depends-on-kasan_generic.patch
* kasan-group-vmalloc-code.patch
* kasan-shadow-declarations-only-for-software-modes.patch
* kasan-rename-unpoison_shadow-to-unpoison_range.patch
* kasan-rename-kasan_shadow_-to-kasan_granule_.patch
* kasan-only-build-initc-for-software-modes.patch
* kasan-split-out-shadowc-from-commonc.patch
* kasan-define-kasan_memory_per_shadow_page.patch
* kasan-rename-report-and-tags-files.patch
* kasan-dont-duplicate-config-dependencies.patch
* kasan-hide-invalid-free-check-implementation.patch
* kasan-decode-stack-frame-only-with-kasan_stack_enable.patch
* kasan-arm64-only-init-shadow-for-software-modes.patch
* kasan-arm64-only-use-kasan_depth-for-software-modes.patch
* kasan-arm64-move-initialization-message.patch
* kasan-arm64-rename-kasan_init_tags-and-mark-as-__init.patch
* kasan-rename-addr_has_shadow-to-addr_has_metadata.patch
* kasan-rename-print_shadow_for_address-to-print_memory_metadata.patch
* kasan-rename-shadow-layout-macros-to-meta.patch
* kasan-separate-metadata_fetch_row-for-each-mode.patch
* kasan-introduce-config_kasan_hw_tags.patch
* arm64-enable-armv85-a-asm-arch-option.patch
* arm64-mte-add-in-kernel-mte-helpers.patch
* arm64-mte-reset-the-page-tag-in-page-flags.patch
* arm64-mte-add-in-kernel-tag-fault-handler.patch
* arm64-kasan-allow-enabling-in-kernel-mte.patch
* arm64-mte-convert-gcr_user-into-an-exclude-mask.patch
* arm64-mte-switch-gcr_el1-in-kernel-entry-and-exit.patch
* kasan-mm-untag-page-address-in-free_reserved_area.patch
* arm64-kasan-align-allocations-for-hw_tags.patch
* arm64-kasan-add-arch-layer-for-memory-tagging-helpers.patch
* kasan-define-kasan_granule_size-for-hw_tags.patch
* kasan-x86-s390-update-undef-config_kasan.patch
* kasan-arm64-expand-config_kasan-checks.patch
* kasan-arm64-implement-hw_tags-runtime.patch
* kasan-arm64-print-report-from-tag-fault-handler.patch
* kasan-mm-reset-tags-when-accessing-metadata.patch
* kasan-arm64-enable-config_kasan_hw_tags.patch
* kasan-add-documentation-for-hardware-tag-based-mode.patch
* kselftest-arm64-check-gcr_el1-after-context-switch.patch
* kasan-simplify-quarantine_put-call-site.patch
* kasan-rename-get_alloc-free_info.patch
* kasan-introduce-set_alloc_info.patch
* kasan-arm64-unpoison-stack-only-with-config_kasan_stack.patch
* kasan-allow-vmap_stack-for-hw_tags-mode.patch
* kasan-remove-__kasan_unpoison_stack.patch
* kasan-inline-kasan_reset_tag-for-tag-based-modes.patch
* kasan-inline-random_tag-for-hw_tags.patch
* kasan-open-code-kasan_unpoison_slab.patch
*

Re: [PATCH v2 12/12] ipu3-cio2: Add cio2-bridge to ipu3-cio2 driver

2020-12-18 Thread Daniel Scally

Hi Andy, thanks for the comments

On 18/12/2020 21:17, Andy Shevchenko wrote:
> On Thu, Dec 17, 2020 at 11:43:37PM +, Daniel Scally wrote:
>> Currently on platforms designed for Windows, connections between CIO2 and
>> sensors are not properly defined in DSDT. This patch extends the ipu3-cio2
>> driver to compensate by building software_node connections, parsing the
>> connection properties from the sensor's SSDB buffer.
> 
> ...
> 
>> +sensor->ep_properties[0] = 
>> PROPERTY_ENTRY_U32(sensor->prop_names.bus_type, 4);
> 
> Does 4 has any meaning that can be described by #define ?

It's V4L2_FWNODE_BUS_TYPE_CSI2_DPHY:

https://elixir.bootlin.com/linux/latest/source/drivers/media/v4l2-core/v4l2-fwnode.c#L36

That enum's not in an accessible header, but I can define it in this
module's header

>> +static void cio2_bridge_init_swnode_names(struct cio2_sensor *sensor)
>> +{
>> +snprintf(sensor->node_names.remote_port, 7, "port@%u", 
>> sensor->ssdb.link);
> 
> Hmm... I think you should use actual size of remote_port instead of 7.

Yes ok


>> +strscpy(sensor->node_names.port, "port@0", 
>> sizeof(sensor->node_names.port));
> 
> Yeah, I would rather like to see one point of the definition of the format.
> If it's the same as per OF case, perhaps some generic header (like fwnode.h?) 
> is good for this?
> In this case the 5 in one of the previous patches Also can be derived from 
> the format.

Okedokey. It is indeed intended to match OF and ACPI case, both of which
mandate that format (though only ACPI's functions seem to enforce it).
fwnode.h seems as good a place as any to me, though I'm not sure there's
anywhere in the driver code for OF or ACPI that would actually use it at
the moment.

>> +strscpy(sensor->node_names.endpoint, "endpoint@0", 
>> sizeof(sensor->node_names.endpoint));
> 
> Similar here.
> 
>> +}
> 
> ...
> 
>> +for (i = 0; i < ARRAY_SIZE(cio2_supported_sensors); i++) {
>> +const struct cio2_sensor_config *cfg = 
>> &cio2_supported_sensors[i];
>> +
>> +for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
> 
>> +if (bridge->n_sensors >= CIO2_NUM_PORTS) {
>> +dev_warn(&cio2->dev, "Exceeded available CIO2 
>> ports\n");
> 
>> +/* overflow i so outer loop ceases */
>> +i = ARRAY_SIZE(cio2_supported_sensors);
>> +break;
> 
> Why not to create a new label below and assign ret here with probably comment
> why it's not an error?

Sure, I can do that, but since it wouldn't need any cleanup I could also
just return 0 here as Laurent suggest (but with a comment explaining why
that's ok as you say) - do you have a preference?

>> +}
> 
> ...
> 
>> +ret = cio2_bridge_read_acpi_buffer(adev, "SSDB",
>> +   &sensor->ssdb,
>> +   
>> sizeof(sensor->ssdb));
>> +if (ret < 0)
> 
> if (ret) (because positive case can be returned just by next conditional).

cio2_bridge_read_acpi_buffer() returns the buffer length on success at
the moment, but I can change it to return 0 and have this be if (ret)

>> +goto err_put_adev;
>> +
>> +if (sensor->ssdb.lanes > 4) {
>> +dev_err(&adev->dev,
>> +"Number of lanes in SSDB is invalid\n");
>> +goto err_put_adev;
>> +}
> 
> ...
> 
>> +dev_info(&cio2->dev, "Found supported sensor %s\n",
>> + acpi_dev_name(adev));
>> +
>> +bridge->n_sensors++;
>> +}
>> +}
> 
>   return 0;

Okedokey

> 
>> +err_free_swnodes:
>> +software_node_unregister_nodes(sensor->swnodes);
>> +err_put_adev:
>> +acpi_dev_put(sensor->adev);
> 
> err_out:

Depends on question above I think

>> +return ret;
>> +}
> 
> ...
> 
>> +enum cio2_sensor_swnodes {
>> +SWNODE_SENSOR_HID,
>> +SWNODE_SENSOR_PORT,
>> +SWNODE_SENSOR_ENDPOINT,
>> +SWNODE_CIO2_PORT,
>> +SWNODE_CIO2_ENDPOINT,
> 
>> +NR_OF_SENSOR_SWNODES
> 
> Perhaps same namespace, i.e.
> 
>   SWNODE_SENSOR_NR

Yep, will do.

Thanks
Dan

Re: [resend/standalone PATCH v4] Add auxiliary bus support

2020-12-18 Thread Alexandre Belloni

On 18/12/2020 19:36:08-0400, Jason Gunthorpe wrote:
> On Fri, Dec 18, 2020 at 10:16:58PM +0100, Alexandre Belloni wrote:
> 
> > But then again, what about non-enumerable devices on the PCI device? I
> > feel this would exactly fit MFD. This is a collection of IPs that exist
> > as standalone but in this case are grouped in a single device.
> 
> So, if mfd had a mfd_device and a mfd bus_type then drivers would need
> to have both a mfd_driver and a platform_driver to bind. Look at
> something like drivers/char/tpm/tpm_tis.c to see how a multi-probe
> driver is structured
> 
> See Mark's remarks about the old of_platform_device, to explain why we
> don't have a 'dt_device' today
> 

So, what would that mfd_driver have that the platform_driver doesn't
already provide?

> > Note that I then have another issue because the kernel doesn't support
> > irq controllers on PCI and this is exactly what my SoC has. But for now,
> > I can just duplicate the irqchip driver in the MFD driver.
> 
> I think Thomas fixed that recently on x86 at least.. 
> 
> Having to put dummy irq chip drivers in MFD anything sounds scary :|
> 

This isn't a dummy driver it is a real irqchip, what issue is there to
register an irqchip from MFD ?

> > Let me point to drivers/net/ethernet/cadence/macb_pci.c which is a
> > fairly recent example. It does exactly that and I'm not sure you could
> > do it otherwise while still not having to duplicate most of macb_probe.
> 
> Creating a platform_device to avoid restructuring the driver's probe
> and device logic to be generic is a *really* horrible reason to use a
> platform device.
> 

Definitively but it made it in and seemed reasonable at the time it
seems. I stumbled upon that a while ago because I wanted to remove
platform_data support from the macb driver and this is the last user. I
never got the time to tackle that.

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

[PATCH] riscv: Trace irq on only interrupt is enabled

2020-12-18 Thread Atish Patra

We should call irq trace only if interrupt is going to be enabled during
excecption handling. Otherwise, it results in following warning during
boot with lock debugging enabled.

[0.00] [ cut here ]
[0.00] DEBUG_LOCKS_WARN_ON(early_boot_irqs_disabled)
[0.00] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:4085 
lockdep_hardirqs_on_prepare+0x22a/0x22e
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
5.10.0-00022-ge20097fb37e2-dirty #548
[0.00] epc: c005d5d4 ra : c005d5d4 sp : c1c01e80
[0.00]  gp : c1d456e0 tp : c1c0a980 t0 : 
[0.00]  t1 :  t2 :  s0 : c1c01ea0
[0.00]  s1 : c100f360 a0 : 002d a1 : c00666ee
[0.00]  a2 :  a3 :  a4 : 
[0.00]  a5 :  a6 : c1c6b390 a7 : 300e
[0.00]  s2 : c2384fe8 s3 :  s4 : 0001
[0.00]  s5 : c1c0a980 s6 : c1d48000 s7 : c1613b4c
[0.00]  s8 : 0fff s9 : 8200 s10: c1613b40
[0.00]  s11:  t3 :  t4 : 
[0.00]  t5 : 0001 t6 : 

Fixes: 3c4697982982 ("riscv:Enable LOCKDEP_SUPPORT & fixup 
TRACE_IRQFLAGS_SUPPORT")
Cc: sta...@vger.kernel.org

Signed-off-by: Atish Patra 
---
 arch/riscv/kernel/entry.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 524d918f3601..7dea5ee5a3ac 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -124,15 +124,15 @@ skip_context_tracking:
REG_L a1, (a1)
jr a1
 1:
-#ifdef CONFIG_TRACE_IRQFLAGS
-   call trace_hardirqs_on
-#endif
/*
 * Exceptions run with interrupts enabled or disabled depending on the
 * state of SR_PIE in m/sstatus.
 */
andi t0, s1, SR_PIE
beqz t0, 1f
+#ifdef CONFIG_TRACE_IRQFLAGS
+   call trace_hardirqs_on
+#endif
csrs CSR_STATUS, SR_IE
 
 1:
-- 
2.25.1

Re: [RFC PATCH net-next] bonding: add a vlan+srcmac tx hashing option

2020-12-18 Thread Jay Vosburgh

Jarod Wilson  wrote:

>This comes from an end-user request, where they're running multiple VMs on
>hosts with bonded interfaces connected to some interest switch topologies,
>where 802.3ad isn't an option. They're currently running a proprietary
>solution that effectively achieves load-balancing of VMs and bandwidth
>utilization improvements with a similar form of transmission algorithm.
>
>Basically, each VM has it's own vlan, so it always sends its traffic out
>the same interface, unless that interface fails. Traffic gets split
>between the interfaces, maintaining a consistent path, with failover still
>available if an interface goes down.
>
>This has been rudimetarily tested to provide similar results, suitable for
>them to use to move off their current proprietary solution.
>
>Still on the TODO list, if these even looks sane to begin with, is
>fleshing out Documentation/networking/bonding.rst.

I'm sure you're aware, but any final submission will also need
to include netlink and iproute2 support.

>Cc: Jay Vosburgh 
>Cc: Veaceslav Falico 
>Cc: Andy Gospodarek 
>Cc: "David S. Miller" 
>Cc: Jakub Kicinski 
>Cc: Thomas Davis 
>Cc: net...@vger.kernel.org
>Signed-off-by: Jarod Wilson 
>---
> drivers/net/bonding/bond_main.c| 27 +--
> drivers/net/bonding/bond_options.c |  1 +
> include/linux/netdevice.h  |  1 +
> include/uapi/linux/if_bonding.h|  1 +
> 4 files changed, 28 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 5fe5232cc3f3..151ce8c7a56f 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -164,7 +164,7 @@ module_param(xmit_hash_policy, charp, 0);
> MODULE_PARM_DESC(xmit_hash_policy, "balance-alb, balance-tlb, balance-xor, 
> 802.3ad hashing method; "
>  "0 for layer 2 (default), 1 for layer 3+4, "
>  "2 for layer 2+3, 3 for encap layer 2+3, "
>- "4 for encap layer 3+4");
>+ "4 for encap layer 3+4, 5 for vlan+srcmac");
> module_param(arp_interval, int, 0);
> MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds");
> module_param_array(arp_ip_target, charp, NULL, 0);
>@@ -1434,6 +1434,8 @@ static enum netdev_lag_hash bond_lag_hash_type(struct 
>bonding *bond,
>   return NETDEV_LAG_HASH_E23;
>   case BOND_XMIT_POLICY_ENCAP34:
>   return NETDEV_LAG_HASH_E34;
>+  case BOND_XMIT_POLICY_VLAN_SRCMAC:
>+  return NETDEV_LAG_HASH_VLAN_SRCMAC;
>   default:
>   return NETDEV_LAG_HASH_UNKNOWN;
>   }
>@@ -3494,6 +3496,20 @@ static bool bond_flow_ip(struct sk_buff *skb, struct 
>flow_keys *fk,
>   return true;
> }
> 
>+static inline u32 bond_vlan_srcmac_hash(struct sk_buff *skb)
>+{
>+  struct ethhdr *mac_hdr = (struct ethhdr *)skb_mac_header(skb);
>+  u32 srcmac = mac_hdr->h_source[5];
>+  u16 vlan;
>+
>+  if (!skb_vlan_tag_present(skb))
>+  return srcmac;
>+
>+  vlan = skb_vlan_tag_get(skb);
>+
>+  return srcmac ^ vlan;

For the common configuration wherein multiple VLANs are
configured atop a single interface (and thus by default end up with the
same MAC address), this seems like a fairly weak hash.  The TCI is 16
bits (12 of which are the VID), but only 8 are used from the MAC, which
will be a constant.

Is this algorithm copying the proprietary solution you mention?

-J

>+}
>+
> /* Extract the appropriate headers based on bond's xmit policy */
> static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
> struct flow_keys *fk)
>@@ -3501,10 +3517,14 @@ static bool bond_flow_dissect(struct bonding *bond, 
>struct sk_buff *skb,
>   bool l34 = bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34;
>   int noff, proto = -1;
> 
>-  if (bond->params.xmit_policy > BOND_XMIT_POLICY_LAYER23) {
>+  switch (bond->params.xmit_policy) {
>+  case BOND_XMIT_POLICY_ENCAP23:
>+  case BOND_XMIT_POLICY_ENCAP34:
>   memset(fk, 0, sizeof(*fk));
>   return __skb_flow_dissect(NULL, skb, &flow_keys_bonding,
> fk, NULL, 0, 0, 0, 0);
>+  default:
>+  break;
>   }
> 
>   fk->ports.ports = 0;
>@@ -3556,6 +3576,9 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff 
>*skb)
>   skb->l4_hash)
>   return skb->hash;
> 
>+  if (bond->params.xmit_policy == BOND_XMIT_POLICY_VLAN_SRCMAC)
>+  return bond_vlan_srcmac_hash(skb);
>+
>   if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER2 ||
>   !bond_flow_dissect(bond, skb, &flow))
>   return bond_eth_hash(skb);
>diff --git a/drivers/net/bonding/bond_options.c 
>b/drivers/net/bonding/bond_options.c
>index a4e4e15f574d..9826fe46fca1 100644
>--- a/drivers/net/bonding/bond_op

[PATCH v2] RISC-V: Fix usage of memblock_enforce_memory_limit

2020-12-18 Thread Atish Patra

memblock_enforce_memory_limit accepts the maximum memory size not the
maximum address that can be handled by kernel. Fix the function invocation
accordingly.

Fixes: 1bd14a66ee52 ("RISC-V: Remove any memblock representing unusable memory 
area")
Cc: sta...@vger.kernel.org

Reported-by: Bin Meng 
Tested-by: Bin Meng 
Acked-by: Mike Rapoport 
Signed-off-by: Atish Patra 
---
Changes from v1->v2:
1. Added stable-kernel in cc.
2. Added reported/tested by tag.
---
 arch/riscv/mm/init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 13ba533f462b..bf5379135e39 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -176,7 +176,7 @@ void __init setup_bootmem(void)
 * Make sure that any memory beyond mem_start + (-PAGE_OFFSET) is 
removed
 * as it is unusable by kernel.
 */
-   memblock_enforce_memory_limit(mem_start - PAGE_OFFSET);
+   memblock_enforce_memory_limit(-PAGE_OFFSET);
 
/* Reserve from the start of the kernel to the end of the kernel */
memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
-- 
2.25.1

Re: [PATCH 06/22] misc: xlink-pcie: Add documentation for XLink PCIe driver

2020-12-18 Thread mark gross

On Fri, Dec 18, 2020 at 02:59:00PM -0800, Randy Dunlap wrote:
> On 12/1/20 2:34 PM, mgr...@linux.intel.com wrote:
> > From: Srikanth Thokala 
> > 
> > Provide overview of XLink PCIe driver implementation
> > 
> > Cc: linux-...@vger.kernel.org
> > Reviewed-by: Mark Gross 
> > Signed-off-by: Srikanth Thokala 
> > ---
> >  Documentation/vpu/index.rst  |  1 +
> >  Documentation/vpu/xlink-pcie.rst | 91 
> >  2 files changed, 92 insertions(+)
> >  create mode 100644 Documentation/vpu/xlink-pcie.rst
> > 
> 
> Hi--
> 
> For document, chapter, section, etc., headings, please read & use
> Documentation/doc-guide/sphinx.rst:
> 
> * Please stick to this order of heading adornments:
> 
>   1. ``=`` with overline for document title::
> 
>==
>Document title
>==
> 
>   2. ``=`` for chapters::
> 
>Chapters
>
> 
>   3. ``-`` for sections::
> 
>Section
>---
> 
>   4. ``~`` for subsections::
> 
>Subsection
>~~
Thanks for the help!  I'm new to the sphix markup language and appreciate your
advice.  I'll reread that doc-guide.

We'll address the issues on our next posting once the marge window closes.

thanks again for the reviews!

--mark


> 
> > diff --git a/Documentation/vpu/xlink-pcie.rst 
> > b/Documentation/vpu/xlink-pcie.rst
> > new file mode 100644
> > index ..bc64b566989d
> > --- /dev/null
> > +++ b/Documentation/vpu/xlink-pcie.rst
> > @@ -0,0 +1,91 @@
> > +.. SPDX-License-Identifier: GPL-2.0-only
> > +
> > +Kernel driver: xlink-pcie driver
> > +
> > +Supported chips:
> > +  * Intel Edge.AI Computer Vision platforms: Keem Bay
> > +Suffix: Bay
> > +Slave address: 6240
> > +Datasheet: Publicly available at Intel
> > +
> > +Author: Srikanth Thokala srikanth.thok...@intel.com
> > +
> > +-
> > +Introduction:
> 
> No colon at end of chapter/section headings.
> 
> > +-
> > +The xlink-pcie driver in linux-5.4 provides transport layer implementation 
> > for
> 
> Linux 5.4 (?)
> 
> > +the data transfers to support xlink protocol subsystem communication with 
> > the
> 
>  Xlink
> 
> > +peer device. i.e, between remote host system and the local Keem Bay device.
> 
> device, i.e., between the remote host system and
> 
> > +
> > +The Keem Bay device is an ARM based SOC that includes a vision processing
> 
>  ARM-based
> 
> > +unit (VPU) and deep learning, neural network core in the hardware.
> > +The xlink-pcie driver exports a functional device endpoint to the Keem Bay 
> > device
> > +and supports two-way communication with peer device.
> 
>   with the peer device.
> 
> > +
> > +
> > +High-level architecture:
> > +
> > +Remote Host: IA CPU
> > +Local Host: ARM CPU (Keem Bay)::
> > +
> > +
> > ++
> > +|  Remote Host IA CPU  | | Local Host ARM CPU (Keem 
> > Bay) |   |
> > +
> > +==+=+===+===+
> > +|  User App| | User App
> >   |   |
> > +
> > +--+-+---+---+
> > +|   XLink UAPI | | XLink UAPI  
> >   |   |
> > +
> > +--+-+---+---+
> > +|   XLink Core | | XLink Core  
> >   |   |
> > +
> > +--+-+---+---+
> > +|   XLink PCIe | | XLink PCIe  
> >   |   |
> > +
> > +--+-+---+---+
> > +|   XLink-PCIe Remote Host driver  | | XLink-PCIe Local Host 
> > driver  |   |
> > +
> > +--+-+---+---+
> > +
> > |-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:|:|:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:|
> > +
> > +--+-+---+---+
> > +| PCIe Host Controller | | PCIe Device Controller  
> >   | HW|
> > +
> > +--+-+---+---+
> > +   ^ ^
> > +   | |
> > +   |- PCIe x2 Link  -|
> > +
> > +This XLink PCIe driver comprises of two variants:
> > +* Local Host driver
> > +
> > +  * Intended for ARM CPU
> > +  * It is based on PCI Endpoint Framework
> > +  * Driver path: {tre

[PATCH v3 1/2] proc: Allow pid_revalidate() during LOOKUP_RCU

2020-12-18 Thread Stephen Brennan

The pid_revalidate() function requires dropping from RCU into REF lookup
mode. When many threads are resolving paths within /proc in parallel,
this can result in heavy spinlock contention on d_locrkef as each thread
tries to grab a reference to the /proc dentry (and drop it shortly
thereafter).

Allow the pid_revalidate() function to execute under LOOKUP_RCU. When
updates must be made to the inode, drop out of RCU and into REF mode.

Signed-off-by: Stephen Brennan 
---

When running running ~100 parallel instances of "TZ=/etc/localtime ps -fe
>/dev/null" on a 100CPU machine, the %sys utilization reaches 90%, and perf
shows the following code path as being responsible for heavy contention on
the d_lockref spinlock:

  walk_component()
lookup_fast()
  unlazy_child()
lockref_get_not_dead(&nd->path.dentry->d_lockref)

By applying this patch, %sys utilization falls to around 60% under the same
workload. Although this particular workload is a bit contrived, we have seen
some monitoring scripts which produced similarly high %sys time due to this
contention.

Changes from v3:
- Rather than call pid_update_inode() with flags, create
  proc_inode_needs_update() to determine whether the call can be skipped.
- Restore the call to the security hook (see next patch).
Changes from v2:
- Remove get_pid_task_rcu_user() and get_proc_task_rcu(), since they were
  unnecessary.
- Remove the call to security_task_to_inode().

 fs/proc/base.c | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index b3422cda2a91..4b246e9bd5df 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1968,6 +1968,20 @@ void pid_update_inode(struct task_struct *task, struct 
inode *inode)
security_task_to_inode(task, inode);
 }
 
+/* See if we can avoid the above call. Assumes RCU lock held */
+static bool pid_inode_needs_update(struct task_struct *task, struct inode 
*inode)
+{
+   kuid_t uid;
+   kgid_t gid;
+
+   if (inode->i_mode & (S_ISUID | S_ISGID))
+   return true;
+   task_dump_owner(task, inode->i_mode, &uid, &gid);
+   if (!uid_eq(uid, inode->i_uid) || !gid_eq(gid, inode->i_gid))
+   return true;
+   return false;
+}
+
 /*
  * Rewrite the inode's ownerships here because the owning task may have
  * performed a setuid(), etc.
@@ -1977,19 +1991,20 @@ static int pid_revalidate(struct dentry *dentry, 
unsigned int flags)
 {
struct inode *inode;
struct task_struct *task;
+   int rv = 0;
 
-   if (flags & LOOKUP_RCU)
-   return -ECHILD;
-
-   inode = d_inode(dentry);
-   task = get_proc_task(inode);
-
+   rcu_read_lock();
+   inode = d_inode_rcu(dentry);
+   task = pid_task(proc_pid(inode), PIDTYPE_PID);
if (task) {
-   pid_update_inode(task, inode);
-   put_task_struct(task);
-   return 1;
+   rv = 1;
+   if ((flags & LOOKUP_RCU) && pid_inode_needs_update(task, inode))
+   rv = -ECHILD;
+   else if (!(flags & LOOKUP_RCU))
+   pid_update_inode(task, inode);
}
-   return 0;
+   rcu_read_unlock();
+   return rv;
 }
 
 static inline bool proc_inode_is_dead(struct inode *inode)
-- 
2.25.1

[PATCH v3 2/2] proc: ensure security hook is called after exec

2020-12-18 Thread Stephen Brennan

Smack needs its security_task_to_inode() hook to be called when a task
execs a new executable. Store the self_exec_id of the task and call the
hook via pid_update_inode() whenever the exec_id changes.

Signed-off-by: Stephen Brennan 
---

As discussed on the v2 of the patch, this should allow Smack to receive a
security_task_to_inode() call only when the uid/gid changes, or when the task
execs a new binary. I have verified that this doesn't change the performance of
the patch set, and that we do fall out of RCU walk on tasks which have recently
exec'd.

 fs/proc/base.c | 4 +++-
 fs/proc/internal.h | 5 -
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4b246e9bd5df..ad59e92e8433 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1917,6 +1917,7 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
}
 
task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid);
+   ei->exec_id = task->self_exec_id;
security_task_to_inode(task, inode);
 
 out:
@@ -1965,6 +1966,7 @@ void pid_update_inode(struct task_struct *task, struct 
inode *inode)
task_dump_owner(task, inode->i_mode, &inode->i_uid, &inode->i_gid);
 
inode->i_mode &= ~(S_ISUID | S_ISGID);
+   PROC_I(inode)->exec_id = task->self_exec_id;
security_task_to_inode(task, inode);
 }
 
@@ -1979,7 +1981,7 @@ static bool pid_inode_needs_update(struct task_struct 
*task, struct inode *inode
task_dump_owner(task, inode->i_mode, &uid, &gid);
if (!uid_eq(uid, inode->i_uid) || !gid_eq(gid, inode->i_gid))
return true;
-   return false;
+   return task->self_exec_id != PROC_I(inode)->exec_id;
 }
 
 /*
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index f60b379dcdc7..1df9b039dfc3 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -92,7 +92,10 @@ union proc_op {
 
 struct proc_inode {
struct pid *pid;
-   unsigned int fd;
+   union {
+   unsigned int fd;
+   u32 exec_id;
+   };
union proc_op op;
struct proc_dir_entry *pde;
struct ctl_table_header *sysctl;
-- 
2.25.1

[GIT PULL] power-supply changes for 5.11

2020-12-18 Thread Sebastian Reichel

Hi Linus,

The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec:

  Linux 5.10-rc1 (2020-10-25 15:14:11 -0700)

are available in the Git repository at:

  
ssh://g...@gitolite.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply.git
 tags/for-v5.11

for you to fetch changes up to c2362519a04a7307e386e43bc567780d0d7631c7:

  power: supply: Fix a typo in warning message (2020-12-13 01:00:10 +0100)


power supply and reset changes for the v5.11 series

battery/charger driver changes:
 * collie_battery, generic-adc-battery, s3c-adc-battery: convert to GPIO 
descriptors (incl. ARM board files)
 * misc. cleanup and fixes

reset drivers:
 * new poweroff driver for force disabling a regulator
 * Use printk format symbol resolver
 * ocelot: add support for Luton and Jaguar2


Gregory CLEMENT (2):
  dt-bindings: reset: ocelot: Add Luton and Jaguar2 support
  power: reset: ocelot: Add support 2 other MIPS based SoCs

Hans de Goede (1):
  power: supply: axp288_charger: Fix HP Pavilion x2 10 DMI matching

Helge Deller (1):
  power: reset: Use printk format symbol resolver

Linus Walleij (10):
  power: supply: s3c-adc-battery: Convert to GPIO descriptors
  power: supply: collie_battery: Convert to GPIO descriptors
  power: supply: generic-adc-battery: Use GPIO descriptors
  power: supply: bq24190_charger: Drop unused include
  power: supply: bq24735: Drop unused include
  power: supply: ab8500: Use local helper
  power: supply: ab8500: Convert to dev_pm_ops
  power: supply: ab8500_charger: Oneshot threaded IRQs
  power: supply: ab8500_fg: Request all IRQs as threaded
  power: supply: ab8500: Use dev_err_probe() for IIO channels

Masanari Iida (1):
  power: supply: Fix a typo in warning message

Michael Klein (2):
  power: reset: new driver regulator-poweroff
  Documentation: DT: binding documentation for regulator-poweroff

Nigel Christian (1):
  power: supply: pm2301_charger: remove unnecessary variable

Sebastian Krzyszkowiak (5):
  power: supply: bq25890: Use the correct range for IILIM register
  power: supply: max17042_battery: Fix current_{avg,now} hiding with no 
current sense
  power: supply: max17042_battery: Improve accuracy of current_now and 
current_avg readings
  power: supply: max17042_battery: Take r_sns value into account in 
charge_counter
  power: supply: max17042_battery: Export charge termination current 
property

Tian Tao (1):
  power: supply: Fix missing IRQF_ONESHOT as only threaded handler

Timon Baetz (3):
  power: supply: max8997-charger: Use module_platform_driver()
  power: supply: max8997-charger: Fix platform data retrieval
  power: supply: max8997-charger: Improve getting charger status

Tom Rix (1):
  power: supply: wm831x_power: remove unneeded break

Yangtao Li (2):
  power: supply: axp20x_usb_power: fix typo
  power: supply: axp20x_usb_power: Use power efficient workqueue for 
debounce

Zhang Qilong (1):
  power: supply: bq24190_charger: fix reference leak

 .../bindings/power/reset/ocelot-reset.txt  |   4 +-
 .../bindings/power/reset/regulator-poweroff.yaml   |  37 +
 arch/arm/mach-s3c/mach-h1940.c |  12 +-
 arch/arm/mach-s3c/mach-rx1950.c|  11 +-
 arch/arm/mach-sa1100/collie.c  |  21 +++
 drivers/power/reset/Kconfig|   7 +
 drivers/power/reset/Makefile   |   1 +
 drivers/power/reset/ocelot-reset.c |  30 +++-
 drivers/power/reset/qnap-poweroff.c|   8 +-
 drivers/power/reset/regulator-poweroff.c   |  82 +++
 drivers/power/reset/syscon-poweroff.c  |   8 +-
 drivers/power/supply/ab8500_btemp.c|  68 --
 drivers/power/supply/ab8500_charger.c  |  99 ++
 drivers/power/supply/ab8500_fg.c   | 106 +--
 drivers/power/supply/abx500_chargalg.c |  19 +--
 drivers/power/supply/axp20x_usb_power.c|  10 +-
 drivers/power/supply/axp288_charger.c  |  28 ++--
 drivers/power/supply/bq24190_charger.c |  21 ++-
 drivers/power/supply/bq24735-charger.c |   1 -
 drivers/power/supply/bq25890_charger.c |   2 +-
 drivers/power/supply/collie_battery.c  | 151 +++--
 drivers/power/supply/generic-adc-battery.c |  31 ++---
 drivers/power/supply/max17042_battery.c|  23 +++-
 drivers/power/supply/max8997_charger.c |  67 +
 drivers/power/supply/pm2301_charger.c  |   3 +-
 drivers/power/supply/power_supply_sysfs.c  |   2 +-
 drivers/power/supply/s3c_adc_battery.c |  57 
 drivers/power/supply/wm831x_power.c

[GIT PULL] hsi changes for 5.11

2020-12-18 Thread Sebastian Reichel

Hi Linus,

The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec:

  Linux 5.10-rc1 (2020-10-25 15:14:11 -0700)

are available in the Git repository at:

  ssh://g...@gitolite.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi.git 
tags/hsi-for-5.11

for you to fetch changes up to 8a77ed6d1fdda752f6b3203391a099f590a9454f:

  HSI: core: fix a kernel-doc markup (2020-12-02 22:35:44 +0100)


HSI changes for the 5.11 series

* misc. cleanups


Jing Xiangfeng (1):
  HSI: omap_ssi: Don't jump to free ID in ssi_add_controller()

Mauro Carvalho Chehab (1):
  HSI: core: fix a kernel-doc markup

 drivers/hsi/controllers/omap_ssi_core.c | 2 +-
 drivers/hsi/hsi_core.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


signature.asc
Description: PGP signature

Re: [PATCH v2 12/12] ipu3-cio2: Add cio2-bridge to ipu3-cio2 driver

2020-12-18 Thread Daniel Scally

Hi Laurent - thanks for the comments

On 18/12/2020 16:53, Laurent Pinchart wrote:
>> +static void cio2_bridge_init_property_names(struct cio2_sensor *sensor)
>> +{
>> +strscpy(sensor->prop_names.clock_frequency, "clock-frequency",
>> +sizeof(sensor->prop_names.clock_frequency));
>> +strscpy(sensor->prop_names.rotation, "rotation",
>> +sizeof(sensor->prop_names.rotation));
>> +strscpy(sensor->prop_names.bus_type, "bus-type",
>> +sizeof(sensor->prop_names.bus_type));
>> +strscpy(sensor->prop_names.data_lanes, "data-lanes",
>> +sizeof(sensor->prop_names.data_lanes));
>> +strscpy(sensor->prop_names.remote_endpoint, "remote-endpoint",
>> +sizeof(sensor->prop_names.remote_endpoint));
>> +strscpy(sensor->prop_names.link_frequencies, "link-frequencies",
>> +sizeof(sensor->prop_names.link_frequencies));
> 
> Just curious, was there anything not working correctly with the proposal
> I made ?
> 
> static const struct cio2_property_names prop_names = {
>   .clock_frequency = "clock-frequency",
>   .rotation = "rotation",
>   .bus_type = "bus-type",
>   .data_lanes = "data-lanes",
>   .remote_endpoint = "remote-endpoint",
> };
> 
> static void cio2_bridge_init_property_names(struct cio2_sensor *sensor)
> {
>   sensor->prop_names = prop_names;
> }
> 
> It generates a warning when the string is too long for the field size,
> which should help catching issues at compilation time.

Yes, though I don't know how much of a real-world problem it would have
been - if you recall we have the issue that the device grabs a reference
to the software_nodes (after we stopped delaying until after the
i2c_client is available), which means we can't safely free the
cio2_bridge struct on module unload. That also means we can't rely on
those pointers to string literals existing, because if the ipu3-cio2
module gets unloaded they'll be gone.

Shame, as it's way neater.

>> +static void cio2_bridge_init_swnode_names(struct cio2_sensor *sensor)
>> +{
>> +snprintf(sensor->node_names.remote_port, 7, "port@%u", 
>> sensor->ssdb.link);
>> +strscpy(sensor->node_names.port, "port@0", 
>> sizeof(sensor->node_names.port));
>> +strscpy(sensor->node_names.endpoint, "endpoint@0", 
>> sizeof(sensor->node_names.endpoint));
> 
> I'd wrap lines, but maybe that's because I'm an old-school, 80-columns
> programmer :-)

Heh sure, I'll wrap them.

>> +static int cio2_bridge_connect_sensors(struct cio2_bridge *bridge,
>> +   struct pci_dev *cio2)
>> +{
>> +struct fwnode_handle *fwnode;
>> +struct cio2_sensor *sensor;
>> +struct acpi_device *adev;
>> +unsigned int i;
>> +int ret = 0;
>> +
>> +for (i = 0; i < ARRAY_SIZE(cio2_supported_sensors); i++) {
>> +const struct cio2_sensor_config *cfg = 
>> &cio2_supported_sensors[i];
>> +
>> +for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
>> +if (bridge->n_sensors >= CIO2_NUM_PORTS) {
>> +dev_warn(&cio2->dev, "Exceeded available CIO2 
>> ports\n");
>> +/* overflow i so outer loop ceases */
>> +i = ARRAY_SIZE(cio2_supported_sensors);
>> +break;
> 
> Or just
> 
>   return 0;
> 
> ?

Derp, yes of course.


>> +/* Data representation as it is in ACPI SSDB buffer */
>> +struct cio2_sensor_ssdb {
>> +u8 version; /*  */
>> +u8 sku; /* 0001 */
>> +u8 guid_csi2[16];   /* 0002 */
>> +u8 devfunction; /* 0003 */
>> +u8 bus; /* 0004 */
>> +u32 dphylinkenfuses;/* 0005 */
>> +u32 clockdiv;   /* 0009 */
>> +u8 link;/* 0013 */
>> +u8 lanes;   /* 0014 */
>> +u32 csiparams[10];  /* 0015 */
>> +u32 maxlanespeed;   /* 0019 */
>> +u8 sensorcalibfileidx;  /* 0023 */
>> +u8 sensorcalibfileidxInMBZ[3];  /* 0024 */
>> +u8 romtype; /* 0025 */
>> +u8 vcmtype; /* 0026 */
>> +u8 platforminfo;/* 0027 */
> 
> Why stop at 27 ? :-) I'd either go all the way, or not at all. It's also
> quite customary to represent offset as hex values, as that's what most
> hex editors / viewers will show.

Oops - that was actually just me debugging...I guess I might actually
finish it, converted to hex. It came in useful reading the DSDT to have
that somewhere easy to refer to.

> Reviewed-by: Laurent Pinchart 

Nice - thank you!

Re: [PATCH net] bonding: reduce rtnl lock contention in mii monitor thread

2020-12-18 Thread Jarod Wilson

On Tue, Dec 8, 2020 at 3:35 PM Jay Vosburgh  wrote:
>
> Jarod Wilson  wrote:
...
> >The addition of a case BOND_LINK_BACK in bond_miimon_commit() is somewhat
> >separate from the fix for the actual hang, but it eliminates a constant
> >"invalid new link 3 on slave" message seen related to this issue, and it's
> >not actually an invalid state here, so we shouldn't be reporting it as an
> >error.
...
> In principle, bond_miimon_commit should not see _BACK or _FAIL
> state as a new link state, because those states should be managed at the
> bond_miimon_inspect level (as they are the result of updelay and
> downdelay).  These states should not be "committed" in the sense of
> causing notifications or doing actions that require RTNL.
>
> My recollection is that the "invalid new link" messages were the
> result of a bug in de77ecd4ef02, which was fixed in 1899bb325149
> ("bonding: fix state transition issue in link monitoring"), but maybe
> the RTNL problem here induces that in some other fashion.
>
> Either way, I believe this message is correct as-is.

For reference, with 5.10.1 and this script:

#!/bin/sh

slave1=ens4f0
slave2=ens4f1

modprobe -rv bonding
modprobe -v bonding mode=2 miimon=100 updelay=200
ip link set bond0 up
ifenslave bond0 $slave1 $slave2
sleep 5

while :
do
ip link set $slave1 down
sleep 1
ip link set $slave1 up
sleep 1
done

I get this repeating log output:

[ 9488.262291] sfc :05:00.0 ens4f0: link up at 1Mbps
full-duplex (MTU 1500)
[ 9488.339508] bond0: (slave ens4f0): link status up, enabling it in 200 ms
[ 9488.339511] bond0: (slave ens4f0): invalid new link 3 on slave
[ 9488.547643] bond0: (slave ens4f0): link status definitely up, 1
Mbps full duplex
[ 9489.276614] bond0: (slave ens4f0): link status definitely down,
disabling slave
[ 9490.273830] sfc :05:00.0 ens4f0: link up at 1Mbps
full-duplex (MTU 1500)
[ 9490.315540] bond0: (slave ens4f0): link status up, enabling it in 200 ms
[ 9490.315543] bond0: (slave ens4f0): invalid new link 3 on slave
[ 9490.523641] bond0: (slave ens4f0): link status definitely up, 1
Mbps full duplex
[ 9491.356526] bond0: (slave ens4f0): link status definitely down,
disabling slave
[ 9492.285249] sfc :05:00.0 ens4f0: link up at 1Mbps
full-duplex (MTU 1500)
[ 9492.291522] bond0: (slave ens4f0): link status up, enabling it in 200 ms
[ 9492.291523] bond0: (slave ens4f0): invalid new link 3 on slave
[ 9492.499604] bond0: (slave ens4f0): link status definitely up, 1
Mbps full duplex
[ 9493.331594] bond0: (slave ens4f0): link status definitely down,
disabling slave

"invalid new link 3 on slave" is there every single time.

Side note: I'm not actually able to reproduce the repeating "link
status up, enabling it in 200 ms" and never recovering from a downed
link on this host, no clue why it's so reproducible w/another system.

-- 
Jarod Wilson
ja...@redhat.com

Re: [PATCH v2 06/12] software_node: Add support for fwnode_graph*() family of functions

2020-12-18 Thread Daniel Scally



On 18/12/2020 22:13, Daniel Scally wrote:

>>> +   break;
>>> +   }
>>> +
>>> +   /* No more endpoints for that port, so stop passing old */
>>> +   old = NULL;
>>
>> I wonder if you could drop the 'old' variable and use 'enpoint' in the
>> call to software_node_get_next_child(). You could then drop these two
>> lines.
> 
> That won't work, because endpoint would at that point not be a child of
> the port we're passing, and the function relies on it being one:
> 
>   if (!p || list_empty(&p->children) ||
>   (c && list_is_last(&c->entry, &p->children))) {
>   fwnode_handle_put(child);
>   return NULL;
>   }
> 

Wait, that's nonsense of course, because endpoint gets set to NULL when
software_node_get_next_child() finds nothing - I'll double check but
pretty sure you're right.

Re: [resend/standalone PATCH v4] Add auxiliary bus support

2020-12-18 Thread Jason Gunthorpe

On Fri, Dec 18, 2020 at 10:16:58PM +0100, Alexandre Belloni wrote:

> But then again, what about non-enumerable devices on the PCI device? I
> feel this would exactly fit MFD. This is a collection of IPs that exist
> as standalone but in this case are grouped in a single device.

So, if mfd had a mfd_device and a mfd bus_type then drivers would need
to have both a mfd_driver and a platform_driver to bind. Look at
something like drivers/char/tpm/tpm_tis.c to see how a multi-probe
driver is structured

See Mark's remarks about the old of_platform_device, to explain why we
don't have a 'dt_device' today

> Note that I then have another issue because the kernel doesn't support
> irq controllers on PCI and this is exactly what my SoC has. But for now,
> I can just duplicate the irqchip driver in the MFD driver.

I think Thomas fixed that recently on x86 at least.. 

Having to put dummy irq chip drivers in MFD anything sounds scary :|

> Let me point to drivers/net/ethernet/cadence/macb_pci.c which is a
> fairly recent example. It does exactly that and I'm not sure you could
> do it otherwise while still not having to duplicate most of macb_probe.

Creating a platform_device to avoid restructuring the driver's probe
and device logic to be generic is a *really* horrible reason to use a
platform device.

Jason

Re: [PATCH 01/22] Add Vision Processing Unit (VPU) documentation.

2020-12-18 Thread Randy Dunlap

Hi--

On 12/1/20 2:34 PM, mgr...@linux.intel.com wrote:
> From: mark gross 
> 
> 
> Reviewed-by: Mark Gross 
> Signed-off-by: Mark Gross 

My reading of submitting-patches.rst seems to indicate that
the Reviewer and Submitter are probably not the same person.

Are you sure that you reviewed it?


> ---
>  Documentation/index.rst  |   3 +-
>  Documentation/vpu/index.rst  |  16 ++
>  Documentation/vpu/vpu-stack-overview.rst | 267 +++
>  3 files changed, 285 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/vpu/index.rst
>  create mode 100644 Documentation/vpu/vpu-stack-overview.rst
> 
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index 57719744774c..0a2cc0204e8f 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -1,4 +1,4 @@
> -.. SPDX-License-Identifier: GPL-2.0
> +.. SPDX-License-Identifier: GPL-2.0-only

That looks both inappropriate for this patch and incorrect AFAICT.

>  
>  
>  .. The Linux Kernel documentation master file, created by
> @@ -137,6 +137,7 @@ needed).
> misc-devices/index
> scheduler/index
> mhi/index
> +   vpu/index
>  
>  Architecture-agnostic documentation
>  ---
> diff --git a/Documentation/vpu/index.rst b/Documentation/vpu/index.rst
> new file mode 100644
> index ..7e290e048910
> --- /dev/null
> +++ b/Documentation/vpu/index.rst
> @@ -0,0 +1,16 @@
> +.. SPDX-License-Identifier: GPL-2.0-only

license-rules.rst says:

For 'GNU General Public License (GPL) version 2 only' use:
  SPDX-License-Identifier: GPL-2.0

> +
> +
> +Vision Processor Unit Documentation
> +
> +
> +This documentation contains information for the Intel VPU stack.
> +
> +.. class:: toc-title
> +
> +Table of contents
> +
> +.. toctree::
> +   :maxdepth: 2
> +
> +   vpu-stack-overview
> diff --git a/Documentation/vpu/vpu-stack-overview.rst 
> b/Documentation/vpu/vpu-stack-overview.rst
> new file mode 100644
> index ..53c06a7d9a52
> --- /dev/null
> +++ b/Documentation/vpu/vpu-stack-overview.rst
> @@ -0,0 +1,267 @@
> +.. SPDX-License-Identifier: GPL-2.0-only

Nope.

> +
> +==
> +Intel VPU architecture
> +==
> +
> +Overview
> +
> +
> +The Intel Movidius acquisition has developed a Vision Processing Unit (VPU)
> +roadmap of products starting with Keem Bay (KMB).  The HW configurations the

s/HW/hardware/

> +VPU can support include:
> +
> +1. Standalone smart camera that does local CV processing in camera

Tell us what CV is before using it.

> +2. Standalone appliance or SBC device connected to a network and tethered

Tell us what SBC is before using it. (yeah, I know)

> +   cameras doing local CV processing
> +3. Embedded in a USB dongle or M.2 as an CV accelerator.
> +4. Multiple VPU enabled SOC's on a PCIE card as a CV accelerator in a larger 
> IA

  PCIe (?)

> +   box or server.
> +
> +Keem Bay is the first instance of this family of products. This document
> +provides an architectural overview of the SW stack supporting the VPU enabled

s/SW/software/

> +products.
> +
> +Keem Bay (KMB) is a Computer Vision AI processing SoC based on ARM A53 CPU 
> that
> +provides Edge neural network acceleration (inference) and includes a Vision
> +Processing Unit (VPU) hardware.  The ARM CPU SubSystem (CPUSS) interfaces
> +locally to the VPU and enables integration/interfacing with a remote host 
> over
> +PCIe or USB or Ethernet interfaces. The interface between the CPUSS and the 
> VPU
> +is implemented with HW FIFOs (Control) and coherent memory mapping (Data) 
> such
> +that zero copy processing can happen within the VPU.
> +
> +The KMB can be used in all 4 of the above classes of designs.
> +
> +We refer to the 'local host' as being the ARM part of the SoC, while the
> +'remote host' as the IA system hosting the KMB device(s).  The KMB SoC boots
> +from an eMMC via uBoot and ARM Linux compatible device tree interface with an
> +expectation to fully boot within hundreds of milliseconds.  There is also
> +support for downloading the kernel and root file system image from a remote
> +host.
> +
> +The eMMC can be updated with standard mender update process.

 Mender

> +See https://github.com/mendersoftware/mender
> +
> +The VPU is started and controlled from the A53 local host.  Its firmware 
> image
> +is loaded using the drive FW helper KAPI's.

s/FW/firmware/

> +
> +The VPU IP FW payload consists of a SPARC ISA RTEMS bootloader and/or
> +application binary.
> +
> +The interface allowing (remote or local) host clients  to access VPU IP

^^drop one space

> +capabilities is realized through an abstracted programming model, which
> +provides Remote Proxy APIs for a

Re: [PATCH RESEND v2] virtio-input: add multi-touch support

2020-12-18 Thread Vasyl Vavrychuk


Hi, Dmitry,

Thanks for you suggestion. I have sent v3 version of the patch where I 
have applied it.


Kind regards,
Vasyl

On 09.12.20 00:05, Dmitry Torokhov wrote:

CAUTION: This email originated from outside of the organization.
Do not click links or open attachments unless you recognize the sender and know 
the content is safe.


Hi Vasyl,

On Tue, Dec 08, 2020 at 11:01:50PM +0200, Vasyl Vavrychuk wrote:

From: Mathias Crombez 

Without multi-touch slots allocated, ABS_MT_SLOT events will be lost by
input_handle_abs_event.

Signed-off-by: Mathias Crombez 
Signed-off-by: Vasyl Vavrychuk 
Tested-by: Vasyl Vavrychuk 
---
v2: fix patch corrupted by corporate email server

  drivers/virtio/Kconfig| 11 +++
  drivers/virtio/virtio_input.c |  8 
  2 files changed, 19 insertions(+)

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 7b41130d3f35..2cfd5b01d96d 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -111,6 +111,17 @@ config VIRTIO_INPUT

If unsure, say M.

+config VIRTIO_INPUT_MULTITOUCH_SLOTS
+ depends on VIRTIO_INPUT
+ int "Number of multitouch slots"
+ range 0 64
+ default 10
+ help
+  Define the number of multitouch slots used. Default to 10.
+  This parameter is unused if there is no multitouch capability.


I believe the number of slots should be communicated to the guest by
the host, similarly to how the rest of input device capabilities is
transferred, instead of having static compile-time option.


+
+  0 will disable the feature.
+
  config VIRTIO_MMIO
   tristate "Platform bus driver for memory mapped virtio devices"
   depends on HAS_IOMEM && HAS_DMA
diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index f1f6208edcf5..13f3d90e6c30 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -7,6 +7,7 @@

  #include 
  #include 
+#include 

  struct virtio_input {
   struct virtio_device   *vdev;
@@ -205,6 +206,7 @@ static int virtinput_probe(struct virtio_device *vdev)
   unsigned long flags;
   size_t size;
   int abs, err;
+ bool is_mt = false;

   if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
   return -ENODEV;
@@ -287,9 +289,15 @@ static int virtinput_probe(struct virtio_device *vdev)
   for (abs = 0; abs < ABS_CNT; abs++) {
   if (!test_bit(abs, vi->idev->absbit))
   continue;
+ if (input_is_mt_value(abs))
+ is_mt = true;
   virtinput_cfg_abs(vi, abs);
   }
   }
+ if (is_mt)
+ input_mt_init_slots(vi->idev,
+ CONFIG_VIRTIO_INPUT_MULTITOUCH_SLOTS,
+ INPUT_MT_DIRECT);


Here errors need to be handled.



   virtio_device_ready(vdev);
   vi->ready = true;
--
2.23.0



Thanks.

--
Dmitry

[PATCH v3] virtio-input: add multi-touch support

2020-12-18 Thread Vasyl Vavrychuk

From: Mathias Crombez 

Without multi-touch slots allocated, ABS_MT_SLOT events will be lost by
input_handle_abs_event.

Implementation is based on uinput_create_device.

Signed-off-by: Mathias Crombez 
Co-developed-by: Vasyl Vavrychuk 
Signed-off-by: Vasyl Vavrychuk 
---
v2: fix patch corrupted by corporate email server
v3: use number of slots from the host

 drivers/virtio/virtio_input.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index f1f6208edcf5..f643536807dd 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 
 struct virtio_input {
struct virtio_device   *vdev;
@@ -204,7 +205,7 @@ static int virtinput_probe(struct virtio_device *vdev)
struct virtio_input *vi;
unsigned long flags;
size_t size;
-   int abs, err;
+   int abs, err, nslots;
 
if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
return -ENODEV;
@@ -289,6 +290,13 @@ static int virtinput_probe(struct virtio_device *vdev)
continue;
virtinput_cfg_abs(vi, abs);
}
+
+   if (test_bit(ABS_MT_SLOT, vi->idev->absbit)) {
+   nslots = input_abs_get_max(vi->idev, ABS_MT_SLOT) + 1;
+   err = input_mt_init_slots(vi->idev, nslots, 0);
+   if (err)
+   goto err_mt_init_slots;
+   }
}
 
virtio_device_ready(vdev);
@@ -304,6 +312,7 @@ static int virtinput_probe(struct virtio_device *vdev)
spin_lock_irqsave(&vi->lock, flags);
vi->ready = false;
spin_unlock_irqrestore(&vi->lock, flags);
+err_mt_init_slots:
input_free_device(vi->idev);
 err_input_alloc:
vdev->config->del_vqs(vdev);
-- 
2.23.0

[PATCH v3 RESEND] virtio-input: add multi-touch support

2020-12-18 Thread Vasyl Vavrychuk

From: Mathias Crombez 

Without multi-touch slots allocated, ABS_MT_SLOT events will be lost by
input_handle_abs_event.

Implementation is based on uinput_create_device.

Signed-off-by: Mathias Crombez 
Co-developed-by: Vasyl Vavrychuk 
Signed-off-by: Vasyl Vavrychuk 
---
v2: fix patch corrupted by corporate email server
v3: use number of slots from the host

 drivers/virtio/virtio_input.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index f1f6208edcf5..f643536807dd 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 
 struct virtio_input {
struct virtio_device   *vdev;
@@ -204,7 +205,7 @@ static int virtinput_probe(struct virtio_device *vdev)
struct virtio_input *vi;
unsigned long flags;
size_t size;
-   int abs, err;
+   int abs, err, nslots;
 
if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
return -ENODEV;
@@ -289,6 +290,13 @@ static int virtinput_probe(struct virtio_device *vdev)
continue;
virtinput_cfg_abs(vi, abs);
}
+
+   if (test_bit(ABS_MT_SLOT, vi->idev->absbit)) {
+   nslots = input_abs_get_max(vi->idev, ABS_MT_SLOT) + 1;
+   err = input_mt_init_slots(vi->idev, nslots, 0);
+   if (err)
+   goto err_mt_init_slots;
+   }
}
 
virtio_device_ready(vdev);
@@ -304,6 +312,7 @@ static int virtinput_probe(struct virtio_device *vdev)
spin_lock_irqsave(&vi->lock, flags);
vi->ready = false;
spin_unlock_irqrestore(&vi->lock, flags);
+err_mt_init_slots:
input_free_device(vi->idev);
 err_input_alloc:
vdev->config->del_vqs(vdev);
-- 
2.23.0

Re: [PATCH RESEND v2] virtio-input: add multi-touch support

2020-12-18 Thread Vasyl Vavrychuk

On 09.12.20 10:28, Michael S. Tsirkin wrote:

On Tue, Dec 08, 2020 at 11:01:50PM +0200, Vasyl Vavrychuk wrote:

From: Mathias Crombez 
Cc: sta...@vger.kernel.org

I don't believe this is appropriate for stable, looks like
a new feature to me.

Agree, removed.

+config VIRTIO_INPUT_MULTITOUCH_SLOTS
+ depends on VIRTIO_INPUT
+ int "Number of multitouch slots"
+ range 0 64
+ default 10
+ help
+  Define the number of multitouch slots used. Default to 10.
+  This parameter is unused if there is no multitouch capability.
+
+  0 will disable the feature.
+

Most people won't be using this config so the defaults matter. So why 10? 10 
fingers?

And where does 64 come from?

I have sent v3 version where number of slots it obtained from the host.

+ if (is_mt)
+ input_mt_init_slots(vi->idev,
+ CONFIG_VIRTIO_INPUT_MULTITOUCH_SLOTS,
+ INPUT_MT_DIRECT);

Do we need the number in config space maybe? And maybe with a feature
bit so host can find out whether guest supports MT?

I think it is not applicable in v3 which I sent, because number of slots 
is commit from the host. So, now host controls whether guest support MT.

Re: [PATCH] clk: vc5: Use "idt,voltage-microvolt" instead of "idt,voltage-microvolts"

2020-12-18 Thread Luca Ceresoli

Hi Geert,

On 18/12/20 13:52, Geert Uytterhoeven wrote:
> Commit 45c940184b501fc6 ("dt-bindings: clk: versaclock5: convert to
> yaml") accidentally changed "idt,voltage-microvolts" to
> "idt,voltage-microvolt" in the DT bindings, while the driver still used
> the former.
> 
> Update the driver to match the bindings, as
> Documentation/devicetree/bindings/property-units.txt actually recommends
> using "microvolt".
> 
> Fixes: 260249f929e81d3d ("clk: vc5: Enable addition output configurations of 
> the Versaclock")
> Signed-off-by: Geert Uytterhoeven 
> ---
> There are no upstream users yet, but they are planned for v5.12, so I
> think this should be in v5.11-rc1.
> 
> Thanks!
> ---
>  drivers/clk/clk-versaclock5.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/clk/clk-versaclock5.c b/drivers/clk/clk-versaclock5.c
> index c90460e7ef2153fe..43db67337bc06824 100644
> --- a/drivers/clk/clk-versaclock5.c
> +++ b/drivers/clk/clk-versaclock5.c
> @@ -739,8 +739,8 @@ static int vc5_update_power(struct device_node *np_output,
>  {
>   u32 value;
>  
> - if (!of_property_read_u32(np_output,
> -   "idt,voltage-microvolts", &value)) {
> + if (!of_property_read_u32(np_output, "idt,voltage-microvolt",
> +   &value)) {

Reviewed-by: Luca Ceresoli 

Now the example in the bindings needs the same fix. I guess you doing it
in your "Miscellaneous fixes and improvements" v2 series, otherwise I
can do that.

Thanks,
-- 
Luca

[PATCH 3/5] ibmvfc: define per-queue state/list locks

2020-12-18 Thread Tyrel Datwyler

Define per-queue locks for protecting queue state and event pool
sent/free lists. The evt list lock is initially redundant but it allows
the driver to be modified in the follow-up patches to relax the queue
locking around submissions and completions.

Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 93 +++---
 drivers/scsi/ibmvscsi/ibmvfc.h |  7 ++-
 2 files changed, 80 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 8de2a25b05ee..69a6401ca504 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -176,8 +176,9 @@ static void ibmvfc_trc_start(struct ibmvfc_event *evt)
struct ibmvfc_mad_common *mad = &evt->iu.mad_common;
struct ibmvfc_fcp_cmd_iu *iu = ibmvfc_get_fcp_iu(vhost, vfc_cmd);
struct ibmvfc_trace_entry *entry;
+   int index = atomic_inc_return(&vhost->trace_index) & 
IBMVFC_TRACE_INDEX_MASK;
 
-   entry = &vhost->trace[vhost->trace_index++];
+   entry = &vhost->trace[index];
entry->evt = evt;
entry->time = jiffies;
entry->fmt = evt->crq.format;
@@ -211,8 +212,10 @@ static void ibmvfc_trc_end(struct ibmvfc_event *evt)
struct ibmvfc_mad_common *mad = &evt->xfer_iu->mad_common;
struct ibmvfc_fcp_cmd_iu *iu = ibmvfc_get_fcp_iu(vhost, vfc_cmd);
struct ibmvfc_fcp_rsp *rsp = ibmvfc_get_fcp_rsp(vhost, vfc_cmd);
-   struct ibmvfc_trace_entry *entry = &vhost->trace[vhost->trace_index++];
+   struct ibmvfc_trace_entry *entry;
+   int index = atomic_inc_return(&vhost->trace_index) & 
IBMVFC_TRACE_INDEX_MASK;
 
+   entry = &vhost->trace[index];
entry->evt = evt;
entry->time = jiffies;
entry->fmt = evt->crq.format;
@@ -805,6 +808,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
 
spin_lock_irqsave(vhost->host->host_lock, flags);
+   spin_lock(vhost->crq.q_lock);
vhost->state = IBMVFC_NO_CRQ;
vhost->logged_in = 0;
 
@@ -821,6 +825,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
dev_warn(vhost->dev, "Partner adapter not ready\n");
else if (rc != 0)
dev_warn(vhost->dev, "Couldn't register crq (rc=%d)\n", rc);
+   spin_unlock(vhost->crq.q_lock);
spin_unlock_irqrestore(vhost->host->host_lock, flags);
 
return rc;
@@ -853,10 +858,16 @@ static int ibmvfc_valid_event(struct ibmvfc_event_pool 
*pool,
 static void ibmvfc_free_event(struct ibmvfc_event *evt)
 {
struct ibmvfc_event_pool *pool = &evt->queue->evt_pool;
+   unsigned long flags;
 
BUG_ON(!ibmvfc_valid_event(pool, evt));
BUG_ON(atomic_inc_return(&evt->free) != 1);
+
+   spin_lock_irqsave(&evt->queue->l_lock, flags);
list_add_tail(&evt->queue_list, &evt->queue->free);
+   if (evt->eh_comp)
+   complete(evt->eh_comp);
+   spin_unlock_irqrestore(&evt->queue->l_lock, flags);
 }
 
 /**
@@ -875,12 +886,27 @@ static void ibmvfc_scsi_eh_done(struct ibmvfc_event *evt)
cmnd->scsi_done(cmnd);
}
 
-   if (evt->eh_comp)
-   complete(evt->eh_comp);
-
ibmvfc_free_event(evt);
 }
 
+/**
+ * ibmvfc_complete_purge - Complete failed command list
+ * @purge_list:list head of failed commands
+ *
+ * This function runs completions on commands to fail as a result of a
+ * host reset or platform migration. Caller must hold host_lock.
+ **/
+static void ibmvfc_complete_purge(struct list_head *purge_list)
+{
+   struct ibmvfc_event *evt, *pos;
+
+   list_for_each_entry_safe(evt, pos, purge_list, queue_list) {
+   list_del(&evt->queue_list);
+   ibmvfc_trc_end(evt);
+   evt->done(evt);
+   }
+}
+
 /**
  * ibmvfc_fail_request - Fail request with specified error code
  * @evt:   ibmvfc event struct
@@ -897,10 +923,7 @@ static void ibmvfc_fail_request(struct ibmvfc_event *evt, 
int error_code)
} else
evt->xfer_iu->mad_common.status = 
cpu_to_be16(IBMVFC_MAD_DRIVER_FAILED);
 
-   list_del(&evt->queue_list);
del_timer(&evt->timer);
-   ibmvfc_trc_end(evt);
-   evt->done(evt);
 }
 
 /**
@@ -914,10 +937,14 @@ static void ibmvfc_fail_request(struct ibmvfc_event *evt, 
int error_code)
 static void ibmvfc_purge_requests(struct ibmvfc_host *vhost, int error_code)
 {
struct ibmvfc_event *evt, *pos;
+   unsigned long flags;
 
ibmvfc_dbg(vhost, "Purging all requests\n");
+   spin_lock_irqsave(&vhost->crq.l_lock, flags);
list_for_each_entry_safe(evt, pos, &vhost->crq.sent, queue_list)
ibmvfc_fail_request(evt, error_code);
+   list_splice_init(&vhost->crq.sent, &vhost->purge);
+   spin_unlock_irqrestore(&vhost->crq.l_lock, flags);
 }
 
 /**
@@ -1314,6 +1341,7 @@ static int ibmv

[PATCH 1/5] ibmvfc: define generic queue structure for CRQs

2020-12-18 Thread Tyrel Datwyler

The primary and async CRQs are nearly identical outside of the format
and length of each message entry in the dma mapped page that represents
the queue data. These queues can be represented with a generic queue
structure that uses a union to differentiate between message format of
the mapped page.

This structure will further be leveraged in a followup patcheset that
introduce Sub-CRQs.

Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 135 +
 drivers/scsi/ibmvscsi/ibmvfc.h |  34 +
 2 files changed, 107 insertions(+), 62 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 42e4d35e0d35..c8e7c4701ac4 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -660,7 +660,7 @@ static void ibmvfc_init_host(struct ibmvfc_host *vhost)
}
 
if (!ibmvfc_set_host_state(vhost, IBMVFC_INITIALIZING)) {
-   memset(vhost->async_crq.msgs, 0, PAGE_SIZE);
+   memset(vhost->async_crq.msgs.async, 0, PAGE_SIZE);
vhost->async_crq.cur = 0;
 
list_for_each_entry(tgt, &vhost->targets, queue)
@@ -713,6 +713,23 @@ static int ibmvfc_send_crq_init_complete(struct 
ibmvfc_host *vhost)
return ibmvfc_send_crq(vhost, 0xC002LL, 0);
 }
 
+/**
+ * ibmvfc_free_queue - Deallocate queue
+ * @vhost: ibmvfc host struct
+ * @queue: ibmvfc queue struct
+ *
+ * Unmaps dma and deallocates page for messages
+ **/
+static void ibmvfc_free_queue(struct ibmvfc_host *vhost,
+ struct ibmvfc_queue *queue)
+{
+   struct device *dev = vhost->dev;
+
+   dma_unmap_single(dev, queue->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
+   free_page((unsigned long)queue->msgs.handle);
+   queue->msgs.handle = NULL;
+}
+
 /**
  * ibmvfc_release_crq_queue - Deallocates data and unregisters CRQ
  * @vhost: ibmvfc host struct
@@ -724,7 +741,7 @@ static void ibmvfc_release_crq_queue(struct ibmvfc_host 
*vhost)
 {
long rc = 0;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
-   struct ibmvfc_crq_queue *crq = &vhost->crq;
+   struct ibmvfc_queue *crq = &vhost->crq;
 
ibmvfc_dbg(vhost, "Releasing CRQ\n");
free_irq(vdev->irq, vhost);
@@ -737,8 +754,8 @@ static void ibmvfc_release_crq_queue(struct ibmvfc_host 
*vhost)
 
vhost->state = IBMVFC_NO_CRQ;
vhost->logged_in = 0;
-   dma_unmap_single(vhost->dev, crq->msg_token, PAGE_SIZE, 
DMA_BIDIRECTIONAL);
-   free_page((unsigned long)crq->msgs);
+
+   ibmvfc_free_queue(vhost, crq);
 }
 
 /**
@@ -778,7 +795,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
int rc = 0;
unsigned long flags;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
-   struct ibmvfc_crq_queue *crq = &vhost->crq;
+   struct ibmvfc_queue *crq = &vhost->crq;
 
/* Close the CRQ */
do {
@@ -792,7 +809,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
vhost->logged_in = 0;
 
/* Clean out the queue */
-   memset(crq->msgs, 0, PAGE_SIZE);
+   memset(crq->msgs.crq, 0, PAGE_SIZE);
crq->cur = 0;
 
/* And re-open it again */
@@ -1238,6 +1255,7 @@ static void ibmvfc_gather_partition_info(struct 
ibmvfc_host *vhost)
 static void ibmvfc_set_login_info(struct ibmvfc_host *vhost)
 {
struct ibmvfc_npiv_login *login_info = &vhost->login_info;
+   struct ibmvfc_queue *async_crq = &vhost->async_crq;
struct device_node *of_node = vhost->dev->of_node;
const char *location;
 
@@ -1257,7 +1275,8 @@ static void ibmvfc_set_login_info(struct ibmvfc_host 
*vhost)
login_info->max_cmds = cpu_to_be32(max_requests + 
IBMVFC_NUM_INTERNAL_REQ);
login_info->capabilities = cpu_to_be64(IBMVFC_CAN_MIGRATE | 
IBMVFC_CAN_SEND_VF_WWPN);
login_info->async.va = cpu_to_be64(vhost->async_crq.msg_token);
-   login_info->async.len = cpu_to_be32(vhost->async_crq.size * 
sizeof(*vhost->async_crq.msgs));
+   login_info->async.len = cpu_to_be32(async_crq->size *
+   sizeof(*async_crq->msgs.async));
strncpy(login_info->partition_name, vhost->partition_name, 
IBMVFC_MAX_NAME);
strncpy(login_info->device_name,
dev_name(&vhost->host->shost_gendev), IBMVFC_MAX_NAME);
@@ -3230,10 +3249,10 @@ static struct scsi_host_template driver_template = {
  **/
 static struct ibmvfc_async_crq *ibmvfc_next_async_crq(struct ibmvfc_host 
*vhost)
 {
-   struct ibmvfc_async_crq_queue *async_crq = &vhost->async_crq;
+   struct ibmvfc_queue *async_crq = &vhost->async_crq;
struct ibmvfc_async_crq *crq;
 
-   crq = &async_crq->msgs[async_crq->cur];
+   crq = &async_crq->msgs.async[async_crq->cur];
if (crq->valid & 0x80) {
if (++async_crq->cur == async_crq->size)
async_crq->cur =

[PATCH 2/5] ibmvfc: make command event pool queue specific

2020-12-18 Thread Tyrel Datwyler

There is currently a single command event pool per host. In anticipation
of providing multiple queues add a per-queue event pool definition and
reimplement the existing CRQ to use its queue defined event pool for
command submission and completion.

Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 95 ++
 drivers/scsi/ibmvscsi/ibmvfc.h | 10 ++--
 2 files changed, 55 insertions(+), 50 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index c8e7c4701ac4..8de2a25b05ee 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -852,12 +852,11 @@ static int ibmvfc_valid_event(struct ibmvfc_event_pool 
*pool,
  **/
 static void ibmvfc_free_event(struct ibmvfc_event *evt)
 {
-   struct ibmvfc_host *vhost = evt->vhost;
-   struct ibmvfc_event_pool *pool = &vhost->pool;
+   struct ibmvfc_event_pool *pool = &evt->queue->evt_pool;
 
BUG_ON(!ibmvfc_valid_event(pool, evt));
BUG_ON(atomic_inc_return(&evt->free) != 1);
-   list_add_tail(&evt->queue, &vhost->free);
+   list_add_tail(&evt->queue_list, &evt->queue->free);
 }
 
 /**
@@ -898,7 +897,7 @@ static void ibmvfc_fail_request(struct ibmvfc_event *evt, 
int error_code)
} else
evt->xfer_iu->mad_common.status = 
cpu_to_be16(IBMVFC_MAD_DRIVER_FAILED);
 
-   list_del(&evt->queue);
+   list_del(&evt->queue_list);
del_timer(&evt->timer);
ibmvfc_trc_end(evt);
evt->done(evt);
@@ -917,7 +916,7 @@ static void ibmvfc_purge_requests(struct ibmvfc_host 
*vhost, int error_code)
struct ibmvfc_event *evt, *pos;
 
ibmvfc_dbg(vhost, "Purging all requests\n");
-   list_for_each_entry_safe(evt, pos, &vhost->sent, queue)
+   list_for_each_entry_safe(evt, pos, &vhost->crq.sent, queue_list)
ibmvfc_fail_request(evt, error_code);
 }
 
@@ -1292,10 +1291,11 @@ static void ibmvfc_set_login_info(struct ibmvfc_host 
*vhost)
  *
  * Returns zero on success.
  **/
-static int ibmvfc_init_event_pool(struct ibmvfc_host *vhost)
+static int ibmvfc_init_event_pool(struct ibmvfc_host *vhost,
+ struct ibmvfc_queue *queue)
 {
int i;
-   struct ibmvfc_event_pool *pool = &vhost->pool;
+   struct ibmvfc_event_pool *pool = &queue->evt_pool;
 
ENTER;
pool->size = max_requests + IBMVFC_NUM_INTERNAL_REQ;
@@ -1312,6 +1312,9 @@ static int ibmvfc_init_event_pool(struct ibmvfc_host 
*vhost)
return -ENOMEM;
}
 
+   INIT_LIST_HEAD(&queue->sent);
+   INIT_LIST_HEAD(&queue->free);
+
for (i = 0; i < pool->size; ++i) {
struct ibmvfc_event *evt = &pool->events[i];
atomic_set(&evt->free, 1);
@@ -1319,8 +1322,9 @@ static int ibmvfc_init_event_pool(struct ibmvfc_host 
*vhost)
evt->crq.ioba = cpu_to_be64(pool->iu_token + 
(sizeof(*evt->xfer_iu) * i));
evt->xfer_iu = pool->iu_storage + i;
evt->vhost = vhost;
+   evt->queue = queue;
evt->ext_list = NULL;
-   list_add_tail(&evt->queue, &vhost->free);
+   list_add_tail(&evt->queue_list, &queue->free);
}
 
LEAVE;
@@ -1332,14 +1336,15 @@ static int ibmvfc_init_event_pool(struct ibmvfc_host 
*vhost)
  * @vhost: ibmvfc host who owns the event pool
  *
  **/
-static void ibmvfc_free_event_pool(struct ibmvfc_host *vhost)
+static void ibmvfc_free_event_pool(struct ibmvfc_host *vhost,
+  struct ibmvfc_queue *queue)
 {
int i;
-   struct ibmvfc_event_pool *pool = &vhost->pool;
+   struct ibmvfc_event_pool *pool = &queue->evt_pool;
 
ENTER;
for (i = 0; i < pool->size; ++i) {
-   list_del(&pool->events[i].queue);
+   list_del(&pool->events[i].queue_list);
BUG_ON(atomic_read(&pool->events[i].free) != 1);
if (pool->events[i].ext_list)
dma_pool_free(vhost->sg_pool,
@@ -1360,14 +1365,14 @@ static void ibmvfc_free_event_pool(struct ibmvfc_host 
*vhost)
  *
  * Returns a free event from the pool.
  **/
-static struct ibmvfc_event *ibmvfc_get_event(struct ibmvfc_host *vhost)
+static struct ibmvfc_event *ibmvfc_get_event(struct ibmvfc_queue *queue)
 {
struct ibmvfc_event *evt;
 
-   BUG_ON(list_empty(&vhost->free));
-   evt = list_entry(vhost->free.next, struct ibmvfc_event, queue);
+   BUG_ON(list_empty(&queue->free));
+   evt = list_entry(queue->free.next, struct ibmvfc_event, queue_list);
atomic_set(&evt->free, 0);
-   list_del(&evt->queue);
+   list_del(&evt->queue_list);
return evt;
 }
 
@@ -1512,7 +1517,7 @@ static int ibmvfc_send_event(struct ibmvfc_event *evt,
else
BUG();
 
-   list_add_tail(&evt->queue, &vhost->sent);
+   list_add_tail(&evt->queue_list, &

[PATCH 4/5] ibmvfc: complete commands outside the host/queue lock

2020-12-18 Thread Tyrel Datwyler

Drain the command queue and place all commands on a completion list.
Perform command completion on that list outside the host/queue locks.
Further, move purged command compeletions outside the host_lock as well.

Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 58 ++
 drivers/scsi/ibmvscsi/ibmvfc.h |  3 +-
 2 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 69a6401ca504..b74080489807 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -894,7 +894,7 @@ static void ibmvfc_scsi_eh_done(struct ibmvfc_event *evt)
  * @purge_list:list head of failed commands
  *
  * This function runs completions on commands to fail as a result of a
- * host reset or platform migration. Caller must hold host_lock.
+ * host reset or platform migration.
  **/
 static void ibmvfc_complete_purge(struct list_head *purge_list)
 {
@@ -1407,6 +1407,23 @@ static struct ibmvfc_event *ibmvfc_get_event(struct 
ibmvfc_queue *queue)
return evt;
 }
 
+/**
+ * ibmvfc_locked_done - Calls evt completion with host_lock held
+ * @evt:   ibmvfc evt to complete
+ *
+ * All non-scsi command completion callbacks have the expectation that the
+ * host_lock is held. This callback is used by ibmvfc_init_event to wrap a
+ * MAD evt with the host_lock.
+ **/
+void ibmvfc_locked_done(struct ibmvfc_event *evt)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(evt->vhost->host->host_lock, flags);
+   evt->_done(evt);
+   spin_unlock_irqrestore(evt->vhost->host->host_lock, flags);
+}
+
 /**
  * ibmvfc_init_event - Initialize fields in an event struct that are always
  * required.
@@ -1419,9 +1436,14 @@ static void ibmvfc_init_event(struct ibmvfc_event *evt,
 {
evt->cmnd = NULL;
evt->sync_iu = NULL;
-   evt->crq.format = format;
-   evt->done = done;
evt->eh_comp = NULL;
+   evt->crq.format = format;
+   if (format == IBMVFC_CMD_FORMAT)
+   evt->done = done;
+   else {
+   evt->_done = done;
+   evt->done = ibmvfc_locked_done;
+   }
 }
 
 /**
@@ -1640,7 +1662,9 @@ static void ibmvfc_relogin(struct scsi_device *sdev)
struct ibmvfc_host *vhost = shost_priv(sdev->host);
struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
struct ibmvfc_target *tgt;
+   unsigned long flags;
 
+   spin_lock_irqsave(vhost->host->host_lock, flags);
list_for_each_entry(tgt, &vhost->targets, queue) {
if (rport == tgt->rport) {
ibmvfc_del_tgt(tgt);
@@ -1649,6 +1673,7 @@ static void ibmvfc_relogin(struct scsi_device *sdev)
}
 
ibmvfc_reinit_host(vhost);
+   spin_unlock_irqrestore(vhost->host->host_lock, flags);
 }
 
 /**
@@ -2901,7 +2926,8 @@ static void ibmvfc_handle_async(struct ibmvfc_async_crq 
*crq,
  * @vhost: ibmvfc host struct
  *
  **/
-static void ibmvfc_handle_crq(struct ibmvfc_crq *crq, struct ibmvfc_host 
*vhost)
+static void ibmvfc_handle_crq(struct ibmvfc_crq *crq, struct ibmvfc_host 
*vhost,
+ struct list_head *evt_doneq)
 {
long rc;
struct ibmvfc_event *evt = (struct ibmvfc_event 
*)be64_to_cpu(crq->ioba);
@@ -2972,12 +2998,9 @@ static void ibmvfc_handle_crq(struct ibmvfc_crq *crq, 
struct ibmvfc_host *vhost)
return;
}
 
-   del_timer(&evt->timer);
spin_lock(&evt->queue->l_lock);
-   list_del(&evt->queue_list);
+   list_move_tail(&evt->queue_list, evt_doneq);
spin_unlock(&evt->queue->l_lock);
-   ibmvfc_trc_end(evt);
-   evt->done(evt);
 }
 
 /**
@@ -3364,8 +3387,10 @@ static void ibmvfc_tasklet(void *data)
struct vio_dev *vdev = to_vio_dev(vhost->dev);
struct ibmvfc_crq *crq;
struct ibmvfc_async_crq *async;
+   struct ibmvfc_event *evt, *temp;
unsigned long flags;
int done = 0;
+   LIST_HEAD(evt_doneq);
 
spin_lock_irqsave(vhost->host->host_lock, flags);
spin_lock(vhost->crq.q_lock);
@@ -3379,7 +3404,7 @@ static void ibmvfc_tasklet(void *data)
 
/* Pull all the valid messages off the CRQ */
while ((crq = ibmvfc_next_crq(vhost)) != NULL) {
-   ibmvfc_handle_crq(crq, vhost);
+   ibmvfc_handle_crq(crq, vhost, &evt_doneq);
crq->valid = 0;
wmb();
}
@@ -3392,7 +3417,7 @@ static void ibmvfc_tasklet(void *data)
wmb();
} else if ((crq = ibmvfc_next_crq(vhost)) != NULL) {
vio_disable_interrupts(vdev);
-   ibmvfc_handle_crq(crq, vhost);
+   ibmvfc_handle_crq(crq, vhost, &evt_doneq);
crq->valid = 0;

[PATCH 5/5] ibmvfc: relax locking around ibmvfc_queuecommand

2020-12-18 Thread Tyrel Datwyler

The drivers queuecommand routine is still wrapped to hold the host lock
for the duration of the call. This will become problematic when moving
to multiple queues due to the lock contention preventing asynchronous
submissions to mulitple queues. There is no real legatimate reason to
hold the host lock, and previous patches have insured proper protection
of moving ibmvfc_event objects between free and sent lists.

Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index b74080489807..151e9111ab8a 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1793,10 +1793,9 @@ static struct ibmvfc_cmd *ibmvfc_init_vfc_cmd(struct 
ibmvfc_event *evt, struct s
  * Returns:
  * 0 on success / other on failure
  **/
-static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
-  void (*done) (struct scsi_cmnd *))
+static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 {
-   struct ibmvfc_host *vhost = shost_priv(cmnd->device->host);
+   struct ibmvfc_host *vhost = shost_priv(shost);
struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));
struct ibmvfc_cmd *vfc_cmd;
struct ibmvfc_fcp_cmd_iu *iu;
@@ -1806,7 +1805,7 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
if (unlikely((rc = fc_remote_port_chkready(rport))) ||
unlikely((rc = ibmvfc_host_chkready(vhost {
cmnd->result = rc;
-   done(cmnd);
+   cmnd->scsi_done(cmnd);
return 0;
}
 
@@ -1814,7 +1813,6 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
evt = ibmvfc_get_event(&vhost->crq);
ibmvfc_init_event(evt, ibmvfc_scsi_done, IBMVFC_CMD_FORMAT);
evt->cmnd = cmnd;
-   cmnd->scsi_done = done;
 
vfc_cmd = ibmvfc_init_vfc_cmd(evt, cmnd->device);
iu = ibmvfc_get_fcp_iu(vhost, vfc_cmd);
@@ -1841,12 +1839,10 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd 
*cmnd,
"Failed to map DMA buffer for command. rc=%d\n", 
rc);
 
cmnd->result = DID_ERROR << 16;
-   done(cmnd);
+   cmnd->scsi_done(cmnd);
return 0;
 }
 
-static DEF_SCSI_QCMD(ibmvfc_queuecommand)
-
 /**
  * ibmvfc_sync_completion - Signal that a synchronous command has completed
  * @evt:   ibmvfc event struct
-- 
2.27.0

[PATCH 0/5] ibmvfc: MQ preparatory locking work

2020-12-18 Thread Tyrel Datwyler

The ibmvfc driver in its current form relies heavily on the host_lock. This
patchset introduces a genric queue with its own queue lock and sent/free event
list locks. This generic queue allows the driver to decouple the primary queue
and future subordinate queues from the host lock reducing lock contention while
also relaxing locking for submissions and completions to simply the list lock of
the queue in question.

Tyrel Datwyler (5):
  ibmvfc: define generic queue structure for CRQs
  ibmvfc: make command event pool queue specific
  ibmvfc: define per-queue state/list locks
  ibmvfc: complete commands outside the host/queue lock
  ibmvfc: relax locking around ibmvfc_queuecommand

 drivers/scsi/ibmvscsi/ibmvfc.c | 379 ++---
 drivers/scsi/ibmvscsi/ibmvfc.h |  54 +++--
 2 files changed, 286 insertions(+), 147 deletions(-)

-- 
2.27.0

Re: [PATCH 06/22] misc: xlink-pcie: Add documentation for XLink PCIe driver

2020-12-18 Thread Randy Dunlap

On 12/1/20 2:34 PM, mgr...@linux.intel.com wrote:
> From: Srikanth Thokala 
> 
> Provide overview of XLink PCIe driver implementation
> 
> Cc: linux-...@vger.kernel.org
> Reviewed-by: Mark Gross 
> Signed-off-by: Srikanth Thokala 
> ---
>  Documentation/vpu/index.rst  |  1 +
>  Documentation/vpu/xlink-pcie.rst | 91 
>  2 files changed, 92 insertions(+)
>  create mode 100644 Documentation/vpu/xlink-pcie.rst
> 

Hi--

For document, chapter, section, etc., headings, please read & use
Documentation/doc-guide/sphinx.rst:

* Please stick to this order of heading adornments:

  1. ``=`` with overline for document title::

   ==
   Document title
   ==

  2. ``=`` for chapters::

   Chapters
   

  3. ``-`` for sections::

   Section
   ---

  4. ``~`` for subsections::

   Subsection
   ~~

> diff --git a/Documentation/vpu/xlink-pcie.rst 
> b/Documentation/vpu/xlink-pcie.rst
> new file mode 100644
> index ..bc64b566989d
> --- /dev/null
> +++ b/Documentation/vpu/xlink-pcie.rst
> @@ -0,0 +1,91 @@
> +.. SPDX-License-Identifier: GPL-2.0-only
> +
> +Kernel driver: xlink-pcie driver
> +
> +Supported chips:
> +  * Intel Edge.AI Computer Vision platforms: Keem Bay
> +Suffix: Bay
> +Slave address: 6240
> +Datasheet: Publicly available at Intel
> +
> +Author: Srikanth Thokala srikanth.thok...@intel.com
> +
> +-
> +Introduction:

No colon at end of chapter/section headings.

> +-
> +The xlink-pcie driver in linux-5.4 provides transport layer implementation 
> for

Linux 5.4 (?)

> +the data transfers to support xlink protocol subsystem communication with the

 Xlink

> +peer device. i.e, between remote host system and the local Keem Bay device.

device, i.e., between the remote host system and

> +
> +The Keem Bay device is an ARM based SOC that includes a vision processing

 ARM-based

> +unit (VPU) and deep learning, neural network core in the hardware.
> +The xlink-pcie driver exports a functional device endpoint to the Keem Bay 
> device
> +and supports two-way communication with peer device.

  with the peer device.

> +
> +
> +High-level architecture:
> +
> +Remote Host: IA CPU
> +Local Host: ARM CPU (Keem Bay)::
> +
> +
> ++
> +|  Remote Host IA CPU  | | Local Host ARM CPU (Keem Bay) 
> |   |
> +
> +==+=+===+===+
> +|  User App| | User App  
> |   |
> +
> +--+-+---+---+
> +|   XLink UAPI | | XLink UAPI
> |   |
> +
> +--+-+---+---+
> +|   XLink Core | | XLink Core
> |   |
> +
> +--+-+---+---+
> +|   XLink PCIe | | XLink PCIe
> |   |
> +
> +--+-+---+---+
> +|   XLink-PCIe Remote Host driver  | | XLink-PCIe Local Host driver  
> |   |
> +
> +--+-+---+---+
> +
> |-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:|:|:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:|
> +
> +--+-+---+---+
> +| PCIe Host Controller | | PCIe Device Controller
> | HW|
> +
> +--+-+---+---+
> +   ^ ^
> +   | |
> +   |- PCIe x2 Link  -|
> +
> +This XLink PCIe driver comprises of two variants:
> +* Local Host driver
> +
> +  * Intended for ARM CPU
> +  * It is based on PCI Endpoint Framework
> +  * Driver path: {tree}/drivers/misc/xlink-pcie/local_host
> +
> +* Remote Host driver
> +
> +   * Intended for IA CPU
> +   * It is a PCIe endpoint driver
> +   * Driver path: {tree}/drivers/misc/xlink-pcie/remote_host
> +
> +XLink PCIe communication between local host and remote host is achieved 
> through
> +ring buffer management and MSI/Doorbell interrupts.
> +
> +The xlink-pcie driver subsystem registers Keem Bay device as an endpoint 
> driver

   registers the

> +and provides standard linux pcie sysfs interface, # 
> /sys/bus/pci/devices/:xx:xx.0

[PATCH v2] PCI: endpoint: Fix NULL pointer dereference for ->get_features()

2020-12-18 Thread Shradha Todi

get_features ops of pci_epc_ops may return NULL, causing NULL pointer
dereference in pci_epf_test_bind function. Let us add a check for
pci_epc_feature pointer in pci_epf_test_bind before we access it to
avoid any such NULL pointer dereference and return -ENOTSUPP in case
pci_epc_feature is not found.

Reviewed-by: Pankaj Dubey 
Signed-off-by: Sriram Dash 
Signed-off-by: Shradha Todi 
---
v2:
 rebase on v1
 v1: https://lore.kernel.org/patchwork/patch/1208269/

 drivers/pci/endpoint/functions/pci-epf-test.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c 
b/drivers/pci/endpoint/functions/pci-epf-test.c
index 66723d5..f1842e6 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -835,13 +835,16 @@ static int pci_epf_test_bind(struct pci_epf *epf)
return -EINVAL;
 
epc_features = pci_epc_get_features(epc, epf->func_no);
-   if (epc_features) {
-   linkup_notifier = epc_features->linkup_notifier;
-   core_init_notifier = epc_features->core_init_notifier;
-   test_reg_bar = pci_epc_get_first_free_bar(epc_features);
-   pci_epf_configure_bar(epf, epc_features);
+   if (!epc_features) {
+   dev_err(&epf->dev, "epc_features not implemented\n");
+   return -EOPNOTSUPP;
}
 
+   linkup_notifier = epc_features->linkup_notifier;
+   core_init_notifier = epc_features->core_init_notifier;
+   test_reg_bar = pci_epc_get_first_free_bar(epc_features);
+   pci_epf_configure_bar(epf, epc_features);
+
epf_test->test_reg_bar = test_reg_bar;
epf_test->epc_features = epc_features;
 
-- 
2.7.4

Re: [PATCH 1/5] dt-bindings: remoteproc: Add PRU consumer bindings

2020-12-18 Thread Rob Herring

On Wed, Dec 16, 2020 at 9:55 AM Grzegorz Jaszczyk
 wrote:
>
> Hi Rob,
>
> On Mon, 14 Dec 2020 at 23:58, Rob Herring  wrote:
> >
> > On Fri, Dec 11, 2020 at 03:29:29PM +0100, Grzegorz Jaszczyk wrote:
> > > From: Suman Anna 
> > >
> > > Add a YAML binding document for PRU consumers. The binding includes
> > > all the common properties that can be used by different PRU consumer
> > > or application nodes and supported by the PRU remoteproc driver.
> > > These are used to configure the PRU hardware for specific user
> > > applications.
> > >
> > > The application nodes themselves should define their own bindings.
> > >
> > > Co-developed-by: Tero Kristo 
> > > Signed-off-by: Tero Kristo 
> > > Signed-off-by: Suman Anna 
> > > Co-developed-by: Grzegorz Jaszczyk 
> > > Signed-off-by: Grzegorz Jaszczyk 
> > > ---
> > >  .../bindings/remoteproc/ti,pru-consumer.yaml  | 64 +++
> > >  1 file changed, 64 insertions(+)
> > >  create mode 100644 
> > > Documentation/devicetree/bindings/remoteproc/ti,pru-consumer.yaml
> > >
> > > diff --git 
> > > a/Documentation/devicetree/bindings/remoteproc/ti,pru-consumer.yaml 
> > > b/Documentation/devicetree/bindings/remoteproc/ti,pru-consumer.yaml
> > > new file mode 100644
> > > index ..2c5c5e2b6159
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/remoteproc/ti,pru-consumer.yaml
> > > @@ -0,0 +1,64 @@
> > > +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
> > > +%YAML 1.2
> > > +---
> > > +$id: http://devicetree.org/schemas/remoteproc/ti,pru-consumer.yaml#
> > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > +
> > > +title: Common TI PRU Consumer Binding
> > > +
> > > +maintainers:
> > > +  - Suman Anna 
> > > +
> > > +description: |
> > > +  A PRU application/consumer/user node typically uses one or more PRU 
> > > device
> > > +  nodes to implement a PRU application/functionality. Each 
> > > application/client
> > > +  node would need a reference to at least a PRU node, and optionally 
> > > define
> > > +  some properties needed for hardware/firmware configuration. The below
> > > +  properties are a list of common properties supported by the PRU 
> > > remoteproc
> > > +  infrastructure.
> > > +
> > > +  The application nodes shall define their own bindings like regular 
> > > platform
> > > +  devices, so below are in addition to each node's bindings.
> > > +
> > > +properties:
> > > +  prus:
> >
> > ti,prus
>
> Thank you - I will change and post v2 but with this I will run into
> issues when this binding will be referenced by some consumer YAML
> binding. Running dtbs_check in such case throws:
> ... k3-am654-base-board.dt.yaml: serial@28000: 'ti,prus' does not
> match any of the regexes: 'pinctrl-[0-9]+'
> In the same time if I will remove this property from that node I am getting:
> ... k3-am654-base-board.dt.yaml: serial@28000: 'ti,prus' is a required 
> property
> as expected.

Sounds like you didn't update 'ti,prus' in whatever schema you include
this one from.

>
> Getting rid of the comma from this property name workarounds mentioned
> problem (which is not proper but allows me to correctly test this
> binding): e.g. s/ti,prus/ti-pruss/ or using the previous name without
> a comma.
> It seems to be an issue with dtbs_check itself which we will encounter
> in the future.

If not, can you point me to a branch having this problem.

Rob

[PATCH] PCI: dwc: Change size to u64 for EP outbound iATU

2020-12-18 Thread Shradha Todi

Since outbound iATU permits size to be greater than
4GB for which the support is also available, allow
EP function to send u64 size instead of truncating
to u32.

Signed-off-by: Shradha Todi 
---
 drivers/pci/controller/dwc/pcie-designware.c | 2 +-
 drivers/pci/controller/dwc/pcie-designware.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 7eba3b2..6298212 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -325,7 +325,7 @@ void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int 
index, int type,
 
 void dw_pcie_prog_ep_outbound_atu(struct dw_pcie *pci, u8 func_no, int index,
  int type, u64 cpu_addr, u64 pci_addr,
- u32 size)
+ u64 size)
 {
__dw_pcie_prog_outbound_atu(pci, func_no, index, type,
cpu_addr, pci_addr, size);
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 28b72fb..bb33f28 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -307,7 +307,7 @@ void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int 
index,
   u64 size);
 void dw_pcie_prog_ep_outbound_atu(struct dw_pcie *pci, u8 func_no, int index,
  int type, u64 cpu_addr, u64 pci_addr,
- u32 size);
+ u64 size);
 int dw_pcie_prog_inbound_atu(struct dw_pcie *pci, u8 func_no, int index,
 int bar, u64 cpu_addr,
 enum dw_pcie_as_type as_type);
-- 
2.7.4

Re: [PATCH V3 04/10] x86/pks: Preserve the PKRS MSR on context switch

2020-12-18 Thread Thomas Gleixner

On Fri, Dec 18 2020 at 13:58, Dan Williams wrote:
> On Fri, Dec 18, 2020 at 1:06 PM Thomas Gleixner  wrote:
>> kmap_local() is fine. That can work automatically because it's strict
>> local to the context which does the mapping.
>>
>> kmap() is dubious because it's a 'global' mapping as dictated per
>> HIGHMEM. So doing the RELAXED mode for kmap() is sensible I think to
>> identify cases where the mapped address is really handed to a different
>> execution context. We want to see those cases and analyse whether this
>> can't be solved in a different way. That's why I suggested to do a
>> warning in that case.
>>
>> Also vs. the DAX use case I really meant the code in fs/dax and
>> drivers/dax/ itself which is handling this via dax_read_[un]lock.
>>
>> Does that make more sense?
>
> Yup, got it. The dax code can be precise wrt to PKS in a way that
> kmap_local() cannot.

Which makes me wonder whether we should have kmap_local_for_read()
or something like that, which could be obviously only be RO enforced for
the real HIGHMEM case or the (for now x86 only) enforced kmap_local()
debug mechanics on 64bit.

So for the !highmem case it would not magically make the existing kernel
mapping RO, but this could be forwarded to the PKS protection. Aside of
that it's a nice annotation in the code.

That could be used right away for all the kmap[_atomic] -> kmap_local
conversions.

Thanks,

tglx
---
 include/linux/highmem-internal.h |   14 ++
 1 file changed, 14 insertions(+)

--- a/include/linux/highmem-internal.h
+++ b/include/linux/highmem-internal.h
@@ -32,6 +32,10 @@ static inline void kmap_flush_tlb(unsign
 #define kmap_prot PAGE_KERNEL
 #endif
 
+#ifndef kmap_prot_to
+#define kmap_prot PAGE_KERNEL_RO
+#endif
+
 void *kmap_high(struct page *page);
 void kunmap_high(struct page *page);
 void __kmap_flush_unused(void);
@@ -73,6 +77,11 @@ static inline void *kmap_local_page(stru
return __kmap_local_page_prot(page, kmap_prot);
 }
 
+static inline void *kmap_local_page_for_read(struct page *page)
+{
+   return __kmap_local_page_prot(page, kmap_prot_ro);
+}
+
 static inline void *kmap_local_page_prot(struct page *page, pgprot_t prot)
 {
return __kmap_local_page_prot(page, prot);
@@ -169,6 +178,11 @@ static inline void *kmap_local_page_prot
 {
return kmap_local_page(page);
 }
+
+static inline void *kmap_local_page_for_read(struct page *page)
+{
+   return kmap_local_page(page);
+}
 
 static inline void *kmap_local_pfn(unsigned long pfn)
 {

Re: [RFC PATCH v2 2/2] blk-mq: Lockout tagset iter when freeing rqs

2020-12-18 Thread Bart Van Assche

On 12/17/20 3:07 AM, John Garry wrote:
> References to old IO sched requests are currently cleared from the
> tagset when freeing those requests; switching elevator or changing
> request queue depth is such a scenario in which this occurs.
> 
> However, this does not stop the potentially racy behaviour of freeing
> and clearing a request reference between a tagset iterator getting a
> reference to a request and actually dereferencing that request.
> 
> Such a use-after-free can be triggered, as follows:
> 
> ==
> BUG: KASAN: use-after-free in bt_iter+0xa0/0x120
> Read of size 8 at addr 00108d589300 by task fio/3052
> 
> CPU: 32 PID: 3052 Comm: fio Tainted: GW
> 5.10.0-rc4-64839-g2dcf1ee5054f #693
> Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon
> D05 IT21 Nemo 2.0 RC0 04/18/2018
> Call trace:
> dump_backtrace+0x0/0x2d0
> show_stack+0x18/0x68
> dump_stack+0x100/0x16c
> print_address_description.constprop.12+0x6c/0x4e8
> kasan_report+0x130/0x200
> __asan_load8+0x9c/0xd8
> bt_iter+0xa0/0x120
> blk_mq_queue_tag_busy_iter+0x2d8/0x540
> blk_mq_in_flight+0x80/0xb8
> part_stat_show+0xd8/0x238
> dev_attr_show+0x44/0x90
> sysfs_kf_seq_show+0x128/0x1c8
> kernfs_seq_show+0xa0/0xb8
> seq_read_iter+0x1ec/0x6a0
> seq_read+0x1d0/0x250
> kernfs_fop_read+0x70/0x330
> vfs_read+0xe4/0x250
> ksys_read+0xc8/0x178
> __arm64_sys_read+0x44/0x58
> el0_svc_common.constprop.2+0xc4/0x1e8
> do_el0_svc+0x90/0xa0
> el0_sync_handler+0x128/0x178
> el0_sync+0x158/0x180
> 
> This is found experimentally by running fio on 2x SCSI disks - 1x disk
> holds the root partition. Userspace is constantly triggering the tagset
> iter from reading the root (gen)disk partition info. And so if the IO
> sched is constantly changed on the other disk, eventually the UAF occurs,
> as described above.

Hi John,

Something is not clear to me. The above call stack includes
blk_mq_queue_tag_busy_iter(). That function starts with
percpu_ref_tryget(&q->q_usage_counter) and ends with calling
percpu_ref_put(&q->q_usage_counter). So it will only iterate over a tag set
if q->q_usage_counter is live. However, both blk_mq_update_nr_requests()
and elevator_switch() start with freezing the request queue.
blk_mq_freeze_queue() starts with killing q->q_usage_counter and waits
until that counter has dropped to zero. In other words,
blk_mq_queue_tag_busy_iter() should not iterate over a tag set while a tag
set is being freed or reallocated. Does this mean that we do not yet have
a full explanation about why the above call stack can be triggered?

Thanks,

Bart.

Re: [PATCH 04/14] dt-bindings: display: bridge: Add i.MX8qm/qxp pixel combiner binding

2020-12-18 Thread Rob Herring

On Thu, Dec 17, 2020 at 7:48 PM Liu Ying  wrote:
>
> Hi,
>
> On Thu, 2020-12-17 at 12:50 -0600, Rob Herring wrote:
> > On Thu, 17 Dec 2020 17:59:23 +0800, Liu Ying wrote:
> > > This patch adds bindings for i.MX8qm/qxp pixel combiner.
> > >
> > > Signed-off-by: Liu Ying 
> > > ---
> > >  .../display/bridge/fsl,imx8qxp-pixel-combiner.yaml | 160 
> > > +
> > >  1 file changed, 160 insertions(+)
> > >  create mode 100644 
> > > Documentation/devicetree/bindings/display/bridge/fsl,imx8qxp-pixel-combiner.yaml
> > >
> >
> > My bot found errors running 'make dt_binding_check' on your patch:
> >
> > yamllint warnings/errors:
> >
> > dtschema/dtc warnings/errors:
> > Documentation/devicetree/bindings/display/bridge/fsl,imx8qxp-pixel-combiner.example.dts:19:18:
> >  fatal error: dt-bindings/clock/imx8-lpcg.h: No such file or directory
> >19 | #include 
> >   |  ^~~
> > compilation terminated.
> > make[1]: *** [scripts/Makefile.lib:342: 
> > Documentation/devicetree/bindings/display/bridge/fsl,imx8qxp-pixel-combiner.example.dt.yaml]
> >  Error 1
> > make[1]: *** Waiting for unfinished jobs
> > make: *** [Makefile:1364: dt_binding_check] Error 2
> >
> > See https://patchwork.ozlabs.org/patch/1417599
> >
> > This check can fail if there are any dependencies. The base for a patch
> > series is generally the most recent rc1.
>
> This series can be applied to linux-next/master branch.

I can't know that to apply and run checks automatically. I guessed
that reviewing this before sending, but I want it abundantly clear
what the result of applying this might be and it wasn't mentioned in
this patch.

Plus linux-next is a base no one can apply patches to, so should you
be sending patches based on it? It's also the merge window, so maybe
wait until rc1 when your dependency is in and the patch can actually
be applied. Also, the drm-misc folks will still need to know they need
to merge rc1 in before this is applied.

Rob

Re: [resend/standalone PATCH v4] Add auxiliary bus support

2020-12-18 Thread Dan Williams

On Fri, Dec 18, 2020 at 1:17 PM Alexandre Belloni
 wrote:
>
> On 18/12/2020 16:58:56-0400, Jason Gunthorpe wrote:
> > On Fri, Dec 18, 2020 at 08:32:11PM +, Mark Brown wrote:
> >
> > > > So, I strongly suspect, MFD should create mfd devices on a MFD bus
> > > > type.
> > >
> > > Historically people did try to create custom bus types, as I have
> > > pointed out before there was then pushback that these were duplicating
> > > the platform bus so everything uses platform bus.
> >
> > Yes, I vaugely remember..
> >
> > I don't know what to say, it seems Greg doesn't share this view of
> > platform devices as a universal device.
> >
> > Reading between the lines, I suppose things would have been happier
> > with some kind of inheritance scheme where platform device remained as
> > only instantiated directly in board files, while drivers could bind to
> > OF/DT/ACPI/FPGA/etc device instantiations with minimal duplication &
> > boilerplate.
> >
> > And maybe that is exactly what we have today with platform devices,
> > though the name is now unfortunate.
> >
> > > I can't tell the difference between what it's doing and what SOF is
> > > doing, the code I've seen is just looking at the system it's running
> > > on and registering a fixed set of client devices.  It looks slightly
> > > different because it's registering a device at a time with some wrapper
> > > functions involved but that's what the code actually does.
> >
> > SOF's aux bus usage in general seems weird to me, but if you think
> > it fits the mfd scheme of primarily describing HW to partition vs
> > describing a SW API then maybe it should use mfd.
> >
> > The only problem with mfd as far as SOF is concerned was Greg was not
> > happy when he saw PCI stuff in the MFD subsystem.
> >
>
> But then again, what about non-enumerable devices on the PCI device? I
> feel this would exactly fit MFD. This is a collection of IPs that exist
> as standalone but in this case are grouped in a single device.
>
> Note that I then have another issue because the kernel doesn't support
> irq controllers on PCI and this is exactly what my SoC has. But for now,
> I can just duplicate the irqchip driver in the MFD driver.
>
> > This whole thing started when Intel first proposed to directly create
> > platform_device's in their ethernet driver and Greg had a quite strong
> > NAK to that.
>
> Let me point to drivers/net/ethernet/cadence/macb_pci.c which is a
> fairly recent example. It does exactly that and I'm not sure you could
> do it otherwise while still not having to duplicate most of macb_probe.
>

This still feels an orthogonal example to the problem auxiliary-bus is
solving. If a platform-device and a pci-device surface an IP with a
shared programming model that's an argument for a shared library, like
libata to house the commonality. In contrast auxiliary-bus is a
software model for software-defined sub-functionality to be wrapped in
a driver model. It assumes a parent-device / parent-driver hierarchy
that platform-bus and pci-bus do not imply.

Re: [PATCH v2 05/15] ia64: convert to legacy_timer_tick

2020-12-18 Thread John Paul Adrian Glaubitz

Hi Arnd!

On 12/18/20 11:13 PM, John Paul Adrian Glaubitz wrote:
>> I've attached a patch for a partial revert of my original change, this
>> should still work with the final cleanup on top, but restore the loop
>> plus the local_irq_enable()/local_irq_disable() that I dropped from
>> the original code. Does this make a difference?
> 
> I'll give it a try and report back.

Yes. That solves the timer issues. Now there is unfortunately still a
second, unrelated regression with the hpsa driver that was introduced
by one of the ia64 patches in the mm tree from Andrew which makes the
hpsa driver not load at all.

Haven't figured out yet what the problem is.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Re: [PATCH v2 0/1] arm64: defconfig: Enable Librem 5 hardware

2020-12-18 Thread Pavel Machek

Hi!

> > > > > Patches are on top of Shawn's imx/defconfig
> > > > 
> > > > Thanks for bringing support for your hardware to the mainline.
> > > > 
> > > > Can I ask phone-de...@vger.kernel.org to be cc-ed for phone-related
> > > > changes?
> > > 
> > > Good point. Done with v3.
> > > 
> > > > How complete is the support?
> > > 
> > > The components enabled should work in 5.11 (there's some LCD/DSI patches
> > > in flight (that's why i did not send the corresponding DT addition yet)
> > > and we need to submit a DT for Evergreen (imx8mq-librem5r4).
> > > 
> > > https://git.sigxcpu.org/cgit/talks/2020-debconf-mobile/plain/talk.pdf
> > > 
> > > is a bit outdated but has some numbers starting on page 24.
> > 
> > Thanks for pointer :-).
> > 
> > > > In particular, what interface do you use to configure audio routing
> > > > for the modem?
> > > 
> > > https://salsa.debian.org/DebianOnMobile-team/callaudiod manages
> > > > that.
> > 
> > Does kernel provide mixer interface for callaudiod to do its job?
> 
> callaudiod handles selecting e.g. earpiece vs. speaker by selecting the
> right pulseaudio ports (it's invoked by calls (the phone call handling
> application via DBus) and only relies on the codec being an alsa
> device and hence handled by pulseaudio/alsa-ucm.
> 
> Wys (https://source.puri.sm/Librem5/wys) manages the routing between the
> modem and codec by listening to ModemManager's state and connecting audio
> source and sink (again solely via pulsaudio so again just relying on
> modem and codec being alsa devices). Since the modem is not part of the
> SoC on the Librem 5 it's a completely separate device.

Aha, yep, sorry -- I forgot. I was hoping to copy solution for Librem
5 to Droid 4, but that won't work, as Droid 4 is doing audio in
hardware, while Librem does it in wys.

Best regards,

Pavel
-- 
http://www.livejournal.com/~pavelmachek


signature.asc
Description: PGP signature

Re: [RFC PATCH 0/13] sparc32: sunset sun4m and sun4d

2020-12-18 Thread Kjetil Oftedal

On 18/12/2020, Sam Ravnborg  wrote:
> The sun4m and sun4d based SPARC machines was very popular in the
> 90'ties and was then replaced by the more powerful sparc64
> class of machines.
> Today there is only Gentoo that to my best knowledge supports
> sparc32 and people have moved on to more capable HW.
>
> Cobham Gaisler have variants of the LEON processer that
> runs sparc32 - and they are in production today.
>
> With this patchset I propose to sunset sun4m and sun4d and move
> focus to a more streamlined support for LEON.
>
> One downside is that qemu supports sun4m - and we may loose
> some testing possibilities when sun4m is dropped. qemu supports
> LEON to some degree - I have not yet tried it out.
>
> Andreas from Gaisler have indicated that they may be more active
> upstream on sparc32 - and this will only be easier with a kernel
> where the legacy stuff is dropped.
>

This makes me a bit sad. But I guess I haven't had any time to put
into the sparc32 port
for many years, so I guess it is time to let go.

But I do believe that by doing this we should make sure we are not
putting ourselves
in a position where the sparc kernel-developers don't have access to
any real sparc32
hardware.

SUN machines were at least plentiful. The LEON-family of processors
being targeted
towards the rad-hardened market are not so much available.

Maybe Gaisler can contribute some systems, or make some available remotely?

Best regards,
Kjetil Oftedal

Re: [PATCH v2 06/12] software_node: Add support for fwnode_graph*() family of functions

2020-12-18 Thread Daniel Scally

On 18/12/2020 20:37, Andy Shevchenko wrote:
> On Thu, Dec 17, 2020 at 11:43:31PM +, Daniel Scally wrote:
>> From: Heikki Krogerus 
>>
>> This implements the remaining .graph_* callbacks in the
>> fwnode operations structure for the software nodes. That makes
>> the fwnode_graph*() functions available in the drivers also
>> when software nodes are used.
>>
>> The implementation tries to mimic the "OF graph" as much as
>> possible, but there is no support for the "reg" device
>> property. The ports will need to have the index in their
>> name which starts with "port@" (for example "port@0", "port@1",
>> ...) and endpoints will use the index of the software node
>> that is given to them during creation. The port nodes can
>> also be grouped under a specially named "ports" subnode,
>> just like in DT, if necessary.
>>
>> The remote-endpoints are reference properties under the
>> endpoint nodes that are named "remote-endpoint".
> 
> ...
> 
>> +while ((port = software_node_get_next_child(parent, old))) {
>> +if (!strncmp(to_swnode(port)->node->name, "port", 4))
>> +return port;
>> +old = port;
>> +}
> 
> Dunno if we need defines for port and its length here.

Mmm, maybe a comment?

> ...
> 
>> +ret = kstrtou32(swnode->parent->node->name + 5, 10, &endpoint->port);
> 
> But here at least comment is needed what 5 means ('port@' I suppose).

Ack - I'll add an explanatory comment (and yep, it's 'port@')

>> +if (ret)
>> +return ret;
> 
>

[tip: timers/urgent] timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill"

2020-12-18 Thread tip-bot2 for Colin Ian King

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID: f6f5cd840ae782680c5e94048c72420e4e6857f9
Gitweb:
https://git.kernel.org/tip/f6f5cd840ae782680c5e94048c72420e4e6857f9
Author:Colin Ian King 
AuthorDate:Thu, 17 Dec 2020 17:17:05 
Committer: Thomas Gleixner 
CommitterDate: Fri, 18 Dec 2020 23:15:00 +01:00

timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill"

There is a spelling mistake in the Kconfig help text. Fix it.

Signed-off-by: Colin Ian King 
Signed-off-by: Thomas Gleixner 
Acked-by: Linus Walleij 
Link: https://lore.kernel.org/r/20201217171705.57586-1-colin.k...@canonical.com

---
 kernel/time/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index a09b1d6..64051f4 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -141,7 +141,7 @@ config CONTEXT_TRACKING_FORCE
  dynticks working.
 
  This option stands for testing when an arch implements the
- context tracking backend but doesn't yet fullfill all the
+ context tracking backend but doesn't yet fulfill all the
  requirements to make the full dynticks feature working.
  Without the full dynticks, there is no way to test the support
  for context tracking and the subsystems that rely on it: RCU

Re: [PATCH v2 04/12] software_node: Enforce parent before child ordering of nodes arrays

2020-12-18 Thread Daniel Scally

Hi Laurent

On 18/12/2020 16:02, Laurent Pinchart wrote:
> Hi Daniel,
> 
> Thank you for the patch.
> 
> On Thu, Dec 17, 2020 at 11:43:29PM +, Daniel Scally wrote:
>> Registering software_nodes with the .parent member set to point to a
>> currently unregistered software_node has the potential for problems,
>> so enforce parent -> child ordering in arrays passed in to
>> software_node_register_nodes().
>>
>> Software nodes that are children of another software node should be
>> unregistered before their parent. To allow easy unregistering of an array
>> of software_nodes ordered parent to child, reverse the order in which
>> software_node_unregister_nodes() unregisters software_nodes.
>>
>> Suggested-by: Andy Shevchenko 
>> Signed-off-by: Daniel Scally 
>> ---
>> Changes in v2:
>>
>>  - Squashed the patches that originally touched these separately
>>  - Updated documentation
>>
>>  drivers/base/swnode.c | 43 ++-
>>  1 file changed, 30 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
>> index 615a0c93e116..cfd1faea48a7 100644
>> --- a/drivers/base/swnode.c
>> +++ b/drivers/base/swnode.c
>> @@ -692,7 +692,10 @@ swnode_register(const struct software_node *node, 
>> struct swnode *parent,
>>   * software_node_register_nodes - Register an array of software nodes
>>   * @nodes: Zero terminated array of software nodes to be registered
>>   *
>> - * Register multiple software nodes at once.
>> + * Register multiple software nodes at once. If any node in the array
>> + * has it's .parent pointer set, then it's parent **must** have been
>> + * registered before it is; either outside of this function or by
>> + * ordering the array such that parent comes before child.
>>   */
>>  int software_node_register_nodes(const struct software_node *nodes)
>>  {
>> @@ -700,33 +703,47 @@ int software_node_register_nodes(const struct 
>> software_node *nodes)
>>  int i;
>>  
>>  for (i = 0; nodes[i].name; i++) {
>> -ret = software_node_register(&nodes[i]);
>> -if (ret) {
>> -software_node_unregister_nodes(nodes);
>> -return ret;
>> +const struct software_node *parent = nodes[i].parent;
>> +
>> +if (parent && !software_node_to_swnode(parent)) {
>> +ret = -EINVAL;
>> +goto err_unregister_nodes;
>>  }
>> +
>> +ret = software_node_register(&nodes[i]);
>> +if (ret)
>> +goto err_unregister_nodes;
>>  }
>>  
>>  return 0;
>> +
>> +err_unregister_nodes:
>> +software_node_unregister_nodes(nodes);
>> +return ret;
>>  }
>>  EXPORT_SYMBOL_GPL(software_node_register_nodes);
>>  
>>  /**
>>   * software_node_unregister_nodes - Unregister an array of software nodes
>> - * @nodes: Zero terminated array of software nodes to be unregistered
>> + * @nodes: Zero terminated array of software nodes to be unregistered.
> 
> Not sure if this is needed.

Hah, of course. Hangover from the last version (when I had made that
line two sentences)
> 
>>   *
>> - * Unregister multiple software nodes at once.
>> + * Unregister multiple software nodes at once. If parent pointers are set up
>> + * in any of the software nodes then the array MUST be ordered such that
> 
> I'd either replace **must** above with MUST, or use **must** here. I'm
> not sure if kerneldoc handles emphasis with **must**, if it does that
> seems a bit nicer to me, but it's really up to you.

Honestly I haven't delved into kerneldoc yet, but either way I think
**must** is better in both places - will change.

> Reviewed-by: Laurent Pinchart 

Thank you!
> 
>> + * parents come before their children.
>>   *
>> - * NOTE: Be careful using this call if the nodes had parent pointers set up 
>> in
>> - * them before registering.  If so, it is wiser to remove the nodes
>> - * individually, in the correct order (child before parent) instead of 
>> relying
>> - * on the sequential order of the list of nodes in the array.
>> + * NOTE: If you are uncertain whether the array is ordered such that
>> + * parents will be unregistered before their children, it is wiser to
>> + * remove the nodes individually, in the correct order (child before
>> + * parent).
>>   */
>>  void software_node_unregister_nodes(const struct software_node *nodes)
>>  {
>> -int i;
>> +unsigned int i = 0;
>> +
>> +while (nodes[i].name)
>> +i++;
>>  
>> -for (i = 0; nodes[i].name; i++)
>> +while (i--)
>>  software_node_unregister(&nodes[i]);
>>  }
>>  EXPORT_SYMBOL_GPL(software_node_unregister_nodes);
>

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-12-18 Thread Rob Herring

On Thu, Dec 17, 2020 at 9:00 AM Thierry Reding  wrote:
>
> On Tue, Nov 10, 2020 at 08:33:09PM +0100, Thierry Reding wrote:
> > On Fri, Nov 06, 2020 at 04:25:48PM +0100, Thierry Reding wrote:
> > > On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> > > > On 2020-11-05 16:43, Thierry Reding wrote:
> > > > > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > > > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > > > > > > > From: Thierry Reding 
> > > > > > > > >
> > > > > > > > > Reserved memory regions can be marked as "active" if hardware 
> > > > > > > > > is
> > > > > > > > > expected to access the regions during boot and before the 
> > > > > > > > > operating
> > > > > > > > > system can take control. One example where this is useful is 
> > > > > > > > > for the
> > > > > > > > > operating system to infer whether the region needs to be 
> > > > > > > > > identity-
> > > > > > > > > mapped through an IOMMU.
> > > > > > > >
> > > > > > > > I like simple solutions, but this hardly seems adequate to 
> > > > > > > > solve the
> > > > > > > > problem of passing IOMMU setup from bootloader/firmware to the 
> > > > > > > > OS. Like
> > > > > > > > what is the IOVA that's supposed to be used if identity mapping 
> > > > > > > > is not
> > > > > > > > used?
> > > > > > >
> > > > > > > The assumption here is that if the region is not active there is 
> > > > > > > no need
> > > > > > > for the IOVA to be specified because the kernel will allocate 
> > > > > > > memory and
> > > > > > > assign any IOVA of its choosing.
> > > > > > >
> > > > > > > Also, note that this is not meant as a way of passing IOMMU setup 
> > > > > > > from
> > > > > > > the bootloader or firmware to the OS. The purpose of this is to 
> > > > > > > specify
> > > > > > > that some region of memory is actively being accessed during 
> > > > > > > boot. The
> > > > > > > particular case that I'm looking at is where the bootloader set 
> > > > > > > up a
> > > > > > > splash screen and keeps it on during boot. The bootloader has not 
> > > > > > > set up
> > > > > > > an IOMMU mapping and the identity mapping serves as a way of 
> > > > > > > keeping the
> > > > > > > accesses by the display hardware working during the transitional 
> > > > > > > period
> > > > > > > after the IOMMU translations have been enabled by the kernel but 
> > > > > > > before
> > > > > > > the kernel display driver has had a chance to set up its own IOMMU
> > > > > > > mappings.
> > > > > > >
> > > > > > > > If you know enough about the regions to assume identity 
> > > > > > > > mapping, then
> > > > > > > > can't you know if active or not?
> > > > > > >
> > > > > > > We could alternatively add some property that describes the 
> > > > > > > region as
> > > > > > > requiring an identity mapping. But note that we can't make any
> > > > > > > assumptions here about the usage of these regions because the 
> > > > > > > IOMMU
> > > > > > > driver simply has no way of knowing what they are being used for.
> > > > > > >
> > > > > > > Some additional information is required in device tree for the 
> > > > > > > IOMMU
> > > > > > > driver to be able to make that decision.
> > > > > >
> > > > > > Rob, can you provide any hints on exactly how you want to move this
> > > > > > forward? I don't know in what direction you'd like to proceed.
> > > > >
> > > > > Hi Rob,
> > > > >
> > > > > do you have any suggestions on how to proceed with this? I'd like to 
> > > > > get
> > > > > this moving again because it's something that's been nagging me for 
> > > > > some
> > > > > months now. It also requires changes across two levels in the 
> > > > > bootloader
> > > > > stack as well as Linux and it takes quite a bit of work to make all 
> > > > > the
> > > > > changes, so before I go and rewrite everything I'd like to get the DT
> > > > > bindings sorted out first.
> > > > >
> > > > > So just to summarize why I think this simple solution is good enough: 
> > > > > it
> > > > > tries to solve a very narrow and simple problem. This is not an 
> > > > > attempt
> > > > > at describing the firmware's full IOMMU setup to the kernel. In fact, 
> > > > > it
> > > > > is primarily targetted at cases where the firmware hasn't setup an 
> > > > > IOMMU
> > > > > at all, and we just want to make sure that when the kernel takes over
> > > > > and does want to enable the IOMMU, that all the regions that are
> > > > > actively being accessed by non-quiesced hardware (the most typical
> > > > > example would be a framebuffer scanning out a splat screen or 
> > > > > animation,
> > > > > but it could equally well be some sort of welcoming tone or music 
> > > > > being
> > > > > played back) are described in device tree.
> > > > >
> > > > > In other words, and this is perhaps better answering y

Re: [PATCH v2 06/12] software_node: Add support for fwnode_graph*() family of functions

2020-12-18 Thread Daniel Scally

Hi Laurent - thanks for comments as always

On 18/12/2020 16:22, Laurent Pinchart wrote:
> Hi Daniel,
> 
> Thank you for the patch.
> 
> On Thu, Dec 17, 2020 at 11:43:31PM +, Daniel Scally wrote:
>> From: Heikki Krogerus 
>>
>> This implements the remaining .graph_* callbacks in the
>> fwnode operations structure for the software nodes. That makes
>> the fwnode_graph*() functions available in the drivers also
>> when software nodes are used.
>>
>> The implementation tries to mimic the "OF graph" as much as
>> possible, but there is no support for the "reg" device
>> property. The ports will need to have the index in their
>> name which starts with "port@" (for example "port@0", "port@1",
>> ...) and endpoints will use the index of the software node
>> that is given to them during creation. The port nodes can
>> also be grouped under a specially named "ports" subnode,
>> just like in DT, if necessary.
>>
>> The remote-endpoints are reference properties under the
>> endpoint nodes that are named "remote-endpoint".
>>
>> Signed-off-by: Heikki Krogerus 
>> Co-developed-by: Daniel Scally 
>> Signed-off-by: Daniel Scally 
>> ---
>> Changes in v2:
>>
>>  - Changed commit to specify port name prefix as port@
>>  - Accounted for that rename in *parse_endpoint()
>>
>>  drivers/base/swnode.c | 110 +-
>>  1 file changed, 109 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
>> index 2b90d380039b..0d14d5ebe441 100644
>> --- a/drivers/base/swnode.c
>> +++ b/drivers/base/swnode.c
>> @@ -540,6 +540,110 @@ software_node_get_reference_args(const struct 
>> fwnode_handle *fwnode,
>>  return 0;
>>  }
>>  
>> +static struct fwnode_handle *
>> +swnode_graph_find_next_port(const struct fwnode_handle *parent,
>> +struct fwnode_handle *port)
>> +{
>> +struct fwnode_handle *old = port;
>> +
>> +while ((port = software_node_get_next_child(parent, old))) {
>> +if (!strncmp(to_swnode(port)->node->name, "port", 4))
> 
> Maybe we'll need to limit this to matching on "port" or "port@[0-9]+" to
> avoid false positives, but that can be done later, if needed.

Hmm yeah I guess that's a danger - ok, I'll stick it on the list.


>> +return port;
>> +old = port;
>> +}
>> +
>> +return NULL;
>> +}
>> +
>> +static struct fwnode_handle *
>> +software_node_graph_get_next_endpoint(const struct fwnode_handle *fwnode,
>> +  struct fwnode_handle *endpoint)
>> +{
>> +struct swnode *swnode = to_swnode(fwnode);
>> +struct fwnode_handle *old = endpoint;
>> +struct fwnode_handle *parent;
>> +struct fwnode_handle *port;
>> +
>> +if (!swnode)
>> +return NULL;
>> +
>> +if (endpoint) {
>> +port = software_node_get_parent(endpoint);
> 
> Here the reference count to port is incremented.
> 
>> +parent = software_node_get_parent(port);
>> +} else {
>> +parent = software_node_get_named_child_node(fwnode, "ports");
>> +if (!parent)
>> +parent = software_node_get(&swnode->fwnode);
>> +
>> +port = swnode_graph_find_next_port(parent, NULL);
> 
> But here it isn't, software_node_get_next_child() doesn't deal with
> reference counts.

Not as in the kernel right now, but after patch one of this series, it does:

[PATCH v2 01/12] software_node: Fix refcounts in
software_node_get_next_child()

I'm not sure that one linked to the thread correctly, but it's here if
you haven't seen it:

https://lore.kernel.org/linux-media/20201217234337.1983732-2-djrsca...@gmail.com/T/#u

The tl;dr of the change is that it will now get() the next node (if
found) and **always** put() if one is passed.


>> +}
>> +
>> +for (; port; port = swnode_graph_find_next_port(parent, port)) {
> 
> So if the loop terminates normally, the reference acquired in the first
> branch of the if will be leaked.
> 
>> +endpoint = software_node_get_next_child(port, old);
>> +if (endpoint) {
>> +fwnode_handle_put(port);
> 
> While in this case the reference not acquired in the second branch of
> the if will be released incorrectly.
> 
> I think it's software_node_get_next_child() that needs to be fixed if
> I'm not mistaken.

I think that's all handled in software_node_get_next_child() as amended
by 01/12. The net effect of get_next_endpoint() should be one refcount
increased for any endpoint returned, and 0 change to parent and any ports.


>> +break;
>> +}
>> +
>> +/* No more endpoints for that port, so stop passing old */
>> +old = NULL;
> 
> I wonder if you could drop the 'old' variable and use 'enpoint' in the
> call to software_node_get_next_child(). You could then drop these two
> lines.

That won't work, because endpoint would at that point not be a child of
the port we're

Re: [PATCH v2 05/15] ia64: convert to legacy_timer_tick

2020-12-18 Thread John Paul Adrian Glaubitz

Hi Arnd!

On 12/18/20 11:07 PM, Arnd Bergmann wrote:
> Sorry for causing this bug, and thank you for bisecting it
> down to my patch.
> 
> Do you see any other strange behavior with that patch, or is
> this the only symptom you are aware of?

This seems to be the only issue I'm seeing so far. However, as I'm not
able to fully boot the system, I'm not able to be certain that there
might be other fallouts once the system is running.

>> I'm seeing this backtrace now:
>>
>> [  905.883273] usb 1-2: SerialNumber: A6002001
>> [  905.918170]  sda: sda1 sda2 sda3
>> [  905.920107] sd 0:1:0:0: [sda] Attached SCSI disk
>> [  905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected
>> [  905.944102] scsi host1: usb-storage 1-2:1.0
>> [  905.944102] usbcore: registered new interface driver usb-storage
>> [  905.944117] usbcore: registered new interface driver uas
> 
> The timestamps show that time is moving forward, which is at least
> something. Do you have a feeling for whether the timestamps are
> counting in (roughly) the correct speed, or is it going much faster
> or slower than it should?
> 
> To clarify: the [905.944117] numbers are seconds/microseconds
> since boot, so message would be 906 seconds after the kernel
> started.

No, that would be definitely off. I hadn't had the machine up and running
for 15 minutes. This issue showed right after boot.

>> Begin: Loading essential drivers ... done.   
>>  
>> > Begin: Running /scripts/init-premount ... done.
>>  
>> > Begin: Mounting root file system ... Begin: Running 
>> /scripts/local-top ... done.
> 
> Ok, so it gets into user space. Is this initramfs or the actual read-only 
> root?

This is using an initramfs.

>> [  906.666923] hpsa :05:00.0: scsi 0:1:0:0: resetting logical  
>> Direct-Access HP   LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
>> [  906.670923] hpsa :05:00.0: device is ready.
>> [  906.670923] hpsa :05:00.0: scsi 0:1:0:0: reset logical  completed 
>> successfully Direct-Access HP   LOGICAL VOLUME   RAID-0 
>> SSDSmartPathCap- En- Exp=1
>> done.
>> [  906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>> [  906.722166] rcu: 2-: (3 ticks this GP) 
>> idle=fe6/1/0x4000 softirq=693/698 fqs=4
>> [  906.722166]  (detected by 0, t=6115 jiffies, g=465, q=80)
> This appears to be an 'rcu stall' warning, from print_cpu_stall_info(),
> indicating that timer ticks are missing.

OK.

>> [  909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 
>> seconds.
>> [  909.360108]   Not tainted 5.10.0+ #130
>> [  909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [  909.360108] task:systemd-sysv-ge state:D stack:0 pid:  200 ppid:   
>> 189 flags:0x
>> [  909.364108]
>> [  909.364108] Call Trace:
>> [  909.364423]  [] __schedule+0x890/0x21e0
>> [  909.364423] sp=e100487d7b70 
>> bsp=e100487d1748
>> [  909.368423]  [] schedule+0xa0/0x240
>> [  909.368423] sp=e100487d7b90 
>> bsp=e100487d16e0
>> [  909.368558]  [] io_schedule+0x70/0xa0
>> [  909.368558] sp=e100487d7b90 
>> bsp=e100487d16c0
>> [  909.372290]  [] bit_wait_io+0x20/0xe0
>> [  909.372290] sp=e100487d7b90 
>> bsp=e100487d1698
>> [  909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>> [  909.376290]  [] __wait_on_bit+0xc0/0x1c0
>> [  909.376290] sp=e100487d7b90 
>> bsp=e100487d1648
>> [  909.374168] rcu: 3-: (2 ticks this GP) 
>> idle=19e/1/0x4002 softirq=1581/1581 fqs=2
>> [  909.374168]  (detected by 0, t=5661 jiffies, g=1089, q=3)
>> [  909.376290]  [] out_of_line_wait_on_bit+0x120/0x140
>> [  909.376290] sp=e100487d7b90 
>> bsp=e100487d1610
>> [  909.374168] Task dump for CPU 3:
>> [  909.374168] task:khungtaskd  state:R  running task
> 
> and this seems to be another instance of the same. I would assume that this
> is completely unrelated to the block driver and just happened to trigger 
> during
> the same time the driver was doing something.
> 
> Can you see in your full logs if the "Oops: timer tick before it's due" 
> warning
> triggered at any point?

It's difficult, to be honest. The problem is that the above message spams the 
whole
kernel buffer to the point that the buffer of the built-in serial console is 
filled
up. So I'm not sure if I've seen this message.

> I've attached a patch for a partial revert of my original change, this
> should still work with the final cleanup on top, but restore the loop
> plus the local_irq_enable()/local_irq_di

1 2 3 4 5 6 7 8 >

1 - 100 of 712 matches

Mail list logo