[PATCH] m68k: add missing __user annotation in get_user()

2020-05-19 Thread Jason Wang
The ptr is a pointer to userspace memory. So we need annotate it with
__user otherwise we may get sparse warnings like:

drivers/vhost/vhost.c:1603:13: sparse: sparse: incorrect type in initializer 
(different address spaces) @@expected void const *__gu_ptr @@got 
unsigned int [noderef] [usertypvoid const *__gu_ptr @@
drivers/vhost/vhost.c:1603:13: sparse:expected void const *__gu_ptr
drivers/vhost/vhost.c:1603:13: sparse:got unsigned int [noderef] [usertype] 
 *idxp

Cc: Geert Uytterhoeven 
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-kernel@vger.kernel.org
Reported-by: kbuild test robot 
Signed-off-by: Jason Wang 
---
 arch/m68k/include/asm/uaccess_mm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/m68k/include/asm/uaccess_mm.h 
b/arch/m68k/include/asm/uaccess_mm.h
index 7e85de984df1..9ae9f8d05925 100644
--- a/arch/m68k/include/asm/uaccess_mm.h
+++ b/arch/m68k/include/asm/uaccess_mm.h
@@ -142,7 +142,7 @@ asm volatile ("\n"  \
__get_user_asm(__gu_err, x, ptr, u32, l, r, -EFAULT);   \
break;  \
case 8: {   \
-   const void *__gu_ptr = (ptr);   \
+   const void __user *__gu_ptr = (ptr);\
union { \
u64 l;  \
__typeof__(*(ptr)) t;   \
-- 
2.20.1



[PATCH] dmaengine: mmp_pdma: Do not warn when IRQ is shared by all chans

2020-05-19 Thread Lubomir Rintel
When there's a single interrupt for all the DMA channels, the
unsuccessful attempt to request separate IRQs emits useless warnings:

  [1.370381] mmp-pdma d400.dma: IRQ index 1 not found
  ...
  [1.412398] mmp-pdma d400.dma: IRQ index 15 not found
  [1.418308] mmp-pdma d400.dma: initialized 16 channels

Avoid that, treating the IRQs as optional.

Signed-off-by: Lubomir Rintel 
---
 drivers/dma/mmp_pdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/mmp_pdma.c b/drivers/dma/mmp_pdma.c
index ad06f260e907..41c542eaa23a 100644
--- a/drivers/dma/mmp_pdma.c
+++ b/drivers/dma/mmp_pdma.c
@@ -1060,7 +1060,7 @@ static int mmp_pdma_probe(struct platform_device *op)
pdev->dma_channels = dma_channels;
 
for (i = 0; i < dma_channels; i++) {
-   if (platform_get_irq(op, i) > 0)
+   if (platform_get_irq_optional(op, i) > 0)
irq_num++;
}
 
-- 
2.26.2



Re: [PATCH v3 1/4] dt-bindings: iio: magnetometer: ak8975: convert txt format to yaml

2020-05-19 Thread Jonathan Albrieux
On Tue, May 19, 2020 at 06:37:05PM +0100, Jonathan Cameron wrote:
> On Tue, 19 May 2020 18:44:33 +0200
> Jonathan Albrieux  wrote:
> 
> > On Tue, May 19, 2020 at 07:01:37PM +0300, Andy Shevchenko wrote:
> > > On Tue, May 19, 2020 at 04:03:54PM +0200, Jonathan Albrieux wrote:  
> > > > On Tue, May 19, 2020 at 03:22:07PM +0200, Stephan Gerhold wrote:  
> > > > > On Tue, May 19, 2020 at 02:43:51PM +0200, Jonathan Albrieux wrote:  
> > > 
> > > ...
> > >   
> > > > > > +maintainers:
> > > > > > +  - can't find a mantainer, author is Laxman Dewangan 
> > > > > >   
> > > > > 
> > > > > Should probably add someone here, although I'm not sure who either.
> > > > >   
> > > > 
> > > > Yep I couldn't find a maintainer for that driver..what to do in this 
> > > > case?  
> > > 
> > > Volunteer yourself!
> > >   
> > 
> > While I'd really like to, I have to decline the offer as I currently don't 
> > have
> > enought knowledge to become a maintainer :-) but thank you! (Who knows, 
> > maybe in
> > a couple of year!) Now I'll make the final edits and will submit a new
> > patchset soon with all the changes
> 
> Don't be so hard on yourself.  We all get thrown in at the deep end :)
> 
> Note that being a driver maintainer (or even just the binding) really
> just means you get cc'd on the patches and I'll make sure you've had time
> to review them if you wish.   Best of all, if you have hardware (and time)
> being able to test them, that is extremely useful (whether you are
> maintaining the driver or not!) 
> 
> I closely review the majority of stuff that comes through IIO and in
> the case of bindings we also have Rob and co. doing an amazing job.
> We have some excellent additional reviewers who review IIO stuff all the
> time, some of which have reviewed your patch I see.  Without them I'd
> never survive the deluge.
> 
> Of course it's entirely your decision, but I'd definitely encourage you
> to give it a go.
> 
> Thanks,
> 
> Jonathan
> 

Thank you for your encouraging words and for the trust! As a tester I will
be very pleased to give an help on this hardware but as a maintainer I
could contribute little to nothing at the moment and I'm not being hard
with myself but currently I really have to focus on the basic concepts first
and I'm lucky enought to have willing people helping me to do so :-)

Accepting to become the maintainer after the first contribution let me feels
like I'm burning some foundamental stage. I really hope you understand!

> 
> 
> > 
> > > -- 
> > > With Best Regards,
> > > Andy Shevchenko
> > > 
> > >   
> > 
> > Best regards,
> > Jonathan Albrieux
> 
> 

Best regards,
Jonathan Albrieux


[PATCH] dmaengine: mmp_tdma: share the IRQ line

2020-05-19 Thread Lubomir Rintel
On a MMP2, the DMA interrupt is shared by all channels of the peripheral
DMA controller and the audio DMA controller. Both drivers can identify
their interrupts, but only the PDMA driver marks the line shared:

  [1.185782] mmp-pdma d400.dma: initialized 16 channels
  [1.186808] mmp-tdma d42a0800.adma: IRQ index 1 not found
  [1.194317] genirq: Flags mismatch irq 64.  (tdma) vs. 0080 
(pdma)
  [1.197894] mmp-tdma: probe of d42a0800.adma failed with error -16

Let's turn on IRQF_SHARED in the ADMA driver as well.

Signed-off-by: Lubomir Rintel 
---
 drivers/dma/mmp_tdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/mmp_tdma.c b/drivers/dma/mmp_tdma.c
index dbc6a48424fa..960c7c40aef7 100644
--- a/drivers/dma/mmp_tdma.c
+++ b/drivers/dma/mmp_tdma.c
@@ -682,7 +682,7 @@ static int mmp_tdma_probe(struct platform_device *pdev)
if (irq_num != chan_num) {
irq = platform_get_irq(pdev, 0);
ret = devm_request_irq(&pdev->dev, irq,
-   mmp_tdma_int_handler, 0, "tdma", tdev);
+   mmp_tdma_int_handler, IRQF_SHARED, "tdma", tdev);
if (ret)
return ret;
}
-- 
2.26.2



RE: [PATCH 0/3] arm64: perf: Add support for Perf NMI interrupts

2020-05-19 Thread Song Bao Hua
> 
> On 5/18/20 11:45 AM, Mark Rutland wrote:
> > Hi all,
> >
> > On Mon, May 18, 2020 at 02:26:00PM +0800, Lecopzer Chen wrote:
> >> HI Sumit,
> >>
> >> Thanks for your information.
> >>
> >> I've already implemented IPI (same as you did [1], little difference
> >> in detail), hardlockup detector and perf in last year(2019) for
> >> debuggability.
> >> And now we tend to upstream to reduce kernel maintaining effort.
> >> I'm glad if someone in ARM can do this work :)
> >>
> >> Hi Julien,
> >>
> >> Does any Arm maintainers can proceed this action?
> > Alexandru (Cc'd) has been rebasing and reworking Julien's patches,
> > which is my preferred approach.
> >
> > I understand that's not quite ready for posting since he's
> > investigating some of the nastier subtleties (e.g. mutual exclusion
> > with the NMI), but maybe we can put the work-in-progress patches
> > somewhere in the mean time.
> >
> > Alexandru, do you have an idea of what needs to be done, and/or when
> > you expect you could post that?
> 
> I'm currently working on rebasing the patches on top of 5.7-rc5, when I have
> something usable I'll post a link (should be a couple of days). After that I 
> will
> address the review comments, and I plan to do a thorough testing because I'm
> not 100% confident that some of the assumptions around the locks that were
> removed are correct. My guess is this will take a few weeks.

+1
I would be awesome if perf NMI patches could be re-activated. Right now, it 
seems it is hard to
do "perf annotate" on a kernel function with local_irq disabled.

func()
{
local_irq_save();
.

local_irq_restore();
return;
}

Perf will report all cycles are used by the last moment of the func().

Thanks,
Barry

> 
> Thanks,
> Alex
> >
> > Thanks,
> > Mark.
> >
> >> This is really useful in debugging.
> >> Thank you!!
> >>
> >>
> >>
> >> [1] https://lkml.org/lkml/2020/4/24/328
> >>
> >>
> >> Lecopzer
> >>
> >> Sumit Garg  於 2020年5月18日 週一 下午
> 1:46寫道:
> >>> + Julien
> >>>
> >>> Hi Lecopzer,
> >>>
> >>> On Sat, 16 May 2020 at 18:20, Lecopzer Chen 
> wrote:
>  These series implement Perf NMI funxtionality and depends on Pseudo
>  NMI [1] which has been upstreamed.
> 
>  In arm64 with GICv3, Pseudo NMI was implemented for NMI-like
> interruts.
>  That can be extended to Perf NMI which is the prerequisite for
>  hard-lockup detector which had already a standard interface inside Linux.
> 
>  Thus the first step we need to implement perf NMI interface and
>  make sure it works fine.
> 
> >>> This is something that is already implemented via Julien's patch-set
> >>> [1]. Its v4 has been floating since July, 2019 and I couldn't find
> >>> any major blocking comments but not sure why things haven't
> >>> progressed further.
> >>>
> >>> Maybe Julien or Arm maintainers can provide updates on existing
> >>> patch-set [1] and how we should proceed further with this
> >>> interesting feature.
> >>>
> >>> And regarding hard-lockup detection, I have been able to enable it
> >>> based on perf NMI events using Julien's perf patch-set [1]. Have a
> >>> look at the patch here [2].
> >>>
> >>> [1] https://patchwork.kernel.org/cover/11047407/
> >>> [2]
> >>> http://lists.infradead.org/pipermail/linux-arm-kernel/2020-May/73222
> >>> 7.html
> >>>
> >>> -Sumit
> >>>
>  Perf NMI has been test by dd if=/dev/urandom of=/dev/null like the
>  link [2] did.
> 
>  [1] https://lkml.org/lkml/2019/1/31/535
>  [2] https://www.linaro.org/blog/debugging-arm-kernels-using-nmifiq
> 
> 
>  Lecopzer Chen (3):
>    arm_pmu: Add support for perf NMI interrupts registration
>    arm64: perf: Support NMI context for perf event ISR
>    arm64: Kconfig: Add support for the Perf NMI
> 
>   arch/arm64/Kconfig | 10 +++
>   arch/arm64/kernel/perf_event.c | 36 ++--
>   drivers/perf/arm_pmu.c | 51
> ++
>   include/linux/perf/arm_pmu.h   |  6 
>   4 files changed, 88 insertions(+), 15 deletions(-)
> 
>  --
>  2.25.1



Re: [PATCH] iommu: Don't call .probe_finalize() under group->mutex

2020-05-19 Thread Yong Wu
On Tue, 2020-05-19 at 15:28 +0200, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> The .probe_finalize() call-back of some IOMMU drivers calls into
> arm_iommu_attach_device(). This function will call back into the
> IOMMU core code, where it tries to take group->mutex again, resulting
> in a deadlock.
> 
> As there is no reason why .probe_finalize() needs to be called under
> that mutex, move it after the lock has been released to fix the
> deadlock.
> 
> Cc: Yong Wu 
> Reported-by: Yong Wu 
> Fixes: deac0b3bed26 ("iommu: Split off default domain allocation from group 
> assignment")
> Signed-off-by: Joerg Roedel 

Tested-by: Yong Wu 

Tested on MediaTek-v1 mt2701 evb board.


Re: [PATCH] dma-fence: add might_sleep annotation to _wait()

2020-05-19 Thread Christian König

Am 19.05.20 um 15:27 schrieb Daniel Vetter:

Do it uncontionally, there's a separate peek function with
dma_fence_is_signalled() which can be called from atomic context.

v2: Consensus calls for an unconditional might_sleep (Chris,
Christian)

Full audit:
- dma-fence.h: Uses MAX_SCHEDULE_TIMOUT, good chance this sleeps
- dma-resv.c: Timeout always at least 1
- st-dma-fence.c: Save to sleep in testcases
- amdgpu_cs.c: Both callers are for variants of the wait ioctl
- amdgpu_device.c: Two callers in vram recover code, both right next
   to mutex_lock.
- amdgpu_vm.c: Use in the vm_wait ioctl, next to _reserve/unreserve
- remaining functions in amdgpu: All for test_ib implementations for
   various engines, caller for that looks all safe (debugfs, driver
   load, reset)
- etnaviv: another wait ioctl
- habanalabs: another wait ioctl
- nouveau_fence.c: hardcoded 15*HZ ... glorious
- nouveau_gem.c: hardcoded 2*HZ ... so not even super consistent, but
   this one does have a WARN_ON :-/ At least this one is only a
   fallback path for when kmalloc fails. Maybe this should be put onto
   some worker list instead, instead of a work per unamp ...
- i915/selftests: Hardecoded HZ / 4 or HZ / 8
- i915/gt/selftests: Going up the callchain looks safe looking at
   nearby callers
- i915/gt/intel_gt_requests.c. Wrapped in a mutex_lock
- i915/gem_i915_gem_wait.c: The i915-version which is called instead
   for i915 fences already has a might_sleep() annotation, so all good

Cc: Alex Deucher 
Cc: Lucas Stach 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Ben Skeggs 
Cc: "VMware Graphics" 
Cc: Oded Gabbay 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-r...@vger.kernel.org
Cc: amd-...@lists.freedesktop.org
Cc: intel-...@lists.freedesktop.org
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Christian König 
Signed-off-by: Daniel Vetter 


Reviewed-by: Christian König 


---
  drivers/dma-buf/dma-fence.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 90edf2b281b0..656e9ac2d028 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -208,6 +208,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, 
signed long timeout)
if (WARN_ON(timeout < 0))
return -EINVAL;
  
+	might_sleep();

+
trace_dma_fence_wait_start(fence);
if (fence->ops->wait)
ret = fence->ops->wait(fence, intr, timeout);




RE: [PATCH 3/3] arm64: dts: imx8mp: add mu node

2020-05-19 Thread Aisheng Dong
> From: Fabio Estevam 
> Sent: Wednesday, May 20, 2020 11:07 AM
> 
> Hi Peng,
> 
> On Wed, May 20, 2020 at 12:01 AM Peng Fan  wrote:
> 
> > Nothing specific in i.MX8MP for the mu part, so do we really need add
> > "fsl,imx8mp-mu"?
> 
> It is good practice to add a more specific option.
> 
> Let's say in future a bug is found that affects imx8mp MU, then you could fix 
> the
> MU driver and keep the dtb compatibility.

+1


[RFC PATCH] arm64: dts: rockchip: fix dmas dma-names for rk3308 i2s node

2020-05-19 Thread Johan Jonker
One of the current rk3308 'i2s' nodes has a different dma layout
with only 1 item. Table 9-2 DMAC1 Request Mapping Table shows that
there 2 dma sources available, so fix the dmas and dma-names
for the rk3308 'i2s' node.

10 I2S/PCM_2CH_1 tx High level
11 I2S/PCM_2CH_1 rx High level

Signed-off-by: Johan Jonker 
---
 arch/arm64/boot/dts/rockchip/rk3308.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3308.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3308.dtsi
index ac7f69407..79c1dd1fe 100644
--- a/arch/arm64/boot/dts/rockchip/rk3308.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3308.dtsi
@@ -564,8 +564,8 @@
interrupts = ;
clocks = <&cru SCLK_I2S1_2CH>, <&cru HCLK_I2S1_2CH>;
clock-names = "i2s_clk", "i2s_hclk";
-   dmas = <&dmac1 11>;
-   dma-names = "rx";
+   dmas = <&dmac1 10>, <&dmac1 11>;
+   dma-names = "tx", "rx";
resets = <&cru SRST_I2S1_2CH_M>, <&cru SRST_I2S1_2CH_H>;
reset-names = "reset-m", "reset-h";
status = "disabled";
-- 
2.11.0



[ISSUE] acpi_cpufreq is added and removed frequently

2020-05-19 Thread Feng Li
Hi expert,

I could see my CentOS7,  `udevadm monitor` reports this log very fast:

UDEV  [14258.464055] change
/devices/LNXSYSTM:00/device:00/ACPI0004:01/LNXCPU:4d (acpi)
KERNEL[14258.464065] add  /module/acpi_cpufreq (module)
KERNEL[14258.471130] remove   /module/acpi_cpufreq (module)
UDEV  [14258.473672] change
/devices/LNXSYSTM:00/device:00/ACPI0004:01/LNXCPU:4a (acpi)
KERNEL[14258.473684] add  /module/acpi_cpufreq (module)
KERNEL[14258.482001] remove   /module/acpi_cpufreq (module)
UDEV  [14258.485059] change
/devices/LNXSYSTM:00/device:00/ACPI0004:01/LNXCPU:4f (acpi)
KERNEL[14258.485070] add  /module/acpi_cpufreq (module)
KERNEL[14258.495195] remove   /module/acpi_cpufreq (module)


What's wrong with the system?

OS: CentOS 7.6
Kernel: 3.10.0-1062.1.2.el7 / kernel-4.18.0-147

[root@67_95 14:46:21 ~]$lscpu
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):16
On-line CPU(s) list:   0-15
Thread(s) per core:2
Core(s) per socket:4
Socket(s): 2
NUMA node(s):  2
Vendor ID: GenuineIntel
CPU family:6
Model: 85
Model name:Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz
Stepping:  4
CPU MHz:   3677.124
CPU max MHz:   3700.
CPU min MHz:   1200.
BogoMIPS:  7200.00
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  1024K
L3 cache:  16896K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep
mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht
tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs
bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni
pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16
xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch
epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd mba ibrs
ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust
bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f
avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl
xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total
cqm_mbm_local dtherm ida arat pln pts pku ospke spec_ctrl intel_stibp
flush_l1d

Any tips?
Thanks in advance.


[PATCH 2/2] firewire-core: obsolete cast of function callback

2020-05-19 Thread Takashi Sakamoto
This commit obsoletes cast of function callback to assist attempt of
Control Flow Integrity builds.

Reported-by: Oscar Carter 
Reference: 
https://lore.kernel.org/lkml/20200519173425.4724-1-oscar.car...@gmx.com/
Signed-off-by: Takashi Sakamoto 
---
 drivers/firewire/core-cdev.c | 44 +++-
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
index 6e291d8f3a27..f1e83396dd22 100644
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -957,7 +957,6 @@ static int ioctl_create_iso_context(struct client *client, 
union ioctl_arg *arg)
 {
struct fw_cdev_create_iso_context *a = &arg->create_iso_context;
struct fw_iso_context *context;
-   fw_iso_callback_t cb;
int ret;
 
BUILD_BUG_ON(FW_CDEV_ISO_CONTEXT_TRANSMIT != FW_ISO_CONTEXT_TRANSMIT ||
@@ -965,32 +964,35 @@ static int ioctl_create_iso_context(struct client 
*client, union ioctl_arg *arg)
 FW_CDEV_ISO_CONTEXT_RECEIVE_MULTICHANNEL !=
FW_ISO_CONTEXT_RECEIVE_MULTICHANNEL);
 
-   switch (a->type) {
-   case FW_ISO_CONTEXT_TRANSMIT:
-   if (a->speed > SCODE_3200 || a->channel > 63)
-   return -EINVAL;
-
-   cb = iso_callback;
-   break;
+   if (a->type != FW_ISO_CONTEXT_RECEIVE_MULTICHANNEL) {
+   fw_iso_callback_t cb;
 
-   case FW_ISO_CONTEXT_RECEIVE:
-   if (a->header_size < 4 || (a->header_size & 3) ||
-   a->channel > 63)
-   return -EINVAL;
+   switch (a->type) {
+   case FW_ISO_CONTEXT_TRANSMIT:
+   if (a->speed > SCODE_3200 || a->channel > 63)
+   return -EINVAL;
 
-   cb = iso_callback;
-   break;
+   cb = iso_callback;
+   break;
 
-   case FW_ISO_CONTEXT_RECEIVE_MULTICHANNEL:
-   cb = (fw_iso_callback_t)iso_mc_callback;
-   break;
+   case FW_ISO_CONTEXT_RECEIVE:
+   if (a->header_size < 4 || (a->header_size & 3) ||
+   a->channel > 63)
+   return -EINVAL;
 
-   default:
-   return -EINVAL;
-   }
+   cb = iso_callback;
+   break;
+   default:
+   return -EINVAL;
+   }
 
-   context = fw_iso_context_create(client->device->card, a->type,
+   context = fw_iso_context_create(client->device->card, a->type,
a->channel, a->speed, a->header_size, cb, client);
+   } else {
+   context = fw_iso_mc_context_create(client->device->card,
+   a->type, a->channel, a->speed, a->header_size,
+   iso_mc_callback, client);
+   }
if (IS_ERR(context))
return PTR_ERR(context);
if (client->version < FW_CDEV_VERSION_AUTO_FLUSH_ISO_OVERFLOW)
-- 
2.25.1



[PATCH 0/2] firewire: obsolete cast of function callback toward CFI

2020-05-19 Thread Takashi Sakamoto
Hi,

Oscar Carter works for Control Flow Integrity build. Any cast
of function callback is inconvenient for the work. Unfortunately,
current code of firewire-core driver includes the cast[1] and Oscar
posted some patches to remove it[2]. The patch is itself good. However,
it includes changes existent kernel API and all of drivers as user
of the API get affects from the change.

This patchset is an alternative idea to add a new kernel API specific
for multichannel isoc context. The existent kernel API and drivers is
left as is.

Practically, no in-kernel drivers use the additional API. Although the
API is exported in the patchset, it's better to discuss about unexporting
the API.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/firewire/core-cdev.c#n985
[2] https://lore.kernel.org/lkml/20200519173425.4724-1-oscar.car...@gmx.com/

Regards

Takashi Sakamoto (2):
  firewire-core: add kernel API to construct multichannel isoc context
  firewire-core: obsolete cast of function callback

 drivers/firewire/core-cdev.c | 44 +++-
 drivers/firewire/core-iso.c  | 17 ++
 include/linux/firewire.h |  3 +++
 3 files changed, 43 insertions(+), 21 deletions(-)

-- 
2.25.1



[PATCH v2] tty: hvc: Fix data abort due to race in hvc_open

2020-05-19 Thread Raghavendra Rao Ananta
Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, do a NULL check at the beginning, before proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does an open() call on /dev/hvcX.
For example:
$ cat /dev/hvc0 & cat /dev/hvc0 &

Cc: sta...@vger.kernel.org
Signed-off-by: Raghavendra Rao Ananta 
---
 drivers/tty/hvc/hvc_console.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 436cc51c92c3..80709f754cc8 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -350,6 +350,9 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
unsigned long flags;
int rc = 0;

+   if (!hp)
+   return -ENODEV;
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH] MIPS: SGI-IP27: Remove not used includes and comment in ip27-timer.c

2020-05-19 Thread Thomas Bogendoerfer
On Wed, May 20, 2020 at 01:12:37PM +0800, Tiezhu Yang wrote:
> After commit 0ce5ebd24d25 ("mfd: ioc3: Add driver for SGI IOC3 chip"),
> the related includes and comment about ioc3 are not used any more in
> ip27-timer.c, remove them.
> 
> Signed-off-by: Tiezhu Yang 
> ---
>  arch/mips/sgi-ip27/ip27-timer.c | 5 -
>  1 file changed, 5 deletions(-)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]


Re: [PATCH] MIPS: ingenic: Add missing include

2020-05-19 Thread Thomas Bogendoerfer
On Tue, May 19, 2020 at 11:22:30PM +0200, Paul Cercueil wrote:
> Add missing include which adds the prototype to plat_time_init().
> 
> Fixes: f932449c11da ("MIPS: ingenic: Drop obsolete code, merge the rest in 
> setup.c")
> Signed-off-by: Paul Cercueil 
> Reported-by: kbuild test robot 
> ---
>  arch/mips/jz4740/setup.c | 1 +
>  1 file changed, 1 insertion(+)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]


[PATCH 1/2] firewire-core: add kernel API to construct multichannel isoc context

2020-05-19 Thread Takashi Sakamoto
In 1394 OHCI specification, IR context has several modes. One of mode
is 'multiChanMode'. For this mode, Linux FireWire stack has
FW_ISO_CONTEXT_RECEIVE_MULTICHANNEL flag apart from FW_ISO_CONTEXT_RECEIVE,
and associated internal callback. However, code of firewire-core driver
includes cast of function callback for the mode and this brings
inconvenient to effort of Control Flow Integrity builds.

This commit is a preparation to remove the cast. A new kernel API for the
mode is added and existent API is specific for FW_ISO_CONTEXT_RECEIVE and
FW_ISO_CONTEXT_TRANSMIT modes. Actually, no in-kernel driver uses the mode
and the additional kernel API is never used at present.

Reported-by: Oscar Carter 
Reference: 
https://lore.kernel.org/lkml/20200519173425.4724-1-oscar.car...@gmx.com/
Signed-off-by: Takashi Sakamoto 
---
 drivers/firewire/core-iso.c | 17 +
 include/linux/firewire.h|  3 +++
 2 files changed, 20 insertions(+)

diff --git a/drivers/firewire/core-iso.c b/drivers/firewire/core-iso.c
index 185b0b78b3d6..07e967594f27 100644
--- a/drivers/firewire/core-iso.c
+++ b/drivers/firewire/core-iso.c
@@ -152,6 +152,23 @@ struct fw_iso_context *fw_iso_context_create(struct 
fw_card *card,
 }
 EXPORT_SYMBOL(fw_iso_context_create);
 
+struct fw_iso_context *fw_iso_mc_context_create(struct fw_card *card,
+   int type, int channel, int speed, size_t header_size,
+   fw_iso_mc_callback_t callback, void *callback_data)
+{
+   struct fw_iso_context *ctx;
+
+   ctx = fw_iso_context_create(card, type, channel, speed, header_size,
+   NULL, callback_data);
+   if (IS_ERR(ctx))
+   return ctx;
+
+   ctx->callback.mc = callback;
+
+   return ctx;
+}
+EXPORT_SYMBOL(fw_iso_mc_context_create);
+
 void fw_iso_context_destroy(struct fw_iso_context *ctx)
 {
ctx->card->driver->free_iso_context(ctx);
diff --git a/include/linux/firewire.h b/include/linux/firewire.h
index aec8f30ab200..9477814ab12a 100644
--- a/include/linux/firewire.h
+++ b/include/linux/firewire.h
@@ -453,6 +453,9 @@ struct fw_iso_context {
 struct fw_iso_context *fw_iso_context_create(struct fw_card *card,
int type, int channel, int speed, size_t header_size,
fw_iso_callback_t callback, void *callback_data);
+struct fw_iso_context *fw_iso_mc_context_create(struct fw_card *card,
+   int type, int channel, int speed, size_t header_size,
+   fw_iso_mc_callback_t callback, void *callback_data);
 int fw_iso_context_set_channels(struct fw_iso_context *ctx, u64 *channels);
 int fw_iso_context_queue(struct fw_iso_context *ctx,
 struct fw_iso_packet *packet,
-- 
2.25.1



RE: [PATCH 2/3] clk: imx8mp: add mu root clk

2020-05-19 Thread Aisheng Dong
> From: Peng Fan 
> Sent: Wednesday, May 20, 2020 10:05 AM
> 
> Add mu root clk for mu mailbox usage.
> 
> Signed-off-by: Peng Fan 

Reviewed-by: Dong Aisheng 

Regards
Aisheng


RE: [PATCH 1/3] arm64: dts: imx8m: add mu node

2020-05-19 Thread Aisheng Dong
> From: Peng Fan 
> Sent: Wednesday, May 20, 2020 10:05 AM
> 
> Add mu node to let A53 could communicate with M Core.
> 
> Signed-off-by: Peng Fan 
> ---
>  arch/arm64/boot/dts/freescale/imx8mm.dtsi | 9 +
> arch/arm64/boot/dts/freescale/imx8mn.dtsi | 9 +
> arch/arm64/boot/dts/freescale/imx8mq.dtsi | 9 +
>  3 files changed, 27 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> index f3bbefe3e59f..9722f76d8c3f 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> @@ -771,6 +771,15 @@
>   status = "disabled";
>   };
> 
> + mu: mailbox@30aa {
> + compatible = "fsl,imx6sx-mu";

Usually we also add current SoC compatible string.
compatible = "fsl,imx8mm-mu", "fsl,imx6sx-mu"

> + reg = <0x30aa 0x1>;
> + interrupts = ;
> + clocks = <&clk IMX8MM_CLK_MU_ROOT>;
> + clock-names = "mu";

Undocumented property, drop it

> + #mbox-cells = <2>;
> + };
> +
>   usdhc1: mmc@30b4 {
>   compatible = "fsl,imx8mm-usdhc", 
> "fsl,imx7d-usdhc";
>   reg = <0x30b4 0x1>;
> diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> index fb63a98fdff5..5f30f1d50460 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> @@ -671,6 +671,15 @@
>   status = "disabled";
>   };
> 
> + mu: mailbox@30aa {
> + compatible = "fsl,imx6sx-mu";
> + reg = <0x30aa 0x1>;
> + interrupts = ;
> + clocks = <&clk IMX8MN_CLK_MU_ROOT>;
> + clock-names = "mu";
> + #mbox-cells = <2>;
> + };
> +
>   usdhc1: mmc@30b4 {
>   compatible = "fsl,imx8mn-usdhc", 
> "fsl,imx7d-usdhc";
>   reg = <0x30b4 0x1>;
> diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> index 1d15680a4962..e969fcbbd15f 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> @@ -956,6 +956,15 @@
>   status = "disabled";
>   };
> 
> + mu: mailbox@30aa {
> + compatible = "fsl,imx6sx-mu";
> + reg = <0x30aa 0x1>;
> + interrupts = ;
> + clocks = <&clk IMX8MQ_CLK_MU_ROOT>;
> + clock-names = "mu";
> + #mbox-cells = <2>;
> + };
> +
>   usdhc1: mmc@30b4 {
>   compatible = "fsl,imx8mq-usdhc",
>"fsl,imx7d-usdhc";
> --
> 2.16.4



Re: [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount

2020-05-19 Thread Ahmed S. Darwish
Hello Eric,

On Tue, May 19, 2020 at 07:01:38PM -0700, Eric Dumazet wrote:
>
> On 5/19/20 2:45 PM, Ahmed S. Darwish wrote:
> > Sequence counters write paths are critical sections that must never be
> > preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed.
> >
> > Commit 5dbe7c178d3f ("net: fix kernel deadlock with interface rename and
> > netdev name retrieval.") handled a deadlock, observed with
> > CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was
> > infinitely spinning: it got scheduled after the seqcount write side
> > blocked inside its own critical section.
> >
> > To fix that deadlock, among other issues, the commit added a
> > cond_resched() inside the read side section. While this will get the
> > non-preemptible kernel eventually unstuck, the seqcount reader is fully
> > exhausting its slice just spinning -- until TIF_NEED_RESCHED is set.
> >
> > The fix is also still broken: if the seqcount reader belongs to a
> > real-time scheduling policy, it can spin forever and the kernel will
> > livelock.
> >
> > Disabling preemption over the seqcount write side critical section will
> > not work: inside it are a number of GFP_KERNEL allocations and mutex
> > locking through the drivers/base/ :: device_rename() call chain.
> >
> > From all the above, replace the seqcount with a rwsem.
> >
> > Fixes: 5dbe7c178d3f (net: fix kernel deadlock with interface rename and 
> > netdev name retrieval.)
> > Fixes: 30e6c9fa93cf (net: devnet_rename_seq should be a seqcount)
> > Fixes: c91f6df2db49 (sockopt: Change getsockopt() of SO_BINDTODEVICE to 
> > return an interface name)
> > Cc: 
> > Signed-off-by: Ahmed S. Darwish 
> > Reviewed-by: Sebastian Andrzej Siewior 
> > ---
> >  net/core/dev.c | 30 --
> >  1 file changed, 12 insertions(+), 18 deletions(-)
> >
>
> Seems fine to me, assuming rwsem prevent starvation of the writer.
>

Thanks for the review.

AFAIK, due to 5cfd92e12e13 ("locking/rwsem: Adaptive disabling of reader
optimistic spinning"), using a rwsem shouldn't lead to writer starvation
in the contended case.

--
Ahmed S. Darwish
Linutronix GmbH


[PATCHv5 1/5] ext4: mballoc: Add blocks to PA list under same spinlock after allocating blocks

2020-05-19 Thread Ritesh Harjani
ext4_mb_discard_preallocations() only checks for grp->bb_prealloc_list
of every group to discard the group's PA to free up the space if
allocation request fails. Consider below race:-

Process A   Process B

1. allocate blocks
1. Fails block allocation from
 ext4_mb_regular_allocator()
   ext4_lock_group()
allocated blocks
more than ac_o_ex.fe_len
   ext4_unlock_group()
2. Scans the
   grp->bb_prealloc_list (under
   ext4_lock_group()) and
   find nothing and thus return
   -ENOSPC.

2. Add the additional blocks to PA list

   ext4_lock_group()
add blocks to grp->bb_prealloc_list
   ext4_unlock_group()

Above race could be avoided if we add those additional blocks to
grp->bb_prealloc_list at the same time with block allocation when
ext4_lock_group() was still held.
With this discard-PA will know if there are actually any blocks which
could be freed from the PA

Signed-off-by: Ritesh Harjani 
---
 fs/ext4/mballoc.c | 97 ++-
 1 file changed, 62 insertions(+), 35 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 33a69424942c..decc5168d126 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -349,6 +349,7 @@ static void ext4_mb_generate_from_pa(struct super_block 
*sb, void *bitmap,
ext4_group_t group);
 static void ext4_mb_generate_from_freelist(struct super_block *sb, void 
*bitmap,
ext4_group_t group);
+static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac);
 
 static inline void *mb_correct_addr_and_bit(int *bit, void *addr)
 {
@@ -1701,6 +1702,14 @@ static void ext4_mb_use_best_found(struct 
ext4_allocation_context *ac,
sbi->s_mb_last_start = ac->ac_f_ex.fe_start;
spin_unlock(&sbi->s_md_lock);
}
+   /*
+* As we've just preallocated more space than
+* user requested originally, we store allocated
+* space in a special descriptor.
+*/
+   if (ac->ac_o_ex.fe_len < ac->ac_b_ex.fe_len)
+   ext4_mb_new_preallocation(ac);
+
 }
 
 /*
@@ -1949,7 +1958,7 @@ void ext4_mb_simple_scan_group(struct 
ext4_allocation_context *ac,
 
ext4_mb_use_best_found(ac, e4b);
 
-   BUG_ON(ac->ac_b_ex.fe_len != ac->ac_g_ex.fe_len);
+   BUG_ON(ac->ac_f_ex.fe_len != ac->ac_g_ex.fe_len);
 
if (EXT4_SB(sb)->s_mb_stats)
atomic_inc(&EXT4_SB(sb)->s_bal_2orders);
@@ -3675,7 +3684,7 @@ static void ext4_mb_put_pa(struct ext4_allocation_context 
*ac,
 /*
  * creates new preallocated space for given inode
  */
-static noinline_for_stack int
+static noinline_for_stack void
 ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 {
struct super_block *sb = ac->ac_sb;
@@ -3688,10 +3697,9 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
BUG_ON(ac->ac_o_ex.fe_len >= ac->ac_b_ex.fe_len);
BUG_ON(ac->ac_status != AC_STATUS_FOUND);
BUG_ON(!S_ISREG(ac->ac_inode->i_mode));
+   BUG_ON(ac->ac_pa == NULL);
 
-   pa = kmem_cache_alloc(ext4_pspace_cachep, GFP_NOFS);
-   if (pa == NULL)
-   return -ENOMEM;
+   pa = ac->ac_pa;
 
if (ac->ac_b_ex.fe_len < ac->ac_g_ex.fe_len) {
int winl;
@@ -3735,7 +3743,6 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
pa->pa_pstart = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
pa->pa_len = ac->ac_b_ex.fe_len;
pa->pa_free = pa->pa_len;
-   atomic_set(&pa->pa_count, 1);
spin_lock_init(&pa->pa_lock);
INIT_LIST_HEAD(&pa->pa_inode_list);
INIT_LIST_HEAD(&pa->pa_group_list);
@@ -3755,21 +3762,17 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
pa->pa_obj_lock = &ei->i_prealloc_lock;
pa->pa_inode = ac->ac_inode;
 
-   ext4_lock_group(sb, ac->ac_b_ex.fe_group);
list_add(&pa->pa_group_list, &grp->bb_prealloc_list);
-   ext4_unlock_group(sb, ac->ac_b_ex.fe_group);
 
spin_lock(pa->pa_obj_lock);
list_add_rcu(&pa->pa_inode_list, &ei->i_prealloc_list);
spin_unlock(pa->pa_obj_lock);
-
-   return 0;
 }
 
 /*
  * creates new preallocated space for locality group inodes belongs to
  */
-static noinline_for_stack int
+static noinline_for_stack void
 ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
 {
struct super_block *sb = ac->ac_sb;
@@ -3781,11 +3784,9 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
BUG_ON(ac->ac_o_ex.fe_len >= ac->ac_b_ex.fe_len);
BUG_ON(ac->ac_status != AC_STATUS_FOUND);
BUG_ON(!S

[PATCHv5 4/5] ext4: mballoc: Refactor ext4_mb_good_group()

2020-05-19 Thread Ritesh Harjani
ext4_mb_good_group() definition was changed some time back
and now it even initializes the buddy cache (via ext4_mb_init_group()),
if in case the EXT4_MB_GRP_NEED_INIT() is true for a group.
Note that ext4_mb_init_group() could sleep and so should not be called
under a spinlock held.
This is fine as of now because ext4_mb_good_group() is called before
loading the buddy bitmap without ext4_lock_group() held
and again called after loading the bitmap, only this time with
ext4_lock_group() held.
But still this whole thing is confusing.

So this patch refactors out ext4_mb_good_group_nolock() which should be
called when without holding ext4_lock_group().
Also in further patches we hold the spinlock (ext4_lock_group()) while
doing any calculations which involves grp->bb_free or grp->bb_fragments.

Signed-off-by: Ritesh Harjani 
---
 fs/ext4/mballoc.c | 78 ++-
 1 file changed, 50 insertions(+), 28 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 754ff9f65199..c9297c878a90 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2106,15 +2106,14 @@ void ext4_mb_scan_aligned(struct 
ext4_allocation_context *ac,
 }
 
 /*
- * This is now called BEFORE we load the buddy bitmap.
+ * This is also called BEFORE we load the buddy bitmap.
  * Returns either 1 or 0 indicating that the group is either suitable
- * for the allocation or not. In addition it can also return negative
- * error code when something goes wrong.
+ * for the allocation or not.
  */
-static int ext4_mb_good_group(struct ext4_allocation_context *ac,
+static bool ext4_mb_good_group(struct ext4_allocation_context *ac,
ext4_group_t group, int cr)
 {
-   unsigned free, fragments;
+   ext4_grpblk_t free, fragments;
int flex_size = ext4_flex_bg_size(EXT4_SB(ac->ac_sb));
struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
 
@@ -2122,23 +2121,16 @@ static int ext4_mb_good_group(struct 
ext4_allocation_context *ac,
 
free = grp->bb_free;
if (free == 0)
-   return 0;
+   return false;
if (cr <= 2 && free < ac->ac_g_ex.fe_len)
-   return 0;
+   return false;
 
if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
-   return 0;
-
-   /* We only do this if the grp has never been initialized */
-   if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
-   int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
-   if (ret)
-   return ret;
-   }
+   return false;
 
fragments = grp->bb_fragments;
if (fragments == 0)
-   return 0;
+   return false;
 
switch (cr) {
case 0:
@@ -2148,31 +2140,63 @@ static int ext4_mb_good_group(struct 
ext4_allocation_context *ac,
if ((ac->ac_flags & EXT4_MB_HINT_DATA) &&
(flex_size >= EXT4_FLEX_SIZE_DIR_ALLOC_SCHEME) &&
((group % flex_size) == 0))
-   return 0;
+   return false;
 
if ((ac->ac_2order > ac->ac_sb->s_blocksize_bits+1) ||
(free / fragments) >= ac->ac_g_ex.fe_len)
-   return 1;
+   return true;
 
if (grp->bb_largest_free_order < ac->ac_2order)
-   return 0;
+   return false;
 
-   return 1;
+   return true;
case 1:
if ((free / fragments) >= ac->ac_g_ex.fe_len)
-   return 1;
+   return true;
break;
case 2:
if (free >= ac->ac_g_ex.fe_len)
-   return 1;
+   return true;
break;
case 3:
-   return 1;
+   return true;
default:
BUG();
}
 
-   return 0;
+   return false;
+}
+
+/*
+ * This could return negative error code if something goes wrong
+ * during ext4_mb_init_group(). This should not be called with
+ * ext4_lock_group() held.
+ */
+static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
+ext4_group_t group, int cr)
+{
+   struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
+   ext4_grpblk_t free;
+   int ret = 0;
+
+   free = grp->bb_free;
+   if (free == 0)
+   goto out;
+   if (cr <= 2 && free < ac->ac_g_ex.fe_len)
+   goto out;
+   if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
+   goto out;
+
+   /* We only do this if the grp has never been initialized */
+   if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
+   ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
+   if (ret)
+   return ret;
+   }
+
+   ret = ext4_mb_good_

[PATCHv5 3/5] ext4: mballoc: Introduce pcpu seqcnt for freeing PA to improve ENOSPC handling

2020-05-19 Thread Ritesh Harjani
There could be a race in function ext4_mb_discard_group_preallocations()
where the 1st thread may iterate through group's bb_prealloc_list and
remove all the PAs and add to function's local list head.
Now if the 2nd thread comes in to discard the group preallocations,
it will see that the group->bb_prealloc_list is empty and will return 0.

Consider for a case where we have less number of groups
(for e.g. just group 0),
this may even return an -ENOSPC error from ext4_mb_new_blocks()
(where we call for ext4_mb_discard_group_preallocations()).
But that is wrong, since 2nd thread should have waited for 1st thread
to release all the PAs and should have retried for allocation.
Since 1st thread was anyway going to discard the PAs.

The algorithm using this percpu seq counter goes below:
1. We sample the percpu discard_pa_seq counter before trying for block
   allocation in ext4_mb_new_blocks().
2. We increment this percpu discard_pa_seq counter when we either allocate
   or free these blocks i.e. while marking those blocks as used/free in
   mb_mark_used()/mb_free_blocks().
3. We also increment this percpu seq counter when we successfully identify
   that the bb_prealloc_list is not empty and hence proceed for discarding
   of those PAs inside ext4_mb_discard_group_preallocations().

Now to make sure that the regular fast path of block allocation is not
affected, as a small optimization we only sample the percpu seq counter
on that cpu. Only when the block allocation fails and when freed blocks
found were 0, that is when we sample percpu seq counter for all cpus using
below function ext4_get_discard_pa_seq_sum(). This happens after making
sure that all the PAs on grp->bb_prealloc_list got freed or if it's empty.

It can be well argued that why don't just check for grp->bb_free to
see if there are any free blocks to be allocated. So here are the two
concerns which were discussed:-

1. If for some reason the blocks available in the group are not
   appropriate for allocation logic (say for e.g.
   EXT4_MB_HINT_GOAL_ONLY, although this is not yet implemented), then
   the retry logic may result into infinte looping since grp->bb_free is
   non-zero.

2. Also before preallocation was clubbed with block allocation with the
   same ext4_lock_group() held, there were lot of races where grp->bb_free
   could not be reliably relied upon.
Due to above, this patch considers discard_pa_seq logic to determine if
we should retry for block allocation. Say if there are are n threads
trying for block allocation and none of those could allocate or discard
any of the blocks, then all of those n threads will fail the block
allocation and return -ENOSPC error. (Since the seq counter for all of
those will match as no block allocation/discard was done during that
duration).

Signed-off-by: Ritesh Harjani 
---
 fs/ext4/mballoc.c | 56 ++-
 1 file changed, 51 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index b75408d72773..754ff9f65199 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -351,6 +351,35 @@ static void ext4_mb_generate_from_freelist(struct 
super_block *sb, void *bitmap,
ext4_group_t group);
 static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac);
 
+/*
+ * The algorithm using this percpu seq counter goes below:
+ * 1. We sample the percpu discard_pa_seq counter before trying for block
+ *allocation in ext4_mb_new_blocks().
+ * 2. We increment this percpu discard_pa_seq counter when we either allocate
+ *or free these blocks i.e. while marking those blocks as used/free in
+ *mb_mark_used()/mb_free_blocks().
+ * 3. We also increment this percpu seq counter when we successfully identify
+ *that the bb_prealloc_list is not empty and hence proceed for discarding
+ *of those PAs inside ext4_mb_discard_group_preallocations().
+ *
+ * Now to make sure that the regular fast path of block allocation is not
+ * affected, as a small optimization we only sample the percpu seq counter
+ * on that cpu. Only when the block allocation fails and when freed blocks
+ * found were 0, that is when we sample percpu seq counter for all cpus using
+ * below function ext4_get_discard_pa_seq_sum(). This happens after making
+ * sure that all the PAs on grp->bb_prealloc_list got freed or if it's empty.
+ */
+static DEFINE_PER_CPU(u64, discard_pa_seq);
+static inline u64 ext4_get_discard_pa_seq_sum(void)
+{
+   int __cpu;
+   u64 __seq = 0;
+
+   for_each_possible_cpu(__cpu)
+   __seq += per_cpu(discard_pa_seq, __cpu);
+   return __seq;
+}
+
 static inline void *mb_correct_addr_and_bit(int *bit, void *addr)
 {
 #if BITS_PER_LONG == 64
@@ -1462,6 +1491,7 @@ static void mb_free_blocks(struct inode *inode, struct 
ext4_buddy *e4b,
mb_check_buddy(e4b);
mb_free_blocks_double(inode, e4b, first, count);
 
+   this_cpu_inc(discard_pa_seq);
 

[PATCHv5 5/5] ext4: mballoc: Use lock for checking free blocks while retrying

2020-05-19 Thread Ritesh Harjani
Currently while doing block allocation grp->bb_free may be getting
modified if discard is happening in parallel.
For e.g. consider a case where there are lot of threads who have
preallocated lot of blocks and there is a thread which is trying
to discard all of this group's PA. Now it could happen that
we see all of those group's bb_free is zero and fail the allocation
while there is sufficient space if we free up all the PA.

So this patch adds another flag "EXT4_MB_STRICT_CHECK" which will be set
if we are unable to allocate any blocks in the first try (since we may
not have considered blocks about to be discarded from PA lists).
So during retry attempt to allocate blocks we will use ext4_lock_group()
for checking if the group is good or not.

Signed-off-by: Ritesh Harjani 
---
 fs/ext4/ext4.h  |  2 ++
 fs/ext4/mballoc.c   | 13 -
 include/trace/events/ext4.h |  3 ++-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index fb37fb3fe689..d185f3bcb9eb 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -150,6 +150,8 @@ enum SHIFT_DIRECTION {
 #define EXT4_MB_USE_ROOT_BLOCKS0x1000
 /* Use blocks from reserved pool */
 #define EXT4_MB_USE_RESERVED   0x2000
+/* Do strict check for free blocks while retrying block allocation */
+#define EXT4_MB_STRICT_CHECK   0x4000
 
 struct ext4_allocation_request {
/* target inode for block we're allocating */
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c9297c878a90..a9083113a8c0 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2176,9 +2176,13 @@ static int ext4_mb_good_group_nolock(struct 
ext4_allocation_context *ac,
 ext4_group_t group, int cr)
 {
struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
+   struct super_block *sb = ac->ac_sb;
+   bool should_lock = ac->ac_flags & EXT4_MB_STRICT_CHECK;
ext4_grpblk_t free;
int ret = 0;
 
+   if (should_lock)
+   ext4_lock_group(sb, group);
free = grp->bb_free;
if (free == 0)
goto out;
@@ -2186,6 +2190,8 @@ static int ext4_mb_good_group_nolock(struct 
ext4_allocation_context *ac,
goto out;
if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
goto out;
+   if (should_lock)
+   ext4_unlock_group(sb, group);
 
/* We only do this if the grp has never been initialized */
if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
@@ -2194,8 +2200,12 @@ static int ext4_mb_good_group_nolock(struct 
ext4_allocation_context *ac,
return ret;
}
 
+   if (should_lock)
+   ext4_lock_group(sb, group);
ret = ext4_mb_good_group(ac, group, cr);
 out:
+   if (should_lock)
+   ext4_unlock_group(sb, group);
return ret;
 }
 
@@ -4610,7 +4620,8 @@ static bool 
ext4_mb_discard_preallocations_should_retry(struct super_block *sb,
goto out_dbg;
}
seq_retry = ext4_get_discard_pa_seq_sum();
-   if (seq_retry != *seq) {
+   if (!(ac->ac_flags & EXT4_MB_STRICT_CHECK) || seq_retry != *seq) {
+   ac->ac_flags |= EXT4_MB_STRICT_CHECK;
*seq = seq_retry;
ret = true;
}
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 19c87661eeec..0df9efa80b16 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -35,7 +35,8 @@ struct partial_cluster;
{ EXT4_MB_DELALLOC_RESERVED,"DELALLOC_RESV" },  \
{ EXT4_MB_STREAM_ALLOC, "STREAM_ALLOC" },   \
{ EXT4_MB_USE_ROOT_BLOCKS,  "USE_ROOT_BLKS" },  \
-   { EXT4_MB_USE_RESERVED, "USE_RESV" })
+   { EXT4_MB_USE_RESERVED, "USE_RESV" },   \
+   { EXT4_MB_STRICT_CHECK, "STRICT_CHECK" })
 
 #define show_map_flags(flags) __print_flags(flags, "|",
\
{ EXT4_GET_BLOCKS_CREATE,   "CREATE" }, \
-- 
2.21.0



[PATCHv5 0/5] Improve ext4 handling of ENOSPC with multi-threaded use-case

2020-05-19 Thread Ritesh Harjani
Hello All,

Please note that these patches are based on top of mballoc cleanup series [2]
which is also pending review. :)

v4 -> v5:
1. Removed ext4_lock_group() from fastpath and added that in the retry attempt,
so that the performance of fastpath is not affected.

v3 -> v4:
1. Splitted code cleanups and debug improvements as a separate patch series.
2. Dropped rcu_barrier() approach since it did cause some latency
in my testing of ENOSPC handling.
3. This patch series takes a different approach to improve the multi-threaded
ENOSPC handling in ext4 mballoc code. Below mail gives more details.


Background
==
Consider a case where your disk is close to full but still enough space
remains for your multi-threaded application to run. Now when this application
threads tries to write (e.g. sparse file followed by mmap write or
even fallocate multiple files) in parallel, then with current code of
ext4 multi-block allocator, the application may get an ENOSPC error in some
cases. Examining disk space at this time, we see there is sufficient space
remaining for your application to continue to run.

Additional info:

1. Our internal test team was easily able to reproduce this ENOSPC error on 
   an upstream kernel with 2GB ext4 image, with 64K blocksize. They didn't try
   above 2GB and reprorted this issue directly to dev team. On examining the
   free space when the application gets ENOSPC, the free space left was more
   then 50% of filesystem size in some cases.

2. For debugging/development of these patches, I used below script [1] to
   trigger this issue quite frequently on a 64K blocksize setup with 240MB
   ext4 image.


Summary of patches and problem with current design
==
There were 3 main problems which these patches tries to address and hence
improve the ENOSPC handling in ext4's multi-block allocator code.

1. Patch-2: Earlier we were considering the group is good or not (means
   checking if it has enough free blocks to serve your request) without taking
   the group's lock. This could result into a race where, if another thread is
   discarding the group's prealloc list, then the allocation thread will not
   consider those about to be free blocks and will fail will return that group
   is not fit for allocation thus eventually fails with ENOSPC error.

2. Patch-4: Discard PA algoritm only scans the PA list to free up the
   additional blocks which got added to PA. This is done by the same thread-A
   which at 1st couldn't allocate any blocks. But there is a window where,
   once the blocks were allocated (say by some other thread-B previously) we
   drop the group's lock and then checks to see if some of these blocks could
   be added to prealloc list of the group from where we allocated some blocks.
   After that we take the lock and add these additional blocks allocated by
   thread-B to the PA list. But say if thread-A tries to scan the PA list
   between this time interval then there is possibilty that it won't find any
   blocks added to the PA list and hence may return ENOSPC error.
   Hence this patch tries to add those additional blocks to the PA list just
   after the blocks are marked as used with the same group's spinlock held.
   
3. Patch-3: Introduces a per cpu discard_pa_seq counter which is increased
   whenever there is block allocation/freeing or when the discarding of any
   group's PA list has started. With this we could know when to stop the
   retrying logic and return ENOSPC error if there is actually no free space
   left.
   There is an optimization done in the block allocation fast path with this
   approach that, before starting the block allocation, we only sample the
   percpu seq count on that cpu. Only when the allocation fails and discard
   couldn't free up any of the blocks in all of the group's PA list, that is
   when we sample the percpu seq counter sum over all possible cpus to check
   if we need to retry.


Testing:
=
Tested fstests with default bs of 4K and bs == PAGESIZE ("-g auto")
No new failures were reported with this patch series in this testing.

NOTE:
1. This patch series is based on top of mballoc code cleanup patch series
posted at [2].

References:
===
[v3]: 
https://lkml.kernel.org/linux-ext4/cover.1588313626.git.rite...@linux.ibm.com/
[1]: 
https://github.com/riteshharjani/LinuxStudy/blob/master/tools/test_mballoc.sh
[2]: 
https://lkml.kernel.org/linux-ext4/cover.1589086800.git.rite...@linux.ibm.com/


Ritesh Harjani (5):
  ext4: mballoc: Add blocks to PA list under same spinlock after allocating 
blocks
  ext4: mballoc: Refactor ext4_mb_discard_preallocations()
  ext4: mballoc: Introduce pcpu seqcnt for freeing PA to improve ENOSPC handling
  ext4: mballoc: Refactor ext4_mb_good_group()
  ext4: mballoc: Use lock for checking free blocks while retrying

 fs/ext4/ext4.h  |   2 +
 fs/ext4/mballoc.c   | 247 +++

RE: [PATCH 0/4] arm64: dts: imx8m: dtb aliases update

2020-05-19 Thread Aisheng Dong
> From: Peng Fan 
> Sent: Wednesday, May 20, 2020 10:03 AM
> 
> Minor patchset to update device tree aliases
> 
> Peng Fan (4):
>   arm64: dts: imx8mq: Add mmc aliases
>   arm64: dts: imx8mq: Add ethernet alias
>   arm64: dts: imx8mm: sort the aliases
>   arm64: dts: imx8mp: add i2c aliases

For this patchset,

Reviewed-by: Dong Aisheng 

Regards
Aisheng


[PATCHv5 2/5] ext4: mballoc: Refactor ext4_mb_discard_preallocations()

2020-05-19 Thread Ritesh Harjani
Implement ext4_mb_discard_preallocations_should_retry()
which we will need in later patches to add more logic
like check for sequence number match to see if we should
retry for block allocation or not.

There should be no functionality change in this patch.

Signed-off-by: Ritesh Harjani 
---
 fs/ext4/mballoc.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index decc5168d126..b75408d72773 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4543,6 +4543,17 @@ static int ext4_mb_discard_preallocations(struct 
super_block *sb, int needed)
return freed;
 }
 
+static bool ext4_mb_discard_preallocations_should_retry(struct super_block *sb,
+   struct ext4_allocation_context *ac)
+{
+   int freed;
+
+   freed = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
+   if (freed)
+   return true;
+   return false;
+}
+
 /*
  * Main entry point into mballoc to allocate blocks
  * it tries to use preallocation first, then falls back
@@ -4551,7 +4562,6 @@ static int ext4_mb_discard_preallocations(struct 
super_block *sb, int needed)
 ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
struct ext4_allocation_request *ar, int *errp)
 {
-   int freed;
struct ext4_allocation_context *ac = NULL;
struct ext4_sb_info *sbi;
struct super_block *sb;
@@ -4656,8 +4666,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
ar->len = ac->ac_b_ex.fe_len;
}
} else {
-   freed  = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
-   if (freed)
+   if (ext4_mb_discard_preallocations_should_retry(sb, ac))
goto repeat;
/*
 * If block allocation fails then the pa allocated above
-- 
2.21.0



[PATCH] ARM: dts: imx: Make tempmon node as child of anatop node

2020-05-19 Thread Anson Huang
i.MX6/7 SoCs' temperature sensor is inside anatop module from HW
perspective, so it should be a child node of anatop.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6qdl.dtsi | 22 +++---
 arch/arm/boot/dts/imx6sl.dtsi  | 20 ++--
 arch/arm/boot/dts/imx6sll.dtsi | 20 ++--
 arch/arm/boot/dts/imx6sx.dtsi  | 20 ++--
 arch/arm/boot/dts/imx6ul.dtsi  | 20 ++--
 arch/arm/boot/dts/imx7s.dtsi   | 20 ++--
 6 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
index 39d4afd..43d44d5 100644
--- a/arch/arm/boot/dts/imx6qdl.dtsi
+++ b/arch/arm/boot/dts/imx6qdl.dtsi
@@ -69,17 +69,6 @@
};
};
 
-   tempmon: tempmon {
-   compatible = "fsl,imx6q-tempmon";
-   interrupt-parent = <&gpc>;
-   interrupts = <0 49 IRQ_TYPE_LEVEL_HIGH>;
-   fsl,tempmon = <&anatop>;
-   nvmem-cells = <&tempmon_calib>, <&tempmon_temp_grade>;
-   nvmem-cell-names = "calib", "temp_grade";
-   clocks = <&clks IMX6QDL_CLK_PLL3_USB_OTG>;
-   #thermal-sensor-cells = <0>;
-   };
-
ldb: ldb {
#address-cells = <1>;
#size-cells = <0>;
@@ -795,6 +784,17 @@
anatop-min-voltage = <725000>;
anatop-max-voltage = <145>;
};
+
+   tempmon: tempmon {
+   compatible = "fsl,imx6q-tempmon";
+   interrupt-parent = <&gpc>;
+   interrupts = <0 49 IRQ_TYPE_LEVEL_HIGH>;
+   fsl,tempmon = <&anatop>;
+   nvmem-cells = <&tempmon_calib>, 
<&tempmon_temp_grade>;
+   nvmem-cell-names = "calib", 
"temp_grade";
+   clocks = <&clks 
IMX6QDL_CLK_PLL3_USB_OTG>;
+   #thermal-sensor-cells = <0>;
+   };
};
 
usbphy1: usbphy@20c9000 {
diff --git a/arch/arm/boot/dts/imx6sl.dtsi b/arch/arm/boot/dts/imx6sl.dtsi
index 911d8cf..d8efc0a 100644
--- a/arch/arm/boot/dts/imx6sl.dtsi
+++ b/arch/arm/boot/dts/imx6sl.dtsi
@@ -93,16 +93,6 @@
};
};
 
-   tempmon: tempmon {
-   compatible = "fsl,imx6q-tempmon";
-   interrupts = <0 49 IRQ_TYPE_LEVEL_HIGH>;
-   interrupt-parent = <&gpc>;
-   fsl,tempmon = <&anatop>;
-   nvmem-cells = <&tempmon_calib>, <&tempmon_temp_grade>;
-   nvmem-cell-names = "calib", "temp_grade";
-   clocks = <&clks IMX6SL_CLK_PLL3_USB_OTG>;
-   };
-
pmu {
compatible = "arm,cortex-a9-pmu";
interrupt-parent = <&gpc>;
@@ -628,6 +618,16 @@
anatop-min-voltage = <725000>;
anatop-max-voltage = <145>;
};
+
+   tempmon: tempmon {
+   compatible = "fsl,imx6q-tempmon";
+   interrupts = <0 49 IRQ_TYPE_LEVEL_HIGH>;
+   interrupt-parent = <&gpc>;
+   fsl,tempmon = <&anatop>;
+   nvmem-cells = <&tempmon_calib>, 
<&tempmon_temp_grade>;
+   nvmem-cell-names = "calib", 
"temp_grade";
+   clocks = <&clks 
IMX6SL_CLK_PLL3_USB_OTG>;
+   };
};
 
usbphy1: usbphy@20c9000 {
diff --git a/arch/arm/boot/dts/imx6sll.dtsi b/arch/arm/boot/dts/imx6sll.dtsi
index edd3abb..bf7f048 100644
--- a/arch/arm/boot/dts/imx6sll.dtsi
+++ b/arch/arm/boot/dts/imx6sll.dtsi
@@ -105,16 +105,6 @@
clock-output-names = "ipp_di1";
};
 
-   tempmon: temperature-sensor {
-   compatible = "fsl,imx6sll-tempmon", "fsl,imx6sx-tempmon";
-   interrupts = ;
-   interrupt-parent = <&gpc>;
-   fsl,tempmon = <&anatop>;
-   nvmem-cells = <&tempmon_calib>, <&tempmon_temp_grade>;
-   nvmem-cell-names = "calib", "temp_grade";
-   clocks = <&clks IMX6SLL_CLK_PLL3_USB_OTG>;
-   };
-
soc {
#address-cells = <1>;
#size-cells = <1>;
@@ -531,6 +521,16 @@
anatop-max-voltage = <340>;
anatop-enable-bit = <0>;
};
+
+  

Re: [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists

2020-05-19 Thread maobibo



On 05/20/2020 09:26 AM, Andrew Morton wrote:
> On Tue, 19 May 2020 18:03:28 +0800 Bibo Mao  wrote:
> 
>> If two threads concurrently fault at the same address, the thread that
>> won the race updates the PTE and its local TLB. For now, the other
>> thread gives up, simply does nothing, and continues.
>>
>> It could happen that this second thread triggers another fault, whereby
>> it only updates its local TLB while handling the fault. Instead of
>> triggering another fault, let's directly update the local TLB of the
>> second thread.
>>
>> It is only useful to architectures where software can update TLB, it may
>> bring out some negative effect if update_mmu_cache is used for other
>> purpose also. It seldom happens where multiple threads access the same
>> page at the same time, so the negative effect is limited on other arches.
> 
> I'm still worried about the impact on other architectures.  The
> additional update_mmu_cache() calls won't occur only when multiple
> threads are racing against the same page, I think?  For example,
> insert_pfn() will do this when making a read-only page a writable one.
How about defining ptep_set_access_flags function like this on mips system?
which is the same on riscv platform.

static inline int ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long address, pte_t *ptep,
pte_t entry, int dirty)
{
if (!pte_same(*ptep, entry))
set_pte_at(vma->vm_mm, address, ptep, entry);
/*
 * update_mmu_cache will unconditionally execute, handling both
 * the case that the PTE changed and the spurious fault case.
 */
return true;
}

And keep the following piece of code unchanged, the change will be smaller.
@@ -1770,8 +1770,8 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, 
unsigned long addr,
}
entry = pte_mkyoung(*pte);
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-   if (ptep_set_access_flags(vma, addr, pte, entry, 1))
-   update_mmu_cache(vma, addr, pte);
+   ptep_set_access_flags(vma, addr, pte, entry, 1);
+   update_mmu_cache(vma, addr, pte);
}

@@ -2436,17 +2436,16 @@ static inline bool cow_user_page(struct page *dst, 
struct page *src,
entry = pte_mkyoung(vmf->orig_pte);
-   if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
-   update_mmu_cache(vma, addr, vmf->pte);
+   ptep_set_access_flags(vma, addr, vmf->pte, entry, 0);
+   update_mmu_cache(vma, addr, vmf->pte);
}
@@ -2618,8 +2618,8 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
entry = pte_mkyoung(vmf->orig_pte);
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-   if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
-   update_mmu_cache(vma, vmf->address, vmf->pte);
+   ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1);
+   update_mmu_cache(vma, vmf->address, vmf->pte);
pte_unmap_unlock(vmf->pte, vmf->ptl);
 }



> 
> Would you have time to add some instrumentation into update_mmu_cache()
> (maybe a tracepoint) and see what effect this change has upon the
> frequency at which update_mmu_cache() is called for a selection of
> workloads?  And add this info to the changelog to set minds at ease?
OK, I will add some instrumentation data in the changelog.

 



Re: [PATCH v1 2/2] mfd: Introduce QTI I2C PMIC controller

2020-05-19 Thread Lee Jones
On Tue, 19 May 2020, Guru Das Srinagesh wrote:

> On Fri, May 15, 2020 at 11:45:20AM +0100, Lee Jones wrote:
> > On Thu, 30 Apr 2020, Guru Das Srinagesh wrote:
> > 
> > > On Wed, Apr 29, 2020 at 08:50:10AM +0100, Lee Jones wrote:
> > > > On Tue, 28 Apr 2020, Guru Das Srinagesh wrote:
> > > > 
> > > > > The Qualcomm Technologies, Inc. I2C PMIC Controller is used by
> > > > > multi-function PMIC devices which communicate over the I2C bus.  The
> > > > > controller enumerates all child nodes as platform devices, and
> > > > > instantiates a regmap interface for them to communicate over the I2C
> > > > > bus.
> > > > > 
> > > > > The controller also controls interrupts for all of the children 
> > > > > platform
> > > > > devices.  The controller handles the summary interrupt by deciphering
> > > > > which peripheral triggered the interrupt, and which of the peripheral
> > > > > interrupts were triggered.  Finally, it calls the interrupt handlers 
> > > > > for
> > > > > each of the virtual interrupts that were registered.
> > > > > 
> > > > > Nicholas Troast is the original author of this driver.
> > > > > 
> > > > > Signed-off-by: Guru Das Srinagesh 
> > > > > ---
> > > > >  drivers/mfd/Kconfig |  11 +
> > > > >  drivers/mfd/Makefile|   1 +
> > > > >  drivers/mfd/qcom-i2c-pmic.c | 737 
> > > > > 
> > > > 
> > > > The vast majority of this driver deals with IRQ handling.  Why can't
> > > > this be split out into its own IRQ Chip driver and moved to
> > > > drivers/irqchip?
> > > 
> > > There appear to be quite a few in-tree MFD drivers that register IRQ
> > > controllers, like this driver does:
> > > 
> > > $ grep --exclude-dir=.git -rnE "irq_domain_(add|create).+\(" drivers/mfd 
> > > | wc -l
> > > 23
> > > 
> > > As a further example, drivers/mfd/stpmic1.c closely resembles this
> > > driver in that it uses both devm_regmap_add_irq_chip() as well as
> > > devm_of_platform_populate().
> > > 
> > > As such, it seems like this driver is in line with some of the
> > > architectural choices that have been accepted in already-merged drivers.
> > > Could you please elaborate on your concerns?
> > 
> > It is true that *basic* IRQ domain support has been added to these
> > drivers in the past.  However, IMHO the support added to this driver
> > goes beyond those realms such that it would justify a driver of its
> > own.
> 
> I am exploring an option to see if the regmap-irq APIs may be used in
> this driver, similar to stpmic1.c. Just to let you know, it might be a
> few days before I am able to post my next patchset as I'll have to make
> the necessary changes and test them out first.

Take your time.

The next release is due imminently, so you have as long as you need.

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v3 59/75] x86/sev-es: Handle MONITOR/MONITORX Events

2020-05-19 Thread Sean Christopherson
On Tue, Apr 28, 2020 at 05:17:09PM +0200, Joerg Roedel wrote:
> From: Tom Lendacky 
> 
> Implement a handler for #VC exceptions caused by MONITOR and MONITORX
> instructions.
> 
> Signed-off-by: Tom Lendacky 
> [ jroe...@suse.de: Adapt to #VC handling infrastructure ]
> Co-developed-by: Joerg Roedel 
> Signed-off-by: Joerg Roedel 
> ---
>  arch/x86/kernel/sev-es.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
> index 601554e6360f..1a961714cd1b 100644
> --- a/arch/x86/kernel/sev-es.c
> +++ b/arch/x86/kernel/sev-es.c
> @@ -824,6 +824,22 @@ static enum es_result vc_handle_rdpmc(struct ghcb *ghcb, 
> struct es_em_ctxt *ctxt
>   return ES_OK;
>  }
>  
> +static enum es_result vc_handle_monitor(struct ghcb *ghcb,
> + struct es_em_ctxt *ctxt)
> +{
> + phys_addr_t monitor_pa;
> + pgd_t *pgd;
> +
> + pgd = __va(read_cr3_pa());
> + monitor_pa = vc_slow_virt_to_phys(ghcb, ctxt->regs->ax);
> +
> + ghcb_set_rax(ghcb, monitor_pa);
> + ghcb_set_rcx(ghcb, ctxt->regs->cx);
> + ghcb_set_rdx(ghcb, ctxt->regs->dx);
> +
> + return sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_MONITOR, 0, 0);

Why?  If SVM has the same behavior as VMX, the MONITOR will be disarmed on
VM-Enter, i.e. the VMM can't do anything useful for MONITOR/MWAIT.  I
assume that's the case given that KVM emulates MONITOR/MWAIT as NOPs on
SVM.

> +}
> +
>  static enum es_result vc_handle_exitcode(struct es_em_ctxt *ctxt,
>struct ghcb *ghcb,
>unsigned long exit_code)
> @@ -860,6 +876,9 @@ static enum es_result vc_handle_exitcode(struct 
> es_em_ctxt *ctxt,
>   case SVM_EXIT_WBINVD:
>   result = vc_handle_wbinvd(ghcb, ctxt);
>   break;
> + case SVM_EXIT_MONITOR:
> + result = vc_handle_monitor(ghcb, ctxt);
> + break;
>   case SVM_EXIT_NPF:
>   result = vc_handle_mmio(ghcb, ctxt);
>   break;
> -- 
> 2.17.1
> 


Re: [PATCH 1/2] MAINTAINERS: Add entry for ROHM power management ICs

2020-05-19 Thread Lee Jones
On Wed, 20 May 2020, Matti Vaittinen wrote:

> Add entry for maintaining power management IC drivers for ROHM
> BD71837, BD71847, BD71850, BD71828, BD71878, BD70528 and BD99954.
> 
> Signed-off-by: Matti Vaittinen 
> ---
>  MAINTAINERS | 30 ++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ecc0749810b0..63a2ca70540e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14490,6 +14490,12 @@ L:   linux-ser...@vger.kernel.org
>  S:   Odd Fixes
>  F:   drivers/tty/serial/rp2.*
>  
> +ROHM BD99954 CHARGER IC
> +R:   Matti Vaittinen 
> +S:   Supported
> +F:   drivers/power/supply/bd99954-charger.c
> +F:   drivers/power/supply/bd99954-charger.h
> +
>  ROHM BH1750 AMBIENT LIGHT SENSOR DRIVER
>  M:   Tomasz Duszynski 
>  S:   Maintained
> @@ -14507,6 +14513,30 @@ F:   drivers/mfd/bd9571mwv.c
>  F:   drivers/regulator/bd9571mwv-regulator.c
>  F:   include/linux/mfd/bd9571mwv.h
>  
> +ROHM POWER MANAGEMENT IC DEVICE DRIVERS
> +R:   Matti Vaittinen 
> +S:   Supported
> +F:   Documentation/devicetree/bindings/mfd/rohm,bd70528-pmic.txt
> +F:   Documentation/devicetree/bindings/regulator/rohm,bd70528-regulator.txt
> +F:   drivers/clk/clk-bd718x7.c
> +F:   drivers/gpio/gpio-bd70528.c
> +F:   drivers/gpio/gpio-bd71828.c
> +F:   drivers/mfd/rohm-bd70528.c
> +F:   drivers/mfd/rohm-bd71828.c
> +F:   drivers/mfd/rohm-bd718x7.c
> +F:   drivers/power/supply/bd70528-charger.c
> +F:   drivers/regulator/bd70528-regulator.c
> +F:   drivers/regulator/bd71828-regulator.c
> +F:   drivers/regulator/bd718x7-regulator.c
> +F:   drivers/regulator/rohm-regulator.c
> +F:   drivers/rtc/rtc-bd70528.c
> +F:   drivers/watchdog/bd70528_wdt.c
> +F:   include/linux/mfd/rohm-shared.h
> +F:   include/linux/mfd/rohm-bd71828.h
> +F:   include/linux/mfd/rohm-bd70528.h
> +F:   include/linux/mfd/rohm-generic.h
> +F:   include/linux/mfd/rohm-bd718x7.h

How small can you get this list using wildcards?

+F: drivers/clk/clk-bd718x7.c
+F: drivers/gpio/gpio-bd7*
+F: drivers/mfd/rohm-bd7*
+F: drivers/power/supply/bd7*
+F: drivers/regulator/bd7*
+F: drivers/regulator/rohm-regulator.c
+F: drivers/rtc/rtc-bd7*
+F: drivers/watchdog/bd7*
+F: include/linux/mfd/rohm-*

Or

+F: drivers/*/bd7*
+F: drivers/*/*-bd7*
+F: drivers/*/rohm-*
+F: drivers/*/*rohm-*
+F: include/linux/*/rohm-*
+F: include/linux/*/*rohm-*

Not checked either of these.  They are just an example.

>  ROSE NETWORK LAYER
>  M:   Ralf Baechle 
>  L:   linux-h...@vger.kernel.org

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v3 51/75] x86/sev-es: Handle MMIO events

2020-05-19 Thread Sean Christopherson
On Tue, Apr 28, 2020 at 05:17:01PM +0200, Joerg Roedel wrote:
> From: Tom Lendacky 
> 
> Add handler for VC exceptions caused by MMIO intercepts. These
> intercepts come along as nested page faults on pages with reserved
> bits set.
> 
> Signed-off-by: Tom Lendacky 
> [ jroe...@suse.de: Adapt to VC handling framework ]
> Co-developed-by: Joerg Roedel 
> Signed-off-by: Joerg Roedel 
> ---

...

> diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
> index f4ce3b475464..e3662723ed76 100644
> --- a/arch/x86/kernel/sev-es.c
> +++ b/arch/x86/kernel/sev-es.c
> @@ -294,6 +294,25 @@ static enum es_result vc_read_mem(struct es_em_ctxt 
> *ctxt,
>   return ES_EXCEPTION;
>  }
>  
> +static phys_addr_t vc_slow_virt_to_phys(struct ghcb *ghcb, unsigned long 
> vaddr)
> +{
> + unsigned long va = (unsigned long)vaddr;
> + unsigned int level;
> + phys_addr_t pa;
> + pgd_t *pgd;
> + pte_t *pte;
> +
> + pgd = pgd_offset(current->active_mm, va);
> + pte = lookup_address_in_pgd(pgd, va, &level);
> + if (!pte)
> + return 0;

'0' is a valid physical address.  It happens to be reserved in the kernel
thanks to L1TF, but using '0' as an error code is ugly.  Not to mention
none of the callers actually check the result.

> +
> + pa = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT;
> + pa |= va & ~page_level_mask(level);
> +
> + return pa;
> +}


[PATCH net-next v3 2/2] net: phy: tja11xx: add SQI support

2020-05-19 Thread Oleksij Rempel
This patch implements reading of the Signal Quality Index for better
cable/link troubleshooting.

Signed-off-by: Oleksij Rempel 
---
 drivers/net/phy/nxp-tja11xx.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/net/phy/nxp-tja11xx.c b/drivers/net/phy/nxp-tja11xx.c
index 0d4f9067ca715..1e79c30ca81a5 100644
--- a/drivers/net/phy/nxp-tja11xx.c
+++ b/drivers/net/phy/nxp-tja11xx.c
@@ -53,6 +53,8 @@
 
 #define MII_COMMSTAT   23
 #define MII_COMMSTAT_LINK_UP   BIT(15)
+#define MII_COMMSTAT_SQI_STATE GENMASK(7, 5)
+#define MII_COMMSTAT_SQI_MAX   7
 
 #define MII_GENSTAT24
 #define MII_GENSTAT_PLL_LOCKED BIT(14)
@@ -329,6 +331,22 @@ static int tja11xx_read_status(struct phy_device *phydev)
return 0;
 }
 
+static int tja11xx_get_sqi(struct phy_device *phydev)
+{
+   int ret;
+
+   ret = phy_read(phydev, MII_COMMSTAT);
+   if (ret < 0)
+   return ret;
+
+   return FIELD_GET(MII_COMMSTAT_SQI_STATE, ret);
+}
+
+static int tja11xx_get_sqi_max(struct phy_device *phydev)
+{
+   return MII_COMMSTAT_SQI_MAX;
+}
+
 static int tja11xx_get_sset_count(struct phy_device *phydev)
 {
return ARRAY_SIZE(tja11xx_hw_stats);
@@ -683,6 +701,8 @@ static struct phy_driver tja11xx_driver[] = {
.config_aneg= tja11xx_config_aneg,
.config_init= tja11xx_config_init,
.read_status= tja11xx_read_status,
+   .get_sqi= tja11xx_get_sqi,
+   .get_sqi_max= tja11xx_get_sqi_max,
.suspend= genphy_suspend,
.resume = genphy_resume,
.set_loopback   = genphy_loopback,
@@ -699,6 +719,8 @@ static struct phy_driver tja11xx_driver[] = {
.config_aneg= tja11xx_config_aneg,
.config_init= tja11xx_config_init,
.read_status= tja11xx_read_status,
+   .get_sqi= tja11xx_get_sqi,
+   .get_sqi_max= tja11xx_get_sqi_max,
.suspend= genphy_suspend,
.resume = genphy_resume,
.set_loopback   = genphy_loopback,
@@ -715,6 +737,8 @@ static struct phy_driver tja11xx_driver[] = {
.config_aneg= tja11xx_config_aneg,
.config_init= tja11xx_config_init,
.read_status= tja11xx_read_status,
+   .get_sqi= tja11xx_get_sqi,
+   .get_sqi_max= tja11xx_get_sqi_max,
.match_phy_device = tja1102_p0_match_phy_device,
.suspend= genphy_suspend,
.resume = genphy_resume,
@@ -736,6 +760,8 @@ static struct phy_driver tja11xx_driver[] = {
.config_aneg= tja11xx_config_aneg,
.config_init= tja11xx_config_init,
.read_status= tja11xx_read_status,
+   .get_sqi= tja11xx_get_sqi,
+   .get_sqi_max= tja11xx_get_sqi_max,
.match_phy_device = tja1102_p1_match_phy_device,
.suspend= genphy_suspend,
.resume = genphy_resume,
-- 
2.26.2



[PATCH net-next v3 0/2] provide KAPI for SQI

2020-05-19 Thread Oleksij Rempel
This patches are extending ethtool netlink interface to export Signal
Quality Index (SQI). SQI provided by 100Base-T1 PHYs and can be used for
cable diagnostic. Compared to a typical cable tests, this value can be
only used after link is established.

changes v3:
- rename __ethtool_get_sqi* to linkstate_get_sqi*. And move this
  functions to the net/ethtool/linkstate.c
- protect linkstate_get_sqi* with locking

changes v2:
- use u32 instead of u8 for SQI
- add SQI_MAX field and callbacks
- some style fixes in the rst.
- do not convert index to shifted index.

Oleksij Rempel (2):
  ethtool: provide UAPI for PHY Signal Quality Index (SQI)
  net: phy: tja11xx: add SQI support

 Documentation/networking/ethtool-netlink.rst |  6 +-
 drivers/net/phy/nxp-tja11xx.c| 26 +++
 include/linux/phy.h  |  2 +
 include/uapi/linux/ethtool_netlink.h |  2 +
 net/ethtool/linkstate.c  | 75 +++-
 5 files changed, 108 insertions(+), 3 deletions(-)

-- 
2.26.2



[PATCH net-next v3 1/2] ethtool: provide UAPI for PHY Signal Quality Index (SQI)

2020-05-19 Thread Oleksij Rempel
Signal Quality Index is a mandatory value required by "OPEN Alliance
SIG" for the 100Base-T1 PHYs [1]. This indicator can be used for cable
integrity diagnostic and investigating other noise sources and
implement by at least two vendors: NXP[2] and TI[3].

[1] 
http://www.opensig.org/download/document/218/Advanced_PHY_features_for_automotive_Ethernet_V1.0.pdf
[2] https://www.nxp.com/docs/en/data-sheet/TJA1100.pdf
[3] https://www.ti.com/product/DP83TC811R-Q1

Signed-off-by: Oleksij Rempel 
---
 Documentation/networking/ethtool-netlink.rst |  6 +-
 include/linux/phy.h  |  2 +
 include/uapi/linux/ethtool_netlink.h |  2 +
 net/ethtool/linkstate.c  | 75 +++-
 4 files changed, 82 insertions(+), 3 deletions(-)

diff --git a/Documentation/networking/ethtool-netlink.rst 
b/Documentation/networking/ethtool-netlink.rst
index eed46b6aa07df..7e651ea33eabb 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -454,10 +454,12 @@ Request contents:
 
 Kernel response contents:
 
-    ==  ==
+    ==  
   ``ETHTOOL_A_LINKSTATE_HEADER``nested  reply header
   ``ETHTOOL_A_LINKSTATE_LINK``  boollink state (up/down)
-    ==  ==
+  ``ETHTOOL_A_LINKSTATE_SQI``   u32 Current Signal Quality Index
+  ``ETHTOOL_A_LINKSTATE_SQI_MAX``   u32 Max support SQI value
+    ==  
 
 For most NIC drivers, the value of ``ETHTOOL_A_LINKSTATE_LINK`` returns
 carrier flag provided by ``netif_carrier_ok()`` but there are drivers which
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 59344db43fcb1..950ba479754bd 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -706,6 +706,8 @@ struct phy_driver {
struct ethtool_tunable *tuna,
const void *data);
int (*set_loopback)(struct phy_device *dev, bool enable);
+   int (*get_sqi)(struct phy_device *dev);
+   int (*get_sqi_max)(struct phy_device *dev);
 };
 #define to_phy_driver(d) container_of(to_mdio_common_driver(d),
\
  struct phy_driver, mdiodrv)
diff --git a/include/uapi/linux/ethtool_netlink.h 
b/include/uapi/linux/ethtool_netlink.h
index 2881af411f761..e6f109b76c9aa 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -232,6 +232,8 @@ enum {
ETHTOOL_A_LINKSTATE_UNSPEC,
ETHTOOL_A_LINKSTATE_HEADER, /* nest - _A_HEADER_* */
ETHTOOL_A_LINKSTATE_LINK,   /* u8 */
+   ETHTOOL_A_LINKSTATE_SQI,/* u32 */
+   ETHTOOL_A_LINKSTATE_SQI_MAX,/* u32 */
 
/* add new constants above here */
__ETHTOOL_A_LINKSTATE_CNT,
diff --git a/net/ethtool/linkstate.c b/net/ethtool/linkstate.c
index 2740cde0a182b..7f47ba89054e1 100644
--- a/net/ethtool/linkstate.c
+++ b/net/ethtool/linkstate.c
@@ -2,6 +2,7 @@
 
 #include "netlink.h"
 #include "common.h"
+#include 
 
 struct linkstate_req_info {
struct ethnl_req_info   base;
@@ -10,6 +11,8 @@ struct linkstate_req_info {
 struct linkstate_reply_data {
struct ethnl_reply_data base;
int link;
+   int sqi;
+   int sqi_max;
 };
 
 #define LINKSTATE_REPDATA(__reply_base) \
@@ -20,8 +23,46 @@ linkstate_get_policy[ETHTOOL_A_LINKSTATE_MAX + 1] = {
[ETHTOOL_A_LINKSTATE_UNSPEC]= { .type = NLA_REJECT },
[ETHTOOL_A_LINKSTATE_HEADER]= { .type = NLA_NESTED },
[ETHTOOL_A_LINKSTATE_LINK]  = { .type = NLA_REJECT },
+   [ETHTOOL_A_LINKSTATE_SQI]   = { .type = NLA_REJECT },
+   [ETHTOOL_A_LINKSTATE_SQI_MAX]   = { .type = NLA_REJECT },
 };
 
+static int linkstate_get_sqi(struct net_device *dev)
+{
+   struct phy_device *phydev = dev->phydev;
+   int ret;
+
+   if (!phydev)
+   return -EOPNOTSUPP;
+
+   mutex_lock(&phydev->lock);
+   if (!phydev->drv || !phydev->drv->get_sqi)
+   ret = -EOPNOTSUPP;
+   else
+   ret = phydev->drv->get_sqi(phydev);
+   mutex_unlock(&phydev->lock);
+
+   return ret;
+}
+
+static int linkstate_get_sqi_max(struct net_device *dev)
+{
+   struct phy_device *phydev = dev->phydev;
+   int ret;
+
+   if (!phydev)
+   return -EOPNOTSUPP;
+
+   mutex_lock(&phydev->lock);
+   if (!phydev->drv || !phydev->drv->get_sqi_max)
+   ret = -EOPNOTSUPP;
+   else
+   ret = phydev->drv->get_sqi_max(phydev);
+   mutex_unlock(&phydev->lock);
+
+

Re: [PATCH v2] firewire: Remove function callback casts

2020-05-19 Thread Takashi Sakamoto
Hi,

On Tue, May 19, 2020 at 07:34:25PM +0200, Oscar Carter wrote:
> In an effort to enable -Wcast-function-type in the top-level Makefile to
> support Control Flow Integrity builds, remove all the function callback
> casts.
> 
> To do this, modify the "fw_iso_context_create" function prototype adding
> a new parameter for the multichannel callback. Also, fix all the
> function calls accordingly.
> 
> In the "fw_iso_context_create" function return an error code if both
> callback parameters are NULL and also set the "ctx->callback.sc"
> explicity to NULL in this case. It is not necessary set to NULL the
> "ctx->callback.mc" variable because this and "ctx->callback.sc" are an
> union and setting one implies setting the other one to the same value.
> 
> Signed-off-by: Oscar Carter 
> ---
> Changelog v1->v2
> -Set explicity to NULL the "ctx->callback.sc" variable and return an error
>  code in "fw_iso_context_create" function if both callback parameters are
>  NULL as Lev R. Oshvang suggested.
> -Modify the commit changelog accordingly.
> 
>  drivers/firewire/core-cdev.c| 12 +++-
>  drivers/firewire/core-iso.c | 14 --
>  drivers/firewire/net.c  |  2 +-
>  drivers/media/firewire/firedtv-fw.c |  3 ++-
>  include/linux/firewire.h|  3 ++-
>  sound/firewire/amdtp-stream.c   |  2 +-
>  sound/firewire/isight.c |  4 ++--
>  7 files changed, 27 insertions(+), 13 deletions(-)

I'm an author of ALSA firewire stack and thanks for the patch. I agree with
your intention to remove the cast of function callback toward CFI build.

Practically, the isochronous context with FW_ISO_CONTEXT_RECEIVE_MULTICHANNEL
is never used by in-kernel drivers. Here, I propose to leave current
kernel API (fw_iso_context_create() with fw_iso_callback_t) as is.
Alternatively, a new kernel API for the context (e.g.
fw_iso_mc_context_create() with fw_iso_mc_callback_t). This idea leaves
current drivers as is and the change is done inner firewire-core driver,
therefore existent kernel API is not changed.

Later I post two patches for the proposal. I'd like you to review it and
I'm glad to receive your comments.


Regards

Takashi Sakamoto


Re: [PATCH v2 1/2] hwrng: iproc-rng200 - Set the quality value

2020-05-19 Thread Stephan Mueller
Am Dienstag, 19. Mai 2020, 23:25:51 CEST schrieb Łukasz Stelmach:

Hi Łukasz,

> The value was estimaded with ea_iid[1] using on 10485760 bytes read from
> the RNG via /dev/hwrng. The min-entropy value calculated using the most
> common value estimate (NIST SP 800-90P[2], section 6.3.1) was 7.964464.

I am sorry, but I think I did not make myself clear: testing random numbers 
post-processing with the statistical tools does NOT give any idea about the 
entropy rate. Thus, all that was calculated is the proper implementation of 
the post-processing operation and not the actual noise source.

What needs to happen is that we need access to raw, unconditioned data from 
the noise source that is analyzed with the statistical methods.

Ciao
Stephan




[tip:core/kprobes] BUILD SUCCESS 66e9b0717102507e64f638790eaece88765cc9e5

2020-05-19 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git  
core/kprobes
branch HEAD: 66e9b0717102507e64f638790eaece88765cc9e5  kprobes: Prevent probes 
in .noinstr.text section

elapsed time: 491m

configs tested: 98
configs skipped: 1

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
sparcallyesconfig
mips allyesconfig
m68k allyesconfig
i386  allnoconfig
i386defconfig
i386  debian-10.3
i386 allyesconfig
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
i386 randconfig-a012-20200519
i386 randconfig-a014-20200519
i386 randconfig-a016-20200519
i386 randconfig-a011-20200519
i386 randconfig-a015-20200519
i386 randconfig-a013-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:core/rcu] BUILD SUCCESS b1fcf9b83c4149c63d1e0c699e85f93cbe28e211

2020-05-19 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git  
core/rcu
branch HEAD: b1fcf9b83c4149c63d1e0c699e85f93cbe28e211  rcu: Provide 
__rcu_is_watching()

elapsed time: 491m

configs tested: 92
configs skipped: 1

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
sparcallyesconfig
mips allyesconfig
m68k allyesconfig
i386  allnoconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
umallnoconfig
um  defconfig
um   allmodconfig
um   allyesconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH v3 25/75] x86/sev-es: Add support for handling IOIO exceptions

2020-05-19 Thread Sean Christopherson
On Tue, Apr 28, 2020 at 05:16:35PM +0200, Joerg Roedel wrote:
> From: Tom Lendacky 
> 
> Add support for decoding and handling #VC exceptions for IOIO events.
> 
> Signed-off-by: Tom Lendacky 
> [ jroe...@suse.de: Adapted code to #VC handling framework ]
> Co-developed-by: Joerg Roedel 
> Signed-off-by: Joerg Roedel 
> ---
>  arch/x86/boot/compressed/sev-es.c |  32 +
>  arch/x86/kernel/sev-es-shared.c   | 202 ++
>  2 files changed, 234 insertions(+)
> 
> diff --git a/arch/x86/boot/compressed/sev-es.c 
> b/arch/x86/boot/compressed/sev-es.c
> index 1241697dd156..17765e471e28 100644
> --- a/arch/x86/boot/compressed/sev-es.c
> +++ b/arch/x86/boot/compressed/sev-es.c
> @@ -23,6 +23,35 @@

...

> +static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt 
> *ctxt)
> +{
> + struct pt_regs *regs = ctxt->regs;
> + u64 exit_info_1, exit_info_2;
> + enum es_result ret;
> +
> + ret = vc_ioio_exitinfo(ctxt, &exit_info_1);
> + if (ret != ES_OK)
> + return ret;
> +
> + if (exit_info_1 & IOIO_TYPE_STR) {
> + int df = (regs->flags & X86_EFLAGS_DF) ? -1 : 1;
> + unsigned int io_bytes, exit_bytes;
> + unsigned int ghcb_count, op_count;
> + unsigned long es_base;
> + u64 sw_scratch;
> +
> + /*
> +  * For the string variants with rep prefix the amount of in/out
> +  * operations per #VC exception is limited so that the kernel
> +  * has a chance to take interrupts an re-schedule while the
> +  * instruction is emulated.

Doesn't this also suppress single-step #DBs?

> +  */
> + io_bytes   = (exit_info_1 >> 4) & 0x7;
> + ghcb_count = sizeof(ghcb->shared_buffer) / io_bytes;
> +
> + op_count= (exit_info_1 & IOIO_REP) ? regs->cx : 1;
> + exit_info_2 = min(op_count, ghcb_count);
> + exit_bytes  = exit_info_2 * io_bytes;
> +
> + es_base = insn_get_seg_base(ctxt->regs, INAT_SEG_REG_ES);
> +
> + if (!(exit_info_1 & IOIO_TYPE_IN)) {
> + ret = vc_insn_string_read(ctxt,
> +(void *)(es_base + regs->si),

SEV(-ES) is 64-bit only, why bother with the es_base charade?

> +ghcb->shared_buffer, io_bytes,
> +exit_info_2, df);

df handling is busted, it's aways non-zero.  Same goes for the SI/DI
adjustments below.

> + if (ret)
> + return ret;
> + }
> +
> + sw_scratch = __pa(ghcb) + offsetof(struct ghcb, shared_buffer);
> + ghcb_set_sw_scratch(ghcb, sw_scratch);
> + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_IOIO,
> +exit_info_1, exit_info_2);
> + if (ret != ES_OK)
> + return ret;

Batching the memory accesses and I/O accesses separately is technically
wrong, e.g. a #DB on a memory access will result in bogus data being shown
in the debugger.  In practice it seems unlikely to matter, but I'm curious
as to why string I/O is supported in the first place.  I didn't think there
was that much string I/O in the kernel?

> +
> + /* Everything went well, write back results */
> + if (exit_info_1 & IOIO_TYPE_IN) {
> + ret = vc_insn_string_write(ctxt,
> + (void *)(es_base + regs->di),
> + ghcb->shared_buffer, io_bytes,
> + exit_info_2, df);
> + if (ret)
> + return ret;
> +
> + if (df)
> + regs->di -= exit_bytes;
> + else
> + regs->di += exit_bytes;
> + } else {
> + if (df)
> + regs->si -= exit_bytes;
> + else
> + regs->si += exit_bytes;
> + }
> +
> + if (exit_info_1 & IOIO_REP)
> + regs->cx -= exit_info_2;
> +
> + ret = regs->cx ? ES_RETRY : ES_OK;
> +
> + } else {
> + int bits = (exit_info_1 & 0x70) >> 1;
> + u64 rax = 0;
> +
> + if (!(exit_info_1 & IOIO_TYPE_IN))
> + rax = lower_bits(regs->ax, bits);
> +
> + ghcb_set_rax(ghcb, rax);
> +
> + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_IOIO, 
> exit_info_1, 0);
> + if (ret != ES_OK)
> + return ret;
> +
> + if (exit_info_1 & IOIO_TYPE_IN) {
> + if (!ghcb_is_valid_rax(ghcb))
> + return ES_VMM_ERROR;
> + regs->ax = lower_bits(ghcb

Re: [PATCH v4 2/4] kasan: record and print the free track

2020-05-19 Thread Walter Wu
On Wed, 2020-05-20 at 13:14 +0800, Walter Wu wrote:
> > On Wed, May 20, 2020 at 6:03 AM Walter Wu  wrote:
> > >
> > > > On Tue, May 19, 2020 at 4:25 AM Walter Wu  
> > > > wrote:
> > > > >
> > > > > Move free track from slub alloc meta-data to slub free meta-data in
> > > > > order to make struct kasan_free_meta size is 16 bytes. It is a good
> > > > > size because it is the minimal redzone size and a good number of
> > > > > alignment.
> > > > >
> > > > > For free track in generic KASAN, we do the modification in struct
> > > > > kasan_alloc_meta and kasan_free_meta:
> > > > > - remove free track from kasan_alloc_meta.
> > > > > - add free track into kasan_free_meta.
> > > > >
> > > > > [1]https://bugzilla.kernel.org/show_bug.cgi?id=198437
> > > > >
> > > > > Signed-off-by: Walter Wu 
> > > > > Suggested-by: Dmitry Vyukov 
> > > > > Cc: Andrey Ryabinin 
> > > > > Cc: Dmitry Vyukov 
> > > > > Cc: Alexander Potapenko 
> > > > > ---
> > > > >  mm/kasan/common.c  | 22 ++
> > > > >  mm/kasan/generic.c | 18 ++
> > > > >  mm/kasan/kasan.h   |  7 +++
> > > > >  mm/kasan/report.c  | 20 
> > > > >  mm/kasan/tags.c| 37 +
> > > > >  5 files changed, 64 insertions(+), 40 deletions(-)
> > > > >
> > > > > diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> > > > > index 8bc618289bb1..47b53912f322 100644
> > > > > --- a/mm/kasan/common.c
> > > > > +++ b/mm/kasan/common.c
> > > > > @@ -51,7 +51,7 @@ depot_stack_handle_t kasan_save_stack(gfp_t flags)
> > > > > return stack_depot_save(entries, nr_entries, flags);
> > > > >  }
> > > > >
> > > > > -static inline void set_track(struct kasan_track *track, gfp_t flags)
> > > > > +void kasan_set_track(struct kasan_track *track, gfp_t flags)
> > > > >  {
> > > > > track->pid = current->pid;
> > > > > track->stack = kasan_save_stack(flags);
> > > > > @@ -299,24 +299,6 @@ struct kasan_free_meta *get_free_info(struct 
> > > > > kmem_cache *cache,
> > > > > return (void *)object + cache->kasan_info.free_meta_offset;
> > > > >  }
> > > > >
> > > > > -
> > > > > -static void kasan_set_free_info(struct kmem_cache *cache,
> > > > > -   void *object, u8 tag)
> > > > > -{
> > > > > -   struct kasan_alloc_meta *alloc_meta;
> > > > > -   u8 idx = 0;
> > > > > -
> > > > > -   alloc_meta = get_alloc_info(cache, object);
> > > > > -
> > > > > -#ifdef CONFIG_KASAN_SW_TAGS_IDENTIFY
> > > > > -   idx = alloc_meta->free_track_idx;
> > > > > -   alloc_meta->free_pointer_tag[idx] = tag;
> > > > > -   alloc_meta->free_track_idx = (idx + 1) % KASAN_NR_FREE_STACKS;
> > > > > -#endif
> > > > > -
> > > > > -   set_track(&alloc_meta->free_track[idx], GFP_NOWAIT);
> > > > > -}
> > > > > -
> > > > >  void kasan_poison_slab(struct page *page)
> > > > >  {
> > > > > unsigned long i;
> > > > > @@ -492,7 +474,7 @@ static void *__kasan_kmalloc(struct kmem_cache 
> > > > > *cache, const void *object,
> > > > > KASAN_KMALLOC_REDZONE);
> > > > >
> > > > > if (cache->flags & SLAB_KASAN)
> > > > > -   set_track(&get_alloc_info(cache, 
> > > > > object)->alloc_track, flags);
> > > > > +   kasan_set_track(&get_alloc_info(cache, 
> > > > > object)->alloc_track, flags);
> > > > >
> > > > > return set_tag(object, tag);
> > > > >  }
> > > > > diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> > > > > index 3372bdcaf92a..763d8a13e0ac 100644
> > > > > --- a/mm/kasan/generic.c
> > > > > +++ b/mm/kasan/generic.c
> > > > > @@ -344,3 +344,21 @@ void kasan_record_aux_stack(void *addr)
> > > > > alloc_info->aux_stack[1] = alloc_info->aux_stack[0];
> > > > > alloc_info->aux_stack[0] = kasan_save_stack(GFP_NOWAIT);
> > > > >  }
> > > > > +
> > > > > +void kasan_set_free_info(struct kmem_cache *cache,
> > > > > +   void *object, u8 tag)
> > > > > +{
> > > > > +   struct kasan_free_meta *free_meta;
> > > > > +
> > > > > +   free_meta = get_free_info(cache, object);
> > > > > +   kasan_set_track(&free_meta->free_track, GFP_NOWAIT);
> > > > > +}
> > > > > +
> > > > > +struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
> > > > > +   void *object, u8 tag)
> > > > > +{
> > > > > +   struct kasan_free_meta *free_meta;
> > > > > +
> > > > > +   free_meta = get_free_info(cache, object);
> > > > > +   return &free_meta->free_track;
> > > > > +}
> > > > > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> > > > > index a7391bc83070..ad897ec36545 100644
> > > > > --- a/mm/kasan/kasan.h
> > > > > +++ b/mm/kasan/kasan.h
> > > > > @@ -127,6 +127,9 @@ struct kasan_free_meta {
> > > > >  * Otherwise it might be used for the allocator freelist.
> > > > >  */
> > > > > struct qlist_node quarantine_link;
> > > > > +#ifdef CONFIG_KASAN_GENERIC
> > > > > +   struct kasan_track free_t

Re: seccomp feature development

2020-05-19 Thread Aleksa Sarai
On 2020-05-19, Alexei Starovoitov  wrote:
> On Wed, May 20, 2020 at 11:20:45AM +1000, Aleksa Sarai wrote:
> > No it won't become copy_from_user(), nor will there be a TOCTOU race.
> > 
> > The idea is that seccomp will proactively copy the struct (and
> > recursively any of the struct pointers inside) before the syscall runs
> > -- as this is done by seccomp it doesn't require any copy_from_user()
> > primitives in cBPF. We then run the cBPF filter on the copied struct,
> > just like how cBPF programs currently operate on seccomp_data (how this
> > would be exposed to the cBPF program as part of the seccomp ABI is the
> > topic of discussion here).
> > 
> > Then, when the actual syscall code runs, the struct will have already
> > been copied and the syscall won't copy it again.
> 
> Let's take bpf syscall as an example.
> Are you suggesting that all of syscall logic of conditionally parsing
> the arguments will be copy-pasted into seccomp-syscall infra, then
> it will do copy_from_user() all the data and replace all aligned_u64
> in "union bpf_attr" with kernel copied pointers instead of user pointers
> and make all of bpf syscall's copy_from_user() actions to be conditional ?
> If seccomp is on, use kernel pointers... if seccomp is off, do copy_from_user 
> ?
> And the same idea will be replicated for all syscalls?

This would be done optionally per-syscall. Only syscalls which want to
opt-in to such a mechanism (such as clone3 and openat2) would be
affected. Also, bpf is possibly the least-friendly syscall to pick as an
example of these types of filters -- openat2/clone3 is much simpler to
consider.

The point is that if we both agree that seccomp needs to have a way to
do "deep argument inspection" (filtering based on the struct argument to
a syscall), then some sort of caching mechanism is simply necessary to
solve the problem. Otherwise there's a trivial TOCTOU and seccomp
filtering for such syscalls would be rendered almost useless.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


[PATCH] iio: dac: ad5592r-base: Replace indio_dev->mlock with own device lock

2020-05-19 Thread Sergiu Cuciurean
As part of the general cleanup of indio_dev->mlock, this change replaces
it with a local lock on the device's state structure.

Signed-off-by: Sergiu Cuciurean 
---
 drivers/iio/dac/ad5592r-base.c | 28 +++-
 drivers/iio/dac/ad5592r-base.h |  1 +
 2 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/drivers/iio/dac/ad5592r-base.c b/drivers/iio/dac/ad5592r-base.c
index e2110113e884..10109eb81db2 100644
--- a/drivers/iio/dac/ad5592r-base.c
+++ b/drivers/iio/dac/ad5592r-base.c
@@ -166,10 +166,10 @@ static int ad5592r_reset(struct ad5592r_state *st)
udelay(1);
gpiod_set_value(gpio, 1);
} else {
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
/* Writing this magic value resets the device */
st->ops->reg_write(st, AD5592R_REG_RESET, 0xdac);
-   mutex_unlock(&iio_dev->mlock);
+   mutex_unlock(&st->lock);
}
 
udelay(250);
@@ -247,7 +247,7 @@ static int ad5592r_set_channel_modes(struct ad5592r_state 
*st)
}
}
 
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
 
/* Pull down unused pins to GND */
ret = ops->reg_write(st, AD5592R_REG_PULLDOWN, pulldown);
@@ -285,7 +285,7 @@ static int ad5592r_set_channel_modes(struct ad5592r_state 
*st)
ret = -EIO;
 
 err_unlock:
-   mutex_unlock(&iio_dev->mlock);
+   mutex_unlock(&st->lock);
return ret;
 }
 
@@ -314,11 +314,11 @@ static int ad5592r_write_raw(struct iio_dev *iio_dev,
if (!chan->output)
return -EINVAL;
 
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
ret = st->ops->write_dac(st, chan->channel, val);
if (!ret)
st->cached_dac[chan->channel] = val;
-   mutex_unlock(&iio_dev->mlock);
+   mutex_unlock(&st->lock);
return ret;
case IIO_CHAN_INFO_SCALE:
if (chan->type == IIO_VOLTAGE) {
@@ -333,12 +333,12 @@ static int ad5592r_write_raw(struct iio_dev *iio_dev,
else
return -EINVAL;
 
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
 
ret = st->ops->reg_read(st, AD5592R_REG_CTRL,
&st->cached_gp_ctrl);
if (ret < 0) {
-   mutex_unlock(&iio_dev->mlock);
+   mutex_unlock(&st->lock);
return ret;
}
 
@@ -360,7 +360,7 @@ static int ad5592r_write_raw(struct iio_dev *iio_dev,
 
ret = st->ops->reg_write(st, AD5592R_REG_CTRL,
 st->cached_gp_ctrl);
-   mutex_unlock(&iio_dev->mlock);
+   mutex_unlock(&st->lock);
 
return ret;
}
@@ -382,7 +382,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
 
switch (m) {
case IIO_CHAN_INFO_RAW:
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
 
if (!chan->output) {
ret = st->ops->read_adc(st, chan->channel, &read_val);
@@ -419,7 +419,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
} else {
int mult;
 
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
 
if (chan->output)
mult = !!(st->cached_gp_ctrl &
@@ -437,7 +437,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
case IIO_CHAN_INFO_OFFSET:
ret = ad5592r_get_vref(st);
 
-   mutex_lock(&iio_dev->mlock);
+   mutex_lock(&st->lock);
 
if (st->cached_gp_ctrl & AD5592R_REG_CTRL_ADC_RANGE)
*val = (-34365 * 25) / ret;
@@ -450,7 +450,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
}
 
 unlock:
-   mutex_unlock(&iio_dev->mlock);
+   mutex_unlock(&st->lock);
return ret;
 }
 
@@ -625,6 +625,8 @@ int ad5592r_probe(struct device *dev, const char *name,
iio_dev->info = &ad5592r_info;
iio_dev->modes = INDIO_DIRECT_MODE;
 
+   mutex_init(&st->lock);
+
ad5592r_init_scales(st, ad5592r_get_vref(st));
 
ret = ad5592r_reset(st);
diff --git a/drivers/iio/dac/ad5592r-base.h b/drivers/iio/dac/ad5592r-base.h
index 4774e4cd9c11..23dac2f1ff8a 100644
--- a/drivers/iio/dac/ad5592r-base.h
+++ b/drivers/iio/dac/ad5592r-base.h
@@ -52,6 +52,7 @@ struct ad5592r_state {
struct regulator *reg;
struct gpio_chip gpiochip;
struct mutex gpio_lock; /* Protect cached gpio_out, gpio_val, etc.

Re: [PATCH 2/3] arm64: dts: qcom: Add initial sm6125 SoC support

2020-05-19 Thread Bjorn Andersson
On Tue 19 May 04:18 PDT 2020, Eli Riggs wrote:

> On Mon, 18 May 2020 23:08:48 -0700
> Bjorn Andersson  wrote:
> 
> > Please use dual GPL/BSD license for dts files, if you can.
> 
> Unfortunately the downstream tree I ported has a GPL-2-only header.
> 
> > [...review]
> 
> OK
> 
> > Given that you won't get very far without GCC and e.g.  pinctrl
> > driver I would prefer to see some patches for those as well, to
> > ensure that this will be able to go beyond basic UART.
> 
> Cleaning up my gcc and clk-smd-rpm drivers now, as well as another
> patchset for pm6125, qusb2-phy, dwc3, and sdhci. TLMM in the vague
> future.
> 

Looking forward to review these!

Regards,
Bjorn


Re: [PATCH 00/12] Add cpufreq and cci devfreq for mt8183, and SVS support

2020-05-19 Thread Chanwoo Choi
Hi,

On 5/20/20 2:36 PM, andrew-sh.cheng wrote:
> On Wed, 2020-05-20 at 13:10 +0900, Chanwoo Choi wrote:
>> Hi Andrew,
>>
>> Could you explain the base commit of these patches?
>> When I tried to apply them to v5.7-rc1 for testing,
>> the merge conflict occurs.
>>
>> Thanks,

>> Chanwoo Choi
> 
> Hi Chanwoo Choi,
> 
> My base commit is
> commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
> Author: Linus Torvalds 
> Date:   Sun Apr 12 12:35:55 2020 -0700
> 
> Linux 5.7-rc1
> 
> Could you show me the conflict error?


When I tried to apply first patch with 'git am',
the merge conflict occurred.

git am \[PATCH\ 01_12\]\ OPP\:\ Allow\ required-opps\ even\ if\ the\ device\ 
doesn\'t\ have\ power-domains.eml
Applying: OPP: Allow required-opps even if the device doesn't have power-domains
error: patch failed: drivers/opp/core.c:755
error: drivers/opp/core.c: patch does not apply
error: patch failed: drivers/opp/of.c:195   








error: drivers/opp/of.c: patch does not apply
Patch failed at 0001 OPP: Allow required-opps even if the device doesn't have 
power-domains
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Regards,
Chanwoo Choi

> 
> BR,
> Andrew-sh.Cheng
>>
>> On 5/20/20 12:42 PM, Andrew-sh.Cheng wrote:
>>> MT8183 supports CPU DVFS and CCI DVFS, and LITTLE cpus and CCI are in the 
>>> same voltage domain.
>>> So, this series is to add drivers to handle the voltage coupling between 
>>> CPU and CCI DVFS.
>>>
>>> For SVS support, need OPP_EVENT_ADJUST_VOLTAGE and corresponding reaction.
>>>
>>> Change since v5:
>>> - Changing dt-binding format to yaml.
>>> - Extending current devfreq passive_governor instead of create a new 
>>> one.
>>> - Resend depending patches of Sravana Kannan base on kernel-5.7
>>>
>>>
>>> Andrew-sh.Cheng (6):
>>>   cpufreq: mediatek: add clock and regulator enable for intermediate
>>> clock
>>>   dt-bindings: devfreq: add compatible for mt8183 cci devfreq
>>>   devfreq: add mediatek cci devfreq
>>>   opp: Modify opp API, dev_pm_opp_get_freq(), find freq in opp, even it
>>> is disabled
>>>   cpufreq: mediatek: add opp notification for SVS support
>>>   devfreq: mediatek: cci devfreq register opp notification for SVS
>>> support
>>>
>>> Saravana Kannan (6):
>>>   OPP: Allow required-opps even if the device doesn't have power-domains
>>>   OPP: Add function to look up required OPP's for a given OPP
>>>   OPP: Improve required-opps linking
>>>   PM / devfreq: Cache OPP table reference in devfreq
>>>   PM / devfreq: Add required OPPs support to passive governor
>>>   PM / devfreq: Add cpu based scaling support to passive_governor
>>>
>>>  .../devicetree/bindings/devfreq/mt8183-cci.yaml|  51 
>>>  drivers/cpufreq/mediatek-cpufreq.c | 122 -
>>>  drivers/devfreq/Kconfig|  12 +
>>>  drivers/devfreq/Makefile   |   1 +
>>>  drivers/devfreq/devfreq.c  |   6 +
>>>  drivers/devfreq/governor_passive.c | 298 
>>> +++--
>>>  drivers/devfreq/mt8183-cci-devfreq.c   | 233 
>>>  drivers/opp/core.c |  85 +-
>>>  drivers/opp/of.c   | 108 
>>>  drivers/opp/opp.h  |   5 +
>>>  include/linux/devfreq.h|  42 ++-
>>>  include/linux/pm_opp.h |  11 +
>>>  12 files changed, 874 insertions(+), 100 deletions(-)
>>>  create mode 100644 
>>> Documentation/devicetree/bindings/devfreq/mt8183-cci.yaml
>>>  create mode 100644 drivers/devfreq/mt8183-cci-devfreq.c
>>>
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [RFC PATCH 1/4] gpu: dxgkrnl: core code

2020-05-19 Thread Greg KH
On Tue, May 19, 2020 at 01:45:53PM -0400, Sasha Levin wrote:
> On Tue, May 19, 2020 at 07:21:05PM +0200, Greg KH wrote:
> > On Tue, May 19, 2020 at 12:32:31PM -0400, Sasha Levin wrote:
> > > +
> > > +#define DXGK_MAX_LOCK_DEPTH  64
> > > +#define W_MAX_PATH   260
> > 
> > We already have a max path number, why use a different one?
> 
> It's max path for Windows, not Linux (thus the "W_" prefix) :)

Ah, not obvious :)

> Maybe changing it to WIN_MAX_PATH or such will make it better?

Probably.

> > > +#define d3dkmt_handleu32
> > > +#define d3dgpu_virtual_address   u64
> > > +#define winwchar u16
> > > +#define winhandleu64
> > > +#define ntstatus int
> > > +#define winbool  u32
> > > +#define d3dgpu_size_tu64
> > 
> > These are all ripe for a simple search/replace in your editor before you
> > do your next version :)
> 
> I've actually attempted that, and reverted that change, mostly because
> the whole 'handle' thing became very confusing.

Yeah, "handles" in windows can be a mess, with some being pointers and
others just integers.  Trying to make a specific typedef for it is
usually the better way overall, that way you can get the compiler to
check for mistakes.  These #defines will not really help with that.

But, 'ntstatus' should be ok to just make "int" everywhere, right?

> Note that we have a few 'handles', each with a different size, and thus
> calling get_something_something_handle() type of functions becase very
> confusing since it's not clear what handle we're working with in that
> case.

Yeah, typedefs can help there.

> With regards to the rest, I wanted to leave stuff like 'winbool' to
> document the expected ABI between the Windows and Linux side of things.
> Ideally it would be 'bool' or 'u8', but as you see we had to use 'u32'
> here which I feel lessens our ability to have the code document itself.

'bool' probably will not work as I think it's compiler dependent, __u8
is probably best.

thanks,

greg k-h


[PATCH 2/2] MAINTAINERS: Add maintainer entry for linear ranges helper

2020-05-19 Thread Matti Vaittinen
The linear ranges helpers were refactored out of regulator core
for other drivers to enjoy. Add regulator maintainer Mark Brown as
maintainer and myself as a reviewer.

Signed-off-by: Matti Vaittinen 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 63a2ca70540e..e103e7db1522 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9720,6 +9720,13 @@ F:   drivers/lightnvm/
 F: include/linux/lightnvm.h
 F: include/uapi/linux/lightnvm.h
 
+LINEAR RANGES HELPERS
+M: Mark Brown 
+R: Matti Vaittinen 
+F: lib/linear_ranges.c
+F: lib/test_linear_ranges.c
+F: include/linux/linear_range.h
+
 LINUX FOR POWER MACINTOSH
 M: Benjamin Herrenschmidt 
 L: linuxppc-...@lists.ozlabs.org
-- 
2.21.0


-- 
Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~~ "I don't think so," said Rene Descartes. Just then he vanished ~~~
Simon says - in Latin please.
~~~ "non cogito me" dixit Rene Descarte, deinde evanescavit ~~~
Thanks to Simon Glass for the translation =] 


[PATCH V2] dt-bindings: thermal: Convert i.MX to json-schema

2020-05-19 Thread Anson Huang
Convert the i.MX thermal binding to DT schema format using json-schema

Signed-off-by: Anson Huang 
---
Changes since V1:
- move tempmon node into its parent node anatop in example;
- improve "fsl,tempmon" description.
---
 .../devicetree/bindings/thermal/imx-thermal.txt|  61 -
 .../devicetree/bindings/thermal/imx-thermal.yaml   | 100 +
 2 files changed, 100 insertions(+), 61 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/thermal/imx-thermal.txt
 create mode 100644 Documentation/devicetree/bindings/thermal/imx-thermal.yaml

diff --git a/Documentation/devicetree/bindings/thermal/imx-thermal.txt 
b/Documentation/devicetree/bindings/thermal/imx-thermal.txt
deleted file mode 100644
index 823e417..000
--- a/Documentation/devicetree/bindings/thermal/imx-thermal.txt
+++ /dev/null
@@ -1,61 +0,0 @@
-* Temperature Monitor (TEMPMON) on Freescale i.MX SoCs
-
-Required properties:
-- compatible : must be one of following:
-  - "fsl,imx6q-tempmon" for i.MX6Q,
-  - "fsl,imx6sx-tempmon" for i.MX6SX,
-  - "fsl,imx7d-tempmon" for i.MX7S/D.
-- interrupts : the interrupt output of the controller:
-  i.MX6Q has one IRQ which will be triggered when temperature is higher than 
high threshold,
-  i.MX6SX and i.MX7S/D have two more IRQs than i.MX6Q, one is IRQ_LOW and the 
other is IRQ_PANIC,
-  when temperature is below than low threshold, IRQ_LOW will be triggered, 
when temperature
-  is higher than panic threshold, system will auto reboot by SRC module.
-- fsl,tempmon : phandle pointer to system controller that contains TEMPMON
-  control registers, e.g. ANATOP on imx6q.
-- nvmem-cells: A phandle to the calibration cells provided by ocotp.
-- nvmem-cell-names: Should be "calib", "temp_grade".
-
-Deprecated properties:
-- fsl,tempmon-data : phandle pointer to fuse controller that contains TEMPMON
-  calibration data, e.g. OCOTP on imx6q.  The details about calibration data
-  can be found in SoC Reference Manual.
-
-Direct access to OCOTP via fsl,tempmon-data is incorrect on some newer chips
-because it does not handle OCOTP clock requirements.
-
-Optional properties:
-- clocks : thermal sensor's clock source.
-
-Example:
-ocotp: ocotp@21bc000 {
-   #address-cells = <1>;
-   #size-cells = <1>;
-   compatible = "fsl,imx6sx-ocotp", "syscon";
-   reg = <0x021bc000 0x4000>;
-   clocks = <&clks IMX6SX_CLK_OCOTP>;
-
-   tempmon_calib: calib@38 {
-   reg = <0x38 4>;
-   };
-
-   tempmon_temp_grade: temp-grade@20 {
-   reg = <0x20 4>;
-   };
-};
-
-tempmon: tempmon {
-   compatible = "fsl,imx6sx-tempmon", "fsl,imx6q-tempmon";
-   interrupts = ;
-   fsl,tempmon = <&anatop>;
-   nvmem-cells = <&tempmon_calib>, <&tempmon_temp_grade>;
-   nvmem-cell-names = "calib", "temp_grade";
-   clocks = <&clks IMX6SX_CLK_PLL3_USB_OTG>;
-};
-
-Legacy method (Deprecated):
-tempmon {
-   compatible = "fsl,imx6q-tempmon";
-   fsl,tempmon = <&anatop>;
-   fsl,tempmon-data = <&ocotp>;
-   clocks = <&clks 172>;
-};
diff --git a/Documentation/devicetree/bindings/thermal/imx-thermal.yaml 
b/Documentation/devicetree/bindings/thermal/imx-thermal.yaml
new file mode 100644
index 000..894465e
--- /dev/null
+++ b/Documentation/devicetree/bindings/thermal/imx-thermal.yaml
@@ -0,0 +1,100 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/thermal/imx-thermal.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NXP i.MX Thermal Binding
+
+maintainers:
+  - Shawn Guo 
+  - Anson Huang 
+
+properties:
+  compatible:
+oneOf:
+  - items:
+  - enum:
+  - fsl,imx6q-tempmon
+  - fsl,imx6sx-tempmon
+  - fsl,imx7d-tempmon
+
+  interrupts:
+description: |
+  The interrupt output of the controller, the IRQ will be triggered
+  when temperature is higher than high threshold.
+maxItems: 1
+
+  nvmem-cells:
+description: |
+  Phandle to the calibration cells provided by ocotp for calibration
+  data and temperature grade.
+maxItems: 2
+
+  nvmem-cell-names:
+maxItems: 2
+items:
+  - const: calib
+  - const: temp_grade
+
+  fsl,tempmon:
+$ref: '/schemas/types.yaml#/definitions/phandle'
+description: Phandle to the register map node.
+
+  fsl,tempmon-data:
+$ref: '/schemas/types.yaml#/definitions/phandle'
+description: |
+  Deprecated property, phandle pointer to fuse controller that contains
+  TEMPMON calibration data, e.g. OCOTP on imx6q. The details about
+  calibration data can be found in SoC Reference Manual.
+deprecated: true
+
+  clocks:
+maxItems: 1
+
+required:
+  - compatible
+  - interrupts
+  - fsl,tempmon
+  - nvmem-cells
+  - nvmem-cell-names
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+
+efuse@21bc000 {
+ #address-cells = <1>;
+ 

[PATCH 1/2] MAINTAINERS: Add entry for ROHM power management ICs

2020-05-19 Thread Matti Vaittinen
Add entry for maintaining power management IC drivers for ROHM
BD71837, BD71847, BD71850, BD71828, BD71878, BD70528 and BD99954.

Signed-off-by: Matti Vaittinen 
---
 MAINTAINERS | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index ecc0749810b0..63a2ca70540e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14490,6 +14490,12 @@ L: linux-ser...@vger.kernel.org
 S: Odd Fixes
 F: drivers/tty/serial/rp2.*
 
+ROHM BD99954 CHARGER IC
+R: Matti Vaittinen 
+S: Supported
+F: drivers/power/supply/bd99954-charger.c
+F: drivers/power/supply/bd99954-charger.h
+
 ROHM BH1750 AMBIENT LIGHT SENSOR DRIVER
 M: Tomasz Duszynski 
 S: Maintained
@@ -14507,6 +14513,30 @@ F: drivers/mfd/bd9571mwv.c
 F: drivers/regulator/bd9571mwv-regulator.c
 F: include/linux/mfd/bd9571mwv.h
 
+ROHM POWER MANAGEMENT IC DEVICE DRIVERS
+R: Matti Vaittinen 
+S: Supported
+F: Documentation/devicetree/bindings/mfd/rohm,bd70528-pmic.txt
+F: Documentation/devicetree/bindings/regulator/rohm,bd70528-regulator.txt
+F: drivers/clk/clk-bd718x7.c
+F: drivers/gpio/gpio-bd70528.c
+F: drivers/gpio/gpio-bd71828.c
+F: drivers/mfd/rohm-bd70528.c
+F: drivers/mfd/rohm-bd71828.c
+F: drivers/mfd/rohm-bd718x7.c
+F: drivers/power/supply/bd70528-charger.c
+F: drivers/regulator/bd70528-regulator.c
+F: drivers/regulator/bd71828-regulator.c
+F: drivers/regulator/bd718x7-regulator.c
+F: drivers/regulator/rohm-regulator.c
+F: drivers/rtc/rtc-bd70528.c
+F: drivers/watchdog/bd70528_wdt.c
+F: include/linux/mfd/rohm-shared.h
+F: include/linux/mfd/rohm-bd71828.h
+F: include/linux/mfd/rohm-bd70528.h
+F: include/linux/mfd/rohm-generic.h
+F: include/linux/mfd/rohm-bd718x7.h
+
 ROSE NETWORK LAYER
 M: Ralf Baechle 
 L: linux-h...@vger.kernel.org
-- 
2.21.0


-- 
Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~~ "I don't think so," said Rene Descartes. Just then he vanished ~~~
Simon says - in Latin please.
~~~ "non cogito me" dixit Rene Descarte, deinde evanescavit ~~~
Thanks to Simon Glass for the translation =] 


[PATCH 0/2] MAINTAINER entries for few ROHM power devices

2020-05-19 Thread Matti Vaittinen
Add maintainer entries to a few ROHM devices and Linear Ranges

Linear Ranges helpers were refactored out of regulator core to lib so
that other drivers could utilize them too. (I guess power/supply drivers 
and possibly clk drivers can benefit from them). As regulators is
currently the main user it makes sense the changes to linear_ranges go
through Mark's tree.

During past two years few ROHM PMIC drivers have been added to
mainstream. They deserve a supporter from ROHM side too :)

Patch 1:
Maintainer entries for few ROHM IC drivers
Patch 2:
Maintainer entry for linear ranges helpers

---

Matti Vaittinen (2):
  MAINTAINERS: Add entry for ROHM power management ICs
  MAINTAINERS: Add maintainer entry for linear ranges helper

 MAINTAINERS | 37 +
 1 file changed, 37 insertions(+)


base-commit: b9bbe6ed63b2b9f2c9ee5cbd0f2c946a2723f4ce
-- 
2.21.0


-- 
Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~~ "I don't think so," said Rene Descartes. Just then he vanished ~~~
Simon says - in Latin please.
~~~ "non cogito me" dixit Rene Descarte, deinde evanescavit ~~~
Thanks to Simon Glass for the translation =] 


Re: [PATCH] input: i8042: Remove special PowerPC handling

2020-05-19 Thread Michael Ellerman
[ + Dmitry & linux-input ]

Nathan Chancellor  writes:
> This causes a build error with CONFIG_WALNUT because kb_cs and kb_data
> were removed in commit 917f0af9e5a9 ("powerpc: Remove arch/ppc and
> include/asm-ppc").
>
> ld.lld: error: undefined symbol: kb_cs
>> referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28)
>> input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
>> referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28)
>> input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
>> referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28)
>> input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
>
> ld.lld: error: undefined symbol: kb_data
>> referenced by i8042.c:309 (drivers/input/serio/i8042.c:309)
>> input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
>> referenced by i8042-ppcio.h:33 (drivers/input/serio/i8042-ppcio.h:33)
>> input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
>> referenced by i8042.c:319 (drivers/input/serio/i8042.c:319)
>> input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
>> referenced 15 more times
>
> Presumably since nobody has noticed this for the last 12 years, there is
> not anyone actually trying to use this driver so we can just remove this
> special walnut code and use the generic header so it builds for all
> configurations.
>
> Fixes: 917f0af9e5a9 ("powerpc: Remove arch/ppc and include/asm-ppc")
> Reported-by: kbuild test robot 
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/input/serio/i8042-ppcio.h | 57 ---
>  drivers/input/serio/i8042.h   |  2 --
>  2 files changed, 59 deletions(-)
>  delete mode 100644 drivers/input/serio/i8042-ppcio.h

This LGTM.

Acked-by: Michael Ellerman  (powerpc)

I assumed drivers/input/serio would be pretty quiet, but there's
actually some commits to it in linux-next. So perhaps this should go via
the input tree.

Dmitry do you want to take this, or should I take it via powerpc?

Original patch is here:
  https://lore.kernel.org/lkml/20200518181043.3363953-1-natechancel...@gmail.com

cheers

> diff --git a/drivers/input/serio/i8042-ppcio.h 
> b/drivers/input/serio/i8042-ppcio.h
> deleted file mode 100644
> index 391f94d9e47d..
> --- a/drivers/input/serio/i8042-ppcio.h
> +++ /dev/null
> @@ -1,57 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0-only */
> -#ifndef _I8042_PPCIO_H
> -#define _I8042_PPCIO_H
> -
> -
> -#if defined(CONFIG_WALNUT)
> -
> -#define I8042_KBD_IRQ 25
> -#define I8042_AUX_IRQ 26
> -
> -#define I8042_KBD_PHYS_DESC "walnutps2/serio0"
> -#define I8042_AUX_PHYS_DESC "walnutps2/serio1"
> -#define I8042_MUX_PHYS_DESC "walnutps2/serio%d"
> -
> -extern void *kb_cs;
> -extern void *kb_data;
> -
> -#define I8042_COMMAND_REG (*(int *)kb_cs)
> -#define I8042_DATA_REG (*(int *)kb_data)
> -
> -static inline int i8042_read_data(void)
> -{
> - return readb(kb_data);
> -}
> -
> -static inline int i8042_read_status(void)
> -{
> - return readb(kb_cs);
> -}
> -
> -static inline void i8042_write_data(int val)
> -{
> - writeb(val, kb_data);
> -}
> -
> -static inline void i8042_write_command(int val)
> -{
> - writeb(val, kb_cs);
> -}
> -
> -static inline int i8042_platform_init(void)
> -{
> - i8042_reset = I8042_RESET_ALWAYS;
> - return 0;
> -}
> -
> -static inline void i8042_platform_exit(void)
> -{
> -}
> -
> -#else
> -
> -#include "i8042-io.h"
> -
> -#endif
> -
> -#endif /* _I8042_PPCIO_H */
> diff --git a/drivers/input/serio/i8042.h b/drivers/input/serio/i8042.h
> index 38dc27ad3c18..eb376700dfff 100644
> --- a/drivers/input/serio/i8042.h
> +++ b/drivers/input/serio/i8042.h
> @@ -17,8 +17,6 @@
>  #include "i8042-ip22io.h"
>  #elif defined(CONFIG_SNI_RM)
>  #include "i8042-snirm.h"
> -#elif defined(CONFIG_PPC)
> -#include "i8042-ppcio.h"
>  #elif defined(CONFIG_SPARC)
>  #include "i8042-sparcio.h"
>  #elif defined(CONFIG_X86) || defined(CONFIG_IA64)
>
> base-commit: 72bc15d0018ebfbc9c389539d636e2e9a9002b3b
> -- 
> 2.27.0.rc0


RE: [PATCH V2 0/3] ARM: imx: move cpu code to drivers/soc/imx

2020-05-19 Thread Peng Fan
Hi Shawn,

> Subject: Re: [PATCH V2 0/3] ARM: imx: move cpu code to drivers/soc/imx
> 
> On Wed, May 20, 2020 at 8:57 AM Shawn Guo 
> wrote:
> >
> > On Wed, Apr 29, 2020 at 05:17:20PM +0800, peng@nxp.com wrote:
> > > From: Peng Fan 
> > >
> > > V2:
> > >  Keep i.MX1/2/3/5 cpu type for completness  Correct return value in
> > > patch 1/3  use CONFIG_ARM to guard compile soc-imx.c in patch 3/3
> > >
> > > V1:
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpa
> > >
> tchwork.kernel.org%2Fcover%2F11433689%2F&data=02%7C01%7Cpen
> g.fan
> > > %40nxp.com%7C3fe49570a6824631476908d7fc6e5cd3%7C686ea1d3bc2
> b4c6fa92c
> > >
> d99c5c301635%7C0%7C0%7C637255423274738401&sdata=ELtEt3Nbg
> kUg83w4
> > > UbCftkVMu0toYDUXJy4MgLc8qbQ%3D&reserved=0
> > > RFC version :
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpa
> > >
> tchwork.kernel.org%2Fcover%2F11336433%2F&data=02%7C01%7Cpen
> g.fan
> > > %40nxp.com%7C3fe49570a6824631476908d7fc6e5cd3%7C686ea1d3bc2
> b4c6fa92c
> > >
> d99c5c301635%7C0%7C0%7C637255423274738401&sdata=RE%2Fprw
> CLb7fQpY
> > > hmszlnXxTBKJVdEXsjMBrd2ZHmKc8%3D&reserved=0
> > >
> > > Nothing changed in v1, just rename to formal patches
> > >
> > > Shawn,
> > >  The original concern has been eliminated in RFC discussion,  so
> > > this patchset is ready to be in next.
> > > Thanks.
> > >
> > > Follow i.MX8, move the soc device register code to drivers/soc/imx
> > > to simplify arch/arm/mach-imx/cpu.c
> > >
> > > I planned to use similar logic as soc-imx8m.c to restructure
> > > soc-imx.c and merged the two files into one. But not sure, so still
> > > keep the logic in cpu.c.
> > >
> > > There is one change is the platform devices are not under
> > > /sys/devices/soc0 after patch 1/4. Actually ARM64 platform devices
> > > are not under /sys/devices/soc0, such as i.MX8/8M.
> > > So it should not hurt to let the platform devices under platform dir.
> > >
> > > Peng Fan (3):
> > >   ARM: imx: use device_initcall for imx_soc_device_init
> > >   ARM: imx: move cpu definitions into a header
> > >   soc: imx: move cpu code to drivers/soc/imx
> >
> > Applied all, thanks.
> 
> Unfortunately, I have to drop this, as it turns out the series needs a rebase
> onto for-next.  The series conflicts with 'ARM: vf610: report soc info via soc
> device' there.

I just posted out v3 which rebased on latest next tree and resolved the 
conflicts.

Thanks,
Peng.

> 
> Shawn


[PATCH V3 0/3] ARM: imx: move cpu code to drivers/soc/imx

2020-05-19 Thread peng . fan
From: Peng Fan 

V3:
 Rebased to latest next tree
 Resolved the conflicts with vf610 soc patch

V2:
 Keep i.MX1/2/3/5 cpu type for completness
 Correct return value in patch 1/3
 use CONFIG_ARM to guard compile soc-imx.c in patch 3/3

V1:
https://patchwork.kernel.org/cover/11433689/
RFC version :
https://patchwork.kernel.org/cover/11336433/

Nothing changed in v1, just rename to formal patches

Shawn,
 The original concern has been eliminated in RFC discussion,
 so this patchset is ready to be in next.
Thanks.

Follow i.MX8, move the soc device register code to drivers/soc/imx
to simplify arch/arm/mach-imx/cpu.c

I planned to use similar logic as soc-imx8m.c to restructure soc-imx.c
and merged the two files into one. But not sure, so still keep
the logic in cpu.c.

There is one change is the platform devices are not under
/sys/devices/soc0 after patch 1/4. Actually ARM64 platform
devices are not under /sys/devices/soc0, such as i.MX8/8M.
So it should not hurt to let the platform devices under platform dir.

Peng Fan (3):
  ARM: imx: use device_initcall for imx_soc_device_init
  ARM: imx: move cpu definitions into a header
  soc: imx: move cpu code to drivers/soc/imx

 arch/arm/mach-imx/common.h   |   1 -
 arch/arm/mach-imx/cpu.c  | 175 ---
 arch/arm/mach-imx/mach-imx6q.c   |   8 +-
 arch/arm/mach-imx/mach-imx6sl.c  |   8 +-
 arch/arm/mach-imx/mach-imx6sx.c  |   8 +-
 arch/arm/mach-imx/mach-imx6ul.c  |   8 +-
 arch/arm/mach-imx/mach-imx7d.c   |   6 --
 arch/arm/mach-imx/mach-imx7ulp.c |   2 +-
 arch/arm/mach-imx/mach-vf610.c   |   8 +-
 arch/arm/mach-imx/mxc.h  |  28 +-
 drivers/soc/imx/Makefile |   3 +
 drivers/soc/imx/soc-imx.c| 192 +++
 include/soc/imx/cpu.h|  36 
 13 files changed, 238 insertions(+), 245 deletions(-)
 create mode 100644 drivers/soc/imx/soc-imx.c
 create mode 100644 include/soc/imx/cpu.h

-- 
2.16.4



[PATCH V3 2/3] ARM: imx: move cpu definitions into a header

2020-05-19 Thread peng . fan
From: Peng Fan 

The soc device register code will be moved to drivers/soc/imx/,
the code needs the cpu type definitions. So let's move the cpu
type definitions to a header.

Signed-off-by: Peng Fan 
---
 arch/arm/mach-imx/mxc.h | 28 +---
 include/soc/imx/cpu.h   | 36 
 2 files changed, 37 insertions(+), 27 deletions(-)
 create mode 100644 include/soc/imx/cpu.h

diff --git a/arch/arm/mach-imx/mxc.h b/arch/arm/mach-imx/mxc.h
index 48e6d781f15b..fe2d0f5abfcc 100644
--- a/arch/arm/mach-imx/mxc.h
+++ b/arch/arm/mach-imx/mxc.h
@@ -8,41 +8,15 @@
 #define __ASM_ARCH_MXC_H__
 
 #include 
+#include 
 
 #ifndef __ASM_ARCH_MXC_HARDWARE_H__
 #error "Do not include directly."
 #endif
 
-#define MXC_CPU_MX11
-#define MXC_CPU_MX21   21
-#define MXC_CPU_MX25   25
-#define MXC_CPU_MX27   27
-#define MXC_CPU_MX31   31
-#define MXC_CPU_MX35   35
-#define MXC_CPU_MX51   51
-#define MXC_CPU_MX53   53
-#define MXC_CPU_IMX6SL 0x60
-#define MXC_CPU_IMX6DL 0x61
-#define MXC_CPU_IMX6SX 0x62
-#define MXC_CPU_IMX6Q  0x63
-#define MXC_CPU_IMX6UL 0x64
-#define MXC_CPU_IMX6ULL0x65
-/* virtual cpu id for i.mx6ulz */
-#define MXC_CPU_IMX6ULZ0x6b
-#define MXC_CPU_IMX6SLL0x67
-#define MXC_CPU_IMX7D  0x72
-#define MXC_CPU_IMX7ULP0xff
-
-#define MXC_CPU_VFx10  0x010
-#define MXC_CPU_VF500  0x500
-#define MXC_CPU_VF510  (MXC_CPU_VF500 | MXC_CPU_VFx10)
-#define MXC_CPU_VF600  0x600
-#define MXC_CPU_VF610  (MXC_CPU_VF600 | MXC_CPU_VFx10)
-
 #define IMX_DDR_TYPE_LPDDR21
 
 #ifndef __ASSEMBLY__
-extern unsigned int __mxc_cpu_type;
 
 #ifdef CONFIG_SOC_IMX6SL
 static inline bool cpu_is_imx6sl(void)
diff --git a/include/soc/imx/cpu.h b/include/soc/imx/cpu.h
new file mode 100644
index ..42d6aeb951fa
--- /dev/null
+++ b/include/soc/imx/cpu.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef __IMX_CPU_H__
+#define __IMX_CPU_H__
+
+#define MXC_CPU_MX11
+#define MXC_CPU_MX21   21
+#define MXC_CPU_MX25   25
+#define MXC_CPU_MX27   27
+#define MXC_CPU_MX31   31
+#define MXC_CPU_MX35   35
+#define MXC_CPU_MX51   51
+#define MXC_CPU_MX53   53
+#define MXC_CPU_IMX6SL 0x60
+#define MXC_CPU_IMX6DL 0x61
+#define MXC_CPU_IMX6SX 0x62
+#define MXC_CPU_IMX6Q  0x63
+#define MXC_CPU_IMX6UL 0x64
+#define MXC_CPU_IMX6ULL0x65
+/* virtual cpu id for i.mx6ulz */
+#define MXC_CPU_IMX6ULZ0x6b
+#define MXC_CPU_IMX6SLL0x67
+#define MXC_CPU_IMX7D  0x72
+#define MXC_CPU_IMX7ULP0xff
+
+#define MXC_CPU_VFx10  0x010
+#define MXC_CPU_VF500  0x500
+#define MXC_CPU_VF510  (MXC_CPU_VF500 | MXC_CPU_VFx10)
+#define MXC_CPU_VF600  0x600
+#define MXC_CPU_VF610  (MXC_CPU_VF600 | MXC_CPU_VFx10)
+
+#ifndef __ASSEMBLY__
+extern unsigned int __mxc_cpu_type;
+#endif
+
+#endif
-- 
2.16.4



[PATCH V3 1/3] ARM: imx: use device_initcall for imx_soc_device_init

2020-05-19 Thread peng . fan
From: Peng Fan 

This is preparation to move imx_soc_device_init to drivers/soc/imx/

There is no reason to must put dt devices under /sys/devices/soc0,
they could also be under /sys/devices/platform, so we could
pass NULL as parent when calling of_platform_default_populate.

Following soc-imx8.c soc-imx-scu.c using device_initcall, need
to change return type to int type for imx_soc_device_init.

Signed-off-by: Peng Fan 
---
 arch/arm/mach-imx/common.h   |  1 -
 arch/arm/mach-imx/cpu.c  | 21 ++---
 arch/arm/mach-imx/mach-imx6q.c   |  8 +---
 arch/arm/mach-imx/mach-imx6sl.c  |  8 +---
 arch/arm/mach-imx/mach-imx6sx.c  |  8 +---
 arch/arm/mach-imx/mach-imx6ul.c  |  8 +---
 arch/arm/mach-imx/mach-imx7d.c   |  6 --
 arch/arm/mach-imx/mach-imx7ulp.c |  2 +-
 arch/arm/mach-imx/mach-vf610.c   |  8 +---
 9 files changed, 20 insertions(+), 50 deletions(-)

diff --git a/arch/arm/mach-imx/common.h b/arch/arm/mach-imx/common.h
index 5aa5796cff0e..72c3fcc32910 100644
--- a/arch/arm/mach-imx/common.h
+++ b/arch/arm/mach-imx/common.h
@@ -49,7 +49,6 @@ void imx_aips_allow_unprivileged_access(const char *compat);
 int mxc_device_init(void);
 void imx_set_soc_revision(unsigned int rev);
 void imx_init_revision_from_anatop(void);
-struct device *imx_soc_device_init(void);
 void imx6_enable_rbc(bool enable);
 void imx_gpc_check_dt(void);
 void imx_gpc_set_arm_power_in_lpm(bool power_off);
diff --git a/arch/arm/mach-imx/cpu.c b/arch/arm/mach-imx/cpu.c
index e3d12b21d6f6..75ffcba9f878 100644
--- a/arch/arm/mach-imx/cpu.c
+++ b/arch/arm/mach-imx/cpu.c
@@ -83,7 +83,7 @@ void __init imx_aips_allow_unprivileged_access(
}
 }
 
-struct device * __init imx_soc_device_init(void)
+static int __init imx_soc_device_init(void)
 {
struct soc_device_attribute *soc_dev_attr;
const char *ocotp_compat = NULL;
@@ -97,7 +97,7 @@ struct device * __init imx_soc_device_init(void)
 
soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
if (!soc_dev_attr)
-   return NULL;
+   return -ENOMEM;
 
soc_dev_attr->family = "Freescale i.MX";
 
@@ -224,18 +224,24 @@ struct device * __init imx_soc_device_init(void)
soc_dev_attr->revision = kasprintf(GFP_KERNEL, "%d.%d",
   (imx_soc_revision >> 4) & 0xf,
   imx_soc_revision & 0xf);
-   if (!soc_dev_attr->revision)
+   if (!soc_dev_attr->revision) {
+   ret = -ENOMEM;
goto free_soc;
+   }
 
soc_dev_attr->serial_number = kasprintf(GFP_KERNEL, "%016llX", soc_uid);
-   if (!soc_dev_attr->serial_number)
+   if (!soc_dev_attr->serial_number) {
+   ret = -ENOMEM;
goto free_rev;
+   }
 
soc_dev = soc_device_register(soc_dev_attr);
-   if (IS_ERR(soc_dev))
+   if (IS_ERR(soc_dev)) {
+   ret = PTR_ERR(soc_dev);
goto free_serial_number;
+   }
 
-   return soc_device_to_device(soc_dev);
+   return 0;
 
 free_serial_number:
kfree(soc_dev_attr->serial_number);
@@ -243,5 +249,6 @@ struct device * __init imx_soc_device_init(void)
kfree(soc_dev_attr->revision);
 free_soc:
kfree(soc_dev_attr);
-   return NULL;
+   return ret;
 }
+device_initcall(imx_soc_device_init);
diff --git a/arch/arm/mach-imx/mach-imx6q.c b/arch/arm/mach-imx/mach-imx6q.c
index 284bce1112d2..85c084a716ab 100644
--- a/arch/arm/mach-imx/mach-imx6q.c
+++ b/arch/arm/mach-imx/mach-imx6q.c
@@ -245,21 +245,15 @@ static void __init imx6q_axi_init(void)
 
 static void __init imx6q_init_machine(void)
 {
-   struct device *parent;
-
if (cpu_is_imx6q() && imx_get_soc_revision() == IMX_CHIP_REVISION_2_0)
imx_print_silicon_rev("i.MX6QP", IMX_CHIP_REVISION_1_0);
else
imx_print_silicon_rev(cpu_is_imx6dl() ? "i.MX6DL" : "i.MX6Q",
imx_get_soc_revision());
 
-   parent = imx_soc_device_init();
-   if (parent == NULL)
-   pr_warn("failed to initialize soc device\n");
-
imx6q_enet_phy_init();
 
-   of_platform_default_populate(NULL, NULL, parent);
+   of_platform_default_populate(NULL, NULL, NULL);
 
imx_anatop_init();
cpu_is_imx6q() ?  imx6q_pm_init() : imx6dl_pm_init();
diff --git a/arch/arm/mach-imx/mach-imx6sl.c b/arch/arm/mach-imx/mach-imx6sl.c
index e27a6889cc56..f6e87363d605 100644
--- a/arch/arm/mach-imx/mach-imx6sl.c
+++ b/arch/arm/mach-imx/mach-imx6sl.c
@@ -45,13 +45,7 @@ static void __init imx6sl_init_late(void)
 
 static void __init imx6sl_init_machine(void)
 {
-   struct device *parent;
-
-   parent = imx_soc_device_init();
-   if (parent == NULL)
-   pr_warn("failed to initialize soc device\n");
-
-   of_platform_default_populate(NULL, NULL, parent);
+   of_platform_default_populate(NULL, NULL, NULL);
 
if (

[PATCH V3 3/3] soc: imx: move cpu code to drivers/soc/imx

2020-05-19 Thread peng . fan
From: Peng Fan 

Move the soc device register code to drivers/soc/imx to align with
i.MX8.

Signed-off-by: Peng Fan 
---
 arch/arm/mach-imx/cpu.c   | 182 ---
 drivers/soc/imx/Makefile  |   3 +
 drivers/soc/imx/soc-imx.c | 192 ++
 3 files changed, 195 insertions(+), 182 deletions(-)
 create mode 100644 drivers/soc/imx/soc-imx.c

diff --git a/arch/arm/mach-imx/cpu.c b/arch/arm/mach-imx/cpu.c
index 75ffcba9f878..65c7224f5250 100644
--- a/arch/arm/mach-imx/cpu.c
+++ b/arch/arm/mach-imx/cpu.c
@@ -1,25 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
 
 #include "hardware.h"
 #include "common.h"
 
-#define OCOTP_UID_H0x420
-#define OCOTP_UID_L0x410
-
-#define OCOTP_ULP_UID_10x4b0
-#define OCOTP_ULP_UID_20x4c0
-#define OCOTP_ULP_UID_30x4d0
-#define OCOTP_ULP_UID_40x4e0
-
 unsigned int __mxc_cpu_type;
 static unsigned int imx_soc_revision;
 
@@ -82,173 +70,3 @@ void __init imx_aips_allow_unprivileged_access(
imx_set_aips(aips_base_addr);
}
 }
-
-static int __init imx_soc_device_init(void)
-{
-   struct soc_device_attribute *soc_dev_attr;
-   const char *ocotp_compat = NULL;
-   struct soc_device *soc_dev;
-   struct device_node *root;
-   struct regmap *ocotp = NULL;
-   const char *soc_id;
-   u64 soc_uid = 0;
-   u32 val;
-   int ret;
-
-   soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
-   if (!soc_dev_attr)
-   return -ENOMEM;
-
-   soc_dev_attr->family = "Freescale i.MX";
-
-   root = of_find_node_by_path("/");
-   ret = of_property_read_string(root, "model", &soc_dev_attr->machine);
-   of_node_put(root);
-   if (ret)
-   goto free_soc;
-
-   switch (__mxc_cpu_type) {
-   case MXC_CPU_MX1:
-   soc_id = "i.MX1";
-   break;
-   case MXC_CPU_MX21:
-   soc_id = "i.MX21";
-   break;
-   case MXC_CPU_MX25:
-   soc_id = "i.MX25";
-   break;
-   case MXC_CPU_MX27:
-   soc_id = "i.MX27";
-   break;
-   case MXC_CPU_MX31:
-   soc_id = "i.MX31";
-   break;
-   case MXC_CPU_MX35:
-   soc_id = "i.MX35";
-   break;
-   case MXC_CPU_MX51:
-   soc_id = "i.MX51";
-   break;
-   case MXC_CPU_MX53:
-   soc_id = "i.MX53";
-   break;
-   case MXC_CPU_IMX6SL:
-   ocotp_compat = "fsl,imx6sl-ocotp";
-   soc_id = "i.MX6SL";
-   break;
-   case MXC_CPU_IMX6DL:
-   ocotp_compat = "fsl,imx6q-ocotp";
-   soc_id = "i.MX6DL";
-   break;
-   case MXC_CPU_IMX6SX:
-   ocotp_compat = "fsl,imx6sx-ocotp";
-   soc_id = "i.MX6SX";
-   break;
-   case MXC_CPU_IMX6Q:
-   ocotp_compat = "fsl,imx6q-ocotp";
-   soc_id = "i.MX6Q";
-   break;
-   case MXC_CPU_IMX6UL:
-   ocotp_compat = "fsl,imx6ul-ocotp";
-   soc_id = "i.MX6UL";
-   break;
-   case MXC_CPU_IMX6ULL:
-   ocotp_compat = "fsl,imx6ull-ocotp";
-   soc_id = "i.MX6ULL";
-   break;
-   case MXC_CPU_IMX6ULZ:
-   ocotp_compat = "fsl,imx6ull-ocotp";
-   soc_id = "i.MX6ULZ";
-   break;
-   case MXC_CPU_IMX6SLL:
-   ocotp_compat = "fsl,imx6sll-ocotp";
-   soc_id = "i.MX6SLL";
-   break;
-   case MXC_CPU_IMX7D:
-   ocotp_compat = "fsl,imx7d-ocotp";
-   soc_id = "i.MX7D";
-   break;
-   case MXC_CPU_IMX7ULP:
-   ocotp_compat = "fsl,imx7ulp-ocotp";
-   soc_id = "i.MX7ULP";
-   break;
-   case MXC_CPU_VF500:
-   ocotp_compat = "fsl,vf610-ocotp";
-   soc_id = "VF500";
-   break;
-   case MXC_CPU_VF510:
-   ocotp_compat = "fsl,vf610-ocotp";
-   soc_id = "VF510";
-   break;
-   case MXC_CPU_VF600:
-   ocotp_compat = "fsl,vf610-ocotp";
-   soc_id = "VF600";
-   break;
-   case MXC_CPU_VF610:
-   ocotp_compat = "fsl,vf610-ocotp";
-   soc_id = "VF610";
-   break;
-   default:
-   soc_id = "Unknown";
-   }
-   soc_dev_attr->soc_id = soc_id;
-
-   if (ocotp_compat) {
-   ocotp = syscon_regmap_lookup_by_compatible(ocotp_compat);
-   if (IS_ERR(ocotp))
-   pr_err("%s: failed to find %s regmap!\n", __func__, 
ocotp_compat);
-   }
-
-   if (!IS_ERR_OR_NULL(ocotp)) {
-   if (__mxc_cpu_

Re: [PATCH v3] perf record: add dummy event during system wide synthesis

2020-05-19 Thread Ian Rogers
On Tue, May 19, 2020 at 6:54 PM Arnaldo Carvalho de Melo
 wrote:
>
> Em Wed, Apr 22, 2020 at 10:36:15AM -0700, Ian Rogers escreveu:
> > During the processing of /proc during event synthesis new processes may
> > start. Add a dummy event if /proc is to be processed, to capture mmaps
> > for starting processes. This reuses the existing logic for
> > initial-delay.
> >
> > v3 fixes the attr test of test-record-C0
> > v2 fixes the dummy event configuration and a branch stack issue.
>
> Something I noticed only now is that this ends up in the perf.data file,
> and we don't need it at all there, i.e.
>
>   # perf record -I
>
> I.e. system wide, asking for registers now ends up with:
>
> [root@quaco ~]# perf record -I
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.855 MB perf.data (4902 samples) ]
> [root@quaco ~]# perf evlist
> cycles
> dummy:HG
> [root@quaco ~]# perf evlist -v
> cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: 
> IP|TID|TIME|ID|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1, inherit: 
> 1, freq: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, 
> sample_regs_intr: 0xff0fff
> dummy:HG: type: 1, size: 120, config: 0x9, { sample_period, sample_freq }: 
> 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD|REGS_INTR, read_format: ID, 
> inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, mmap2: 1, 
> comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_intr: 0xff0fff
> [root@quaco ~]#
>
> For perf top is ok to reuse the main evlist, as those are not going to
> hit the disk, but for 'perf record' it pollutes the perf.data file with
> that dummy event.
>
> This was a problem introduced with initial-delay, that IIRC predates the
> side band thread tho, I'll have to think about it, just writing this
> down to revisit this, as may raise some eyebrows by now being more
> exposed.

Agreed. We've had to adjust some tooling like the protobuf convertor
because of this:
https://github.com/google/perf_data_converter/pull/88

Thanks,
Ian

> - Arnaldo
>
> > Suggested-by: Stephane Eranian 
> > Signed-off-by: Ian Rogers 
> > ---
> >  tools/perf/builtin-record.c | 19 +++---
> >  tools/perf/tests/attr/system-wide-dummy | 50 +
> >  tools/perf/tests/attr/test-record-C0| 12 +-
> >  tools/perf/util/evsel.c |  5 ++-
> >  4 files changed, 78 insertions(+), 8 deletions(-)
> >  create mode 100644 tools/perf/tests/attr/system-wide-dummy
> >
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 1ab349abe904..8d1e93351298 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -805,19 +805,28 @@ static int record__open(struct record *rec)
> >   int rc = 0;
> >
> >   /*
> > -  * For initial_delay we need to add a dummy event so that we can track
> > -  * PERF_RECORD_MMAP while we wait for the initial delay to enable the
> > -  * real events, the ones asked by the user.
> > +  * For initial_delay or system wide, we need to add a dummy event so
> > +  * that we can track PERF_RECORD_MMAP to cover the delay of waiting or
> > +  * event synthesis.
> >*/
> > - if (opts->initial_delay) {
> > + if (opts->initial_delay || target__has_cpu(&opts->target)) {
> >   if (perf_evlist__add_dummy(evlist))
> >   return -ENOMEM;
> >
> > + /* Disable tracking of mmaps on lead event. */
> >   pos = evlist__first(evlist);
> >   pos->tracking = 0;
> > + /* Set up dummy event. */
> >   pos = evlist__last(evlist);
> >   pos->tracking = 1;
> > - pos->core.attr.enable_on_exec = 1;
> > + /*
> > +  * Enable the dummy event when the process is forked for
> > +  * initial_delay, immediately for system wide.
> > +  */
> > + if (opts->initial_delay)
> > + pos->core.attr.enable_on_exec = 1;
> > + else
> > + pos->immediate = 1;
> >   }
> >
> >   perf_evlist__config(evlist, opts, &callchain_param);
> > diff --git a/tools/perf/tests/attr/system-wide-dummy 
> > b/tools/perf/tests/attr/system-wide-dummy
> > new file mode 100644
> > index ..eba723cc0d38
> > --- /dev/null
> > +++ b/tools/perf/tests/attr/system-wide-dummy
> > @@ -0,0 +1,50 @@
> > +# Event added by system-wide or CPU perf-record to handle the race of
> > +# processes starting while /proc is processed.
> > +[event]
> > +fd=1
> > +group_fd=-1
> > +cpu=*
> > +pid=-1
> > +flags=8
> > +type=1
> > +size=120
> > +config=9
> > +sample_period=4000
> > +sample_type=455
> > +read_format=4
> > +# Event will be enabled right away.
> > +disabled=0
> > +inherit=1
> > +pinned=0
> > +exclusive=0
> > +exclude_user=0
> > +exclude_kernel=0
> > +exclude_hv=0
> > +exclude_idle=0
> > +mmap=1
> > +comm=1
> > +freq=1
> > +inherit_sta

[PATCH V3 1/8] fs/ext4: Narrow scope of DAX check in setflags

2020-05-19 Thread ira . weiny
From: Ira Weiny 

When preventing DAX and journaling on an inode.  Use the effective DAX
check rather than the mount option.

This will be required to support per inode DAX flags.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 
---
 fs/ext4/ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index bfc1281fc4cb..5813e5e73eab 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -393,9 +393,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
/*
 * Changes to the journaling mode can cause unsafe changes to
-* S_DAX if we are using the DAX mount option.
+* S_DAX if the inode is DAX
 */
-   if (test_opt(inode->i_sb, DAX)) {
+   if (IS_DAX(inode)) {
err = -EBUSY;
goto flags_out;
}
-- 
2.25.1



[PATCH V3 5/8] fs/ext4: Only change S_DAX on inode load

2020-05-19 Thread ira . weiny
From: Ira Weiny 

To prevent complications with in memory inodes we only set S_DAX on
inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
change after inode eviction and reload.

Add init bool to ext4_set_inode_flags() to indicate if the inode is
being newly initialized.

Assert that S_DAX is not set on an inode which is just being loaded.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from V2:
Rework based on moving the encryption patch to the end.

Changes from RFC:
Change J_ASSERT() to WARN_ON_ONCE()
Fix bug which would clear S_DAX incorrectly
---
 fs/ext4/ext4.h   |  2 +-
 fs/ext4/ialloc.c |  2 +-
 fs/ext4/inode.c  | 13 ++---
 fs/ext4/ioctl.c  |  3 ++-
 fs/ext4/super.c  |  4 ++--
 fs/ext4/verity.c |  2 +-
 6 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1a3daf2d18ef..86a0994332ce 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2692,7 +2692,7 @@ extern int ext4_can_truncate(struct inode *inode);
 extern int ext4_truncate(struct inode *);
 extern int ext4_break_layouts(struct inode *);
 extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length);
-extern void ext4_set_inode_flags(struct inode *);
+extern void ext4_set_inode_flags(struct inode *, bool init);
 extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 4b8c9a9bdf0c..7941c140723f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1116,7 +1116,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct 
inode *dir,
ei->i_block_group = group;
ei->i_last_alloc_group = ~0;
 
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, true);
if (IS_DIRSYNC(inode))
ext4_handle_sync(handle);
if (insert_inode_locked(inode) < 0) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d3a4c2ed7a1c..23e42a223235 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4419,11 +4419,13 @@ static bool ext4_should_enable_dax(struct inode *inode)
return false;
 }
 
-void ext4_set_inode_flags(struct inode *inode)
+void ext4_set_inode_flags(struct inode *inode, bool init)
 {
unsigned int flags = EXT4_I(inode)->i_flags;
unsigned int new_fl = 0;
 
+   WARN_ON_ONCE(IS_DAX(inode) && init);
+
if (flags & EXT4_SYNC_FL)
new_fl |= S_SYNC;
if (flags & EXT4_APPEND_FL)
@@ -4434,8 +4436,13 @@ void ext4_set_inode_flags(struct inode *inode)
new_fl |= S_NOATIME;
if (flags & EXT4_DIRSYNC_FL)
new_fl |= S_DIRSYNC;
-   if (ext4_should_enable_dax(inode))
+
+   /* Because of the way inode_set_flags() works we must preserve S_DAX
+* here if already set. */
+   new_fl |= (inode->i_flags & S_DAX);
+   if (init && ext4_should_enable_dax(inode))
new_fl |= S_DAX;
+
if (flags & EXT4_ENCRYPT_FL)
new_fl |= S_ENCRYPTED;
if (flags & EXT4_CASEFOLD_FL)
@@ -4649,7 +4656,7 @@ struct inode *__ext4_iget(struct super_block *sb, 
unsigned long ino,
 * not initialized on a new filesystem. */
}
ei->i_flags = le32_to_cpu(raw_inode->i_flags);
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, true);
inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
if (ext4_has_feature_64bit(sb))
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 5813e5e73eab..145083e8cd1e 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -381,7 +381,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
ext4_clear_inode_flag(inode, i);
}
 
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, false);
+
inode->i_ctime = current_time(inode);
 
err = ext4_mark_iloc_dirty(handle, inode, &iloc);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 7b99c44d0a91..3cb9b48d3cc4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1348,7 +1348,7 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
 * Update inode->i_flags - S_ENCRYPTED will be enabled,
 * S_DAX may be disabled
 */
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, false);
}
return res;
}
@@ -1375,7 +1375,7 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
 * Update inode->i_flags - S_ENCRYPTED will be enabled,
 * S_DAX may be disabled
 */
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, false);
res = ext4_mark_inode_dir

[PATCH V3 6/8] fs/ext4: Make DAX mount option a tri-state

2020-05-19 Thread ira . weiny
From: Ira Weiny 

We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
operate the same which is equivalent to 'always'.  This new
functionality is limited to ext4 only.

Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.

We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.

Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
specified that option for printing.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from V1:
Fix up mounting options to only show an option if specified
Fix remount to prevent dax changes
Isolate behavior to ext4 only

Changes from RFC:
Combine remount check for DAX_NEVER with DAX_ALWAYS
Update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  |  2 ++
 fs/ext4/inode.c |  2 ++
 fs/ext4/super.c | 67 +
 3 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 86a0994332ce..6235440e4c39 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1168,6 +1168,8 @@ struct ext4_inode_info {
  blocks */
 #define EXT4_MOUNT2_HURD_COMPAT0x0004 /* Support 
HURD-castrated
  file systems */
+#define EXT4_MOUNT2_DAX_NEVER  0x0008 /* Do not allow Direct 
Access */
+#define EXT4_MOUNT2_DAX_INODE  0x0010 /* For printing options only 
*/
 
 #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM  0x0008 /* User explicitly
specified journal checksum */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 23e42a223235..140b1930e2f4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
 
 static bool ext4_should_enable_dax(struct inode *inode)
 {
+   if (test_opt2(inode->i_sb, DAX_NEVER))
+   return false;
if (!S_ISREG(inode->i_mode))
return false;
if (ext4_should_journal_data(inode))
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3cb9b48d3cc4..5ba65eb0e2ef 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1512,7 +1512,8 @@ enum {
Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
-   Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
+   Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
+   Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
Opt_nowarn_on_error, Opt_mblk_io_submit,
Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
@@ -1579,6 +1580,9 @@ static const match_table_t tokens = {
{Opt_nobarrier, "nobarrier"},
{Opt_i_version, "i_version"},
{Opt_dax, "dax"},
+   {Opt_dax_always, "dax=always"},
+   {Opt_dax_inode, "dax=inode"},
+   {Opt_dax_never, "dax=never"},
{Opt_stripe, "stripe=%u"},
{Opt_delalloc, "delalloc"},
{Opt_warn_on_error, "warn_on_error"},
@@ -1726,6 +1730,7 @@ static int clear_qf_name(struct super_block *sb, int 
qtype)
 #define MOPT_NO_EXT3   0x0200
 #define MOPT_EXT4_ONLY (MOPT_NO_EXT2 | MOPT_NO_EXT3)
 #define MOPT_STRING0x0400
+#define MOPT_SKIP  0x0800
 
 static const struct mount_opts {
int token;
@@ -1775,7 +1780,13 @@ static const struct mount_opts {
{Opt_min_batch_time, 0, MOPT_GTE0},
{Opt_inode_readahead_blks, 0, MOPT_GTE0},
{Opt_init_itable, 0, MOPT_GTE0},
-   {Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
+   {Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET | MOPT_SKIP},
+   {Opt_dax_always, EXT4_MOUNT_DAX_ALWAYS,
+   MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
+   {Opt_dax_inode, EXT4_MOUNT2_DAX_INODE,
+   MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
+   {Opt_dax_never, EXT4_MOUNT2_DAX_NEVER,
+   MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
{Opt_stripe, 0, MOPT_GTE0},
{Opt_resuid, 0, MOPT_GTE0},
{Opt_resgid, 0, MOPT_GTE0},
@@ -2084,13 +2095,32 @@ static int handle_mount_opt(struct super_block *sb, 
char *opt, int token,
}
sbi->s_jquota_fmt = m->mount_opt;
 #endif
-   } else if (token == Opt_dax) {
+   } else if (token == Opt_dax || token == Opt_dax_always ||
+  token == Opt_dax_inode || token == Opt_dax_never) {
 #ifdef CONFIG_FS_DAX
-   ext4_msg(sb, KERN_WARNING,
-   "DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
-   sbi->s_mount_opt |= m->mount_opt;
+   switch (token) {
+   case Opt_dax:
+   case Opt_dax_always:

[PATCH V3 4/8] fs/ext4: Update ext4_should_use_dax()

2020-05-19 Thread ira . weiny
From: Ira Weiny 

S_DAX should only be enabled when the underlying block device supports
dax.

Change ext4_should_use_dax() to check for device support prior to the
over riding mount option.

While we are at it change the function to ext4_should_enable_dax() as
this better reflects the ask as well as matches xfs.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from RFC
Change function name to 'should enable'
Clean up bool conversion
Reorder this for better bisect-ability
---
 fs/ext4/inode.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a10ff12194db..d3a4c2ed7a1c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4398,10 +4398,8 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
 }
 
-static bool ext4_should_use_dax(struct inode *inode)
+static bool ext4_should_enable_dax(struct inode *inode)
 {
-   if (!test_opt(inode->i_sb, DAX_ALWAYS))
-   return false;
if (!S_ISREG(inode->i_mode))
return false;
if (ext4_should_journal_data(inode))
@@ -4412,7 +4410,13 @@ static bool ext4_should_use_dax(struct inode *inode)
return false;
if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
return false;
-   return true;
+   if (!bdev_dax_supported(inode->i_sb->s_bdev,
+   inode->i_sb->s_blocksize))
+   return false;
+   if (test_opt(inode->i_sb, DAX_ALWAYS))
+   return true;
+
+   return false;
 }
 
 void ext4_set_inode_flags(struct inode *inode)
@@ -4430,7 +4434,7 @@ void ext4_set_inode_flags(struct inode *inode)
new_fl |= S_NOATIME;
if (flags & EXT4_DIRSYNC_FL)
new_fl |= S_DIRSYNC;
-   if (ext4_should_use_dax(inode))
+   if (ext4_should_enable_dax(inode))
new_fl |= S_DAX;
if (flags & EXT4_ENCRYPT_FL)
new_fl |= S_ENCRYPTED;
-- 
2.25.1



[PATCH V3 2/8] fs/ext4: Disallow verity if inode is DAX

2020-05-19 Thread ira . weiny
From: Ira Weiny 

Verity and DAX are incompatible.  Changing the DAX mode due to a verity
flag change is wrong without a corresponding address_space_operations
update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

(Setting DAX is already disabled if Verity is set first.)

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from V2:
Remove Section title 'Verity and DAX'

Changes:
remove WARN_ON_ONCE
Add documentation for DAX/Verity exclusivity
---
 Documentation/filesystems/ext4/verity.rst | 3 +++
 fs/ext4/verity.c  | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/filesystems/ext4/verity.rst 
b/Documentation/filesystems/ext4/verity.rst
index 3e4c0ee0e068..e99ff3fd09f7 100644
--- a/Documentation/filesystems/ext4/verity.rst
+++ b/Documentation/filesystems/ext4/verity.rst
@@ -39,3 +39,6 @@ is encrypted as well as the data itself.
 
 Verity files cannot have blocks allocated past the end of the verity
 metadata.
+
+Verity and DAX are not compatible and attempts to set both of these flags
+on a file will fail.
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dc5ec724d889..f05a09fb2ae4 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
handle_t *handle;
int err;
 
+   if (IS_DAX(inode))
+   return -EINVAL;
+
if (ext4_verity_in_progress(inode))
return -EBUSY;
 
-- 
2.25.1



[PATCH V3 8/8] Documentation/dax: Update DAX enablement for ext4

2020-05-19 Thread ira . weiny
From: Ira Weiny 

Update the document to reflect ext4 and xfs now behave the same.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from RFC:
Update with ext2 text...
---
 Documentation/filesystems/dax.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/dax.txt 
b/Documentation/filesystems/dax.txt
index 735fb4b54117..265c4f808dbf 100644
--- a/Documentation/filesystems/dax.txt
+++ b/Documentation/filesystems/dax.txt
@@ -25,7 +25,7 @@ size when creating the filesystem.
 Currently 3 filesystems support DAX: ext2, ext4 and xfs.  Enabling DAX on them
 is different.
 
-Enabling DAX on ext4 and ext2
+Enabling DAX on ext2
 -
 
 When mounting the filesystem, use the "-o dax" option on the command line or
@@ -33,8 +33,8 @@ add 'dax' to the options in /etc/fstab.  This works to enable 
DAX on all files
 within the filesystem.  It is equivalent to the '-o dax=always' behavior below.
 
 
-Enabling DAX on xfs

+Enabling DAX on xfs and ext4
+
 
 Summary
 ---
-- 
2.25.1



[PATCH V3 0/8] Enable ext4 support for per-file/directory DAX operations

2020-05-19 Thread ira . weiny
From: Ira Weiny 

Changes from V2:
Rework DAX exclusivity with verity and encryption based on feedback
from Eric

Enable the same per file DAX support in ext4 as was done for xfs.  This series
builds and depends on the V11 series for xfs.[1]

This passes the same xfstests test as XFS.

The only issue is that this modifies the old mount option parsing code rather
than waiting for the new parsing code to be finalized.

This series starts with 3 fixes which include making Verity and Encrypt truly
mutually exclusive from DAX.  I think these first 3 patches should be picked up
for 5.8 regardless of what is decided regarding the mount parsing.

[1] https://lore.kernel.org/lkml/20200428002142.404144-1-ira.we...@intel.com/

To: linux-kernel@vger.kernel.org
Cc: "Darrick J. Wong" 
Cc: Dan Williams 
Cc: Dave Chinner 
Cc: Christoph Hellwig 
Cc: "Theodore Y. Ts'o" 
Cc: Jan Kara 
Cc: linux-e...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org


Ira Weiny (8):
  fs/ext4: Narrow scope of DAX check in setflags
  fs/ext4: Disallow verity if inode is DAX
  fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  fs/ext4: Update ext4_should_use_dax()
  fs/ext4: Only change S_DAX on inode load
  fs/ext4: Make DAX mount option a tri-state
  fs/ext4: Introduce DAX inode flag
  Documentation/dax: Update DAX enablement for ext4

 Documentation/filesystems/dax.txt |  6 +-
 Documentation/filesystems/ext4/verity.rst |  3 +
 fs/ext4/ext4.h| 22 +--
 fs/ext4/ialloc.c  |  2 +-
 fs/ext4/inode.c   | 25 +--
 fs/ext4/ioctl.c   | 41 ++--
 fs/ext4/super.c   | 80 ++-
 fs/ext4/verity.c  |  5 +-
 include/uapi/linux/fs.h   |  1 +
 9 files changed, 148 insertions(+), 37 deletions(-)

-- 
2.25.1



[PATCH V3 3/8] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS

2020-05-19 Thread ira . weiny
From: Ira Weiny 

In prep for the new tri-state mount option which then introduces
EXT4_MOUNT_DAX_NEVER.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes:
New patch
---
 fs/ext4/ext4.h  |  4 ++--
 fs/ext4/inode.c |  2 +-
 fs/ext4/super.c | 12 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 91eb4381cae5..1a3daf2d18ef 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1123,9 +1123,9 @@ struct ext4_inode_info {
 #define EXT4_MOUNT_MINIX_DF0x00080 /* Mimics the Minix statfs */
 #define EXT4_MOUNT_NOLOAD  0x00100 /* Don't use existing journal*/
 #ifdef CONFIG_FS_DAX
-#define EXT4_MOUNT_DAX 0x00200 /* Direct Access */
+#define EXT4_MOUNT_DAX_ALWAYS  0x00200 /* Direct Access */
 #else
-#define EXT4_MOUNT_DAX 0
+#define EXT4_MOUNT_DAX_ALWAYS  0
 #endif
 #define EXT4_MOUNT_DATA_FLAGS  0x00C00 /* Mode for data writes: */
 #define EXT4_MOUNT_JOURNAL_DATA0x00400 /* Write data to 
journal */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2a4aae6acdcb..a10ff12194db 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,7 +4400,7 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
 
 static bool ext4_should_use_dax(struct inode *inode)
 {
-   if (!test_opt(inode->i_sb, DAX))
+   if (!test_opt(inode->i_sb, DAX_ALWAYS))
return false;
if (!S_ISREG(inode->i_mode))
return false;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index bf5fcb477f66..7b99c44d0a91 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1775,7 +1775,7 @@ static const struct mount_opts {
{Opt_min_batch_time, 0, MOPT_GTE0},
{Opt_inode_readahead_blks, 0, MOPT_GTE0},
{Opt_init_itable, 0, MOPT_GTE0},
-   {Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
+   {Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
{Opt_stripe, 0, MOPT_GTE0},
{Opt_resuid, 0, MOPT_GTE0},
{Opt_resgid, 0, MOPT_GTE0},
@@ -3982,7 +3982,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
 "both data=journal and dioread_nolock");
goto failed_mount;
}
-   if (test_opt(sb, DAX)) {
+   if (test_opt(sb, DAX_ALWAYS)) {
ext4_msg(sb, KERN_ERR, "can't mount with "
 "both data=journal and dax");
goto failed_mount;
@@ -4092,7 +4092,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
goto failed_mount;
}
 
-   if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
+   if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
if (ext4_has_feature_inline_data(sb)) {
ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
" that may contain inline data");
@@ -5412,7 +5412,7 @@ static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
err = -EINVAL;
goto restore_opts;
}
-   if (test_opt(sb, DAX)) {
+   if (test_opt(sb, DAX_ALWAYS)) {
ext4_msg(sb, KERN_ERR, "can't mount with "
 "both data=journal and dax");
err = -EINVAL;
@@ -5433,10 +5433,10 @@ static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
goto restore_opts;
}
 
-   if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX) {
+   if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
"dax flag with busy inodes while remounting");
-   sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
+   sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
}
 
if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
-- 
2.25.1



[PATCH V3 7/8] fs/ext4: Introduce DAX inode flag

2020-05-19 Thread ira . weiny
From: Ira Weiny 

Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.

Set the flag to be user visible and changeable.  Set the flag to be
inherited.  Allow applications to change the flag at any time with the
exception of if VERITY or ENCRYPT is set.

Disallow setting VERITY or ENCRYPT if DAX is set.

Finally, on regular files, flag the inode to not be cached to facilitate
changing S_DAX on the next creation of the inode.

Signed-off-by: Ira Weiny 

---
Change from V2:
Add in making verity and DAX exclusive.
'Squash' in making encryption and DAX exclusive.
Add in EXT4_INODE_DAX flag definition to be compatible with
ext4_[set|test]_inode_flag() bit operations
Use ext4_[set|test]_inode_flag() bit operations to be consistent
with other code.

Change from V0:
Add FS_DAX_FL to include/uapi/linux/fs.h
to be consistent
Move ext4_dax_dontcache() to ext4_ioctl_setflags()
This ensures that it is only set when the flags are going to be
set and not if there is an error
Also this sets don't cache in the FS_IOC_SETFLAGS case

Change from RFC:
use new d_mark_dontcache()
Allow caching if ALWAYS/NEVER is set
Rebased to latest Linus master
Change flag to unused 0x0100
update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  | 14 ++
 fs/ext4/inode.c |  2 +-
 fs/ext4/ioctl.c | 34 +-
 fs/ext4/super.c |  3 +++
 fs/ext4/verity.c|  2 +-
 include/uapi/linux/fs.h |  1 +
 6 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 6235440e4c39..467c30a789b6 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -415,13 +415,16 @@ struct flex_groups {
 #define EXT4_VERITY_FL 0x0010 /* Verity protected inode */
 #define EXT4_EA_INODE_FL   0x0020 /* Inode used for large EA */
 /* 0x0040 was formerly EXT4_EOFBLOCKS_FL */
+
+#define EXT4_DAX_FL0x0100 /* Inode is DAX */
+
 #define EXT4_INLINE_DATA_FL0x1000 /* Inode has inline data. */
 #define EXT4_PROJINHERIT_FL0x2000 /* Create with parents 
projid */
 #define EXT4_CASEFOLD_FL   0x4000 /* Casefolded file */
 #define EXT4_RESERVED_FL   0x8000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE   0x705BDFFF /* User visible flags */
-#define EXT4_FL_USER_MODIFIABLE0x604BC0FF /* User modifiable 
flags */
+#define EXT4_FL_USER_VISIBLE   0x715BDFFF /* User visible flags */
+#define EXT4_FL_USER_MODIFIABLE0x614BC0FF /* User modifiable 
flags */
 
 /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
 #define EXT4_FL_XFLAG_VISIBLE  (EXT4_SYNC_FL | \
@@ -429,14 +432,16 @@ struct flex_groups {
 EXT4_APPEND_FL | \
 EXT4_NODUMP_FL | \
 EXT4_NOATIME_FL | \
-EXT4_PROJINHERIT_FL)
+EXT4_PROJINHERIT_FL | \
+EXT4_DAX_FL)
 
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
-  EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
+  EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
+  EXT4_DAX_FL)
 
 /* Flags that are appropriate for regular files (all but dir-specific ones). */
 #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL 
|\
@@ -488,6 +493,7 @@ enum {
EXT4_INODE_VERITY   = 20,   /* Verity protected inode */
EXT4_INODE_EA_INODE = 21,   /* Inode used for large EA */
 /* 22 was formerly EXT4_INODE_EOFBLOCKS */
+   EXT4_INODE_DAX  = 24,   /* Inode is DAX */
EXT4_INODE_INLINE_DATA  = 28,   /* Data in inode. */
EXT4_INODE_PROJINHERIT  = 29,   /* Create with parents projid */
EXT4_INODE_RESERVED = 31,   /* reserved for ext4 lib */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 140b1930e2f4..ae61db8b8bae 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4418,7 +4418,7 @@ static bool ext4_should_enable_dax(struct inode *inode)
if (test_opt(inode->i_sb, DAX_ALWAYS))
return true;
 
-   return false;
+   return ext4_test_inode_flag(inode, EXT4_INODE_DAX);
 }
 
 void ext4_set_inode_flags(struct inode *inode, bool init)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 145083e8cd1e..668b8

Re: general protection fault in kobject_get (2)

2020-05-19 Thread Greg KH
On Tue, May 19, 2020 at 09:53:16PM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:d00f26b6 Merge git://git.kernel.org/pub/scm/linux/kernel/g..
> git tree:   net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1316343c10
> kernel config:  https://syzkaller.appspot.com/x/.config?x=26d0bd769afe1a2c
> dashboard link: https://syzkaller.appspot.com/bug?extid=407fd358a932bbf639c6
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+407fd358a932bbf63...@syzkaller.appspotmail.com
> 
> general protection fault, probably for non-canonical address 
> 0xdc13:  [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0098-0x009f]
> CPU: 1 PID: 16682 Comm: syz-executor.3 Not tainted 5.7.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> RIP: 0010:kobject_get+0x30/0x150 lib/kobject.c:640
> Code: 53 e8 d4 7e c6 fd 4d 85 e4 0f 84 a2 00 00 00 e8 c6 7e c6 fd 49 8d 7c 24 
> 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 
> 83 e2 07 38 d0 7f 08 84 c0 0f 85 e7 00 00 00
> RSP: 0018:c9000772f240 EFLAGS: 00010203
> RAX: dc00 RBX: 85acfca0 RCX: c9000fc67000
> RDX: 0013 RSI: 83acadfa RDI: 009c
> RBP: 0060 R08: 8880a8dfa4c0 R09: ed100a03f403
> R10: 8880501fa017 R11: ed100a03f402 R12: 0060
> R13: c9000772f3c0 R14: 88805d1ec4e8 R15: 88805d1ec580
> FS:  7f1ebed26700() GS:8880ae70() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 004d88f0 CR3: a86c4000 CR4: 001406e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  get_device+0x20/0x30 drivers/base/core.c:2620
>  __ib_get_client_nl_info+0x1d4/0x2a0 drivers/infiniband/core/device.c:1863
>  ib_get_client_nl_info+0x30/0x180 drivers/infiniband/core/device.c:1883
>  nldev_get_chardev+0x52b/0xa40 drivers/infiniband/core/nldev.c:1625
>  rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
>  rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
>  rdma_nl_rcv+0x586/0x900 drivers/infiniband/core/netlink.c:259
>  netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
>  netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
>  netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
>  sock_sendmsg_nosec net/socket.c:652 [inline]
>  sock_sendmsg+0xcf/0x120 net/socket.c:672
>  sys_sendmsg+0x6e6/0x810 net/socket.c:2352
>  ___sys_sendmsg+0x100/0x170 net/socket.c:2406
>  __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
>  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> RIP: 0033:0x45c829
> Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f1ebed25c78 EFLAGS: 0246 ORIG_RAX: 002e
> RAX: ffda RBX: 004ff720 RCX: 0045c829
> RDX:  RSI: 2200 RDI: 0003
> RBP: 0078bf00 R08:  R09: 
> R10:  R11: 0246 R12: 
> R13: 09ad R14: 004d5f10 R15: 7f1ebed266d4
> Modules linked in:
> ---[ end trace 239938a6c4c3c99f ]---
> RIP: 0010:kobject_get+0x30/0x150 lib/kobject.c:640
> Code: 53 e8 d4 7e c6 fd 4d 85 e4 0f 84 a2 00 00 00 e8 c6 7e c6 fd 49 8d 7c 24 
> 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 
> 83 e2 07 38 d0 7f 08 84 c0 0f 85 e7 00 00 00
> RSP: 0018:c9000772f240 EFLAGS: 00010203
> RAX: dc00 RBX: 85acfca0 RCX: c9000fc67000
> RDX: 0013 RSI: 83acadfa RDI: 009c
> RBP: 0060 R08: 8880a8dfa4c0 R09: ed100a03f403
> R10: 8880501fa017 R11: ed100a03f402 R12: 0060
> R13: c9000772f3c0 R14: 88805d1ec4e8 R15: 88805d1ec580
> FS:  7f1ebed26700() GS:8880ae70() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0073fad4 CR3: a86c4000 CR4: 001406e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400

Looks like an IB/rdma issue, poke those developers please :)


Re: [RFC PATCH 0/8] Qualcomm Cloud AI 100 driver

2020-05-19 Thread Greg Kroah-Hartman
On Tue, May 19, 2020 at 10:11:35PM -0700, Bjorn Andersson wrote:
> On Tue 19 May 21:59 PDT 2020, Greg Kroah-Hartman wrote:
> 
> > On Tue, May 19, 2020 at 10:41:15PM +0200, Daniel Vetter wrote:
> > > > Ok, that's a decision you are going to have to push upward on, as we
> > > > really can't take this without a working, open, userspace.
> > > 
> > > Uh wut.
> > > 
> > > So the merge criteria for drivers/accel (atm still drivers/misc but I
> > > thought that was interim until more drivers showed up) isn't actually
> > > "totally-not-a-gpu accel driver without open source userspace".
> > > 
> > > Instead it's "totally-not-a-gpu accel driver without open source
> > > userspace" _and_ you have to be best buddies with Greg. Or at least
> > > not be on the naughty company list. Since for habanalabs all you
> > > wanted is a few test cases to exercise the ioctls. Not the entire
> > > userspace.
> > 
> > Habanalabs now has their full library opensourced that their tools use
> > directly, so that's not an argument anymore.
> > 
> > My primary point here is the copyright owner of this code, because of
> > that, I'm not going to objet to allowing this to be merged without open
> > userspace code.
> > 
> 
> So because it's copyright Linux Foundation you are going to accept it
> without user space, after all?

Huh, no, the exact opposite, sorry, drop the "not" in that above
sentence.  My bad.

greg k-h


Re: [PATCH v1 2/6] bus: mhi: core: Mark device inactive soon after host issues a shutdown

2020-05-19 Thread kbuild test robot
Hi Bhaumik,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20200519]
[cannot apply to linus/master v5.7-rc6 v5.7-rc5 v5.7-rc4 v5.7-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Bhaumik-Bhatt/Bug-fixes-and-bootup-and-shutdown-improvements/20200520-083400
base:fb57b1fabcb28f358901b2df90abd2b48abc1ca8
config: riscv-allyesconfig (attached as .config)
compiler: riscv64-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):

drivers/bus/mhi/core/main.c: In function 'mhi_intvec_threaded_handler':
>> drivers/bus/mhi/core/main.c:397:8: error: implicit declaration of function 
>> 'mhi_is_active' [-Werror=implicit-function-declaration]
397 |   if (!mhi_is_active(mhi_cntrl)) {
|^
cc1: some warnings being treated as errors

vim +/mhi_is_active +397 drivers/bus/mhi/core/main.c

   371  
   372  irqreturn_t mhi_intvec_threaded_handler(int irq_number, void *priv)
   373  {
   374  struct mhi_controller *mhi_cntrl = priv;
   375  struct device *dev = &mhi_cntrl->mhi_dev->dev;
   376  enum mhi_state state = MHI_STATE_MAX;
   377  enum mhi_pm_state pm_state = 0;
   378  enum mhi_ee_type ee = 0;
   379  bool handle_rddm = false;
   380  
   381  write_lock_irq(&mhi_cntrl->pm_lock);
   382  if (!MHI_REG_ACCESS_VALID(mhi_cntrl->pm_state)) {
   383  write_unlock_irq(&mhi_cntrl->pm_lock);
   384  goto exit_intvec;
   385  }
   386  
   387  state = mhi_get_mhi_state(mhi_cntrl);
   388  ee = mhi_cntrl->ee;
   389  mhi_cntrl->ee = mhi_get_exec_env(mhi_cntrl);
   390  dev_dbg(dev, "local ee:%s device ee:%s dev_state:%s\n",
   391  TO_MHI_EXEC_STR(mhi_cntrl->ee), TO_MHI_EXEC_STR(ee),
   392  TO_MHI_STATE_STR(state));
   393  
   394   /* If device supports RDDM don't bother processing SYS error */
   395  if (mhi_cntrl->rddm_image) {
   396  /* host may be performing a device power down already */
 > 397  if (!mhi_is_active(mhi_cntrl)) {
   398  write_unlock_irq(&mhi_cntrl->pm_lock);
   399  goto exit_intvec;
   400  }
   401  
   402  if (mhi_cntrl->ee == MHI_EE_RDDM && mhi_cntrl->ee != 
ee) {
   403  /* prevent clients from queueing any more 
packets */
   404  pm_state = mhi_tryset_pm_state(mhi_cntrl,
   405 
MHI_PM_SYS_ERR_DETECT);
   406  if (pm_state == MHI_PM_SYS_ERR_DETECT)
   407  handle_rddm = true;
   408  }
   409  
   410  write_unlock_irq(&mhi_cntrl->pm_lock);
   411  
   412  if (handle_rddm) {
   413  dev_err(dev, "RDDM event occurred!\n");
   414  mhi_cntrl->status_cb(mhi_cntrl, MHI_CB_EE_RDDM);
   415  wake_up_all(&mhi_cntrl->state_event);
   416  }
   417  goto exit_intvec;
   418  }
   419  
   420  if (state == MHI_STATE_SYS_ERR) {
   421  dev_dbg(dev, "System error detected\n");
   422  pm_state = mhi_tryset_pm_state(mhi_cntrl,
   423 MHI_PM_SYS_ERR_DETECT);
   424  }
   425  
   426  write_unlock_irq(&mhi_cntrl->pm_lock);
   427  
   428  if (pm_state == MHI_PM_SYS_ERR_DETECT) {
   429  wake_up_all(&mhi_cntrl->state_event);
   430  
   431  /* For fatal errors, we let controller decide next step 
*/
   432  if (MHI_IN_PBL(ee))
   433  mhi_cntrl->status_cb(mhi_cntrl, 
MHI_CB_FATAL_ERROR);
   434  else
   435  mhi_pm_sys_err_handler(mhi_cntrl);
   436  }
   437  
   438  exit_intvec:
   439  
   440  return IRQ_HANDLED;
   441  }
   442  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[tip:x86/urgent] BUILD SUCCESS d7110a26e5905ec2fe3fc88bc6a538901accb72b

2020-05-19 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git  
x86/urgent
branch HEAD: d7110a26e5905ec2fe3fc88bc6a538901accb72b  x86/mmiotrace: Use 
cpumask_available() for cpumask_var_t variables

elapsed time: 486m

configs tested: 98
configs skipped: 74

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
sparcallyesconfig
mips allyesconfig
m68k allyesconfig
i386  allnoconfig
i386defconfig
i386  debian-10.3
i386 allyesconfig
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
i386 randconfig-a012-20200519
i386 randconfig-a014-20200519
i386 randconfig-a016-20200519
i386 randconfig-a011-20200519
i386 randconfig-a015-20200519
i386 randconfig-a013-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:perf/core] BUILD SUCCESS c50c75e9b87946499a62bffc021e95c87a1d57cd

2020-05-19 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git  
perf/core
branch HEAD: c50c75e9b87946499a62bffc021e95c87a1d57cd  perf/core: Replace 
zero-length array with flexible-array

elapsed time: 486m

configs tested: 98
configs skipped: 1

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
sparcallyesconfig
mips allyesconfig
m68k allyesconfig
i386  allnoconfig
i386defconfig
i386  debian-10.3
i386 allyesconfig
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
i386 randconfig-a012-20200519
i386 randconfig-a014-20200519
i386 randconfig-a016-20200519
i386 randconfig-a011-20200519
i386 randconfig-a015-20200519
i386 randconfig-a013-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:locking/core] BUILD SUCCESS db78538c75e49c09b002a2cd96a19ae0c39be771

2020-05-19 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git  
locking/core
branch HEAD: db78538c75e49c09b002a2cd96a19ae0c39be771  locking/lockdep: Replace 
zero-length array with flexible-array

elapsed time: 486m

configs tested: 98
configs skipped: 1

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
sparcallyesconfig
mips allyesconfig
m68k allyesconfig
i386  allnoconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
i386 randconfig-a012-20200519
i386 randconfig-a014-20200519
i386 randconfig-a016-20200519
i386 randconfig-a011-20200519
i386 randconfig-a015-20200519
i386 randconfig-a013-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
umallnoconfig
um  defconfig
um   allmodconfig
um   allyesconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:sched/core] BUILD SUCCESS d505b8af58912ae1e1a211fabc9995b19bd40828

2020-05-19 Thread kbuild test robot
   defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
i386 randconfig-a012-20200519
i386 randconfig-a014-20200519
i386 randconfig-a016-20200519
i386 randconfig-a011-20200519
i386 randconfig-a015-20200519
i386 randconfig-a013-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:sched/urgent] BUILD SUCCESS 39f23ce07b9355d05a64ae303ce20d1c4b92b957

2020-05-19 Thread kbuild test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git  
sched/urgent
branch HEAD: 39f23ce07b9355d05a64ae303ce20d1c4b92b957  sched/fair: Fix 
unthrottle_cfs_rq() for leaf_cfs_rq list

elapsed time: 486m

configs tested: 98
configs skipped: 1

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
sparcallyesconfig
mips allyesconfig
m68k allyesconfig
i386  allnoconfig
i386defconfig
i386  debian-10.3
i386 allyesconfig
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20200519
i386 randconfig-a005-20200519
i386 randconfig-a001-20200519
i386 randconfig-a003-20200519
i386 randconfig-a004-20200519
i386 randconfig-a002-20200519
x86_64   randconfig-a003-20200519
x86_64   randconfig-a005-20200519
x86_64   randconfig-a004-20200519
x86_64   randconfig-a006-20200519
x86_64   randconfig-a002-20200519
x86_64   randconfig-a001-20200519
i386 randconfig-a012-20200519
i386 randconfig-a014-20200519
i386 randconfig-a016-20200519
i386 randconfig-a011-20200519
i386 randconfig-a015-20200519
i386 randconfig-a013-20200519
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
x86_64  defconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH v2] /dev/mem: Revoke mappings when a driver claims the region

2020-05-19 Thread Greg KH
On Tue, May 19, 2020 at 11:27:02AM -0700, Dan Williams wrote:
> On Tue, May 19, 2020 at 5:11 AM Greg KH  wrote:
> >
> > On Tue, May 19, 2020 at 12:03:06AM -0700, Dan Williams wrote:
> > > Close the hole of holding a mapping over kernel driver takeover event of
> > > a given address range.
> > >
> > > Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
> > > introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the
> > > kernel against scenarios where a /dev/mem user tramples memory that a
> > > kernel driver owns. However, this protection only prevents *new* read(),
> > > write() and mmap() requests. Established mappings prior to the driver
> > > calling request_mem_region() are left alone.
> > >
> > > Especially with persistent memory, and the core kernel metadata that is
> > > stored there, there are plentiful scenarios for a /dev/mem user to
> > > violate the expectations of the driver and cause amplified damage.
> > >
> > > Teach request_mem_region() to find and shoot down active /dev/mem
> > > mappings that it believes it has successfully claimed for the exclusive
> > > use of the driver. Effectively a driver call to request_mem_region()
> > > becomes a hole-punch on the /dev/mem device.
> > >
> > > The typical usage of unmap_mapping_range() is part of
> > > truncate_pagecache() to punch a hole in a file, but in this case the
> > > implementation is only doing the "first half" of a hole punch. Namely it
> > > is just evacuating current established mappings of the "hole", and it
> > > relies on the fact that /dev/mem establishes mappings in terms of
> > > absolute physical address offsets. Once existing mmap users are
> > > invalidated they can attempt to re-establish the mapping, or attempt to
> > > continue issuing read(2) / write(2) to the invalidated extent, but they
> > > will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can
> > > block those subsequent accesses.
> > >
> > > Cc: Arnd Bergmann 
> > > Cc: Ingo Molnar 
> > > Cc: Kees Cook 
> > > Cc: Russell King 
> > > Cc: Andrew Morton 
> > > Cc: Greg Kroah-Hartman 
> > > Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
> > > Signed-off-by: Dan Williams 
> > > ---
> > > Changes since v1 [1]:
> > >
> > > - updated the changelog to describe the usage of unmap_mapping_range().
> > >   No other logic changes:
> > >
> > > [1]: 
> > > http://lore.kernel.org/r/158662721802.1893045.12301414116114602646.st...@dwillia2-desk3.amr.corp.intel.com
> > >
> > > Greg, Andrew,
> > >
> > > I have a regression test for this case now. This was found by an
> > > intermittent data corruption scenario on pmem from a test tool using
> > > /dev/mem.
> >
> > Ick, why are test tools messing around in /dev/mem :)
> 
> Yeah, I'm all for useful tools, just not at the expense of kernel integrity.
> 
> > Anyway, this seems sane to me, want me to take it through my tree?
> 
> Yes please, seems to belong with the driver core.

Ok, will wait for a v3 to handle the issue that was just found in
review.

thanks,

greg k-h


Re: [PATCH 09/15] device core: Add ability to handle multiple dma offsets

2020-05-19 Thread Greg Kroah-Hartman
On Tue, May 19, 2020 at 04:34:07PM -0400, Jim Quinlan wrote:
> diff --git a/include/linux/device.h b/include/linux/device.h
> index ac8e37cd716a..6cd916860b5f 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -493,6 +493,8 @@ struct dev_links_info {
>   * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
>   *   DMA limit than the device itself supports.
>   * @dma_pfn_offset: offset of DMA memory range relatively of RAM
> + * @dma_map: Like dma_pfn_offset but used when there are multiple
> + *   pfn offsets for multiple dma-ranges.
>   * @dma_parms:   A low level driver may set these to teach IOMMU code 
> about
>   *   segment limitations.
>   * @dma_pools:   Dma pools (if dma'ble device).
> @@ -578,7 +580,12 @@ struct device {
>allocations such descriptors. */
>   u64 bus_dma_limit;  /* upstream dma constraint */
>   unsigned long   dma_pfn_offset;
> -
> +#ifdef CONFIG_DMA_PFN_OFFSET_MAP
> + const void *dma_offset_map; /* Like dma_pfn_offset, but for
> +  * the unlikely case of multiple
> +  * offsets. If non-null, dma_pfn_offset
> +  * will be 0. */
> +#endif
>   struct device_dma_parameters *dma_parms;
>  
>   struct list_headdma_pools;  /* dma pools (if dma'ble) */

I'll defer to Christoph here, but I thought we were trying to get rid of
stuff like this from struct device, not add new things to it for dma
apis.  And why is it a void *?

thanks,

greg k-h


Re: [PATCH v4 2/4] kasan: record and print the free track

2020-05-19 Thread Walter Wu
> On Wed, May 20, 2020 at 6:03 AM Walter Wu  wrote:
> >
> > > On Tue, May 19, 2020 at 4:25 AM Walter Wu  
> > > wrote:
> > > >
> > > > Move free track from slub alloc meta-data to slub free meta-data in
> > > > order to make struct kasan_free_meta size is 16 bytes. It is a good
> > > > size because it is the minimal redzone size and a good number of
> > > > alignment.
> > > >
> > > > For free track in generic KASAN, we do the modification in struct
> > > > kasan_alloc_meta and kasan_free_meta:
> > > > - remove free track from kasan_alloc_meta.
> > > > - add free track into kasan_free_meta.
> > > >
> > > > [1]https://bugzilla.kernel.org/show_bug.cgi?id=198437
> > > >
> > > > Signed-off-by: Walter Wu 
> > > > Suggested-by: Dmitry Vyukov 
> > > > Cc: Andrey Ryabinin 
> > > > Cc: Dmitry Vyukov 
> > > > Cc: Alexander Potapenko 
> > > > ---
> > > >  mm/kasan/common.c  | 22 ++
> > > >  mm/kasan/generic.c | 18 ++
> > > >  mm/kasan/kasan.h   |  7 +++
> > > >  mm/kasan/report.c  | 20 
> > > >  mm/kasan/tags.c| 37 +
> > > >  5 files changed, 64 insertions(+), 40 deletions(-)
> > > >
> > > > diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> > > > index 8bc618289bb1..47b53912f322 100644
> > > > --- a/mm/kasan/common.c
> > > > +++ b/mm/kasan/common.c
> > > > @@ -51,7 +51,7 @@ depot_stack_handle_t kasan_save_stack(gfp_t flags)
> > > > return stack_depot_save(entries, nr_entries, flags);
> > > >  }
> > > >
> > > > -static inline void set_track(struct kasan_track *track, gfp_t flags)
> > > > +void kasan_set_track(struct kasan_track *track, gfp_t flags)
> > > >  {
> > > > track->pid = current->pid;
> > > > track->stack = kasan_save_stack(flags);
> > > > @@ -299,24 +299,6 @@ struct kasan_free_meta *get_free_info(struct 
> > > > kmem_cache *cache,
> > > > return (void *)object + cache->kasan_info.free_meta_offset;
> > > >  }
> > > >
> > > > -
> > > > -static void kasan_set_free_info(struct kmem_cache *cache,
> > > > -   void *object, u8 tag)
> > > > -{
> > > > -   struct kasan_alloc_meta *alloc_meta;
> > > > -   u8 idx = 0;
> > > > -
> > > > -   alloc_meta = get_alloc_info(cache, object);
> > > > -
> > > > -#ifdef CONFIG_KASAN_SW_TAGS_IDENTIFY
> > > > -   idx = alloc_meta->free_track_idx;
> > > > -   alloc_meta->free_pointer_tag[idx] = tag;
> > > > -   alloc_meta->free_track_idx = (idx + 1) % KASAN_NR_FREE_STACKS;
> > > > -#endif
> > > > -
> > > > -   set_track(&alloc_meta->free_track[idx], GFP_NOWAIT);
> > > > -}
> > > > -
> > > >  void kasan_poison_slab(struct page *page)
> > > >  {
> > > > unsigned long i;
> > > > @@ -492,7 +474,7 @@ static void *__kasan_kmalloc(struct kmem_cache 
> > > > *cache, const void *object,
> > > > KASAN_KMALLOC_REDZONE);
> > > >
> > > > if (cache->flags & SLAB_KASAN)
> > > > -   set_track(&get_alloc_info(cache, object)->alloc_track, 
> > > > flags);
> > > > +   kasan_set_track(&get_alloc_info(cache, 
> > > > object)->alloc_track, flags);
> > > >
> > > > return set_tag(object, tag);
> > > >  }
> > > > diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> > > > index 3372bdcaf92a..763d8a13e0ac 100644
> > > > --- a/mm/kasan/generic.c
> > > > +++ b/mm/kasan/generic.c
> > > > @@ -344,3 +344,21 @@ void kasan_record_aux_stack(void *addr)
> > > > alloc_info->aux_stack[1] = alloc_info->aux_stack[0];
> > > > alloc_info->aux_stack[0] = kasan_save_stack(GFP_NOWAIT);
> > > >  }
> > > > +
> > > > +void kasan_set_free_info(struct kmem_cache *cache,
> > > > +   void *object, u8 tag)
> > > > +{
> > > > +   struct kasan_free_meta *free_meta;
> > > > +
> > > > +   free_meta = get_free_info(cache, object);
> > > > +   kasan_set_track(&free_meta->free_track, GFP_NOWAIT);
> > > > +}
> > > > +
> > > > +struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
> > > > +   void *object, u8 tag)
> > > > +{
> > > > +   struct kasan_free_meta *free_meta;
> > > > +
> > > > +   free_meta = get_free_info(cache, object);
> > > > +   return &free_meta->free_track;
> > > > +}
> > > > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> > > > index a7391bc83070..ad897ec36545 100644
> > > > --- a/mm/kasan/kasan.h
> > > > +++ b/mm/kasan/kasan.h
> > > > @@ -127,6 +127,9 @@ struct kasan_free_meta {
> > > >  * Otherwise it might be used for the allocator freelist.
> > > >  */
> > > > struct qlist_node quarantine_link;
> > > > +#ifdef CONFIG_KASAN_GENERIC
> > > > +   struct kasan_track free_track;
> > > > +#endif
> > > >  };
> > > >
> > > >  struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
> > > > @@ -168,6 +171,10 @@ void kasan_report_invalid_free(void *object, 
> > > > unsigned long ip);
> > > >  struct page *kasan_addr_to_page(const void *addr);
> > > >

Re: [PATCH 5.6 000/192] 5.6.14-rc2 review

2020-05-19 Thread Greg Kroah-Hartman
On Tue, May 19, 2020 at 01:37:20PM -0600, shuah wrote:
> On 5/18/20 11:47 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.6.14 release.
> > There are 192 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu, 21 May 2020 05:45:41 +.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.6.14-rc2.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-5.6.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 5.6 000/192] 5.6.14-rc2 review

2020-05-19 Thread Greg Kroah-Hartman
On Tue, May 19, 2020 at 09:30:22AM -0700, Guenter Roeck wrote:
> On 5/18/20 10:47 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.6.14 release.
> > There are 192 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu, 21 May 2020 05:45:41 +.
> > Anything received after that time might be too late.
> > 
> 
> Build results:
>   total: 155 pass: 155 fail: 0
> Qemu test results:
>   total: 431 pass: 431 fail: 0

Great, thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation

2020-05-19 Thread kbuild test robot
Hi Anchal,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.7-rc6]
[cannot apply to xen-tip/linux-next tip/irq/core tip/auto-latest next-20200519]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Anchal-Agarwal/Fix-PM-hibernation-in-Xen-guests/20200520-073211
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
03fb3acae4be8a6b680ffedb220a8b6c07260b40
config: x86_64-rhel (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All error/warnings (new ones prefixed by >>, old ones prefixed by <<):

drivers/block/xen-blkfront.c: In function 'blkfront_freeze':
>> drivers/block/xen-blkfront.c:2699:30: warning: missing terminating " 
>> character
xenbus_dev_error(dev, err, "Hibernation Failed.
^
>> drivers/block/xen-blkfront.c:2699:30: error: missing terminating " character
xenbus_dev_error(dev, err, "Hibernation Failed.
^~~~
>> drivers/block/xen-blkfront.c:2700:4: error: 'The' undeclared (first use in 
>> this function)
The ring is still busy");
^~~
drivers/block/xen-blkfront.c:2700:4: note: each undeclared identifier is 
reported only once for each function it appears in
>> drivers/block/xen-blkfront.c:2700:8: error: expected ')' before 'ring'
The ring is still busy");
^~~~
drivers/block/xen-blkfront.c:2700:26: warning: missing terminating " character
The ring is still busy");
^
drivers/block/xen-blkfront.c:2700:26: error: missing terminating " character
The ring is still busy");
^~~
>> drivers/block/xen-blkfront.c:2704:2: error: expected ';' before '}' token
}
^

vim +2699 drivers/block/xen-blkfront.c

  2672  
  2673  static int blkfront_freeze(struct xenbus_device *dev)
  2674  {
  2675  unsigned int i;
  2676  struct blkfront_info *info = dev_get_drvdata(&dev->dev);
  2677  struct blkfront_ring_info *rinfo;
  2678  /* This would be reasonable timeout as used in 
xenbus_dev_shutdown() */
  2679  unsigned int timeout = 5 * HZ;
  2680  unsigned long flags;
  2681  int err = 0;
  2682  
  2683  info->connected = BLKIF_STATE_FREEZING;
  2684  
  2685  blk_mq_freeze_queue(info->rq);
  2686  blk_mq_quiesce_queue(info->rq);
  2687  
  2688  for_each_rinfo(info, rinfo, i) {
  2689  /* No more gnttab callback work. */
  2690  gnttab_cancel_free_callback(&rinfo->callback);
  2691  /* Flush gnttab callback work. Must be done with no locks 
held. */
  2692  flush_work(&rinfo->work);
  2693  }
  2694  
  2695  for_each_rinfo(info, rinfo, i) {
  2696  spin_lock_irqsave(&rinfo->ring_lock, flags);
  2697  if (RING_FULL(&rinfo->ring)
  2698  || RING_HAS_UNCONSUMED_RESPONSES(&rinfo->ring)) {
> 2699  xenbus_dev_error(dev, err, "Hibernation Failed.
> 2700  The ring is still busy");
  2701  info->connected = BLKIF_STATE_CONNECTED;
  2702  spin_unlock_irqrestore(&rinfo->ring_lock, flags);
  2703  return -EBUSY;
> 2704  }
  2705  spin_unlock_irqrestore(&rinfo->ring_lock, flags);
  2706  }
  2707  /* Kick the backend to disconnect */
  2708  xenbus_switch_state(dev, XenbusStateClosing);
  2709  
  2710  /*
  2711   * We don't want to move forward before the frontend is 
diconnected
  2712   * from the backend cleanly.
  2713   */
  2714  timeout = 
wait_for_completion_timeout(&info->wait_backend_disconnected,
  2715timeout);
  2716  if (!timeout) {
  2717  err = -EBUSY;
  2718  xenbus_dev_error(dev, err, "Freezing timed out;"
  2719   "the device may become inconsistent 
state");
  2720  }
  2721  
  2722  return err;
  2723  }
  2724  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()

2020-05-19 Thread Emmanuel Grumbach
Hi all,



Since I have been involved quite a bit in the firmware debugging
features in iwlwifi, I think I can give a few insights here.

But before this, we need to understand that there are several sources of issues:
1) the firmware may crash but the bus is still alive, you can still
use the bus to get the crash data
2) the bus is dead, when that happens, the firmware might even be in a
good condition, but since the bus is dead, you stop getting any
information about the firmware, and then, at some point, you get to
the conclusion that the firmware is dead. You can't get the crash data
that resides on the other side of the bus (you may have gathered data
in the DRAM directly, but that's a different thing), and you don't
have much recovery to do besides re-starting the PCI enumeration.

At Intel, we have seen both unfortunately. The bus issues are the ones
that are trickier obviously. Trickier to detect (because you just get
garbage from any request you issue on the bus), and trickier to
handle. One can argue that the kernel should *not* handle those and
let this in userspace hands. I guess it all depends on what component
you ship to your customer and what you customer asks from you  :).



>
> Hi Luis,
>
> On Tue, May 19, 2020 at 7:02 AM Luis Chamberlain  wrote:
> > On Mon, May 18, 2020 at 06:23:33PM -0700, Brian Norris wrote:
> > > On Sat, May 16, 2020 at 6:51 AM Johannes Berg  
> > > wrote:
> > > > In addition, look what we have in iwl_trans_pcie_removal_wk(). If we
> > > > detect that the device is really wedged enough that the only way we can
> > > > still try to recover is by completely unbinding the driver from it, then
> > > > we give userspace a uevent for that. I don't remember exactly how and
> > > > where that gets used (ChromeOS) though, but it'd be nice to have that
> > > > sort of thing as part of the infrastructure, in a sort of two-level
> > > > notification?
> > >
> > > 
> > > We use this on certain devices where we know the underlying hardware
> > > has design issues that may lead to device failure
> >
> > Ah, after reading below I see you meant for iwlwifi.
>
> Sorry, I was replying to Johannes, who I believe had his "we"="Intel"
> hat (as iwlwifi maintainer) on, and was pointing at
> iwl_trans_pcie_removal_wk().
>

This pcie_removal thing is for the bus dead thing. My 2) above.

> > If userspace can indeed grow to support this, that would be fantastic.
>
> Well, Chrome OS tailors its user space a bit more to the hardware (and
> kernel/drivers in use) than the average distro might. We already do
> this (for some values of "this") today. Is that "fantastic" to you? :D

I guess it can be fantastic if other vendors also suffer from this. Or
maybe that could be done as part of the PCI bus driver inside the
kernel?

>
> > > -- then when we see
> > > this sort of unrecoverable "firmware-death", we remove the
> > > device[*]+driver, force-reset the PCI device (SBR), and try to
> > > reload/reattach the driver. This all happens by way of a udev rule.
> >
> > So you've sprikled your own udev event here as part of your kernel delta?
>
> No kernel delta -- the event is there already:
> iwl_trans_pcie_removal_wk()
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/intel/iwlwifi/pcie/trans.c?h=v5.6#n2027
>
> And you can see our udev rules and scripts, in all their ugly details
> here, if you really care:
> https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/master/net-wireless/iwlwifi_rescan/files/
>
> > > We
> > > also log this sort of stuff (and metrics around it) for bug reports
> > > and health statistics, since we really hope to not see this happen
> > > often.
> >
> > Assuming perfection is ideal but silly. So, what infrastructure do you
> > use for this sort of issue?
>
> We don't yet log firmware crashes generally, but for all our current
> crash reports (including WARN()), they go through this:
> https://chromium.googlesource.com/chromiumos/platform2/+/master/crash-reporter/README.md
>
> For example, look for "cut here" in:
> https://chromium.googlesource.com/chromiumos/platform2/+/master/crash-reporter/anomaly_detector.cc
>
> For other specific metrics (like counting "EVENT=INACCESSIBLE"), we
> use the Chrome UMA system:
> https://chromium.googlesource.com/chromiumos/platform2/+/master/metrics/README.md
>
> I don't imagine the "infrastructure" side of any of that would be
> useful to you, but maybe the client-side gathering can at least show
> you what we do.
>
> > > [*] "We" (user space) don't actually do this...it happens via the
> > > 'remove_when_gone' module parameter abomination found in iwlwifi.
> >
> > BTW is this likely a place on iwlwifi where the firmware likely crashed?
>
> iwl_trans_pcie_removal_wk() is triggered because HW accesses timed out
> in a way that is likely due to a dead PCIe endpoint. It's not directly
> a firmware crash, although there may be firmware crashes reported
> around the same time.

iwl

Re: [PATCH] perf evsel: Get group fd from CPU0 for system wide event

2020-05-19 Thread Jin, Yao

Hi Jiri,

On 5/18/2020 11:28 AM, Jin, Yao wrote:

Hi Jiri,

On 5/15/2020 4:33 PM, Jiri Olsa wrote:

On Fri, May 15, 2020 at 02:04:57PM +0800, Jin, Yao wrote:

SNIP


I think I get the root cause. That should be a serious bug in get_group_fd, 
access violation!

For a group mixed with system-wide event and per-core event and the group
leader is system-wide event, access violation will happen.

perf_evsel__alloc_fd allocates one FD member for system-wide event (only 
FD(evsel, 0, 0) is valid).

But for per core event, perf_evsel__alloc_fd allocates N FD members (N =
ncpus). For example, for ncpus is 8, FD(evsel, 0, 0) to FD(evsel, 7, 0) are
valid.

get_group_fd(struct evsel *evsel, int cpu, int thread)
{
 struct evsel *leader = evsel->leader;

 fd = FD(leader, cpu, thread);    /* access violation may happen here */
}

If leader is system-wide event, only the FD(leader, 0, 0) is valid.

When get_group_fd accesses FD(leader, 1, 0), access violation happens.

My fix is:

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..db05b8a1e1a8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1440,6 +1440,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int 
thread)
 if (evsel__is_group_leader(evsel))
 return -1;

+   if (leader->core.system_wide && !evsel->core.system_wide)
+   return -2;


so this effectively stops grouping system_wide events with others,
and I think it's correct, how about events that differ in cpumask?



My understanding for the events that differ in cpumaks is, if the leader's cpumask is not fully 
matched with the evsel's cpumask then we stop the grouping. Is this understanding correct?


I have done some tests and get some conclusions:

1. If the group is mixed with core and uncore events, the system_wide checking 
can distinguish them.

2. If the group is mixed with core and uncore events and "-a" is specified, the system_wide for core 
event is also false. So system_wide checking can distinguish them too


3. In my test, the issue only occurs when we collect the metric which is mixed with uncore event and 
core event, so maybe checking the system_wide is OK.



should we perhaps ensure this before we call open? go throught all
groups and check they are on the same cpus?



The issue doesn't happen at most of the time (only for the metric consisting of uncore event and 
core event), so fallback to stop grouping if call open is failed looks reasonable.


Thanks
Jin Yao


thanks,
jirka



+
 /*
  * Leader must be already processed/open,
  * if not it's a bug.
@@ -1665,6 +1668,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct 
perf_cpu_map *cpus,
 pid = perf_thread_map__pid(threads, thread);

 group_fd = get_group_fd(evsel, cpu, thread);
+   if (group_fd == -2) {
+   errno = EINVAL;
+   err = -EINVAL;
+   goto out_close;
+   }
  retry_open:
 test_attr__ready();

It enables the perf_evlist__reset_weak_group. And in the second_pass (in
__run_perf_stat), the events will be opened successfully.

I have tested OK for this fix on cascadelakex.

Thanks
Jin Yao





Is this fix OK?

Another thing is, do you think if we need to rename "evsel->core.system_wide" to 
"evsel->core.has_cpumask".


The "system_wide" may misleading.

evsel->core.system_wide = pmu ? pmu->is_uncore : false;

"pmu->is_uncore" is true if PMU has a "cpumask". But it's not just uncore PMU which has cpumask. 
Some other PMUs, e.g. cstate_pkg, also have cpumask. So for this case, "has_cpumask" should be better.


But I'm not sure if the change is OK for other case, e.g. PT, which also uses 
"evsel->core.system_wide".


Thanks
Jin Yao


Re: [PATCH v2] drm/exynos: Remove dev_err() on platform_get_irq() failure

2020-05-19 Thread Inki Dae
Hi Tamseel,

Same patch[1] has been merged. So could you re-post this patch after rebasing 
it on top of exynos-drm-next branch?
After rebase, only g2d part would be valid.

Thanks,
Inki Dae

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git/commit/?h=exynos-drm-next&id=fdd79b0db1899f915f489e744a06846284fa3f1e

20. 5. 19. 오후 7:49에 Tamseel Shams 이(가) 쓴 글:
> platform_get_irq() will call dev_err() itself on failure,
> so there is no need for the driver to also do this.
> This is detected by coccinelle.
> 
> Also removing unnecessary curly braces around if () statement.
> 
> Signed-off-by: Tamseel Shams 
> ---
> Fixed review comment by j...@perches.com
> 
>  drivers/gpu/drm/exynos/exynos_drm_dsi.c | 4 +---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 1 -
>  drivers/gpu/drm/exynos/exynos_drm_rotator.c | 4 +---
>  drivers/gpu/drm/exynos/exynos_drm_scaler.c  | 4 +---
>  4 files changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
> b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> index 902938d2568f..958e2c6a6702 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> @@ -1809,10 +1809,8 @@ static int exynos_dsi_probe(struct platform_device 
> *pdev)
>   }
>  
>   dsi->irq = platform_get_irq(pdev, 0);
> - if (dsi->irq < 0) {
> - dev_err(dev, "failed to request dsi irq resource\n");
> + if (dsi->irq < 0)
>   return dsi->irq;
> - }
>  
>   irq_set_status_flags(dsi->irq, IRQ_NOAUTOEN);
>   ret = devm_request_threaded_irq(dev, dsi->irq, NULL,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index fcee33a43aca..03be31427181 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -1498,7 +1498,6 @@ static int g2d_probe(struct platform_device *pdev)
>  
>   g2d->irq = platform_get_irq(pdev, 0);
>   if (g2d->irq < 0) {
> - dev_err(dev, "failed to get irq\n");
>   ret = g2d->irq;
>   goto err_put_clk;
>   }
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_rotator.c 
> b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> index dafa87b82052..2d94afba031e 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> @@ -293,10 +293,8 @@ static int rotator_probe(struct platform_device *pdev)
>   return PTR_ERR(rot->regs);
>  
>   irq = platform_get_irq(pdev, 0);
> - if (irq < 0) {
> - dev_err(dev, "failed to get irq\n");
> + if (irq < 0)
>   return irq;
> - }
>  
>   ret = devm_request_irq(dev, irq, rotator_irq_handler, 0, dev_name(dev),
>  rot);
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_scaler.c 
> b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> index 93c43c8d914e..ce1857138f89 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> @@ -502,10 +502,8 @@ static int scaler_probe(struct platform_device *pdev)
>   return PTR_ERR(scaler->regs);
>  
>   irq = platform_get_irq(pdev, 0);
> - if (irq < 0) {
> - dev_err(dev, "failed to get irq\n");
> + if (irq < 0)
>   return irq;
> - }
>  
>   ret = devm_request_threaded_irq(dev, irq, NULL, scaler_irq_handler,
>   IRQF_ONESHOT, "drm_scaler", scaler);
> 


Re: [RFC PATCH 0/8] Qualcomm Cloud AI 100 driver

2020-05-19 Thread Greg Kroah-Hartman
On Tue, May 19, 2020 at 12:26:01PM -0600, Jeffrey Hugo wrote:
> On 5/19/2020 12:12 PM, Greg Kroah-Hartman wrote:
> > > > Especially given the copyright owner of this code, that would be just
> > > > crazy and foolish to not have open userspace code as well.  Firmware
> > > > would also be wonderful as well, go poke your lawyers about derivative
> > > > work issues and the like for fun conversations :)
> > > 
> > > Those are the kind of conversations I try to avoid  :)
> > 
> > Sounds like you are going to now have to have them, have fun!
> 
> Honestly, I fail to see where you think there is a derivative work, so, I'm
> not really sure what discussions I need to revisit with our lawyers.

Given that we are not lawyers, why don't we leave those types of
discussions up to the lawyers, and not depend on people like me and you
for that?  :)

If your lawyers think that the code division is fine as-is, that's
great, I'd be glad to review it if they add their signed-off-by: on it
verifying that the api divide is approved by them.

thanks!

greg k-h


[PATCH v6 11/12] mmap locking API: convert mmap_sem API comments

2020-05-19 Thread Michel Lespinasse
Convert comments that reference old mmap_sem APIs to reference
corresponding new mmap locking APIs instead.

Signed-off-by: Michel Lespinasse 
---
 Documentation/vm/hmm.rst   |  6 +++---
 arch/alpha/mm/fault.c  |  2 +-
 arch/ia64/mm/fault.c   |  2 +-
 arch/m68k/mm/fault.c   |  2 +-
 arch/microblaze/mm/fault.c |  2 +-
 arch/mips/mm/fault.c   |  2 +-
 arch/nds32/mm/fault.c  |  2 +-
 arch/nios2/mm/fault.c  |  2 +-
 arch/openrisc/mm/fault.c   |  2 +-
 arch/parisc/mm/fault.c |  2 +-
 arch/riscv/mm/fault.c  |  2 +-
 arch/sh/mm/fault.c |  2 +-
 arch/sparc/mm/fault_32.c   |  2 +-
 arch/sparc/mm/fault_64.c   |  2 +-
 arch/xtensa/mm/fault.c |  2 +-
 drivers/android/binder_alloc.c |  4 ++--
 fs/hugetlbfs/inode.c   |  2 +-
 fs/userfaultfd.c   |  2 +-
 mm/filemap.c   |  2 +-
 mm/gup.c   | 12 ++--
 mm/huge_memory.c   |  4 ++--
 mm/khugepaged.c|  2 +-
 mm/ksm.c   |  2 +-
 mm/memory.c|  4 ++--
 mm/mempolicy.c |  2 +-
 mm/migrate.c   |  4 ++--
 mm/mmap.c  |  2 +-
 mm/oom_kill.c  |  8 
 net/ipv4/tcp.c |  2 +-
 29 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 4e3e9362afeb..046817505033 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -194,15 +194,15 @@ The usage pattern is::
 
  again:
   range.notifier_seq = mmu_interval_read_begin(&interval_sub);
-  down_read(&mm->mmap_sem);
+  mmap_read_lock(mm);
   ret = hmm_range_fault(&range);
   if (ret) {
-  up_read(&mm->mmap_sem);
+  mmap_read_unlock(mm);
   if (ret == -EBUSY)
  goto again;
   return ret;
   }
-  up_read(&mm->mmap_sem);
+  mmap_read_unlock(mm);
 
   take_lock(driver->update);
   if (mmu_interval_read_retry(&ni, range.notifier_seq) {
diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index 36efa778ee1a..c2303a8c2b9f 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -171,7 +171,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
 
-/* No need to up_read(&mm->mmap_sem) as we would
+/* No need to mmap_read_unlock(mm) as we would
 * have already released it in __lock_page_or_retry
 * in mm/filemap.c.
 */
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 9b95050c2048..0f788992608a 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -169,7 +169,7 @@ ia64_do_page_fault (unsigned long address, unsigned long 
isr, struct pt_regs *re
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
 
-/* No need to up_read(&mm->mmap_sem) as we would
+/* No need to mmap_read_unlock(mm) as we would
 * have already released it in __lock_page_or_retry
 * in mm/filemap.c.
 */
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index 650acab0d77d..a94a814ad6ad 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -165,7 +165,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
flags |= FAULT_FLAG_TRIED;
 
/*
-* No need to up_read(&mm->mmap_sem) as we would
+* No need to mmap_read_unlock(mm) as we would
 * have already released it in __lock_page_or_retry
 * in mm/filemap.c.
 */
diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
index 9d7c423dea1d..ebf1ac50b291 100644
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -239,7 +239,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long 
address,
flags |= FAULT_FLAG_TRIED;
 
/*
-* No need to up_read(&mm->mmap_sem) as we would
+* No need to mmap_read_unlock(mm) as we would
 * have already released it in __lock_page_or_retry
 * in mm/filemap.c.
 */
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 9ef2dd39111e..01b168a90434 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -181,7 +181,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, 
unsigned long write,
flags |= FAULT_FLAG_TRIED;
 
/*
-* No need to up_

[PATCH v6 09/12] mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()

2020-05-19 Thread Michel Lespinasse
Add new APIs to assert that mmap_sem is held.

Using this instead of rwsem_is_locked and lockdep_assert_held[_write]
makes the assertions more tolerant of future changes to the lock type.

Signed-off-by: Michel Lespinasse 
---
 arch/x86/events/core.c|  2 +-
 fs/userfaultfd.c  |  6 +++---
 include/linux/mmap_lock.h | 14 ++
 mm/gup.c  |  2 +-
 mm/hmm.c  |  2 +-
 mm/memory.c   |  2 +-
 mm/mmu_notifier.c |  6 +++---
 mm/pagewalk.c |  6 +++---
 mm/util.c |  2 +-
 9 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index a619763e96e1..66559ac4f89e 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2182,7 +2182,7 @@ static void x86_pmu_event_mapped(struct perf_event 
*event, struct mm_struct *mm)
 * For now, this can't happen because all callers hold mmap_sem
 * for write.  If this changes, we'll need a different solution.
 */
-   lockdep_assert_held_write(&mm->mmap_sem);
+   mmap_assert_write_locked(mm);
 
if (atomic_inc_return(&mm->context.perf_rdpmc_allowed) == 1)
on_each_cpu_mask(mm_cpumask(mm), refresh_pce, NULL, 1);
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 9c645eee1a59..12b492409040 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -234,7 +234,7 @@ static inline bool userfaultfd_huge_must_wait(struct 
userfaultfd_ctx *ctx,
pte_t *ptep, pte;
bool ret = true;
 
-   VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
+   mmap_assert_locked(mm);
 
ptep = huge_pte_offset(mm, address, vma_mmu_pagesize(vma));
 
@@ -286,7 +286,7 @@ static inline bool userfaultfd_must_wait(struct 
userfaultfd_ctx *ctx,
pte_t *pte;
bool ret = true;
 
-   VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
+   mmap_assert_locked(mm);
 
pgd = pgd_offset(mm, address);
if (!pgd_present(*pgd))
@@ -405,7 +405,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned 
long reason)
 * Coredumping runs without mmap_sem so we can only check that
 * the mmap_sem is held, if PF_DUMPCORE was not set.
 */
-   WARN_ON_ONCE(!rwsem_is_locked(&mm->mmap_sem));
+   mmap_assert_locked(mm);
 
ctx = vmf->vma->vm_userfaultfd_ctx.ctx;
if (!ctx)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index acac1bf5ecd2..43ef914e6468 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -1,6 +1,8 @@
 #ifndef _LINUX_MMAP_LOCK_H
 #define _LINUX_MMAP_LOCK_H
 
+#include 
+
 #define MMAP_LOCK_INITIALIZER(name) \
.mmap_sem = __RWSEM_INITIALIZER((name).mmap_sem),
 
@@ -73,4 +75,16 @@ static inline void mmap_read_unlock_non_owner(struct 
mm_struct *mm)
up_read_non_owner(&mm->mmap_sem);
 }
 
+static inline void mmap_assert_locked(struct mm_struct *mm)
+{
+   lockdep_assert_held(&mm->mmap_sem);
+   VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
+}
+
+static inline void mmap_assert_write_locked(struct mm_struct *mm)
+{
+   lockdep_assert_held_write(&mm->mmap_sem);
+   VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
+}
+
 #endif /* _LINUX_MMAP_LOCK_H */
diff --git a/mm/gup.c b/mm/gup.c
index 631285295950..c1c0b37d0e8f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1405,7 +1405,7 @@ long populate_vma_page_range(struct vm_area_struct *vma,
VM_BUG_ON(end   & ~PAGE_MASK);
VM_BUG_ON_VMA(start < vma->vm_start, vma);
VM_BUG_ON_VMA(end   > vma->vm_end, vma);
-   VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
+   mmap_assert_locked(mm);
 
gup_flags = FOLL_TOUCH | FOLL_POPULATE | FOLL_MLOCK;
if (vma->vm_flags & VM_LOCKONFAULT)
diff --git a/mm/hmm.c b/mm/hmm.c
index 280585833adf..660a4bcf932a 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -581,7 +581,7 @@ long hmm_range_fault(struct hmm_range *range)
struct mm_struct *mm = range->notifier->mm;
int ret;
 
-   lockdep_assert_held(&mm->mmap_sem);
+   mmap_assert_locked(mm);
 
do {
/* If range is no longer valid force retry. */
diff --git a/mm/memory.c b/mm/memory.c
index e6dd3309c5a3..20f98ea8968e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1214,7 +1214,7 @@ static inline unsigned long zap_pud_range(struct 
mmu_gather *tlb,
next = pud_addr_end(addr, end);
if (pud_trans_huge(*pud) || pud_devmap(*pud)) {
if (next - addr != HPAGE_PUD_SIZE) {
-   
VM_BUG_ON_VMA(!rwsem_is_locked(&tlb->mm->mmap_sem), vma);
+   mmap_assert_locked(tlb->mm);
split_huge_pud(vma, pud, addr);
} else if (zap_huge_pud(tlb, vma, pud, addr))
goto next;
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index cfd0a03bf5cc..24eb9d1ed0a7 100644
--- a/mm/mm

[PATCH v6 02/12] MMU notifier: use the new mmap locking API

2020-05-19 Thread Michel Lespinasse
This use is converted manually ahead of the next patch in the series,
as it requires including a new header which the automated conversion
would miss.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Daniel Jordan 
Reviewed-by: Davidlohr Bueso 
Reviewed-by: Laurent Dufour 
Reviewed-by: Vlastimil Babka 
---
 include/linux/mmu_notifier.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 736f6918335e..2f462710a1a4 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -277,9 +278,9 @@ mmu_notifier_get(const struct mmu_notifier_ops *ops, struct 
mm_struct *mm)
 {
struct mmu_notifier *ret;
 
-   down_write(&mm->mmap_sem);
+   mmap_write_lock(mm);
ret = mmu_notifier_get_locked(ops, mm);
-   up_write(&mm->mmap_sem);
+   mmap_write_unlock(mm);
return ret;
 }
 void mmu_notifier_put(struct mmu_notifier *subscription);
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v6 10/12] mmap locking API: rename mmap_sem to mmap_lock

2020-05-19 Thread Michel Lespinasse
Rename the mmap_sem field to mmap_lock. Any new uses of this lock
should now go through the new mmap locking api. The mmap_lock is
still implemented as a rwsem, though this could change in the future.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Vlastimil Babka 
---
 arch/ia64/mm/fault.c  |  4 +--
 arch/x86/mm/fault.c   |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c |  2 +-
 include/linux/mm_types.h  |  2 +-
 include/linux/mmap_lock.h | 38 +--
 mm/memory.c   |  2 +-
 mm/mmap.c |  4 +--
 mm/mmu_notifier.c |  2 +-
 8 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 693f00b117e1..9b95050c2048 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -70,8 +70,8 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, 
struct pt_regs *re
mask = isr >> IA64_ISR_X_BIT) & 1UL) << VM_EXEC_BIT)
| (((isr >> IA64_ISR_W_BIT) & 1UL) << VM_WRITE_BIT));
 
-   /* mmap_sem is performance critical */
-   prefetchw(&mm->mmap_sem);
+   /* mmap_lock is performance critical */
+   prefetchw(&mm->mmap_lock);
 
/*
 * If we're in an interrupt or have no user context, we must not take 
the fault..
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 181f66b9049f..35f530f9dfc0 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1522,7 +1522,7 @@ dotraplinkage void
 do_page_fault(struct pt_regs *regs, unsigned long hw_error_code,
unsigned long address)
 {
-   prefetchw(¤t->mm->mmap_sem);
+   prefetchw(¤t->mm->mmap_lock);
trace_page_fault_entries(regs, hw_error_code, address);
 
if (unlikely(kmmio_fault(regs, address)))
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index dc9ef302f517..701f3995f621 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -661,7 +661,7 @@ static int etnaviv_gem_userptr_get_pages(struct 
etnaviv_gem_object *etnaviv_obj)
struct etnaviv_gem_userptr *userptr = &etnaviv_obj->userptr;
int ret, pinned = 0, npages = etnaviv_obj->base.size >> PAGE_SHIFT;
 
-   might_lock_read(¤t->mm->mmap_sem);
+   might_lock_read(¤t->mm->mmap_lock);
 
if (userptr->mm != current->mm)
return -EPERM;
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4aba6c0c2ba8..d13b90399c16 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -436,7 +436,7 @@ struct mm_struct {
spinlock_t page_table_lock; /* Protects page tables and some
 * counters
 */
-   struct rw_semaphore mmap_sem;
+   struct rw_semaphore mmap_lock;
 
struct list_head mmlist; /* List of maybe swapped mm's. These
  * are globally strung together off
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 43ef914e6468..b5bd86778cca 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -4,67 +4,67 @@
 #include 
 
 #define MMAP_LOCK_INITIALIZER(name) \
-   .mmap_sem = __RWSEM_INITIALIZER((name).mmap_sem),
+   .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
 
 static inline void mmap_init_lock(struct mm_struct *mm)
 {
-   init_rwsem(&mm->mmap_sem);
+   init_rwsem(&mm->mmap_lock);
 }
 
 static inline void mmap_write_lock(struct mm_struct *mm)
 {
-   down_write(&mm->mmap_sem);
+   down_write(&mm->mmap_lock);
 }
 
 static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass)
 {
-   down_write_nested(&mm->mmap_sem, subclass);
+   down_write_nested(&mm->mmap_lock, subclass);
 }
 
 static inline int mmap_write_lock_killable(struct mm_struct *mm)
 {
-   return down_write_killable(&mm->mmap_sem);
+   return down_write_killable(&mm->mmap_lock);
 }
 
 static inline bool mmap_write_trylock(struct mm_struct *mm)
 {
-   return down_write_trylock(&mm->mmap_sem) != 0;
+   return down_write_trylock(&mm->mmap_lock) != 0;
 }
 
 static inline void mmap_write_unlock(struct mm_struct *mm)
 {
-   up_write(&mm->mmap_sem);
+   up_write(&mm->mmap_lock);
 }
 
 static inline void mmap_write_downgrade(struct mm_struct *mm)
 {
-   downgrade_write(&mm->mmap_sem);
+   downgrade_write(&mm->mmap_lock);
 }
 
 static inline void mmap_read_lock(struct mm_struct *mm)
 {
-   down_read(&mm->mmap_sem);
+   down_read(&mm->mmap_lock);
 }
 
 static inline int mmap_read_lock_killable(struct mm_struct *mm)
 {
-   return down_read_killable(&mm->mmap_sem);
+   return down_read_killable(&mm->mmap_lock);
 }
 
 static inline bool mmap_read_trylock(struct mm_struct *mm)
 {
-   return 

[PATCH v6 08/12] mmap locking API: add MMAP_LOCK_INITIALIZER

2020-05-19 Thread Michel Lespinasse
Define a new initializer for the mmap locking api.
Initially this just evaluates to __RWSEM_INITIALIZER as the API
is defined as wrappers around rwsem.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Laurent Dufour 
Reviewed-by: Vlastimil Babka 
---
 arch/x86/kernel/tboot.c| 2 +-
 drivers/firmware/efi/efi.c | 2 +-
 include/linux/mmap_lock.h  | 3 +++
 mm/init-mm.c   | 2 +-
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index b89f6ac6a0c0..885058325c20 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -90,7 +90,7 @@ static struct mm_struct tboot_mm = {
.pgd= swapper_pg_dir,
.mm_users   = ATOMIC_INIT(2),
.mm_count   = ATOMIC_INIT(1),
-   .mmap_sem   = __RWSEM_INITIALIZER(init_mm.mmap_sem),
+   MMAP_LOCK_INITIALIZER(init_mm)
.page_table_lock =  __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock),
.mmlist = LIST_HEAD_INIT(init_mm.mmlist),
 };
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 911a2bd0f6b7..916313ec8acb 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -54,7 +54,7 @@ struct mm_struct efi_mm = {
.mm_rb  = RB_ROOT,
.mm_users   = ATOMIC_INIT(2),
.mm_count   = ATOMIC_INIT(1),
-   .mmap_sem   = __RWSEM_INITIALIZER(efi_mm.mmap_sem),
+   MMAP_LOCK_INITIALIZER(efi_mm)
.page_table_lock= __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
.mmlist = LIST_HEAD_INIT(efi_mm.mmlist),
.cpu_bitmap = { [BITS_TO_LONGS(NR_CPUS)] = 0},
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index d1826ce42f00..acac1bf5ecd2 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -1,6 +1,9 @@
 #ifndef _LINUX_MMAP_LOCK_H
 #define _LINUX_MMAP_LOCK_H
 
+#define MMAP_LOCK_INITIALIZER(name) \
+   .mmap_sem = __RWSEM_INITIALIZER((name).mmap_sem),
+
 static inline void mmap_init_lock(struct mm_struct *mm)
 {
init_rwsem(&mm->mmap_sem);
diff --git a/mm/init-mm.c b/mm/init-mm.c
index 19603302a77f..fe9c03d8e07b 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -31,7 +31,7 @@ struct mm_struct init_mm = {
.pgd= swapper_pg_dir,
.mm_users   = ATOMIC_INIT(2),
.mm_count   = ATOMIC_INIT(1),
-   .mmap_sem   = __RWSEM_INITIALIZER(init_mm.mmap_sem),
+   MMAP_LOCK_INITIALIZER(init_mm)
.page_table_lock =  __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock),
.arg_lock   =  __SPIN_LOCK_UNLOCKED(init_mm.arg_lock),
.mmlist = LIST_HEAD_INIT(init_mm.mmlist),
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v6 07/12] mmap locking API: add mmap_read_trylock_non_owner()

2020-05-19 Thread Michel Lespinasse
Add a couple APIs used by kernel/bpf/stackmap.c only:
- mmap_read_trylock_non_owner()
- mmap_read_unlock_non_owner() (may be called from a work queue).

It's still not ideal that bpf/stackmap subverts the lock ownership
in this way. Thanks to Peter Zijlstra for suggesting this API as the
least-ugly way of addressing this in the short term.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Daniel Jordan 
Reviewed-by: Vlastimil Babka 
---
 include/linux/mmap_lock.h | 14 ++
 kernel/bpf/stackmap.c | 17 +
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index a757cb30ae77..d1826ce42f00 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -56,4 +56,18 @@ static inline void mmap_read_unlock(struct mm_struct *mm)
up_read(&mm->mmap_sem);
 }
 
+static inline bool mmap_read_trylock_non_owner(struct mm_struct *mm)
+{
+   if (down_read_trylock(&mm->mmap_sem)) {
+   rwsem_release(&mm->mmap_sem.dep_map, _RET_IP_);
+   return true;
+   }
+   return false;
+}
+
+static inline void mmap_read_unlock_non_owner(struct mm_struct *mm)
+{
+   up_read_non_owner(&mm->mmap_sem);
+}
+
 #endif /* _LINUX_MMAP_LOCK_H */
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 11d41f0c7005..998968659892 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -33,7 +33,7 @@ struct bpf_stack_map {
 /* irq_work to run up_read() for build_id lookup in nmi context */
 struct stack_map_irq_work {
struct irq_work irq_work;
-   struct rw_semaphore *sem;
+   struct mm_struct *mm;
 };
 
 static void do_up_read(struct irq_work *entry)
@@ -44,8 +44,7 @@ static void do_up_read(struct irq_work *entry)
return;
 
work = container_of(entry, struct stack_map_irq_work, irq_work);
-   up_read_non_owner(work->sem);
-   work->sem = NULL;
+   mmap_read_unlock_non_owner(work->mm);
 }
 
 static DEFINE_PER_CPU(struct stack_map_irq_work, up_read_work);
@@ -317,7 +316,7 @@ static void stack_map_get_build_id_offset(struct 
bpf_stack_build_id *id_offs,
 * with build_id.
 */
if (!user || !current || !current->mm || irq_work_busy ||
-   mmap_read_trylock(current->mm) == 0) {
+   !mmap_read_trylock_non_owner(current->mm)) {
/* cannot access current->mm, fall back to ips */
for (i = 0; i < trace_nr; i++) {
id_offs[i].status = BPF_STACK_BUILD_ID_IP;
@@ -342,16 +341,10 @@ static void stack_map_get_build_id_offset(struct 
bpf_stack_build_id *id_offs,
}
 
if (!work) {
-   mmap_read_unlock(current->mm);
+   mmap_read_unlock_non_owner(current->mm);
} else {
-   work->sem = ¤t->mm->mmap_sem;
+   work->mm = current->mm;
irq_work_queue(&work->irq_work);
-   /*
-* The irq_work will release the mmap_sem with
-* up_read_non_owner(). The rwsem_release() is called
-* here to release the lock from lockdep's perspective.
-*/
-   rwsem_release(¤t->mm->mmap_sem.dep_map, _RET_IP_);
}
 }
 
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v6 05/12] mmap locking API: convert mmap_sem call sites missed by coccinelle

2020-05-19 Thread Michel Lespinasse
Convert the last few remaining mmap_sem rwsem calls to use the new
mmap locking API. These were missed by coccinelle for some reason
(I think coccinelle does not support some of the preprocessor
constructs in these files ?)

Signed-off-by: Michel Lespinasse 
Reviewed-by: Daniel Jordan 
Reviewed-by: Laurent Dufour 
Reviewed-by: Vlastimil Babka 
---
 arch/mips/mm/fault.c   | 10 +-
 arch/riscv/mm/pageattr.c   |  4 ++--
 arch/x86/kvm/mmu/paging_tmpl.h |  8 
 fs/proc/base.c |  6 +++---
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index f8d62cd83b36..9ef2dd39111e 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -97,7 +97,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, 
unsigned long write,
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
 retry:
-   down_read(&mm->mmap_sem);
+   mmap_read_lock(mm);
vma = find_vma(mm, address);
if (!vma)
goto bad_area;
@@ -190,7 +190,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, 
unsigned long write,
}
}
 
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
return;
 
 /*
@@ -198,7 +198,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, 
unsigned long write,
  * Fix it, but check if it's kernel or user first..
  */
 bad_area:
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
 
 bad_area_nosemaphore:
/* User mode accesses just cause a SIGSEGV */
@@ -250,14 +250,14 @@ static void __kprobes __do_page_fault(struct pt_regs 
*regs, unsigned long write,
 * We ran out of memory, call the OOM killer, and return the userspace
 * (which will retry the fault, or kill us if we got oom-killed).
 */
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
if (!user_mode(regs))
goto no_context;
pagefault_out_of_memory();
return;
 
 do_sigbus:
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
 
/* Kernel mode? Handle exceptions or die */
if (!user_mode(regs))
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 728759eb530a..b9072c043222 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -117,10 +117,10 @@ static int __set_memory(unsigned long addr, int numpages, 
pgprot_t set_mask,
if (!numpages)
return 0;
 
-   down_read(&init_mm.mmap_sem);
+   mmap_read_lock(&init_mm);
ret =  walk_page_range_novma(&init_mm, start, end, &pageattr_ops, NULL,
 &masks);
-   up_read(&init_mm.mmap_sem);
+   mmap_read_unlock(&init_mm);
 
flush_tlb_kernel_range(start, end);
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 9bdf9b7d9a96..40e5bb67cc09 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -165,22 +165,22 @@ static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
unsigned long pfn;
unsigned long paddr;
 
-   down_read(¤t->mm->mmap_sem);
+   mmap_read_lock(current->mm);
vma = find_vma_intersection(current->mm, vaddr, vaddr + 
PAGE_SIZE);
if (!vma || !(vma->vm_flags & VM_PFNMAP)) {
-   up_read(¤t->mm->mmap_sem);
+   mmap_read_unlock(current->mm);
return -EFAULT;
}
pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
paddr = pfn << PAGE_SHIFT;
table = memremap(paddr, PAGE_SIZE, MEMREMAP_WB);
if (!table) {
-   up_read(¤t->mm->mmap_sem);
+   mmap_read_unlock(current->mm);
return -EFAULT;
}
ret = CMPXCHG(&table[index], orig_pte, new_pte);
memunmap(table);
-   up_read(¤t->mm->mmap_sem);
+   mmap_read_unlock(current->mm);
}
 
return (ret != orig_pte);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9a68032d8d73..a96377557db7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2314,7 +2314,7 @@ proc_map_files_readdir(struct file *file, struct 
dir_context *ctx)
if (!mm)
goto out_put_task;
 
-   ret = down_read_killable(&mm->mmap_sem);
+   ret = mmap_read_lock_killable(mm);
if (ret) {
mmput(mm);
goto out_put_task;
@@ -2341,7 +2341,7 @@ proc_map_files_readdir(struct file *file, struct 
dir_context *ctx)
p = genradix_ptr_alloc(&fa, nr_files++, GFP_KERNEL);
if (!p) {
ret = -ENOMEM;
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);
mmput(

[PATCH v6 00/12] Add a new mmap locking API wrapping mmap_sem calls

2020-05-19 Thread Michel Lespinasse
Reposting this patch series on top of v5.7-rc6. I think this is ready
for inclusion into the -mm tree; however there were some minor points
of feedback to address and also it was easier to regenerate a full
version after the v5.5 (only updating patches 09/10 and 10/10) caused
some confusion.


This patch series adds a new mmap locking API replacing the existing
mmap_sem lock and unlocks. Initially the API is just implemente in terms
of inlined rwsem calls, so it doesn't provide any new functionality.

There are two justifications for the new API:

- At first, it provides an easy hooking point to instrument mmap_sem
  locking latencies independently of any other rwsems.

- In the future, it may be a starting point for replacing the rwsem
  implementation with a different one, such as range locks. This is
  something that is being explored, even though there is no wide concensus
  about this possible direction yet.
  (see https://patchwork.kernel.org/cover/11401483/)


Changes since v5.5 of the patchset:

- Applied the changes on top of v5.7-rc6. This was a straight rebase
  except for the changes noted here.

- Re-generated the coccinelle changes (patch 04/12).

- Patch 08/12: use (name) in the MMAP_LOCK_INITIALIZER macro.

- Patch 09/12: use lockdep_assert_held() / lockdep_assert_held_write()
  so that mmap_assert_locked() and mmap_assert_write_locked() get better
  coverage when lockdep is enabled but CONFIG_DEBUG_VM is not.

- Added patches 11 and 12, converting comments that referenced mmap_sem
  rwsem calls or the mmap_sem lock itself, to reference the corresponding
  mmap locking APIs or the mmap_lock itself.


Changes since v5 of the patchset:

- Patch 09/10: Add both mmap_assert_locked() and mmap_assert_write_locked();
  convert some call sites that were using lockdep assertions to use these
  new APIs instead.


Changes since v4 of the patchset:

- Applied the changes on top of v5.7-rc2. This was a straight rebase
  except for changes noted here.

- Patch 01/10: renamed the mmap_write_downgrade API
  (as suggested by Davidlohr Bueso).

- Patch 05/10: added arch/riscv/mm/pageattr.c changes that had been
  previously missed, as found by the kbuild bot.

- Patch 06/10: use SINGLE_DEPTH_NESTING as suggested by Matthew Wilcox.

- Patch 08/10: change MMAP_LOCK_INITIALIZER definition
  as suggested by Matthew Wilcox.

- Patch 09/10: add mm_assert_locked API as suggested by Matthew Wilcox.


Changes since v3 of the patchset:

- The changes now apply on top of v5.7-rc1. This was a straight rebase
  except for changes noted here.

- Re-generated the coccinelle changes (patch 04/10).

- Patch 06/10: removed the mmap_write_unlock_nested API;
  mmap_write_lock_nested() calls now pair with the regular mmap_write_unlock()
  as was suggested by many people.

- Patch 07/10: removed the mmap_read_release API; this is replaced with
  mmap_read_trylock_non_owner() which pairs with mmap_read_unlock_non_owner()
  Thanks to Peter Zijlstra for the suggestion.


Changes since v2 of the patchset:

- Removed the mmap_is_locked API - v2 had removed all uses of it,
  but the actual function definition was still there unused.
  Thanks to Jason Gunthorpe for noticing the unused mmap_is_locked function.


Changes since v1 of the patchset:

- Manually convert drivers/dma-buf/dma-resv.c ahead of the automated
  coccinelle conversion as this file requires a new include statement.
  Thanks to Intel's kbuild test bot for finding the issue.

- In coccinelle automated conversion, apply a single coccinelle rule
  as suggested by Markus Elfring.

- In manual conversion of sites missed by coccinelle, fix an issue where
  I had used mm_read_unlock (from an older version of my patchset) instead
  of mmap_read_unlock in some arch/mips code.
  This was also identified by Intel's kbuild test bot.

- Do not add a new mmap_is_locked API, and use lockdep_assert_held instead.
  Thanks to Jason Gunthorpe and Matthew Wilcox for the suggestion.


The changes apply on top of v5.7-rc6.

I think these changes are ready for integration into the -mm tree now
(for integration into v5.8). The coccinelle part of the change is
relatively invasive, but can be skipped over on a file by file basis
if it causes any conflicts with other pending changes. The new mmap
locking API can interoperate with new code that is still using direct
rwsem calls, until the last patch in the series which renames mmap_sem
to enforce using the new API. Maybe that last patch could be delayed for
a bit, so that we'd get a chance to convert any new code that locks
mmap_sem in the -rc1 release before applying that last patch.


Michel Lespinasse (12):
  mmap locking API: initial implementation as rwsem wrappers
  MMU notifier: use the new mmap locking API
  DMA  reservations: use the new mmap locking API
  mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
  mmap locking API: convert mmap_sem call sites missed by coccinelle
  mmap locking API: convert nested write lock

[PATCH v6 06/12] mmap locking API: convert nested write lock sites

2020-05-19 Thread Michel Lespinasse
Add API for nested write locks and convert the few call sites doing that.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Daniel Jordan 
Reviewed-by: Laurent Dufour 
Reviewed-by: Vlastimil Babka 
---
 arch/um/include/asm/mmu_context.h | 3 ++-
 include/linux/mmap_lock.h | 5 +
 kernel/fork.c | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/um/include/asm/mmu_context.h 
b/arch/um/include/asm/mmu_context.h
index 62262c5c7785..17ddd4edf875 100644
--- a/arch/um/include/asm/mmu_context.h
+++ b/arch/um/include/asm/mmu_context.h
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -47,7 +48,7 @@ static inline void activate_mm(struct mm_struct *old, struct 
mm_struct *new)
 * when the new ->mm is used for the first time.
 */
__switch_mm(&new->context.id);
-   down_write_nested(&new->mmap_sem, 1);
+   mmap_write_lock_nested(new, SINGLE_DEPTH_NESTING);
uml_setup_stubs(new);
mmap_write_unlock(new);
 }
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 97ac53b66052..a757cb30ae77 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -11,6 +11,11 @@ static inline void mmap_write_lock(struct mm_struct *mm)
down_write(&mm->mmap_sem);
 }
 
+static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass)
+{
+   down_write_nested(&mm->mmap_sem, subclass);
+}
+
 static inline int mmap_write_lock_killable(struct mm_struct *mm)
 {
return down_write_killable(&mm->mmap_sem);
diff --git a/kernel/fork.c b/kernel/fork.c
index a4db6bc952c6..e702e84897fa 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -499,7 +499,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
/*
 * Not linked in yet - no deadlock potential:
 */
-   down_write_nested(&mm->mmap_sem, SINGLE_DEPTH_NESTING);
+   mmap_write_lock_nested(mm, SINGLE_DEPTH_NESTING);
 
/* No ordering required: file already has been exposed. */
RCU_INIT_POINTER(mm->exe_file, get_mm_exe_file(oldmm));
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v6 03/12] DMA reservations: use the new mmap locking API

2020-05-19 Thread Michel Lespinasse
This use is converted manually ahead of the next patch in the series,
as it requires including a new header which the automated conversion
would miss.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Daniel Jordan 
Reviewed-by: Laurent Dufour 
Reviewed-by: Vlastimil Babka 
---
 drivers/dma-buf/dma-resv.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 4264e64788c4..b45f8514dc82 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -34,6 +34,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 /**
@@ -109,7 +110,7 @@ static int __init dma_resv_lockdep(void)
 
dma_resv_init(&obj);
 
-   down_read(&mm->mmap_sem);
+   mmap_read_lock(mm);
ww_acquire_init(&ctx, &reservation_ww_class);
ret = dma_resv_lock(&obj, &ctx);
if (ret == -EDEADLK)
@@ -118,7 +119,7 @@ static int __init dma_resv_lockdep(void)
fs_reclaim_release(GFP_KERNEL);
ww_mutex_unlock(&obj.lock);
ww_acquire_fini(&ctx);
-   up_read(&mm->mmap_sem);
+   mmap_read_unlock(mm);

mmput(mm);
 
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v6 12/12] mmap locking API: convert mmap_sem comments

2020-05-19 Thread Michel Lespinasse
Convert comments that reference mmap_sem to reference mmap_lock instead.

Signed-off-by: Michel Lespinasse 
---
 .../admin-guide/mm/numa_memory_policy.rst | 10 ++---
 Documentation/admin-guide/mm/userfaultfd.rst  |  2 +-
 Documentation/filesystems/locking.rst |  2 +-
 Documentation/vm/transhuge.rst|  4 +-
 arch/arc/mm/fault.c   |  2 +-
 arch/arm/kernel/vdso.c|  2 +-
 arch/arm/mm/fault.c   |  2 +-
 arch/ia64/mm/fault.c  |  2 +-
 arch/microblaze/mm/fault.c|  2 +-
 arch/nds32/mm/fault.c |  2 +-
 arch/powerpc/include/asm/pkeys.h  |  2 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c|  6 +--
 arch/powerpc/mm/book3s32/tlb.c|  2 +-
 arch/powerpc/mm/book3s64/hash_pgtable.c   |  4 +-
 arch/powerpc/mm/book3s64/subpage_prot.c   |  2 +-
 arch/powerpc/mm/fault.c   |  8 ++--
 arch/powerpc/mm/pgtable.c |  2 +-
 arch/powerpc/platforms/cell/spufs/file.c  |  6 +--
 arch/riscv/mm/fault.c |  2 +-
 arch/s390/kvm/priv.c  |  2 +-
 arch/s390/mm/fault.c  |  2 +-
 arch/s390/mm/gmap.c   | 32 +++
 arch/s390/mm/pgalloc.c|  2 +-
 arch/sh/mm/cache-sh4.c|  2 +-
 arch/sh/mm/fault.c|  2 +-
 arch/sparc/mm/fault_64.c  |  2 +-
 arch/um/kernel/skas/mmu.c |  2 +-
 arch/um/kernel/tlb.c  |  2 +-
 arch/unicore32/mm/fault.c |  2 +-
 arch/x86/events/core.c|  2 +-
 arch/x86/include/asm/mmu.h|  2 +-
 arch/x86/include/asm/pgtable-3level.h |  8 ++--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c|  6 +--
 arch/x86/kernel/ldt.c |  2 +-
 arch/x86/mm/fault.c   | 12 +++---
 drivers/char/mspec.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  6 +--
 drivers/gpu/drm/i915/i915_perf.c  |  2 +-
 drivers/gpu/drm/ttm/ttm_bo_vm.c   |  6 +--
 drivers/infiniband/core/uverbs_main.c |  2 +-
 drivers/infiniband/hw/hfi1/mmu_rb.c   |  2 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c |  2 +-
 drivers/misc/cxl/cxllib.c |  2 +-
 drivers/misc/sgi-gru/grufault.c   |  8 ++--
 drivers/oprofile/buffer_sync.c|  2 +-
 drivers/staging/android/ashmem.c  |  4 +-
 drivers/staging/comedi/comedi_fops.c  |  2 +-
 drivers/tty/vt/consolemap.c   |  2 +-
 drivers/xen/gntdev.c  |  2 +-
 fs/coredump.c |  4 +-
 fs/exec.c |  2 +-
 fs/ext2/file.c|  2 +-
 fs/ext4/super.c   |  6 +--
 fs/kernfs/file.c  |  4 +-
 fs/proc/base.c|  6 +--
 fs/proc/task_mmu.c|  6 +--
 fs/userfaultfd.c  | 18 -
 fs/xfs/xfs_file.c |  2 +-
 fs/xfs/xfs_inode.c| 14 +++
 fs/xfs/xfs_iops.c |  4 +-
 include/asm-generic/pgtable.h |  6 +--
 include/linux/fs.h|  4 +-
 include/linux/huge_mm.h   |  2 +-
 include/linux/mempolicy.h |  2 +-
 include/linux/mm.h| 10 ++---
 include/linux/mm_types.h  |  2 +-
 include/linux/mmu_notifier.h  |  8 ++--
 include/linux/pagemap.h   |  2 +-
 include/linux/rmap.h  |  2 +-
 include/linux/sched/mm.h  | 10 ++---
 kernel/acct.c |  2 +-
 kernel/cgroup/cpuset.c|  4 +-
 kernel/events/core.c  |  6 +--
 kernel/events/uprobes.c   |  4 +-
 kernel/exit.c |  2 +-
 kernel/relay.c|  2 +-
 kernel/sys.c  |  4 +-
 lib/test_lockup.c |  8 ++--
 mm/filemap.c  | 38 +-
 mm/frame_vector.c |  2 +-
 mm/gup.c  | 38 +-
 mm/huge_memory.c  |  4 +-
 mm/hugetlb.c  |  2 +-
 mm/internal

[PATCH v6 01/12] mmap locking API: initial implementation as rwsem wrappers

2020-05-19 Thread Michel Lespinasse
This change wraps the existing mmap_sem related rwsem calls into a new
mmap locking API. There are two justifications for the new API:

- At first, it provides an easy hooking point to instrument mmap_sem
  locking latencies independently of any other rwsems.

- In the future, it may be a starting point for replacing the rwsem
  implementation with a different one, such as range locks.

Signed-off-by: Michel Lespinasse 
Reviewed-by: Daniel Jordan 
Reviewed-by: Davidlohr Bueso 
Reviewed-by: Laurent Dufour 
Reviewed-by: Vlastimil Babka 
---
 include/linux/mm.h|  1 +
 include/linux/mmap_lock.h | 54 +++
 2 files changed, 55 insertions(+)
 create mode 100644 include/linux/mmap_lock.h

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5a323422d783..051ec782bdbb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
new file mode 100644
index ..97ac53b66052
--- /dev/null
+++ b/include/linux/mmap_lock.h
@@ -0,0 +1,54 @@
+#ifndef _LINUX_MMAP_LOCK_H
+#define _LINUX_MMAP_LOCK_H
+
+static inline void mmap_init_lock(struct mm_struct *mm)
+{
+   init_rwsem(&mm->mmap_sem);
+}
+
+static inline void mmap_write_lock(struct mm_struct *mm)
+{
+   down_write(&mm->mmap_sem);
+}
+
+static inline int mmap_write_lock_killable(struct mm_struct *mm)
+{
+   return down_write_killable(&mm->mmap_sem);
+}
+
+static inline bool mmap_write_trylock(struct mm_struct *mm)
+{
+   return down_write_trylock(&mm->mmap_sem) != 0;
+}
+
+static inline void mmap_write_unlock(struct mm_struct *mm)
+{
+   up_write(&mm->mmap_sem);
+}
+
+static inline void mmap_write_downgrade(struct mm_struct *mm)
+{
+   downgrade_write(&mm->mmap_sem);
+}
+
+static inline void mmap_read_lock(struct mm_struct *mm)
+{
+   down_read(&mm->mmap_sem);
+}
+
+static inline int mmap_read_lock_killable(struct mm_struct *mm)
+{
+   return down_read_killable(&mm->mmap_sem);
+}
+
+static inline bool mmap_read_trylock(struct mm_struct *mm)
+{
+   return down_read_trylock(&mm->mmap_sem) != 0;
+}
+
+static inline void mmap_read_unlock(struct mm_struct *mm)
+{
+   up_read(&mm->mmap_sem);
+}
+
+#endif /* _LINUX_MMAP_LOCK_H */
-- 
2.26.2.761.g0e0b3e54be-goog



Re: [PATCH] s390/sclp_vt220: Fix console name to match device

2020-05-19 Thread Christian Borntraeger


On 19.05.20 20:16, Valentin Vidic wrote:
> Console name reported in /proc/consoles:
> 
>   ttyS1-W- (EC p  )4:65
> 
> does not match device name:
> 
>   crw--w1 root root4,  65 May 17 12:18 /dev/ttysclp0
> 
> so debian-installer gets confused and fails to start.
> 
> Signed-off-by: Valentin Vidic 
> Cc: sta...@vger.kernel.org

This is not as simple. ttyS1 is the the console name and ttysclp0 is the tty 
name.
This has mostly historic reasons and it obviously causes problems.
But there is  documentation out that that actually describes the use of 
console=ttyS1 console=ttyS0.
to have console output on both sclp consoles and there are probably scripts
using ttyS1.

I am wondering. The tty for ttyS0 is named sclp_line0. Does this work in LPAR?


> ---
>  drivers/s390/char/sclp_vt220.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/s390/char/sclp_vt220.c b/drivers/s390/char/sclp_vt220.c
> index 3f9a6ef650fa..3c2ed6d01387 100644
> --- a/drivers/s390/char/sclp_vt220.c
> +++ b/drivers/s390/char/sclp_vt220.c
> @@ -35,8 +35,8 @@
>  #define SCLP_VT220_MINOR 65
>  #define SCLP_VT220_DRIVER_NAME   "sclp_vt220"
>  #define SCLP_VT220_DEVICE_NAME   "ttysclp"
> -#define SCLP_VT220_CONSOLE_NAME  "ttyS"
> -#define SCLP_VT220_CONSOLE_INDEX 1   /* console=ttyS1 */
> +#define SCLP_VT220_CONSOLE_NAME  "ttysclp"
> +#define SCLP_VT220_CONSOLE_INDEX 0   /* console=ttysclp0 */
>  
>  /* Representation of a single write request */
>  struct sclp_vt220_request {
> 


Re: [PATCH v4 00/15] virtio-mem: paravirtualized memory

2020-05-19 Thread teawater
Hi David,

Thanks for your work.
I tried this version with cloud-hypervisor master.  It worked very well.

Best,
Hui

> 2020年5月7日 22:01,David Hildenbrand  写道:
> 
> This series is based on v5.7-rc4. The patches are located at:
>https://github.com/davidhildenbrand/linux.git virtio-mem-v4
> 
> This is basically a resend of v3 [1], now based on v5.7-rc4 and restested.
> One patch was reshuffled and two ACKs I missed to add were added. The
> rebase did not require any modifications to patches.
> 
> Details about virtio-mem can be found in the cover letter of v2 [2]. A
> basic QEMU implementation was posted yesterday [3].
> 
> [1] https://lkml.kernel.org/r/20200507103119.11219-1-da...@redhat.com
> [2] https://lkml.kernel.org/r/20200311171422.10484-1-da...@redhat.com
> [3] https://lkml.kernel.org/r/20200506094948.76388-1-da...@redhat.com
> 
> v3 -> v4:
> - Move "MAINTAINERS: Add myself as virtio-mem maintainer" to #2
> - Add two ACKs from Andrew (in reply to v2)
> -- "mm: Allow to offline unmovable PageOffline() pages via ..."
> -- "mm/memory_hotplug: Introduce offline_and_remove_memory()"
> 
> v2 -> v3:
> - "virtio-mem: Paravirtualized memory hotplug"
> -- Include "linux/slab.h" to fix build issues
> -- Remember the "region_size", helpful for patch #11
> -- Minor simplifaction in virtio_mem_overlaps_range()
> -- Use notifier_from_errno() instead of notifier_to_errno() in notifier
> -- More reliable check for added memory when unloading the driver
> - "virtio-mem: Allow to specify an ACPI PXM as nid"
> -- Also print the nid
> - Added patch #11-#15
> 
> David Hildenbrand (15):
>  virtio-mem: Paravirtualized memory hotplug
>  MAINTAINERS: Add myself as virtio-mem maintainer
>  virtio-mem: Allow to specify an ACPI PXM as nid
>  virtio-mem: Paravirtualized memory hotunplug part 1
>  virtio-mem: Paravirtualized memory hotunplug part 2
>  mm: Allow to offline unmovable PageOffline() pages via
>MEM_GOING_OFFLINE
>  virtio-mem: Allow to offline partially unplugged memory blocks
>  mm/memory_hotplug: Introduce offline_and_remove_memory()
>  virtio-mem: Offline and remove completely unplugged memory blocks
>  virtio-mem: Better retry handling
>  virtio-mem: Add parent resource for all added "System RAM"
>  virtio-mem: Drop manual check for already present memory
>  virtio-mem: Unplug subblocks right-to-left
>  virtio-mem: Use -ETXTBSY as error code if the device is busy
>  virtio-mem: Try to unplug the complete online memory block first
> 
> MAINTAINERS |7 +
> drivers/acpi/numa/srat.c|1 +
> drivers/virtio/Kconfig  |   17 +
> drivers/virtio/Makefile |1 +
> drivers/virtio/virtio_mem.c | 1962 +++
> include/linux/memory_hotplug.h  |1 +
> include/linux/page-flags.h  |   10 +
> include/uapi/linux/virtio_ids.h |1 +
> include/uapi/linux/virtio_mem.h |  208 
> mm/memory_hotplug.c |   81 +-
> mm/page_alloc.c |   26 +
> mm/page_isolation.c |9 +
> 12 files changed, 2314 insertions(+), 10 deletions(-)
> create mode 100644 drivers/virtio/virtio_mem.c
> create mode 100644 include/uapi/linux/virtio_mem.h
> 
> -- 
> 2.25.3



Re: [PATCH v12 10/10] KVM: x86: Enable CET virtualization and advertise CET to userspace

2020-05-19 Thread Sean Christopherson
On Wed, May 06, 2020 at 04:21:09PM +0800, Yang Weijiang wrote:
> Set the feature bits so that CET capabilities can be seen in guest via
> CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET
> master control bit(CR4.CET).
> 
> Signed-off-by: Yang Weijiang 
> ---
>  arch/x86/include/asm/kvm_host.h | 3 ++-
>  arch/x86/kvm/cpuid.c| 5 +++--
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f68c825e94ad..21f3c89d8c70 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -95,7 +95,8 @@
> | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | 
> X86_CR4_PCIDE \
> | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
> | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -   | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +   | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +   | X86_CR4_CET))
>  
>  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>  
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 984ab2b395b3..333a9e0d7cdf 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -344,7 +344,8 @@ void kvm_set_cpu_caps(void)
>   F(AVX512VBMI) | F(LA57) | 0 /*PKU*/ | 0 /*OSPKE*/ | F(RDPID) |
>   F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
>   F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> - F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
> + F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> + F(SHSTK)
>   );
>   /* Set LA57 based on hardware capability. */
>   if (cpuid_ecx(7) & F(LA57))
> @@ -353,7 +354,7 @@ void kvm_set_cpu_caps(void)
>   kvm_cpu_cap_mask(CPUID_7_EDX,
>   F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
>   F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
> - F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM)
> + F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(IBT)

SHSTK and IBT need to be disabled in vmx_set_cpu_caps() if unrestricted guest
is disabled.  CET won't play nice with emulating arbitrary instructions, e.g.
KVM doesn't enforce ENDBR and doesn't keep SSP up-to-date (and no one is
advocating fully emulating CET).

Paolo also floated the idea of providing a reduced opcode set, e.g. only I/O,
MOV, and ALU instructions, but I don't think that needs to be done in the
initial CET enabling as it's more of a defense-in-depth than a functional
requirement.

No need to respin a new series just for this, it can wait until I've looked
through this version.

Original thread: 
https://lkml.kernel.org/r/20200515161919.29249-1-pbonz...@redhat.com

>   );
>  
>   /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
> -- 
> 2.17.2
> 


Re: [RFC V2] mm/vmstat: Add events for PMD based THP migration without split

2020-05-19 Thread John Hubbard

On 2020-05-19 20:32, Anshuman Khandual wrote:
...

How about not being quite so granular on the THP config options, and
just guarding these events with the overall CONFIG_TRANSPARENT_HUGEPAGE
option, instead of the sub-option CONFIG_ARCH_ENABLE_THP_MIGRATION?

I tentatively think it's harmless and not really misleading to have
/proc/vmstat showing this in all THP-enabled configurations:

thp_pmd_migration_success 0
thp_pmd_migration_failure 0

...if THP is enabled, and *whether or not* _THP_MIGRATION is enabled.
And this simplifies things a bit. Given how the .config options can get,
I think simplifying would be nice.

However, I'm ready to be corrected on that, if it's a bad idea for
other API reasons perhaps.  Can anyone please comment?


There is no THP migration events to track unless it is enabled. Why to
show these statistics (as 0) when its not even possible. If the config
simplicity is the only intended rationale here, it might not be the
case either. These events and their tracking would still need to be
wrapped with CONFIG_TRANSPARENT_HUGEPAGE otherwise.

If your concern is more towards CONFIG_ARCH_ENABLE_THP_MIGRATION being
unsuitable or with complex dependencies, then that is something how THP
migration feature itself is implemented currently and adding VM events
does not address that. A possible patch in the future patch could solve
all these (together).

But sure, let's hear it for what others have to say on this.



Well, I don't want to hold up progress. If it's not very convincing to you,
let's just drop the idea/ It was kind of weak. :)



+    THP_PMD_MIGRATION_SUCCESS,
+    THP_PMD_MIGRATION_FAILURE,
+#endif
   #endif
   #ifdef CONFIG_MEMORY_BALLOON
   BALLOON_INFLATE,
diff --git a/mm/migrate.c b/mm/migrate.c
index 7160c1556f79..5325700a3e90 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1170,6 +1170,18 @@ static int __unmap_and_move(struct page *page, struct 
page *newpage,
   #define ICE_noinline
   #endif
   +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline void thp_migration_success(bool success)



I think this should be named

     thp_pmd_migration_success()

, since that's what you're really counting. Or, you could
name the events THP_MIGRATION_SUCCESS|FAILURE. Either way,
just so the function name matches the events it's counting.


Makes sense but IMHO we should keep _pmd_ to be more specific.
Will change the name here as thp_pmd_migration_success().





+{
+    if (success)
+    count_vm_event(THP_PMD_MIGRATION_SUCCESS);
+    else
+    count_vm_event(THP_PMD_MIGRATION_FAILURE);
+}
+#else
+static inline void thp_migration_success(bool success) { }



This whole ifdef clause would disappear if my suggestion above is


We will have to protect these with CONFIG_TRANSPARENT_HUGEPAGE as
the events are still conditionally available.



Yes you are right, of course. And I even worked through that, but then
when I sat down to write a response my fingers typed v1 of my understanding
instead of v2. No one knows why. :) Sorry about the misinformation there.


accepted. However, if not, then I believe the convention for this
kind of situation is:

static inline void thp_migration_success(bool success)
{
}


AFAIK, we have examples both ways but will change if this is preferred.



Not worth worrying about, but I do recall a few recent code reviews that
all preferred the multi-line version, which is why I suggested it.

Anyway, either way, with the thp_pmd_migration_success() name change, you
can add:

Reviewed-by: John Hubbard 


thanks,
--
John Hubbard
NVIDIA


  1   2   3   4   5   6   7   8   9   10   >