Re: [PATCH v2 4/4] arm64: dts: add support for A1 based Amlogic AD401

2019-09-05 Thread Jianxin Pan
Hi Martin,

Thanks for the review, we really appreciate your time.
Please see my comments below.

On 2019/9/6 4:15, Martin Blumenstingl wrote:
> Hi Jianxin,
> 
> (it's great to see that you and your team are upstreaming this early)
> 
> On Thu, Sep 5, 2019 at 9:08 AM Jianxin Pan  wrote:
> [...]
>> +   memory@0 {
>> +   device_type = "memory";
>> +   reg = <0x0 0x0 0x0 0x800>;
>> +   /*linux,usable-memory = <0x0 0x0 0x0 0x800>;*/
> why do we need that comment here (I don't understand it - why doesn't
> the "reg" property cover this)?
> I replaced "linux,usable-memory" with reg, but forgot to remove this comment 
> line. 
I will remove this line in the next version. Thank you.
>> +   };
>> +};
>> +
>> +_AO_B {
>> +   status = "okay";
>> +};
>> diff --git a/arch/arm64/boot/dts/amlogic/meson-a1.dtsi 
>> b/arch/arm64/boot/dts/amlogic/meson-a1.dtsi
>> new file mode 100644
>> index ..4d476ac
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/amlogic/meson-a1.dtsi
>> @@ -0,0 +1,122 @@
>> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
>> +/*
>> + * Copyright (c) 2019 Amlogic, Inc. All rights reserved.
>> + */
>> +
>> +#include 
>> +#include 
>> +
>> +/ {
>> +   compatible = "amlogic,a1";
>> +
>> +   interrupt-parent = <>;
>> +   #address-cells = <2>;
>> +   #size-cells = <2>;
>> +
>> +   cpus {
>> +   #address-cells = <0x2>;
>> +   #size-cells = <0x0>;
> only now I notice that all our other .dtsi also use hex values
> (instead of decimal as just a few lines above) here
> do you know if there is a particular reason for this?
> 
I just copied from the previous series, and didn't notice the difference 
before.> [...]
>> +   uart_AO_B: serial@fe002000 {
>> +   compatible = "amlogic,meson-gx-uart",
>> +"amlogic,meson-ao-uart";
>> +reg = <0x0 0xfe002000 0x0 0x18>;
> the indentation of the "reg" property is off here
OK, I will fix it.
> 
> also I'm a bit surprised to see no busses (like aobus, cbus, periphs, ...) 
> here
> aren't there any busses defined in the A1 SoC implementation or are
> were you planning to add them later?
>Unlike previous series,there is no Cortex-M3 AO CPU in A1, and there is no 
>AO/EE power domain.
Most of the registers are on the apb_32b bus.  aobus, cbus and periphs are not 
used in A1.
> 
> Martin
> 
> .
> 



[tip: x86/cpu] x86/cpu: Add Tiger Lake to Intel family

2019-09-05 Thread tip-bot2 for Gayatri Kammela
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 6e1c32c5dbb4b90eea8f964c2869d0bde050dbe0
Gitweb:
https://git.kernel.org/tip/6e1c32c5dbb4b90eea8f964c2869d0bde050dbe0
Author:Gayatri Kammela 
AuthorDate:Thu, 05 Sep 2019 12:30:17 -07:00
Committer: Ingo Molnar 
CommitterDate: Fri, 06 Sep 2019 07:30:39 +02:00

x86/cpu: Add Tiger Lake to Intel family

Add the model numbers/CPUIDs of Tiger Lake mobile and desktop to the
Intel family.

Suggested-by: Tony Luck 
Signed-off-by: Gayatri Kammela 
Signed-off-by: Tony Luck 
Reviewed-by: Tony Luck 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Rahul Tanwar 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/20190905193020.14707-2-tony.l...@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/intel-family.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/intel-family.h 
b/arch/x86/include/asm/intel-family.h
index 5c05b2d..7c2ef2e 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -80,6 +80,9 @@
 #define INTEL_FAM6_ICELAKE_L   0x7E
 #define INTEL_FAM6_ICELAKE_NNPI0x9D
 
+#define INTEL_FAM6_TIGERLAKE_L 0x8C
+#define INTEL_FAM6_TIGERLAKE   0x8D
+
 /* "Small Core" Processors (Atom) */
 
 #define INTEL_FAM6_ATOM_BONNELL0x1C /* Diamondville, Pineview 
*/


[tip: x86/cpu] x86/cpu: Add Elkhart Lake to Intel family

2019-09-05 Thread tip-bot2 for Gayatri Kammela
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 0f65605a8d744b3a205d0a2cd8f20707e31fc023
Gitweb:
https://git.kernel.org/tip/0f65605a8d744b3a205d0a2cd8f20707e31fc023
Author:Gayatri Kammela 
AuthorDate:Thu, 05 Sep 2019 12:30:18 -07:00
Committer: Ingo Molnar 
CommitterDate: Fri, 06 Sep 2019 07:30:39 +02:00

x86/cpu: Add Elkhart Lake to Intel family

Add the model number/CPUID of atom based Elkhart Lake to the Intel
family.

Signed-off-by: Gayatri Kammela 
Signed-off-by: Tony Luck 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Rahul Tanwar 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/20190905193020.14707-3-tony.l...@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/intel-family.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/intel-family.h 
b/arch/x86/include/asm/intel-family.h
index 7c2ef2e..55568af 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -106,6 +106,7 @@
 #define INTEL_FAM6_ATOM_GOLDMONT_PLUS  0x7A /* Gemini Lake */
 
 #define INTEL_FAM6_ATOM_TREMONT_D  0x86 /* Jacobsville */
+#define INTEL_FAM6_ATOM_TREMONT0x96 /* Elkhart Lake */
 
 /* Xeon Phi */
 


[tip: x86/cpu] x86/cpu: Add new Airmont variant to Intel family

2019-09-05 Thread tip-bot2 for Rahul Tanwar
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 855fa1f362cab2dc7574acb853b0963dd01d6b8d
Gitweb:
https://git.kernel.org/tip/855fa1f362cab2dc7574acb853b0963dd01d6b8d
Author:Rahul Tanwar 
AuthorDate:Thu, 05 Sep 2019 12:30:19 -07:00
Committer: Ingo Molnar 
CommitterDate: Fri, 06 Sep 2019 07:30:39 +02:00

x86/cpu: Add new Airmont variant to Intel family

Add new Airmont variant CPU model to Intel family.

Signed-off-by: Rahul Tanwar 
Signed-off-by: Tony Luck 
Cc: Gayatri Kammela 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/20190905193020.14707-4-tony.l...@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/intel-family.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/intel-family.h 
b/arch/x86/include/asm/intel-family.h
index 55568af..f046225 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -98,6 +98,7 @@
 
 #define INTEL_FAM6_ATOM_AIRMONT0x4C /* Cherry Trail, Braswell 
*/
 #define INTEL_FAM6_ATOM_AIRMONT_MID0x5A /* Moorefield */
+#define INTEL_FAM6_ATOM_AIRMONT_NP 0x75 /* Lightning Mountain */
 
 #define INTEL_FAM6_ATOM_GOLDMONT   0x5C /* Apollo Lake */
 #define INTEL_FAM6_ATOM_GOLDMONT_D 0x5F /* Denverton */


[tip: x86/platform] x86/platform/uv: Fix kmalloc() NULL check routine

2019-09-05 Thread tip-bot2 for Austin Kim
The following commit has been merged into the x86/platform branch of tip:

Commit-ID: 864b23f0169d5bff677e8443a7a90dfd6b090afc
Gitweb:
https://git.kernel.org/tip/864b23f0169d5bff677e8443a7a90dfd6b090afc
Author:Austin Kim 
AuthorDate:Fri, 06 Sep 2019 08:29:51 +09:00
Committer: Ingo Molnar 
CommitterDate: Fri, 06 Sep 2019 07:36:16 +02:00

x86/platform/uv: Fix kmalloc() NULL check routine

The result of kmalloc() should have been checked ahead of below statement:

pqp = (struct bau_pq_entry *)vp;

Move BUG_ON(!vp) before above statement.

Signed-off-by: Austin Kim 
Cc: Dimitri Sivanich 
Cc: Hedi Berriche 
Cc: Linus Torvalds 
Cc: Mike Travis 
Cc: Peter Zijlstra 
Cc: Russ Anderson 
Cc: Steve Wahl 
Cc: Thomas Gleixner 
Cc: alli...@lohutok.net
Cc: a...@infradead.org
Cc: arm...@tjaldur.nl
Cc: b...@alien8.de
Cc: dvh...@infradead.org
Cc: gre...@linuxfoundation.org
Cc: h...@zytor.com
Cc: k...@umn.edu
Cc: platform-driver-...@vger.kernel.org
Link: https://lkml.kernel.org/r/20190905232951.GA28779@LGEARND20B15
Signed-off-by: Ingo Molnar 
---
 arch/x86/platform/uv/tlb_uv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 20c389a..5f0a96b 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -1804,9 +1804,9 @@ static void pq_init(int node, int pnode)
 
plsize = (DEST_Q_SIZE + 1) * sizeof(struct bau_pq_entry);
vp = kmalloc_node(plsize, GFP_KERNEL, node);
-   pqp = (struct bau_pq_entry *)vp;
-   BUG_ON(!pqp);
+   BUG_ON(!vp);
 
+   pqp = (struct bau_pq_entry *)vp;
cp = (char *)pqp + 31;
pqp = (struct bau_pq_entry *)(((unsigned long)cp >> 5) << 5);
 


[tip: x86/cpu] x86/cpu: Update init data for new Airmont CPU model

2019-09-05 Thread tip-bot2 for Rahul Tanwar
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 0cc5359d8fd45bc410906e009117e78e2b5b2322
Gitweb:
https://git.kernel.org/tip/0cc5359d8fd45bc410906e009117e78e2b5b2322
Author:Rahul Tanwar 
AuthorDate:Thu, 05 Sep 2019 12:30:20 -07:00
Committer: Ingo Molnar 
CommitterDate: Fri, 06 Sep 2019 07:30:40 +02:00

x86/cpu: Update init data for new Airmont CPU model

Update properties for newly added Airmont CPU variant.

Signed-off-by: Rahul Tanwar 
Signed-off-by: Tony Luck 
Cc: Gayatri Kammela 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/20190905193020.14707-5-tony.l...@intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/common.c | 1 +
 arch/x86/kernel/cpu/intel.c  | 1 +
 arch/x86/kernel/tsc_msr.c| 5 +
 3 files changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b6a9e27..030e527 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1059,6 +1059,7 @@ static const __initconst struct x86_cpu_id 
cpu_vuln_whitelist[] = {
VULNWL_INTEL(CORE_YONAH,NO_SSB),
 
VULNWL_INTEL(ATOM_AIRMONT_MID,  NO_L1TF | MSBDS_ONLY | 
NO_SWAPGS),
+   VULNWL_INTEL(ATOM_AIRMONT_NP,   NO_L1TF | NO_SWAPGS),
 
VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS),
VULNWL_INTEL(ATOM_GOLDMONT_D,   NO_MDS | NO_L1TF | NO_SWAPGS),
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index e2082cc..c2fdc00 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -268,6 +268,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
case INTEL_FAM6_ATOM_SALTWELL_MID:
case INTEL_FAM6_ATOM_SALTWELL_TABLET:
case INTEL_FAM6_ATOM_SILVERMONT_MID:
+   case INTEL_FAM6_ATOM_AIRMONT_NP:
set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC_S3);
break;
default:
diff --git a/arch/x86/kernel/tsc_msr.c b/arch/x86/kernel/tsc_msr.c
index 067858f..e0cbe4f 100644
--- a/arch/x86/kernel/tsc_msr.c
+++ b/arch/x86/kernel/tsc_msr.c
@@ -58,6 +58,10 @@ static const struct freq_desc freq_desc_ann = {
1, { 83300, 10, 133300, 10, 0, 0, 0, 0 }
 };
 
+static const struct freq_desc freq_desc_lgm = {
+   1, { 78000, 78000, 78000, 78000, 78000, 78000, 78000, 78000 }
+};
+
 static const struct x86_cpu_id tsc_msr_cpu_ids[] = {
INTEL_CPU_FAM6(ATOM_SALTWELL_MID,   freq_desc_pnw),
INTEL_CPU_FAM6(ATOM_SALTWELL_TABLET,freq_desc_clv),
@@ -65,6 +69,7 @@ static const struct x86_cpu_id tsc_msr_cpu_ids[] = {
INTEL_CPU_FAM6(ATOM_SILVERMONT_MID, freq_desc_tng),
INTEL_CPU_FAM6(ATOM_AIRMONT,freq_desc_cht),
INTEL_CPU_FAM6(ATOM_AIRMONT_MID,freq_desc_ann),
+   INTEL_CPU_FAM6(ATOM_AIRMONT_NP, freq_desc_lgm),
{}
 };
 


Re: [git pull] habanalabs pull request for kernel 5.4

2019-09-05 Thread Greg KH
On Fri, Sep 06, 2019 at 07:54:41AM +0300, Oded Gabbay wrote:
> On Fri, Sep 6, 2019 at 7:38 AM Oded Gabbay  wrote:
> >
> > On Thu, Sep 5, 2019 at 11:50 PM Greg KH  wrote:
> > >
> > > On Thu, Sep 05, 2019 at 03:19:34PM +0300, Oded Gabbay wrote:
> > > > Hello Greg,
> > > >
> > > > This is the pull request for habanalabs driver for kernel 5.4.
> > > >
> > > > It contains one major change, the creation of an additional char device
> > > > per PCI device. In addition, there are some small changes and
> > > > improvements.
> > > >
> > > > Please see the tag message for details on what this pull request 
> > > > contains.
> > > >
> > > > Thanks,
> > > > Oded
> > > >
> > > > The following changes since commit 
> > > > 25ec8710d9c2cd4d0446ac60a72d388000d543e6:
> > > >
> > > >   w1: add DS2501, DS2502, DS2505 EPROM device driver (2019-09-04 
> > > > 14:34:31 +0200)
> > > >
> > > > are available in the Git repository at:
> > > >
> > > >   git://people.freedesktop.org/~gabbayo/linux 
> > > > tags/misc-habanalabs-next-2019-09-05
> > >
> > > Is that a signed tag?  It doesn't seem to me like it is, have you always
> > > sent unsigned tags?
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > It is unsigned. I have never sent you a signed tag.
> >
> > Thanks,
> > Oded
> 
> Just to clarify. I have never sent a signed pull request. I'll look
> now how to do it and re-send this pull request to you.
> My only question is how do you verify my GPG key ? Do I need to
> authenticate it somewhere ?

Ok, for some reason I thought you had sent signed tags in the past, my
fault, sorry about that.

If at all possible, it would be good to get your gpg key into the
kernel.org "web of trust" by having it signed by a kernel.org developer
so that I "know" this is you in some form :)

Now pulled and pushed out.

thanks,

greg k-h


Re: [PATCH v2 2/2] PTP: add support for one-shot output

2019-09-05 Thread Richard Cochran
On Thu, Sep 05, 2019 at 01:03:46PM +0300, Felipe Balbi wrote:
> This a bit confusing, really. Specially when the comment right above
> those flags states:
> 
> /* PTP_xxx bits, for the flags field within the request structures. */

Agreed, it is confusing.  Go ahead and remove this comment.

> Seems like we will, at least, make it clear which flags are valid for
> which request structures.

Yes, please do make it as clear as you can.

Thanks,
Richard


Re: [PATCH v4 3/4] dt-bindings: Add Qualcomm USB SuperSpeed PHY bindings

2019-09-05 Thread Stephen Boyd
Quoting Jack Pham (2019-09-05 10:58:02)
> Hi Jorge, Bjorn,
> 
> On Thu, Sep 05, 2019 at 09:18:57AM +0200, Jorge Ramirez wrote:
> > On 9/4/19 01:34, Bjorn Andersson wrote:
> > > On Tue 03 Sep 14:45 PDT 2019, Stephen Boyd wrote:
> > >> that would need an of_regulator_get() sort of API that can get the
> > >> regulator out of there? Or to make the connector into a struct device
> > >> that can get the regulator out per some generic connector driver and
> > >> then pass it through to the USB controller when it asks for it. Maybe
> > >> try to prototype that out?
> > >>
> > > 
> > > The examples given in the DT bindings describes the connector as a child
> > > of a PMIC, with of_graph somehow tying it to the various inputs. But in
> > > these examples vbus is handled by implicitly inside the MFD, where
> > > extcon is informed about the plug event they toggle vbus as well.
> > > 
> > > In our case we have a extcon-usb-gpio to detect mode, which per Jorge's
> > > proposal will trickle down to the PHY and become a regulator calls on
> > > either some external regulator or more typically one of the chargers in
> > > the system.
> 
> Interesting you mention extcon-usb-gpio. I thought extcon at least from
> bindings perspective is passé now. Maybe this is what you need (just
> landed in usb-next):
> 
> usb: common: add USB GPIO based connection detection driver
> https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next=4602f3bff2669012c1147eecfe74c121765f5c56
> 
> dt-bindings: usb: add binding for USB GPIO based connection detection driver
> https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next=f651c73e71f53f65e9846677d79d8e120452b59f
> 
> Fortunately this new driver might check the right boxes for you:
> - usb connector binding
> - ID detect GPIO
> - vbus-supply regulator
> 
> With that, I think you can also keep the connector subnode out of the
> SSPHY node well, and similarly get rid of the vbus toggle handling from
> the PHY driver.
> 
> The big thing missing now is that this driver replaces extcon
> completely, so we'll need handling in dwc3/dwc3-qcom to retrieve the
> role switch state to know when host mode is entered. I saw this a while
> back but don't think it got picked up:
> 
> https://patchwork.kernel.org/patch/10909981/
> 

Yes this looks like the approach that should be taken. One question
though, is this a micro-b connector or a type-c connector on the board?
I thought it was a type-c, so then this USB gpio based connection driver
isn't an exact fit?



Re: [PATCH v2] drm/virtio: Use vmalloc for command buffer allocations.

2019-09-05 Thread Gerd Hoffmann
> +/* How many bytes left in this page. */
> +static unsigned int rest_of_page(void *data)
> +{
> + return PAGE_SIZE - offset_in_page(data);
> +}

Not needed.

> +/* Create sg_table from a vmalloc'd buffer. */
> +static struct sg_table *vmalloc_to_sgt(char *data, uint32_t size, int 
> *sg_ents)
> +{
> + int nents, ret, s, i;
> + struct sg_table *sgt;
> + struct scatterlist *sg;
> + struct page *pg;
> +
> + *sg_ents = 0;
> +
> + sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
> + if (!sgt)
> + return NULL;
> +
> + nents = DIV_ROUND_UP(size, PAGE_SIZE) + 1;

Why +1?

> + ret = sg_alloc_table(sgt, nents, GFP_KERNEL);
> + if (ret) {
> + kfree(sgt);
> + return NULL;
> + }
> +
> + for_each_sg(sgt->sgl, sg, nents, i) {
> + pg = vmalloc_to_page(data);
> + if (!pg) {
> + sg_free_table(sgt);
> + kfree(sgt);
> + return NULL;
> + }
> +
> + s = rest_of_page(data);
> + if (s > size)
> + s = size;

vmalloc memory is page aligned, so:

s = min(PAGE_SIZE, size);

> + sg_set_page(sg, pg, s, offset_in_page(data));

Offset is always zero.

> +
> + size -= s;
> + data += s;
> + *sg_ents += 1;

sg_ents isn't used anywhere.

> +
> + if (size) {
> + sg_unmark_end(sg);
> + } else {
> + sg_mark_end(sg);
> + break;
> + }

That looks a bit strange.  I guess you need only one of the two because
the other is the default?

>  static int virtio_gpu_queue_fenced_ctrl_buffer(struct virtio_gpu_device 
> *vgdev,
>  struct virtio_gpu_vbuffer *vbuf,
>  struct virtio_gpu_ctrl_hdr *hdr,
>  struct virtio_gpu_fence *fence)
>  {
>   struct virtqueue *vq = vgdev->ctrlq.vq;
> + struct scatterlist *vout = NULL, sg;
> + struct sg_table *sgt = NULL;
>   int rc;
> + int outcnt = 0;
> +
> + if (vbuf->data_size) {
> + if (is_vmalloc_addr(vbuf->data_buf)) {
> + sgt = vmalloc_to_sgt(vbuf->data_buf, vbuf->data_size,
> +  );
> + if (!sgt)
> + return -ENOMEM;
> + vout = sgt->sgl;
> + } else {
> + sg_init_one(, vbuf->data_buf, vbuf->data_size);
> + vout = 
> + outcnt = 1;

outcnt must be set in both cases.

> +static int virtio_gpu_queue_ctrl_buffer(struct virtio_gpu_device *vgdev,
> + struct virtio_gpu_vbuffer *vbuf)
> +{
> + return virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, NULL, NULL);
> +}

Changing virtio_gpu_queue_ctrl_buffer to call
virtio_gpu_queue_fenced_ctrl_buffer should be done in a separate patch.

cheers,
  Gerd



Re: [PATCH v2 3/3] phy: qcom-qmp: Add SM8150 QMP UFS PHY support

2019-09-05 Thread Stephen Boyd
Quoting Vinod Koul (2019-09-05 22:10:17)
> SM8150 UFS PHY is v4 of QMP phy. Add support for V4 QMP phy register
> defines and support for SM8150 QMP UFS PHY.
> 
> Signed-off-by: Vinod Koul 
> Reviewed-by: Bjorn Andersson 
> ---

Reviewed-by: Stephen Boyd 



Re: [PATCH -next] coccinelle: platform_get_irq: Fix parse error

2019-09-05 Thread Stephen Boyd
Quoting YueHaibing (2019-09-05 20:30:06)
> When do coccicheck, I get this error:
> 
> spatch -D report --no-show-diff --very-quiet --cocci-file
> ./scripts/coccinelle/api/platform_get_irq.cocci --include-headers
> --dir . -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include
>  -I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi
>  -I ./include/uapi -I ./include/generated/uapi
>  --include ./include/linux/kconfig.h --jobs 192 --chunksize 1
> minus: parse error:
>   File "./scripts/coccinelle/api/platform_get_irq.cocci", line 24, column 9, 
> charpos = 355
>   around = '\(',
>   whole content = if ( ret \( < \| <= \) 0 )
> 
> In commit e56476897448 ("fpga: Remove dev_err() usage
> after platform_get_irq()") log, I found the semantic patch,
> it fix this issue.
> 
> Fixes: 98051ba2b28b ("coccinelle: Add script to check for platform_get_irq() 
> excessive prints")
> Signed-off-by: YueHaibing 
> ---

Hmm I had this earlier but someone asked me to change it.

Reviewed-by: Stephen Boyd 



Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

2019-09-05 Thread Daniel Lezcano


Hi,

On 06/09/2019 03:48, Ming Lei wrote:

[ ... ]

>> You did not share yet the analysis of the problem (the kernel warnings
>> give the symptoms) and gave the reasoning for the solution. It is hard
>> to understand what you are looking for exactly and how to connect the dots.
> 
> Let me explain it one more time:>
> When one IRQ flood happens on one CPU:
> 
> 1) softirq handling on this CPU can't make progress
> 
> 2) kernel thread bound to this CPU can't make progress
> 
> For example, network may require softirq to xmit packets, or another irq
> thread for handling keyboards/mice or whatever, or rcu_sched may depend
> on that CPU for making progress, then the irq flood stalls the whole
> system.
> 
>>
>> AFAIU, there are fast medium where the responses to requests are faster
>> than the time to process them, right?
> 
> Usually medium may not be faster than CPU, now we are talking about
> interrupts, which can be originated from lots of devices concurrently,
> for example, in Long Li'test, there are 8 NVMe drives involved.
> 
>>
>> I don't see how detecting IRQ flooding and use a threaded irq is the
>> solution, can you explain?
> 
> When IRQ flood is detected, we reserve a bit little time for providing
> chance to make softirq/threads scheduled by scheduler, then the above
> problem can be avoided.
> 
>>
>> If the responses are coming at a very high rate, whatever the solution
>> (interrupts, threaded interrupts, polling), we are still in the same
>> situation.
> 
> When we moving the interrupt handling into irq thread, other softirq/
> threaded interrupt/thread gets chance to be scheduled, so we can avoid
> to stall the whole system.

Ok, so the real problem is per-cpu bounded tasks.

I share Thomas opinion about a NAPI like approach.

I do believe you should also rely on the IRQ_TIME_ACCOUNTING (may be get
it optimized) to contribute to the CPU load and enforce task migration
at load balance.

>> My suggestion was initially to see if the interrupt load will be taken
>> into accounts in the cpu load and favorize task migration with the
>> scheduler load balance to a less loaded CPU, thus the CPU processing
>> interrupts will end up doing only that while other CPUs will handle the
>> "threaded" side.
>>
>> Beside that, I'm wondering if the block scheduler should be somehow
>> involved in that [1]
> 
> For NVMe or any multi-queue storage, the default scheduler is 'none',
> which basically does nothing except for submitting IO asap.
> 
> 
> Thanks,
> Ming
> 


-- 
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog



Re: [PATCH v2 1/2] dt-bindings: milbeaut-m10v-hdmac: Add Socionext Milbeaut HDMAC bindings

2019-09-05 Thread Vinod Koul
On 04-09-19, 01:18, Jassi Brar wrote:
> On Wed, Sep 4, 2019 at 12:51 AM Vinod Koul  wrote:
> >
> > On 18-08-19, 00:17, jassisinghb...@gmail.com wrote:
> > > From: Jassi Brar 
> > >
> > > Document the devicetree bindings for Socionext Milbeaut HDMAC
> > > controller. Controller has upto 8 floating channels, that need
> > > a predefined slave-id to work from a set of slaves.
> > >
> > > Signed-off-by: Jassi Brar 
> > > ---
> > >  .../bindings/dma/milbeaut-m10v-hdmac.txt  | 32 +++
> > >  1 file changed, 32 insertions(+)
> > >  create mode 100644 
> > > Documentation/devicetree/bindings/dma/milbeaut-m10v-hdmac.txt
> > >
> > > diff --git 
> > > a/Documentation/devicetree/bindings/dma/milbeaut-m10v-hdmac.txt 
> > > b/Documentation/devicetree/bindings/dma/milbeaut-m10v-hdmac.txt
> > > new file mode 100644
> > > index ..f0960724f1c7
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/dma/milbeaut-m10v-hdmac.txt
> > > @@ -0,0 +1,32 @@
> > > +* Milbeaut AHB DMA Controller
> > > +
> > > +Milbeaut AHB DMA controller has transfer capability bellow.
> >
> > s/bellow/below:
> >
> > > + - device to memory transfer
> > > + - memory to device transfer
> > > +
> > > +Required property:
> > > +- compatible:   Should be  "socionext,milbeaut-m10v-hdmac"
> > > +- reg:  Should contain DMA registers location and length.
> > > +- interrupts:   Should contain all of the per-channel DMA interrupts.
> > > + Number of channels is configurable - 2, 4 or 8, so
> > > + the number of interrupts specfied should be {2,4,8}.
> >
> > s/specfied/specified
> >
> Hi Vinod,
>   Do you want me to spin yet another revision for the two types in text?

Yes that would be easier for me

Thanks

-- 
~Vinod


[PATCH v2 2/3] dt-bindings: phy-qcom-qmp: Add sm8150 UFS phy compatible string

2019-09-05 Thread Vinod Koul
Document "qcom,sdm845-qmp-ufs-phy" compatible string for QMP UFS PHY
found on SM8150.

Signed-off-by: Vinod Koul 
Reviewed-by: Bjorn Andersson 
Reviewed-by: Stephen Boyd 
---
 Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt 
b/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt
index 085fbd676cfc..eac9ad3cbbc8 100644
--- a/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt
+++ b/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt
@@ -14,7 +14,8 @@ Required properties:
   "qcom,msm8998-qmp-pcie-phy" for PCIe QMP phy on msm8998,
   "qcom,sdm845-qmp-usb3-phy" for USB3 QMP V3 phy on sdm845,
   "qcom,sdm845-qmp-usb3-uni-phy" for USB3 QMP V3 UNI phy on sdm845,
-  "qcom,sdm845-qmp-ufs-phy" for UFS QMP phy on sdm845.
+  "qcom,sdm845-qmp-ufs-phy" for UFS QMP phy on sdm845,
+  "qcom,sm8150-qmp-ufs-phy" for UFS QMP phy on sm8150.
 
 - reg:
   - index 0: address and length of register set for PHY's common
@@ -57,6 +58,8 @@ Required properties:
"aux", "cfg_ahb", "ref", "com_aux".
For "qcom,sdm845-qmp-ufs-phy" must contain:
"ref", "ref_aux".
+   For "qcom,sm8150-qmp-ufs-phy" must contain:
+   "ref", "ref_aux".
 
  - resets: a list of phandles and reset controller specifier pairs,
   one for each entry in reset-names.
@@ -83,6 +86,8 @@ Required properties:
"phy", "common".
For "qcom,sdm845-qmp-ufs-phy": must contain:
"ufsphy".
+   For "qcom,sm8150-qmp-ufs-phy": must contain:
+   "ufsphy".
 
  - vdda-phy-supply: Phandle to a regulator supply to PHY core block.
  - vdda-pll-supply: Phandle to 1.8V regulator supply to PHY refclk pll block.
-- 
2.20.1



[PATCH v2 3/3] phy: qcom-qmp: Add SM8150 QMP UFS PHY support

2019-09-05 Thread Vinod Koul
SM8150 UFS PHY is v4 of QMP phy. Add support for V4 QMP phy register
defines and support for SM8150 QMP UFS PHY.

Signed-off-by: Vinod Koul 
Reviewed-by: Bjorn Andersson 
---
 drivers/phy/qualcomm/phy-qcom-qmp.c | 125 
 drivers/phy/qualcomm/phy-qcom-qmp.h |  96 +
 2 files changed, 221 insertions(+)

diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
b/drivers/phy/qualcomm/phy-qcom-qmp.c
index 34ff6434da8f..92d3048f2b36 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
@@ -164,6 +164,11 @@ static const unsigned int sdm845_ufsphy_regs_layout[] = {
[QPHY_PCS_READY_STATUS] = 0x160,
 };
 
+static const unsigned int sm8150_ufsphy_regs_layout[] = {
+   [QPHY_START_CTRL]   = 0x00,
+   [QPHY_PCS_READY_STATUS] = 0x180,
+};
+
 static const struct qmp_phy_init_tbl msm8996_pcie_serdes_tbl[] = {
QMP_PHY_INIT_CFG(QSERDES_COM_BIAS_EN_CLKBUFLR_EN, 0x1c),
QMP_PHY_INIT_CFG(QSERDES_COM_CLK_ENABLE1, 0x10),
@@ -878,6 +883,93 @@ static const struct qmp_phy_init_tbl 
msm8998_usb3_pcs_tbl[] = {
QMP_PHY_INIT_CFG(QPHY_V3_PCS_RXEQTRAINING_RUN_TIME, 0x13),
 };
 
+static const struct qmp_phy_init_tbl sm8150_ufsphy_serdes_tbl[] = {
+   QMP_PHY_INIT_CFG(QPHY_POWER_DOWN_CONTROL, 0x01),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_SYSCLK_EN_SEL, 0xd9),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_HSCLK_SEL, 0x11),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_HSCLK_HS_SWITCH_SEL, 0x00),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_LOCK_CMP_EN, 0x01),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_VCO_TUNE_MAP, 0x02),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_PLL_IVCO, 0x0f),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_VCO_TUNE_INITVAL2, 0x00),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_BIN_VCOCAL_HSCLK_SEL, 0x11),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_DEC_START_MODE0, 0x82),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_CP_CTRL_MODE0, 0x06),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_PLL_RCTRL_MODE0, 0x16),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_PLL_CCTRL_MODE0, 0x36),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_LOCK_CMP1_MODE0, 0xff),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_LOCK_CMP2_MODE0, 0x0c),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_BIN_VCOCAL_CMP_CODE1_MODE0, 0xac),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_BIN_VCOCAL_CMP_CODE2_MODE0, 0x1e),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_DEC_START_MODE1, 0x98),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_CP_CTRL_MODE1, 0x06),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_PLL_RCTRL_MODE1, 0x16),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_PLL_CCTRL_MODE1, 0x36),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_LOCK_CMP1_MODE1, 0x32),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_LOCK_CMP2_MODE1, 0x0f),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_BIN_VCOCAL_CMP_CODE1_MODE1, 0xdd),
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_BIN_VCOCAL_CMP_CODE2_MODE1, 0x23),
+
+   /* Rate B */
+   QMP_PHY_INIT_CFG(QSERDES_COM_V4_VCO_TUNE_MAP, 0x06),
+};
+
+static const struct qmp_phy_init_tbl sm8150_ufsphy_tx_tbl[] = {
+   QMP_PHY_INIT_CFG(QSERDES_V4_TX_PWM_GEAR_1_DIVIDER_BAND0_1, 0x06),
+   QMP_PHY_INIT_CFG(QSERDES_V4_TX_PWM_GEAR_2_DIVIDER_BAND0_1, 0x03),
+   QMP_PHY_INIT_CFG(QSERDES_V4_TX_PWM_GEAR_3_DIVIDER_BAND0_1, 0x01),
+   QMP_PHY_INIT_CFG(QSERDES_V4_TX_PWM_GEAR_4_DIVIDER_BAND0_1, 0x00),
+   QMP_PHY_INIT_CFG(QSERDES_V4_TX_LANE_MODE_1, 0x05),
+   QMP_PHY_INIT_CFG(QSERDES_V4_TX_TRAN_DRVR_EMP_EN, 0x0c),
+};
+
+static const struct qmp_phy_init_tbl sm8150_ufsphy_rx_tbl[] = {
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_SIGDET_LVL, 0x24),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_SIGDET_CNTRL, 0x0f),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_SIGDET_DEGLITCH_CNTRL, 0x1e),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_BAND, 0x18),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_FASTLOCK_FO_GAIN, 0x0a),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_SO_SATURATION_AND_ENABLE, 0x4b),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_PI_CONTROLS, 0xf1),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_FASTLOCK_COUNT_LOW, 0x80),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_PI_CTRL2, 0x80),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_FO_GAIN, 0x0c),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_UCDR_SO_GAIN, 0x04),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_TERM_BW, 0x1b),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_EQU_ADAPTOR_CNTRL2, 0x06),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_EQU_ADAPTOR_CNTRL3, 0x04),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_EQU_ADAPTOR_CNTRL4, 0x1d),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_OFFSET_ADAPTOR_CNTRL2, 0x00),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_IDAC_MEASURE_TIME, 0x10),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_IDAC_TSETTLE_LOW, 0xc0),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_IDAC_TSETTLE_HIGH, 0x00),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_MODE_00_LOW, 0x36),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_MODE_00_HIGH, 0x36),
+   QMP_PHY_INIT_CFG(QSERDES_V4_RX_RX_MODE_00_HIGH2, 0xf6),
+   

[PATCH v2 0/3] UFS: Add support for SM8150 UFS

2019-09-05 Thread Vinod Koul
This series adds compatible strings for ufs hc and ufs qmp phy found in
Qualcomm SM8150 SoC. Also update the qmp phy driver with version 4 and
support for ufs phy.

Changes since v1:
 - make the numbers a lower case hex
 - add review tags recieved

Vinod Koul (3):
  dt-bindings: ufs: Add sm8150 compatible string
  dt-bindings: phy-qcom-qmp: Add sm8150 UFS phy compatible string
  phy: qcom-qmp: Add SM8150 QMP UFS PHY support

 .../devicetree/bindings/phy/qcom-qmp-phy.txt  |   7 +-
 .../devicetree/bindings/ufs/ufshcd-pltfrm.txt |   1 +
 drivers/phy/qualcomm/phy-qcom-qmp.c   | 125 ++
 drivers/phy/qualcomm/phy-qcom-qmp.h   |  96 ++
 4 files changed, 228 insertions(+), 1 deletion(-)

-- 
2.20.1



[PATCH v2 1/3] dt-bindings: ufs: Add sm8150 compatible string

2019-09-05 Thread Vinod Koul
Document "qcom,sm8150-ufshc" compatible string for UFS HC found on
SM8150.

Signed-off-by: Vinod Koul 
Reviewed-by: Bjorn Andersson 
Reviewed-by: Stephen Boyd 
---
 Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt 
b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
index a74720486ee2..7529e2c26127 100644
--- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
+++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
@@ -13,6 +13,7 @@ Required properties:
"qcom,msm8996-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
"qcom,msm8998-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
"qcom,sdm845-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
+   "qcom,sm8150-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
 - interrupts: 
 - reg   : 
 
-- 
2.20.1



Re: [PATCH -next] coccinelle: platform_get_irq: Fix parse error

2019-09-05 Thread Julia Lawall



On Fri, 6 Sep 2019, YueHaibing wrote:

> When do coccicheck, I get this error:
>
> spatch -D report --no-show-diff --very-quiet --cocci-file
> ./scripts/coccinelle/api/platform_get_irq.cocci --include-headers
> --dir . -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include
>  -I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi
>  -I ./include/uapi -I ./include/generated/uapi
>  --include ./include/linux/kconfig.h --jobs 192 --chunksize 1
> minus: parse error:
>   File "./scripts/coccinelle/api/platform_get_irq.cocci", line 24, column 9, 
> charpos = 355
>   around = '\(',
>   whole content = if ( ret \( < \| <= \) 0 )
>
> In commit e56476897448 ("fpga: Remove dev_err() usage
> after platform_get_irq()") log, I found the semantic patch,
> it fix this issue.

Thanks very much for reporting the problem.

Acked-by: Julia Lawall 


>
> Fixes: 98051ba2b28b ("coccinelle: Add script to check for platform_get_irq() 
> excessive prints")
> Signed-off-by: YueHaibing 
> ---
>  scripts/coccinelle/api/platform_get_irq.cocci | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/scripts/coccinelle/api/platform_get_irq.cocci 
> b/scripts/coccinelle/api/platform_get_irq.cocci
> index f6e1afc..06b6a95 100644
> --- a/scripts/coccinelle/api/platform_get_irq.cocci
> +++ b/scripts/coccinelle/api/platform_get_irq.cocci
> @@ -21,7 +21,7 @@ platform_get_irq
>  platform_get_irq_byname
>  )(E, ...);
>
> -if ( ret \( < \| <= \) 0 )
> +if ( \( ret < 0 \| ret <= 0 \) )
>  {
>  (
>  if (ret != -EPROBE_DEFER)
> @@ -47,7 +47,7 @@ platform_get_irq
>  platform_get_irq_byname
>  )(E, ...);
>
> -if ( ret \( < \| <= \) 0 )
> +if ( \( ret < 0 \| ret <= 0 \) )
>  {
>  (
>  -if (ret != -EPROBE_DEFER)
> @@ -74,7 +74,7 @@ platform_get_irq
>  platform_get_irq_byname
>  )(E, ...);
>
> -if ( ret \( < \| <= \) 0 )
> +if ( \( ret < 0 \| ret <= 0 \) )
>  {
>  (
>  if (ret != -EPROBE_DEFER)
> --
> 2.7.4
>
>
>


Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted

2019-09-05 Thread Michael S. Tsirkin
On Mon, Aug 12, 2019 at 02:15:32PM +0200, Christoph Hellwig wrote:
> On Sun, Aug 11, 2019 at 04:55:27AM -0400, Michael S. Tsirkin wrote:
> > On Sun, Aug 11, 2019 at 07:56:07AM +0200, Christoph Hellwig wrote:
> > > So we need a flag on the virtio device, exposed by the
> > > hypervisor (or hardware for hw virtio devices) that says:  hey, I'm real,
> > > don't take a shortcut.
> > 
> > The point here is that it's actually still not real. So we would still
> > use a physical address. However Linux decides that it wants extra
> > security by moving all data through the bounce buffer.  The distinction
> > made is that one can actually give device a physical address of the
> > bounce buffer.
> 
> Sure.  The problem is just that you keep piling hacks on top of hacks.
> We need the per-device flag anyway to properly support hardware virtio
> device in all circumstances.  Instead of coming up with another ad-hoc
> hack to force DMA uses implement that one proper bit and reuse it here.

The flag that you mention literally means "I am a real device" so for
example, you can use VFIO with it. And this device isn't a real one,
and you can't use VFIO with it, even though it's part of a power
system which always has an IOMMU.



Or here's another way to put it: we have a broken device that can only
access physical addresses, not DMA addresses. But to enable SEV Linux
requires DMA API.  So we can still make it work if DMA address happens
to be a physical address (not necessarily of the same page).


This is where dma_addr_is_a_phys_addr() is coming from: it tells us this
weird configuration can still work.  What are we going to do for SEV if
dma_addr_is_a_phys_addr does not apply? Fail probe I guess.


So the proposal is really to make things safe and to this end,
to add this in probe:

if (sev_active() &&
!dma_addr_is_a_phys_addr(dev) &&
!virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
return -EINVAL;


the point being to prevent loading driver where it would
corrupt guest memory. Put this way, any objections to adding
dma_addr_is_a_phys_addr to the DMA API?





-- 
MST


[PATCH] clk: qcom: gcc-qcs404: Use floor ops for sdcc clks

2019-09-05 Thread Vinod Koul
Update the gcc qcs404 clock driver to use floor ops for sdcc clocks. As
disuccsed in [1] it is good idea to use floor ops for sdcc clocks as we
dont want the clock rates to do round up.

[1]: 
https://lore.kernel.org/linux-arm-msm/20190830195142.103564-1-swb...@chromium.org/

Signed-off-by: Vinod Koul 
---
 drivers/clk/qcom/gcc-qcs404.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/qcom/gcc-qcs404.c b/drivers/clk/qcom/gcc-qcs404.c
index e12c04c09a6a..bd32212f37e6 100644
--- a/drivers/clk/qcom/gcc-qcs404.c
+++ b/drivers/clk/qcom/gcc-qcs404.c
@@ -1057,7 +1057,7 @@ static struct clk_rcg2 sdcc1_apps_clk_src = {
.name = "sdcc1_apps_clk_src",
.parent_names = gcc_parent_names_13,
.num_parents = 5,
-   .ops = _rcg2_ops,
+   .ops = _rcg2_floor_ops,
},
 };
 
@@ -1103,7 +1103,7 @@ static struct clk_rcg2 sdcc2_apps_clk_src = {
.name = "sdcc2_apps_clk_src",
.parent_names = gcc_parent_names_14,
.num_parents = 4,
-   .ops = _rcg2_ops,
+   .ops = _rcg2_floor_ops,
},
 };
 
-- 
2.20.1



Re: [git pull] habanalabs pull request for kernel 5.4

2019-09-05 Thread Oded Gabbay
On Fri, Sep 6, 2019 at 7:38 AM Oded Gabbay  wrote:
>
> On Thu, Sep 5, 2019 at 11:50 PM Greg KH  wrote:
> >
> > On Thu, Sep 05, 2019 at 03:19:34PM +0300, Oded Gabbay wrote:
> > > Hello Greg,
> > >
> > > This is the pull request for habanalabs driver for kernel 5.4.
> > >
> > > It contains one major change, the creation of an additional char device
> > > per PCI device. In addition, there are some small changes and
> > > improvements.
> > >
> > > Please see the tag message for details on what this pull request contains.
> > >
> > > Thanks,
> > > Oded
> > >
> > > The following changes since commit 
> > > 25ec8710d9c2cd4d0446ac60a72d388000d543e6:
> > >
> > >   w1: add DS2501, DS2502, DS2505 EPROM device driver (2019-09-04 14:34:31 
> > > +0200)
> > >
> > > are available in the Git repository at:
> > >
> > >   git://people.freedesktop.org/~gabbayo/linux 
> > > tags/misc-habanalabs-next-2019-09-05
> >
> > Is that a signed tag?  It doesn't seem to me like it is, have you always
> > sent unsigned tags?
> >
> > thanks,
> >
> > greg k-h
>
> It is unsigned. I have never sent you a signed tag.
>
> Thanks,
> Oded

Just to clarify. I have never sent a signed pull request. I'll look
now how to do it and re-send this pull request to you.
My only question is how do you verify my GPG key ? Do I need to
authenticate it somewhere ?

Thanks,
Oded


Re: [RESEND PATCH next v2 0/6] ARM: keystone: update dt and enable cpts support

2019-09-05 Thread santosh . shilimkar

On 9/5/19 12:33 PM, Grygorii Strashko wrote:

Hi Santosh,

On 06/07/2019 02:48, santosh.shilim...@oracle.com wrote:

On 7/5/19 8:12 AM, Grygorii Strashko wrote:

Hi Santosh,

This series is set of platform changes required to enable NETCP CPTS 
reference
clock selection and final patch to enable CPTS for Keystone 
66AK2E/L/HK SoCs.


Those patches were posted already [1] together with driver's changes, 
so this
is re-send of DT/platform specific changes only, as driver's changes 
have

been merged already.

Patches 1-5: CPTS DT nodes update for TI Keystone 2 66AK2HK/E/L SoCs.
Patch 6: enables CPTS for TI Keystone 2 66AK2HK/E/L SoCs.

[1] https://patchwork.kernel.org/cover/10980037/

Grygorii Strashko (6):
   ARM: dts: keystone-clocks: add input fixed clocks
   ARM: dts: k2e-clocks: add input ext. fixed clocks tsipclka/b
   ARM: dts: k2e-netcp: add cpts refclk_mux node
   ARM: dts: k2hk-netcp: add cpts refclk_mux node
   ARM: dts: k2l-netcp: add cpts refclk_mux node
   ARM: configs: keystone: enable cpts


Will add these for 5.4 queue. Thanks !!


Sry, that I'm disturbing you, but I do not see those patches applied?


Sorry I missed this one. Will queue this up for next merge window.
Will push this out to next early once rc1 is out. If you don't
see it, please ping me.


Regards,
Santosh


Re: WARNING in posix_cpu_timer_del (3)

2019-09-05 Thread Eric Biggers
On Tue, Sep 03, 2019 at 09:38:07AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:6d028043 Add linux-next specific files for 20190830
> git tree:   linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=179e59de60
> kernel config:  https://syzkaller.appspot.com/x/.config?x=82a6bec43ab0cb69
> dashboard link: https://syzkaller.appspot.com/bug?extid=dabf3198a30ed5a2158f
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13bb454660
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13af435660
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+dabf3198a30ed5a21...@syzkaller.appspotmail.com
> 
> [ cut here ]
> WARNING: CPU: 1 PID: 9805 at kernel/time/posix-cpu-timers.c:401
> posix_cpu_timer_del+0x2f0/0x3b0 kernel/time/posix-cpu-timers.c:401
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 9805 Comm: syz-executor380 Not tainted 5.3.0-rc6-next-20190830
> #75
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>  panic+0x2dc/0x755 kernel/panic.c:220
>  __warn.cold+0x2f/0x3c kernel/panic.c:581
>  report_bug+0x289/0x300 lib/bug.c:195
>  fixup_bug arch/x86/kernel/traps.c:179 [inline]
>  fixup_bug arch/x86/kernel/traps.c:174 [inline]
>  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
>  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
>  invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028
> RIP: 0010:posix_cpu_timer_del+0x2f0/0x3b0 kernel/time/posix-cpu-timers.c:401
> Code: 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85
> b5 00 00 00 48 83 bb c8 00 00 00 00 74 16 e8 00 58 0d 00 <0f> 0b e9 87 fe ff
> ff e8 c4 38 48 00 e9 dd fd ff ff e8 ea 57 0d 00
> RSP: 0018:88809ac87a30 EFLAGS: 00010093
> RAX: 888090a6e0c0 RBX: 88809ae762e0 RCX: 11101214dd2a
> RDX:  RSI: 8164fe10 RDI: 88809ae763a8
> RBP: 88809ac87ac0 R08: 0002 R09: 888090a6e958
> R10: fbfff138aef8 R11: 89c577c7 R12: 88809b326100
> R13: 111013590f47 R14: 88809ac87a98 R15: 88809ae76338
>  timer_delete_hook kernel/time/posix-timers.c:978 [inline]
>  itimer_delete kernel/time/posix-timers.c:1021 [inline]
>  exit_itimers+0xdb/0x2e0 kernel/time/posix-timers.c:1041
>  do_exit+0x1980/0x2e60 kernel/exit.c:853
>  do_group_exit+0x135/0x360 kernel/exit.c:983
>  get_signal+0x47c/0x2500 kernel/signal.c:2734
>  do_signal+0x87/0x1700 arch/x86/kernel/signal.c:815
>  exit_to_usermode_loop+0x286/0x380 arch/x86/entry/common.c:159
>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
>  do_syscall_64+0x65f/0x760 arch/x86/entry/common.c:300
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x446679
> Code: e8 5c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> 0f 83 0b 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7909e168 EFLAGS: 0246 ORIG_RAX: 00ca
> RAX:  RBX: 00017a92 RCX: 00446679
> RDX:  RSI: 0080 RDI: 006dbc2c
> RBP: 006dbc2c R08: 00037a00 R09: 00037a00
> R10: 7909e180 R11: 0246 R12: 006dbc20
> R13:  R14: 002d R15: 20c49ba5e353f7cf
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

FYI, this is still reproducible on latest linux-next (next-20190904).

- Eric


[GIT PULL] Please pull powerpc/linux.git powerpc-5.3-5 tag

2019-09-05 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull some more powerpc fixes for 5.3:

The following changes since commit ed4289e8b48845888ee46377bd2b55884a55e60b:

  Revert "powerpc: slightly improve cache helpers" (2019-07-31 22:56:27 +1000)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.3-5

for you to fetch changes up to a8318c13e79badb92bc6640704a64cc022a6eb97:

  powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts 
(2019-09-04 22:31:13 +1000)

- --
powerpc fixes for 5.3 #5

One fix for a boot hang on some Freescale machines when PREEMPT is enabled.

Two CVE fixes for bugs in our handling of FP registers and transactional memory,
both of which can result in corrupted FP state, or FP state leaking between
processes.

Thanks to:
  Chris Packham, Christophe Leroy, Gustavo Romero, Michael Neuling.

- --
Christophe Leroy (1):
  powerpc/64e: Drop stale call to smp_processor_id() which hangs SMP startup

Gustavo Romero (2):
  powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
  powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts


 arch/powerpc/kernel/process.c | 21 -
 arch/powerpc/mm/nohash/tlb.c  |  1 -
 2 files changed, 4 insertions(+), 18 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl1x48cACgkQUevqPMjh
pYBCuRAAo/buJyqBbDaN8qnw1L0gusb8KF3/j9rmoMQYTmwJROtodWnK7Yxf79LI
t8Fj94ENYaEZRazpJ379Yp2qWCIExfmWpv4GdVoLKuZwVi6aM4H5iRTSZ6SSbTY6
Rdseae17IbV4oXwEzYLjYdDtgVdrJbcWEqbdxLkffkn+km35Idz3jD5WeWXx0RQy
H0YtaZfj79caxB6Db78xY/z9ocq4zPNljI2Ghd0bvC7NmsELAVUl0/8RFn6qjRM9
LZM+Oi64Z7JnLz/FRtD/RNZSK4xAwq7vh/BdSzDGiGbaNX1o0OysxQHzv8ePwABC
GE42CqQ326vx4uICkaA7uFJ6F94s6PF1F+6XjiI0hm2STPsn0QuHd+yWKkHXMx23
VU/SvZWK3PC6HJgyCQQvmdY7g/UrcXpA0pr2IUhCnxmT2MrbFceYopSnueoag7An
WAVombwtfaFLRv8g8yV2E0y8K1X3AmfIpZBFK95zMg3uQPTiuA5Z2lFcU6L131g0
pr3K3OkR9vOn5crxOba8osjhwseNcpcvynkT5xzpRewrIpUSkl0tzwwue5jox6NX
KPV2eooGNfEcdYhuum41k+2Ps9y2aNdIXkhAdqXqTArpOdTSjdDNd+CxlHg8ZGl7
S9LffbbZhsxY4++6xSiLhhfAYQi/QjEuL6HQjy+DENEdSc+BmF8=
=nAqV
-END PGP SIGNATURE-


RE: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

2019-09-05 Thread Long Li
>Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
>
>
>On 06/09/2019 03:22, Long Li wrote:
>[ ... ]
>>
>
>> Tracing shows that the CPU was in either hardirq or softirq all the
>> time before warnings. During tests, the system was unresponsive at
>> times.
>>
>> Ming's patch fixed this problem. The system was responsive throughout
>> tests.
>>
>> As for performance hit, both resulted in a small drop in peak IOPS.
>> With IRQ_TIME_ACCOUNTING I see a 3% drop. With Ming's patch it is 1%
>> drop.
>
>Do you mean IRQ_TIME_ACCOUNTING + irq threaded ?

It's just IRQ_TIME_ACCOUNTING.

>
>
>> For the tests, I used the following fio command on 10 NVMe disks: fio
>> --bs=4k --ioengine=libaio --iodepth=128
>> --
>filename=/dev/nvme0n1:/dev/nvme1n1:/dev/nvme2n1:/dev/nvme3n1:/dev
>/nv
>>
>me4n1:/dev/nvme5n1:/dev/nvme6n1:/dev/nvme7n1:/dev/nvme8n1:/dev/n
>vme9n1
>> --direct=1 --runtime=12000 --numjobs=80 --rw=randread --name=test
>> --group_reporting --gtod_reduce=1
>
>--
>
>.linaro.org%2Fdata=02%7C01%7Clongli%40microsoft.com%7Cf142f9f9e
>15145434dd608d73283c817%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0
>%7C637033413903343340sdata=FRCGiKyxpdqyIPob1nWITGvymRdI3fSG
>vyBJovpwVw4%3Dreserved=0> Linaro.org │ Open source software for
>ARM SoCs
>
>Follow Linaro:
>.facebook.com%2Fpages%2FLinarodata=02%7C01%7Clongli%40microso
>ft.com%7Cf142f9f9e15145434dd608d73283c817%7C72f988bf86f141af91ab2d7c
>d011db47%7C1%7C0%7C637033413903343340sdata=P6t7wiGUESJoFuKi
>u3VrjRMGBYUWAW7TEYinUiFrlQs%3Dreserved=0> Facebook |
>er.com%2F%23!%2Flinaroorgdata=02%7C01%7Clongli%40microsoft.co
>m%7Cf142f9f9e15145434dd608d73283c817%7C72f988bf86f141af91ab2d7cd011
>db47%7C1%7C0%7C637033413903343340sdata=UB%2FOZZ1Mz38PQiDa
>BiJOHS4qr%2FWCejI0aKX9JRPNZ3s%3Dreserved=0> Twitter |
>.linaro.org%2Flinaro-
>blog%2Fdata=02%7C01%7Clongli%40microsoft.com%7Cf142f9f9e15145
>434dd608d73283c817%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6
>37033413903343340sdata=7%2BrawoAWuFzou90GTgIUJV%2Fasv2N2Og
>ciePvYmblDFM%3Dreserved=0> Blog



Re: [PATCH] soundwire: add back ACPI dependency

2019-09-05 Thread Vinod Koul
On 05-09-19, 22:35, Arnd Bergmann wrote:
> Soundwire gained a warning for randconfig builds without
> CONFIG_ACPI during the linux-5.3-rc cycle:
> 
> drivers/soundwire/slave.c:16:12: error: unused function 'sdw_slave_add' 
> [-Werror,-Wunused-function]
> 
> Add the CONFIG_ACPI dependency at the top level now.

Did you run this yesterday or today. I have applied Srini's patches to
add DT support for Soundwire couple of days back so we should not see
this warning anymore

> Fixes: 8676b3ca4673 ("soundwire: fix regmap dependencies and align with other 
> serial links")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/soundwire/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/soundwire/Kconfig b/drivers/soundwire/Kconfig
> index f518273cfbe3..c73bfbaa2659 100644
> --- a/drivers/soundwire/Kconfig
> +++ b/drivers/soundwire/Kconfig
> @@ -5,6 +5,7 @@
>  
>  menuconfig SOUNDWIRE
>   tristate "SoundWire support"
> + depends on ACPI
>   help
> SoundWire is a 2-Pin interface with data and clock line ratified
> by the MIPI Alliance. SoundWire is used for transporting data
> -- 
> 2.20.0

-- 
~Vinod


Re: [git pull] habanalabs pull request for kernel 5.4

2019-09-05 Thread Oded Gabbay
On Thu, Sep 5, 2019 at 11:50 PM Greg KH  wrote:
>
> On Thu, Sep 05, 2019 at 03:19:34PM +0300, Oded Gabbay wrote:
> > Hello Greg,
> >
> > This is the pull request for habanalabs driver for kernel 5.4.
> >
> > It contains one major change, the creation of an additional char device
> > per PCI device. In addition, there are some small changes and
> > improvements.
> >
> > Please see the tag message for details on what this pull request contains.
> >
> > Thanks,
> > Oded
> >
> > The following changes since commit 25ec8710d9c2cd4d0446ac60a72d388000d543e6:
> >
> >   w1: add DS2501, DS2502, DS2505 EPROM device driver (2019-09-04 14:34:31 
> > +0200)
> >
> > are available in the Git repository at:
> >
> >   git://people.freedesktop.org/~gabbayo/linux 
> > tags/misc-habanalabs-next-2019-09-05
>
> Is that a signed tag?  It doesn't seem to me like it is, have you always
> sent unsigned tags?
>
> thanks,
>
> greg k-h

It is unsigned. I have never sent you a signed tag.

Thanks,
Oded


[PATCH 3/3] software node: remove separate handling of references

2019-09-05 Thread Dmitry Torokhov
Now that all users of references have moved to reference properties,
we can remove separate handling of references.

Signed-off-by: Dmitry Torokhov 
---
 drivers/base/swnode.c| 31 +--
 include/linux/property.h | 26 ++
 2 files changed, 15 insertions(+), 42 deletions(-)

diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
index 01325705b8e4..21771b29b641 100644
--- a/drivers/base/swnode.c
+++ b/drivers/base/swnode.c
@@ -568,7 +568,6 @@ software_node_get_reference_args(const struct fwnode_handle 
*fwnode,
 {
struct swnode *swnode = to_swnode(fwnode);
const struct software_node_reference *ref;
-   const struct software_node_ref_args *ref_args;
const struct property_entry *prop;
struct fwnode_handle *refnode;
int i;
@@ -577,30 +576,18 @@ software_node_get_reference_args(const struct 
fwnode_handle *fwnode,
return -ENOENT;
 
prop = property_entry_get(swnode->node->properties, propname);
-   if (prop) {
-   if (prop->type != DEV_PROP_REF)
-   return -EINVAL;
-
-   if (index * sizeof(*ref_args) >= prop->length)
-   return -ENOENT;
-
-   ref_args = prop->is_array ?
-   >pointer.ref[index] : >value.ref;
-   } else {
-   if (!swnode->node->references)
-   return -ENOENT;
+   if (!prop)
+   return -ENOENT;
 
-   for (ref = swnode->node->references; ref->name; ref++)
-   if (!strcmp(ref->name, propname))
-   break;
+   if (prop->type != DEV_PROP_REF)
+   return -EINVAL;
 
-   if (!ref->name || index > (ref->nrefs - 1))
-   return -ENOENT;
+   if (index * sizeof(*ref) >= prop->length)
+   return -ENOENT;
 
-   ref_args = >refs[index];
-   }
+   ref = prop->is_array ? >pointer.ref[index] : >value.ref;
 
-   refnode = software_node_fwnode(ref_args->node);
+   refnode = software_node_fwnode(ref->node);
if (!refnode)
return -ENOENT;
 
@@ -619,7 +606,7 @@ software_node_get_reference_args(const struct fwnode_handle 
*fwnode,
args->nargs = nargs;
 
for (i = 0; i < nargs; i++)
-   args->args[i] = ref_args->args[i];
+   args->args[i] = ref->args[i];
 
return 0;
 }
diff --git a/include/linux/property.h b/include/linux/property.h
index b25440344743..005b90d9e608 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -223,12 +223,12 @@ static inline int fwnode_property_count_u64(const struct 
fwnode_handle *fwnode,
 struct software_node;
 
 /**
- * struct software_node_ref_args - Reference property with additional arguments
+ * struct software_node_reference - Named software node reference property
  * @node: Reference to a software node
  * @nargs: Number of elements in @args array
  * @args: Integer arguments
  */
-struct software_node_ref_args {
+struct software_node_reference {
const struct software_node *node;
unsigned int nargs;
u64 args[NR_FWNODE_REFERENCE_ARGS];
@@ -255,7 +255,7 @@ struct property_entry {
const u32 *u32_data;
const u64 *u64_data;
const char * const *str;
-   const struct software_node_ref_args *ref;
+   const struct software_node_reference *ref;
} pointer;
union {
u8 u8_data;
@@ -263,7 +263,7 @@ struct property_entry {
u32 u32_data;
u64 u64_data;
const char *str;
-   struct software_node_ref_args ref;
+   struct software_node_reference ref;
} value;
};
 };
@@ -305,7 +305,7 @@ struct property_entry {
 (struct property_entry) {  \
.name = _name_, \
.length = ARRAY_SIZE(_val_) *   \
-   sizeof(struct software_node_ref_args),  \
+   sizeof(struct software_node_reference), \
.is_array = true,   \
.type = DEV_PROP_REF,   \
.pointer.ref = _val_,   \
@@ -344,7 +344,7 @@ struct property_entry {
 #define PROPERTY_ENTRY_REF(_name_, _ref_, ...) \
 (struct property_entry) {  \
.name = _name_, \
-   .length = sizeof(struct software_node_ref_args),\
+   .length = sizeof(struct software_node_reference),   \
.type = DEV_PROP_REF,   \

[PATCH 2/3] platform/x86: intel_cht_int33fe: use inline reference properties

2019-09-05 Thread Dmitry Torokhov
Now that static device properties allow defining reference properties
together with all other types of properties, instead of managing them
separately, let's adjust the driver.

Signed-off-by: Dmitry Torokhov 
---

Heikki, I do not have this hardware, so if you could try this out
it would be really great.

 drivers/platform/x86/intel_cht_int33fe.c | 46 
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/drivers/platform/x86/intel_cht_int33fe.c 
b/drivers/platform/x86/intel_cht_int33fe.c
index 4fbdff48a4b5..91f3c8840fd8 100644
--- a/drivers/platform/x86/intel_cht_int33fe.c
+++ b/drivers/platform/x86/intel_cht_int33fe.c
@@ -50,28 +50,8 @@ struct cht_int33fe_data {
 
 static const struct software_node nodes[];
 
-static const struct software_node_ref_args pi3usb30532_ref = {
-   [INT33FE_NODE_PI3USB30532]
-};
-
-static const struct software_node_ref_args dp_ref = {
-   [INT33FE_NODE_DISPLAYPORT]
-};
-
 static struct software_node_ref_args mux_ref;
 
-static const struct software_node_reference usb_connector_refs[] = {
-   { "orientation-switch", 1, _ref},
-   { "mode-switch", 1, _ref},
-   { "displayport", 1, _ref},
-   { }
-};
-
-static const struct software_node_reference fusb302_refs[] = {
-   { "usb-role-switch", 1, _ref},
-   { }
-};
-
 /*
  * Grrr I severly dislike buggy BIOS-es. At least one BIOS enumerates
  * the max17047 both through the INT33FE ACPI device (it is right there
@@ -107,7 +87,13 @@ static const struct property_entry max17047_props[] = {
{ }
 };
 
-static const struct property_entry fusb302_props[] = {
+/* Not const because we need to update "usb-role-switch" property. */
+static struct property_entry fusb302_props[] = {
+   /*
+* usb-role-switch property must be first as we rely on fixed
+* position to adjust it once we know the real node.
+*/
+   PROPERTY_ENTRY_REF("usb-role-switch", NULL),
PROPERTY_ENTRY_STRING("linux,extcon-name", "cht_wcove_pwrsrc"),
{ }
 };
@@ -131,16 +117,22 @@ static const struct property_entry usb_connector_props[] 
= {
PROPERTY_ENTRY_U32_ARRAY("source-pdos", src_pdo),
PROPERTY_ENTRY_U32_ARRAY("sink-pdos", snk_pdo),
PROPERTY_ENTRY_U32("op-sink-microwatt", 250),
+   PROPERTY_ENTRY_REF("orientation-switch",
+  [INT33FE_NODE_PI3USB30532]),
+   PROPERTY_ENTRY_REF("mode-switch",
+  [INT33FE_NODE_PI3USB30532]),
+   PROPERTY_ENTRY_REF("displayport",
+  [INT33FE_NODE_DISPLAYPORT]),
{ }
 };
 
 static const struct software_node nodes[] = {
-   { "fusb302", NULL, fusb302_props, fusb302_refs },
+   { "fusb302", NULL, fusb302_props },
{ "max17047", NULL, max17047_props },
{ "pi3usb30532" },
{ "displayport" },
{ "usb-role-switch" },
-   { "connector", [0], usb_connector_props, usb_connector_refs },
+   { "connector", [0], usb_connector_props },
{ }
 };
 
@@ -174,7 +166,13 @@ static int cht_int33fe_setup_mux(struct cht_int33fe_data 
*data)
 
data->mux = fwnode_handle_get(dev->fwnode);
put_device(dev);
-   mux_ref.node = to_software_node(data->mux);
+
+   /*
+* Update "usb-role-switch" property with real node. Note that we
+* rely on software_node_register_nodes() to use the original
+* instance of properties instead of copying them.
+*/
+   fusb302_props[0].value.ref.node = to_software_node(data->mux);
 
return 0;
 }
-- 
2.23.0.187.g17f5b7556c-goog



[PATCH 1/3] software node: implement reference properties

2019-09-05 Thread Dmitry Torokhov
It is possible to store references to software nodes in the same fashion as
other static properties, so that users do not need to define separate
structures:

const struct software_node gpio_bank_b_node = {
.name = "B",
};

const struct property_entry simone_key_enter_props[] __initconst = {
PROPERTY_ENTRY_U32("linux,code", KEY_ENTER),
PROPERTY_ENTRY_STRING("label", "enter"),
PROPERTY_ENTRY_REF("gpios", _bank_b_node, 123, GPIO_ACTIVE_LOW),
{ }
};

Signed-off-by: Dmitry Torokhov 
---
 drivers/base/swnode.c| 34 +++--
 include/linux/property.h | 54 +---
 2 files changed, 65 insertions(+), 23 deletions(-)

diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
index e7b3aa3bd55a..01325705b8e4 100644
--- a/drivers/base/swnode.c
+++ b/drivers/base/swnode.c
@@ -568,21 +568,39 @@ software_node_get_reference_args(const struct 
fwnode_handle *fwnode,
 {
struct swnode *swnode = to_swnode(fwnode);
const struct software_node_reference *ref;
+   const struct software_node_ref_args *ref_args;
const struct property_entry *prop;
struct fwnode_handle *refnode;
int i;
 
-   if (!swnode || !swnode->node->references)
+   if (!swnode)
return -ENOENT;
 
-   for (ref = swnode->node->references; ref->name; ref++)
-   if (!strcmp(ref->name, propname))
-   break;
+   prop = property_entry_get(swnode->node->properties, propname);
+   if (prop) {
+   if (prop->type != DEV_PROP_REF)
+   return -EINVAL;
 
-   if (!ref->name || index > (ref->nrefs - 1))
-   return -ENOENT;
+   if (index * sizeof(*ref_args) >= prop->length)
+   return -ENOENT;
+
+   ref_args = prop->is_array ?
+   >pointer.ref[index] : >value.ref;
+   } else {
+   if (!swnode->node->references)
+   return -ENOENT;
+
+   for (ref = swnode->node->references; ref->name; ref++)
+   if (!strcmp(ref->name, propname))
+   break;
+
+   if (!ref->name || index > (ref->nrefs - 1))
+   return -ENOENT;
+
+   ref_args = >refs[index];
+   }
 
-   refnode = software_node_fwnode(ref->refs[index].node);
+   refnode = software_node_fwnode(ref_args->node);
if (!refnode)
return -ENOENT;
 
@@ -601,7 +619,7 @@ software_node_get_reference_args(const struct fwnode_handle 
*fwnode,
args->nargs = nargs;
 
for (i = 0; i < nargs; i++)
-   args->args[i] = ref->refs[index].args[i];
+   args->args[i] = ref_args->args[i];
 
return 0;
 }
diff --git a/include/linux/property.h b/include/linux/property.h
index 5a910ad79591..b25440344743 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -22,7 +22,8 @@ enum dev_prop_type {
DEV_PROP_U32,
DEV_PROP_U64,
DEV_PROP_STRING,
-   DEV_PROP_MAX,
+   DEV_PROP_REF,
+   DEV_PROP_MAX
 };
 
 enum dev_dma_attr {
@@ -219,6 +220,20 @@ static inline int fwnode_property_count_u64(const struct 
fwnode_handle *fwnode,
return fwnode_property_read_u64_array(fwnode, propname, NULL, 0);
 }
 
+struct software_node;
+
+/**
+ * struct software_node_ref_args - Reference property with additional arguments
+ * @node: Reference to a software node
+ * @nargs: Number of elements in @args array
+ * @args: Integer arguments
+ */
+struct software_node_ref_args {
+   const struct software_node *node;
+   unsigned int nargs;
+   u64 args[NR_FWNODE_REFERENCE_ARGS];
+};
+
 /**
  * struct property_entry - "Built-in" device property representation.
  * @name: Name of the property.
@@ -240,6 +255,7 @@ struct property_entry {
const u32 *u32_data;
const u64 *u64_data;
const char * const *str;
+   const struct software_node_ref_args *ref;
} pointer;
union {
u8 u8_data;
@@ -247,6 +263,7 @@ struct property_entry {
u32 u32_data;
u64 u64_data;
const char *str;
+   struct software_node_ref_args ref;
} value;
};
 };
@@ -284,6 +301,16 @@ struct property_entry {
{ .pointer = { .str = _val_ } },\
 }
 
+#define PROPERTY_ENTRY_REF_ARRAY(_name_, _val_)\
+(struct property_entry) {  \
+   .name = _name_, \
+   .length = ARRAY_SIZE(_val_) *   \
+   sizeof(struct software_node_ref_args),  \
+   .is_array = true,   \
+ 

Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

2019-09-05 Thread Daniel Lezcano


On 06/09/2019 03:22, Long Li wrote:
[ ... ]
> 

> Tracing shows that the CPU was in either hardirq or softirq all the
> time before warnings. During tests, the system was unresponsive at
> times.
> 
> Ming's patch fixed this problem. The system was responsive throughout
> tests.
> 
> As for performance hit, both resulted in a small drop in peak IOPS.
> With IRQ_TIME_ACCOUNTING I see a 3% drop. With Ming's patch it is 1%
> drop.

Do you mean IRQ_TIME_ACCOUNTING + irq threaded ?


> For the tests, I used the following fio command on 10 NVMe disks: fio
> --bs=4k --ioengine=libaio --iodepth=128
> --filename=/dev/nvme0n1:/dev/nvme1n1:/dev/nvme2n1:/dev/nvme3n1:/dev/nvme4n1:/dev/nvme5n1:/dev/nvme6n1:/dev/nvme7n1:/dev/nvme8n1:/dev/nvme9n1
> --direct=1 --runtime=12000 --numjobs=80 --rw=randread --name=test
> --group_reporting --gtod_reduce=1

-- 
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog



Re: [PATCH] net/skbuff: silence warnings under memory pressure

2019-09-05 Thread Sergey Senozhatsky
On (09/05/19 12:03), Qian Cai wrote:
> > ---
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index cd51aa7d08a9..89cb47882254 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2027,8 +2027,11 @@ asmlinkage int vprintk_emit(int facility, int level,
> >     pending_output = (curr_log_seq != log_next_seq);
> >     logbuf_unlock_irqrestore(flags);
> >  
> > +   if (!pending_output)
> > +   return printed_len;
> > +
> >     /* If called from the scheduler, we can not call up(). */
> > -   if (!in_sched && pending_output) {
> > +   if (!in_sched) {
> >     /*
> >      * Disable preemption to avoid being preempted while holding
> >      * console_sem which would prevent anyone from printing to
> > @@ -2043,10 +2046,11 @@ asmlinkage int vprintk_emit(int facility, int level,
> >     if (console_trylock_spinning())
> >     console_unlock();
> >     preempt_enable();
> > -   }
> >  
> > -   if (pending_output)
> > +   wake_up_interruptible(_wait);
> > +   } else {
> >     wake_up_klogd();
> > +   }
> >     return printed_len;
> >  }
> >  EXPORT_SYMBOL(vprintk_emit);
> > ---

Qian Cai, any chance you can test that patch?

-ss


Re: [PATCH 3/3] phy: qcom-qmp: Add SM8150 QMP UFS PHY support

2019-09-05 Thread Vinod Koul
On 04-09-19, 16:23, Stephen Boyd wrote:
> Quoting Vinod Koul (2019-09-04 03:08:35)
> > @@ -878,6 +883,93 @@ static const struct qmp_phy_init_tbl 
> > msm8998_usb3_pcs_tbl[] = {
> > QMP_PHY_INIT_CFG(QPHY_V3_PCS_RXEQTRAINING_RUN_TIME, 0x13),
> >  };
> >  
> > +static const struct qmp_phy_init_tbl sm8150_ufsphy_serdes_tbl[] = {
> > +   QMP_PHY_INIT_CFG(QPHY_POWER_DOWN_CONTROL, 0x01),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_SYSCLK_EN_SEL, 0xD9),
> 
> Can you use lowercase hex?

Sure will update

> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_HSCLK_SEL, 0x11),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_HSCLK_HS_SWITCH_SEL, 0x00),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_LOCK_CMP_EN, 0x01),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_VCO_TUNE_MAP, 0x02),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_PLL_IVCO, 0x0F),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_VCO_TUNE_INITVAL2, 0x00),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_BIN_VCOCAL_HSCLK_SEL, 0x11),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_DEC_START_MODE0, 0x82),
> > +   QMP_PHY_INIT_CFG(QSERDES_COM_V4_CP_CTRL_MODE0, 0x06),
> 
> Gotta love the pile of numbers and register writes...

:D

-- 
~Vinod


[PATCH 1/1] PCI: iproc: Invalidate PAXB address mapping before programming it

2019-09-05 Thread Abhishek Shah
Invalidate PAXB inbound/outbound address mapping each time before
programming it. This is helpful for the cases where we need to
reprogram inbound/outbound address mapping without resetting PAXB.
kexec kernel is one such example.

Signed-off-by: Abhishek Shah 
Reviewed-by: Ray Jui 
Reviewed-by: Vikram Mysore Prakash 
---
 drivers/pci/controller/pcie-iproc.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/pci/controller/pcie-iproc.c 
b/drivers/pci/controller/pcie-iproc.c
index e3ca46497470..99a9521ba7ab 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -1245,6 +1245,32 @@ static int iproc_pcie_map_dma_ranges(struct iproc_pcie 
*pcie)
return ret;
 }
 
+static void iproc_pcie_invalidate_mapping(struct iproc_pcie *pcie)
+{
+   struct iproc_pcie_ib *ib = >ib;
+   struct iproc_pcie_ob *ob = >ob;
+   int idx;
+
+   if (pcie->ep_is_internal)
+   return;
+
+   if (pcie->need_ob_cfg) {
+   /* iterate through all OARR mapping regions */
+   for (idx = ob->nr_windows - 1; idx >= 0; idx--) {
+   iproc_pcie_write_reg(pcie,
+MAP_REG(IPROC_PCIE_OARR0, idx), 0);
+   }
+   }
+
+   if (pcie->need_ib_cfg) {
+   /* iterate through all IARR mapping regions */
+   for (idx = 0; idx < ib->nr_regions; idx++) {
+   iproc_pcie_write_reg(pcie,
+MAP_REG(IPROC_PCIE_IARR0, idx), 0);
+   }
+   }
+}
+
 static int iproce_pcie_get_msi(struct iproc_pcie *pcie,
   struct device_node *msi_node,
   u64 *msi_addr)
@@ -1517,6 +1543,8 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
iproc_pcie_perst_ctrl(pcie, true);
iproc_pcie_perst_ctrl(pcie, false);
 
+   iproc_pcie_invalidate_mapping(pcie);
+
if (pcie->need_ob_cfg) {
ret = iproc_pcie_map_ranges(pcie, res);
if (ret) {
-- 
2.17.1



[PATCH 3/4] mmc: host: sdhci-sprd: Add virtual command queue support

2019-09-05 Thread Baolin Wang
Add virtual command queue support.

Signed-off-by: Baolin Wang 
---
 drivers/mmc/host/Kconfig  |1 +
 drivers/mmc/host/sdhci-sprd.c |   16 
 2 files changed, 17 insertions(+)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index e2a12c3..851e947 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -619,6 +619,7 @@ config MMC_SDHCI_SPRD
depends on ARCH_SPRD
depends on MMC_SDHCI_PLTFM
select MMC_SDHCI_IO_ACCESSORS
+   select MMC_VIRTUAL_CQHCI
help
  This selects the SDIO Host Controller in Spreadtrum
  SoCs, this driver supports R11(IP version: R11P0).
diff --git a/drivers/mmc/host/sdhci-sprd.c b/drivers/mmc/host/sdhci-sprd.c
index 19a2104..ff4886a3 100644
--- a/drivers/mmc/host/sdhci-sprd.c
+++ b/drivers/mmc/host/sdhci-sprd.c
@@ -19,6 +19,7 @@
 #include 
 
 #include "sdhci-pltfm.h"
+#include "cqhci.h"
 
 /* SDHCI_ARGUMENT2 register high 16bit */
 #define SDHCI_SPRD_ARG2_STUFF  GENMASK(31, 16)
@@ -515,6 +516,7 @@ static int sdhci_sprd_probe(struct platform_device *pdev)
 {
struct sdhci_host *host;
struct sdhci_sprd_host *sprd_host;
+   struct cqhci_host *cqv_host;
struct clk *clk;
int ret = 0;
 
@@ -625,6 +627,17 @@ static int sdhci_sprd_probe(struct platform_device *pdev)
 
sprd_host->flags = host->flags;
 
+   cqv_host = devm_kzalloc(>dev,
+   sizeof(*cqv_host), GFP_KERNEL);
+   if (!cqv_host) {
+   ret = -ENOMEM;
+   goto err_cleanup_host;
+   }
+
+   ret = cqhci_virt_init(cqv_host, host->mmc);
+   if (ret)
+   goto err_cleanup_host;
+
ret = __sdhci_add_host(host);
if (ret)
goto err_cleanup_host;
@@ -685,6 +698,7 @@ static int sdhci_sprd_runtime_suspend(struct device *dev)
struct sdhci_host *host = dev_get_drvdata(dev);
struct sdhci_sprd_host *sprd_host = TO_SPRD_HOST(host);
 
+   cqhci_virt_suspend(host->mmc);
sdhci_runtime_suspend_host(host);
 
clk_disable_unprepare(sprd_host->clk_sdio);
@@ -713,6 +727,8 @@ static int sdhci_sprd_runtime_resume(struct device *dev)
goto clk_disable;
 
sdhci_runtime_resume_host(host, 1);
+   cqhci_virt_resume(host->mmc);
+
return 0;
 
 clk_disable:
-- 
1.7.9.5



[PATCH 4/4] mmc: host: sdhci: Add virtual command queue support

2019-09-05 Thread Baolin Wang
Add cqhci_virt_finalize_request() to help to complete a request
from virtual command queue.

Signed-off-by: Baolin Wang 
---
 drivers/mmc/host/sdhci.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 4e9ebc8..fb5983e 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -32,6 +32,7 @@
 #include 
 
 #include "sdhci.h"
+#include "cqhci.h"
 
 #define DRIVER_NAME "sdhci"
 
@@ -2710,7 +2711,8 @@ static bool sdhci_request_done(struct sdhci_host *host)
 
spin_unlock_irqrestore(>lock, flags);
 
-   mmc_request_done(host->mmc, mrq);
+   if (!cqhci_virt_finalize_request(host->mmc, mrq))
+   mmc_request_done(host->mmc, mrq);
 
return false;
 }
@@ -3133,7 +3135,8 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
 
/* Process mrqs ready for immediate completion */
for (i = 0; i < SDHCI_MAX_MRQS; i++) {
-   if (mrqs_done[i])
+   if (mrqs_done[i] &&
+   !cqhci_virt_finalize_request(host->mmc, mrqs_done[i]))
mmc_request_done(host->mmc, mrqs_done[i]);
}
 
-- 
1.7.9.5



[PATCH 1/4] mmc: host: cqhci: Move the struct cqhci_slot into header file

2019-09-05 Thread Baolin Wang
The struct cqhci_slot will be used by virtual command queue introducing
by following patches, thus move it to the header file.

Signed-off-by: Baolin Wang 
---
 drivers/mmc/host/cqhci.c |   10 --
 drivers/mmc/host/cqhci.h |   11 ++-
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/host/cqhci.c b/drivers/mmc/host/cqhci.c
index f7bdae5..57ff1cc 100644
--- a/drivers/mmc/host/cqhci.c
+++ b/drivers/mmc/host/cqhci.c
@@ -21,16 +21,6 @@
 #define DCMD_SLOT 31
 #define NUM_SLOTS 32
 
-struct cqhci_slot {
-   struct mmc_request *mrq;
-   unsigned int flags;
-#define CQHCI_EXTERNAL_TIMEOUT BIT(0)
-#define CQHCI_COMPLETEDBIT(1)
-#define CQHCI_HOST_CRC BIT(2)
-#define CQHCI_HOST_TIMEOUT BIT(3)
-#define CQHCI_HOST_OTHER   BIT(4)
-};
-
 static inline u8 *get_desc(struct cqhci_host *cq_host, u8 tag)
 {
return cq_host->desc_base + (tag * cq_host->slot_sz);
diff --git a/drivers/mmc/host/cqhci.h b/drivers/mmc/host/cqhci.h
index def76e9..7b07bf24f 100644
--- a/drivers/mmc/host/cqhci.h
+++ b/drivers/mmc/host/cqhci.h
@@ -141,7 +141,16 @@
 struct cqhci_host_ops;
 struct mmc_host;
 struct mmc_request;
-struct cqhci_slot;
+
+struct cqhci_slot {
+   struct mmc_request *mrq;
+   unsigned int flags;
+#define CQHCI_EXTERNAL_TIMEOUT BIT(0)
+#define CQHCI_COMPLETEDBIT(1)
+#define CQHCI_HOST_CRC BIT(2)
+#define CQHCI_HOST_TIMEOUT BIT(3)
+#define CQHCI_HOST_OTHER   BIT(4)
+};
 
 struct cqhci_host {
const struct cqhci_host_ops *ops;
-- 
1.7.9.5



[PATCH 2/4] mmc: Add virtual command queue support

2019-09-05 Thread Baolin Wang
Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead, especially for high I/O per second rates, to affect the IO
performance.

Thus this patch introduces virtual command queue interface, which is
similar with the hardware command queue engine's idea, that can remove
the context switching. Moreover we set the queue depth as 2 for virtual
command queue, that is enough to let the irq handler always trigger
the next request without a context switch and then ask the blk_mq
layer for the next one to get queued, as well as avoiding a long
latency.

>From the fio testing data in cover letter, we can see the virtual
command queue can improve performance obviously with 4K block size,
increasing about 52% for sequential read, increasing about 114% for
random read, increasing about 81% for sequential write, and increasing
about 127% for random write.

Moreover we can expand the virtual command queue interface to
support MMC packed request or packed command in future.

Signed-off-by: Baolin Wang 
---
 drivers/mmc/core/block.c  |   62 
 drivers/mmc/core/mmc.c|   13 +-
 drivers/mmc/core/queue.c  |   25 ++-
 drivers/mmc/host/Kconfig  |8 +
 drivers/mmc/host/Makefile |1 +
 drivers/mmc/host/cqhci-virt.c |  346 +
 drivers/mmc/host/cqhci.h  |   34 
 include/linux/mmc/host.h  |1 +
 8 files changed, 480 insertions(+), 10 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci-virt.c

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 2c71a43..63d487f 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -168,6 +168,11 @@ struct mmc_rpmb_data {
 
 static inline int mmc_blk_part_switch(struct mmc_card *card,
  unsigned int part_type);
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+  struct mmc_card *card,
+  int disable_multi,
+  struct mmc_queue *mq);
+static void mmc_blk_virt_cqe_req_done(struct mmc_request *mrq);
 
 static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
 {
@@ -1569,9 +1574,31 @@ static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, 
struct request *req)
return mmc_blk_cqe_start_req(mq->card->host, mrq);
 }
 
+static int mmc_blk_virt_cqe_issue_rw_rq(struct mmc_queue *mq,
+   struct request *req)
+{
+   struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+   struct mmc_host *host = mq->card->host;
+   int err;
+
+   mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
+   mqrq->brq.mrq.done = mmc_blk_virt_cqe_req_done;
+   mmc_pre_req(host, >brq.mrq);
+
+   err = mmc_cqe_start_req(host, >brq.mrq);
+   if (err)
+   mmc_post_req(host, >brq.mrq, err);
+
+   return err;
+}
+
 static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+   struct mmc_host *host = mq->card->host;
+
+   if (host->virt_cqe)
+   return mmc_blk_virt_cqe_issue_rw_rq(mq, req);
 
mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
 
@@ -1957,6 +1984,41 @@ static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
mmc_run_bkops(mq->card);
 }
 
+static void mmc_blk_virt_cqe_req_done(struct mmc_request *mrq)
+{
+   struct mmc_queue_req *mqrq =
+   container_of(mrq, struct mmc_queue_req, brq.mrq);
+   struct request *req = mmc_queue_req_to_req(mqrq);
+   struct request_queue *q = req->q;
+   struct mmc_queue *mq = q->queuedata;
+   struct mmc_host *host = mq->card->host;
+   unsigned long flags;
+
+   if (mmc_blk_rq_error(>brq) ||
+   mmc_blk_urgent_bkops_needed(mq, mqrq)) {
+   spin_lock_irqsave(>lock, flags);
+   mq->recovery_needed = true;
+   mq->recovery_req = req;
+   spin_unlock_irqrestore(>lock, flags);
+
+   host->cqe_ops->cqe_recovery_start(host);
+
+   schedule_work(>recovery_work);
+   return;
+   }
+
+   mmc_blk_rw_reset_success(mq, req);
+
+   /*
+* Block layer timeouts race with completions which means the normal
+* completion path cannot be used during recovery.
+*/
+   if (mq->in_recovery)
+   mmc_blk_cqe_complete_rq(mq, req);
+   else
+   blk_mq_complete_request(req);
+}
+
 void mmc_blk_mq_complete(struct request *req)
 {
struct mmc_queue *mq = req->q->queuedata;
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index c880489..316b0a6 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1852,15 +1852,22 @@ static int mmc_init_card(struct mmc_host *host, u32 

[PATCH 0/4] Add MMC virtual command queue support

2019-09-05 Thread Baolin Wang
Hi All,

Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead, especially for high I/O per second rates, to affect the IO
performance.

Thus this patch set will introduce the virtual command queue support,
and set the queue depth as 2, that means we do not need wait for previous
request is completed and can queue 2 requests in flight. It is enough to
let the irq handler always trigger the next request without a context
switch and then ask the blk_mq layer for the next one to get queued,
as well as avoiding a long latency.

Moreover we can expand the virtual command queue interface to support
MMC packed request or packed command instead of adding new interfaces,
according to previosus discussion.

Below are some comparison data with fio tool. The fio command I used
is like below with changing the '--rw' parameter and enabling the direct
IO flag to measure the actual hardware transfer speed in 4K block size.

./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K 
--size=512M --group_reporting --numjobs=20 --name=test_read

My eMMC card working at HS400 Enhanced strobe mode:
[2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB 
[2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
[2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
[2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev 
(248:0)

1. Without virtual command queue
I tested 3 times for each case and output a average speed.

1) Sequential read:
Speed: 28.9MiB/s, 26.4MiB/s, 30.9MiB/s
Average speed: 28.7MiB/s

2) Random read:
Speed: 18.2MiB/s, 8.9MiB/s, 15.8MiB/s
Average speed: 14.3MiB/s

3) Sequential write:
Speed: 21.1MiB/s, 27.9MiB/s, 25MiB/s
Average speed: 24.7MiB/s

4) Random write:
Speed: 21.5MiB/s, 18.1MiB/s, 18.1MiB/s
Average speed: 19.2MiB/s

2. With virtual command queue
I tested 3 times for each case and output a average speed.

1) Sequential read:
Speed: 44.1MiB/s, 42.3MiB/s, 44.4MiB/s
Average speed: 43.6MiB/s

2) Random read:
Speed: 30.6MiB/s, 30.9MiB/s, 30.5MiB/s
Average speed: 30.6MiB/s

3) Sequential write:
Speed: 44.1MiB/s, 45.9MiB/s, 44.2MiB/s
Average speed: 44.7MiB/s

4) Random write:
Speed: 45.1MiB/s, 43.3MiB/s, 42.4MiB/s
Average speed: 43.6MiB/s

Form above data, we can see the virtual command queue can help to improve the
performance obviously.

Any comments are welcome. Thanks a lot.

Baolin Wang (4):
  mmc: host: cqhci: Move the struct cqhci_slot into header file
  mmc: Add virtual command queue support
  mmc: host: sdhci-sprd: Add virtual command queue support
  mmc: host: sdhci: Add virtual command queue support

 drivers/mmc/core/block.c  |   62 
 drivers/mmc/core/mmc.c|   13 +-
 drivers/mmc/core/queue.c  |   25 ++-
 drivers/mmc/host/Kconfig  |9 ++
 drivers/mmc/host/Makefile |1 +
 drivers/mmc/host/cqhci-virt.c |  346 +
 drivers/mmc/host/cqhci.c  |   10 --
 drivers/mmc/host/cqhci.h  |   45 +-
 drivers/mmc/host/sdhci-sprd.c |   16 ++
 drivers/mmc/host/sdhci.c  |7 +-
 include/linux/mmc/host.h  |1 +
 11 files changed, 512 insertions(+), 23 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci-virt.c

-- 
1.7.9.5



Re: general protection fault in __apic_accept_irq

2019-09-05 Thread Wanpeng Li
On Thu, 5 Sep 2019 at 21:11, Vitaly Kuznetsov  wrote:
>
> Wanpeng Li  writes:
>
> > On Thu, 5 Sep 2019 at 16:53, syzbot
> >  wrote:
> >>
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:3b47fd5c Merge tag 'nfs-for-5.3-4' of 
> >> git://git.linux-nfs...
> >> git tree:   upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=124af12a60
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=144488c6c6c6d2b6
> >> dashboard link: 
> >> https://syzkaller.appspot.com/bug?extid=dff25ee91f0c7d5c1695
> >> compiler:   clang version 9.0.0 (/home/glider/llvm/clang
> >> 80fee25776c2fb61e74c1ecb1a523375c2500b69)
> >> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1095467660
> >> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1752fe0a60
> >>
> >> The bug was bisected to:
> >>
> >> commit 0aa67255f54df192d29aec7ac6abb1249d45bda7
> >> Author: Vitaly Kuznetsov 
> >> Date:   Mon Nov 26 15:47:29 2018 +
> >>
> >>  x86/hyper-v: move synic/stimer control structures definitions to
> >> hyperv-tlfs.h
> >>
> >> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=156128c160
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=136128c160
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+dff25ee91f0c7d5c1...@syzkaller.appspotmail.com
> >> Fixes: 0aa67255f54d ("x86/hyper-v: move synic/stimer control structures
> >> definitions to hyperv-tlfs.h")
> >>
> >> kvm [9347]: vcpu0, guest rIP: 0xcc Hyper-V uhandled wrmsr: 0x4004 data
> >> 0x94
> >> kvm [9347]: vcpu0, guest rIP: 0xcc Hyper-V uhandled wrmsr: 0x4004 data
> >> 0x48c
> >> kvm [9347]: vcpu0, guest rIP: 0xcc Hyper-V uhandled wrmsr: 0x4004 data
> >> 0x4ac
> >> kvm [9347]: vcpu0, guest rIP: 0xcc Hyper-V uhandled wrmsr: 0x4005 data
> >> 0x1520
> >> kvm [9347]: vcpu0, guest rIP: 0xcc Hyper-V uhandled wrmsr: 0x4006 data
> >> 0x15d4
> >> kvm [9347]: vcpu0, guest rIP: 0xcc Hyper-V uhandled wrmsr: 0x4007 data
> >> 0x15c4
> >> kasan: CONFIG_KASAN_INLINE enabled
> >> kasan: GPF could be caused by NULL-ptr deref or user memory access
> >> general protection fault:  [#1] PREEMPT SMP KASAN
> >> CPU: 0 PID: 9347 Comm: syz-executor665 Not tainted 5.3.0-rc7+ #0
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> Google 01/01/2011
> >> RIP: 0010:__apic_accept_irq+0x46/0x740 arch/x86/kvm/lapic.c:1029
> >
> > Thanks for the report, I found the root cause, will send a patch soon.
> >
>
> I'm really interested in how any issue can be caused by 0aa67255f54d as
> we just moved some definitions from a c file to a common header... (ok,
> we did more than that, some structures gained '__packed' but it all
> still seems legitimate to me and I can't recall any problems with
> genuine Hyper-V...)

Yes, the bisect is false positive, we can focus on fixing the bug.

 Wanpeng


Re: [PATCH] net/skbuff: silence warnings under memory pressure

2019-09-05 Thread Sergey Senozhatsky
On (09/05/19 13:23), Steven Rostedt wrote:
> > I think we can queue significantly much less irq_work-s from printk().
> > 
> > Petr, Steven, what do you think?

[..]
> I mean, really, do we need to keep calling wake up if it
> probably never even executed?

I guess ratelimiting you are talking about ("if it probably never even
executed") would be to check if we have already called wake up on the
log_wait ->head. For that we need to, at least, take log_wait spin_lock
and check that ->head is still in TASK_INTERRUPTIBLE; which is (quite,
but not exactly) close to what wake_up_interruptible() does - it doesn't
wake up the same task twice, it bails out on `p->state & state' check.

Or did I miss something?

-ss


RE: [Patch v3] storvsc: setup 1:1 mapping between hardware queue and CPU queue

2019-09-05 Thread Long Li
>Subject: RE: [Patch v3] storvsc: setup 1:1 mapping between hardware queue
>and CPU queue
>
>From: Long Li  Sent: Thursday, September 5, 2019 3:55
>PM
>>
>> storvsc doesn't use a dedicated hardware queue for a given CPU queue.
>> When issuing I/O, it selects returning CPU (hardware queue)
>> dynamically based on vmbus channel usage across all channels.
>>
>> This patch advertises num_present_cpus() as number of hardware queues.
>> This will have upper layer setup 1:1 mapping between hardware queue
>> and CPU queue and avoid unnecessary locking when issuing I/O.
>>
>> Changes:
>> v2: rely on default upper layer function to map queues. (suggested by
>> Ming Lei
>> )
>> v3: use num_present_cpus() instead of num_online_cpus(). Hyper-v
>> doesn't support hot-add CPUs. (suggested by Michael Kelley
>> )
>
>I've mostly seen the "Changes:" section placed below the "---" so that it
>doesn't clutter up the commit log.  But maybe there's not a strong
>requirement one way or the other as I didn't find anything called out in the
>"Documentation/process"
>directory.

Should I resubmit the patch (but keep it v3)?

>
>Michael
>
>>
>> Signed-off-by: Long Li 
>> ---
>>  drivers/scsi/storvsc_drv.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
>> index b89269120a2d..cf987712041a 100644
>> --- a/drivers/scsi/storvsc_drv.c
>> +++ b/drivers/scsi/storvsc_drv.c
>> @@ -1836,8 +1836,7 @@ static int storvsc_probe(struct hv_device *device,
>>  /*
>>   * Set the number of HW queues we are supporting.
>>   */
>> -if (stor_device->num_sc != 0)
>> -host->nr_hw_queues = stor_device->num_sc + 1;
>> +host->nr_hw_queues = num_present_cpus();
>>
>>  /*
>>   * Set the error handler work queue.
>> --
>> 2.17.1



Re: [PATCH v3 1/2] dt-bindings: PCI: intel: Add YAML schemas for the PCIe RC controller

2019-09-05 Thread Chuan Hua, Lei



On 9/6/2019 4:31 AM, Martin Blumenstingl wrote:

Hi Dilip,

On Wed, Sep 4, 2019 at 12:11 PM Dilip Kota  wrote:
[...]

+properties:
+  compatible:
+const: intel,lgm-pcie

should we add the "snps,dw-pcie" here (and in the example below) as well?
(this is what for example
Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt does)

Thanks for pointing out this. We should add this.


[...]

+  phy-names:
+const: pciephy

the most popular choice in Documentation/devicetree/bindings/pci/ is "pcie-phy"
if Rob is happy with "pciephy" (which is already part of two other
bindings) then I'm happy with "pciephy" as well

Agree.



+  num-lanes:
+description: Number of lanes to use for this port.

are there SoCs with more than 2 lanes?
you can list the allowed values in an enum so "num-lanes = <16>"
causes an error when someone accidentally has this in their .dts (and
runs the dt-bindings validation)
Our SoC(LGM) supports single lane or dual lane. Again this also depends 
on the board. I wonder if we should put this into board specific dts.  
To make multiple lanes work properly, it also depends on the phy mode. 
In my internal version, I put it into board dts.


[...]

+  reset-assert-ms:

maybe add:
   $ref: /schemas/types.yaml#/definitions/uint32

Agree

+description: |
+  Device reset interval in ms.
+  Some devices need an interval upto 500ms. By default it is 100ms.
+
+required:
+  - compatible
+  - device_type
+  - reg
+  - reg-names
+  - ranges
+  - resets
+  - clocks
+  - phys
+  - phy-names
+  - reset-gpios
+  - num-lanes
+  - linux,pci-domain
+  - interrupts
+  - interrupt-map
+  - interrupt-map-mask
+
+additionalProperties: false
+
+examples:
+  - |
+pcie10:pcie@d0e0 {
+  compatible = "intel,lgm-pcie";
+  device_type = "pci";
+  #address-cells = <3>;
+  #size-cells = <2>;
+  reg = <
+0xd0e0 0x1000
+0xd200 0x80
+0xd0a41000 0x1000
+>;
+  reg-names = "dbi", "config", "app";
+  linux,pci-domain = <0>;
+  max-link-speed = <4>;
+  bus-range = <0x00 0x08>;
+  interrupt-parent = <>;
+  interrupts = <67 1>;
+  #interrupt-cells = <1>;
+  interrupt-map-mask = <0 0 0 0x7>;
+  interrupt-map = <0 0 0 1  27 1>,
+  <0 0 0 2  28 1>,
+  <0 0 0 3  29 1>,
+  <0 0 0 4  30 1>;

is the "1" in the interrupts and interrupt-map properties IRQ_TYPE_EDGE_RISING?
you can use these macros in this example as well, see
Documentation/devicetree/bindings/iio/accel/adi,adxl372.yaml for
example


No. 1 here means index from arch/x86/devicetree.c

static struct of_ioapic_type of_ioapic_type[] =
{
    {
        .out_type    = IRQ_TYPE_EDGE_RISING,
        .trigger    = IOAPIC_EDGE,
        .polarity    = 1,
    },
    {
        .out_type    = IRQ_TYPE_LEVEL_LOW,
        .trigger    = IOAPIC_LEVEL,
        .polarity    = 0,
    },
    {
        .out_type    = IRQ_TYPE_LEVEL_HIGH,
        .trigger    = IOAPIC_LEVEL,
        .polarity    = 1,
    },
    {
        .out_type    = IRQ_TYPE_EDGE_FALLING,
        .trigger    = IOAPIC_EDGE,
        .polarity    = 0,
    },
};

static int dt_irqdomain_alloc(struct irq_domain *domain, unsigned int virq,
              unsigned int nr_irqs, void *arg)
{
    struct irq_fwspec *fwspec = (struct irq_fwspec *)arg;
    struct of_ioapic_type *it;
    struct irq_alloc_info tmp;
    int type_index;

    if (WARN_ON(fwspec->param_count < 2))
        return -EINVAL;

    type_index = fwspec->param[1]; // index.
    if (type_index >= ARRAY_SIZE(of_ioapic_type))
        return -EINVAL;

I would not see this definition is user-friendly. But it is how x86 
handles at the moment.




Martin


Re: [PATCH 1/2] mm/kasan: dump alloc/free stack for page allocator

2019-09-05 Thread Walter Wu
On Thu, 2019-09-05 at 10:03 +0200, Vlastimil Babka wrote:
> On 9/4/19 4:24 PM, Walter Wu wrote:
> > On Wed, 2019-09-04 at 16:13 +0200, Vlastimil Babka wrote:
> >> On 9/4/19 4:06 PM, Walter Wu wrote:
> >>
> >> The THP fix is not required for the rest of the series, it was even merged 
> >> to
> >> mainline separately.
> >>
> >>> And It looks like something is different, because we only need last
> >>> stack of page, so it can decrease memory overhead.
> >>
> >> That would save you depot_stack_handle_t (which is u32) per page. I guess 
> >> that's
> >> nothing compared to KASAN overhead?
> >>
> > If we can use less memory, we can achieve what we want. Why not?
> 
> In my experience to solve some UAFs, it's important to know not only the
> freeing stack, but also the allocating stack. Do they make sense together,
> or not? In some cases, even longer history of alloc/free would be nice :)
> 
We think it only has free stack to find out the root cause. Maybe we can
refer to other people's experience and ideas.


> Also by simply recording the free stack in the existing depot handle,
> you might confuse existing page_owner file consumers, who won't know
> that this is a freeing stack.
> 
Don't worry it.
1. Our feature option has this description about last stack of page.
when consumer enable our feature, they should know the changing.
2. We add to print text message for alloc or free stack before dump the
stack of page. so consumers should know what is it.

> All that just doesn't seem to justify saving an u32 per page.

Actually, We want to slim memory usage instead of increasing the memory
usage at another mail discussion. Maybe, maintainer or reviewer can
provide some ideas. That will be great.

> > 
> > 
> 




[RFC PATCH 1/4] Fix: sched/membarrier: private expedited registration check

2019-09-05 Thread Mathieu Desnoyers
Fix a logic flaw in the way membarrier_register_private_expedited()
handles ready state checks for private expedited sync core and private
expedited registrations.

If a private expedited membarrier registration is first performed, and
then a private expedited sync_core registration is performed, the ready
state check will skip the second registration when it really should not.

Signed-off-by: Mathieu Desnoyers 
Cc: "Paul E. McKenney" 
Cc: Peter Zijlstra 
Cc: Oleg Nesterov 
Cc: "Eric W. Biederman" 
Cc: Linus Torvalds 
Cc: Russell King - ARM Linux admin 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Kirill Tkhai 
Cc: Mike Galbraith 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
---
 kernel/sched/membarrier.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index aa8d75804108..5110d91b1b0e 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -226,7 +226,7 @@ static int membarrier_register_private_expedited(int flags)
 * groups, which use the same mm. (CLONE_VM but not
 * CLONE_THREAD).
 */
-   if (atomic_read(>membarrier_state) & state)
+   if ((atomic_read(>membarrier_state) & state) == state)
return 0;
atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED, >membarrier_state);
if (flags & MEMBARRIER_FLAG_SYNC_CORE)
-- 
2.17.1



[RFC PATCH 0/4] Membarrier fixes/cleanups

2019-09-05 Thread Mathieu Desnoyers
This series of fixes/cleanups is submitted for feedback. It takes care
of membarrier issues recently discussed.

Thanks,

Mathieu

Mathieu Desnoyers (4):
  Fix: sched/membarrier: private expedited registration check
  Cleanup: sched/membarrier: remove redundant check
  Cleanup: sched/membarrier: only sync_core before usermode for same mm
  Fix: sched/membarrier: p->mm->membarrier_state racy load

 include/linux/mm_types.h  |   7 +-
 include/linux/sched/mm.h  |   8 +--
 kernel/sched/core.c   |   1 +
 kernel/sched/membarrier.c | 141 +++---
 kernel/sched/sched.h  |  33 +
 5 files changed, 143 insertions(+), 47 deletions(-)

-- 
2.17.1



[RFC PATCH 4/4] Fix: sched/membarrier: p->mm->membarrier_state racy load

2019-09-05 Thread Mathieu Desnoyers
The membarrier_state field is located within the mm_struct, which
is not guaranteed to exist when used from runqueue-lock-free iteration
on runqueues by the membarrier system call.

Copy the membarrier_state from the mm_struct into the scheduler runqueue
in the scheduler prepare task switch when the scheduler switches between
mm.

When registering membarrier for mm, after setting the registration bit
in the mm membarrier state, issue a synchronize_rcu() to ensure the
scheduler observes the change. In order to take care of the case
where a runqueue keeps executing the target mm without swapping to
other mm, iterate over each runqueue and issue an IPI to copy the
membarrier_state from the mm_struct into each runqueue which have the
same mm which state has just been modified.

Move the mm membarrier_state field closer to pgd in mm_struct so sched
switch prepare hits a cache line which is already touched by the
scheduler switch_mm.

membarrier_execve() hook now needs to clear the runqueue's membarrier
state in addition to clear the mm membarrier state, so move its
implementation into the scheduler membarrier code so it can access the
runqueue structure.

As suggested by Linus, move all membarrier.c RCU read-side locks outside
of the for each cpu loops. In fact, it turns out that
membarrier_global_expedited() does not need the RCU read-side critical
section because it does not need to use task_rcu_dereference() anymore,
relying instead on the runqueue's membarrier_state.

Suggested-by: Linus Torvalds 
Signed-off-by: Mathieu Desnoyers 
Cc: "Paul E. McKenney" 
Cc: Peter Zijlstra 
Cc: Oleg Nesterov 
Cc: "Eric W. Biederman" 
Cc: Linus Torvalds 
Cc: Russell King - ARM Linux admin 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Kirill Tkhai 
Cc: Mike Galbraith 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
---
 include/linux/mm_types.h  |   7 +-
 include/linux/sched/mm.h  |   6 +-
 kernel/sched/core.c   |   1 +
 kernel/sched/membarrier.c | 141 +++---
 kernel/sched/sched.h  |  33 +
 5 files changed, 141 insertions(+), 47 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6a7a1083b6fb..7020572eb605 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -382,6 +382,9 @@ struct mm_struct {
unsigned long task_size;/* size of task vm space */
unsigned long highest_vm_end;   /* highest vma end address */
pgd_t * pgd;
+#ifdef CONFIG_MEMBARRIER
+   atomic_t membarrier_state;
+#endif
 
/**
 * @mm_users: The number of users including userspace.
@@ -452,9 +455,7 @@ struct mm_struct {
unsigned long flags; /* Must use atomic bitops to access */
 
struct core_state *core_state; /* coredumping support */
-#ifdef CONFIG_MEMBARRIER
-   atomic_t membarrier_state;
-#endif
+
 #ifdef CONFIG_AIO
spinlock_t  ioctx_lock;
struct kioctx_table __rcu   *ioctx_table;
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 8557ec664213..7070dbef8066 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -370,10 +370,8 @@ static inline void 
membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
sync_core_before_usermode();
 }
 
-static inline void membarrier_execve(struct task_struct *t)
-{
-   atomic_set(>mm->membarrier_state, 0);
-}
+extern void membarrier_execve(struct task_struct *t);
+
 #else
 #ifdef CONFIG_ARCH_HAS_MEMBARRIER_CALLBACKS
 static inline void membarrier_arch_switch_mm(struct mm_struct *prev,
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 010d578118d6..1cffc1aa403c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3038,6 +3038,7 @@ prepare_task_switch(struct rq *rq, struct task_struct 
*prev,
perf_event_task_sched_out(prev, next);
rseq_preempt(prev);
fire_sched_out_preempt_notifiers(prev, next);
+   membarrier_prepare_task_switch(rq, prev, next);
prepare_task(next);
prepare_arch_switch(next);
 }
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 7e0a0d6535f3..5744c300d29e 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -30,6 +30,28 @@ static void ipi_mb(void *info)
smp_mb();   /* IPIs should be serializing but paranoid. */
 }
 
+static void ipi_sync_rq_state(void *info)
+{
+   struct mm_struct *mm = (struct mm_struct *) info;
+
+   if (current->mm != mm)
+   return;
+   WRITE_ONCE(this_rq()->membarrier_state,
+  atomic_read(>membarrier_state));
+   /*
+* Issue a memory barrier after setting 
MEMBARRIER_STATE_GLOBAL_EXPEDITED
+* in the current runqueue to guarantee that no memory access following
+* registration is reordered before registration.
+*/
+   smp_mb();
+}
+
+void 

[RFC PATCH 2/4] Cleanup: sched/membarrier: remove redundant check

2019-09-05 Thread Mathieu Desnoyers
Checking that the number of threads is 1 is redundant with checking
mm_users == 1.

Suggested-by: Oleg Nesterov 
Signed-off-by: Mathieu Desnoyers 
Cc: "Paul E. McKenney" 
Cc: Peter Zijlstra 
Cc: Oleg Nesterov 
Cc: "Eric W. Biederman" 
Cc: Linus Torvalds 
Cc: Russell King - ARM Linux admin 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Kirill Tkhai 
Cc: Mike Galbraith 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
---
 kernel/sched/membarrier.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 5110d91b1b0e..7e0a0d6535f3 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -186,7 +186,7 @@ static int membarrier_register_global_expedited(void)
MEMBARRIER_STATE_GLOBAL_EXPEDITED_READY)
return 0;
atomic_or(MEMBARRIER_STATE_GLOBAL_EXPEDITED, >membarrier_state);
-   if (atomic_read(>mm_users) == 1 && get_nr_threads(p) == 1) {
+   if (atomic_read(>mm_users) == 1) {
/*
 * For single mm user, single threaded process, we can
 * simply issue a memory barrier after setting
@@ -232,7 +232,7 @@ static int membarrier_register_private_expedited(int flags)
if (flags & MEMBARRIER_FLAG_SYNC_CORE)
atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE,
  >membarrier_state);
-   if (!(atomic_read(>mm_users) == 1 && get_nr_threads(p) == 1)) {
+   if (atomic_read(>mm_users) != 1) {
/*
 * Ensure all future scheduler executions will observe the
 * new thread flag state for this process.
-- 
2.17.1



[RFC PATCH 3/4] Cleanup: sched/membarrier: only sync_core before usermode for same mm

2019-09-05 Thread Mathieu Desnoyers
When the prev and next task's mm change, switch_mm() provides the core
serializing guarantees before returning to usermode. The only case
where an explicit core serialization is needed is when the scheduler
keeps the same mm for prev and next.

Suggested-by: Oleg Nesterov 
Signed-off-by: Mathieu Desnoyers 
Cc: "Paul E. McKenney" 
Cc: Peter Zijlstra 
Cc: Oleg Nesterov 
Cc: "Eric W. Biederman" 
Cc: Linus Torvalds 
Cc: Russell King - ARM Linux admin 
Cc: Chris Metcalf 
Cc: Christoph Lameter 
Cc: Kirill Tkhai 
Cc: Mike Galbraith 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
---
 include/linux/sched/mm.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 4a7944078cc3..8557ec664213 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -362,6 +362,8 @@ enum {
 
 static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct 
*mm)
 {
+   if (current->mm != mm)
+   return;
if (likely(!(atomic_read(>membarrier_state) &
 MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE)))
return;
-- 
2.17.1



Re: [RFC v2 1/3] cpufreq: ti-cpufreq: add support for omap34xx and omap36xx

2019-09-05 Thread Viresh Kumar
On 05-09-19, 07:32, Tony Lindgren wrote:
> Acked-by: Tony Lindgren 

Do you want to pick the series instead as this has lots of DT changes
?

-- 
viresh


[PATCH] net: Remove the source address setting in connect() for UDP

2019-09-05 Thread Enke Chen
The connect() system call for a UDP socket is for setting the destination
address and port. But the current code mistakenly sets the source address
for the socket as well. Remove the source address setting in connect() for
UDP in this patch.

Implications of the bug:

  - Packet drop:

On a multi-homed device, an address assigned to any interface may
qualify as a source address when originating a packet. If needed, the
IP_PKTINFO option can be used to explicitly specify the source address.
But with the source address being mistakenly set for the socket in
connect(), a return packet (for the socket) destined to an interface
address different from that source address would be wrongly dropped
due to address mismatch.

This can be reproduced easily. The dropped packets are shown in the
following output by "netstat -s" for UDP:

  xxx packets to unknown port received

  - Source address selection:

The source address, if unspecified via "bind()" or IP_PKTINFO, should
be determined by routing at the time of packet origination, and not at
the time when the connect() call is made. The difference matters as
routing can change, e.g., by interface down/up events, and using a
source address of an "down" interface is known to be problematic.

There is no backward compatibility issue here as the source address setting
in connect() is not needed anyway.

  - No impact on the source address selection when the source address
is explicitly specified by "bind()", or by the "IP_PKTINFO" option.

  - In the case that the source address is not explicitly specified,
the selection of the source address would be more accurate and
reliable based on the up-to-date routing table.

Signed-off-by: Enke Chen 
---
 net/ipv4/datagram.c |  7 ---
 net/ipv6/datagram.c | 15 +--
 2 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index f915abff1350..4065808ec6c1 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -64,13 +64,6 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr 
*uaddr, int addr_len
err = -EACCES;
goto out;
}
-   if (!inet->inet_saddr)
-   inet->inet_saddr = fl4->saddr;  /* Update source address */
-   if (!inet->inet_rcv_saddr) {
-   inet->inet_rcv_saddr = fl4->saddr;
-   if (sk->sk_prot->rehash)
-   sk->sk_prot->rehash(sk);
-   }
inet->inet_daddr = fl4->daddr;
inet->inet_dport = usin->sin_port;
sk->sk_state = TCP_ESTABLISHED;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index ecf440a4f593..80388cd50dc3 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -197,19 +197,6 @@ int __ip6_datagram_connect(struct sock *sk, struct 
sockaddr *uaddr,
goto out;
 
ipv6_addr_set_v4mapped(inet->inet_daddr, >sk_v6_daddr);
-
-   if (ipv6_addr_any(>saddr) ||
-   ipv6_mapped_addr_any(>saddr))
-   ipv6_addr_set_v4mapped(inet->inet_saddr, >saddr);
-
-   if (ipv6_addr_any(>sk_v6_rcv_saddr) ||
-   ipv6_mapped_addr_any(>sk_v6_rcv_saddr)) {
-   ipv6_addr_set_v4mapped(inet->inet_rcv_saddr,
-  >sk_v6_rcv_saddr);
-   if (sk->sk_prot->rehash)
-   sk->sk_prot->rehash(sk);
-   }
-
goto out;
}
 
@@ -247,7 +234,7 @@ int __ip6_datagram_connect(struct sock *sk, struct sockaddr 
*uaddr,
 *  destination cache for it.
 */
 
-   err = ip6_datagram_dst_update(sk, true);
+   err = ip6_datagram_dst_update(sk, false);
if (err) {
/* Restore the socket peer info, to keep it consistent with
 * the old socket state
-- 
2.19.1



Re: [RFC v2 1/3] cpufreq: ti-cpufreq: add support for omap34xx and omap36xx

2019-09-05 Thread Viresh Kumar
On 05-09-19, 07:32, Tony Lindgren wrote:
> * H. Nikolaus Schaller  [190904 08:54]:
> > This adds code and tables to read the silicon revision and
> > eFuse (speed binned / 720 MHz grade) bits for selecting
> > opp-v2 table entries.
> > 
> > Since these bits are not always part of the syscon register
> > range (like for am33xx, am43, dra7), we add code to directly
> > read the register values using ioremap() if syscon access fails.
> 
> This is nice :) Seems to work for me based on a quick test
> on at least omap36xx.
> 
> Looks like n900 produces the following though:
> 
> core: _opp_supported_by_regulators: OPP minuV: 127 maxuV: 127, not 
> supported by regulator
> cpu cpu0: _opp_add: OPP not supported by regulators (55000)

That's a DT thing I believe where the voltage doesn't fit what the
regulator can support.

> But presumably that can be further patched. So for this
> patch:
> 
> Acked-by: Tony Lindgren 

Thanks.

-- 
viresh


Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold

2019-09-05 Thread Joel Fernandes
On Thu, Sep 05, 2019 at 06:15:43PM -0700, Daniel Colascione wrote:
[snip]
> > > > > > > The bigger improvement with the threshold is the number of trace 
> > > > > > > records are
> > > > > > > almost halved by using a threshold. The number of records went 
> > > > > > > from 4.6K to
> > > > > > > 2.6K.
> > > > > >
> > > > > > Steven, would it be feasible to add a generic tracepoint throttling?
> > > > >
> > > > > I might misunderstand this but is the issue here actually throttling
> > > > > of the sheer number of trace records or tracing large enough changes
> > > > > to RSS that user might care about? Small changes happen all the time
> > > > > but we are likely not interested in those. Surely we could postprocess
> > > > > the traces to extract changes large enough to be interesting but why
> > > > > capture uninteresting information in the first place? IOW the
> > > > > throttling here should be based not on the time between traces but on
> > > > > the amount of change of the traced signal. Maybe a generic facility
> > > > > like that would be a good idea?
> > > >
> > > > You mean like add a trigger (or filter) that only traces if a field has
> > > > changed since the last time the trace was hit? Hmm, I think we could
> > > > possibly do that. Perhaps even now with histogram triggers?
> > >
> > > I was thinking along the same lines. The histogram subsystem seems
> > > like a very good fit here. Histogram triggers already let users talk
> > > about specific fields of trace events, aggregate them in configurable
> > > ways, and (importantly, IMHO) create synthetic new trace events that
> > > the kernel emits under configurable conditions.
> >
> > Hmm, I think this tracing feature will be a good idea. But in order not to
> > gate this patch, can we agree on keeping a temporary threshold for this
> > patch? Once such idea is implemented in trace subsystem, then we can remove
> > the temporary filter.
> >
> > As Tim said, we don't want our traces flooded and this is a very useful
> > tracepoint as proven in our internal usage at Android. The threshold filter
> > is just few lines of code.
> 
> I'm not sure the threshold filtering code you've added does the right
> thing: we don't keep state, so if a counter constantly flips between
> one "side" of the TRACE_MM_COUNTER_THRESHOLD and the other, we'll emit
> ftrace events at high frequency. More generally, this filtering
> couples the rate of counter logging to the *value* of the counter ---
> that is, we log ftrace events at different times depending on how much
> memory we happen to have used --- and that's not ideal from a
> predictability POV.
> 
> All things being equal, I'd prefer that we get things upstream as fast
> as possible. But in this case, I'd rather wait for a general-purpose
> filtering facility (whether that facility is based on histogram, eBPF,
> or something else) rather than hardcode one particular fixed filtering
> strategy (which might be suboptimal) for one particular kind of event.
> Is there some special urgency here?
> 
> How about we instead add non-filtered tracepoints for the mm counters?
> These tracepoints will still be free when turned off.
> 
> Having added the basic tracepoints, we can discuss separately how to
> do the rate limiting. Maybe instead of providing direct support for
> the algorithm that I described above, we can just use a BPF program as
> a yes/no predicate for whether to log to ftrace. That'd get us to the
> same place as this patch, but more flexibly, right?

Chatted with Daniel offline, we agreed on removing the threshold -- which
Michal also wants to be that way.

So I'll be resubmitting this patch with the threshold removed; and we'll work
on seeing to use filtering through other generic ways like BPF.

thanks all!

 - Joel



Re: [PATCH vfs/for-next] vfs: fix vfs_get_single_reconf_super error handling

2019-09-05 Thread Eric Biggers
On Fri, Aug 30, 2019 at 10:10:24PM -0500, Eric Biggers wrote:
> From: Eric Biggers 
> 
> syzbot reported an invalid free in debugfs_release_dentry().  The
> reproducer tries to mount debugfs with the 'dirsync' option, which is
> not allowed.  The bug is that if reconfigure_super() fails in
> vfs_get_super(), deactivate_locked_super() is called, but also
> fs_context::root is left non-NULL which causes deactivate_super() to be
> called again later.
> 
> Fix it by releasing fs_context::root in the error path.
> 
> Reported-by: syzbot+5aca688dac0796c56...@syzkaller.appspotmail.com
> Fixes: e478b48498a7 ("vfs: Add a single-or-reconfig keying to 
> vfs_get_super()")
> Cc: David Howells 
> Signed-off-by: Eric Biggers 
> ---
>  fs/super.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 0f913376fc4c..99195e15be05 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1194,8 +1194,11 @@ int vfs_get_super(struct fs_context *fc,
>   fc->root = dget(sb->s_root);
>   if (keying == vfs_get_single_reconf_super) {
>   err = reconfigure_super(fc);
> - if (err < 0)
> + if (err < 0) {
> + dput(fc->root);
> + fc->root = NULL;
>   goto error;
> + }
>   }
>   }
>  

Ping.  This is still broken in linux-next.

- Eric


Re: linux-next: build failure after merge of the net-next tree

2019-09-05 Thread Masahiro Yamada
On Fri, Sep 6, 2019 at 4:26 AM Andrii Nakryiko
 wrote:
>
> On Tue, Sep 3, 2019 at 11:20 PM Masahiro Yamada
>  wrote:
> >
> > On Wed, Sep 4, 2019 at 3:00 PM Stephen Rothwell  
> > wrote:
> > >
> > > Hi all,
> > >
> > > After merging the net-next tree, today's linux-next build (arm
> > > multi_v7_defconfig) failed like this:
> > >
> > > scripts/link-vmlinux.sh: 74: Bad substitution
> > >
> > > Caused by commit
> > >
> > >   341dfcf8d78e ("btf: expose BTF info through sysfs")
> > >
> > > interacting with commit
> > >
> > >   1267f9d3047d ("kbuild: add $(BASH) to run scripts with bash-extension")
> > >
> > > from the kbuild tree.
> >
> >
> > I knew that they were using bash-extension
> > in the #!/bin/sh script.  :-D
> >
> > In fact, I wrote my patch in order to break their code
> > and  make btf people realize that they were doing wrong.
>
> Was there a specific reason to wait until this would break during
> Stephen's merge, instead of giving me a heads up (or just replying on
> original patch) and letting me fix it and save everyone's time and
> efforts?
>
> Either way, I've fixed the issue in
> https://patchwork.ozlabs.org/patch/1158620/ and will pay way more
> attention to BASH-specific features going forward (I found it pretty
> hard to verify stuff like this, unfortunately). But again, code review
> process is the best place to catch this and I really hope in the
> future we can keep this process productive. Thanks!

I could have pointed it out if I had noticed
it in the review process.

I actually noticed your patch by Stephen's
former email.  (i.e. when it appeared in linux-next)

(I try my best to check kbuild ML, and also search for
my name in LKML in case I am explicitly addressed,
but a large number of emails fall off my filter)

It was somewhat too late when I noticed it.
Of course, I still could email you afterward, or even send a patch to btf ML,
but I did not fix a particular instance of breakage
because there are already the same type of breakages in code base.

Then, I applied the all-or-nothing checker because I thought it was
the only way to address the root cause of the problems.

I admit I could have done the process better.
Sorry if I made people uncomfortable and waste time.

Thanks.




> >
> >
> >
> > > The change in the net-next tree turned link-vmlinux.sh into a bash script
> > > (I think).
> > >
> > > I have applied the following patch for today:
> >
> >
> > But, this is a temporary fix only for linux-next.
> >
> > scripts/link-vmlinux.sh does not need to use the
> > bash-extension ${@:2} in the first place.
> >
> > I hope btf people will write the correct code.
>
> I replaced ${@:2} with shift and ${@}, I hope that's a correct fix,
> but if you think it's not, please reply on the patch and let me know.
>
>
> >
> > Thanks.
> >
> >
> >
> >
> > > From: Stephen Rothwell 
> > > Date: Wed, 4 Sep 2019 15:43:41 +1000
> > > Subject: [PATCH] link-vmlinux.sh is now a bash script
> > >
> > > Signed-off-by: Stephen Rothwell 
> > > ---
> > >  Makefile| 4 ++--
> > >  scripts/link-vmlinux.sh | 2 +-
> > >  2 files changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/Makefile b/Makefile
> > > index ac97fb282d99..523d12c5cebe 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -1087,7 +1087,7 @@ ARCH_POSTLINK := $(wildcard 
> > > $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
> > >
> > >  # Final link of vmlinux with optional arch pass after final link
> > >  cmd_link-vmlinux = \
> > > -   $(CONFIG_SHELL) $< $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_vmlinux) ;   
> > >  \
> > > +   $(BASH) $< $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_vmlinux) ;\
> > > $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
> > >
> > >  vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) 
> > > FORCE
> > > @@ -1403,7 +1403,7 @@ clean: rm-files := $(CLEAN_FILES)
> > >  PHONY += archclean vmlinuxclean
> > >
> > >  vmlinuxclean:
> > > -   $(Q)$(CONFIG_SHELL) $(srctree)/scripts/link-vmlinux.sh clean
> > > +   $(Q)$(BASH) $(srctree)/scripts/link-vmlinux.sh clean
> > > $(Q)$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) clean)
> > >
> > >  clean: archclean vmlinuxclean
> > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > index f7edb75f9806..ea1f8673869d 100755
> > > --- a/scripts/link-vmlinux.sh
> > > +++ b/scripts/link-vmlinux.sh
> > > @@ -1,4 +1,4 @@
> > > -#!/bin/sh
> > > +#!/bin/bash
> > >  # SPDX-License-Identifier: GPL-2.0
> > >  #
> > >  # link vmlinux
> > > --
> > > 2.23.0.rc1
> > >
> > > --
> > > Cheers,
> > > Stephen Rothwell
> >
> >
> >
> > --
> > Best Regards
> > Masahiro Yamada



--
Best Regards
Masahiro Yamada


Re: [PATCH] net/skbuff: silence warnings under memory pressure

2019-09-05 Thread Sergey Senozhatsky
On (09/05/19 13:14), Steven Rostedt wrote:
> > Hmm, from the article,
> > 
> > https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter
> > 
> > "Since transmission of a single or multiple characters may take a long time
> > relative to CPU speeds, a UART maintains a flag showing busy status so that 
> > the
> > host system knows if there is at least one character in the transmit buffer 
> > or
> > shift register; "ready for next character(s)" may also be signaled with an
> > interrupt."
> 
> I'm pretty sure all serial consoles do a busy loop on the UART and not
> use interrupts to notify when it's available.

Yes. Besides, we call console drivers with local IRQs disabled.

-ss


Re: [PATCH] rtl8xxxu: add bluetooth co-existence support for single antenna

2019-09-05 Thread Chris Chiu
On Tue, Sep 3, 2019 at 1:37 PM Chris Chiu  wrote:
>
> The RTL8723BU suffers the wifi disconnection problem while bluetooth
> device connected. While wifi is doing tx/rx, the bluetooth will scan
> without results. This is due to the wifi and bluetooth share the same
> single antenna for RF communication and they need to have a mechanism
> to collaborate.
>
> BT information is provided via the packet sent from co-processor to
> host (C2H). It contains the status of BT but the rtl8723bu_handle_c2h
> dose not really handle it. And there's no bluetooth coexistence
> mechanism to deal with it.
>
> This commit adds a workqueue to set the tdma configurations and
> coefficient table per the parsed bluetooth link status and given
> wifi connection state. The tdma/coef table comes from the vendor
> driver code of the RTL8192EU and RTL8723BU. However, this commit is
> only for single antenna scenario which RTL8192EU is default dual
> antenna. The rtl8xxxu_parse_rxdesc24 which invokes the handle_c2h
> is only for 8723b and 8192e so the mechanism is expected to work
> on both chips with single antenna. Note RTL8192EU dual antenna is
> not supported.
>
> Signed-off-by: Chris Chiu 
> ---
>  .../net/wireless/realtek/rtl8xxxu/rtl8xxxu.h  |  37 +++
>  .../realtek/rtl8xxxu/rtl8xxxu_8723b.c |   2 -
>  .../wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 243 +-
>  3 files changed, 275 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h 
> b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
> index 582c2a346cec..22e95b11bfbb 100644
> --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
> +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
> @@ -1220,6 +1220,37 @@ enum ratr_table_mode_new {
> RATEID_IDX_BGN_3SS = 14,
>  };
>
> +#define BT_INFO_8723B_1ANT_B_FTP   BIT(7)
> +#define BT_INFO_8723B_1ANT_B_A2DP  BIT(6)
> +#define BT_INFO_8723B_1ANT_B_HID   BIT(5)
> +#define BT_INFO_8723B_1ANT_B_SCO_BUSY  BIT(4)
> +#define BT_INFO_8723B_1ANT_B_ACL_BUSY  BIT(3)
> +#define BT_INFO_8723B_1ANT_B_INQ_PAGE  BIT(2)
> +#define BT_INFO_8723B_1ANT_B_SCO_ESCO  BIT(1)
> +#define BT_INFO_8723B_1ANT_B_CONNECTIONBIT(0)
> +
> +enum _BT_8723B_1ANT_STATUS {
> +   BT_8723B_1ANT_STATUS_NON_CONNECTED_IDLE  = 0x0,
> +   BT_8723B_1ANT_STATUS_CONNECTED_IDLE  = 0x1,
> +   BT_8723B_1ANT_STATUS_INQ_PAGE= 0x2,
> +   BT_8723B_1ANT_STATUS_ACL_BUSY= 0x3,
> +   BT_8723B_1ANT_STATUS_SCO_BUSY= 0x4,
> +   BT_8723B_1ANT_STATUS_ACL_SCO_BUSY= 0x5,
> +   BT_8723B_1ANT_STATUS_MAX
> +};
> +
> +struct rtl8xxxu_btcoex {
> +   u8  bt_status;
> +   boolbt_busy;
> +   boolhas_sco;
> +   boolhas_a2dp;
> +   boolhas_hid;
> +   boolhas_pan;
> +   boolhid_only;
> +   boola2dp_only;
> +   boolc2h_bt_inquiry;
> +};
> +
>  #define RTL8XXXU_RATR_STA_INIT 0
>  #define RTL8XXXU_RATR_STA_HIGH 1
>  #define RTL8XXXU_RATR_STA_MID  2
> @@ -1340,6 +1371,10 @@ struct rtl8xxxu_priv {
>  */
> struct ieee80211_vif *vif;
> struct delayed_work ra_watchdog;
> +   struct work_struct c2hcmd_work;
> +   struct sk_buff_head c2hcmd_queue;
> +   spinlock_t c2hcmd_lock;
> +   struct rtl8xxxu_btcoex bt_coex;
>  };
>
>  struct rtl8xxxu_rx_urb {
> @@ -1486,6 +1521,8 @@ void rtl8xxxu_fill_txdesc_v2(struct ieee80211_hw *hw, 
> struct ieee80211_hdr *hdr,
>  struct rtl8xxxu_txdesc32 *tx_desc32, bool sgi,
>  bool short_preamble, bool ampdu_enable,
>  u32 rts_rate);
> +void rtl8723bu_set_ps_tdma(struct rtl8xxxu_priv *priv,
> +  u8 arg1, u8 arg2, u8 arg3, u8 arg4, u8 arg5);
>
>  extern struct rtl8xxxu_fileops rtl8192cu_fops;
>  extern struct rtl8xxxu_fileops rtl8192eu_fops;
> diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c 
> b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
> index ceffe05bd65b..9ba661b3d767 100644
> --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
> +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
> @@ -1580,9 +1580,7 @@ static void rtl8723b_enable_rf(struct rtl8xxxu_priv 
> *priv)
> /*
>  * Software control, antenna at WiFi side
>  */
> -#ifdef NEED_PS_TDMA
> rtl8723bu_set_ps_tdma(priv, 0x08, 0x00, 0x00, 0x00, 0x00);
> -#endif
>
> rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x);
> rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0x);
> diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c 
> b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
> index a6f358b9e447..4f72c2d14d44 100644
> --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
> +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
> @@ 

[RESEND PATCH v2] Bluetooth: Retry configure request if result is L2CAP_CONF_UNKNOWN

2019-09-05 Thread Andrey Smirnov
Due to:

 * Current implementation of l2cap_config_rsp() dropping BT
   connection if sender of configuration response replied with unknown
   option failure (Result=0x0003/L2CAP_CONF_UNKNOWN)

 * Current implementation of l2cap_build_conf_req() adding
   L2CAP_CONF_RFC(0x04) option to initial configure request sent by
   the Linux host.

devices that do no recongninze L2CAP_CONF_RFC, such as Xbox One S
controllers, will get stuck in endless connect -> configure ->
disconnect loop, never connect and be generaly unusable.

To avoid this problem add code to do the following:

 1. Parse the body of response L2CAP_CONF_UNKNOWN and, in case of
unsupported option being RFC, clear L2CAP_FEAT_ERTM and
L2CAP_FEAT_STREAMING from connection's feature mask (in order to
prevent RFC option from being added going forward)

 2. Retry configuration step the same way it's done for
L2CAP_CONF_UNACCEPT

Signed-off-by: Andrey Smirnov 
Cc: Pierre-Loup A. Griffais 
Cc: Florian Dollinger 
Cc: Marcel Holtmann 
Cc: Johan Hedberg 
Cc: linux-blueto...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---

Changes since [v1]:

   - Patch simplified to simply clear L2CAP_FEAT_ERTM |
 L2CAP_FEAT_STREAMING from feat_mask when device flags RFC options
 as unknown

[v1] lore.kernel.org/r/20190208025828.30901-1-andrew.smir...@gmail.com

 net/bluetooth/l2cap_core.c | 58 ++
 1 file changed, 58 insertions(+)

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index dfc1edb168b7..77b65870b064 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -4216,6 +4216,49 @@ static inline int l2cap_config_req(struct l2cap_conn 
*conn,
return err;
 }
 
+static inline int l2cap_config_rsp_unknown(struct l2cap_conn *conn,
+  struct l2cap_chan *chan,
+  const u8 *data,
+  int len)
+{
+   char req[64];
+
+   if (!len || len > sizeof(req) -  sizeof(struct l2cap_conf_req))
+   return -ECONNRESET;
+
+   while (len--) {
+   const u8 option_type = *data++;
+
+   BT_DBG("chan %p, unknown option type: %u", chan,  option_type);
+
+   /* "...Hints shall not be included in the Response and
+* shall not be the sole cause for rejecting the
+* Request.."
+*/
+   if (option_type & L2CAP_CONF_HINT)
+   return -ECONNRESET;
+
+   switch (option_type) {
+   case L2CAP_CONF_RFC:
+   /* Clearing the following feature should
+* prevent RFC option from being added next
+* connection attempt
+*/
+   conn->feat_mask &= ~(L2CAP_FEAT_ERTM |
+L2CAP_FEAT_STREAMING);
+   break;
+   default:
+   return -ECONNRESET;
+   }
+   }
+
+   len = l2cap_build_conf_req(chan, req, sizeof(req));
+   l2cap_send_cmd(conn, l2cap_get_ident(conn), L2CAP_CONF_REQ, len, req);
+   chan->num_conf_req++;
+
+   return 0;
+}
+
 static inline int l2cap_config_rsp(struct l2cap_conn *conn,
   struct l2cap_cmd_hdr *cmd, u16 cmd_len,
   u8 *data)
@@ -4271,6 +4314,21 @@ static inline int l2cap_config_rsp(struct l2cap_conn 
*conn,
}
goto done;
 
+   case L2CAP_CONF_UNKNOWN:
+   if (chan->num_conf_rsp <= L2CAP_CONF_MAX_CONF_RSP) {
+   if (l2cap_config_rsp_unknown(conn, chan, rsp->data,
+len) < 0) {
+   l2cap_send_disconn_req(chan, ECONNRESET);
+   goto done;
+   }
+   break;
+   }
+   /* Once, chan->num_conf_rsp goes above
+* L2CAP_CONF_MAX_CONF_RSP we want to go down all the
+* way to default label (just like L2CAP_CONF_UNACCEPT
+* below)
+*/
+   /* fall through */
case L2CAP_CONF_UNACCEPT:
if (chan->num_conf_rsp <= L2CAP_CONF_MAX_CONF_RSP) {
char req[64];
-- 
2.21.0



[PATCH V8 5/5] mmc: host: sdhci-pci: Add Genesys Logic GL975x support

2019-09-05 Thread Ben Chuang
From: Ben Chuang 

Add support for the GL9750 and GL9755 chipsets.

Enable v4 mode and wait 5ms after set 1.8V signal enable for GL9750/
GL9755. Fix the value of SDHCI_MAX_CURRENT register and use the vendor
tuning flow for GL9750.

Co-developed-by: Michael K Johnson 
Signed-off-by: Michael K Johnson 
Signed-off-by: Ben Chuang 
---
 drivers/mmc/host/Kconfig  |   1 +
 drivers/mmc/host/Makefile |   2 +-
 drivers/mmc/host/sdhci-pci-core.c |   2 +
 drivers/mmc/host/sdhci-pci-gli.c  | 355 ++
 drivers/mmc/host/sdhci-pci.h  |   5 +
 5 files changed, 364 insertions(+), 1 deletion(-)
 create mode 100644 drivers/mmc/host/sdhci-pci-gli.c

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 931770f17087..9fbfff514d6c 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -94,6 +94,7 @@ config MMC_SDHCI_PCI
depends on MMC_SDHCI && PCI
select MMC_CQHCI
select IOSF_MBI if X86
+   select MMC_SDHCI_IO_ACCESSORS
help
  This selects the PCI Secure Digital Host Controller Interface.
  Most controllers found today are PCI devices.
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 73578718f119..661445415090 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_MMC_MXS) += mxs-mmc.o
 obj-$(CONFIG_MMC_SDHCI)+= sdhci.o
 obj-$(CONFIG_MMC_SDHCI_PCI)+= sdhci-pci.o
 sdhci-pci-y+= sdhci-pci-core.o sdhci-pci-o2micro.o 
sdhci-pci-arasan.o \
-  sdhci-pci-dwc-mshc.o
+  sdhci-pci-dwc-mshc.o sdhci-pci-gli.o
 obj-$(subst m,y,$(CONFIG_MMC_SDHCI_PCI))   += sdhci-pci-data.o
 obj-$(CONFIG_MMC_SDHCI_ACPI)   += sdhci-acpi.o
 obj-$(CONFIG_MMC_SDHCI_PXAV3)  += sdhci-pxav3.o
diff --git a/drivers/mmc/host/sdhci-pci-core.c 
b/drivers/mmc/host/sdhci-pci-core.c
index 4154ee11b47d..e5835fbf73bc 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -1682,6 +1682,8 @@ static const struct pci_device_id pci_ids[] = {
SDHCI_PCI_DEVICE(O2, SEABIRD1, o2),
SDHCI_PCI_DEVICE(ARASAN, PHY_EMMC, arasan),
SDHCI_PCI_DEVICE(SYNOPSYS, DWC_MSHC, snps),
+   SDHCI_PCI_DEVICE(GLI, 9750, gl9750),
+   SDHCI_PCI_DEVICE(GLI, 9755, gl9755),
SDHCI_PCI_DEVICE_CLASS(AMD, SYSTEM_SDHCI, PCI_CLASS_MASK, amd),
/* Generic SD host controller */
{PCI_DEVICE_CLASS(SYSTEM_SDHCI, PCI_CLASS_MASK)},
diff --git a/drivers/mmc/host/sdhci-pci-gli.c b/drivers/mmc/host/sdhci-pci-gli.c
new file mode 100644
index ..94462b94abec
--- /dev/null
+++ b/drivers/mmc/host/sdhci-pci-gli.c
@@ -0,0 +1,355 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2019 Genesys Logic, Inc.
+ *
+ * Authors: Ben Chuang 
+ *
+ * Version: v0.9.0 (2019-08-08)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "sdhci.h"
+#include "sdhci-pci.h"
+
+/*  Genesys Logic extra registers */
+#define SDHCI_GLI_9750_WT 0x800
+#define   SDHCI_GLI_9750_WT_EN  BIT(0)
+#define   GLI_9750_WT_EN_ON0x1
+#define   GLI_9750_WT_EN_OFF   0x0
+
+#define SDHCI_GLI_9750_DRIVING  0x860
+#define   SDHCI_GLI_9750_DRIVING_1GENMASK(11, 0)
+#define   SDHCI_GLI_9750_DRIVING_2GENMASK(27, 26)
+#define   GLI_9750_DRIVING_1_VALUE0xFFF
+#define   GLI_9750_DRIVING_2_VALUE0x3
+
+#define SDHCI_GLI_9750_PLL   0x864
+#define   SDHCI_GLI_9750_PLL_TX2_INVBIT(23)
+#define   SDHCI_GLI_9750_PLL_TX2_DLYGENMASK(22, 20)
+#define   GLI_9750_PLL_TX2_INV_VALUE0x1
+#define   GLI_9750_PLL_TX2_DLY_VALUE0x0
+
+#define SDHCI_GLI_9750_SW_CTRL  0x874
+#define   SDHCI_GLI_9750_SW_CTRL_4GENMASK(7, 6)
+#define   GLI_9750_SW_CTRL_4_VALUE0x3
+
+#define SDHCI_GLI_9750_MISC0x878
+#define   SDHCI_GLI_9750_MISC_TX1_INVBIT(2)
+#define   SDHCI_GLI_9750_MISC_RX_INV BIT(3)
+#define   SDHCI_GLI_9750_MISC_TX1_DLYGENMASK(6, 4)
+#define   GLI_9750_MISC_TX1_INV_VALUE0x0
+#define   GLI_9750_MISC_RX_INV_ON0x1
+#define   GLI_9750_MISC_RX_INV_OFF   0x0
+#define   GLI_9750_MISC_RX_INV_VALUE GLI_9750_MISC_RX_INV_OFF
+#define   GLI_9750_MISC_TX1_DLY_VALUE0x5
+
+#define SDHCI_GLI_9750_TUNING_CONTROL0x540
+#define   SDHCI_GLI_9750_TUNING_CONTROL_EN  BIT(4)
+#define   GLI_9750_TUNING_CONTROL_EN_ON 0x1
+#define   GLI_9750_TUNING_CONTROL_EN_OFF0x0
+#define   SDHCI_GLI_9750_TUNING_CONTROL_GLITCH_1BIT(16)
+#define   SDHCI_GLI_9750_TUNING_CONTROL_GLITCH_2GENMASK(20, 19)
+#define   GLI_9750_TUNING_CONTROL_GLITCH_1_VALUE0x1
+#define   GLI_9750_TUNING_CONTROL_GLITCH_2_VALUE0x2
+
+#define SDHCI_GLI_9750_TUNING_PARAMETERS   0x544
+#define   SDHCI_GLI_9750_TUNING_PARAMETERS_RX_DLYGENMASK(2, 0)
+#define   GLI_9750_TUNING_PARAMETERS_RX_DLY_VALUE0x1
+
+#define 

[PATCH V8 4/5] mmc: sdhci: Export sdhci_abort_tuning function symbol

2019-09-05 Thread Ben Chuang
From: Ben Chuang 

Export sdhci_abort_tuning() function symbols which are used by other SD Host
controller driver modules.

Signed-off-by: Ben Chuang 
Co-developed-by: Michael K Johnson 
Signed-off-by: Michael K Johnson 
Acked-by: Adrian Hunter 
---
 drivers/mmc/host/sdhci.c | 3 ++-
 drivers/mmc/host/sdhci.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 9106ebc7a422..0f2f110534db 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2328,7 +2328,7 @@ void sdhci_reset_tuning(struct sdhci_host *host)
 }
 EXPORT_SYMBOL_GPL(sdhci_reset_tuning);
 
-static void sdhci_abort_tuning(struct sdhci_host *host, u32 opcode)
+void sdhci_abort_tuning(struct sdhci_host *host, u32 opcode)
 {
sdhci_reset_tuning(host);
 
@@ -2339,6 +2339,7 @@ static void sdhci_abort_tuning(struct sdhci_host *host, 
u32 opcode)
 
mmc_abort_tuning(host->mmc, opcode);
 }
+EXPORT_SYMBOL_GPL(sdhci_abort_tuning);
 
 /*
  * We use sdhci_send_tuning() because mmc_send_tuning() is not a good fit. 
SDHCI
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 72601a4d2e95..437bab3af195 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -797,5 +797,6 @@ void sdhci_start_tuning(struct sdhci_host *host);
 void sdhci_end_tuning(struct sdhci_host *host);
 void sdhci_reset_tuning(struct sdhci_host *host);
 void sdhci_send_tuning(struct sdhci_host *host, u32 opcode);
+void sdhci_abort_tuning(struct sdhci_host *host, u32 opcode);
 
 #endif /* __SDHCI_HW_H */
-- 
2.23.0



[PATCH V8 3/5] PCI: Add Genesys Logic, Inc. Vendor ID

2019-09-05 Thread Ben Chuang
From: Ben Chuang 

Add the Genesys Logic, Inc. vendor ID to pci_ids.h.

Signed-off-by: Ben Chuang 
Co-developed-by: Michael K Johnson 
Signed-off-by: Michael K Johnson 
Acked-by: Adrian Hunter 
---
 include/linux/pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 70e86148cb1e..4f7e12772a14 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2403,6 +2403,8 @@
 #define PCI_DEVICE_ID_RDC_R60610x6061
 #define PCI_DEVICE_ID_RDC_D10100x1010
 
+#define PCI_VENDOR_ID_GLI  0x17a0
+
 #define PCI_VENDOR_ID_LENOVO   0x17aa
 
 #define PCI_VENDOR_ID_QCOM 0x17cb
-- 
2.23.0



[PATCH V8 2/5] mmc: sdhci: Add PLL Enable support to internal clock setup

2019-09-05 Thread Ben Chuang
From: Ben Chuang 

The GL9750 and GL9755 chipsets, and possibly others, require PLL Enable
setup as part of the internal clock setup as described in 3.2.1 Internal
Clock Setup Sequence of SD Host Controller Simplified Specification
Version 4.20.

Signed-off-by: Ben Chuang 
Co-developed-by: Michael K Johnson 
Signed-off-by: Michael K Johnson 
Acked-by: Adrian Hunter 
---
 drivers/mmc/host/sdhci.c | 23 +++
 drivers/mmc/host/sdhci.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index bed0760a6c2a..9106ebc7a422 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1653,6 +1653,29 @@ void sdhci_enable_clk(struct sdhci_host *host, u16 clk)
udelay(10);
}
 
+   if (host->version >= SDHCI_SPEC_410 && host->v4_mode) {
+   clk |= SDHCI_CLOCK_PLL_EN;
+   clk &= ~SDHCI_CLOCK_INT_STABLE;
+   sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
+
+   /* Wait max 150 ms */
+   timeout = ktime_add_ms(ktime_get(), 150);
+   while (1) {
+   bool timedout = ktime_after(ktime_get(), timeout);
+
+   clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+   if (clk & SDHCI_CLOCK_INT_STABLE)
+   break;
+   if (timedout) {
+   pr_err("%s: PLL clock never stabilised.\n",
+  mmc_hostname(host->mmc));
+   sdhci_dumpregs(host);
+   return;
+   }
+   udelay(10);
+   }
+   }
+
clk |= SDHCI_CLOCK_CARD_EN;
sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
 }
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 199712e7adbb..72601a4d2e95 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -114,6 +114,7 @@
 #define  SDHCI_DIV_HI_MASK 0x300
 #define  SDHCI_PROG_CLOCK_MODE 0x0020
 #define  SDHCI_CLOCK_CARD_EN   0x0004
+#define  SDHCI_CLOCK_PLL_EN0x0008
 #define  SDHCI_CLOCK_INT_STABLE0x0002
 #define  SDHCI_CLOCK_INT_EN0x0001
 
-- 
2.23.0



[PATCH V8 1/5] mmc: sdhci: Change timeout of loop for checking internal clock stable

2019-09-05 Thread Ben Chuang
From: Ben Chuang 

According to section 3.2.1 internal clock setup in SD Host Controller
Simplified Specifications 4.20, the timeout of loop for checking
internal clock stable is defined as 150ms.

Signed-off-by: Ben Chuang 
Co-developed-by: Michael K Johnson 
Signed-off-by: Michael K Johnson 
Acked-by: Adrian Hunter 
---
 drivers/mmc/host/sdhci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 59acf8e3331e..bed0760a6c2a 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1636,8 +1636,8 @@ void sdhci_enable_clk(struct sdhci_host *host, u16 clk)
clk |= SDHCI_CLOCK_INT_EN;
sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
 
-   /* Wait max 20 ms */
-   timeout = ktime_add_ms(ktime_get(), 20);
+   /* Wait max 150 ms */
+   timeout = ktime_add_ms(ktime_get(), 150);
while (1) {
bool timedout = ktime_after(ktime_get(), timeout);
 
-- 
2.23.0



[PATCH V8 0/5] Add Genesys Logic GL975x support

2019-09-05 Thread Ben Chuang
From: Ben Chuang 

The patches modify internal clock setup to match SD Host Controller
Simplified Specifications 4.20 and support Genesys Logic GL9750/GL9755
chipsets.

v8:
 refine codes in sdhci-gli-pci.c
 - remove duplicate assignment 
 - remove redundant delay
 - use '!!'(not not) logical operator to refine the true/false condition
 - check end condition after outter loop
 - add comments for delay 5ms in sdhci_gli_voltage_switch()
 - merge two logical conditions to one line

v7:
 - remove condition define CONFIG_MMC_SDHCI_IO_ACCESSORS from sdhci-pci-gli.c
 - making the accessors(MMC_SDHCI_IO_ACCESSORS) always available when selecting
   MMC_SDHCI_PCI in Kconfig

V6:
 - export sdhci_abot_tuning() function symbol
 - use C-style comments
 - use BIT, FIELD_{GET,PREP} and GENMASK to define bit fields of register
 - use host->ops->platform_execute_tuning instead of mmc->ops->execute_tuning
 - call sdhci_reset() instead of duplicating the code in sdhci_gl9750_reset
 - remove .hw_reset 
 - use condition define CONFIG_MMC_SDHCI_IO_ACCESSORS for read_l

V5:
 - add "change timeout of loop .." to a patch
 - fix typo "verndor" to "vendor"

V4:
 - change name from sdhci_gli_reset to sdhci_gl9750_reset
 - fix sdhci_reset to sdhci_gl9750_reset in sdhci_gl9750_ops
 - fix sdhci_gli_reset to sdhci_reset in sdhci_gl9755_ops
 
V3:
 - change usleep_range to udelay
 - add Genesys Logic PCI Vendor ID to a patch
 - separate the Genesys Logic specific part to a patch

V2:
 - change udelay to usleep_range

Ben Chuang (5):
  mmc: sdhci: Change timeout of loop for checking internal clock stable
  mmc: sdhci: Add PLL Enable support to internal clock setup
  PCI: Add Genesys Logic, Inc. Vendor ID
  mmc: sdhci: Export sdhci_abort_tuning function symbol
  mmc: host: sdhci-pci: Add Genesys Logic GL975x support

 drivers/mmc/host/Kconfig  |   1 +
 drivers/mmc/host/Makefile |   2 +-
 drivers/mmc/host/sdhci-pci-core.c |   2 +
 drivers/mmc/host/sdhci-pci-gli.c  | 355 ++
 drivers/mmc/host/sdhci-pci.h  |   5 +
 drivers/mmc/host/sdhci.c  |  30 ++-
 drivers/mmc/host/sdhci.h  |   2 +
 include/linux/pci_ids.h   |   2 +
 8 files changed, 395 insertions(+), 4 deletions(-)
 create mode 100644 drivers/mmc/host/sdhci-pci-gli.c

-- 
2.23.0



[PATCH v4 0/4] KVM: X86: Some tracepoint enhancements

2019-09-05 Thread Peter Xu
v4:
- pick r-b
- swap the last two patches [Sean]

v3:
- use unsigned int for vcpu id [Sean]
- a new patch to fix ple_window type [Sean]

v2:
- fix commit messages, change format of ple window tracepoint [Sean]
- rebase [Wanpeng]

Each small patch explains itself.  I noticed them when I'm tracing
some IRQ paths and I found them helpful at least to me.

Please have a look.  Thanks,

Peter Xu (4):
  KVM: X86: Trace vcpu_id for vmexit
  KVM: X86: Remove tailing newline for tracepoints
  KVM: VMX: Change ple_window type to unsigned int
  KVM: X86: Tune PLE Window tracepoint

 arch/x86/kvm/svm.c | 16 
 arch/x86/kvm/trace.h   | 34 ++
 arch/x86/kvm/vmx/vmx.c | 18 ++
 arch/x86/kvm/vmx/vmx.h |  2 +-
 arch/x86/kvm/x86.c |  2 +-
 5 files changed, 34 insertions(+), 38 deletions(-)

-- 
2.21.0



[PATCH v4 3/4] KVM: VMX: Change ple_window type to unsigned int

2019-09-05 Thread Peter Xu
The VMX ple_window is 32 bits wide, so logically it can overflow with
an int.  The module parameter is declared as unsigned int which is
good, however the dynamic variable is not.  Switching all the
ple_window references to use unsigned int.

The tracepoint changes will also affect SVM, but SVM is using an even
smaller width (16 bits) so it's always fine.

Suggested-by: Sean Christopherson 
Signed-off-by: Peter Xu 
---
 arch/x86/kvm/trace.h   | 9 +
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 arch/x86/kvm/vmx/vmx.h | 2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 8a7570f8c943..afe8d269c16c 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -891,14 +891,15 @@ TRACE_EVENT(kvm_pml_full,
 );
 
 TRACE_EVENT(kvm_ple_window,
-   TP_PROTO(bool grow, unsigned int vcpu_id, int new, int old),
+   TP_PROTO(bool grow, unsigned int vcpu_id, unsigned int new,
+unsigned int old),
TP_ARGS(grow, vcpu_id, new, old),
 
TP_STRUCT__entry(
__field(bool,  grow )
__field(unsigned int,   vcpu_id )
-   __field( int,   new )
-   __field( int,   old )
+   __field(unsigned int,   new )
+   __field(unsigned int,   old )
),
 
TP_fast_assign(
@@ -908,7 +909,7 @@ TRACE_EVENT(kvm_ple_window,
__entry->old= old;
),
 
-   TP_printk("vcpu %u: ple_window %d (%s %d)",
+   TP_printk("vcpu %u: ple_window %u (%s %u)",
  __entry->vcpu_id,
  __entry->new,
  __entry->grow ? "grow" : "shrink",
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 42ed3faa6af8..b172b675d420 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5227,7 +5227,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu 
*vcpu)
 static void grow_ple_window(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   int old = vmx->ple_window;
+   unsigned int old = vmx->ple_window;
 
vmx->ple_window = __grow_ple_window(old, ple_window,
ple_window_grow,
@@ -5242,7 +5242,7 @@ static void grow_ple_window(struct kvm_vcpu *vcpu)
 static void shrink_ple_window(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   int old = vmx->ple_window;
+   unsigned int old = vmx->ple_window;
 
vmx->ple_window = __shrink_ple_window(old, ple_window,
  ple_window_shrink,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 82d0bc3a4d52..64d5a4890aa9 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -253,7 +253,7 @@ struct vcpu_vmx {
struct nested_vmx nested;
 
/* Dynamic PLE window. */
-   int ple_window;
+   unsigned int ple_window;
bool ple_window_dirty;
 
bool req_immediate_exit;
-- 
2.21.0



[PATCH v4 2/4] KVM: X86: Remove tailing newline for tracepoints

2019-09-05 Thread Peter Xu
It's done by TP_printk() already.

Reviewed-by: Krish Sadhukhan 
Reviewed-by: Sean Christopherson 
Signed-off-by: Peter Xu 
---
 arch/x86/kvm/trace.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 20d6cac9f157..8a7570f8c943 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1323,7 +1323,7 @@ TRACE_EVENT(kvm_avic_incomplete_ipi,
__entry->index = index;
),
 
-   TP_printk("vcpu=%u, icrh:icrl=%#010x:%08x, id=%u, index=%u\n",
+   TP_printk("vcpu=%u, icrh:icrl=%#010x:%08x, id=%u, index=%u",
  __entry->vcpu, __entry->icrh, __entry->icrl,
  __entry->id, __entry->index)
 );
@@ -1348,7 +1348,7 @@ TRACE_EVENT(kvm_avic_unaccelerated_access,
__entry->vec = vec;
),
 
-   TP_printk("vcpu=%u, offset=%#x(%s), %s, %s, vec=%#x\n",
+   TP_printk("vcpu=%u, offset=%#x(%s), %s, %s, vec=%#x",
  __entry->vcpu,
  __entry->offset,
  __print_symbolic(__entry->offset, kvm_trace_symbol_apic),
-- 
2.21.0



[PATCH v4 4/4] KVM: X86: Tune PLE Window tracepoint

2019-09-05 Thread Peter Xu
The PLE window tracepoint triggers even if the window is not changed,
and the wording can be a bit confusing too.  One example line:

  kvm_ple_window: vcpu 0: ple_window 4096 (shrink 4096)

It easily let people think of "the window now is 4096 which is
shrinked", but the truth is the value actually didn't change (4096).

Let's only dump this message if the value really changed, and we make
the message even simpler like:

  kvm_ple_window: vcpu 4 old 4096 new 8192 (growed)

Signed-off-by: Peter Xu 
---
 arch/x86/kvm/svm.c | 16 
 arch/x86/kvm/trace.h   | 22 ++
 arch/x86/kvm/vmx/vmx.c | 14 --
 arch/x86/kvm/x86.c |  2 +-
 4 files changed, 23 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d685491fce4d..d5cb6b5a9254 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1269,11 +1269,11 @@ static void grow_ple_window(struct kvm_vcpu *vcpu)
pause_filter_count_grow,
pause_filter_count_max);
 
-   if (control->pause_filter_count != old)
+   if (control->pause_filter_count != old) {
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
-
-   trace_kvm_ple_window_grow(vcpu->vcpu_id,
- control->pause_filter_count, old);
+   trace_kvm_ple_window_update(vcpu->vcpu_id,
+   control->pause_filter_count, old);
+   }
 }
 
 static void shrink_ple_window(struct kvm_vcpu *vcpu)
@@ -1287,11 +1287,11 @@ static void shrink_ple_window(struct kvm_vcpu *vcpu)
pause_filter_count,
pause_filter_count_shrink,
pause_filter_count);
-   if (control->pause_filter_count != old)
+   if (control->pause_filter_count != old) {
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
-
-   trace_kvm_ple_window_shrink(vcpu->vcpu_id,
-   control->pause_filter_count, old);
+   trace_kvm_ple_window_update(vcpu->vcpu_id,
+   control->pause_filter_count, old);
+   }
 }
 
 static __init int svm_hardware_setup(void)
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index afe8d269c16c..ae924566c401 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -890,37 +890,27 @@ TRACE_EVENT(kvm_pml_full,
TP_printk("vcpu %d: PML full", __entry->vcpu_id)
 );
 
-TRACE_EVENT(kvm_ple_window,
-   TP_PROTO(bool grow, unsigned int vcpu_id, unsigned int new,
-unsigned int old),
-   TP_ARGS(grow, vcpu_id, new, old),
+TRACE_EVENT(kvm_ple_window_update,
+   TP_PROTO(unsigned int vcpu_id, unsigned int new, unsigned int old),
+   TP_ARGS(vcpu_id, new, old),
 
TP_STRUCT__entry(
-   __field(bool,  grow )
__field(unsigned int,   vcpu_id )
__field(unsigned int,   new )
__field(unsigned int,   old )
),
 
TP_fast_assign(
-   __entry->grow   = grow;
__entry->vcpu_id= vcpu_id;
__entry->new= new;
__entry->old= old;
),
 
-   TP_printk("vcpu %u: ple_window %u (%s %u)",
- __entry->vcpu_id,
- __entry->new,
- __entry->grow ? "grow" : "shrink",
- __entry->old)
+   TP_printk("vcpu %u old %u new %u (%s)",
+ __entry->vcpu_id, __entry->old, __entry->new,
+ __entry->old < __entry->new ? "growed" : "shrinked")
 );
 
-#define trace_kvm_ple_window_grow(vcpu_id, new, old) \
-   trace_kvm_ple_window(true, vcpu_id, new, old)
-#define trace_kvm_ple_window_shrink(vcpu_id, new, old) \
-   trace_kvm_ple_window(false, vcpu_id, new, old)
-
 TRACE_EVENT(kvm_pvclock_update,
TP_PROTO(unsigned int vcpu_id, struct pvclock_vcpu_time_info *pvclock),
TP_ARGS(vcpu_id, pvclock),
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b172b675d420..1dbb63ffdd6d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5233,10 +5233,11 @@ static void grow_ple_window(struct kvm_vcpu *vcpu)
ple_window_grow,
ple_window_max);
 
-   if (vmx->ple_window != old)
+   if (vmx->ple_window != old) {
vmx->ple_window_dirty = true;
-
-   trace_kvm_ple_window_grow(vcpu->vcpu_id, vmx->ple_window, old);
+   trace_kvm_ple_window_update(vcpu->vcpu_id,
+   vmx->ple_window, old);
+   }
 }
 
 static void shrink_ple_window(struct 

[PATCH v4 1/4] KVM: X86: Trace vcpu_id for vmexit

2019-09-05 Thread Peter Xu
Tracing the ID helps to pair vmenters and vmexits for guests with
multiple vCPUs.

Reviewed-by: Krish Sadhukhan 
Reviewed-by: Sean Christopherson 
Signed-off-by: Peter Xu 
---
 arch/x86/kvm/trace.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index b5c831e79094..20d6cac9f157 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -232,17 +232,20 @@ TRACE_EVENT(kvm_exit,
__field(u32,isa )
__field(u64,info1   )
__field(u64,info2   )
+   __field(unsigned int,   vcpu_id )
),
 
TP_fast_assign(
__entry->exit_reason= exit_reason;
__entry->guest_rip  = kvm_rip_read(vcpu);
__entry->isa= isa;
+   __entry->vcpu_id= vcpu->vcpu_id;
kvm_x86_ops->get_exit_info(vcpu, &__entry->info1,
   &__entry->info2);
),
 
-   TP_printk("reason %s rip 0x%lx info %llx %llx",
+   TP_printk("vcpu %u reason %s rip 0x%lx info %llx %llx",
+ __entry->vcpu_id,
 (__entry->isa == KVM_ISA_VMX) ?
 __print_symbolic(__entry->exit_reason, VMX_EXIT_REASONS) :
 __print_symbolic(__entry->exit_reason, SVM_EXIT_REASONS),
-- 
2.21.0



Re: WARNING in hso_free_net_device

2019-09-05 Thread Hui Peng



On 9/5/2019 7:24 AM, Andrey Konovalov wrote:
> On Thu, Sep 5, 2019 at 4:20 AM Hui Peng  wrote:
>>
>> Can you guys have  a look at the attached patch?
> 
> Let's try it:
> 
> #syz test: https://github.com/google/kasan.git eea39f24
> 
> FYI: there are two more reports coming from this driver, which might
> (or might not) have the same root cause. One of them has a suggested
> fix by Oliver.
> 
> https://syzkaller.appspot.com/bug?extid=67b2bd0e34f952d0321e
> https://syzkaller.appspot.com/bug?extid=93f2f45b19519b289613
> 

I think they are different, though similar.
This one is resulted from unregistering a network device.
These 2 are resulted from unregistering a tty device.

>>
>> On 9/4/19 6:41 PM, Stephen Hemminger wrote:
>>> On Wed, 4 Sep 2019 16:27:50 -0400
>>> Hui Peng  wrote:
>>>
 Hi, all:

 I looked at the bug a little.

 The issue is that in the error handling code, hso_free_net_device
 unregisters

 the net_device (hso_net->net)  by calling unregister_netdev. In the
 error handling code path,

 hso_net->net has not been registered yet.

 I think there are two ways to solve the issue:

 1. fix it in drivers/net/usb/hso.c to avoiding unregistering the
 net_device when it is still not registered

 2. fix it in unregister_netdev. We can add a field in net_device to
 record whether it is registered, and make unregister_netdev return if
 the net_device is not registered yet.

 What do you guys think ?
>>> #1


Re: [PATCH] KVM: LAPIC: Fix SynIC Timers inject timer interrupt w/o LAPIC present

2019-09-05 Thread Wanpeng Li
On Thu, 5 Sep 2019 at 21:16, Vitaly Kuznetsov  wrote:
>
> Wanpeng Li  writes:
>
> > From: Wanpeng Li 
> >
> > Reported by syzkaller:
> >
> >   kasan: GPF could be caused by NULL-ptr deref or user memory access
> >   general protection fault:  [#1] PREEMPT SMP KASAN
> >   RIP: 0010:__apic_accept_irq+0x46/0x740 arch/x86/kvm/lapic.c:1029
> >   Call Trace:
> >   kvm_apic_set_irq+0xb4/0x140 arch/x86/kvm/lapic.c:558
> >   stimer_notify_direct arch/x86/kvm/hyperv.c:648 [inline]
> >   stimer_expiration arch/x86/kvm/hyperv.c:659 [inline]
> >   kvm_hv_process_stimers+0x594/0x1650 arch/x86/kvm/hyperv.c:686
> >   vcpu_enter_guest+0x2b2a/0x54b0 arch/x86/kvm/x86.c:7896
> >   vcpu_run+0x393/0xd40 arch/x86/kvm/x86.c:8152
> >   kvm_arch_vcpu_ioctl_run+0x636/0x900 arch/x86/kvm/x86.c:8360
> >   kvm_vcpu_ioctl+0x6cf/0xaf0 
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:2765
> >
> > The testcase programs HV_X64_MSR_STIMERn_CONFIG/HV_X64_MSR_STIMERn_COUNT,
> > in addition, there is no lapic in the kernel, the counters value are small
> > enough in order that kvm_hv_process_stimers() inject this already-expired
> > timer interrupt into the guest through lapic in the kernel which triggers
> > the NULL deferencing. This patch fixes it by checking lapic_in_kernel,
> > discarding the inject if it is 0.
> >
> > Reported-by: syzbot+dff25ee91f0c7d5c1...@syzkaller.appspotmail.com
> > Cc: Paolo Bonzini 
> > Cc: Radim Krčmář 
> > Cc: Vitaly Kuznetsov 
> > Signed-off-by: Wanpeng Li 
> > ---
> >  arch/x86/kvm/hyperv.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > index c10a8b1..461fcc5 100644
> > --- a/arch/x86/kvm/hyperv.c
> > +++ b/arch/x86/kvm/hyperv.c
> > @@ -645,7 +645,9 @@ static int stimer_notify_direct(struct 
> > kvm_vcpu_hv_stimer *stimer)
> >   .vector = stimer->config.apic_vector
> >   };
> >
> > - return !kvm_apic_set_irq(vcpu, , NULL);
> > + if (lapic_in_kernel(vcpu))
> > + return !kvm_apic_set_irq(vcpu, , NULL);
> > + return 0;
> >  }
> >
> >  static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
>
> Hm, but this basically means direct mode synthetic timers won't work
> when LAPIC is not in kernel but the feature will still be advertised to
> the guest, not good. Shall we stop advertizing it? Something like
> (completely untested):
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 3f5ad84853fb..1dfa594eaab6 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1856,7 +1856,13 @@ int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, 
> struct kvm_cpuid2 *cpuid,
>
> ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
> ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
> -   ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
> +
> +   /*
> +* Direct Synthetic timers only make sense with 
> in-kernel
> +* LAPIC
> +*/
> +   if (lapic_in_kernel(vcpu))
> +   ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
>
> break;

Thanks, I fold this into v2, syzkaller even didn't check the cpuid, so
I still keep the discard inject part.

   Wanpeng


Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

2019-09-05 Thread Ming Lei
Hi Daniel,

On Thu, Sep 05, 2019 at 12:37:13PM +0200, Daniel Lezcano wrote:
> 
> Hi Ming,
> 
> On 05/09/2019 11:06, Ming Lei wrote:
> > On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
> >> Hi,
> >>
> >> On 04/09/2019 19:07, Bart Van Assche wrote:
> >>> On 9/3/19 12:50 AM, Daniel Lezcano wrote:
>  On 03/09/2019 09:28, Ming Lei wrote:
> > On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
> >> It is a scheduler problem then ?
> >
> > Scheduler can do nothing if the CPU is taken completely by handling
> > interrupt & softirq, so seems not a scheduler problem, IMO.
> 
>  Why? If there is a irq pressure on one CPU reducing its capacity, the
>  scheduler will balance the tasks on another CPU, no?
> >>>
> >>> Only if CONFIG_IRQ_TIME_ACCOUNTING has been enabled. However, I don't
> >>> know any Linux distro that enables that option. That's probably because
> >>> that option introduces two rdtsc() calls in each interrupt. Given the
> >>> overhead introduced by this option, I don't think this is the solution
> >>> Ming is looking for.
> >>
> >> Was this overhead reported somewhere ?
> > 
> > The syscall of gettimeofday() calls ktime_get_real_ts64() which finally
> > calls tk_clock_read() which calls rdtsc too.
> > 
> > But gettimeofday() is often used in fast path, and block IO_STAT needs to
> > read it too.
> > 
> >>
> >>> See also irqtime_account_irq() in kernel/sched/cputime.c.
> >>
> >> From my POV, this framework could be interesting to detect this situation.
> > 
> > Now we are talking about IRQ_TIME_ACCOUNTING instead of IRQ_TIMINGS, and the
> > former one could be used to implement the detection. And the only sharing
> > should be the read of timestamp.
> 
> You did not share yet the analysis of the problem (the kernel warnings
> give the symptoms) and gave the reasoning for the solution. It is hard
> to understand what you are looking for exactly and how to connect the dots.

Let me explain it one more time:

When one IRQ flood happens on one CPU:

1) softirq handling on this CPU can't make progress

2) kernel thread bound to this CPU can't make progress

For example, network may require softirq to xmit packets, or another irq
thread for handling keyboards/mice or whatever, or rcu_sched may depend
on that CPU for making progress, then the irq flood stalls the whole
system.

> 
> AFAIU, there are fast medium where the responses to requests are faster
> than the time to process them, right?

Usually medium may not be faster than CPU, now we are talking about
interrupts, which can be originated from lots of devices concurrently,
for example, in Long Li'test, there are 8 NVMe drives involved.

> 
> I don't see how detecting IRQ flooding and use a threaded irq is the
> solution, can you explain?

When IRQ flood is detected, we reserve a bit little time for providing
chance to make softirq/threads scheduled by scheduler, then the above
problem can be avoided.

> 
> If the responses are coming at a very high rate, whatever the solution
> (interrupts, threaded interrupts, polling), we are still in the same
> situation.

When we moving the interrupt handling into irq thread, other softirq/
threaded interrupt/thread gets chance to be scheduled, so we can avoid
to stall the whole system.

> 
> My suggestion was initially to see if the interrupt load will be taken
> into accounts in the cpu load and favorize task migration with the
> scheduler load balance to a less loaded CPU, thus the CPU processing
> interrupts will end up doing only that while other CPUs will handle the
> "threaded" side.
> 
> Beside that, I'm wondering if the block scheduler should be somehow
> involved in that [1]

For NVMe or any multi-queue storage, the default scheduler is 'none',
which basically does nothing except for submitting IO asap.


Thanks,
Ming


[PATCH -tip v3 0/2] x86: kprobes: Prohibit kprobes on Xen/KVM emulate prefixes

2019-09-05 Thread Masami Hiramatsu
Hi,

Here is the 3rd version of patches to handle Xen/KVM emulate
prefix by x86 instruction decoder.

These patches allow x86 instruction decoder to decode
Xen and KVM emulate prefix correctly, and prohibit kprobes to
probe on it.

Josh reported that the objtool can not decode such special
prefixed instructions, and I found that we also have to
prohibit kprobes to probe on such instruction.

This series can be applied on -tip master branch which
has merged Josh's objtool/perf sharing common x86 insn
decoder series.

In the 2nd version, I added KVM emulate prefix support and generalized
the interface. (insn_has_xen_prefix -> insn_has_emulate_prefix)
Also, I added insn.emulate_prefix_size for those prefixes because
that prefix is NOT an x86 instruction prefix, and the next instruction
of those emulate prefixes can have x86 instruction prefix. So we
can not use insn.prefix for it.

In this 3rd version, I just fixed tools/perf/check-headers.sh so
that it can ignore the difference of xen/prefix header path.

Thank you,

---

Masami Hiramatsu (2):
  x86: xen: insn: Decode Xen and KVM emulate-prefix signature
  x86: kprobes: Prohibit probing on instruction which has emulate prefix


 arch/x86/include/asm/insn.h |6 +
 arch/x86/include/asm/xen/interface.h|7 --
 arch/x86/include/asm/xen/prefix.h   |   10 +
 arch/x86/kernel/kprobes/core.c  |4 +++
 arch/x86/lib/insn.c |   36 +++
 tools/arch/x86/include/asm/insn.h   |6 +
 tools/arch/x86/include/asm/xen/prefix.h |   10 +
 tools/arch/x86/lib/insn.c   |   36 +++
 tools/objtool/sync-check.sh |3 ++-
 tools/perf/check-headers.sh |2 +-
 10 files changed, 116 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/prefix.h
 create mode 100644 tools/arch/x86/include/asm/xen/prefix.h

--
Masami Hiramatsu (Linaro) 


[PATCH -tip v3 2/2] x86: kprobes: Prohibit probing on instruction which has emulate prefix

2019-09-05 Thread Masami Hiramatsu
Prohibit probing on instruction which has XEN_EMULATE_PREFIX
or KVM's emulate prefix. Since that prefix is a marker for Xen
and KVM, if we modify the marker by kprobe's int3, that doesn't
work as expected.

Signed-off-by: Masami Hiramatsu 
---
 arch/x86/kernel/kprobes/core.c |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 43fc13c831af..4f13af7cbcdb 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -351,6 +351,10 @@ int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct 
insn *insn)
kernel_insn_init(insn, dest, MAX_INSN_SIZE);
insn_get_length(insn);
 
+   /* We can not probe force emulate prefixed instruction */
+   if (insn_has_emulate_prefix(insn))
+   return 0;
+
/* Another subsystem puts a breakpoint, failed to recover */
if (insn->opcode.bytes[0] == BREAKPOINT_INSTRUCTION)
return 0;



[PATCH -tip v3 1/2] x86: xen: insn: Decode Xen and KVM emulate-prefix signature

2019-09-05 Thread Masami Hiramatsu
Decode Xen and KVM's emulate-prefix signature by x86 insn decoder.
It is called "prefix" but actually not x86 instruction prefix, so
this adds insn.emulate_prefix_size field instead of reusing
insn.prefixes.

If x86 decoder finds a special sequence of instructions of
XEN_EMULATE_PREFIX and 'ud2a; .ascii "kvm"', it just counts the
length, set insn.emulate_prefix_size and fold it with the next
instruction. In other words, the signature and the next instruction
is treated as a single instruction.

Reported-by: Josh Poimboeuf 
Signed-off-by: Masami Hiramatsu 
Acked-by: Josh Poimboeuf 
---
 Changes in v3:
  - Fix perf's check script too.
 Changes in v2:
  - Generalize the emulate-prefix handling not only for Xen but KVM.
  - Introduce insn.emulate_prefix_size instead of using insn.prefixes.
---
 arch/x86/include/asm/insn.h |6 +
 arch/x86/include/asm/xen/interface.h|7 --
 arch/x86/include/asm/xen/prefix.h   |   10 +
 arch/x86/lib/insn.c |   36 +++
 tools/arch/x86/include/asm/insn.h   |6 +
 tools/arch/x86/include/asm/xen/prefix.h |   10 +
 tools/arch/x86/lib/insn.c   |   36 +++
 tools/objtool/sync-check.sh |3 ++-
 tools/perf/check-headers.sh |2 +-
 9 files changed, 112 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/prefix.h
 create mode 100644 tools/arch/x86/include/asm/xen/prefix.h

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 154f27be8bfc..5c1ae3eff9d4 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -45,6 +45,7 @@ struct insn {
struct insn_field immediate2;   /* for 64bit imm or seg16 */
};
 
+   int emulate_prefix_size;
insn_attr_t attr;
unsigned char opnd_bytes;
unsigned char addr_bytes;
@@ -128,6 +129,11 @@ static inline int insn_is_evex(struct insn *insn)
return (insn->vex_prefix.nbytes == 4);
 }
 
+static inline int insn_has_emulate_prefix(struct insn *insn)
+{
+   return !!insn->emulate_prefix_size;
+}
+
 /* Ensure this instruction is decoded completely */
 static inline int insn_complete(struct insn *insn)
 {
diff --git a/arch/x86/include/asm/xen/interface.h 
b/arch/x86/include/asm/xen/interface.h
index 62ca03ef5c65..fe33a9798708 100644
--- a/arch/x86/include/asm/xen/interface.h
+++ b/arch/x86/include/asm/xen/interface.h
@@ -379,12 +379,15 @@ struct xen_pmu_arch {
  * Prefix forces emulation of some non-trapping instructions.
  * Currently only CPUID.
  */
+#include 
+
 #ifdef __ASSEMBLY__
-#define XEN_EMULATE_PREFIX .byte 0x0f,0x0b,0x78,0x65,0x6e ;
+#define XEN_EMULATE_PREFIX .byte __XEN_EMULATE_PREFIX ;
 #define XEN_CPUID  XEN_EMULATE_PREFIX cpuid
 #else
-#define XEN_EMULATE_PREFIX ".byte 0x0f,0x0b,0x78,0x65,0x6e ; "
+#define XEN_EMULATE_PREFIX ".byte " __XEN_EMULATE_PREFIX_STR " ; "
 #define XEN_CPUID  XEN_EMULATE_PREFIX "cpuid"
+
 #endif
 
 #endif /* _ASM_X86_XEN_INTERFACE_H */
diff --git a/arch/x86/include/asm/xen/prefix.h 
b/arch/x86/include/asm/xen/prefix.h
new file mode 100644
index ..f901be0d7a95
--- /dev/null
+++ b/arch/x86/include/asm/xen/prefix.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _TOOLS_ASM_X86_XEN_PREFIX_H
+#define _TOOLS_ASM_X86_XEN_PREFIX_H
+
+#include 
+
+#define __XEN_EMULATE_PREFIX  0x0f,0x0b,0x78,0x65,0x6e
+#define __XEN_EMULATE_PREFIX_STR  __stringify(__XEN_EMULATE_PREFIX)
+
+#endif
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 0b5862ba6a75..b7eb50187db9 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -13,6 +13,9 @@
 #include 
 #include 
 
+/* For special Xen prefix */
+#include 
+
 /* Verify next sizeof(t) bytes can be on the same instruction */
 #define validate_next(t, insn, n)  \
((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
@@ -58,6 +61,37 @@ void insn_init(struct insn *insn, const void *kaddr, int 
buf_len, int x86_64)
insn->addr_bytes = 4;
 }
 
+static const insn_byte_t xen_prefix[] = { __XEN_EMULATE_PREFIX };
+/* See handle_ud()@arch/x86/kvm/x86.c */
+static const insn_byte_t kvm_prefix[] = "\xf\xbkvm";
+
+static int __insn_get_emulate_prefix(struct insn *insn,
+const insn_byte_t *prefix, size_t len)
+{
+   size_t i;
+
+   for (i = 0; i < len; i++) {
+   if (peek_nbyte_next(insn_byte_t, insn, i) != prefix[i])
+   goto err_out;
+   }
+
+   insn->emulate_prefix_size = len;
+   insn->next_byte += len;
+
+   return 1;
+
+err_out:
+   return 0;
+}
+
+static void insn_get_emulate_prefix(struct insn *insn)
+{
+   if (__insn_get_emulate_prefix(insn, xen_prefix, sizeof(xen_prefix)))
+   return;
+
+   __insn_get_emulate_prefix(insn, kvm_prefix, sizeof(kvm_prefix));
+}
+
 /**
  * insn_get_prefixes - scan x86 

[PATCH v3 4/9] padata: make padata_do_parallel find alternate callback CPU

2019-09-05 Thread Daniel Jordan
padata_do_parallel currently returns -EINVAL if the callback CPU isn't
in the callback cpumask.

pcrypt tries to prevent this situation by keeping its own callback
cpumask in sync with padata's and checks that the callback CPU it passes
to padata is valid.  Make padata handle this instead.

padata_do_parallel now takes a pointer to the callback CPU and updates
it for the caller if an alternate CPU is used.  Overall behavior in
terms of which callback CPUs are chosen stays the same.

Prepares for removal of the padata cpumask notifier in pcrypt, which
will fix a lockdep complaint about nested acquisition of the CPU hotplug
lock later in the series.

Signed-off-by: Daniel Jordan 
Acked-by: Steffen Klassert 
Cc: Herbert Xu 
Cc: Lai Jiangshan 
Cc: Peter Zijlstra 
Cc: Tejun Heo 
Cc: linux-cry...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 crypto/pcrypt.c| 33 ++---
 include/linux/padata.h |  2 +-
 kernel/padata.c| 27 ---
 3 files changed, 23 insertions(+), 39 deletions(-)

diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index d67293063c7f..efca962ab12a 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -57,35 +57,6 @@ struct pcrypt_aead_ctx {
unsigned int cb_cpu;
 };
 
-static int pcrypt_do_parallel(struct padata_priv *padata, unsigned int *cb_cpu,
- struct padata_pcrypt *pcrypt)
-{
-   unsigned int cpu_index, cpu, i;
-   struct pcrypt_cpumask *cpumask;
-
-   cpu = *cb_cpu;
-
-   rcu_read_lock_bh();
-   cpumask = rcu_dereference_bh(pcrypt->cb_cpumask);
-   if (cpumask_test_cpu(cpu, cpumask->mask))
-   goto out;
-
-   if (!cpumask_weight(cpumask->mask))
-   goto out;
-
-   cpu_index = cpu % cpumask_weight(cpumask->mask);
-
-   cpu = cpumask_first(cpumask->mask);
-   for (i = 0; i < cpu_index; i++)
-   cpu = cpumask_next(cpu, cpumask->mask);
-
-   *cb_cpu = cpu;
-
-out:
-   rcu_read_unlock_bh();
-   return padata_do_parallel(pcrypt->pinst, padata, cpu);
-}
-
 static int pcrypt_aead_setkey(struct crypto_aead *parent,
  const u8 *key, unsigned int keylen)
 {
@@ -157,7 +128,7 @@ static int pcrypt_aead_encrypt(struct aead_request *req)
   req->cryptlen, req->iv);
aead_request_set_ad(creq, req->assoclen);
 
-   err = pcrypt_do_parallel(padata, >cb_cpu, );
+   err = padata_do_parallel(pencrypt.pinst, padata, >cb_cpu);
if (!err)
return -EINPROGRESS;
 
@@ -199,7 +170,7 @@ static int pcrypt_aead_decrypt(struct aead_request *req)
   req->cryptlen, req->iv);
aead_request_set_ad(creq, req->assoclen);
 
-   err = pcrypt_do_parallel(padata, >cb_cpu, );
+   err = padata_do_parallel(pdecrypt.pinst, padata, >cb_cpu);
if (!err)
return -EINPROGRESS;
 
diff --git a/include/linux/padata.h b/include/linux/padata.h
index 839d9319920a..f7851f8e2190 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -154,7 +154,7 @@ struct padata_instance {
 extern struct padata_instance *padata_alloc_possible(const char *name);
 extern void padata_free(struct padata_instance *pinst);
 extern int padata_do_parallel(struct padata_instance *pinst,
- struct padata_priv *padata, int cb_cpu);
+ struct padata_priv *padata, int *cb_cpu);
 extern void padata_do_serial(struct padata_priv *padata);
 extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
  cpumask_var_t cpumask);
diff --git a/kernel/padata.c b/kernel/padata.c
index 58728cd7f40c..9a17922ec436 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -94,17 +94,19 @@ static void padata_parallel_worker(struct work_struct 
*parallel_work)
  *
  * @pinst: padata instance
  * @padata: object to be parallelized
- * @cb_cpu: cpu the serialization callback function will run on,
- *  must be in the serial cpumask of padata(i.e. cpumask.cbcpu).
+ * @cb_cpu: pointer to the CPU that the serialization callback function should
+ *  run on.  If it's not in the serial cpumask of @pinst
+ *  (i.e. cpumask.cbcpu), this function selects a fallback CPU and if
+ *  none found, returns -EINVAL.
  *
  * The parallelization callback function will run with BHs off.
  * Note: Every object which is parallelized by padata_do_parallel
  * must be seen by padata_do_serial.
  */
 int padata_do_parallel(struct padata_instance *pinst,
-  struct padata_priv *padata, int cb_cpu)
+  struct padata_priv *padata, int *cb_cpu)
 {
-   int target_cpu, err;
+   int i, cpu, cpu_index, target_cpu, err;
struct padata_parallel_queue *queue;
struct parallel_data *pd;
 
@@ -116,8 +118,19 @@ int padata_do_parallel(struct padata_instance *pinst,
if 

Re: [Patch V8 7/8] usb: gadget: Add UDC driver for tegra XUSB device mode controller

2019-09-05 Thread Chunfeng Yun
On Thu, 2019-09-05 at 09:57 +0530, Nagarjuna Kristam wrote:
> 
> On 04-09-2019 16:00, Chunfeng Yun wrote:
> > On Wed, 2019-09-04 at 13:53 +0530, Nagarjuna Kristam wrote:
> >> This patch adds UDC driver for tegra XUSB 3.0 device mode controller.
> >> XUSB device mode controller supports SS, HS and FS modes
> >>
> >> Based on work by:
> >>   Mark Kuo 
> >>   Hui Fu 
> >>   Andrew Bresticker 
> >>
> >> Signed-off-by: Nagarjuna Kristam 
> >> Acked-by: Thierry Reding 
> >> ---
> >>  drivers/usb/gadget/udc/Kconfig  |   12 +
> >>  drivers/usb/gadget/udc/Makefile |1 +
> >>  drivers/usb/gadget/udc/tegra-xudc.c | 3791 
> >> +++
> >>  3 files changed, 3804 insertions(+)
> >>  create mode 100644 drivers/usb/gadget/udc/tegra-xudc.c
> >>
> >> diff --git a/drivers/usb/gadget/udc/Kconfig 
> >> b/drivers/usb/gadget/udc/Kconfig
> >> index d354036..0fe7577 100644
> >> --- a/drivers/usb/gadget/udc/Kconfig
> >> +++ b/drivers/usb/gadget/udc/Kconfig
> >> @@ -441,6 +441,18 @@ config USB_GADGET_XILINX
> >>  dynamically linked module called "udc-xilinx" and force all
> >>  gadget drivers to also be dynamically linked.
> >>  
> >> +config USB_TEGRA_XUDC
> >> +  tristate "NVIDIA Tegra Superspeed USB 3.0 Device Controller"
> >> +  depends on ARCH_TEGRA || COMPILE_TEST
> >> +  depends on PHY_TEGRA_XUSB
> >> +  select USB_CONN_GPIO
> > To me, needn't select this driver, without it, the driver still build
> > pass
> > 
> Yes compilation passes with out this. Added select for getting USB_CONN_GPIO
> driver functionality.
We can enable it in defconfig, think about one day use type-c to do
dual-role switch, or other ways, will needn't select it then.

> 
> >> +  help
> >> +   Enables NVIDIA Tegra USB 3.0 device mode controller driver.
> >> +
> >> +   Say "y" to link the driver statically, or "m" to build a
> >> +   dynamically linked module called "tegra_xudc" and force all
> >> +   gadget drivers to also be dynamically linked.
> >> +
> >>  source "drivers/usb/gadget/udc/aspeed-vhub/Kconfig"
> >>  




Re: [PATCH -tip v2 1/2] x86: xen: insn: Decode Xen and KVM emulate-prefix signature

2019-09-05 Thread Masami Hiramatsu
On Thu, 5 Sep 2019 20:13:50 -0500
Josh Poimboeuf  wrote:

> On Fri, Sep 06, 2019 at 09:50:19AM +0900, Masami Hiramatsu wrote:
> > --- a/tools/objtool/sync-check.sh
> > +++ b/tools/objtool/sync-check.sh
> > @@ -4,6 +4,7 @@
> >  FILES='
> >  arch/x86/include/asm/inat_types.h
> >  arch/x86/include/asm/orc_types.h
> > +arch/x86/include/asm/xen/prefix.h
> >  arch/x86/lib/x86-opcode-map.txt
> >  arch/x86/tools/gen-insn-attr-x86.awk
> >  '
> > @@ -46,6 +47,6 @@ done
> >  check arch/x86/include/asm/inat.h '-I "^#include 
> > [\"<]\(asm/\)*inat_types.h[\">]"'
> >  check arch/x86/include/asm/insn.h '-I "^#include 
> > [\"<]\(asm/\)*inat.h[\">]"'
> >  check arch/x86/lib/inat.c '-I "^#include 
> > [\"<]\(../include/\)*asm/insn.h[\">]"'
> > -check arch/x86/lib/insn.c '-I "^#include 
> > [\"<]\(../include/\)*asm/in\(at\|sn\).h[\">]"'
> > +check arch/x86/lib/insn.c '-I "^#include 
> > [\"<]\(../include/\)*asm/in\(at\|sn\).h[\">]" -I "^#include 
> > [\"<]\(../include/\)*asm/xen/prefix.h[\">]"'
> 
> Unfortunately perf also has a similar sync check script:
> tools/perf/check-headers.sh.  So you'll also need to add the above
> changes there.

Oops, I thought it was integrated... OK, I'll update this patch.

> 
> Otherwise
> 
> Acked-by: Josh Poimboeuf 

Thanks!

> 
> -- 
> Josh


-- 
Masami Hiramatsu 


[PATCH V2 2/2] clk: imx8mn: Use common 1443X/1416X PLL clock structure

2019-09-05 Thread Anson Huang
Use common 1413X/1416X PLL clock structure to save a lot
of duplicated code on i.MX8MN clock driver.

Signed-off-by: Anson Huang 
---
Changes since V1:
- Changes according to patch 1/2, now PLL table/structure is in pll14xx 
driver.
---
 drivers/clk/imx/clk-imx8mn.c  | 89 +--
 drivers/clk/imx/clk-pll14xx.c |  2 +
 2 files changed, 12 insertions(+), 79 deletions(-)

diff --git a/drivers/clk/imx/clk-imx8mn.c b/drivers/clk/imx/clk-imx8mn.c
index cc65c13..91b6da8 100644
--- a/drivers/clk/imx/clk-imx8mn.c
+++ b/drivers/clk/imx/clk-imx8mn.c
@@ -39,75 +39,6 @@ enum {
NR_PLLS,
 };
 
-static const struct imx_pll14xx_rate_table imx8mn_pll1416x_tbl[] = {
-   PLL_1416X_RATE(18U, 225, 3, 0),
-   PLL_1416X_RATE(16U, 200, 3, 0),
-   PLL_1416X_RATE(15U, 375, 3, 1),
-   PLL_1416X_RATE(14U, 350, 3, 1),
-   PLL_1416X_RATE(12U, 300, 3, 1),
-   PLL_1416X_RATE(10U, 250, 3, 1),
-   PLL_1416X_RATE(8U,  200, 3, 1),
-   PLL_1416X_RATE(75000U,  250, 2, 2),
-   PLL_1416X_RATE(7U,  350, 3, 2),
-   PLL_1416X_RATE(6U,  300, 3, 2),
-};
-
-static const struct imx_pll14xx_rate_table imx8mn_audiopll_tbl[] = {
-   PLL_1443X_RATE(393216000U, 262, 2, 3, 9437),
-   PLL_1443X_RATE(361267200U, 361, 3, 3, 17511),
-};
-
-static const struct imx_pll14xx_rate_table imx8mn_videopll_tbl[] = {
-   PLL_1443X_RATE(65000U, 325, 3, 2, 0),
-   PLL_1443X_RATE(59400U, 198, 2, 2, 0),
-};
-
-static const struct imx_pll14xx_rate_table imx8mn_drampll_tbl[] = {
-   PLL_1443X_RATE(65000U, 325, 3, 2, 0),
-};
-
-static struct imx_pll14xx_clk imx8mn_audio_pll = {
-   .type = PLL_1443X,
-   .rate_table = imx8mn_audiopll_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_audiopll_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mn_video_pll = {
-   .type = PLL_1443X,
-   .rate_table = imx8mn_videopll_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_videopll_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mn_dram_pll = {
-   .type = PLL_1443X,
-   .rate_table = imx8mn_drampll_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_drampll_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mn_arm_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mn_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_pll1416x_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mn_gpu_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mn_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_pll1416x_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mn_vpu_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mn_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_pll1416x_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mn_sys_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mn_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mn_pll1416x_tbl),
-};
-
 static const char * const pll_ref_sels[] = { "osc_24m", "dummy", "dummy", 
"dummy", };
 static const char * const audio_pll1_bypass_sels[] = {"audio_pll1", 
"audio_pll1_ref_sel", };
 static const char * const audio_pll2_bypass_sels[] = {"audio_pll2", 
"audio_pll2_ref_sel", };
@@ -409,16 +340,16 @@ static int imx8mn_clocks_probe(struct platform_device 
*pdev)
clks[IMX8MN_SYS_PLL2_REF_SEL] = imx_clk_mux("sys_pll2_ref_sel", base + 
0x104, 0, 2, pll_ref_sels, ARRAY_SIZE(pll_ref_sels));
clks[IMX8MN_SYS_PLL3_REF_SEL] = imx_clk_mux("sys_pll3_ref_sel", base + 
0x114, 0, 2, pll_ref_sels, ARRAY_SIZE(pll_ref_sels));
 
-   clks[IMX8MN_AUDIO_PLL1] = imx_clk_pll14xx("audio_pll1", 
"audio_pll1_ref_sel", base, _audio_pll);
-   clks[IMX8MN_AUDIO_PLL2] = imx_clk_pll14xx("audio_pll2", 
"audio_pll2_ref_sel", base + 0x14, _audio_pll);
-   clks[IMX8MN_VIDEO_PLL1] = imx_clk_pll14xx("video_pll1", 
"video_pll1_ref_sel", base + 0x28, _video_pll);
-   clks[IMX8MN_DRAM_PLL] = imx_clk_pll14xx("dram_pll", "dram_pll_ref_sel", 
base + 0x50, _dram_pll);
-   clks[IMX8MN_GPU_PLL] = imx_clk_pll14xx("gpu_pll", "gpu_pll_ref_sel", 
base + 0x64, _gpu_pll);
-   clks[IMX8MN_VPU_PLL] = imx_clk_pll14xx("vpu_pll", "vpu_pll_ref_sel", 
base + 0x74, _vpu_pll);
-   clks[IMX8MN_ARM_PLL] = imx_clk_pll14xx("arm_pll", "arm_pll_ref_sel", 
base + 0x84, _arm_pll);
-   clks[IMX8MN_SYS_PLL1] = imx_clk_pll14xx("sys_pll1", "sys_pll1_ref_sel", 
base + 0x94, _sys_pll);
-   clks[IMX8MN_SYS_PLL2] = imx_clk_pll14xx("sys_pll2", "sys_pll2_ref_sel", 
base + 0x104, _sys_pll);
-   clks[IMX8MN_SYS_PLL3] = imx_clk_pll14xx("sys_pll3", "sys_pll3_ref_sel", 
base + 0x114, _sys_pll);
+   clks[IMX8MN_AUDIO_PLL1] = imx_clk_pll14xx("audio_pll1", 
"audio_pll1_ref_sel", base, _1443x_pll);
+   clks[IMX8MN_AUDIO_PLL2] = imx_clk_pll14xx("audio_pll2", 

[PATCH V2 1/2] clk: imx8mm: Move 1443X/1416X PLL clock structure to common place

2019-09-05 Thread Anson Huang
Many i.MX8M SoCs use same 1443X/1416X PLL, such as i.MX8MM,
i.MX8MN and later i.MX8M SoCs, moving these PLL definitions
to pll14xx driver can save a lot of duplicated code on each
platform.

Meanwhile, no need to define PLL clock structure for every
module which uses same type of PLL, e.g., audio/video/dram use
1443X PLL, arm/gpu/vpu/sys use 1416X PLL, define 2 PLL clock
structure for each group is enough.

Signed-off-by: Anson Huang 
---
Changes since V1:
- Move 1443X/1416X PLL clock table/structure to pll14xx driver.
---
 drivers/clk/imx/clk-imx8mm.c  | 87 +--
 drivers/clk/imx/clk-pll14xx.c | 30 +++
 drivers/clk/imx/clk.h |  3 ++
 3 files changed, 43 insertions(+), 77 deletions(-)

diff --git a/drivers/clk/imx/clk-imx8mm.c b/drivers/clk/imx/clk-imx8mm.c
index 2758e3f..9649250 100644
--- a/drivers/clk/imx/clk-imx8mm.c
+++ b/drivers/clk/imx/clk-imx8mm.c
@@ -26,73 +26,6 @@ static u32 share_count_disp;
 static u32 share_count_pdm;
 static u32 share_count_nand;
 
-static const struct imx_pll14xx_rate_table imx8mm_pll1416x_tbl[] = {
-   PLL_1416X_RATE(18U, 225, 3, 0),
-   PLL_1416X_RATE(16U, 200, 3, 0),
-   PLL_1416X_RATE(12U, 300, 3, 1),
-   PLL_1416X_RATE(10U, 250, 3, 1),
-   PLL_1416X_RATE(8U,  200, 3, 1),
-   PLL_1416X_RATE(75000U,  250, 2, 2),
-   PLL_1416X_RATE(7U,  350, 3, 2),
-   PLL_1416X_RATE(6U,  300, 3, 2),
-};
-
-static const struct imx_pll14xx_rate_table imx8mm_audiopll_tbl[] = {
-   PLL_1443X_RATE(393216000U, 262, 2, 3, 9437),
-   PLL_1443X_RATE(361267200U, 361, 3, 3, 17511),
-};
-
-static const struct imx_pll14xx_rate_table imx8mm_videopll_tbl[] = {
-   PLL_1443X_RATE(65000U, 325, 3, 2, 0),
-   PLL_1443X_RATE(59400U, 198, 2, 2, 0),
-};
-
-static const struct imx_pll14xx_rate_table imx8mm_drampll_tbl[] = {
-   PLL_1443X_RATE(65000U, 325, 3, 2, 0),
-};
-
-static struct imx_pll14xx_clk imx8mm_audio_pll = {
-   .type = PLL_1443X,
-   .rate_table = imx8mm_audiopll_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_audiopll_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mm_video_pll = {
-   .type = PLL_1443X,
-   .rate_table = imx8mm_videopll_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_videopll_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mm_dram_pll = {
-   .type = PLL_1443X,
-   .rate_table = imx8mm_drampll_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_drampll_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mm_arm_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mm_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_pll1416x_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mm_gpu_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mm_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_pll1416x_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mm_vpu_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mm_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_pll1416x_tbl),
-};
-
-static struct imx_pll14xx_clk imx8mm_sys_pll = {
-   .type = PLL_1416X,
-   .rate_table = imx8mm_pll1416x_tbl,
-   .rate_count = ARRAY_SIZE(imx8mm_pll1416x_tbl),
-};
-
 static const char *pll_ref_sels[] = { "osc_24m", "dummy", "dummy", "dummy", };
 static const char *audio_pll1_bypass_sels[] = {"audio_pll1", 
"audio_pll1_ref_sel", };
 static const char *audio_pll2_bypass_sels[] = {"audio_pll2", 
"audio_pll2_ref_sel", };
@@ -396,16 +329,16 @@ static int imx8mm_clocks_probe(struct platform_device 
*pdev)
clks[IMX8MM_SYS_PLL2_REF_SEL] = imx_clk_mux("sys_pll2_ref_sel", base + 
0x104, 0, 2, pll_ref_sels, ARRAY_SIZE(pll_ref_sels));
clks[IMX8MM_SYS_PLL3_REF_SEL] = imx_clk_mux("sys_pll3_ref_sel", base + 
0x114, 0, 2, pll_ref_sels, ARRAY_SIZE(pll_ref_sels));
 
-   clks[IMX8MM_AUDIO_PLL1] = imx_clk_pll14xx("audio_pll1", 
"audio_pll1_ref_sel", base, _audio_pll);
-   clks[IMX8MM_AUDIO_PLL2] = imx_clk_pll14xx("audio_pll2", 
"audio_pll2_ref_sel", base + 0x14, _audio_pll);
-   clks[IMX8MM_VIDEO_PLL1] = imx_clk_pll14xx("video_pll1", 
"video_pll1_ref_sel", base + 0x28, _video_pll);
-   clks[IMX8MM_DRAM_PLL] = imx_clk_pll14xx("dram_pll", "dram_pll_ref_sel", 
base + 0x50, _dram_pll);
-   clks[IMX8MM_GPU_PLL] = imx_clk_pll14xx("gpu_pll", "gpu_pll_ref_sel", 
base + 0x64, _gpu_pll);
-   clks[IMX8MM_VPU_PLL] = imx_clk_pll14xx("vpu_pll", "vpu_pll_ref_sel", 
base + 0x74, _vpu_pll);
-   clks[IMX8MM_ARM_PLL] = imx_clk_pll14xx("arm_pll", "arm_pll_ref_sel", 
base + 0x84, _arm_pll);
-   clks[IMX8MM_SYS_PLL1] = imx_clk_pll14xx("sys_pll1", "sys_pll1_ref_sel", 
base + 0x94, _sys_pll);
-   clks[IMX8MM_SYS_PLL2] = imx_clk_pll14xx("sys_pll2", "sys_pll2_ref_sel", 
base + 0x104, _sys_pll);
-   

[PATCH RESEND v3 2/5] KVM: LAPIC: Periodically revaluate to get conservative lapic_timer_advance_ns

2019-09-05 Thread Wanpeng Li
From: Wanpeng Li 

Even if for realtime CPUs, cache line bounces, frequency scaling, presence 
of higher-priority RT tasks, etc can still cause different response. These 
interferences should be considered and periodically revaluate whether 
or not the lapic_timer_advance_ns value is the best, do nothing if it is,
otherwise recaluate again. Set lapic_timer_advance_ns to the minimal 
conservative value from all the estimated values.

Testing on Skylake server, cat vcpu*/lapic_timer_advance_ns, before patch:
1628
4161
4321
3236
...

Testing on Skylake server, cat vcpu*/lapic_timer_advance_ns, after patch:
1553
1499
1509
1489
...

Testing on Haswell desktop, cat vcpu*/lapic_timer_advance_ns, before patch:
4617
3641
4102
4577
...
Testing on Haswell desktop, cat vcpu*/lapic_timer_advance_ns, after patch:
2775
2892
2764
2775
...

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/lapic.c | 37 ++---
 arch/x86/kvm/lapic.h |  2 ++
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 2f4a48a..12ade70 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -70,6 +70,7 @@
 /* step-by-step approximation to mitigate fluctuation */
 #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
 #define LAPIC_TIMER_ADVANCE_FILTER 1
+#define LAPIC_TIMER_ADVANCE_RECALC_PERIOD (600 * HZ)
 
 static inline int apic_test_vector(int vec, void *bitmap)
 {
@@ -1484,13 +1485,24 @@ static inline void __wait_lapic_expire(struct kvm_vcpu 
*vcpu, u64 guest_cycles)
 static inline void adjust_lapic_timer_advance(struct kvm_vcpu *vcpu,
  s64 advance_expire_delta)
 {
-   struct kvm_lapic *apic = vcpu->arch.apic;
-   u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns, ns;
+   struct kvm_timer *ktimer = >arch.apic->lapic_timer;
+   u32 timer_advance_ns = ktimer->timer_advance_ns, ns;
 
if (abs(advance_expire_delta) > LAPIC_TIMER_ADVANCE_FILTER)
/* filter out drastic fluctuations */
return;
 
+   /* periodic revaluate */
+   if (unlikely(ktimer->timer_advance_adjust_done)) {
+   ktimer->recalc_timer_advance_ns = jiffies +
+   LAPIC_TIMER_ADVANCE_RECALC_PERIOD;
+   if (abs(advance_expire_delta) > 
LAPIC_TIMER_ADVANCE_ADJUST_DONE) {
+   timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
+   ktimer->timer_advance_adjust_done = false;
+   } else
+   return;
+   }
+
/* too early */
if (advance_expire_delta < 0) {
ns = -advance_expire_delta * 100ULL;
@@ -1503,17 +1515,24 @@ static inline void adjust_lapic_timer_advance(struct 
kvm_vcpu *vcpu,
timer_advance_ns += ns;
}
 
-   timer_advance_ns = (apic->lapic_timer.timer_advance_ns *
+   timer_advance_ns = (ktimer->timer_advance_ns *
(LAPIC_TIMER_ADVANCE_ADJUST_STEP - 1) + advance_expire_delta) /
LAPIC_TIMER_ADVANCE_ADJUST_STEP;
 
if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE)
-   apic->lapic_timer.timer_advance_adjust_done = true;
+   ktimer->timer_advance_adjust_done = true;
if (unlikely(timer_advance_ns > 5000)) {
timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
-   apic->lapic_timer.timer_advance_adjust_done = false;
+   ktimer->timer_advance_adjust_done = false;
+   }
+
+   ktimer->timer_advance_ns = timer_advance_ns;
+
+   if (ktimer->timer_advance_adjust_done) {
+   if (ktimer->min_timer_advance_ns > timer_advance_ns)
+   ktimer->min_timer_advance_ns = timer_advance_ns;
+   ktimer->timer_advance_ns = ktimer->min_timer_advance_ns;
}
-   apic->lapic_timer.timer_advance_ns = timer_advance_ns;
 }
 
 static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
@@ -1532,7 +1551,8 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
if (guest_tsc < tsc_deadline)
__wait_lapic_expire(vcpu, tsc_deadline - guest_tsc);
 
-   if (unlikely(!apic->lapic_timer.timer_advance_adjust_done))
+   if (unlikely(!apic->lapic_timer.timer_advance_adjust_done) ||
+   time_before(apic->lapic_timer.recalc_timer_advance_ns, jiffies))
adjust_lapic_timer_advance(vcpu, 
apic->lapic_timer.advance_expire_delta);
 }
 
@@ -2310,9 +2330,12 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int 
timer_advance_ns)
if (timer_advance_ns == -1) {
apic->lapic_timer.timer_advance_ns = 
LAPIC_TIMER_ADVANCE_ADJUST_INIT;
apic->lapic_timer.timer_advance_adjust_done = false;
+   apic->lapic_timer.recalc_timer_advance_ns = jiffies;
+   apic->lapic_timer.min_timer_advance_ns = UINT_MAX;
} else {

[PATCH RESEND v3 1/5] KVM: LAPIC: Tune lapic_timer_advance_ns smoothly

2019-09-05 Thread Wanpeng Li
From: Wanpeng Li 

Using a moving average based on per-vCPU lapic_timer_advance_ns to tune
smoothly, filter out drastic fluctuation which prevents this before,
let's assume it is 1 cycles.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/lapic.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e904ff0..2f4a48a 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -69,6 +69,7 @@
 #define LAPIC_TIMER_ADVANCE_ADJUST_INIT 1000
 /* step-by-step approximation to mitigate fluctuation */
 #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
+#define LAPIC_TIMER_ADVANCE_FILTER 1
 
 static inline int apic_test_vector(int vec, void *bitmap)
 {
@@ -1484,23 +1485,28 @@ static inline void adjust_lapic_timer_advance(struct 
kvm_vcpu *vcpu,
  s64 advance_expire_delta)
 {
struct kvm_lapic *apic = vcpu->arch.apic;
-   u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns;
-   u64 ns;
+   u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns, ns;
+
+   if (abs(advance_expire_delta) > LAPIC_TIMER_ADVANCE_FILTER)
+   /* filter out drastic fluctuations */
+   return;
 
/* too early */
if (advance_expire_delta < 0) {
ns = -advance_expire_delta * 100ULL;
do_div(ns, vcpu->arch.virtual_tsc_khz);
-   timer_advance_ns -= min((u32)ns,
-   timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
+   timer_advance_ns -= ns;
} else {
/* too late */
ns = advance_expire_delta * 100ULL;
do_div(ns, vcpu->arch.virtual_tsc_khz);
-   timer_advance_ns += min((u32)ns,
-   timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
+   timer_advance_ns += ns;
}
 
+   timer_advance_ns = (apic->lapic_timer.timer_advance_ns *
+   (LAPIC_TIMER_ADVANCE_ADJUST_STEP - 1) + advance_expire_delta) /
+   LAPIC_TIMER_ADVANCE_ADJUST_STEP;
+
if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE)
apic->lapic_timer.timer_advance_adjust_done = true;
if (unlikely(timer_advance_ns > 5000)) {
-- 
2.7.4



[PATCH RESEND 4/5] KVM: VMX: Stop the preemption timer during vCPU reset

2019-09-05 Thread Wanpeng Li
From: Wanpeng Li 

The hrtimer which is used to emulate lapic timer is stopped during
vcpu reset, preemption timer should do the same.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx/vmx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 570a233..f794929 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4162,6 +4162,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool 
init_event)
 
vcpu->arch.microcode_version = 0x1ULL;
vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
+   vmx->hv_deadline_tsc = -1;
kvm_set_cr8(vcpu, 0);
 
if (!init_event) {
-- 
2.7.4



[PATCH RESEND 3/5] KVM: LAPIC: Micro optimize IPI latency

2019-09-05 Thread Wanpeng Li
From: Wanpeng Li 

This patch optimizes the virtual IPI emulation sequence:

write ICR2 write ICR2
write ICR  read ICR2
read ICR==>send virtual IPI
read ICR2  write ICR
send virtual IPI

It can reduce kvm-unit-tests/vmexit.flat IPI testing latency(from sender
send IPI to sender receive the ACK) from 3319 cycles to 3203 cycles on
SKylake server.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/lapic.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 12ade70..34fd299 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1200,10 +1200,8 @@ void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, 
int vector)
 }
 EXPORT_SYMBOL_GPL(kvm_apic_set_eoi_accelerated);
 
-static void apic_send_ipi(struct kvm_lapic *apic)
+static void apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
 {
-   u32 icr_low = kvm_lapic_get_reg(apic, APIC_ICR);
-   u32 icr_high = kvm_lapic_get_reg(apic, APIC_ICR2);
struct kvm_lapic_irq irq;
 
irq.vector = icr_low & APIC_VECTOR_MASK;
@@ -1940,8 +1938,9 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, 
u32 val)
}
case APIC_ICR:
/* No delay here, so we always clear the pending bit */
-   kvm_lapic_set_reg(apic, APIC_ICR, val & ~(1 << 12));
-   apic_send_ipi(apic);
+   val &= ~(1 << 12);
+   apic_send_ipi(apic, val, kvm_lapic_get_reg(apic, APIC_ICR2));
+   kvm_lapic_set_reg(apic, APIC_ICR, val);
break;
 
case APIC_ICR2:
-- 
2.7.4



[PATCH v2 5/5] KVM: hyperv: Fix Direct Synthetic timers assert an interrupt w/o lapic_in_kernel

2019-09-05 Thread Wanpeng Li
From: Wanpeng Li 

Reported by syzkaller:

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] PREEMPT SMP KASAN
RIP: 0010:__apic_accept_irq+0x46/0x740 arch/x86/kvm/lapic.c:1029
Call Trace:
kvm_apic_set_irq+0xb4/0x140 arch/x86/kvm/lapic.c:558
stimer_notify_direct arch/x86/kvm/hyperv.c:648 [inline]
stimer_expiration arch/x86/kvm/hyperv.c:659 [inline]
kvm_hv_process_stimers+0x594/0x1650 arch/x86/kvm/hyperv.c:686
vcpu_enter_guest+0x2b2a/0x54b0 arch/x86/kvm/x86.c:7896
vcpu_run+0x393/0xd40 arch/x86/kvm/x86.c:8152
kvm_arch_vcpu_ioctl_run+0x636/0x900 arch/x86/kvm/x86.c:8360
kvm_vcpu_ioctl+0x6cf/0xaf0 
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2765

The testcase programs HV_X64_MSR_STIMERn_CONFIG/HV_X64_MSR_STIMERn_COUNT,
in addition, there is no lapic in the kernel, the counters value are small
enough in order that kvm_hv_process_stimers() inject this already-expired
timer interrupt into the guest through lapic in the kernel which triggers
the NULL deferencing. This patch fixes it by don't advertise direct mode 
synthetic timers and discarding the inject when lapic is not in kernel.

Reported-by: syzbot+dff25ee91f0c7d5c1...@syzkaller.appspotmail.com
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Vitaly Kuznetsov 
Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * don't advertise direct mode synthetic timers when lapic is not in kernel

 arch/x86/kvm/hyperv.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index c10a8b1..069e655 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -645,7 +645,9 @@ static int stimer_notify_direct(struct kvm_vcpu_hv_stimer 
*stimer)
.vector = stimer->config.apic_vector
};
 
-   return !kvm_apic_set_irq(vcpu, , NULL);
+   if (lapic_in_kernel(vcpu))
+   return !kvm_apic_set_irq(vcpu, , NULL);
+   return 0;
 }
 
 static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
@@ -1849,7 +1851,13 @@ int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, 
struct kvm_cpuid2 *cpuid,
 
ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
-   ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
+
+   /*
+* Direct Synthetic timers only make sense with 
in-kernel
+* LAPIC
+*/
+   if (lapic_in_kernel(vcpu))
+   ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
 
break;
 
-- 
2.7.4



Re: [PATCH] ASoC: fsl_sai: Implement set_bclk_ratio

2019-09-05 Thread Nicolin Chen
On Sat, Aug 31, 2019 at 12:59:10AM +0300, Daniel Baluta wrote:
> From: Viorel Suman 
> 
> This is to allow machine drivers to set a certain bitclk rate
> which might not be exactly rate * frame size.

Just a quick thought of mine: slot_width and slots could be
set via set_dai_tdm_slot() actually, while set_bclk_ratio()
would override that one with your change. I'm not sure which
one could be more important...so would you mind elaborating
your use case?

Thanks
Nicolin


> 
> Cc: NXP Linux Team 
> Signed-off-by: Viorel Suman 
> Signed-off-by: Daniel Baluta 
> ---
>  sound/soc/fsl/fsl_sai.c | 21 +++--
>  sound/soc/fsl/fsl_sai.h |  1 +
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> index fe126029f4e3..e896b577b1f7 100644
> --- a/sound/soc/fsl/fsl_sai.c
> +++ b/sound/soc/fsl/fsl_sai.c
> @@ -137,6 +137,16 @@ static int fsl_sai_set_dai_tdm_slot(struct snd_soc_dai 
> *cpu_dai, u32 tx_mask,
>   return 0;
>  }
>  
> +static int fsl_sai_set_dai_bclk_ratio(struct snd_soc_dai *dai,
> +   unsigned int ratio)
> +{
> + struct fsl_sai *sai = snd_soc_dai_get_drvdata(dai);
> +
> + sai->bclk_ratio = ratio;
> +
> + return 0;
> +}
> +
>  static int fsl_sai_set_dai_sysclk_tr(struct snd_soc_dai *cpu_dai,
>   int clk_id, unsigned int freq, int fsl_dir)
>  {
> @@ -423,8 +433,14 @@ static int fsl_sai_hw_params(struct snd_pcm_substream 
> *substream,
>   slot_width = sai->slot_width;
>  
>   if (!sai->is_slave_mode) {
> - ret = fsl_sai_set_bclk(cpu_dai, tx,
> - slots * slot_width * params_rate(params));
> + if (sai->bclk_ratio)
> + ret = fsl_sai_set_bclk(cpu_dai, tx,
> +sai->bclk_ratio *
> +params_rate(params));
> + else
> + ret = fsl_sai_set_bclk(cpu_dai, tx,
> +slots * slot_width *
> +params_rate(params));
>   if (ret)
>   return ret;
>  
> @@ -640,6 +656,7 @@ static void fsl_sai_shutdown(struct snd_pcm_substream 
> *substream,
>  }
>  
>  static const struct snd_soc_dai_ops fsl_sai_pcm_dai_ops = {
> + .set_bclk_ratio = fsl_sai_set_dai_bclk_ratio,
>   .set_sysclk = fsl_sai_set_dai_sysclk,
>   .set_fmt= fsl_sai_set_dai_fmt,
>   .set_tdm_slot   = fsl_sai_set_dai_tdm_slot,
> diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
> index 3a3f6f8e5595..f96f8d97489d 100644
> --- a/sound/soc/fsl/fsl_sai.h
> +++ b/sound/soc/fsl/fsl_sai.h
> @@ -177,6 +177,7 @@ struct fsl_sai {
>   unsigned int mclk_streams;
>   unsigned int slots;
>   unsigned int slot_width;
> + unsigned int bclk_ratio;
>  
>   const struct fsl_sai_soc_data *soc_data;
>   struct snd_dmaengine_dai_dma_data dma_params_rx;
> -- 
> 2.17.1
> 


[PATCH] ASoC: bdw-rt5677: channel constraint support

2019-09-05 Thread Brent Lu
BDW boards using this machine driver supports only stereo capture and
playback. Implement a constraint to enforce it.

Signed-off-by: Brent Lu 
---
 sound/soc/intel/boards/bdw-rt5677.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/sound/soc/intel/boards/bdw-rt5677.c 
b/sound/soc/intel/boards/bdw-rt5677.c
index 4a4d335..a312b55 100644
--- a/sound/soc/intel/boards/bdw-rt5677.c
+++ b/sound/soc/intel/boards/bdw-rt5677.c
@@ -22,6 +22,8 @@
 
 #include "../../codecs/rt5677.h"
 
+#define DUAL_CHANNEL 2
+
 struct bdw_rt5677_priv {
struct gpio_desc *gpio_hp_en;
struct snd_soc_component *component;
@@ -245,6 +247,36 @@ static int bdw_rt5677_init(struct snd_soc_pcm_runtime *rtd)
return 0;
 }
 
+static const unsigned int channels[] = {
+   DUAL_CHANNEL,
+};
+
+static const struct snd_pcm_hw_constraint_list constraints_channels = {
+   .count = ARRAY_SIZE(channels),
+   .list = channels,
+   .mask = 0,
+};
+
+static int bdw_fe_startup(struct snd_pcm_substream *substream)
+{
+   struct snd_pcm_runtime *runtime = substream->runtime;
+
+   /*
+* On this platform for PCM device we support,
+* stereo
+*/
+
+   runtime->hw.channels_max = DUAL_CHANNEL;
+   snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
+  _channels);
+
+   return 0;
+}
+
+static const struct snd_soc_ops bdw_rt5677_fe_ops = {
+   .startup = bdw_fe_startup,
+};
+
 /* broadwell digital audio interface glue - connects codec <--> CPU */
 SND_SOC_DAILINK_DEF(dummy,
DAILINK_COMP_ARRAY(COMP_DUMMY()));
@@ -273,6 +305,7 @@ static struct snd_soc_dai_link bdw_rt5677_dais[] = {
},
.dpcm_capture = 1,
.dpcm_playback = 1,
+   .ops = _rt5677_fe_ops,
SND_SOC_DAILINK_REG(fe, dummy, platform),
},
 
-- 
2.7.4



Re: [PATCH] Revert "net: get rid of an signed integer overflow in ip_idents_reserve()"

2019-09-05 Thread Shaokun Zhang
Hi Eric,

On 2019/7/26 17:58, Eric Dumazet wrote:
> 
> 
> On 7/26/19 11:17 AM, Shaokun Zhang wrote:
>> From: Yang Guo 
>>
>> There is an significant performance regression with the following
>> commit-id 
>> ("net: get rid of an signed integer overflow in ip_idents_reserve()").
>>
>>
> 
> So, you jump around and took ownership of this issue, while some of us
> are already working on it ?
> 

Any update about this issue?

Thanks,
Shaokun

> Have you first checked that current UBSAN versions will not complain anymore ?
> 
> A revert adding back the original issue would be silly, performance of
> benchmarks is nice but secondary.
> 
> 
> 



Re: [PATCH] ASoC: fsl_sai: Fix noise when using EDMA

2019-09-05 Thread Nicolin Chen
On Fri, Aug 30, 2019 at 11:09:00PM +0300, Daniel Baluta wrote:
> From: Mihai Serban 
> 
> EDMA requires the period size to be multiple of maxburst. Otherwise the
> remaining bytes are not transferred and thus noise is produced.
> 
> We can handle this issue by adding a constraint on
> SNDRV_PCM_HW_PARAM_PERIOD_SIZE to be multiple of tx/rx maxburst value.
> 
> Cc: NXP Linux Team 
> Signed-off-by: Mihai Serban 
> Signed-off-by: Daniel Baluta 
> ---
>  sound/soc/fsl/fsl_sai.c | 15 +++
>  sound/soc/fsl/fsl_sai.h |  1 +
>  2 files changed, 16 insertions(+)
> 
> diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> index 728307acab90..fe126029f4e3 100644
> --- a/sound/soc/fsl/fsl_sai.c
> +++ b/sound/soc/fsl/fsl_sai.c
> @@ -612,6 +612,16 @@ static int fsl_sai_startup(struct snd_pcm_substream 
> *substream,
>  FSL_SAI_CR3_TRCE_MASK,
>  FSL_SAI_CR3_TRCE);
>  
> + /*
> +  * some DMA controllers need period size to be a multiple of
> +  * tx/rx maxburst
> +  */
> + if (sai->soc_data->use_constraint_period_size)
> + snd_pcm_hw_constraint_step(substream->runtime, 0,
> +SNDRV_PCM_HW_PARAM_PERIOD_SIZE,
> +tx ? sai->dma_params_tx.maxburst :
> +sai->dma_params_rx.maxburst);

I feel that PERIOD_SIZE could be used for some other cases than
being related to maxburst
  
>  static const struct of_device_id fsl_sai_ids[] = {
> diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
> index b89b0ca26053..3a3f6f8e5595 100644
> --- a/sound/soc/fsl/fsl_sai.h
> +++ b/sound/soc/fsl/fsl_sai.h
> @@ -157,6 +157,7 @@
>  
>  struct fsl_sai_soc_data {
>   bool use_imx_pcm;
> + bool use_constraint_period_size;

so maybe the soc specific flag here could be something like
bool use_edma;

What do you think?


RE: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

2019-09-05 Thread Long Li
>Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
>
>
>Hi Ming,
>
>On 05/09/2019 11:06, Ming Lei wrote:
>> On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
>>> Hi,
>>>
>>> On 04/09/2019 19:07, Bart Van Assche wrote:
 On 9/3/19 12:50 AM, Daniel Lezcano wrote:
> On 03/09/2019 09:28, Ming Lei wrote:
>> On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
>>> It is a scheduler problem then ?
>>
>> Scheduler can do nothing if the CPU is taken completely by
>> handling interrupt & softirq, so seems not a scheduler problem, IMO.
>
> Why? If there is a irq pressure on one CPU reducing its capacity,
> the scheduler will balance the tasks on another CPU, no?

 Only if CONFIG_IRQ_TIME_ACCOUNTING has been enabled. However, I
 don't know any Linux distro that enables that option. That's
 probably because that option introduces two rdtsc() calls in each
 interrupt. Given the overhead introduced by this option, I don't
 think this is the solution Ming is looking for.
>>>
>>> Was this overhead reported somewhere ?
>>
>> The syscall of gettimeofday() calls ktime_get_real_ts64() which
>> finally calls tk_clock_read() which calls rdtsc too.
>>
>> But gettimeofday() is often used in fast path, and block IO_STAT needs
>> to read it too.
>>
>>>
 See also irqtime_account_irq() in kernel/sched/cputime.c.
>>>
>>> From my POV, this framework could be interesting to detect this situation.
>>
>> Now we are talking about IRQ_TIME_ACCOUNTING instead of
>IRQ_TIMINGS,
>> and the former one could be used to implement the detection. And the
>> only sharing should be the read of timestamp.
>
>You did not share yet the analysis of the problem (the kernel warnings give
>the symptoms) and gave the reasoning for the solution. It is hard to
>understand what you are looking for exactly and how to connect the dots.
>
>AFAIU, there are fast medium where the responses to requests are faster
>than the time to process them, right?
>
>I don't see how detecting IRQ flooding and use a threaded irq is the solution,
>can you explain?
>
>If the responses are coming at a very high rate, whatever the solution
>(interrupts, threaded interrupts, polling), we are still in the same situation.
>
>My suggestion was initially to see if the interrupt load will be taken into
>accounts in the cpu load and favorize task migration with the scheduler load
>balance to a less loaded CPU, thus the CPU processing interrupts will end up
>doing only that while other CPUs will handle the "threaded" side.
>
>Beside that, I'm wondering if the block scheduler should be somehow
>involved in that [1]
>
>  -- Daniel

Hi Daniel

I want to share some test results with IRQ_TIME_ACCOUNTING. During tests, the 
kernel had warnings on RCU stall and soft lockup:

An example of RCU stall;
[ 3016.148250] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 3016.148299] rcu: 66-: (1 GPs behind) idle=cc2/0/0x3 
softirq=10037/10037 fqs=4717
[ 3016.148299]  (detected by 27, t=15011 jiffies, g=45173, q=17194)
[ 3016.148299] Sending NMI from CPU 27 to CPUs 66:
[ 3016.148299] NMI backtrace for cpu 66
[ 3016.148299] CPU: 66 PID: 0 Comm: swapper/66 Tainted: G L
5.3.0-rc6+ #68
[ 3016.148299] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  05/18/2018
[ 3016.148299] RIP: 0010:0x9c4740013003
[ 3016.148299] Code: Bad RIP value.
[ 3016.148299] RSP: 0018:9c4759acc8d0 EFLAGS: 0046
[ 3016.148299] RAX:  RBX: 0080 RCX: 0001000b
[ 3016.148299] RDX: 00fb RSI: 9c4740013000 RDI: 9c4759acc920
[ 3016.148299] RBP: 9c4759acc920 R08: 0008 R09: 01484a845c6de350
[ 3016.148299] R10: 9c4759accd30 R11: 0001 R12: 00fb
[ 3016.148299] R13: 0042 R14: 8a7d9b771f80 R15: 01e1
[ 3016.148299] FS:  () GS:8afd9f88() 
knlGS:
[ 3016.148299] CS:  0010 DS:  ES:  CR0: 80050033
[ 3016.148299] CR2: 9c4740012fd9 CR3: 00208b9bc000 CR4: 003406e0
[ 3016.148299] Call Trace:
[ 3016.148299]  
[ 3016.148299]  ? __send_ipi_mask+0x145/0x2e0
[ 3016.148299]  ? __send_ipi_one+0x3a/0x60
[ 3016.148299]  ? hv_send_ipi+0x10/0x30
[ 3016.148299]  ? generic_exec_single+0x63/0xe0
[ 3016.148299]  ? smp_call_function_single_async+0x1f/0x40
[ 3016.148299]  ? blk_mq_complete_request+0xdf/0xf0
[ 3016.148299]  ? nvme_irq+0x144/0x240 [nvme]
[ 3016.148299]  ? tick_sched_do_timer+0x80/0x80
[ 3016.148299]  ? __handle_irq_event_percpu+0x40/0x190
[ 3016.148299]  ? handle_irq_event_percpu+0x30/0x70
[ 3016.148299]  ? handle_irq_event+0x36/0x60
[ 3016.148299]  ? handle_edge_irq+0x7e/0x190
[ 3016.148299]  ? handle_irq+0x1c/0x30
[ 3016.148299]  ? do_IRQ+0x49/0xd0
[ 3016.148299]  ? common_interrupt+0xf/0xf
[ 3016.148299]  ? common_interrupt+0xa/0xf
[ 3016.148299]  ? __do_softirq+0x76/0x2e3

Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold

2019-09-05 Thread Daniel Colascione
On Thu, Sep 5, 2019 at 5:59 PM Joel Fernandes  wrote:
> On Thu, Sep 05, 2019 at 10:50:27AM -0700, Daniel Colascione wrote:
> > On Thu, Sep 5, 2019 at 10:35 AM Steven Rostedt  wrote:
> > > On Thu, 5 Sep 2019 09:03:01 -0700
> > > Suren Baghdasaryan  wrote:
> > >
> > > > On Thu, Sep 5, 2019 at 7:43 AM Michal Hocko  wrote:
> > > > >
> > > > > [Add Steven]
> > > > >
> > > > > On Wed 04-09-19 12:28:08, Joel Fernandes wrote:
> > > > > > On Wed, Sep 4, 2019 at 11:38 AM Michal Hocko  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed 04-09-19 11:32:58, Joel Fernandes wrote:
> > > > > [...]
> > > > > > > > but also for reducing
> > > > > > > > tracing noise. Flooding the traces makes it less useful for 
> > > > > > > > long traces and
> > > > > > > > post-processing of traces. IOW, the overhead reduction is a 
> > > > > > > > bonus.
> > > > > > >
> > > > > > > This is not really anything special for this tracepoint though.
> > > > > > > Basically any tracepoint in a hot path is in the same situation 
> > > > > > > and I do
> > > > > > > not see a point why each of them should really invent its own way 
> > > > > > > to
> > > > > > > throttle. Maybe there is some way to do that in the tracing 
> > > > > > > subsystem
> > > > > > > directly.
> > > > > >
> > > > > > I am not sure if there is a way to do this easily. Add to that, the 
> > > > > > fact that
> > > > > > you still have to call into trace events. Why call into it at all, 
> > > > > > if you can
> > > > > > filter in advance and have a sane filtering default?
> > > > > >
> > > > > > The bigger improvement with the threshold is the number of trace 
> > > > > > records are
> > > > > > almost halved by using a threshold. The number of records went from 
> > > > > > 4.6K to
> > > > > > 2.6K.
> > > > >
> > > > > Steven, would it be feasible to add a generic tracepoint throttling?
> > > >
> > > > I might misunderstand this but is the issue here actually throttling
> > > > of the sheer number of trace records or tracing large enough changes
> > > > to RSS that user might care about? Small changes happen all the time
> > > > but we are likely not interested in those. Surely we could postprocess
> > > > the traces to extract changes large enough to be interesting but why
> > > > capture uninteresting information in the first place? IOW the
> > > > throttling here should be based not on the time between traces but on
> > > > the amount of change of the traced signal. Maybe a generic facility
> > > > like that would be a good idea?
> > >
> > > You mean like add a trigger (or filter) that only traces if a field has
> > > changed since the last time the trace was hit? Hmm, I think we could
> > > possibly do that. Perhaps even now with histogram triggers?
> >
> > I was thinking along the same lines. The histogram subsystem seems
> > like a very good fit here. Histogram triggers already let users talk
> > about specific fields of trace events, aggregate them in configurable
> > ways, and (importantly, IMHO) create synthetic new trace events that
> > the kernel emits under configurable conditions.
>
> Hmm, I think this tracing feature will be a good idea. But in order not to
> gate this patch, can we agree on keeping a temporary threshold for this
> patch? Once such idea is implemented in trace subsystem, then we can remove
> the temporary filter.
>
> As Tim said, we don't want our traces flooded and this is a very useful
> tracepoint as proven in our internal usage at Android. The threshold filter
> is just few lines of code.

I'm not sure the threshold filtering code you've added does the right
thing: we don't keep state, so if a counter constantly flips between
one "side" of the TRACE_MM_COUNTER_THRESHOLD and the other, we'll emit
ftrace events at high frequency. More generally, this filtering
couples the rate of counter logging to the *value* of the counter ---
that is, we log ftrace events at different times depending on how much
memory we happen to have used --- and that's not ideal from a
predictability POV.

All things being equal, I'd prefer that we get things upstream as fast
as possible. But in this case, I'd rather wait for a general-purpose
filtering facility (whether that facility is based on histogram, eBPF,
or something else) rather than hardcode one particular fixed filtering
strategy (which might be suboptimal) for one particular kind of event.
Is there some special urgency here?

How about we instead add non-filtered tracepoints for the mm counters?
These tracepoints will still be free when turned off.

Having added the basic tracepoints, we can discuss separately how to
do the rate limiting. Maybe instead of providing direct support for
the algorithm that I described above, we can just use a BPF program as
a yes/no predicate for whether to log to ftrace. That'd get us to the
same place as this patch, but more flexibly, right?


Re: [PATCH -tip v2 1/2] x86: xen: insn: Decode Xen and KVM emulate-prefix signature

2019-09-05 Thread Josh Poimboeuf
On Fri, Sep 06, 2019 at 09:50:19AM +0900, Masami Hiramatsu wrote:
> --- a/tools/objtool/sync-check.sh
> +++ b/tools/objtool/sync-check.sh
> @@ -4,6 +4,7 @@
>  FILES='
>  arch/x86/include/asm/inat_types.h
>  arch/x86/include/asm/orc_types.h
> +arch/x86/include/asm/xen/prefix.h
>  arch/x86/lib/x86-opcode-map.txt
>  arch/x86/tools/gen-insn-attr-x86.awk
>  '
> @@ -46,6 +47,6 @@ done
>  check arch/x86/include/asm/inat.h '-I "^#include 
> [\"<]\(asm/\)*inat_types.h[\">]"'
>  check arch/x86/include/asm/insn.h '-I "^#include 
> [\"<]\(asm/\)*inat.h[\">]"'
>  check arch/x86/lib/inat.c '-I "^#include 
> [\"<]\(../include/\)*asm/insn.h[\">]"'
> -check arch/x86/lib/insn.c '-I "^#include 
> [\"<]\(../include/\)*asm/in\(at\|sn\).h[\">]"'
> +check arch/x86/lib/insn.c '-I "^#include 
> [\"<]\(../include/\)*asm/in\(at\|sn\).h[\">]" -I "^#include 
> [\"<]\(../include/\)*asm/xen/prefix.h[\">]"'

Unfortunately perf also has a similar sync check script:
tools/perf/check-headers.sh.  So you'll also need to add the above
changes there.

Otherwise

Acked-by: Josh Poimboeuf 

-- 
Josh


Re: [PATCH] xfs: Use WARN_ON_ONCE for bailout mount-operation

2019-09-05 Thread Darrick J. Wong
On Fri, Aug 30, 2019 at 10:41:10AM +0900, Austin Kim wrote:
> If the CONFIG_BUG is enabled, BUG is executed and then system is crashed.
> However, the bailout for mount is no longer proceeding.
> 
> Using WARN_ON_ONCE rather than BUG can prevent this situation.
> 
> Signed-off-by: Austin Kim 

Looks ok,
Reviewed-by: Darrick J. Wong 

--D

> ---
>  fs/xfs/xfs_mount.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 322da69..3ab2acf 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -214,7 +214,7 @@ xfs_initialize_perag(
>  
>   spin_lock(>m_perag_lock);
>   if (radix_tree_insert(>m_perag_tree, index, pag)) {
> - BUG();
> + WARN_ON_ONCE(1);
>   spin_unlock(>m_perag_lock);
>   radix_tree_preload_end();
>   error = -EEXIST;
> -- 
> 2.6.2
> 


RE: [PATCH 1/2] clk: imx8mm: Move 1443X/1416X PLL clock structure to common place

2019-09-05 Thread Anson Huang
Hi, Leonard

> On 05.09.2019 12:59, Anson Huang wrote:
> > Many i.MX8M SoCs use same 1443X/1416X PLL, such as i.MX8MM,
> i.MX8MN
> > and later i.MX8M SoCs, moving these PLL definitions to common place
> > can save a lot of duplicated code on each platform.
> 
> There are lots of similarities between imx8m clocks, do you plan to do
> combine them further?

I will consider it later, maybe we can create a new clock file named clk-imx8m.c
as common clock for i.MX8M SoCs which are similar.

> 
> > Meanwhile, no need to define PLL clock structure for every module
> > which uses same type of PLL, e.g., audio/video/dram use 1443X PLL,
> > arm/gpu/vpu/sys use 1416X PLL, define 2 PLL clock structure for each
> > group is enough.
> 
> > diff --git a/drivers/clk/imx/clk.c b/drivers/clk/imx/clk.c
> 
> > +const struct imx_pll14xx_rate_table imx_pll1416x_tbl[] = {
> > +   PLL_1416X_RATE(18U, 225, 3, 0),
> > +   PLL_1416X_RATE(16U, 200, 3, 0),
> > +   PLL_1416X_RATE(12U, 300, 3, 1),
> > +   PLL_1416X_RATE(10U, 250, 3, 1),
> > +   PLL_1416X_RATE(8U,  200, 3, 1),
> > +   PLL_1416X_RATE(75000U,  250, 2, 2),
> > +   PLL_1416X_RATE(7U,  350, 3, 2),
> > +   PLL_1416X_RATE(6U,  300, 3, 2), };
> > +
> > +const struct imx_pll14xx_rate_table imx_pll1443x_tbl[] = {
> > +   PLL_1443X_RATE(65000U, 325, 3, 2, 0),
> > +   PLL_1443X_RATE(59400U, 198, 2, 2, 0),
> > +   PLL_1443X_RATE(393216000U, 262, 2, 3, 9437),
> > +   PLL_1443X_RATE(361267200U, 361, 3, 3, 17511), };
> > +
> > +struct imx_pll14xx_clk imx_1443x_pll = {
> > +   .type = PLL_1443X,
> > +   .rate_table = imx_pll1443x_tbl,
> > +   .rate_count = ARRAY_SIZE(imx_pll1443x_tbl), };
> > +
> > +struct imx_pll14xx_clk imx_1416x_pll = {
> > +   .type = PLL_1416X,
> > +   .rate_table = imx_pll1416x_tbl,
> > +   .rate_count = ARRAY_SIZE(imx_pll1416x_tbl), };
> 
> Perhaps these consts should be in clk-pll14xx.c? That way they won't be
> compiled for imx6 as well.

Make sense, I will do it in V2.

Thanks,
Anson




Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold

2019-09-05 Thread Joel Fernandes
On Thu, Sep 05, 2019 at 10:50:27AM -0700, Daniel Colascione wrote:
> On Thu, Sep 5, 2019 at 10:35 AM Steven Rostedt  wrote:
> > On Thu, 5 Sep 2019 09:03:01 -0700
> > Suren Baghdasaryan  wrote:
> >
> > > On Thu, Sep 5, 2019 at 7:43 AM Michal Hocko  wrote:
> > > >
> > > > [Add Steven]
> > > >
> > > > On Wed 04-09-19 12:28:08, Joel Fernandes wrote:
> > > > > On Wed, Sep 4, 2019 at 11:38 AM Michal Hocko  
> > > > > wrote:
> > > > > >
> > > > > > On Wed 04-09-19 11:32:58, Joel Fernandes wrote:
> > > > [...]
> > > > > > > but also for reducing
> > > > > > > tracing noise. Flooding the traces makes it less useful for long 
> > > > > > > traces and
> > > > > > > post-processing of traces. IOW, the overhead reduction is a bonus.
> > > > > >
> > > > > > This is not really anything special for this tracepoint though.
> > > > > > Basically any tracepoint in a hot path is in the same situation and 
> > > > > > I do
> > > > > > not see a point why each of them should really invent its own way to
> > > > > > throttle. Maybe there is some way to do that in the tracing 
> > > > > > subsystem
> > > > > > directly.
> > > > >
> > > > > I am not sure if there is a way to do this easily. Add to that, the 
> > > > > fact that
> > > > > you still have to call into trace events. Why call into it at all, if 
> > > > > you can
> > > > > filter in advance and have a sane filtering default?
> > > > >
> > > > > The bigger improvement with the threshold is the number of trace 
> > > > > records are
> > > > > almost halved by using a threshold. The number of records went from 
> > > > > 4.6K to
> > > > > 2.6K.
> > > >
> > > > Steven, would it be feasible to add a generic tracepoint throttling?
> > >
> > > I might misunderstand this but is the issue here actually throttling
> > > of the sheer number of trace records or tracing large enough changes
> > > to RSS that user might care about? Small changes happen all the time
> > > but we are likely not interested in those. Surely we could postprocess
> > > the traces to extract changes large enough to be interesting but why
> > > capture uninteresting information in the first place? IOW the
> > > throttling here should be based not on the time between traces but on
> > > the amount of change of the traced signal. Maybe a generic facility
> > > like that would be a good idea?
> >
> > You mean like add a trigger (or filter) that only traces if a field has
> > changed since the last time the trace was hit? Hmm, I think we could
> > possibly do that. Perhaps even now with histogram triggers?
> 
> I was thinking along the same lines. The histogram subsystem seems
> like a very good fit here. Histogram triggers already let users talk
> about specific fields of trace events, aggregate them in configurable
> ways, and (importantly, IMHO) create synthetic new trace events that
> the kernel emits under configurable conditions.

Hmm, I think this tracing feature will be a good idea. But in order not to
gate this patch, can we agree on keeping a temporary threshold for this
patch? Once such idea is implemented in trace subsystem, then we can remove
the temporary filter.

As Tim said, we don't want our traces flooded and this is a very useful
tracepoint as proven in our internal usage at Android. The threshold filter
is just few lines of code.

thanks,

 - Joel



Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load

2019-09-05 Thread Mathieu Desnoyers
- On Sep 4, 2019, at 2:26 PM, Peter Zijlstra pet...@infradead.org wrote:

> On Wed, Sep 04, 2019 at 01:12:53PM -0400, Mathieu Desnoyers wrote:
>> - On Sep 4, 2019, at 12:09 PM, Peter Zijlstra pet...@infradead.org wrote:
>> 
>> > On Wed, Sep 04, 2019 at 11:19:00AM -0400, Mathieu Desnoyers wrote:
>> >> - On Sep 3, 2019, at 4:36 PM, Linus Torvalds 
>> >> torva...@linux-foundation.org
>> >> wrote:
>> > 
>> >> > I wonder if the easiest model might be to just use a percpu variable
>> >> > instead for the membarrier stuff? It's not like it has to be in
>> >> > 'struct task_struct' at all, I think. We only care about the current
>> >> > runqueues, and those are percpu anyway.
>> >> 
>> >> One issue here is that membarrier iterates over all runqueues without
>> >> grabbing any runqueue lock. If we copy that state from mm to rq on
>> >> sched switch prepare, we would need to ensure we have the proper
>> >> memory barriers between:
>> >> 
>> >> prior user-space memory accesses  /  setting the runqueue membarrier state
>> >> 
>> >> and
>> >> 
>> >> setting the runqueue membarrier state / following user-space memory 
>> >> accesses
>> >> 
>> >> Copying the membarrier state into the task struct leverages the fact that
>> >> we have documented and guaranteed those barriers around the rq->curr 
>> >> update
>> >> in the scheduler.
>> > 
>> > Should be the same as the barriers we already rely on for rq->curr, no?
>> > That is, if we put this before switch_mm() then we have
>> > smp_mb__after_spinlock() and switch_mm() itself.
>> 
>> Yes, I think we can piggy-back on the already documented barriers documented
>> around
>> rq->curr store.
>> 
>> > Also, if we place mm->membarrier_state in the same cacheline as mm->pgd
>> > (which switch_mm() is bound to load) then we should be fine, I think.
>> 
>> Yes, if we make sure membarrier_prepare_task_switch only updates the
>> rq->membarrier_state if prev->mm != next->mm, we should be able to avoid
>> loading next->mm->membarrier_state when switch_mm() is not invoked.
>> 
>> I'll prepare RFC patch implementing this approach.
> 
> Thinking about this a bit; switching it 'on' still requires some
> thinking. Consider register on an already threaded process of which
> multiple threads are 'current' on multiple CPUs. In that case none of
> the rq bits will be set.
> 
> Not even synchronize_rcu() is sufficient to force it either, since we
> only update on switch_mm() and nothing guarantees we pass that.
> 
> One possible approach would be to IPI broadcast (after setting the
> ->mm->membarrier_State) and having the IPI update the rq state from
> 'current->mm'.
> 
> But possible I'm just confusing evryone again. I'm not having a good day
> today.

Indeed, switching 'on' requires some care. I implemented the IPI-based
approach as per your suggestion,

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


[PATCH -tip v2 2/2] x86: kprobes: Prohibit probing on instruction which has emulate prefix

2019-09-05 Thread Masami Hiramatsu
Prohibit probing on instruction which has XEN_EMULATE_PREFIX
or KVM's emulate prefix. Since that prefix is a marker for Xen
and KVM, if we modify the marker by kprobe's int3, that doesn't
work as expected.

Signed-off-by: Masami Hiramatsu 
---
 arch/x86/kernel/kprobes/core.c |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 43fc13c831af..4f13af7cbcdb 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -351,6 +351,10 @@ int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct 
insn *insn)
kernel_insn_init(insn, dest, MAX_INSN_SIZE);
insn_get_length(insn);
 
+   /* We can not probe force emulate prefixed instruction */
+   if (insn_has_emulate_prefix(insn))
+   return 0;
+
/* Another subsystem puts a breakpoint, failed to recover */
if (insn->opcode.bytes[0] == BREAKPOINT_INSTRUCTION)
return 0;



[PATCH -tip v2 0/2] x86: kprobes: Prohibit kprobes on Xen/KVM emulate prefixes

2019-09-05 Thread Masami Hiramatsu
Hi,

Here is the 2nd version of patches to handle Xen/KVM emulate
prefix by x86 instruction decoder.

These patches allow x86 instruction decoder to decode
Xen and KVM emulate prefix correctly, and prohibit kprobes to
probe on it.

Josh reported that the objtool can not decode such special
prefixed instructions, and I found that we also have to
prohibit kprobes to probe on such instruction.

This series can be applied on -tip master branch which
has merged Josh's objtool/perf sharing common x86 insn
decoder series.

In this version, I added KVM emulate prefix support and generalized
the interface. (insn_has_xen_prefix -> insn_has_emulate_prefix)
Also, I added insn.emulate_prefix_size for those prefixes because
that prefix is NOT an x86 instruction prefix, and the next instruction
of those emulate prefixes can have x86 instruction prefix. So we
can not use insn.prefix for it.

Thank you,

---

Masami Hiramatsu (2):
  x86: xen: insn: Decode Xen and KVM emulate-prefix signature
  x86: kprobes: Prohibit probing on instruction which has emulate prefix


 arch/x86/include/asm/insn.h |6 +
 arch/x86/include/asm/xen/interface.h|7 --
 arch/x86/include/asm/xen/prefix.h   |   10 +
 arch/x86/kernel/kprobes/core.c  |4 +++
 arch/x86/lib/insn.c |   36 +++
 tools/arch/x86/include/asm/insn.h   |6 +
 tools/arch/x86/include/asm/xen/prefix.h |   10 +
 tools/arch/x86/lib/insn.c   |   36 +++
 tools/objtool/sync-check.sh |3 ++-
 9 files changed, 115 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/prefix.h
 create mode 100644 tools/arch/x86/include/asm/xen/prefix.h

--
Masami Hiramatsu (Linaro) 


[PATCH -tip v2 1/2] x86: xen: insn: Decode Xen and KVM emulate-prefix signature

2019-09-05 Thread Masami Hiramatsu
Decode Xen and KVM's emulate-prefix signature by x86 insn decoder.
It is called "prefix" but actually not x86 instruction prefix, so
this adds insn.emulate_prefix_size field instead of reusing
insn.prefixes.

If x86 decoder finds a special sequence of instructions of
XEN_EMULATE_PREFIX and 'ud2a; .ascii "kvm"', it just counts the
length, set insn.emulate_prefix_size and fold it with the next
instruction. In other words, the signature and the next instruction
is treated as a single instruction.

Reported-by: Josh Poimboeuf 
Signed-off-by: Masami Hiramatsu 
---
 Changes in v2:
  - Generalize the emulate-prefix handling not only for Xen but KVM.
  - Introduce insn.emulate_prefix_size instead of using insn.prefixes.
---
 arch/x86/include/asm/insn.h |6 +
 arch/x86/include/asm/xen/interface.h|7 --
 arch/x86/include/asm/xen/prefix.h   |   10 +
 arch/x86/lib/insn.c |   36 +++
 tools/arch/x86/include/asm/insn.h   |6 +
 tools/arch/x86/include/asm/xen/prefix.h |   10 +
 tools/arch/x86/lib/insn.c   |   36 +++
 tools/objtool/sync-check.sh |3 ++-
 8 files changed, 111 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/prefix.h
 create mode 100644 tools/arch/x86/include/asm/xen/prefix.h

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 154f27be8bfc..5c1ae3eff9d4 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -45,6 +45,7 @@ struct insn {
struct insn_field immediate2;   /* for 64bit imm or seg16 */
};
 
+   int emulate_prefix_size;
insn_attr_t attr;
unsigned char opnd_bytes;
unsigned char addr_bytes;
@@ -128,6 +129,11 @@ static inline int insn_is_evex(struct insn *insn)
return (insn->vex_prefix.nbytes == 4);
 }
 
+static inline int insn_has_emulate_prefix(struct insn *insn)
+{
+   return !!insn->emulate_prefix_size;
+}
+
 /* Ensure this instruction is decoded completely */
 static inline int insn_complete(struct insn *insn)
 {
diff --git a/arch/x86/include/asm/xen/interface.h 
b/arch/x86/include/asm/xen/interface.h
index 62ca03ef5c65..fe33a9798708 100644
--- a/arch/x86/include/asm/xen/interface.h
+++ b/arch/x86/include/asm/xen/interface.h
@@ -379,12 +379,15 @@ struct xen_pmu_arch {
  * Prefix forces emulation of some non-trapping instructions.
  * Currently only CPUID.
  */
+#include 
+
 #ifdef __ASSEMBLY__
-#define XEN_EMULATE_PREFIX .byte 0x0f,0x0b,0x78,0x65,0x6e ;
+#define XEN_EMULATE_PREFIX .byte __XEN_EMULATE_PREFIX ;
 #define XEN_CPUID  XEN_EMULATE_PREFIX cpuid
 #else
-#define XEN_EMULATE_PREFIX ".byte 0x0f,0x0b,0x78,0x65,0x6e ; "
+#define XEN_EMULATE_PREFIX ".byte " __XEN_EMULATE_PREFIX_STR " ; "
 #define XEN_CPUID  XEN_EMULATE_PREFIX "cpuid"
+
 #endif
 
 #endif /* _ASM_X86_XEN_INTERFACE_H */
diff --git a/arch/x86/include/asm/xen/prefix.h 
b/arch/x86/include/asm/xen/prefix.h
new file mode 100644
index ..f901be0d7a95
--- /dev/null
+++ b/arch/x86/include/asm/xen/prefix.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _TOOLS_ASM_X86_XEN_PREFIX_H
+#define _TOOLS_ASM_X86_XEN_PREFIX_H
+
+#include 
+
+#define __XEN_EMULATE_PREFIX  0x0f,0x0b,0x78,0x65,0x6e
+#define __XEN_EMULATE_PREFIX_STR  __stringify(__XEN_EMULATE_PREFIX)
+
+#endif
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 0b5862ba6a75..b7eb50187db9 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -13,6 +13,9 @@
 #include 
 #include 
 
+/* For special Xen prefix */
+#include 
+
 /* Verify next sizeof(t) bytes can be on the same instruction */
 #define validate_next(t, insn, n)  \
((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
@@ -58,6 +61,37 @@ void insn_init(struct insn *insn, const void *kaddr, int 
buf_len, int x86_64)
insn->addr_bytes = 4;
 }
 
+static const insn_byte_t xen_prefix[] = { __XEN_EMULATE_PREFIX };
+/* See handle_ud()@arch/x86/kvm/x86.c */
+static const insn_byte_t kvm_prefix[] = "\xf\xbkvm";
+
+static int __insn_get_emulate_prefix(struct insn *insn,
+const insn_byte_t *prefix, size_t len)
+{
+   size_t i;
+
+   for (i = 0; i < len; i++) {
+   if (peek_nbyte_next(insn_byte_t, insn, i) != prefix[i])
+   goto err_out;
+   }
+
+   insn->emulate_prefix_size = len;
+   insn->next_byte += len;
+
+   return 1;
+
+err_out:
+   return 0;
+}
+
+static void insn_get_emulate_prefix(struct insn *insn)
+{
+   if (__insn_get_emulate_prefix(insn, xen_prefix, sizeof(xen_prefix)))
+   return;
+
+   __insn_get_emulate_prefix(insn, kvm_prefix, sizeof(kvm_prefix));
+}
+
 /**
  * insn_get_prefixes - scan x86 instruction prefix bytes
  * @insn:   insn containing instruction
@@ -76,6 +110,8 @@ void insn_get_prefixes(struct insn 

  1   2   3   4   5   6   7   8   9   10   >