subject:"request\!\! ."

[PATCH 2/2] drm/vc4: hdmi: Convert to the new clock request API

2021-04-13 Thread Maxime Ripard

The new clock request API allows us to increase the rate of the HSM
clock to match our pixel rate requirements while decreasing it when
we're done, resulting in a better power-efficiency.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 19 ---
 drivers/gpu/drm/vc4/vc4_hdmi.h |  3 +++
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 1fda574579af..244053de6150 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -473,7 +473,9 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct 
drm_encoder *encoder,
   HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
 
clk_disable_unprepare(vc4_hdmi->pixel_bvb_clock);
+   clk_request_done(vc4_hdmi->bvb_req);
clk_disable_unprepare(vc4_hdmi->hsm_clock);
+   clk_request_done(vc4_hdmi->hsm_req);
clk_disable_unprepare(vc4_hdmi->pixel_clock);
 
ret = pm_runtime_put(&vc4_hdmi->pdev->dev);
@@ -778,9 +780,9 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct 
drm_encoder *encoder,
 * pixel clock, but HSM ends up being the limiting factor.
 */
hsm_rate = max_t(unsigned long, 12000, (pixel_rate / 100) * 101);
-   ret = clk_set_min_rate(vc4_hdmi->hsm_clock, hsm_rate);
-   if (ret) {
-   DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
+   vc4_hdmi->hsm_req = clk_request_start(vc4_hdmi->hsm_clock, hsm_rate);
+   if (IS_ERR(vc4_hdmi->hsm_req)) {
+   DRM_ERROR("Failed to set HSM clock rate: %ld\n", 
PTR_ERR(vc4_hdmi->hsm_req));
return;
}
 
@@ -797,10 +799,11 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct 
drm_encoder *encoder,
 * FIXME: When the pixel freq is 594MHz (4k60), this needs to be setup
 * at 300MHz.
 */
-   ret = clk_set_min_rate(vc4_hdmi->pixel_bvb_clock,
-  (hsm_rate > VC4_HSM_MID_CLOCK ? 15000 : 
7500));
-   if (ret) {
-   DRM_ERROR("Failed to set pixel bvb clock rate: %d\n", ret);
+   vc4_hdmi->bvb_req = clk_request_start(vc4_hdmi->pixel_bvb_clock,
+ (hsm_rate > VC4_HSM_MID_CLOCK ? 
15000 : 7500));
+   if (IS_ERR(vc4_hdmi->bvb_req)) {
+   DRM_ERROR("Failed to set pixel bvb clock rate: %ld\n", 
PTR_ERR(vc4_hdmi->bvb_req));
+   clk_request_done(vc4_hdmi->hsm_req);
clk_disable_unprepare(vc4_hdmi->hsm_clock);
clk_disable_unprepare(vc4_hdmi->pixel_clock);
return;
@@ -809,6 +812,8 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct 
drm_encoder *encoder,
ret = clk_prepare_enable(vc4_hdmi->pixel_bvb_clock);
if (ret) {
DRM_ERROR("Failed to turn on pixel bvb clock: %d\n", ret);
+   clk_request_done(vc4_hdmi->bvb_req);
+   clk_request_done(vc4_hdmi->hsm_req);
clk_disable_unprepare(vc4_hdmi->hsm_clock);
clk_disable_unprepare(vc4_hdmi->pixel_clock);
return;
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
index 3cebd1fd00fc..9ac4a2c751df 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
@@ -167,6 +167,9 @@ struct vc4_hdmi {
 
struct reset_control *reset;
 
+   struct clk_request *bvb_req;
+   struct clk_request *hsm_req;
+
struct debugfs_regset32 hdmi_regset;
struct debugfs_regset32 hd_regset;
 };
-- 
2.30.2

[PATCH 0/2] clk: Implement a clock request API

2021-04-13 Thread Maxime Ripard

Hi,

This is a follow-up of the discussion here:
https://lore.kernel.org/linux-clk/20210319150355.xzw7ikwdaga2dwhv@gilmour/

This implements a mechanism to raise and lower clock rates based on consumer
workloads, with an example of such an implementation for the RaspberryPi4 HDMI
controller.

There's a couple of things worth discussing:

  - The name is in conflict with clk_request_rate, and even though it feels
like the right name to me, we should probably avoid any confusion

  - The code so far implements a policy of always going for the lowest rate
possible. While we don't have an use-case for something else, this should
maybe be made more flexible?

Let me know what you think
Maxime

Maxime Ripard (2):
  clk: Introduce a clock request API
  drm/vc4: hdmi: Convert to the new clock request API

 drivers/clk/clk.c  | 121 +
 drivers/gpu/drm/vc4/vc4_hdmi.c |  19 --
 drivers/gpu/drm/vc4/vc4_hdmi.h |   3 +
 include/linux/clk.h|   4 ++
 4 files changed, 140 insertions(+), 7 deletions(-)

-- 
2.30.2

Re: cocci script hints request

2021-04-13 Thread Fabio Aiuto

On Tue, Apr 13, 2021 at 11:56:20AM +0200, Julia Lawall wrote:
> 
> 
> On Tue, 13 Apr 2021, Fabio Aiuto wrote:
> 
> > Hi,
> >
> > I would like to improve the following coccinelle script:
> >
> > @@
> > expression a, fmt;
> > expression list var_args;
> > @@
> >
> > -   DBG_871X_LEVEL(a, fmt, var_args);
> > +   printk(fmt, var_args);
> >
> > I would  replace the DBG_871X_LEVEL macro with printk, but
> > I can't find a way to add KERN_* constant prefix to the fmt
> > argument in the + code line. If i try this
> >
> > @@
> > expression a, fmt;
> > expression list var_args;
> > @@
> >
> > -   DBG_871X_LEVEL(a, fmt, var_args);
> > +   printk(KERN_DEBUG fmt, var_args);
> >
> > plus: parse error:
> >   File "../test.cocci", line 94, column 20, charpos = 1171
> >   around = 'fmt',
> >   whole content = + printk(KERN_DEBUG fmt, var_args);
> >
> > how could I do this?
> 
> Although I certainly agree with Greg, I'll answer the question from a
> technical point of view.
> 
> I'm not sure that that kind of compound string is supported for a
> metavariable.  It is possible to get around this problem using a python
> script.  If you ever need to do this for a better reason, you can take a
> look at demos/pythontococci.cocci in the Coccinelle source code
> distribution.
> 
> julia

thank you, this helps a lot!

fabio

Re: cocci script hints request

2021-04-13 Thread Julia Lawall

On Tue, 13 Apr 2021, Fabio Aiuto wrote:

> Hi,
>
> I would like to improve the following coccinelle script:
>
> @@
> expression a, fmt;
> expression list var_args;
> @@
>
> -   DBG_871X_LEVEL(a, fmt, var_args);
> +   printk(fmt, var_args);
>
> I would  replace the DBG_871X_LEVEL macro with printk, but
> I can't find a way to add KERN_* constant prefix to the fmt
> argument in the + code line. If i try this
>
> @@
> expression a, fmt;
> expression list var_args;
> @@
>
> -   DBG_871X_LEVEL(a, fmt, var_args);
> +   printk(KERN_DEBUG fmt, var_args);
>
> plus: parse error:
>   File "../test.cocci", line 94, column 20, charpos = 1171
>   around = 'fmt',
>   whole content = +   printk(KERN_DEBUG fmt, var_args);
>
> how could I do this?

Although I certainly agree with Greg, I'll answer the question from a
technical point of view.

I'm not sure that that kind of compound string is supported for a
metavariable.  It is possible to get around this problem using a python
script.  If you ever need to do this for a better reason, you can take a
look at demos/pythontococci.cocci in the Coccinelle source code
distribution.

julia

Re: cocci script hints request

2021-04-13 Thread Greg KH

On Tue, Apr 13, 2021 at 11:24:56AM +0200, Fabio Aiuto wrote:
> On Tue, Apr 13, 2021 at 11:11:38AM +0200, Greg KH wrote:
> > On Tue, Apr 13, 2021 at 11:04:01AM +0200, Fabio Aiuto wrote:
> > > Hi,
> > > 
> > > I would like to improve the following coccinelle script:
> > > 
> > > @@
> > > expression a, fmt;
> > > expression list var_args;
> > > @@
> > > 
> > > -   DBG_871X_LEVEL(a, fmt, var_args);
> > > +   printk(fmt, var_args);
> > > 
> > > I would  replace the DBG_871X_LEVEL macro with printk,
> > 
> > No you really do not, you want to change that to a dev_*() call instead
> > depending on the "level" of the message.
> > 
> > No "raw" printk() calls please, I will just reject them :)
> > 
> > thanks,
> > 
> > greg k-h
> 
> but there are very few occurences of DBG_871X_LEVEL in module init functions:

Then do those "by hand", if they really are needed.

Drivers, when they are working properly, are totally quiet.

> 
> static int __init rtw_drv_entry(void)
> {
> int ret;
> 
> DBG_871X_LEVEL(_drv_always_, "module init start\n");

Horrible, please remove.

> dump_drv_version(RTW_DBGDUMP);
> #ifdef BTCOEXVERSION
> DBG_871X_LEVEL(_drv_always_, "rtl8723bs BT-Coex version = %s\n", 
> BTCOEXVERSION);

Not needed at all.

> #endif /*  BTCOEXVERSION */
> 
> sdio_drvpriv.drv_registered = true;
> 
> ret = sdio_register_driver(&sdio_drvpriv.r871xs_drv);
> if (ret != 0) {
> sdio_drvpriv.drv_registered = false;
> rtw_ndev_notifier_unregister();
> }
> 
> DBG_871X_LEVEL(_drv_always_, "module init ret =%d\n", ret);

Again, not needed this is noise and if someone really needs to debug
this, they can use the built-in kernel ftrace logic instead.

> return ret;
> }
> 
> where I don't have a device available... shall I pass NULL to
> first argument?

No, that would be a mess :)

I bet almost all of these can be removed if they are like the above
examples as we do not need a lot of "look, the code got here!" type of
messages at all.

> Another question: may I use netdev_dbg in case of rtl8723bs?

Yes please, that is even better and recommended.

thanks,

greg k-h

Re: cocci script hints request

2021-04-13 Thread Fabio Aiuto

On Tue, Apr 13, 2021 at 11:11:38AM +0200, Greg KH wrote:
> On Tue, Apr 13, 2021 at 11:04:01AM +0200, Fabio Aiuto wrote:
> > Hi,
> > 
> > I would like to improve the following coccinelle script:
> > 
> > @@
> > expression a, fmt;
> > expression list var_args;
> > @@
> > 
> > -   DBG_871X_LEVEL(a, fmt, var_args);
> > +   printk(fmt, var_args);
> > 
> > I would  replace the DBG_871X_LEVEL macro with printk,
> 
> No you really do not, you want to change that to a dev_*() call instead
> depending on the "level" of the message.
> 
> No "raw" printk() calls please, I will just reject them :)
> 
> thanks,
> 
> greg k-h

but there are very few occurences of DBG_871X_LEVEL in module init functions:

static int __init rtw_drv_entry(void)
{
int ret;

DBG_871X_LEVEL(_drv_always_, "module init start\n");
dump_drv_version(RTW_DBGDUMP);
#ifdef BTCOEXVERSION
DBG_871X_LEVEL(_drv_always_, "rtl8723bs BT-Coex version = %s\n", 
BTCOEXVERSION);
#endif /*  BTCOEXVERSION */

sdio_drvpriv.drv_registered = true;

ret = sdio_register_driver(&sdio_drvpriv.r871xs_drv);
if (ret != 0) {
sdio_drvpriv.drv_registered = false;
rtw_ndev_notifier_unregister();
}

DBG_871X_LEVEL(_drv_always_, "module init ret =%d\n", ret);
return ret;
}

where I don't have a device available... shall I pass NULL to
first argument?

Another question: may I use netdev_dbg in case of rtl8723bs?

thank you,

fabio

Re: cocci script hints request

2021-04-13 Thread Greg KH

On Tue, Apr 13, 2021 at 11:04:01AM +0200, Fabio Aiuto wrote:
> Hi,
> 
> I would like to improve the following coccinelle script:
> 
> @@
> expression a, fmt;
> expression list var_args;
> @@
> 
> -   DBG_871X_LEVEL(a, fmt, var_args);
> +   printk(fmt, var_args);
> 
> I would  replace the DBG_871X_LEVEL macro with printk,

No you really do not, you want to change that to a dev_*() call instead
depending on the "level" of the message.

No "raw" printk() calls please, I will just reject them :)

thanks,

greg k-h

cocci script hints request

2021-04-13 Thread Fabio Aiuto

Hi,

I would like to improve the following coccinelle script:

@@
expression a, fmt;
expression list var_args;
@@

-   DBG_871X_LEVEL(a, fmt, var_args);
+   printk(fmt, var_args);

I would  replace the DBG_871X_LEVEL macro with printk, but
I can't find a way to add KERN_* constant prefix to the fmt
argument in the + code line. If i try this

@@
expression a, fmt;
expression list var_args;
@@

-   DBG_871X_LEVEL(a, fmt, var_args);
+   printk(KERN_DEBUG fmt, var_args);

plus: parse error: 
  File "../test.cocci", line 94, column 20, charpos = 1171
  around = 'fmt',
  whole content = + printk(KERN_DEBUG fmt, var_args);

how could I do this?

thank you in advance,

fabio

Re: [PATCH] kernel:irq:manage: request threaded irq with a specified priority

2021-04-13 Thread Thomas Gleixner

On Tue, Apr 13 2021 at 14:19, Song Chen wrote:
> In general, irq handler thread will be assigned a default priority which
> is MAX_RT_PRIO/2, as a result, no one can preempt others.
>
> Here is the case I found in a real project, an interrupt int_a is
> coming, wakes up its handler handler_a and handler_a wakes up a
> userspace RT process task_a.
>
> However, if another irq handler handler_b which has nothing to do
> with any RT tasks is running when int_a is coming, handler_a can't
> preempt handler_b, as a result, task_a can't be waken up immediately
> as expected until handler_b gives up cpu voluntarily. In this case,
> determinism breaks.

It breaks because the system designer failed to assign proper priorities
to the irq threads int_a, int_b and to the user space process task_a.

That's not solvable at the kernel level.

Thanks,

tglx

Re: [syzbot] BUG: unable to handle kernel paging request in bpf_trace_run2

2021-04-13 Thread Dmitry Vyukov

On Thu, Apr 1, 2021 at 8:01 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit befe6d946551d65cddbd32b9cb0170b0249fd5ed
> Author: Steven Rostedt (VMware) 
> Date:   Wed Nov 18 14:34:05 2020 +
>
> tracepoint: Do not fail unregistering a probe due to memory failure
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14f0260ed0
> start commit:   12450081 libbpf: Fix native endian assumption when parsing..
> git tree:   bpf
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5ac0d21536db480b
> dashboard link: https://syzkaller.appspot.com/bug?extid=cc36fd07553c0512f5f7
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1365d2c390
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16d5f08d90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: tracepoint: Do not fail unregistering a probe due to memory failure
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Looks reasonable:

#syz fix:
tracepoint: Do not fail unregistering a probe due to memory failure

[PATCH v2 05/12] usb: dwc2: Add exit clock gating from session request interrupt

2021-04-13 Thread Artur Petrosyan

Added clock gating exit flow from session
request interrupt handler according programming guide.

Signed-off-by: Artur Petrosyan 
---
 Changes in v2:
 - None

 drivers/usb/dwc2/core_intr.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c
index c764407e7633..550c52c1a0c7 100644
--- a/drivers/usb/dwc2/core_intr.c
+++ b/drivers/usb/dwc2/core_intr.c
@@ -316,12 +316,19 @@ static void dwc2_handle_session_req_intr(struct 
dwc2_hsotg *hsotg)
hsotg->lx_state);
 
if (dwc2_is_device_mode(hsotg)) {
-   if (hsotg->lx_state == DWC2_L2 && hsotg->in_ppd) {
-   ret = dwc2_exit_partial_power_down(hsotg, 0,
-  true);
-   if (ret)
-   dev_err(hsotg->dev,
-   "exit power_down failed\n");
+   if (hsotg->lx_state == DWC2_L2) {
+   if (hsotg->in_ppd) {
+   ret = dwc2_exit_partial_power_down(hsotg, 0,
+  true);
+   if (ret)
+   dev_err(hsotg->dev,
+   "exit power_down failed\n");
+   }
+
+   /* Exit gadget mode clock gating. */
+   if (hsotg->params.power_down ==
+   DWC2_POWER_DOWN_PARAM_NONE && hsotg->bus_suspended)
+   dwc2_gadget_exit_clock_gating(hsotg, 0);
}
 
/*
-- 
2.25.1

[PATCH 05/12] usb: dwc2: Add exit clock gating from session request interrupt

2021-04-13 Thread Artur Petrosyan

Added clock gating exit flow from session
request interrupt handler according programming guide.

Signed-off-by: Artur Petrosyan 
---
 drivers/usb/dwc2/core_intr.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c
index c764407e7633..550c52c1a0c7 100644
--- a/drivers/usb/dwc2/core_intr.c
+++ b/drivers/usb/dwc2/core_intr.c
@@ -316,12 +316,19 @@ static void dwc2_handle_session_req_intr(struct 
dwc2_hsotg *hsotg)
hsotg->lx_state);
 
if (dwc2_is_device_mode(hsotg)) {
-   if (hsotg->lx_state == DWC2_L2 && hsotg->in_ppd) {
-   ret = dwc2_exit_partial_power_down(hsotg, 0,
-  true);
-   if (ret)
-   dev_err(hsotg->dev,
-   "exit power_down failed\n");
+   if (hsotg->lx_state == DWC2_L2) {
+   if (hsotg->in_ppd) {
+   ret = dwc2_exit_partial_power_down(hsotg, 0,
+  true);
+   if (ret)
+   dev_err(hsotg->dev,
+   "exit power_down failed\n");
+   }
+
+   /* Exit gadget mode clock gating. */
+   if (hsotg->params.power_down ==
+   DWC2_POWER_DOWN_PARAM_NONE && hsotg->bus_suspended)
+   dwc2_gadget_exit_clock_gating(hsotg, 0);
}
 
/*
-- 
2.25.1

[PATCH] kernel:irq:manage: request threaded irq with a specified priority

2021-04-12 Thread Song Chen

In general, irq handler thread will be assigned a default priority which
is MAX_RT_PRIO/2, as a result, no one can preempt others.

Here is the case I found in a real project, an interrupt int_a is
coming, wakes up its handler handler_a and handler_a wakes up a
userspace RT process task_a.

However, if another irq handler handler_b which has nothing to do
with any RT tasks is running when int_a is coming, handler_a can't
preempt handler_b, as a result, task_a can't be waken up immediately
as expected until handler_b gives up cpu voluntarily. In this case,
determinism breaks.

Therefore, this patch introduce a new api to give drivers a chance to
assign expected priorities to their irq handler thread.

Signed-off-by: Song Chen 
---
 include/linux/interrupt.h  |  7 +
 include/linux/sched.h  |  1 +
 include/linux/sched/prio.h |  1 +
 kernel/irq/manage.c| 64 +++---
 kernel/sched/core.c| 11 
 5 files changed, 80 insertions(+), 4 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 967e257..5ab9169 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -121,6 +121,7 @@ struct irqaction {
unsigned long   thread_mask;
const char  *name;
struct proc_dir_entry   *dir;
+   int prio;
 } cacheline_internodealigned_in_smp;
 
 extern irqreturn_t no_action(int cpl, void *dev_id);
@@ -136,6 +137,12 @@ extern irqreturn_t no_action(int cpl, void *dev_id);
 #define IRQ_NOTCONNECTED   (1U << 31)
 
 extern int __must_check
+request_threaded_irq_with_prio(unsigned int irq, irq_handler_t handler,
+irq_handler_t thread_fn,
+unsigned long flags, const char *name, void *dev,
+int prio);
+
+extern int __must_check
 request_threaded_irq(unsigned int irq, irq_handler_t handler,
 irq_handler_t thread_fn,
 unsigned long flags, const char *name, void *dev);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb2..50edae9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1711,6 +1711,7 @@ extern int sched_setscheduler(struct task_struct *, int, 
const struct sched_para
 extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct 
sched_param *);
 extern void sched_set_fifo(struct task_struct *p);
 extern void sched_set_fifo_low(struct task_struct *p);
+extern void sched_set_fifo_with_prio(struct task_struct *p, int prio);
 extern void sched_set_normal(struct task_struct *p, int nice);
 extern int sched_setattr(struct task_struct *, const struct sched_attr *);
 extern int sched_setattr_nocheck(struct task_struct *, const struct sched_attr 
*);
diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
index ab83d85..1e1186e 100644
--- a/include/linux/sched/prio.h
+++ b/include/linux/sched/prio.h
@@ -15,6 +15,7 @@
 
 #define MAX_RT_PRIO100
 
+#define DEFAULT_RT_PRIO(MAX_RT_PRIO / 2)
 #define MAX_PRIO   (MAX_RT_PRIO + NICE_WIDTH)
 #define DEFAULT_PRIO   (MAX_RT_PRIO + NICE_WIDTH / 2)
 
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 21ea370..111b8ce 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1394,7 +1394,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, 
bool secondary)
if (IS_ERR(t))
return PTR_ERR(t);
 
-   sched_set_fifo(t);
+   sched_set_fifo_with_prio(t, new->prio);
 
/*
 * We keep the reference to the task struct even if
@@ -2032,7 +2032,7 @@ const void *free_nmi(unsigned int irq, void *dev_id)
 }
 
 /**
- * request_threaded_irq - allocate an interrupt line
+ * request_threaded_irq_with_prio - allocate an interrupt line
  * @irq: Interrupt line to allocate
  * @handler: Function to be called when the IRQ occurs.
  *   Primary handler for threaded interrupts
@@ -2043,6 +2043,7 @@ const void *free_nmi(unsigned int irq, void *dev_id)
  * @irqflags: Interrupt type flags
  * @devname: An ascii name for the claiming device
  * @dev_id: A cookie passed back to the handler function
+ * @prio: priority of the irq handler thread
  *
  * This call allocates interrupt resources and enables the
  * interrupt line and IRQ handling. From the point this
@@ -2067,15 +2068,18 @@ const void *free_nmi(unsigned int irq, void *dev_id)
  * If your interrupt is shared you must pass a non NULL dev_id
  * as this is required when freeing the interrupt.
  *
+ * If you want to assign a priority for your irq handler thread
+ * instead of default value, you need to supply @prio.
+ *
  * Flags:
  *
  * IRQF_SHARED Interrupt is shared
  * IRQF_TRIGGER_*  Specify active edge(s) or level
  *
  */
-int request_threaded_irq(unsigned int irq, irq_handler_t handler,
+int request_threaded_irq_with_prio(uns

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-12 Thread Sean Christopherson

On Fri, Apr 09, 2021, Lai Jiangshan wrote:
> On Fri, Nov 27, 2020 at 7:26 PM Paolo Bonzini  wrote:
> >
> > kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
> > a hodge-podge of conditions, hacked together to get something that
> > more or less works.  But what is actually needed is much simpler;
> > in both cases the fundamental question is, do we have a place to stash
> > an interrupt if userspace does KVM_INTERRUPT?
> >
> > In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
> > Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
> > unnecessarily restrictive.
> >
> > In split irqchip mode it's a bit more complicated, we need to check
> > kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
> > cycle and thus requires ExtINTs not to be masked) as well as
> > !pending_userspace_extint(vcpu).  However, there is no need to
> > check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
> > pending ExtINT state separate from event injection state, and checking
> > kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
> > priority than APIC interrupts.  In fact the latter fixes a bug:
> > when userspace requests an IRQ window vmexit, an interrupt in the
> > local APIC can cause kvm_cpu_has_interrupt() to be true and thus
> > kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
> > happens, vcpu_run does not exit to userspace but the interrupt window
> > vmexits keep occurring.  The VM loops without any hope of making progress.
> >
> > Once we try to fix these with something like
> >
> >  return kvm_arch_interrupt_allowed(vcpu) &&
> > -!kvm_cpu_has_interrupt(vcpu) &&
> > -!kvm_event_needs_reinjection(vcpu) &&
> > -kvm_cpu_accept_dm_intr(vcpu);
> > +(!lapic_in_kernel(vcpu)
> > + ? !vcpu->arch.interrupt.injected
> > + : (kvm_apic_accept_pic_intr(vcpu)
> > +&& !pending_userspace_extint(v)));
> >
> > we realize two things.  First, thanks to the previous patch the complex
> > conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
> > window request in vcpu_enter_guest()
> >
> > bool req_int_win =
> > dm_request_for_irq_injection(vcpu) &&
> > kvm_cpu_accept_dm_intr(vcpu);
> >
> > should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
> > it is unnecessary to ask the processor for an interrupt window
> > if we would not be able to return to userspace.  Therefore, the
> > complex conditional is really the correct implementation of
> > kvm_cpu_accept_dm_intr(vcpu).  It all makes sense:
> >
> > - we can accept an interrupt from userspace if there is a place
> >   to stash it (and, for irqchip split, ExtINTs are not masked).
> >   Interrupts from userspace _can_ be accepted even if right now
> >   EFLAGS.IF=0.
> 
> Hello, Paolo
> 
> If userspace does KVM_INTERRUPT, vcpu->arch.interrupt.injected is
> set immediately, and in inject_pending_event(), we have
> 
> else if (!vcpu->arch.exception.pending) {
> if (vcpu->arch.nmi_injected) {
> kvm_x86_ops.set_nmi(vcpu);
> can_inject = false;
> } else if (vcpu->arch.interrupt.injected) {
> kvm_x86_ops.set_irq(vcpu);
> can_inject = false;
> }
> }
> 
> I'm curious about that can the kvm_x86_ops.set_irq() here be possible
> to queue the irq with EFLAGS.IF=0? If not, which code prevents it?

The interrupt is only directly injected if the local APIC is _not_ in-kernel.
If userspace is managing the local APIC, my understanding is that userspace is
also responsible for honoring EFLAGS.IF, though KVM aids userspace by updating
vcpu->run->ready_for_interrupt_injection when exiting to userspace.  When
userspace is modeling the local APIC, that resolves to
kvm_vcpu_ready_for_interrupt_injection():

return kvm_arch_interrupt_allowed(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);

where kvm_arch_interrupt_allowed() checks EFLAGS.IF (and an edge case related to
nested virtualization).  KVM also captures EFLAGS.IF in vcpu->run->if_flag.
For whatever reason, QEMU checks both vcpu->run flags before injecting an IRQ,
maybe to handle a case where QEMU itself clears EFLAGS.IF?
 
> I'm asking about this because I just noticed that interrupt can
> be queued when exception pending, and this patch relaxed it even
> more.
> 
> Note: interrupt can NOT be queued when exception pending
> until 664f8e26b00c7 ("KVM: X86: Fix loss of exception which
> has not yet been injected") which I think is dangerous.

Re: BUG: unable to handle kernel paging request in bpf_check

2021-04-12 Thread Alexei Starovoitov

On Mon, Apr 12, 2021 at 12:11 AM Hao Sun  wrote:
>
> Besides, another similar bug occurred while fault injection was enabled.
> 
> BUG: unable to handle kernel paging request in bpf_prog_alloc_no_stats
> 
> RAX: ffda RBX: 0059c080 RCX: 0047338d
> RDX: 0078 RSI: 2300 RDI: 0005
> RBP: 7f7e3c38fc90 R08:  R09: 
> R10:  R11: 0246 R12: 0004
> R13: 7ffed3a1dd6f R14: 7ffed3a1df10 R15: 7f7e3c38fdc0
> BUG: unable to handle page fault for address: 91f2077ed028
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 1810067 P4D 1810067 PUD 1915067 PMD 3b907067 PTE 0
> Oops: 0002 [#1] SMP
> CPU: 3 PID: 17344 Comm: executor Not tainted 5.12.0-rc6+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1.1 04/01/2014
> RIP: 0010:bpf_prog_alloc_no_stats+0x251/0x6e0 kernel/bpf/core.c:94

Both crashes don't make much sense.
There are !null checks in both cases.
I suspect it's a kmsan bug.
Most likely kmsan_map_kernel_range_noflush is doing something wrong.
No idea where that function lives. I don't see it in the kernel sources.

Re: [PATCH 2/2] drm/ingenic: Don't request full modeset if property is not modified

2021-04-12 Thread Paul Cercueil


Can I have an ACK for this patch?

Cheers,
-Paul

Le lun. 29 mars 2021 à 18:50, Paul Cercueil  a 
écrit :

Avoid requesting a full modeset if the sharpness property is not
modified, because then we don't actually need it.

Fixes: fc1acf317b01 ("drm/ingenic: Add support for the IPU")
Cc:  # 5.8+
Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-ipu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c 
b/drivers/gpu/drm/ingenic/ingenic-ipu.c

index 3b1091e7c0cd..95b665c4a7b0 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -640,10 +640,12 @@ ingenic_ipu_plane_atomic_set_property(struct 
drm_plane *plane,

 {
struct ingenic_ipu *ipu = plane_to_ingenic_ipu(plane);
struct drm_crtc_state *crtc_state;
+   bool mode_changed;

if (property != ipu->sharpness_prop)
return -EINVAL;

+   mode_changed = val != ipu->sharpness;
ipu->sharpness = val;

if (state->crtc) {
@@ -651,7 +653,7 @@ ingenic_ipu_plane_atomic_set_property(struct 
drm_plane *plane,

if (WARN_ON(!crtc_state))
return -EINVAL;

-   crtc_state->mode_changed = true;
+   crtc_state->mode_changed |= mode_changed;
}

return 0;
--
2.30.2

[PATCH 5.11 177/210] net: hns3: clear VF down state bit before request link status

2021-04-12 Thread Greg Kroah-Hartman

From: Guangbin Huang 

[ Upstream commit ed7bedd2c3ca040f1e8ea02c6590a93116b1ec78 ]

Currently, the VF down state bit is cleared after VF sending
link status request command. There is problem that when VF gets
link status replied from PF, the down state bit may still set
as 1. In this case, the link status replied from PF will be
ignored and always set VF link status to down.

To fix this problem, clear VF down state bit before VF requests
link status.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) 
Support")
Signed-off-by: Guangbin Huang 
Signed-off-by: Huazhong Tan 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 674b3a22e91f..3bd7bc794677 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2575,14 +2575,14 @@ static int hclgevf_ae_start(struct hnae3_handle *handle)
 {
struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
 
+   clear_bit(HCLGEVF_STATE_DOWN, &hdev->state);
+
hclgevf_reset_tqp_stats(handle);
 
hclgevf_request_link_info(hdev);
 
hclgevf_update_link_mode(hdev);
 
-   clear_bit(HCLGEVF_STATE_DOWN, &hdev->state);
-
return 0;
 }
 
-- 
2.30.2

[PATCH 5.11 159/210] scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs

2021-04-12 Thread Greg Kroah-Hartman

From: Can Guo 

[ Upstream commit 4b42d557a8add52b9a9924fb31e40a218aab7801 ]

In __ufshcd_issue_tm_cmd(), it is not correct to use hba->nutrs + req->tag
as the Task Tag in a TMR UPIU. Directly use req->tag as the Task Tag.

Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command 
implementation")
Link: 
https://lore.kernel.org/r/1617262750-4864-3-git-send-email-c...@codeaurora.org
Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ufs/ufshcd.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index c801f88007dd..e53a3f89e863 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6363,38 +6363,34 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
DECLARE_COMPLETION_ONSTACK(wait);
    struct request *req;
unsigned long flags;
-   int free_slot, task_tag, err;
+   int task_tag, err;
 
/*
-* Get free slot, sleep if slots are unavailable.
-* Even though we use wait_event() which sleeps indefinitely,
-* the maximum wait time is bounded by %TM_CMD_TIMEOUT.
+* blk_get_request() is used here only to get a free tag.
 */
req = blk_get_request(q, REQ_OP_DRV_OUT, 0);
if (IS_ERR(req))
return PTR_ERR(req);
 
req->end_io_data = &wait;
-   free_slot = req->tag;
-   WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
ufshcd_hold(hba, false);
 
spin_lock_irqsave(host->host_lock, flags);
-   task_tag = hba->nutrs + free_slot;
blk_mq_start_request(req);
 
+   task_tag = req->tag;
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-   memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq));
-   ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function);
+   memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq));
+   ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function);
 
/* send command to the controller */
-   __set_bit(free_slot, &hba->outstanding_tasks);
+   __set_bit(task_tag, &hba->outstanding_tasks);
 
/* Make sure descriptors are ready before ringing the task doorbell */
wmb();
 
-   ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL);
+   ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL);
/* Make sure that doorbell is committed immediately */
wmb();
 
@@ -6414,24 +6410,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete_err");
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
-   if (ufshcd_clear_tm_cmd(hba, free_slot))
-   dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) 
after timeout\n",
-   __func__, free_slot);
+   if (ufshcd_clear_tm_cmd(hba, task_tag))
+   dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot 
%d) after timeout\n",
+   __func__, task_tag);
err = -ETIMEDOUT;
} else {
err = 0;
-   memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq));
+   memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq));
 
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete");
}
 
spin_lock_irqsave(hba->host->host_lock, flags);
-   __clear_bit(free_slot, &hba->outstanding_tasks);
+   __clear_bit(task_tag, &hba->outstanding_tasks);
spin_unlock_irqrestore(hba->host->host_lock, flags);
 
+   ufshcd_release(hba);
blk_put_request(req);
 
-   ufshcd_release(hba);
return err;
 }
 
-- 
2.30.2

[PATCH 5.11 158/210] scsi: ufs: core: Fix task management request completion timeout

2021-04-12 Thread Greg Kroah-Hartman

From: Can Guo 

[ Upstream commit 1235fc569e0bf541ddda0a1224d4c6fa6d914890 ]

ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()),
but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
chance to run. Thus, TMR always ends up with completion timeout. Fix it by
calling blk_mq_start_request() in __ufshcd_issue_tm_cmd().

Link: 
https://lore.kernel.org/r/1617262750-4864-2-git-send-email-c...@codeaurora.org
Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and 
free TMFs")
Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ufs/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 16e1bd1aa49d..c801f88007dd 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6381,6 +6381,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 
spin_lock_irqsave(host->host_lock, flags);
task_tag = hba->nutrs + free_slot;
+   blk_mq_start_request(req);
 
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-- 
2.30.2

[PATCH 5.11 146/210] net/mlx5: Dont request more than supported EQs

2021-04-12 Thread Greg Kroah-Hartman

From: Daniel Jurgens 

[ Upstream commit a7b76002ae78cd230ee652ccdfedf21aa94fcecc ]

Calculating the number of compeltion EQs based on the number of
available IRQ vectors doesn't work now that all async EQs share one IRQ.
Thus the max number of EQs can be exceeded on systems with more than
approximately 256 CPUs. Take this into account when calculating the
number of available completion EQs.

Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs")
Signed-off-by: Daniel Jurgens 
Reviewed-by: Parav Pandit 
Signed-off-by: Saeed Mahameed 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index fc0afa03d407..b5f48efebd71 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -928,13 +928,24 @@ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev)
mutex_unlock(&table->lock);
 }
 
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+#define MLX5_MAX_ASYNC_EQS 4
+#else
+#define MLX5_MAX_ASYNC_EQS 3
+#endif
+
 int mlx5_eq_table_create(struct mlx5_core_dev *dev)
 {
struct mlx5_eq_table *eq_table = dev->priv.eq_table;
+   int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ?
+ MLX5_CAP_GEN(dev, max_num_eqs) :
+ 1 << MLX5_CAP_GEN(dev, log_max_eq);
int err;
 
eq_table->num_comp_eqs =
-   mlx5_irq_get_num_comp(eq_table->irq_table);
+   min_t(int,
+ mlx5_irq_get_num_comp(eq_table->irq_table),
+ num_eqs - MLX5_MAX_ASYNC_EQS);
 
err = create_async_eqs(dev);
if (err) {
-- 
2.30.2

[PATCH 5.10 143/188] scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs

2021-04-12 Thread Greg Kroah-Hartman

From: Can Guo 

[ Upstream commit 4b42d557a8add52b9a9924fb31e40a218aab7801 ]

In __ufshcd_issue_tm_cmd(), it is not correct to use hba->nutrs + req->tag
as the Task Tag in a TMR UPIU. Directly use req->tag as the Task Tag.

Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command 
implementation")
Link: 
https://lore.kernel.org/r/1617262750-4864-3-git-send-email-c...@codeaurora.org
Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ufs/ufshcd.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 7e1168ee2474..4215d9a8e5de 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6256,38 +6256,34 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
DECLARE_COMPLETION_ONSTACK(wait);
    struct request *req;
unsigned long flags;
-   int free_slot, task_tag, err;
+   int task_tag, err;
 
/*
-* Get free slot, sleep if slots are unavailable.
-* Even though we use wait_event() which sleeps indefinitely,
-* the maximum wait time is bounded by %TM_CMD_TIMEOUT.
+* blk_get_request() is used here only to get a free tag.
 */
req = blk_get_request(q, REQ_OP_DRV_OUT, 0);
if (IS_ERR(req))
return PTR_ERR(req);
 
req->end_io_data = &wait;
-   free_slot = req->tag;
-   WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
ufshcd_hold(hba, false);
 
spin_lock_irqsave(host->host_lock, flags);
-   task_tag = hba->nutrs + free_slot;
blk_mq_start_request(req);
 
+   task_tag = req->tag;
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-   memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq));
-   ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function);
+   memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq));
+   ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function);
 
/* send command to the controller */
-   __set_bit(free_slot, &hba->outstanding_tasks);
+   __set_bit(task_tag, &hba->outstanding_tasks);
 
/* Make sure descriptors are ready before ringing the task doorbell */
wmb();
 
-   ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL);
+   ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL);
/* Make sure that doorbell is committed immediately */
wmb();
 
@@ -6307,24 +6303,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete_err");
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
-   if (ufshcd_clear_tm_cmd(hba, free_slot))
-   dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) 
after timeout\n",
-   __func__, free_slot);
+   if (ufshcd_clear_tm_cmd(hba, task_tag))
+   dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot 
%d) after timeout\n",
+   __func__, task_tag);
err = -ETIMEDOUT;
} else {
err = 0;
-   memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq));
+   memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq));
 
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete");
}
 
spin_lock_irqsave(hba->host->host_lock, flags);
-   __clear_bit(free_slot, &hba->outstanding_tasks);
+   __clear_bit(task_tag, &hba->outstanding_tasks);
spin_unlock_irqrestore(hba->host->host_lock, flags);
 
+   ufshcd_release(hba);
blk_put_request(req);
 
-   ufshcd_release(hba);
return err;
 }
 
-- 
2.30.2

[PATCH 5.10 142/188] scsi: ufs: core: Fix task management request completion timeout

2021-04-12 Thread Greg Kroah-Hartman

From: Can Guo 

[ Upstream commit 1235fc569e0bf541ddda0a1224d4c6fa6d914890 ]

ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()),
but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
chance to run. Thus, TMR always ends up with completion timeout. Fix it by
calling blk_mq_start_request() in __ufshcd_issue_tm_cmd().

Link: 
https://lore.kernel.org/r/1617262750-4864-2-git-send-email-c...@codeaurora.org
Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and 
free TMFs")
Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ufs/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 97d9d5d99adc..7e1168ee2474 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6274,6 +6274,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 
spin_lock_irqsave(host->host_lock, flags);
task_tag = hba->nutrs + free_slot;
+   blk_mq_start_request(req);
 
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-- 
2.30.2

[PATCH 5.10 157/188] net: hns3: clear VF down state bit before request link status

2021-04-12 Thread Greg Kroah-Hartman

From: Guangbin Huang 

[ Upstream commit ed7bedd2c3ca040f1e8ea02c6590a93116b1ec78 ]

Currently, the VF down state bit is cleared after VF sending
link status request command. There is problem that when VF gets
link status replied from PF, the down state bit may still set
as 1. In this case, the link status replied from PF will be
ignored and always set VF link status to down.

To fix this problem, clear VF down state bit before VF requests
link status.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) 
Support")
Signed-off-by: Guangbin Huang 
Signed-off-by: Huazhong Tan 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index dc5d150a9c54..ac6980acb6f0 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2554,14 +2554,14 @@ static int hclgevf_ae_start(struct hnae3_handle *handle)
 {
struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
 
+   clear_bit(HCLGEVF_STATE_DOWN, &hdev->state);
+
hclgevf_reset_tqp_stats(handle);
 
hclgevf_request_link_info(hdev);
 
hclgevf_update_link_mode(hdev);
 
-   clear_bit(HCLGEVF_STATE_DOWN, &hdev->state);
-
return 0;
 }
 
-- 
2.30.2

[PATCH 5.10 133/188] net/mlx5: Dont request more than supported EQs

2021-04-12 Thread Greg Kroah-Hartman

From: Daniel Jurgens 

[ Upstream commit a7b76002ae78cd230ee652ccdfedf21aa94fcecc ]

Calculating the number of compeltion EQs based on the number of
available IRQ vectors doesn't work now that all async EQs share one IRQ.
Thus the max number of EQs can be exceeded on systems with more than
approximately 256 CPUs. Take this into account when calculating the
number of available completion EQs.

Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs")
Signed-off-by: Daniel Jurgens 
Reviewed-by: Parav Pandit 
Signed-off-by: Saeed Mahameed 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 8ebfe782f95e..ccd53a7a2b80 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -926,13 +926,24 @@ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev)
mutex_unlock(&table->lock);
 }
 
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+#define MLX5_MAX_ASYNC_EQS 4
+#else
+#define MLX5_MAX_ASYNC_EQS 3
+#endif
+
 int mlx5_eq_table_create(struct mlx5_core_dev *dev)
 {
struct mlx5_eq_table *eq_table = dev->priv.eq_table;
+   int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ?
+ MLX5_CAP_GEN(dev, max_num_eqs) :
+ 1 << MLX5_CAP_GEN(dev, log_max_eq);
int err;
 
eq_table->num_comp_eqs =
-   mlx5_irq_get_num_comp(eq_table->irq_table);
+   min_t(int,
+ mlx5_irq_get_num_comp(eq_table->irq_table),
+ num_eqs - MLX5_MAX_ASYNC_EQS);
 
err = create_async_eqs(dev);
if (err) {
-- 
2.30.2

[PATCH 5.4 070/111] net/mlx5: Dont request more than supported EQs

2021-04-12 Thread Greg Kroah-Hartman

From: Daniel Jurgens 

[ Upstream commit a7b76002ae78cd230ee652ccdfedf21aa94fcecc ]

Calculating the number of compeltion EQs based on the number of
available IRQ vectors doesn't work now that all async EQs share one IRQ.
Thus the max number of EQs can be exceeded on systems with more than
approximately 256 CPUs. Take this into account when calculating the
number of available completion EQs.

Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs")
Signed-off-by: Daniel Jurgens 
Reviewed-by: Parav Pandit 
Signed-off-by: Saeed Mahameed 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 0a20938b4aad..30a2ee3c40a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -926,13 +926,24 @@ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev)
mutex_unlock(&table->lock);
 }
 
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+#define MLX5_MAX_ASYNC_EQS 4
+#else
+#define MLX5_MAX_ASYNC_EQS 3
+#endif
+
 int mlx5_eq_table_create(struct mlx5_core_dev *dev)
 {
struct mlx5_eq_table *eq_table = dev->priv.eq_table;
+   int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ?
+ MLX5_CAP_GEN(dev, max_num_eqs) :
+ 1 << MLX5_CAP_GEN(dev, log_max_eq);
int err;
 
eq_table->num_comp_eqs =
-   mlx5_irq_get_num_comp(eq_table->irq_table);
+   min_t(int,
+ mlx5_irq_get_num_comp(eq_table->irq_table),
+ num_eqs - MLX5_MAX_ASYNC_EQS);
 
err = create_async_eqs(dev);
if (err) {
-- 
2.30.2

[PATCH 5.4 089/111] net: hns3: clear VF down state bit before request link status

2021-04-12 Thread Greg Kroah-Hartman

From: Guangbin Huang 

[ Upstream commit ed7bedd2c3ca040f1e8ea02c6590a93116b1ec78 ]

Currently, the VF down state bit is cleared after VF sending
link status request command. There is problem that when VF gets
link status replied from PF, the down state bit may still set
as 1. In this case, the link status replied from PF will be
ignored and always set VF link status to down.

To fix this problem, clear VF down state bit before VF requests
link status.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) 
Support")
Signed-off-by: Guangbin Huang 
Signed-off-by: Huazhong Tan 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 9b09dd95e878..fc275d4f484c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2140,14 +2140,14 @@ static int hclgevf_ae_start(struct hnae3_handle *handle)
 {
struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
 
+   clear_bit(HCLGEVF_STATE_DOWN, &hdev->state);
+
hclgevf_reset_tqp_stats(handle);
 
hclgevf_request_link_info(hdev);
 
hclgevf_update_link_mode(hdev);
 
-   clear_bit(HCLGEVF_STATE_DOWN, &hdev->state);
-
return 0;
 }
 
-- 
2.30.2

[PATCH 5.4 080/111] scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs

2021-04-12 Thread Greg Kroah-Hartman

From: Can Guo 

[ Upstream commit 4b42d557a8add52b9a9924fb31e40a218aab7801 ]

In __ufshcd_issue_tm_cmd(), it is not correct to use hba->nutrs + req->tag
as the Task Tag in a TMR UPIU. Directly use req->tag as the Task Tag.

Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command 
implementation")
Link: 
https://lore.kernel.org/r/1617262750-4864-3-git-send-email-c...@codeaurora.org
Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ufs/ufshcd.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 3f20270f0ca0..b81eebc7e2df 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5695,35 +5695,31 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
DECLARE_COMPLETION_ONSTACK(wait);
    struct request *req;
unsigned long flags;
-   int free_slot, task_tag, err;
+   int task_tag, err;
 
/*
-* Get free slot, sleep if slots are unavailable.
-* Even though we use wait_event() which sleeps indefinitely,
-* the maximum wait time is bounded by %TM_CMD_TIMEOUT.
+* blk_get_request() is used here only to get a free tag.
 */
req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);
req->end_io_data = &wait;
-   free_slot = req->tag;
-   WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
ufshcd_hold(hba, false);
 
spin_lock_irqsave(host->host_lock, flags);
-   task_tag = hba->nutrs + free_slot;
blk_mq_start_request(req);
 
+   task_tag = req->tag;
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-   memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq));
-   ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function);
+   memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq));
+   ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function);
 
/* send command to the controller */
-   __set_bit(free_slot, &hba->outstanding_tasks);
+   __set_bit(task_tag, &hba->outstanding_tasks);
 
/* Make sure descriptors are ready before ringing the task doorbell */
wmb();
 
-   ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL);
+   ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL);
/* Make sure that doorbell is committed immediately */
wmb();
 
@@ -5743,24 +5739,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete_err");
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
-   if (ufshcd_clear_tm_cmd(hba, free_slot))
-   dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) 
after timeout\n",
-   __func__, free_slot);
+   if (ufshcd_clear_tm_cmd(hba, task_tag))
+   dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot 
%d) after timeout\n",
+   __func__, task_tag);
err = -ETIMEDOUT;
} else {
err = 0;
-   memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq));
+   memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq));
 
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete");
}
 
spin_lock_irqsave(hba->host->host_lock, flags);
-   __clear_bit(free_slot, &hba->outstanding_tasks);
+   __clear_bit(task_tag, &hba->outstanding_tasks);
spin_unlock_irqrestore(hba->host->host_lock, flags);
 
+   ufshcd_release(hba);
blk_put_request(req);
 
-   ufshcd_release(hba);
return err;
 }
 
-- 
2.30.2

[PATCH 5.4 079/111] scsi: ufs: core: Fix task management request completion timeout

2021-04-12 Thread Greg Kroah-Hartman

From: Can Guo 

[ Upstream commit 1235fc569e0bf541ddda0a1224d4c6fa6d914890 ]

ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()),
but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
chance to run. Thus, TMR always ends up with completion timeout. Fix it by
calling blk_mq_start_request() in __ufshcd_issue_tm_cmd().

Link: 
https://lore.kernel.org/r/1617262750-4864-2-git-send-email-c...@codeaurora.org
Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and 
free TMFs")
Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ufs/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index e7e6405401dd..3f20270f0ca0 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5710,6 +5710,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 
spin_lock_irqsave(host->host_lock, flags);
task_tag = hba->nutrs + free_slot;
+   blk_mq_start_request(req);
 
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-- 
2.30.2

Re: BUG: unable to handle kernel paging request in bpf_check

2021-04-12 Thread Hao Sun

Besides, another similar bug occurred while fault injection was enabled.

BUG: unable to handle kernel paging request in bpf_prog_alloc_no_stats

RAX: ffda RBX: 0059c080 RCX: 0047338d
RDX: 0078 RSI: 2300 RDI: 0005
RBP: 7f7e3c38fc90 R08:  R09: 
R10:  R11: 0246 R12: 0004
R13: 7ffed3a1dd6f R14: 7ffed3a1df10 R15: 7f7e3c38fdc0
BUG: unable to handle page fault for address: 91f2077ed028
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 1810067 P4D 1810067 PUD 1915067 PMD 3b907067 PTE 0
Oops: 0002 [#1] SMP
CPU: 3 PID: 17344 Comm: executor Not tainted 5.12.0-rc6+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:bpf_prog_alloc_no_stats+0x251/0x6e0 kernel/bpf/core.c:94
Code: 45 b0 4c 8d 78 28 4d 8b a5 20 03 00 00 41 8b 85 a8 0f 00 00 89
45 c8 48 83 7d a8 00 0f 85 2e 03 00 00 4c 89 ff e8 4f 18 60 00 <4c> 89
20 4d 85 e4 0f 85 27 03 00 00 49 89 1f 4d 85 e4 74 0c 49 f7
RSP: 0018:89f2077cfaa8 EFLAGS: 00010286
RAX: 91f2077ed028 RBX: 096680024de8 RCX: 91f2077ed028
RDX: 99f2077ed028 RSI: 0008 RDI: 89f2077ed028
RBP: 89f2077cfb28 R08: d7eb800f R09: 888b7ffd3000
R10: 037a R11:  R12: 
R13: 888b1465aad8 R14: 04c3 R15: 89f2077ed028
FS:  7f7e3c390700() GS:888b7fd0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 91f2077ed028 CR3: 44802004 CR4: 00770ee0
PKRU: 5554
Call Trace:
 bpf_prog_alloc+0x74/0x310 kernel/bpf/core.c:119
 bpf_prog_load kernel/bpf/syscall.c:2162 [inline]
 __do_sys_bpf+0x11af3/0x17290 kernel/bpf/syscall.c:4393
 __se_sys_bpf+0x8e/0xa0 kernel/bpf/syscall.c:4351
 __x64_sys_bpf+0x4a/0x70 kernel/bpf/syscall.c:4351
 do_syscall_64+0xa2/0x120 arch/x86/entry/common.c:48
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x47338d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:7f7e3c38fc58 EFLAGS: 0246 ORIG_RAX: 0141
RAX: ffda RBX: 0059c080 RCX: 0047338d
RDX: 0078 RSI: 2300 RDI: 0005
RBP: 7f7e3c38fc90 R08:  R09: 
R10:  R11: 0246 R12: 0004
R13: 7ffed3a1dd6f R14: 7ffed3a1df10 R15: 7f7e3c38fdc0
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
CR2: 91f2077ed028
---[ end trace bc1de9e0e1b51e8c ]---
RIP: 0010:bpf_prog_alloc_no_stats+0x251/0x6e0 kernel/bpf/core.c:94
Code: 45 b0 4c 8d 78 28 4d 8b a5 20 03 00 00 41 8b 85 a8 0f 00 00 89
45 c8 48 83 7d a8 00 0f 85 2e 03 00 00 4c 89 ff e8 4f 18 60 00 <4c> 89
20 4d 85 e4 0f 85 27 03 00 00 49 89 1f 4d 85 e4 74 0c 49 f7
RSP: 0018:89f2077cfaa8 EFLAGS: 00010286
RAX: 91f2077ed028 RBX: 096680024de8 RCX: 91f2077ed028
RDX: 99f2077ed028 RSI: 0008 RDI: 89f2077ed028
RBP: 89f2077cfb28 R08: d7eb800f R09: 888b7ffd3000
R10: 037a R11:  R12: 
R13: 888b1465aad8 R14: 04c3 R15: 89f2077ed028
FS:  7f7e3c390700() GS:888b7fd0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 91f2077ed028 CR3: 44802004 CR4: 00770ee0
PKRU: 5554

The following system call sequence (Syzlang format) can reproduce the crash:
# {Threaded:false Collide:false Repeat:true RepeatTimes:0 Procs:1
Slowdown:1 Sandbox:none Fault:true FaultCall:0 FaultNth:4 Leak:false
NetInjection:true NetDevices:true NetReset:true Cgroups:true
BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:true USB:true
VhciInjection:true Wifi:true IEEE802154:true Sysctl:true
UseTmpDir:true HandleSegv:true Repro:false Trace:false}

bpf$BPF_PROG_WITH_BTFID_LOAD(0x5, &(0x7f000300)=@bpf_ext={0x1c,
0x8, &(0x7f0001c0)=@raw=[@initr0={0x18, 0x0, 0x0, 0x0,
0x4953b92f0467cc49, 0x0, 0x0, 0x0, 0xdbd689758db6b4a7}, @func={0x85,
0x0, 0x1, 0x0, 0x1}, @exit, @generic={0xd3c15618b9efaeff, 0x0, 0x0,
0x0, 0xc0fc52df13f3fbec}, @map_val={0x18, 0x0, 0x2, 0x0, 0x0, 0x0,
0x0, 0x0, 0xf7a72204b1b46d92}, @jmp], &(0x7f000200)='GPL\x00',
0x0, 0x0, 0x0, 0x0, 0x9, [], 0x0, 0x0, 0x0, 0x8, 0x0, 0x0, 0x10, 0x0,
0x0, 0x0, 0x0}, 0x78)

Using syz-execprog can run this reproduction program directly:
 ./syz-execprog -repeat 0 -procs 1 -slowdown 1 -fault_call 0
-fault_nth 4 -enable tun -enable netdev -enable resetnet -enable
cgroups -enable binfmt-misc -enable close_fds -enable devlinkpci
-enable usb -enable vhci -enabl

Re: BUG: unable to handle kernel paging request in __build_skb

2021-04-11 Thread Willem de Bruijn

On Sun, Apr 11, 2021 at 9:31 PM Hao Sun  wrote:
>
> Hi
>
> When using Healer(https://github.com/SunHao-0/healer/tree/dev) to fuzz
> the Linux kernel, I found the following bug report, but I'm not sure
> about this.
> Sorry, I do not have a reproducing program for this bug.
> I hope that the stack trace information in the crash log can help you
> locate the problem.
>
> Here is the details:
> commit:   4ebaab5fb428374552175aa39832abf5cedb916a
> version:   linux 5.12
> git tree:kmsan
>
> ==
> RAX: ffda RBX: 0059c080 RCX: 0047338d
> RDX: 0010 RSI: 20002400 RDI: 0003
> RBP: 7fb6512c2c90 R08:  R09: 
> R10:  R11: 0246 R12: 0005
> R13: 7fffbb36285f R14: 7fffbb362a00 R15: 7fb6512c2dc0
> BUG: unable to handle page fault for address: a73d01c96a40
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 1810067 P4D 1810067 PUD 1915067 PMD 4b84067 PTE 0
> Oops: 0002 [#1] SMP
> CPU: 0 PID: 6273 Comm: syz-executor Not tainted 5.12.0-rc6+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1.1 04/01/2014
> RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:64
> Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6
> f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1  aa
> 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
> RSP: 0018:9f3d01c9b930 EFLAGS: 00010082
> RAX: a73d01c96a00 RBX: 0020 RCX: 0020
> RDX: 0020 RSI:  RDI: a73d01c96a40
> RBP: 9f3d01c9b960 R08: c239000f R09: a73d01c96a40
> R10: 7dee4e6b R11: b2000782 R12: 
> R13: 0020 R14:  R15: 9f3d01c96a40
> FS:  7fb6512c3700() GS:97407fa0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: a73d01c96a40 CR3: 30087005 CR4: 00770ef0
> PKRU: 5554
> Call Trace:
>  kmsan_internal_unpoison_shadow+0x1d/0x70 mm/kmsan/kmsan.c:110
>  __msan_memset+0x64/0xb0 mm/kmsan/kmsan_instr.c:130
>  __build_skb_around net/core/skbuff.c:209 [inline]
>  __build_skb+0x34b/0x520 net/core/skbuff.c:243
>  netlink_alloc_large_skb net/netlink/af_netlink.c:1193 [inline]
>  netlink_sendmsg+0xdc1/0x14d0 net/netlink/af_netlink.c:1902
>  sock_sendmsg_nosec net/socket.c:654 [inline]
>  sock_sendmsg net/socket.c:674 [inline]

I don't have an idea what might be up, but some context:

This happens in __build_skb_around at

memset(shinfo, 0, offsetof(struct skb_shared_info, dataref));

on vmalloc'd memory in netloc_alloc_large_skb:

data = vmalloc(size);
if (data == NULL)
return NULL;

skb = __build_skb(data, size);

Re: [git pull] habanalabs pull request for kernel 5.13

2021-04-10 Thread Greg KH

On Sat, Apr 10, 2021 at 11:01:42PM +0300, Oded Gabbay wrote:
> Hi Greg,
> 
> This is habanalabs pull request for the merge window of kernel 5.13.
> It contains changes and new features, support for new firmware.
> Details are in the tag.
> 
> Thanks,
> Oded
> 
> The following changes since commit b195b20b7145bcae22ad261abc52d68336f5e913:
> 
>   Merge tag 'extcon-next-for-5.13' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into 
> char-misc-next (2021-04-08 08:45:30 +0200)

Pulled and pushed out, thanks.

greg k-h

[git pull] habanalabs pull request for kernel 5.13

2021-04-10 Thread Oded Gabbay

Hi Greg,

This is habanalabs pull request for the merge window of kernel 5.13.
It contains changes and new features, support for new firmware.
Details are in the tag.

Thanks,
Oded

The following changes since commit b195b20b7145bcae22ad261abc52d68336f5e913:

  Merge tag 'extcon-next-for-5.13' of 
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into 
char-misc-next (2021-04-08 08:45:30 +0200)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux.git 
tags/misc-habanalabs-next-2021-04-10

for you to fetch changes up to b575a7673e3d0396992fc72fce850723d39264e3:

  habanalabs: print f/w boot unknown error (2021-04-09 14:10:32 +0300)


This tag contains habanalabs driver changes for v5.13:

- Add support to reset device after the user closes the file descriptor.
  Because we support a single user, we can reset the device (if needs to)
  after a user closes its file descriptor to make sure the device is in
  idle and clean state for the next user.

- Add a new feature to allow the user to wait on interrupt. This is needed
  for future ASICs

- Replace GFP_ATOMIC with GFP_KERNEL wherever possible and add code to
  support failure of allocating with GFP_ATOMIC.

- Update code to support the latest firmware image:
  - More security features are done in the firmware
  - Remove hard-coded assumptions and replace them with values that are
sent to the firmware on loading.
  - Print device unusable error
  - Reset device in case the communication between driver and firmware
gets out of sync.
  - Support new PCI device ids for secured GAUDI.

- Expose current power draw through the INFO IOCTL.

- Support resetting the device upon a request from the BMC (through F/W).

- Always use only a single MSI in GAUDI, due to H/W limitation.

- Improve data-path code by taking out code from spinlock protection.

- Allow user to specify custom timeout per Command Submission.

- Some enhancements to debugfs.

- Various minor changes and improvements.


Alon Mizrahi (1):
  habanalabs: add custom timeout flag per cs

Bharat Jauhari (1):
  habanalabs: move dram scrub to free sequence

Koby Elbaz (2):
  habanalabs: improve utilization calculation
  habanalabs: support DEVICE_UNUSABLE error indication from FW

Oded Gabbay (11):
  habanalabs: reset after device is actually released
  habanalabs: fail reset if device is not idle
  habanalabs: reset_upon_device_release is for bring-up
  habanalabs: print if device is used on FD close
  habanalabs: change default CS timeout to 30 seconds
  habanalabs: use correct define for 32-bit max value
  habanalabs/gaudi: always use single-msi mode
  habanalabs/gaudi: add debugfs to DMA from the device
  habanalabs: remove the store jobs array from CS IOCTL
  habanalabs: use strscpy instead of sprintf and strlcpy
  habanalabs: print f/w boot unknown error

Ofir Bitton (13):
  habanalabs: add reset support when user closes FD
  habanalabs: enable all IRQs for user interrupt support
  habanalabs: wait for interrupt support
  habanalabs: use a single FW loading bringup flag
  habanalabs/gaudi: update extended async event header
  habanalabs: replace GFP_ATOMIC with GFP_KERNEL
  habanalabs: debugfs access to user mapped host addresses
  habanalabs/gaudi: reset device upon BMC request
  habanalabs/gaudi: unsecure TPC cfg status registers
  habanalabs/gaudi: Update async events header
  habanalabs: move relevant datapath work outside cs lock
  habanalabs/gaudi: derive security status from pci id
  habanalabs/gaudi: skip iATU if F/W security is enabled

Ohad Sharabi (6):
  habanalabs: reset device in case of sync error
  habanalabs: skip DISABLE PCI packet to FW on heartbeat
  habanalabs: update hl_boot_if.h
  habanalabs: support legacy and new pll indexes
  habanalabs: send dynamic msi-x indexes to f/w
  habanalabs: update to latest F/W communication header

Sagiv Ozeri (2):
  habanalabs: support HW blocks vm show
  habanalabs: return current power via INFO IOCTL

Tomer Tayar (1):
  habanalabs/gaudi: clear QM errors only if not in stop_on_err mode

Yang Li (1):
  habanalabs: Switch to using the new API kobj_to_dev()

farah kassabri (3):
  habanalabs: set max asid to 2
  habanalabs: avoid soft lockup bug upon mapping error
  habanalabs/gaudi: sync stream add protection to SOB reset flow

 .../ABI/testing/debugfs-driver-habanalabs  |  70 +++-
 drivers/misc/habanalabs/common/command_buffer.c|  12 +-
 .../misc/habanalabs/common/command_submission.c| 368 +
 drivers/misc/habanalabs/common/context.c   |  14 +-
 drivers/misc/habanalabs/common/debugfs.c   | 224 +++--
 drivers/misc/habanal

Re: [PULL REQUEST] i2c for 5.12

2021-04-10 Thread pr-tracker-bot

The pull request you sent on Sat, 10 Apr 2021 13:00:24 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git i2c/for-current

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/12a0cf7241f9ee6b9b62e4c5aad53c43f46817a4

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

[PULL REQUEST] i2c for 5.12

2021-04-10 Thread Wolfram Sang

Linus,

here is a mixture of driver and documentation bugfixes for I2C.

Please pull.

Thanks,

   Wolfram


The following changes since commit 1e28eed17697bcf343c6743f0028cc3b5dd88bf0:

  Linux 5.12-rc3 (2021-03-14 14:41:02 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git i2c/for-current

for you to fetch changes up to df8a39f2911a4c7769e0f760509f556a9e9d37af:

  i2c: imx: mention Oleksij as maintainer of the binding docs (2021-04-08 
22:39:12 +0200)


Andy Shevchenko (1):
  i2c: designware: Adjust bus_freq_hz when refuse high speed mode set

Bhaskar Chowdhury (1):
  i2c: stm32f4: Mundane typo fix

Hao Fang (1):
  i2c: hix5hd2: use the correct HiSilicon copyright

Krzysztof Kozlowski (1):
  i2c: exynos5: correct top kerneldoc

Wolfram Sang (4):
  i2c: turn recovery error on init to debug
  i2c: imx: drop me as maintainer of binding docs
  i2c: gpio: update email address in binding docs
  i2c: imx: mention Oleksij as maintainer of the binding docs

周琰杰 (Zhou Yanjie) (1):
  I2C: JZ4780: Fix bug for Ingenic X1000.


with much appreciated quality assurance from

Alain Volmat (1):
  (Rev.) i2c: stm32f4: Mundane typo fix

Barry Song (1):
  (Rev.) i2c: designware: Adjust bus_freq_hz when refuse high speed mode set

Klaus Kudielka (1):
  (Test) i2c: turn recovery error on init to debug

Oleksij Rempel (1):
  (Rev.) i2c: imx: mention Oleksij as maintainer of the binding docs

Pierre-Yves MORDRET (1):
  (Rev.) i2c: stm32f4: Mundane typo fix

Rob Herring (1):
  (Rev.) i2c: imx: mention Oleksij as maintainer of the binding docs

杨文龙 (Yang Wenlong) (1):
  (Test) I2C: JZ4780: Fix bug for Ingenic X1000.

 Documentation/devicetree/bindings/i2c/i2c-gpio.yaml | 2 +-
 Documentation/devicetree/bindings/i2c/i2c-imx.yaml  | 2 +-
 drivers/i2c/busses/i2c-designware-master.c  | 1 +
 drivers/i2c/busses/i2c-exynos5.c| 2 +-
 drivers/i2c/busses/i2c-hix5hd2.c| 2 +-
 drivers/i2c/busses/i2c-jz4780.c | 4 ++--
 drivers/i2c/busses/i2c-stm32f4.c| 2 +-
 drivers/i2c/i2c-core-base.c | 7 ---
 8 files changed, 12 insertions(+), 10 deletions(-)


signature.asc
Description: PGP signature

pull request: linux-firmware: update cxgb4 firmware to 1.25.4.0

2021-04-09 Thread Raju Rangoju

Hi,

Can you please pull the new firmware from the following URL?
git://git.chelsio.net/pub/git/linux-firmware.git for-upstream

The following changes since commit af1ca28f03287b0c60682ab37cc684c773de853f:

  amdgpu: add arcturus firmware (2021-04-05 10:40:08 -0400)

are available in the git repository at:

  git://git.chelsio.net/pub/git/linux-firmware.git for-upstream

for you to fetch changes up to 7daedba9b02bfbb47da89c4dae6f20c91a5e5402:

  cxgb4: Update firmware to revision 1.25.4.0 (2021-04-09 07:45:27 -0700)


Raju Rangoju (1):
  cxgb4: Update firmware to revision 1.25.4.0

 WHENCE  |   6 +++---
 cxgb4/configs/t6-config-default.txt |   8 ++--
 cxgb4/t4fw-1.24.17.0.bin| Bin 568832 -> 0 bytes
 cxgb4/t4fw-1.25.4.0.bin | Bin 0 -> 569856 bytes
 cxgb4/t5fw-1.24.17.0.bin| Bin 672768 -> 0 bytes
 cxgb4/t5fw-1.25.4.0.bin | Bin 0 -> 675328 bytes
 cxgb4/t6fw-1.24.17.0.bin| Bin 727040 -> 0 bytes
 cxgb4/t6fw-1.25.4.0.bin | Bin 0 -> 728064 bytes
 8 files changed, 9 insertions(+), 5 deletions(-)
 delete mode 100644 cxgb4/t4fw-1.24.17.0.bin
 create mode 100644 cxgb4/t4fw-1.25.4.0.bin
 delete mode 100644 cxgb4/t5fw-1.24.17.0.bin
 create mode 100644 cxgb4/t5fw-1.25.4.0.bin
 delete mode 100644 cxgb4/t6fw-1.24.17.0.bin
 create mode 100644 cxgb4/t6fw-1.25.4.0.bin

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-09 Thread Lai Jiangshan

On Fri, Nov 27, 2020 at 7:26 PM Paolo Bonzini  wrote:
>
> kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
> a hodge-podge of conditions, hacked together to get something that
> more or less works.  But what is actually needed is much simpler;
> in both cases the fundamental question is, do we have a place to stash
> an interrupt if userspace does KVM_INTERRUPT?
>
> In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
> Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
> unnecessarily restrictive.
>
> In split irqchip mode it's a bit more complicated, we need to check
> kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
> cycle and thus requires ExtINTs not to be masked) as well as
> !pending_userspace_extint(vcpu).  However, there is no need to
> check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
> pending ExtINT state separate from event injection state, and checking
> kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
> priority than APIC interrupts.  In fact the latter fixes a bug:
> when userspace requests an IRQ window vmexit, an interrupt in the
> local APIC can cause kvm_cpu_has_interrupt() to be true and thus
> kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
> happens, vcpu_run does not exit to userspace but the interrupt window
> vmexits keep occurring.  The VM loops without any hope of making progress.
>
> Once we try to fix these with something like
>
>  return kvm_arch_interrupt_allowed(vcpu) &&
> -!kvm_cpu_has_interrupt(vcpu) &&
> -!kvm_event_needs_reinjection(vcpu) &&
> -kvm_cpu_accept_dm_intr(vcpu);
> +(!lapic_in_kernel(vcpu)
> + ? !vcpu->arch.interrupt.injected
> + : (kvm_apic_accept_pic_intr(vcpu)
> +&& !pending_userspace_extint(v)));
>
> we realize two things.  First, thanks to the previous patch the complex
> conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
> window request in vcpu_enter_guest()
>
> bool req_int_win =
> dm_request_for_irq_injection(vcpu) &&
> kvm_cpu_accept_dm_intr(vcpu);
>
> should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
> it is unnecessary to ask the processor for an interrupt window
> if we would not be able to return to userspace.  Therefore, the
> complex conditional is really the correct implementation of
> kvm_cpu_accept_dm_intr(vcpu).  It all makes sense:
>
> - we can accept an interrupt from userspace if there is a place
>   to stash it (and, for irqchip split, ExtINTs are not masked).
>   Interrupts from userspace _can_ be accepted even if right now
>   EFLAGS.IF=0.

Hello, Paolo

If userspace does KVM_INTERRUPT, vcpu->arch.interrupt.injected is
set immediately, and in inject_pending_event(), we have

else if (!vcpu->arch.exception.pending) {
if (vcpu->arch.nmi_injected) {
kvm_x86_ops.set_nmi(vcpu);
can_inject = false;
} else if (vcpu->arch.interrupt.injected) {
kvm_x86_ops.set_irq(vcpu);
can_inject = false;
}
}

I'm curious about that can the kvm_x86_ops.set_irq() here be possible
to queue the irq with EFLAGS.IF=0? If not, which code prevents it?

I'm asking about this because I just noticed that interrupt can
be queued when exception pending, and this patch relaxed it even
more.

Note: interrupt can NOT be queued when exception pending
until 664f8e26b00c7 ("KVM: X86: Fix loss of exception which
has not yet been injected") which I think is dangerous.

Thanks
Lai

>
> - in order to tell userspace we will inject its interrupt ("IRQ
>   window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
>   KVM and the vCPU need to be ready to accept the interrupt.
>
> ... and this is what the patch implements.
>
> Reported-by: David Woodhouse 
> Analyzed-by: David Woodhouse 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/irq.c  |  2 +-
>  arch/x86/kvm/x86.c  | 17 +++--
>  3 files changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d44858b69353..ddaf3e01a854 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1655,6 +1655,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long 
> hva);
>  int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
>  int kvm_cpu_has_inject

[PATCH v3 11/14] usb: dwc2: Fix session request interrupt handler

2021-04-08 Thread Artur Petrosyan

According to programming guide in host mode, port
power must be turned on in session request
interrupt handlers.

Cc: 
Fixes: 21795c826a45 ("usb: dwc2: exit hibernation on session request")
Signed-off-by: Artur Petrosyan 
Acked-by: Minas Harutyunyan 
---
 drivers/usb/dwc2/core_intr.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c
index 0a7f9330907f..8c0152b514be 100644
--- a/drivers/usb/dwc2/core_intr.c
+++ b/drivers/usb/dwc2/core_intr.c
@@ -307,6 +307,7 @@ static void dwc2_handle_conn_id_status_change_intr(struct 
dwc2_hsotg *hsotg)
 static void dwc2_handle_session_req_intr(struct dwc2_hsotg *hsotg)
 {
int ret;
+   u32 hprt0;
 
/* Clear interrupt */
dwc2_writel(hsotg, GINTSTS_SESSREQINT, GINTSTS);
@@ -328,6 +329,13 @@ static void dwc2_handle_session_req_intr(struct dwc2_hsotg 
*hsotg)
 * established
 */
dwc2_hsotg_disconnect(hsotg);
+   } else {
+   /* Turn on the port power bit. */
+   hprt0 = dwc2_read_hprt0(hsotg);
+   hprt0 |= HPRT0_PWR;
+   dwc2_writel(hsotg, hprt0, HPRT0);
+   /* Connect hcd after port power is set. */
+   dwc2_hcd_connect(hsotg);
}
 }
 
-- 
2.25.1

Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-07 Thread Wei Liu

On Wed, Apr 07, 2021 at 04:02:56PM +0200, Vitaly Kuznetsov wrote:
> Wei Liu  writes:
> 
> > On Wed, Apr 07, 2021 at 09:38:21AM +0200, Vitaly Kuznetsov wrote:
> >
> >> One more though: it is probably a good idea to introduce selftests for
> >> /dev/mshv (similar to KVM's selftests in
> >> /tools/testing/selftests/kvm). Selftests don't really need a stable ABI
> >> as they live in the same linux.git and can be updated in the same patch
> >> series which changes /dev/mshv behavior. Selftests are very useful for
> >> checking there are no regressions, especially in the situation when
> >> there's no publicly available userspace for /dev/mshv.
> >
> > I think this can wait until we merge the first implementation in tree.
> > There are still a lot of moving parts. Our (currently limited) internal
> > test cases need more cleaning up before they are ready. I certainly
> > don't want to distract Nuno from getting the foundation right.
> >
> 
> I'm absolutely fine with this approach, selftests are a nice add-on, not
> a requirement for the initial implementation. Also, to make them more
> useful to mere mortals, a doc on how to run Linux as root Hyper-V
> partition would come handy)

Making this system easier for others to use and consume is on our radar.
Currently you need Windows bootloader and a not-yet-released loader to
load the hypervisor. We're making progress in bringing in GRUB.

Needless to say there are technical and non-technical challenges for
this work, so don't expect it to happen very soon. :-)

Wei.

> 
> -- 
> Vitaly
>

Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-07 Thread Vitaly Kuznetsov

Wei Liu  writes:

> On Wed, Apr 07, 2021 at 09:38:21AM +0200, Vitaly Kuznetsov wrote:
>
>> One more though: it is probably a good idea to introduce selftests for
>> /dev/mshv (similar to KVM's selftests in
>> /tools/testing/selftests/kvm). Selftests don't really need a stable ABI
>> as they live in the same linux.git and can be updated in the same patch
>> series which changes /dev/mshv behavior. Selftests are very useful for
>> checking there are no regressions, especially in the situation when
>> there's no publicly available userspace for /dev/mshv.
>
> I think this can wait until we merge the first implementation in tree.
> There are still a lot of moving parts. Our (currently limited) internal
> test cases need more cleaning up before they are ready. I certainly
> don't want to distract Nuno from getting the foundation right.
>

I'm absolutely fine with this approach, selftests are a nice add-on, not
a requirement for the initial implementation. Also, to make them more
useful to mere mortals, a doc on how to run Linux as root Hyper-V
partition would come handy)

-- 
Vitaly

Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-07 Thread Wei Liu

On Wed, Apr 07, 2021 at 09:38:21AM +0200, Vitaly Kuznetsov wrote:
> Nuno Das Neves  writes:
> 
> > On 3/5/2021 1:18 AM, Vitaly Kuznetsov wrote:
> >> Nuno Das Neves  writes:
> >> 
> >>> On 2/9/2021 5:11 AM, Vitaly Kuznetsov wrote:
>  Nuno Das Neves  writes:
> 
> >> ...
> > +
> > +3.1 MSHV_REQUEST_VERSION
> > +
> > +:Type: /dev/mshv ioctl
> > +:Parameters: pointer to a u32
> > +:Returns: 0 on success
> > +
> > +Before issuing any other ioctls, a MSHV_REQUEST_VERSION ioctl must be 
> > called to
> > +establish the interface version with the kernel module.
> > +
> > +The caller should pass the MSHV_VERSION as an argument.
> > +
> > +The kernel module will check which interface versions it supports and 
> > return 0
> > +if one of them matches.
> > +
> > +This /dev/mshv file descriptor will remain 'locked' to that version as 
> > long as
> > +it is open - this ioctl can only be called once per open.
> > +
> 
>  KVM used to have KVM_GET_API_VERSION too but this turned out to be not
>  very convenient so we use capabilities 
>  (KVM_CHECK_EXTENSION/KVM_ENABLE_CAP)
>  instead.
> 
> >>>
> >>> The goal of MSHV_REQUEST_VERSION is to support changes to APIs in the 
> >>> core set.
> >>> When we add new features/ioctls beyond the core we can use an 
> >>> extension/capability
> >>> approach like KVM.
> >>>
> >> 
> >> Driver versions is a very bad idea from distribution/stable kernel point
> >> of view as it presumes that the history is linear. It is not.
> >> 
> >> Imagine you have the following history upstream:
> >> 
> >> MSHV_REQUEST_VERSION = 1
> >> <100 commits with features/fixes>
> >> MSHV_REQUEST_VERSION = 2
> >> 
> >> MSHV_REQUEST_VERSION = 2
> >> 
> >> Now I'm a linux distribution / stable kernel maintainer. My kernel is at
> >> MSHV_REQUEST_VERSION = 1. Now I want to backport 1 feature from between
> >> VER=1 and VER=2 and another feature from between VER=2 and VER=3. My
> >> history now looks like
> >> 
> >> MSHV_REQUEST_VERSION = 1
> >> <5 commits from between VER=1 and VER=2>
> >>Which version should I declare here 
> >> <5 commits from between VER=2 and VER=3>
> >>Which version should I declare here 
> >> 
> >> If I keep VER=1 then userspace will think that I don't have any extra
> >> features added and just won't use them. If I change VER to 2/3, it'll
> >> think I have *all* features from between these versions.
> >> 
> >> The only reasonable way to manage this is to attach a "capability" to
> >> every ABI change and expose this capability *in the same commit which
> >> introduces the change to the ABI*. This way userspace will now exactly
> >> which ioctls are available and what are their interfaces.
> >> 
> >> Also, trying to define "core set" is hard but you don't really need
> >> to.
> >> 
> >
> > We've had some internal discussion on this.
> >
> > There is bound to be some iteration before this ABI is stable, since even 
> > the
> > underlying Microsoft hypervisor interfaces aren't stable just yet.
> >
> > It might make more sense to just have an IOCTL to check if the API is 
> > stable yet.
> > This would be analogous to checking if kVM_GET_API_VERSION returns 12.
> >
> > How does this sound as a proposal?
> > An MSHV_CHECK_EXTENSION ioctl to query extensions to the core /dev/mshv API.
> >
> > It takes a single argument, an integer named MSHV_CAP_* corresponding to
> > the extension to check the existence of.
> >
> > The ioctl will return 0 if the extension is unsupported, or a positive 
> > integer
> > if supported.
> >
> > We can initially include a capability called MSHV_CAP_CORE_API_STABLE.
> > If supported, the core APIs are stable.
> 
> This sounds reasonable, I'd suggest you reserve MSHV_CAP_CORE_API_STABLE
> right away but don't expose it yet so it's clear the API is not yet
> stable. Test userspace you have may always assume it's running with the
> latest kernel.
> 
> Also, please be clear about the fact that /dev/mshv doesn't
> provide a stable API yet so nobody builds an application on top of
> it.
> 

Very good discussion and suggestions. Thank you Vitaly.

> One more though: it is probably a good idea to introduce selftests for
> /dev/mshv (similar to KVM's selftests in
> /tools/testing/selftests/kvm). Selftests don't really need a stable ABI
> as they live in the same linux.git and can be updated in the same patch
> series which changes /dev/mshv behavior. Selftests are very useful for
> checking there are no regressions, especially in the situation when
> there's no publicly available userspace for /dev/mshv.

I think this can wait until we merge the first implementation in tree.
There are still a lot of moving parts. Our (currently limited) internal
test cases need more cleaning up before they are ready. I certainly
don't want to distract Nuno from getting the foundation right.

Wei.

Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-07 Thread Vitaly Kuznetsov

Nuno Das Neves  writes:

> On 3/5/2021 1:18 AM, Vitaly Kuznetsov wrote:
>> Nuno Das Neves  writes:
>> 
>>> On 2/9/2021 5:11 AM, Vitaly Kuznetsov wrote:
 Nuno Das Neves  writes:

>> ...
> +
> +3.1 MSHV_REQUEST_VERSION
> +
> +:Type: /dev/mshv ioctl
> +:Parameters: pointer to a u32
> +:Returns: 0 on success
> +
> +Before issuing any other ioctls, a MSHV_REQUEST_VERSION ioctl must be 
> called to
> +establish the interface version with the kernel module.
> +
> +The caller should pass the MSHV_VERSION as an argument.
> +
> +The kernel module will check which interface versions it supports and 
> return 0
> +if one of them matches.
> +
> +This /dev/mshv file descriptor will remain 'locked' to that version as 
> long as
> +it is open - this ioctl can only be called once per open.
> +

 KVM used to have KVM_GET_API_VERSION too but this turned out to be not
 very convenient so we use capabilities (KVM_CHECK_EXTENSION/KVM_ENABLE_CAP)
 instead.

>>>
>>> The goal of MSHV_REQUEST_VERSION is to support changes to APIs in the core 
>>> set.
>>> When we add new features/ioctls beyond the core we can use an 
>>> extension/capability
>>> approach like KVM.
>>>
>> 
>> Driver versions is a very bad idea from distribution/stable kernel point
>> of view as it presumes that the history is linear. It is not.
>> 
>> Imagine you have the following history upstream:
>> 
>> MSHV_REQUEST_VERSION = 1
>> <100 commits with features/fixes>
>> MSHV_REQUEST_VERSION = 2
>> 
>> MSHV_REQUEST_VERSION = 2
>> 
>> Now I'm a linux distribution / stable kernel maintainer. My kernel is at
>> MSHV_REQUEST_VERSION = 1. Now I want to backport 1 feature from between
>> VER=1 and VER=2 and another feature from between VER=2 and VER=3. My
>> history now looks like
>> 
>> MSHV_REQUEST_VERSION = 1
>> <5 commits from between VER=1 and VER=2>
>>Which version should I declare here 
>> <5 commits from between VER=2 and VER=3>
>>Which version should I declare here 
>> 
>> If I keep VER=1 then userspace will think that I don't have any extra
>> features added and just won't use them. If I change VER to 2/3, it'll
>> think I have *all* features from between these versions.
>> 
>> The only reasonable way to manage this is to attach a "capability" to
>> every ABI change and expose this capability *in the same commit which
>> introduces the change to the ABI*. This way userspace will now exactly
>> which ioctls are available and what are their interfaces.
>> 
>> Also, trying to define "core set" is hard but you don't really need
>> to.
>> 
>
> We've had some internal discussion on this.
>
> There is bound to be some iteration before this ABI is stable, since even the
> underlying Microsoft hypervisor interfaces aren't stable just yet.
>
> It might make more sense to just have an IOCTL to check if the API is stable 
> yet.
> This would be analogous to checking if kVM_GET_API_VERSION returns 12.
>
> How does this sound as a proposal?
> An MSHV_CHECK_EXTENSION ioctl to query extensions to the core /dev/mshv API.
>
> It takes a single argument, an integer named MSHV_CAP_* corresponding to
> the extension to check the existence of.
>
> The ioctl will return 0 if the extension is unsupported, or a positive integer
> if supported.
>
> We can initially include a capability called MSHV_CAP_CORE_API_STABLE.
> If supported, the core APIs are stable.

This sounds reasonable, I'd suggest you reserve MSHV_CAP_CORE_API_STABLE
right away but don't expose it yet so it's clear the API is not yet
stable. Test userspace you have may always assume it's running with the
latest kernel.

Also, please be clear about the fact that /dev/mshv doesn't
provide a stable API yet so nobody builds an application on top of
it.

One more though: it is probably a good idea to introduce selftests for
/dev/mshv (similar to KVM's selftests in
/tools/testing/selftests/kvm). Selftests don't really need a stable ABI
as they live in the same linux.git and can be updated in the same patch
series which changes /dev/mshv behavior. Selftests are very useful for
checking there are no regressions, especially in the situation when
there's no publicly available userspace for /dev/mshv.

-- 
Vitaly

Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-04-06 Thread Nuno Das Neves



On 3/5/2021 1:18 AM, Vitaly Kuznetsov wrote:
> Nuno Das Neves  writes:
> 
>> On 2/9/2021 5:11 AM, Vitaly Kuznetsov wrote:
>>> Nuno Das Neves  writes:
>>>
> ...
 +
 +3.1 MSHV_REQUEST_VERSION
 +
 +:Type: /dev/mshv ioctl
 +:Parameters: pointer to a u32
 +:Returns: 0 on success
 +
 +Before issuing any other ioctls, a MSHV_REQUEST_VERSION ioctl must be 
 called to
 +establish the interface version with the kernel module.
 +
 +The caller should pass the MSHV_VERSION as an argument.
 +
 +The kernel module will check which interface versions it supports and 
 return 0
 +if one of them matches.
 +
 +This /dev/mshv file descriptor will remain 'locked' to that version as 
 long as
 +it is open - this ioctl can only be called once per open.
 +
>>>
>>> KVM used to have KVM_GET_API_VERSION too but this turned out to be not
>>> very convenient so we use capabilities (KVM_CHECK_EXTENSION/KVM_ENABLE_CAP)
>>> instead.
>>>
>>
>> The goal of MSHV_REQUEST_VERSION is to support changes to APIs in the core 
>> set.
>> When we add new features/ioctls beyond the core we can use an 
>> extension/capability
>> approach like KVM.
>>
> 
> Driver versions is a very bad idea from distribution/stable kernel point
> of view as it presumes that the history is linear. It is not.
> 
> Imagine you have the following history upstream:
> 
> MSHV_REQUEST_VERSION = 1
> <100 commits with features/fixes>
> MSHV_REQUEST_VERSION = 2
> 
> MSHV_REQUEST_VERSION = 2
> 
> Now I'm a linux distribution / stable kernel maintainer. My kernel is at
> MSHV_REQUEST_VERSION = 1. Now I want to backport 1 feature from between
> VER=1 and VER=2 and another feature from between VER=2 and VER=3. My
> history now looks like
> 
> MSHV_REQUEST_VERSION = 1
> <5 commits from between VER=1 and VER=2>
>Which version should I declare here 
> <5 commits from between VER=2 and VER=3>
>Which version should I declare here 
> 
> If I keep VER=1 then userspace will think that I don't have any extra
> features added and just won't use them. If I change VER to 2/3, it'll
> think I have *all* features from between these versions.
> 
> The only reasonable way to manage this is to attach a "capability" to
> every ABI change and expose this capability *in the same commit which
> introduces the change to the ABI*. This way userspace will now exactly
> which ioctls are available and what are their interfaces.
> 
> Also, trying to define "core set" is hard but you don't really need
> to.
> 

We've had some internal discussion on this.

There is bound to be some iteration before this ABI is stable, since even the
underlying Microsoft hypervisor interfaces aren't stable just yet.

It might make more sense to just have an IOCTL to check if the API is stable 
yet.
This would be analogous to checking if kVM_GET_API_VERSION returns 12.

How does this sound as a proposal?
An MSHV_CHECK_EXTENSION ioctl to query extensions to the core /dev/mshv API.

It takes a single argument, an integer named MSHV_CAP_* corresponding to
the extension to check the existence of.

The ioctl will return 0 if the extension is unsupported, or a positive integer
if supported.

We can initially include a capability called MSHV_CAP_CORE_API_STABLE.
If supported, the core APIs are stable.

[PATCH 04/11] staging: rtl8188eu: use actual request type as parameter

2021-04-06 Thread Martin Kaiser

At the moment, usbctrl_vendorreq's requesttype parameter must be set to 1
for reading and 0 for writing. It's then converted to the actual
bmRequestType for the USB control request. We can simplify the code and
avoid this conversion if the caller passes the actual bmRequestType.

We already have defines for the read and write request types. Move them to
usb_ops_linux.c, they're used only inside this file. Replace the numeric
values with USB constants to make their meaning clearer.

Signed-off-by: Martin Kaiser 
---
 .../staging/rtl8188eu/include/usb_ops_linux.h |  3 --
 .../staging/rtl8188eu/os_dep/usb_ops_linux.c  | 52 +++
 2 files changed, 20 insertions(+), 35 deletions(-)

diff --git a/drivers/staging/rtl8188eu/include/usb_ops_linux.h 
b/drivers/staging/rtl8188eu/include/usb_ops_linux.h
index 70d729742839..4e0e48cb5c8e 100644
--- a/drivers/staging/rtl8188eu/include/usb_ops_linux.h
+++ b/drivers/staging/rtl8188eu/include/usb_ops_linux.h
@@ -16,9 +16,6 @@
 
 #define RTW_USB_BULKOUT_TIME   5000/* ms */
 
-#define REALTEK_USB_VENQT_READ 0xC0
-#define REALTEK_USB_VENQT_WRITE0x40
-
 #define ALIGNMENT_UNIT 16
 #define MAX_VENDOR_REQ_CMD_SIZE254 /* 8188cu SIE Support */
 #define MAX_USB_IO_CTL_SIZE(MAX_VENDOR_REQ_CMD_SIZE + ALIGNMENT_UNIT)
diff --git a/drivers/staging/rtl8188eu/os_dep/usb_ops_linux.c 
b/drivers/staging/rtl8188eu/os_dep/usb_ops_linux.c
index b760636f03d3..205a15dd67a5 100644
--- a/drivers/staging/rtl8188eu/os_dep/usb_ops_linux.c
+++ b/drivers/staging/rtl8188eu/os_dep/usb_ops_linux.c
@@ -10,6 +10,9 @@
 #include 
 #include 
 
+#define REALTEK_USB_VENQT_READ (USB_DIR_IN | USB_TYPE_VENDOR | 
USB_RECIP_DEVICE)
+#define REALTEK_USB_VENQT_WRITE(USB_DIR_OUT | USB_TYPE_VENDOR | 
USB_RECIP_DEVICE)
+
 #define REALTEK_USB_VENQT_CMD_REQ  0x05
 #define REALTEK_USB_VENQT_CMD_IDX  0x00
 
@@ -202,13 +205,12 @@ unsigned int ffaddr2pipehdl(struct dvobj_priv *pdvobj, 
u32 addr)
 }
 
 static int
-usbctrl_vendorreq(struct adapter *adapt, u16 value, void *pdata, u16 len, u8 
requesttype)
+usbctrl_vendorreq(struct adapter *adapt, u16 value, void *pdata, u16 len, u8 
reqtype)
 {
struct dvobj_priv *dvobjpriv = adapter_to_dvobj(adapt);
struct usb_device *udev = dvobjpriv->pusbdev;
unsigned int pipe;
int status = 0;
-   u8 reqtype;
u8 *pIo_buf;
int vendorreq_times = 0;
 
@@ -242,13 +244,14 @@ usbctrl_vendorreq(struct adapter *adapt, u16 value, void 
*pdata, u16 len, u8 req
while (++vendorreq_times <= MAX_USBCTRL_VENDORREQ_TIMES) {
memset(pIo_buf, 0, len);
 
-   if (requesttype == 0x01) {
+   if (reqtype == REALTEK_USB_VENQT_READ) {
pipe = usb_rcvctrlpipe(udev, 0);/* read_in */
-   reqtype =  REALTEK_USB_VENQT_READ;
-   } else {
+   } else if (reqtype == REALTEK_USB_VENQT_WRITE) {
pipe = usb_sndctrlpipe(udev, 0);/* write_out */
-   reqtype =  REALTEK_USB_VENQT_WRITE;
memcpy(pIo_buf, pdata, len);
+   } else {
+   status = -EINVAL;
+   goto free_buf;
}
 
status = usb_control_msg(udev, pipe, REALTEK_USB_VENQT_CMD_REQ,
@@ -256,11 +259,11 @@ usbctrl_vendorreq(struct adapter *adapt, u16 value, void 
*pdata, u16 len, u8 req
 pIo_buf, len, 
RTW_USB_CONTROL_MSG_TIMEOUT);
 
if (status == len) {   /*  Success this control transfer. */
-   if (requesttype == 0x01)
+   if (reqtype == REALTEK_USB_VENQT_READ)
memcpy(pdata, pIo_buf,  len);
} else { /*  error cases */
DBG_88E("reg 0x%x, usb %s %u fail, status:%d 
value=0x%x, vendorreq_times:%d\n",
-   value, (requesttype == 0x01) ? "read" : "write",
+   value, (reqtype == REALTEK_USB_VENQT_READ) ? 
"read" : "write",
len, status, *(u32 *)pdata, vendorreq_times);
 
if (status < 0) {
@@ -270,7 +273,7 @@ usbctrl_vendorreq(struct adapter *adapt, u16 value, void 
*pdata, u16 len, u8 req

adapt->HalData->srestpriv.wifi_error_status = USB_VEN_REQ_CMD_FAIL;
} else { /*  status != len && status >= 0 */
if (status > 0) {
-   if (requesttype == 0x01) {
+   if (reqtype == REALTEK_USB_VENQT_READ) {
/*  For Control read transfer, 
we have to copy the read data from pIo_buf to pdata. */

[PATCH V2 07/18] i2c: imx-lpi2c: manage irq resource request/release in runtime pm

2021-04-06 Thread Clark Wang

From: Fugang Duan 

Manage irq resource request/release in runtime pm to save irq domain's
power.

Signed-off-by: Frank Li 
Signed-off-by: Fugang Duan 
Signed-off-by: Clark Wang 
Reviewed-by: Frank Li 
---
V2 changes:
 - Change to use request_irq/free_irq.
---
 drivers/i2c/busses/i2c-imx-lpi2c.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c 
b/drivers/i2c/busses/i2c-imx-lpi2c.c
index 89b7b0795f51..333209ba81c1 100644
--- a/drivers/i2c/busses/i2c-imx-lpi2c.c
+++ b/drivers/i2c/busses/i2c-imx-lpi2c.c
@@ -94,6 +94,7 @@ enum lpi2c_imx_pincfg {
 
 struct lpi2c_imx_struct {
struct i2c_adapter  adapter;
+   int irq;
struct clk  *clk_per;
struct clk  *clk_ipg;
void __iomem*base;
@@ -571,7 +572,7 @@ static int lpi2c_imx_probe(struct platform_device *pdev)
 {
struct lpi2c_imx_struct *lpi2c_imx;
unsigned int temp;
-   int irq, ret;
+   int ret;
 
lpi2c_imx = devm_kzalloc(&pdev->dev, sizeof(*lpi2c_imx), GFP_KERNEL);
if (!lpi2c_imx)
@@ -581,9 +582,9 @@ static int lpi2c_imx_probe(struct platform_device *pdev)
if (IS_ERR(lpi2c_imx->base))
return PTR_ERR(lpi2c_imx->base);
 
-   irq = platform_get_irq(pdev, 0);
-   if (irq < 0)
-   return irq;
+   lpi2c_imx->irq = platform_get_irq(pdev, 0);
+   if (lpi2c_imx->irq < 0)
+   return lpi2c_imx->irq;
 
lpi2c_imx->adapter.owner= THIS_MODULE;
lpi2c_imx->adapter.algo = &lpi2c_imx_algo;
@@ -609,13 +610,6 @@ static int lpi2c_imx_probe(struct platform_device *pdev)
if (ret)
lpi2c_imx->bitrate = I2C_MAX_STANDARD_MODE_FREQ;
 
-   ret = devm_request_irq(&pdev->dev, irq, lpi2c_imx_isr, 0,
-  pdev->name, lpi2c_imx);
-   if (ret) {
-   dev_err(&pdev->dev, "can't claim irq %d\n", irq);
-   return ret;
-   }
-
i2c_set_adapdata(&lpi2c_imx->adapter, lpi2c_imx);
platform_set_drvdata(pdev, lpi2c_imx);
 
@@ -668,6 +662,7 @@ static int __maybe_unused lpi2c_runtime_suspend(struct 
device *dev)
 {
struct lpi2c_imx_struct *lpi2c_imx = dev_get_drvdata(dev);
 
+   free_irq(lpi2c_imx->irq, lpi2c_imx);
lpi2c_imx_clocks_unprepare(lpi2c_imx);
pinctrl_pm_select_sleep_state(dev);
 
@@ -677,10 +672,21 @@ static int __maybe_unused lpi2c_runtime_suspend(struct 
device *dev)
 static int __maybe_unused lpi2c_runtime_resume(struct device *dev)
 {
struct lpi2c_imx_struct *lpi2c_imx = dev_get_drvdata(dev);
+   int ret = 0;
 
pinctrl_pm_select_default_state(dev);
+   ret = lpi2c_imx_clocks_prepare(lpi2c_imx);
+   if (ret)
+   return ret;
 
-   return lpi2c_imx_clocks_prepare(lpi2c_imx);
+   ret = request_irq(lpi2c_imx->irq, lpi2c_imx_isr, 0,
+  dev_name(dev), lpi2c_imx);
+   if (ret) {
+   dev_err(dev, "can't claim irq %d\n", lpi2c_imx->irq);
+   return ret;
+   }
+
+   return ret;
 }
 
 static const struct dev_pm_ops lpi2c_pm_ops = {
-- 
2.25.1

[PATCH 2/2] Bluetooth: Do not set cur_adv_instance in adv param MGMT request

2021-04-05 Thread Daniel Winkler

We set hdev->cur_adv_instance in the adv param MGMT request to allow the
callback to the hci param request to set the tx power to the correct
instance. Now that the callbacks use the advertising handle from the hci
request (as they should), this workaround is no longer necessary.

Furthermore, this change resolves a race condition that is more
prevalent when using the extended advertising MGMT calls - if
hdev->cur_adv_instance is set in the params request, then when the data
request is called, we believe our new instance is already active. This
treats it as an update and immediately schedules the instance with the
controller, which has a potential race with the software rotation adv
update. By not setting hdev->cur_adv_instance too early, the new
instance is queued as it should be, to be used when the rotation comes
around again.

This change is tested on harrison peak to confirm that it resolves the
race condition on registration, and that there is no regression in
single- and multi-advertising automated tests.

Reviewed-by: Miao-chen Chou 
Signed-off-by: Daniel Winkler 
---

 net/bluetooth/mgmt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index 09e099c419f251..59f8016c486626 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -7979,7 +7979,6 @@ static int add_ext_adv_params(struct sock *sk, struct 
hci_dev *hdev,
goto unlock;
}
 
-   hdev->cur_adv_instance = cp->instance;
    /* Submit request for advertising params if ext adv available */
if (ext_adv_capable(hdev)) {
hci_req_init(&req, hdev);
-- 
2.31.0.208.g409f899ff0-goog

[PATCH 5.11 043/152] mptcp: init mptcp request socket earlier

2021-04-05 Thread Greg Kroah-Hartman

From: Paolo Abeni 

[ Upstream commit d8b59efa64060d17b7b61f97d891de2d9f2bd9f0 ]

The mptcp subflow route_req() callback performs the subflow
req initialization after the route_req() check. If the latter
fails, mptcp-specific bits of the current request sockets
are left uninitialized.

The above causes bad things at req socket disposal time, when
the mptcp resources are cleared.

This change addresses the issue by splitting subflow_init_req()
into the actual initialization and the mptcp-specific checks.
The initialization is moved before any possibly failing check.

Reported-by: Christoph Paasch 
Fixes: 7ea851d19b23 ("tcp: merge 'init_req' and 'route_req' functions")
Signed-off-by: Paolo Abeni 
Signed-off-by: Mat Martineau 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/mptcp/subflow.c | 40 
 1 file changed, 16 insertions(+), 24 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6c0205816a5d..f97f29df4505 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -92,7 +92,7 @@ static struct mptcp_sock *subflow_token_join_request(struct 
request_sock *req,
return msk;
 }
 
-static int __subflow_init_req(struct request_sock *req, const struct sock 
*sk_listener)
+static void subflow_init_req(struct request_sock *req, const struct sock 
*sk_listener)
 {
struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req);
 
@@ -100,16 +100,6 @@ static int __subflow_init_req(struct request_sock *req, 
const struct sock *sk_li
subflow_req->mp_join = 0;
subflow_req->msk = NULL;
mptcp_token_init_request(req);
-
-#ifdef CONFIG_TCP_MD5SIG
-   /* no MPTCP if MD5SIG is enabled on this socket or we may run out of
-* TCP option space.
-*/
-   if (rcu_access_pointer(tcp_sk(sk_listener)->md5sig_info))
-   return -EINVAL;
-#endif
-
-   return 0;
 }
 
 /* Init mptcp request socket.
@@ -117,20 +107,23 @@ static int __subflow_init_req(struct request_sock *req, 
const struct sock *sk_li
  * Returns an error code if a JOIN has failed and a TCP reset
  * should be sent.
  */
-static int subflow_init_req(struct request_sock *req,
-   const struct sock *sk_listener,
-   struct sk_buff *skb)
+static int subflow_check_req(struct request_sock *req,
+const struct sock *sk_listener,
+struct sk_buff *skb)
 {
struct mptcp_subflow_context *listener = mptcp_subflow_ctx(sk_listener);
struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req);
struct mptcp_options_received mp_opt;
-   int ret;
 
pr_debug("subflow_req=%p, listener=%p", subflow_req, listener);
 
-   ret = __subflow_init_req(req, sk_listener);
-   if (ret)
-   return 0;
+#ifdef CONFIG_TCP_MD5SIG
+   /* no MPTCP if MD5SIG is enabled on this socket or we may run out of
+* TCP option space.
+*/
+   if (rcu_access_pointer(tcp_sk(sk_listener)->md5sig_info))
+   return -EINVAL;
+#endif
 
mptcp_get_options(skb, &mp_opt);
 
@@ -205,10 +198,7 @@ int mptcp_subflow_init_cookie_req(struct request_sock *req,
struct mptcp_options_received mp_opt;
int err;
 
-   err = __subflow_init_req(req, sk_listener);
-   if (err)
-   return err;
-
+   subflow_init_req(req, sk_listener);
mptcp_get_options(skb, &mp_opt);
 
if (mp_opt.mp_capable && mp_opt.mp_join)
@@ -248,12 +238,13 @@ static struct dst_entry *subflow_v4_route_req(const 
struct sock *sk,
int err;
 
tcp_rsk(req)->is_mptcp = 1;
+   subflow_init_req(req, sk);
 
dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req);
if (!dst)
return NULL;
 
-   err = subflow_init_req(req, sk, skb);
+   err = subflow_check_req(req, sk, skb);
if (err == 0)
return dst;
 
@@ -273,12 +264,13 @@ static struct dst_entry *subflow_v6_route_req(const 
struct sock *sk,
int err;
 
tcp_rsk(req)->is_mptcp = 1;
+   subflow_init_req(req, sk);
 
dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req);
if (!dst)
return NULL;
 
-   err = subflow_init_req(req, sk, skb);
+   err = subflow_check_req(req, sk, skb);
if (err == 0)
return dst;
 
-- 
2.30.1

Re: [PATCH v5 1/2] scsi: ufs: Fix task management request completion timeout

2021-04-01 Thread Bart Van Assche

On 4/1/21 12:39 AM, Can Guo wrote:
> ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()),
> but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
> and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
> chance to run. Thus, TMR always ends up with completion timeout. Fix it by
> calling blk_mq_start_request() in  __ufshcd_issue_tm_cmd().

Reviewed-by: Bart Van Assche

Re: [syzbot] BUG: unable to handle kernel paging request in bpf_trace_run2

2021-04-01 Thread syzbot

syzbot suspects this issue was fixed by commit:

commit befe6d946551d65cddbd32b9cb0170b0249fd5ed
Author: Steven Rostedt (VMware) 
Date:   Wed Nov 18 14:34:05 2020 +

tracepoint: Do not fail unregistering a probe due to memory failure

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14f0260ed0
start commit:   12450081 libbpf: Fix native endian assumption when parsing..
git tree:   bpf
kernel config:  https://syzkaller.appspot.com/x/.config?x=5ac0d21536db480b
dashboard link: https://syzkaller.appspot.com/bug?extid=cc36fd07553c0512f5f7
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1365d2c390
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16d5f08d90

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: tracepoint: Do not fail unregistering a probe due to memory failure

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-04-01 Thread Sagi Grimberg





request tag can't be zero? I forget...


Of course it can.  But the reserved tags are before the normal tags,
so 0 would be a reserved tag for nvme.


Right.

[PATCH v5 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs

2021-04-01 Thread Can Guo

In __ufshcd_issue_tm_cmd(), it is not right to use hba->nutrs + req->tag as
the Task Tag in one TMR UPIU. Directly use req->tag as the Task Tag.

Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command 
implementation")

Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index d4f8cb2..ce5f3fea 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6446,38 +6446,34 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
DECLARE_COMPLETION_ONSTACK(wait);
    struct request *req;
unsigned long flags;
-   int free_slot, task_tag, err;
+   int task_tag, err;
 
/*
-* Get free slot, sleep if slots are unavailable.
-* Even though we use wait_event() which sleeps indefinitely,
-* the maximum wait time is bounded by %TM_CMD_TIMEOUT.
+* blk_get_request() is used here only to get a free tag.
 */
req = blk_get_request(q, REQ_OP_DRV_OUT, 0);
if (IS_ERR(req))
return PTR_ERR(req);
 
req->end_io_data = &wait;
-   free_slot = req->tag;
-   WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
ufshcd_hold(hba, false);
 
spin_lock_irqsave(host->host_lock, flags);
-   task_tag = hba->nutrs + free_slot;
blk_mq_start_request(req);
 
+   task_tag = req->tag;
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-   memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq));
-   ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function);
+   memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq));
+   ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function);
 
/* send command to the controller */
-   __set_bit(free_slot, &hba->outstanding_tasks);
+   __set_bit(task_tag, &hba->outstanding_tasks);
 
/* Make sure descriptors are ready before ringing the task doorbell */
wmb();
 
-   ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL);
+   ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL);
/* Make sure that doorbell is committed immediately */
wmb();
 
@@ -6497,24 +6493,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
-   if (ufshcd_clear_tm_cmd(hba, free_slot))
-   dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) 
after timeout\n",
-   __func__, free_slot);
+   if (ufshcd_clear_tm_cmd(hba, task_tag))
+   dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot 
%d) after timeout\n",
+   __func__, task_tag);
err = -ETIMEDOUT;
} else {
err = 0;
-   memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq));
+   memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq));
 
ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_COMP);
}
 
spin_lock_irqsave(hba->host->host_lock, flags);
-   __clear_bit(free_slot, &hba->outstanding_tasks);
+   __clear_bit(task_tag, &hba->outstanding_tasks);
spin_unlock_irqrestore(hba->host->host_lock, flags);
 
+   ufshcd_release(hba);
blk_put_request(req);
 
-   ufshcd_release(hba);
return err;
 }
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH v5 1/2] scsi: ufs: Fix task management request completion timeout

2021-04-01 Thread Can Guo

ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()),
but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
chance to run. Thus, TMR always ends up with completion timeout. Fix it by
calling blk_mq_start_request() in  __ufshcd_issue_tm_cmd().

Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and 
free TMFs")

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index b49555fa..d4f8cb2 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 
spin_lock_irqsave(host->host_lock, flags);
task_tag = hba->nutrs + free_slot;
+   blk_mq_start_request(req);
 
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH v4 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs

2021-04-01 Thread Can Guo


On 2021-04-01 14:44, Daejun Park wrote:

Hi, Can Guo


diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

...


req->end_io_data = &wait;
-   free_slot = req->tag;
WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

I think this line should be removed.



Oh, yes, will remove it in next version.

Thanks,
Can Guo.


Thanks,
Daejun

RE: [PATCH v4 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs

2021-04-01 Thread Daejun Park

Hi, Can Guo

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
...
>  
>   req->end_io_data = &wait;
> - free_slot = req->tag;
>   WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
I think this line should be removed.

Thanks,
Daejun

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-31 Thread Christoph Hellwig

On Wed, Mar 31, 2021 at 03:24:49PM -0700, Sagi Grimberg wrote:
>
>>> What we can do, though, is checking the 'state' field in the tcp
>>> request, and only allow completions for commands which are in a state
>>> allowing for completions.
>>>
>>> Let's see if I can whip up a patch.
>>
>> That would be great.  BTW in the crash dump I am looking at now, it
>> looks like pdu->command_id was zero in nvme_tcp_recv_data(), and
>> blk_mq_tag_to_rq() returned a request struct that had not been used.
>> So I think we do need to check that the tag was actually allocated.
>
> request tag can't be zero? I forget...

Of course it can.  But the reserved tags are before the normal tags,
so 0 would be a reserved tag for nvme.

Re: [PATCH v4 1/2] scsi: ufs: Fix task management request completion timeout

2021-03-31 Thread Bart Van Assche

On 3/31/21 9:45 AM, Avri Altman wrote:
>> ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn =
>> ufshcd_compl_tm()),
>> but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
>> and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
>> chance to run. Thus, TMR always ends up with completion timeout. Fix it by
>> calling blk_mq_start_request() in  __ufshcd_issue_tm_cmd().
>>
>> Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and
>> free TMFs")
>>
>> Signed-off-by: Can Guo 
>> ---
>>  drivers/scsi/ufs/ufshcd.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index b49555fa..d4f8cb2 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba
>> *hba,
>>
>> spin_lock_irqsave(host->host_lock, flags);
>> task_tag = hba->nutrs + free_slot;
>> +   blk_mq_start_request(req);
> Maybe just set req->state to MQ_RQ_IN_FLIGHT
> Without all other irrelevant initializations such as add timeout etc.

Hmm ... I'm not sure that any of the actions performed by
blk_mq_start_request() are irrelevant in this context. Additionally, no
other block or SCSI driver sets MQ_RQ_IN_FLIGHT directly.

Thanks,

Bart.

Re: [PATCH v4 1/2] scsi: ufs: Fix task management request completion timeout

2021-03-31 Thread Can Guo


On 2021-04-01 00:45, Avri Altman wrote:

ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn =
ufshcd_compl_tm()),
but since blk_mq_tagset_busy_iter() only iterates over all reserved 
tags
and requests which are not in IDLE state, ufshcd_compl_tm() never gets 
a
chance to run. Thus, TMR always ends up with completion timeout. Fix 
it by

calling blk_mq_start_request() in  __ufshcd_issue_tm_cmd().

Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to 
allocate and

free TMFs")

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index b49555fa..d4f8cb2 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba
*hba,

spin_lock_irqsave(host->host_lock, flags);
task_tag = hba->nutrs + free_slot;
+   blk_mq_start_request(req);

Maybe just set req->state to MQ_RQ_IN_FLIGHT
Without all other irrelevant initializations such as add timeout etc.



I don't see any other drivers do that, is it appropriate
to call WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT) outside
block layer?

Thanks,
Can Guo.


Thanks,
Avri


treq->req_header.dword_0 |= cpu_to_be32(task_tag);

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project.

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-31 Thread Sagi Grimberg





It is, but in this situation, the controller is sending a second
completion that results in a use-after-free, which makes the
transport irrelevant. Unless there is some other flow (which is
unclear
to me) that causes this which is a bug that needs to be fixed rather
than hidden with a safeguard.



The kernel should not crash regardless of any network traffic that is
sent to the system.  It should not be possible to either intentionally
of mistakenly contruct packets that will deny service in this way.


This is not specific to nvme-tcp. I can build an rdma or pci controller
that can trigger the same crash... I saw a similar patch from Hannes
implemented in the scsi level, and not the individual scsi transports..


If scsi wants this too, this could be made generic at the blk-mq level.
We just need to make something like blk_mq_tag_to_rq(), but return NULL
if the request isn't started.


Makes sense...


I would also mention, that a crash is not even the scariest issue that
we can see here, because if the request happened to be reused we are
in the silent data corruption realm...


If this does happen, I think we have to come up with some way to
mitigate it. We're not utilizing the full 16 bits of the command_id, so
maybe we can append something like a generation sequence number that can
be checked for validity.


That's actually a great idea. scsi needs unique tags so it encodes the
hwq in the upper 16 bits giving the actual tag the lower 16 bits which
is more than enough for a single queue. We can do the same with
a gencnt that will increment in both submission and completion and we
can validate against it.

This will be useful for all transports, so maintaining it in
nvme_req(rq)->genctr and introducing a helper like:
rq = nvme_find_tag(tagset, cqe->command_id)
That will filter genctr, locate the request.

Also:
nvme_validate_request_gen(rq, cqe->command_id) that would
compare against it.


And then a helper to set the command_id like:
cmd->common.command_id = nvme_request_command_id(rq)
that will both increment the genctr and build a command_id
from it.

Thoughts?

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-31 Thread Sagi Grimberg





What we can do, though, is checking the 'state' field in the tcp
request, and only allow completions for commands which are in a state
allowing for completions.

Let's see if I can whip up a patch.


That would be great.  BTW in the crash dump I am looking at now, it
looks like pdu->command_id was zero in nvme_tcp_recv_data(), and
blk_mq_tag_to_rq() returned a request struct that had not been used.
So I think we do need to check that the tag was actually allocated.


request tag can't be zero? I forget...

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-31 Thread Ewan D. Milne

On Wed, 2021-03-31 at 09:11 +0200, Hannes Reinecke wrote:
> On 3/31/21 1:28 AM, Keith Busch wrote:
> > On Tue, Mar 30, 2021 at 10:34:25AM -0700, Sagi Grimberg wrote:
> > > 
> > > > > It is, but in this situation, the controller is sending a
> > > > > second
> > > > > completion that results in a use-after-free, which makes the
> > > > > transport irrelevant. Unless there is some other flow (which
> > > > > is
> > > > > unclear
> > > > > to me) that causes this which is a bug that needs to be fixed
> > > > > rather
> > > > > than hidden with a safeguard.
> > > > > 
> > > > 
> > > > The kernel should not crash regardless of any network traffic
> > > > that is
> > > > sent to the system.  It should not be possible to either
> > > > intentionally
> > > > of mistakenly contruct packets that will deny service in this
> > > > way.
> > > 
> > > This is not specific to nvme-tcp. I can build an rdma or pci
> > > controller
> > > that can trigger the same crash... I saw a similar patch from
> > > Hannes
> > > implemented in the scsi level, and not the individual scsi
> > > transports..
> > 
> > If scsi wants this too, this could be made generic at the blk-mq
> > level.
> > We just need to make something like blk_mq_tag_to_rq(), but return
> > NULL
> > if the request isn't started.
> >  
> > > I would also mention, that a crash is not even the scariest issue
> > > that
> > > we can see here, because if the request happened to be reused we
> > > are
> > > in the silent data corruption realm...
> > 
> > If this does happen, I think we have to come up with some way to
> > mitigate it. We're not utilizing the full 16 bits of the
> > command_id, so
> > maybe we can append something like a generation sequence number
> > that can
> > be checked for validity.
> > 
> 
> ... which will be near impossible.
> We can protect against crashing on invalid frames.
> We can _not_ protect against maliciously crafted packets referencing
> any
> random _existing_ tag; that's what TLS is for.
> 
> What we can do, though, is checking the 'state' field in the tcp
> request, and only allow completions for commands which are in a state
> allowing for completions.
> 
> Let's see if I can whip up a patch.

That would be great.  BTW in the crash dump I am looking at now, it
looks like pdu->command_id was zero in nvme_tcp_recv_data(), and
blk_mq_tag_to_rq() returned a request struct that had not been used.
So I think we do need to check that the tag was actually allocated.

-Ewan

> 
> Cheers,
> 
> Hannes

RE: [PATCH v4 1/2] scsi: ufs: Fix task management request completion timeout

2021-03-31 Thread Avri Altman

> ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn =
> ufshcd_compl_tm()),
> but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
> and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
> chance to run. Thus, TMR always ends up with completion timeout. Fix it by
> calling blk_mq_start_request() in  __ufshcd_issue_tm_cmd().
> 
> Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and
> free TMFs")
> 
> Signed-off-by: Can Guo 
> ---
>  drivers/scsi/ufs/ufshcd.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index b49555fa..d4f8cb2 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba
> *hba,
> 
> spin_lock_irqsave(host->host_lock, flags);
> task_tag = hba->nutrs + free_slot;
> +   blk_mq_start_request(req);
Maybe just set req->state to MQ_RQ_IN_FLIGHT
Without all other irrelevant initializations such as add timeout etc.

Thanks,
Avri
> 
> treq->req_header.dword_0 |= cpu_to_be32(task_tag);
> 
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
> Linux Foundation Collaborative Project.

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-31 Thread Hannes Reinecke

On 3/31/21 1:28 AM, Keith Busch wrote:
> On Tue, Mar 30, 2021 at 10:34:25AM -0700, Sagi Grimberg wrote:
>>
>>>> It is, but in this situation, the controller is sending a second
>>>> completion that results in a use-after-free, which makes the
>>>> transport irrelevant. Unless there is some other flow (which is
>>>> unclear
>>>> to me) that causes this which is a bug that needs to be fixed rather
>>>> than hidden with a safeguard.
>>>>
>>>
>>> The kernel should not crash regardless of any network traffic that is
>>> sent to the system.  It should not be possible to either intentionally
>>> of mistakenly contruct packets that will deny service in this way.
>>
>> This is not specific to nvme-tcp. I can build an rdma or pci controller
>> that can trigger the same crash... I saw a similar patch from Hannes
>> implemented in the scsi level, and not the individual scsi transports..
> 
> If scsi wants this too, this could be made generic at the blk-mq level.
> We just need to make something like blk_mq_tag_to_rq(), but return NULL
> if the request isn't started.
>  
>> I would also mention, that a crash is not even the scariest issue that
>> we can see here, because if the request happened to be reused we are
>> in the silent data corruption realm...
> 
> If this does happen, I think we have to come up with some way to
> mitigate it. We're not utilizing the full 16 bits of the command_id, so
> maybe we can append something like a generation sequence number that can
> be checked for validity.
> 
... which will be near impossible.
We can protect against crashing on invalid frames.
We can _not_ protect against maliciously crafted packets referencing any
random _existing_ tag; that's what TLS is for.

What we can do, though, is checking the 'state' field in the tcp
request, and only allow completions for commands which are in a state
allowing for completions.

Let's see if I can whip up a patch.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

[PATCH v4 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs

2021-03-30 Thread Can Guo

In __ufshcd_issue_tm_cmd(), it is not right to use hba->nutrs + req->tag as
the Task Tag in one TMR UPIU. Directly use req->tag as the Task Tag.

Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command 
implementation")

Reviewed-by: Bart Van Assche 
Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 29 +
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index d4f8cb2..cdd8c3d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6446,38 +6446,35 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
DECLARE_COMPLETION_ONSTACK(wait);
    struct request *req;
unsigned long flags;
-   int free_slot, task_tag, err;
+   int task_tag, err;
 
/*
-* Get free slot, sleep if slots are unavailable.
-* Even though we use wait_event() which sleeps indefinitely,
-* the maximum wait time is bounded by %TM_CMD_TIMEOUT.
+* blk_get_request() is used here only to get a free tag.
 */
req = blk_get_request(q, REQ_OP_DRV_OUT, 0);
if (IS_ERR(req))
return PTR_ERR(req);
 
req->end_io_data = &wait;
-   free_slot = req->tag;
WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);
ufshcd_hold(hba, false);
 
spin_lock_irqsave(host->host_lock, flags);
-   task_tag = hba->nutrs + free_slot;
blk_mq_start_request(req);
 
+   task_tag = req->tag;
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-   memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq));
-   ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function);
+   memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq));
+   ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function);
 
/* send command to the controller */
-   __set_bit(free_slot, &hba->outstanding_tasks);
+   __set_bit(task_tag, &hba->outstanding_tasks);
 
/* Make sure descriptors are ready before ringing the task doorbell */
wmb();
 
-   ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL);
+   ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL);
/* Make sure that doorbell is committed immediately */
wmb();
 
@@ -6497,24 +6494,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
__func__, tm_function);
-   if (ufshcd_clear_tm_cmd(hba, free_slot))
-   dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) 
after timeout\n",
-   __func__, free_slot);
+   if (ufshcd_clear_tm_cmd(hba, task_tag))
+   dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot 
%d) after timeout\n",
+   __func__, task_tag);
err = -ETIMEDOUT;
} else {
err = 0;
-   memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq));
+   memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq));
 
ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_COMP);
}
 
spin_lock_irqsave(hba->host->host_lock, flags);
-   __clear_bit(free_slot, &hba->outstanding_tasks);
+   __clear_bit(task_tag, &hba->outstanding_tasks);
spin_unlock_irqrestore(hba->host->host_lock, flags);
 
+   ufshcd_release(hba);
blk_put_request(req);
 
-   ufshcd_release(hba);
return err;
 }
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH v4 1/2] scsi: ufs: Fix task management request completion timeout

2021-03-30 Thread Can Guo

ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()),
but since blk_mq_tagset_busy_iter() only iterates over all reserved tags
and requests which are not in IDLE state, ufshcd_compl_tm() never gets a
chance to run. Thus, TMR always ends up with completion timeout. Fix it by
calling blk_mq_start_request() in  __ufshcd_issue_tm_cmd().

Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and 
free TMFs")

Signed-off-by: Can Guo 
---
 drivers/scsi/ufs/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index b49555fa..d4f8cb2 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 
spin_lock_irqsave(host->host_lock, flags);
task_tag = hba->nutrs + free_slot;
+   blk_mq_start_request(req);
 
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-30 Thread Keith Busch

On Tue, Mar 30, 2021 at 10:34:25AM -0700, Sagi Grimberg wrote:
> 
> > > It is, but in this situation, the controller is sending a second
> > > completion that results in a use-after-free, which makes the
> > > transport irrelevant. Unless there is some other flow (which is
> > > unclear
> > > to me) that causes this which is a bug that needs to be fixed rather
> > > than hidden with a safeguard.
> > > 
> > 
> > The kernel should not crash regardless of any network traffic that is
> > sent to the system.  It should not be possible to either intentionally
> > of mistakenly contruct packets that will deny service in this way.
> 
> This is not specific to nvme-tcp. I can build an rdma or pci controller
> that can trigger the same crash... I saw a similar patch from Hannes
> implemented in the scsi level, and not the individual scsi transports..

If scsi wants this too, this could be made generic at the blk-mq level.
We just need to make something like blk_mq_tag_to_rq(), but return NULL
if the request isn't started.
 
> I would also mention, that a crash is not even the scariest issue that
> we can see here, because if the request happened to be reused we are
> in the silent data corruption realm...

If this does happen, I think we have to come up with some way to
mitigate it. We're not utilizing the full 16 bits of the command_id, so
maybe we can append something like a generation sequence number that can
be checked for validity.

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-30 Thread Sagi Grimberg





It is, but in this situation, the controller is sending a second
completion that results in a use-after-free, which makes the
transport irrelevant. Unless there is some other flow (which is
unclear
to me) that causes this which is a bug that needs to be fixed rather
than hidden with a safeguard.



The kernel should not crash regardless of any network traffic that is
sent to the system.  It should not be possible to either intentionally
of mistakenly contruct packets that will deny service in this way.


This is not specific to nvme-tcp. I can build an rdma or pci controller
that can trigger the same crash... I saw a similar patch from Hannes
implemented in the scsi level, and not the individual scsi transports..

I would also mention, that a crash is not even the scariest issue that
we can see here, because if the request happened to be reused we are
in the silent data corruption realm...

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-30 Thread Ewan D. Milne

On Mon, 2021-03-15 at 10:16 -0700, Sagi Grimberg wrote:
> > Hi Sagi,
> > 
> > On Fri, Mar 05, 2021 at 11:57:30AM -0800, Sagi Grimberg wrote:
> > > Daniel, again, there is nothing specific about this to nvme-tcp,
> > > this is a safeguard against a funky controller (or a different
> > > bug that is hidden by this).
> > 
> > As far I can tell, the main difference between nvme-tcp and
> > FC/NVMe,
> > nvme-tcp has not a FW or a big driver which filter out some noise
> > from a
> > misbehaving controller. I haven't really checked the other
> > transports
> > but I wouldn't surprised they share the same properties as FC/NVMe.
> > 
> > > The same can happen in any other transport so I would suggest
> > > that if
> > > this is a safeguard we want to put in place, we should make it a
> > > generic one.
> > > 
> > > i.e. nvme_tag_to_rq() that _all_ transports call consistently.
> > 
> > Okay, I'll review all the relevant code and see what could made
> > more
> > generic and consistent.
> > 
> > Though I think nvme-tcp plays in a different league as it is
> > exposed to
> > normal networking traffic and this is a very hostile environment.
> 
> It is, but in this situation, the controller is sending a second
> completion that results in a use-after-free, which makes the
> transport irrelevant. Unless there is some other flow (which is
> unclear
> to me) that causes this which is a bug that needs to be fixed rather
> than hidden with a safeguard.
> 

The kernel should not crash regardless of any network traffic that is
sent to the system.  It should not be possible to either intentionally
of mistakenly contruct packets that will deny service in this way.

-Ewan

[PATCH v2 4/5] phy: cadence-torrent: Explicitly request exclusive reset control

2021-03-30 Thread Kishon Vijay Abraham I

No functional change. Since the reset controls obtained in
Torrent is exclusively used by the Torrent device, use
exclusive reset control request API calls.

Signed-off-by: Kishon Vijay Abraham I 
Reviewed-by: Swapnil Jakhade 
---
 drivers/phy/cadence/phy-cadence-torrent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/cadence/phy-cadence-torrent.c 
b/drivers/phy/cadence/phy-cadence-torrent.c
index 5ee1657f5a1c..ff8bb4b724c0 100644
--- a/drivers/phy/cadence/phy-cadence-torrent.c
+++ b/drivers/phy/cadence/phy-cadence-torrent.c
@@ -2264,7 +2264,7 @@ static int cdns_torrent_reset(struct cdns_torrent_phy 
*cdns_phy)
return PTR_ERR(cdns_phy->phy_rst);
}
 
-   cdns_phy->apb_rst = devm_reset_control_get_optional(dev, "torrent_apb");
+   cdns_phy->apb_rst = devm_reset_control_get_optional_exclusive(dev, 
"torrent_apb");
if (IS_ERR(cdns_phy->apb_rst)) {
dev_err(dev, "%s: failed to get apb reset\n",
dev->of_node->full_name);
-- 
2.17.1

[PATCH 2/2] drm/ingenic: Don't request full modeset if property is not modified

2021-03-29 Thread Paul Cercueil

Avoid requesting a full modeset if the sharpness property is not
modified, because then we don't actually need it.

Fixes: fc1acf317b01 ("drm/ingenic: Add support for the IPU")
Cc:  # 5.8+
Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-ipu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c 
b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 3b1091e7c0cd..95b665c4a7b0 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -640,10 +640,12 @@ ingenic_ipu_plane_atomic_set_property(struct drm_plane 
*plane,
 {
struct ingenic_ipu *ipu = plane_to_ingenic_ipu(plane);
struct drm_crtc_state *crtc_state;
+   bool mode_changed;
 
if (property != ipu->sharpness_prop)
return -EINVAL;
 
+   mode_changed = val != ipu->sharpness;
ipu->sharpness = val;
 
if (state->crtc) {
@@ -651,7 +653,7 @@ ingenic_ipu_plane_atomic_set_property(struct drm_plane 
*plane,
if (WARN_ON(!crtc_state))
return -EINVAL;
 
-   crtc_state->mode_changed = true;
+   crtc_state->mode_changed |= mode_changed;
}
 
return 0;
-- 
2.30.2

[PATCH v2 07/13] intel_gna: add request component

2021-03-24 Thread Maciej Kwapulinski

From: Tomasz Jankowski 

The scoring work submitted to the GNA driver is implemented as a
list of requests that will be processed by the hardware.

Signed-off-by: Tomasz Jankowski 
Tested-by: Savo Novakovic 
Co-developed-by: Anisha Dattatraya Kulkarni 

Signed-off-by: Anisha Dattatraya Kulkarni 
Co-developed-by: Jianxun Zhang 
Signed-off-by: Jianxun Zhang 
Co-developed-by: Maciej Kwapulinski 
Signed-off-by: Maciej Kwapulinski 
---
 drivers/misc/intel/gna/Kbuild|   2 +-
 drivers/misc/intel/gna/gna_device.c  |   6 +
 drivers/misc/intel/gna/gna_device.h  |   6 +
 drivers/misc/intel/gna/gna_mem.c |   3 +
 drivers/misc/intel/gna/gna_request.c | 347 +++
 drivers/misc/intel/gna/gna_request.h |  61 +
 6 files changed, 424 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/intel/gna/gna_request.c
 create mode 100644 drivers/misc/intel/gna/gna_request.h

diff --git a/drivers/misc/intel/gna/Kbuild b/drivers/misc/intel/gna/Kbuild
index e5cd953d83b2..5dbbd3f0a543 100644
--- a/drivers/misc/intel/gna/Kbuild
+++ b/drivers/misc/intel/gna/Kbuild
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_hw.o
+intel_gna-y := gna_device.o gna_driver.o gna_mem.o gna_request.o gna_hw.o
 
 obj-$(CONFIG_INTEL_GNA) += intel_gna.o
diff --git a/drivers/misc/intel/gna/gna_device.c 
b/drivers/misc/intel/gna/gna_device.c
index 9838d003426f..14ce24fd18ff 100644
--- a/drivers/misc/intel/gna/gna_device.c
+++ b/drivers/misc/intel/gna/gna_device.c
@@ -6,6 +6,7 @@
 
 #include "gna_device.h"
 #include "gna_driver.h"
+#include "gna_request.h"
 
 #define GNA_DEV_HWID_CNL   0x5A11
 #define GNA_DEV_HWID_EHL   0x4511
@@ -118,6 +119,11 @@ static int gna_dev_init(struct gna_private *gna_priv, 
struct pci_dev *pcidev,
idr_init(&gna_priv->memory_idr);
mutex_init(&gna_priv->memidr_lock);
 
+   atomic_set(&gna_priv->request_count, 0);
+
+   mutex_init(&gna_priv->reqlist_lock);
+   INIT_LIST_HEAD(&gna_priv->request_list);
+
return 0;
 
 err_pci_drvdata_unset:
diff --git a/drivers/misc/intel/gna/gna_device.h 
b/drivers/misc/intel/gna/gna_device.h
index 799788d70033..b54d0ea9b9ef 100644
--- a/drivers/misc/intel/gna/gna_device.h
+++ b/drivers/misc/intel/gna/gna_device.h
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -44,6 +45,11 @@ struct gna_private {
struct gna_mmu_object mmu;
struct mutex mmu_lock;
 
+   struct list_head request_list;
+   /* protects request_list */
+   struct mutex reqlist_lock;
+   atomic_t request_count;
+
/* memory objects' store */
struct idr memory_idr;
/* lock protecting memory_idr */
diff --git a/drivers/misc/intel/gna/gna_mem.c b/drivers/misc/intel/gna/gna_mem.c
index f3828b503ff6..ce1691d68edb 100644
--- a/drivers/misc/intel/gna/gna_mem.c
+++ b/drivers/misc/intel/gna/gna_mem.c
@@ -17,6 +17,7 @@
 #include "gna_device.h"
 #include "gna_driver.h"
 #include "gna_mem.h"
+#include "gna_request.h"
 
 static void gna_mmu_init(struct gna_private *gna_priv)
 {
@@ -392,6 +393,8 @@ static void gna_memory_release(struct work_struct *work)
 
mo = container_of(work, struct gna_memory_object, work);
 
+   gna_delete_memory_requests(mo->memory_id, mo->gna_priv);
+
mo->user_ptr = NULL;
 
wake_up_interruptible(&mo->waitq);
diff --git a/drivers/misc/intel/gna/gna_request.c 
b/drivers/misc/intel/gna/gna_request.c
new file mode 100644
index ..383871eaebab
--- /dev/null
+++ b/drivers/misc/intel/gna/gna_request.c
@@ -0,0 +1,347 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2017-2021 Intel Corporation
+
+#include 
+#include 
+#include 
+#include 
+
+#include "gna_device.h"
+#include "gna_driver.h"
+#include "gna_request.h"
+
+static struct gna_request *gna_request_create(struct gna_file_private 
*file_priv,
+  struct gna_compute_cfg *compute_cfg)
+{
+   struct gna_request *score_request;
+   struct gna_private *gna_priv;
+
+   gna_priv = file_priv->gna_priv;
+   if (IS_ERR(gna_priv))
+   return NULL;
+
+   score_request = kzalloc(sizeof(*score_request), GFP_KERNEL);
+   if (!score_request)
+   return NULL;
+   kref_init(&score_request->refcount);
+
+   dev_dbg(&gna_priv->pdev->dev, "layer_base %d layer_count %d\n",
+   compute_cfg->layer_base, compute_cfg->layer_count);
+
+   score_request->request_id = atomic_inc_return(&gna_priv->request_count);
+   score_request->compute_cfg = *compute_cfg;
+   score_request->fd = file_priv->fd;
+   score_request->gna_priv = gna_priv;
+   score_request->state = NEW;
+   init_waitqueue_head(&score_request->waitq);
+
+   return score_request;
+}
+
+/*
+ * returns true if [inner_offset, inner_size) is embraced by [0, outer_size). 
False otherwise.
+ */
+static bool gna_validate_ranges(u64 outer_size,

Re: [PATCH] mmc: block: use REQ_HIPRI flag to complete request directly in own complete workqueue

2021-03-23 Thread Christoph Hellwig

On Tue, Mar 23, 2021 at 11:40:47PM +0800, xiang fei wrote:
> Before the commit "40d09b53bfc557af7481b9d80f060a7ac9c7d314", block I/O 
> request
> is completed in mmc_blk_mq_complete_work() and there is no problem.
> But after the commit, block I/O request is completed in softirq and it
> may cause the preemptoff
> problem as above.

I see how they are executed in softirq context, but I don't see how the
above commit could have introduce that.  It literally just refactored the
existing checks.

> The use of REQ_HIPRI flag is intended to execute rq->q->mq_ops->complete() in
> mmc_blk_mq_complete_work(), not in softirq.
> I just think it can avoid the preemptoff problem and not change too much.
> Maybe there is  a better way to solve the problem.

Well, there isn't really much of a point in bouncing to a softirq
context.  The patch below cleans the mmc completion code up to
avoid that internally, but for that it uses an API that right now
isn't intended for that kind of use.  I'm not sure yet it it just
needs updated documentation or maybe a different API.  Jens, any
comments?  Sagi, this might also make sense for nvme-tcp, doesn't it?

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index d666e24fbe0e0a..7ad7a4efd10481 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1364,17 +1364,28 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, 
struct mmc_queue_req *mqrq,
 
 #define MMC_CQE_RETRIES 2
 
-static void mmc_blk_cqe_complete_rq(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_cqe_complete_rq(struct request *req)
 {
struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
struct mmc_request *mrq = &mqrq->brq.mrq;
struct request_queue *q = req->q;
+   struct mmc_queue *mq = q->queuedata;
struct mmc_host *host = mq->card->host;
enum mmc_issue_type issue_type = mmc_issue_type(mq, req);
unsigned long flags;
bool put_card;
int err;
 
+   /*
+* Block layer timeouts race with completions which means the normal
+* completion path cannot be used during recovery.
+*/
+   if (!mq->in_recovery) {
+   if (unlikely(blk_should_fake_timeout(req->q)))
+   return;
+   blk_mq_set_request_complete(req);
+   }
+
mmc_cqe_post_req(host, mrq);
 
if (mrq->cmd && mrq->cmd->error)
@@ -1437,17 +1448,8 @@ static void mmc_blk_cqe_req_done(struct mmc_request *mrq)
struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
  brq.mrq);
struct request *req = mmc_queue_req_to_req(mqrq);
-   struct request_queue *q = req->q;
-   struct mmc_queue *mq = q->queuedata;
 
-   /*
-* Block layer timeouts race with completions which means the normal
-* completion path cannot be used during recovery.
-*/
-   if (mq->in_recovery)
-   mmc_blk_cqe_complete_rq(mq, req);
-   else if (likely(!blk_should_fake_timeout(req->q)))
-   blk_mq_complete_request(req);
+   mmc_blk_cqe_complete_rq(req);
 }
 
 static int mmc_blk_cqe_start_req(struct mmc_host *host, struct mmc_request 
*mrq)
@@ -1864,6 +1866,16 @@ static void mmc_blk_mq_complete_rq(struct mmc_queue *mq, 
struct request *req)
struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
unsigned int nr_bytes = mqrq->brq.data.bytes_xfered;
 
+   /*
+* Block layer timeouts race with completions which means the normal
+* completion path cannot be used during recovery.
+*/
+   if (!mq->in_recovery) {
+   if (unlikely(blk_should_fake_timeout(req->q)))
+   return;
+   blk_mq_set_request_complete(req);
+   }
+
if (nr_bytes) {
if (blk_update_request(req, BLK_STS_OK, nr_bytes))
blk_mq_requeue_request(req, true);
@@ -1920,24 +1932,7 @@ static void mmc_blk_hsq_req_done(struct mmc_request *mrq)
 
mmc_blk_rw_reset_success(mq, req);
 
-   /*
-* Block layer timeouts race with completions which means the normal
-* completion path cannot be used during recovery.
-*/
-   if (mq->in_recovery)
-   mmc_blk_cqe_complete_rq(mq, req);
-   else if (likely(!blk_should_fake_timeout(req->q)))
-   blk_mq_complete_request(req);
-}
-
-void mmc_blk_mq_complete(struct request *req)
-{
-   struct mmc_queue *mq = req->q->queuedata;
-
-   if (mq->use_cqe)
-   mmc_blk_cqe_complete_rq(mq, req);
-   else if (likely(!blk_should_fake_timeout(req->q)))
-   mmc_blk_mq_complete_rq(mq, req);
+   mmc_blk_cqe_complete_rq(req);
 }
 
 static void mmc_blk_mq_poll_completion(struct mmc_q

Re:Re: [PATCH] mmc: block: use REQ_HIPRI flag to complete request directly in own complete workqueue

2021-03-23 Thread xiang fei

At 2021-02-06 00:22:21, "Christoph Hellwig"  wrote:
>On Fri, Feb 05, 2021 at 03:24:06PM +0100, Ulf Hansson wrote:
>> On Thu, 21 Jan 2021 at 09:13, Liu Xiang  wrote:
>> >
>> > After commit "40d09b53bfc557af7481b9d80f060a7ac9c7d314", request is
>> > completed in softirq. This may cause the system to suffer bad preemptoff
>> > time.
>> > The mmc driver has its own complete workqueue, but it can not work
>> > well now.
>> > The REQ_HIPRI flag can be used to complete request directly in its own
>> > complete workqueue and the preemptoff problem could be avoided.
>>
>> I am trying to understand all of the problem, but I don't quite get
>> it, sorry. Would it be possible for you to extend the description in
>> the commit message a bit?
>
>Yes, the message sounds weird.  The mentioned commit should obviously
>not make any difference for drivers not using it.
>
>> More exactly, what will happen if we tag a request with REQ_HIPRI
>> before completing it? Apologize for my ignorance, but I am currently a
>> bit overwhelmed with work, so I didn't have the time to really look it
>> up myself.
>
>Drivers must never set REQ_HIPRI!  This is a flag that is set by
>the submitter, and actually cleared for most drivers that don't support
>it by the block layer.


Sorry for not describing clearly in commit message.
I configure CONFIG_PREEMPT_TRACER and run iozone test with mmc driver.
This is the test result of the mainline:

# tracer: preemptoff
#
# preemptoff latency trace v1.1.5 on 5.11.0-rc4-ga95f9eb3f6cf
# 
# latency: 1130504 us, #51247/49333412, CPU#3 | (M:preempt VP:0, KP:0,
SP:0 HP:0 #P:4)
#-
#| task: ksoftirqd/3-27 (uid:0 nice:0 policy:0 rt_prio:0)
#-
#  => started at: __do_softirq
#  => ended at:   __do_softirq
#
#
#_--=> CPU#
#   / _-=> irqs-off
#  | / _=> need-resched
#  || / _---=> hardirq/softirq
#  ||| / _--=> preempt-depth
#   / delay
#  cmd pid | time  |   caller
# \   /|  \|   /
ksoftirq-273.Ns1 1055999us : end_page_writeback <-ext4_finish_bio
ksoftirq-273.Ns1 1056000us : test_clear_page_writeback
<-end_page_writeback
ksoftirq-273.Ns1 1056002us : page_mapping <-test_clear_page_writeback
ksoftirq-273.Ns1 1056003us : lock_page_memcg <-test_clear_page_writeback
ksoftirq-273.Ns1 1056004us : __rcu_read_lock <-lock_page_memcg
ksoftirq-273.Ns1 1056006us : _raw_spin_lock_irqsave
<-test_clear_page_writeback
ksoftirq-273dNs1 1056007us : preempt_count_add <-_raw_spin_lock_irqsave
ksoftirq-273dNs2 1056009us : preempt_count_add
<-percpu_counter_add_batch
ksoftirq-273dNs3 1056010us : preempt_count_sub
<-percpu_counter_add_batch
ksoftirq-273dNs2 1056012us : preempt_count_add
<-percpu_counter_add_batch
ksoftirq-273dNs3 1056013us : preempt_count_sub
<-percpu_counter_add_batch
ksoftirq-273dNs2 1056014us : preempt_count_add
<-percpu_counter_add_batch
ksoftirq-273dNs3 1056016us : preempt_count_sub
<-percpu_counter_add_batch
ksoftirq-273dNs2 1056017us : preempt_count_add
<-percpu_counter_add_batch
ksoftirq-273dNs3 1056018us : preempt_count_sub
<-percpu_counter_add_batch
ksoftirq-273dNs2 1056020us : mem_cgroup_wb_domain
<-test_clear_page_writeback
ksoftirq-273dNs2 1056021us : _raw_spin_unlock_irqrestore
<-test_clear_page_writeback
ksoftirq-273.Ns2 1056023us : preempt_count_sub
<-_raw_spin_unlock_irqrestore
ksoftirq-273dNs1 1056025us : __mod_lruvec_state
<-test_clear_page_writeback
ksoftirq-273dNs1 1056026us : __mod_node_page_state <-__mod_lruvec_state
ksoftirq-273dNs1 1056027us : __mod_memcg_lruvec_state
<-__mod_lruvec_state
ksoftirq-273dNs1 1056029us : __mod_memcg_state
<-__mod_memcg_lruvec_state
ksoftirq-273.Ns1 1056031us : dec_zone_page_state
<-test_clear_page_writeback
ksoftirq-273.Ns1 1056032us : inc_node_page_state
<-test_clear_page_writeback
ksoftirq-273.Ns1 1056033us : __unlock_page_memcg
<-test_clear_page_writeback
ksoftirq-273.Ns1 1056035us : __rcu_read_unlock <-__unlock_page_memcg
ksoftirq-273.Ns1 1056036us : wake_up_page_bit <-end_page_writeback
ksoftirq-273.Ns1 1056037us : _raw_spin_lock_irqsave <-wake_up_page_bit
ksoftirq-273dNs1 1056039us : preempt_count_add <-_raw_spin_lock_irqsave
ksoftirq-273dNs2 1056041us : __wake_up_locked_key_bookmark
<-wake_up_page_bit
ksoftirq-273dNs2 1056042us : __wake_up_common
&l

Re: [PATCH] usb: gadget: Stall OS descriptor request for unsupported functions

2021-03-22 Thread Wesley Cheng




On 3/22/2021 11:25 PM, Jack Pham wrote:
> Hi Wesley,
> 
> On Mon, Mar 22, 2021 at 06:50:17PM -0700, Wesley Cheng wrote:
>> From: Chandana Kishori Chiluveru 
>>
>> Hosts which request "OS descriptors" from gadgets do so during
>> the enumeration phase and before the configuration is set with
>> SET_CONFIGURATION. Composite driver supports OS descriptor
>> handling in composite_setup function. This requires to pass
>> signature field, vendor code, compatibleID and subCompatibleID
>> from user space.
>>
>> For USB compositions that contain functions which don't implement os
>> descriptors, Windows is sending vendor specific requests for os
>> descriptors and composite driver handling this request with invalid
>> data. With this invalid info host resetting the bus and never
>> selecting the configuration and leading enumeration issue.
>>
>> Fix this by bailing out from the OS descriptor setup request
>> handling if the functions does not have OS descriptors compatibleID.
>>
>> Signed-off-by: Chandana Kishori Chiluveru 
>> Signed-off-by: Wesley Cheng 
>> ---
>>  drivers/usb/gadget/composite.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
>> index 72a9797..473edda6 100644
>> --- a/drivers/usb/gadget/composite.c
>> +++ b/drivers/usb/gadget/composite.c
>> @@ -1945,6 +1945,12 @@ composite_setup(struct usb_gadget *gadget, const 
>> struct usb_ctrlrequest *ctrl)
>>  buf[6] = w_index;
>>  /* Number of ext compat interfaces */
>>  count = count_ext_compat(os_desc_cfg);
>> +/*
>> + * Bailout if device does not
>> + * have ext_compat interfaces.
>> + */
>> +if (count == 0)
>> +break;
>>  buf[8] = count;
>>  count *= 24; /* 24 B/ext compat desc */
>>  count += 16; /* header */
> 
> Do we still need this fix? IIRC we had this change in our downstream
> kernel to fix the case when dynamically re-configuring ConfigFS, i.e.
> changing the composition of functions wherein none of the interfaces
> support OS Descriptors, so this causes count_ext_compat() to return
> 0 and results in the issue described in $SUBJECT.
> 
Hi Jack,

You're correct.  We can address this as well in the userspace perspective.

> But I think this is more of a problem of an improperly configured
> ConfigFS gadget. If userspace instead removes the config from the
> gadget's os_desc subdirectory that should cause cdev->os_desc_config to
> be set to NULL and hence composite_setup() should never enter this
> handling at all, right?

Sure, I'll go with fixing it in the userspace, since the support to
stall the OS desc is already present in the composite driver as you
mentioned.  Thanks for the input.

Thanks
Wesley Cheng

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH] usb: gadget: Stall OS descriptor request for unsupported functions

2021-03-22 Thread Jack Pham

Hi Wesley,

On Mon, Mar 22, 2021 at 06:50:17PM -0700, Wesley Cheng wrote:
> From: Chandana Kishori Chiluveru 
> 
> Hosts which request "OS descriptors" from gadgets do so during
> the enumeration phase and before the configuration is set with
> SET_CONFIGURATION. Composite driver supports OS descriptor
> handling in composite_setup function. This requires to pass
> signature field, vendor code, compatibleID and subCompatibleID
> from user space.
> 
> For USB compositions that contain functions which don't implement os
> descriptors, Windows is sending vendor specific requests for os
> descriptors and composite driver handling this request with invalid
> data. With this invalid info host resetting the bus and never
> selecting the configuration and leading enumeration issue.
> 
> Fix this by bailing out from the OS descriptor setup request
> handling if the functions does not have OS descriptors compatibleID.
> 
> Signed-off-by: Chandana Kishori Chiluveru 
> Signed-off-by: Wesley Cheng 
> ---
>  drivers/usb/gadget/composite.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
> index 72a9797..473edda6 100644
> --- a/drivers/usb/gadget/composite.c
> +++ b/drivers/usb/gadget/composite.c
> @@ -1945,6 +1945,12 @@ composite_setup(struct usb_gadget *gadget, const 
> struct usb_ctrlrequest *ctrl)
>   buf[6] = w_index;
>   /* Number of ext compat interfaces */
>   count = count_ext_compat(os_desc_cfg);
> + /*
> +  * Bailout if device does not
> +  * have ext_compat interfaces.
> +  */
> + if (count == 0)
> + break;
>   buf[8] = count;
>   count *= 24; /* 24 B/ext compat desc */
>   count += 16; /* header */

Do we still need this fix? IIRC we had this change in our downstream
kernel to fix the case when dynamically re-configuring ConfigFS, i.e.
changing the composition of functions wherein none of the interfaces
support OS Descriptors, so this causes count_ext_compat() to return
0 and results in the issue described in $SUBJECT.

But I think this is more of a problem of an improperly configured
ConfigFS gadget. If userspace instead removes the config from the
gadget's os_desc subdirectory that should cause cdev->os_desc_config to
be set to NULL and hence composite_setup() should never enter this
handling at all, right?

Jack
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] usb: gadget: Stall OS descriptor request for unsupported functions

2021-03-22 Thread Wesley Cheng

From: Chandana Kishori Chiluveru 

Hosts which request "OS descriptors" from gadgets do so during
the enumeration phase and before the configuration is set with
SET_CONFIGURATION. Composite driver supports OS descriptor
handling in composite_setup function. This requires to pass
signature field, vendor code, compatibleID and subCompatibleID
from user space.

For USB compositions that contain functions which don't implement os
descriptors, Windows is sending vendor specific requests for os
descriptors and composite driver handling this request with invalid
data. With this invalid info host resetting the bus and never
selecting the configuration and leading enumeration issue.

Fix this by bailing out from the OS descriptor setup request
handling if the functions does not have OS descriptors compatibleID.

Signed-off-by: Chandana Kishori Chiluveru 
Signed-off-by: Wesley Cheng 
---
 drivers/usb/gadget/composite.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index 72a9797..473edda6 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1945,6 +1945,12 @@ composite_setup(struct usb_gadget *gadget, const struct 
usb_ctrlrequest *ctrl)
buf[6] = w_index;
/* Number of ext compat interfaces */
count = count_ext_compat(os_desc_cfg);
+   /*
+* Bailout if device does not
+* have ext_compat interfaces.
+*/
+   if (count == 0)
+   break;
buf[8] = count;
count *= 24; /* 24 B/ext compat desc */
count += 16; /* header */
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

RE: [PATCH v2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-22 Thread Sanket Parmar

Hi Peter,

> On 21-03-17 20:13:59, Sanket Parmar wrote:
> > dma_alloc_coherent() might fail on the platform with a small
> > DMA region.
> >
> > To avoid such failure in cdns3_prepare_aligned_request_buf(),
> > dma_alloc_coherent() is replaced with dma_alloc_noncoherent()
> > to allocate aligned request buffer of dynamic length.
> >
> > Reported-by: Aswath Govindraju 
> > Signed-off-by: Sanket Parmar 
> > ---
> >
> > Changelog:
> > v2:
> > - used dma_*_noncoherent() APIs
> > - changed the commit log
> >
> >  drivers/usb/cdns3/cdns3-gadget.c | 30 -
> -
> >  drivers/usb/cdns3/cdns3-gadget.h |  2 ++
> >  2 files changed, 26 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-
> gadget.c
> > index 0b892a2..126087b 100644
> > --- a/drivers/usb/cdns3/cdns3-gadget.c
> > +++ b/drivers/usb/cdns3/cdns3-gadget.c
> > @@ -819,9 +819,15 @@ void cdns3_gadget_giveback(struct
> cdns3_endpoint *priv_ep,
> > priv_ep->dir);
> >
> > if ((priv_req->flags & REQUEST_UNALIGNED) &&
> > -   priv_ep->dir == USB_DIR_OUT && !request->status)
> > +   priv_ep->dir == USB_DIR_OUT && !request->status) {
> > +   /* Make DMA buffer CPU accessible */
> > +   dma_sync_single_for_cpu(priv_dev->sysdev,
> > +   priv_req->aligned_buf->dma,
> > +   priv_req->aligned_buf->size,
> > +   priv_req->aligned_buf->dir);
> > memcpy(request->buf, priv_req->aligned_buf->buf,
> >request->length);
> > +   }
> >
> > priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
> > /* All TRBs have finished, clear the counter */
> > @@ -883,8 +889,8 @@ static void cdns3_free_aligned_request_buf(struct
> work_struct *work)
> >  * interrupts.
> >  */
> > spin_unlock_irqrestore(&priv_dev->lock, flags);
> > -   dma_free_coherent(priv_dev->sysdev, buf->size,
> > - buf->buf, buf->dma);
> > +   dma_free_noncoherent(priv_dev->sysdev, buf-
> >size,
> > + buf->buf, buf->dma, buf->dir);
> > kfree(buf);
> > spin_lock_irqsave(&priv_dev->lock, flags);
> > }
> > @@ -911,10 +917,13 @@ static int
> cdns3_prepare_aligned_request_buf(struct cdns3_request *priv_req)
> > return -ENOMEM;
> >
> > buf->size = priv_req->request.length;
> > +   buf->dir = usb_endpoint_dir_in(priv_ep->endpoint.desc) ?
> > +   DMA_TO_DEVICE : DMA_FROM_DEVICE;
> >
> > -   buf->buf = dma_alloc_coherent(priv_dev->sysdev,
> > +   buf->buf = dma_alloc_noncoherent(priv_dev->sysdev,
> >   buf->size,
> >   &buf->dma,
> > + buf->dir,
> >   GFP_ATOMIC);
> > if (!buf->buf) {
> > kfree(buf);
> > @@ -936,10 +945,18 @@ static int
> cdns3_prepare_aligned_request_buf(struct cdns3_request *priv_req)
> > }
> >
> > if (priv_ep->dir == USB_DIR_IN) {
> > +   /* Make DMA buffer CPU accessible */
> > +   dma_sync_single_for_cpu(priv_dev->sysdev,
> > +   buf->dma, buf->size, buf->dir);
> > memcpy(buf->buf, priv_req->request.buf,
> >priv_req->request.length);
> > }
> >
> > +   /* Transfer DMA buffer ownership back to device */
> > +   dma_sync_single_for_device(priv_dev->sysdev,
> > +   buf->dma, buf->size, buf->dir);
> > +
> > +
> 
> One more blank line.
> 
> Otherwise, it seems OK for me.

I have remove this blank line. New patch has been posted already.

> 
> > priv_req->flags |= REQUEST_UNALIGNED;
> > trace_cdns3_prepare_aligned_request(priv_req);
> >
> > @@ -3088,9 +3105,10 @@ static void cdns3_gadget_exit(struct cdns *cdns)
> > struct cdns3_aligned_buf *buf;
> >
> > buf = cdns3_next_align_buf(&priv_dev->aligned_buf_li

[PATCH v3] usb: cdns3: Optimize DMA request buffer allocation

2021-03-22 Thread Sanket Parmar

dma_alloc_coherent() might fail on the platform with a small
DMA region.

To avoid such failure in cdns3_prepare_aligned_request_buf(),
dma_alloc_coherent() is replaced with dma_alloc_noncoherent()
to allocate aligned request buffer of dynamic length.

Reported-by: Aswath Govindraju 
Signed-off-by: Sanket Parmar 
---

Changelog:
v3:
- removed extra blank line

v2:
- used dma_*_noncoherent() APIs
- changed the commit log

 drivers/usb/cdns3/cdns3-gadget.c | 29 +++--
 drivers/usb/cdns3/cdns3-gadget.h |  2 ++
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 0b892a2..9b1bd41 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -819,9 +819,15 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
priv_ep->dir);
 
if ((priv_req->flags & REQUEST_UNALIGNED) &&
-   priv_ep->dir == USB_DIR_OUT && !request->status)
+   priv_ep->dir == USB_DIR_OUT && !request->status) {
+   /* Make DMA buffer CPU accessible */
+   dma_sync_single_for_cpu(priv_dev->sysdev,
+   priv_req->aligned_buf->dma,
+   priv_req->aligned_buf->size,
+       priv_req->aligned_buf->dir);
memcpy(request->buf, priv_req->aligned_buf->buf,
   request->length);
+   }
 
priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
/* All TRBs have finished, clear the counter */
@@ -883,8 +889,8 @@ static void cdns3_free_aligned_request_buf(struct 
work_struct *work)
 * interrupts.
 */
spin_unlock_irqrestore(&priv_dev->lock, flags);
-   dma_free_coherent(priv_dev->sysdev, buf->size,
- buf->buf, buf->dma);
+   dma_free_noncoherent(priv_dev->sysdev, buf->size,
+ buf->buf, buf->dma, buf->dir);
kfree(buf);
spin_lock_irqsave(&priv_dev->lock, flags);
}
@@ -911,10 +917,13 @@ static int cdns3_prepare_aligned_request_buf(struct 
cdns3_request *priv_req)
return -ENOMEM;
 
buf->size = priv_req->request.length;
+   buf->dir = usb_endpoint_dir_in(priv_ep->endpoint.desc) ?
+   DMA_TO_DEVICE : DMA_FROM_DEVICE;
 
-   buf->buf = dma_alloc_coherent(priv_dev->sysdev,
+   buf->buf = dma_alloc_noncoherent(priv_dev->sysdev,
  buf->size,
  &buf->dma,
+ buf->dir,
  GFP_ATOMIC);
if (!buf->buf) {
kfree(buf);
@@ -936,10 +945,17 @@ static int cdns3_prepare_aligned_request_buf(struct 
cdns3_request *priv_req)
}
 
if (priv_ep->dir == USB_DIR_IN) {
+   /* Make DMA buffer CPU accessible */
+   dma_sync_single_for_cpu(priv_dev->sysdev,
+   buf->dma, buf->size, buf->dir);
memcpy(buf->buf, priv_req->request.buf,
   priv_req->request.length);
}
 
+   /* Transfer DMA buffer ownership back to device */
+   dma_sync_single_for_device(priv_dev->sysdev,
+   buf->dma, buf->size, buf->dir);
+
priv_req->flags |= REQUEST_UNALIGNED;
trace_cdns3_prepare_aligned_request(priv_req);
 
@@ -3088,9 +3104,10 @@ static void cdns3_gadget_exit(struct cdns *cdns)
struct cdns3_aligned_buf *buf;
 
buf = cdns3_next_align_buf(&priv_dev->aligned_buf_list);
-   dma_free_coherent(priv_dev->sysdev, buf->size,
+   dma_free_noncoherent(priv_dev->sysdev, buf->size,
  buf->buf,
- buf->dma);
+ buf->dma,
+ buf->dir);
 
list_del(&buf->list);
kfree(buf);
diff --git a/drivers/usb/cdns3/cdns3-gadget.h b/drivers/usb/cdns3/cdns3-gadget.h
index ecf9b91..c5660f2 100644
--- a/drivers/usb/cdns3/cdns3-gadget.h
+++ b/drivers/usb/cdns3/cdns3-gadget.h
@@ -12,6 +12,7 @@
 #ifndef __LINUX_CDNS3_GADGET
 #define __LINUX_CDNS3_GADGET
 #include 
+#include 
 
 /*
  * USBSS-DEV register interface.
@@ -1205,6 +1206,7 @@ struct cdns3_aligned_buf {
void*buf;
dma_addr_t  dma;
u32 size;
+   enum dma_data_direction dir;
unsignedin_use:1;
struct list_headlist;
 };
-- 
2.4.5

Re: [PATCH v2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-20 Thread Peter Chen

On 21-03-17 20:13:59, Sanket Parmar wrote:
> dma_alloc_coherent() might fail on the platform with a small
> DMA region.
> 
> To avoid such failure in cdns3_prepare_aligned_request_buf(),
> dma_alloc_coherent() is replaced with dma_alloc_noncoherent()
> to allocate aligned request buffer of dynamic length.
> 
> Reported-by: Aswath Govindraju 
> Signed-off-by: Sanket Parmar 
> ---
> 
> Changelog:
> v2:
> - used dma_*_noncoherent() APIs
> - changed the commit log
> 
>  drivers/usb/cdns3/cdns3-gadget.c | 30 --
>  drivers/usb/cdns3/cdns3-gadget.h |  2 ++
>  2 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c 
> b/drivers/usb/cdns3/cdns3-gadget.c
> index 0b892a2..126087b 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -819,9 +819,15 @@ void cdns3_gadget_giveback(struct cdns3_endpoint 
> *priv_ep,
>   priv_ep->dir);
>  
>   if ((priv_req->flags & REQUEST_UNALIGNED) &&
> - priv_ep->dir == USB_DIR_OUT && !request->status)
> + priv_ep->dir == USB_DIR_OUT && !request->status) {
> + /* Make DMA buffer CPU accessible */
> + dma_sync_single_for_cpu(priv_dev->sysdev,
> + priv_req->aligned_buf->dma,
> + priv_req->aligned_buf->size,
> + priv_req->aligned_buf->dir);
>   memcpy(request->buf, priv_req->aligned_buf->buf,
>  request->length);
> + }
>  
>   priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
>   /* All TRBs have finished, clear the counter */
> @@ -883,8 +889,8 @@ static void cdns3_free_aligned_request_buf(struct 
> work_struct *work)
>* interrupts.
>*/
>   spin_unlock_irqrestore(&priv_dev->lock, flags);
> - dma_free_coherent(priv_dev->sysdev, buf->size,
> -   buf->buf, buf->dma);
> + dma_free_noncoherent(priv_dev->sysdev, buf->size,
> +   buf->buf, buf->dma, buf->dir);
>   kfree(buf);
>   spin_lock_irqsave(&priv_dev->lock, flags);
>   }
> @@ -911,10 +917,13 @@ static int cdns3_prepare_aligned_request_buf(struct 
> cdns3_request *priv_req)
>   return -ENOMEM;
>  
>   buf->size = priv_req->request.length;
> + buf->dir = usb_endpoint_dir_in(priv_ep->endpoint.desc) ?
> + DMA_TO_DEVICE : DMA_FROM_DEVICE;
>  
> - buf->buf = dma_alloc_coherent(priv_dev->sysdev,
> + buf->buf = dma_alloc_noncoherent(priv_dev->sysdev,
> buf->size,
> &buf->dma,
> +   buf->dir,
> GFP_ATOMIC);
>   if (!buf->buf) {
>   kfree(buf);
> @@ -936,10 +945,18 @@ static int cdns3_prepare_aligned_request_buf(struct 
> cdns3_request *priv_req)
>   }
>  
>   if (priv_ep->dir == USB_DIR_IN) {
> + /* Make DMA buffer CPU accessible */
> + dma_sync_single_for_cpu(priv_dev->sysdev,
> + buf->dma, buf->size, buf->dir);
>   memcpy(buf->buf, priv_req->request.buf,
>  priv_req->request.length);
>   }
>  
> + /* Transfer DMA buffer ownership back to device */
> + dma_sync_single_for_device(priv_dev->sysdev,
> + buf->dma, buf->size, buf->dir);
> +
> +

One more blank line.

Otherwise, it seems OK for me.

>   priv_req->flags |= REQUEST_UNALIGNED;
>   trace_cdns3_prepare_aligned_request(priv_req);
>  
> @@ -3088,9 +3105,10 @@ static void cdns3_gadget_exit(struct cdns *cdns)
>   struct cdns3_aligned_buf *buf;
>  
>   buf = cdns3_next_align_buf(&priv_dev->aligned_buf_list);
> - dma_free_coherent(priv_dev->sysdev, buf->size,
> + dma_free_noncoherent(priv_dev->sysdev, buf->size,
> buf->buf,
> -   buf->dma);
> +   buf->dma,
> +   buf->dir);
>  
>   list_del(&buf->list);
>   kfree(buf);

Re: [PATCH v2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-20 Thread Peter Chen

On 21-03-18 07:32:45, Christoph Hellwig wrote:
> On Wed, Mar 17, 2021 at 08:13:59PM +0100, Sanket Parmar wrote:
> > dma_alloc_coherent() might fail on the platform with a small
> > DMA region.
> > 
> > To avoid such failure in cdns3_prepare_aligned_request_buf(),
> > dma_alloc_coherent() is replaced with dma_alloc_noncoherent()
> > to allocate aligned request buffer of dynamic length.
> > 
> > Reported-by: Aswath Govindraju 
> > Signed-off-by: Sanket Parmar 
> 
> Looks good to me:
> 
> Reviewed-by: Christoph Hellwig 

Hi Christoph,

I would like to confirm the dma_alloc_noncoherent allocates the memory
less than PAGE_SIZE if buffer size it would like to allocate is small
(eg, 64 bytes)? 

-- 

Thanks,
Peter Chen

[PATCH v7 07/13] phy: cadence: Sierra: Explicitly request exclusive reset control

2021-03-19 Thread Kishon Vijay Abraham I

No functional change. Since the reset controls obtained in
Sierra is exclusively used by the Sierra device, use
exclusive reset control request API calls.

Signed-off-by: Kishon Vijay Abraham I 
Reviewed-by: Philipp Zabel 
---
 drivers/phy/cadence/phy-cadence-sierra.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/phy/cadence/phy-cadence-sierra.c 
b/drivers/phy/cadence/phy-cadence-sierra.c
index 935f165404e4..44c52a0842dc 100644
--- a/drivers/phy/cadence/phy-cadence-sierra.c
+++ b/drivers/phy/cadence/phy-cadence-sierra.c
@@ -514,14 +514,14 @@ static int cdns_sierra_phy_get_resets(struct 
cdns_sierra_phy *sp,
 {
struct reset_control *rst;
 
-   rst = devm_reset_control_get(dev, "sierra_reset");
+   rst = devm_reset_control_get_exclusive(dev, "sierra_reset");
if (IS_ERR(rst)) {
dev_err(dev, "failed to get reset\n");
return PTR_ERR(rst);
}
sp->phy_rst = rst;
 
-   rst = devm_reset_control_get_optional(dev, "sierra_apb");
+   rst = devm_reset_control_get_optional_exclusive(dev, "sierra_apb");
if (IS_ERR(rst)) {
dev_err(dev, "failed to get apb reset\n");
return PTR_ERR(rst);
-- 
2.17.1

RE: [PATCH 04/11] i2c: imx-lpi2c: manage irq resource request/release in runtime pm

2021-03-19 Thread Aisheng Dong

> > > @@ -665,6 +659,14 @@ static int __maybe_unused
> > > lpi2c_runtime_resume(struct device *dev)
> > >   dev_err(dev, "can't enable I2C ipg clock, ret=%d\n", ret);
> > >   }
> > >
> > > + ret = devm_request_irq(dev, lpi2c_imx->irq, lpi2c_imx_isr,
> >
> > I guess unnecessary to use devm in rpm
> 
> devm_request_irq() will use device resource management.
> Other resource like clk and struct space are all managed by devres.
> Maybe we can still use devm_ to let devres manage irq here?
> 

devm_xxx is usually used to auto free resources when probe fail,
driver unbound / device unregister and etc. Not for runtime pm.
I may prefer using request_irq/free_irq directly in runtime.

BTW, current lpi2c_imx_remove seems didn't ensure the device is
In runtime suspend state after removing.
If framework can't guarantee, the driver has to do it.
Anyway, that's another issue and need a separate patch.

Regards
Aisheng

> Thanks.
> 
> Best Regards,
> Clark Wang
> 
> 
> >
> > > +IRQF_NO_SUSPEND,
> > > +dev_name(dev), lpi2c_imx);
> > > + if (ret) {
> > > + dev_err(dev, "can't claim irq %d\n", lpi2c_imx->irq);
> > > + return ret;
> > > + }
> > > +
> > >   return ret;
> > >  }
> > >
> > > --
> > > 2.25.1

RE: [PATCH 04/11] i2c: imx-lpi2c: manage irq resource request/release in runtime pm

2021-03-19 Thread Clark Wang


> -Original Message-
> From: Aisheng Dong 
> Sent: Friday, March 19, 2021 12:54
> To: Clark Wang ; shawn...@kernel.org;
> s.ha...@pengutronix.de
> Cc: ker...@pengutronix.de; feste...@gmail.com; dl-linux-imx  i...@nxp.com>; sumit.sem...@linaro.org; christian.koe...@amd.com;
> linux-...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux-
> ker...@vger.kernel.org
> Subject: RE: [PATCH 04/11] i2c: imx-lpi2c: manage irq resource
> request/release in runtime pm
> 
> > From: Clark Wang 
> > Sent: Wednesday, March 17, 2021 2:54 PM
> >
> > Manage irq resource request/release in runtime pm to save irq domain's
> > power.
> >
> > Signed-off-by: Frank Li 
> > Signed-off-by: Fugang Duan 
> > Reviewed-by: Frank Li 
> > ---
> >  drivers/i2c/busses/i2c-imx-lpi2c.c | 26 ++
> >  1 file changed, 14 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c
> > b/drivers/i2c/busses/i2c-imx-lpi2c.c
> > index 664fcc0dba51..e718bb6b2387 100644
> > --- a/drivers/i2c/busses/i2c-imx-lpi2c.c
> > +++ b/drivers/i2c/busses/i2c-imx-lpi2c.c
> > @@ -94,6 +94,7 @@ enum lpi2c_imx_pincfg {
> >
> >  struct lpi2c_imx_struct {
> > struct i2c_adapter  adapter;
> > +   int irq;
> > struct clk  *clk_per;
> > struct clk  *clk_ipg;
> > void __iomem*base;
> > @@ -543,7 +544,7 @@ static int lpi2c_imx_probe(struct platform_device
> > *pdev)  {
> > struct lpi2c_imx_struct *lpi2c_imx;
> > unsigned int temp;
> > -   int irq, ret;
> > +   int ret;
> >
> > lpi2c_imx = devm_kzalloc(&pdev->dev, sizeof(*lpi2c_imx),
> GFP_KERNEL);
> > if (!lpi2c_imx)
> > @@ -553,9 +554,9 @@ static int lpi2c_imx_probe(struct platform_device
> > *pdev)
> > if (IS_ERR(lpi2c_imx->base))
> > return PTR_ERR(lpi2c_imx->base);
> >
> > -   irq = platform_get_irq(pdev, 0);
> > -   if (irq < 0)
> > -   return irq;
> > +   lpi2c_imx->irq = platform_get_irq(pdev, 0);
> > +   if (lpi2c_imx->irq < 0)
> > +   return lpi2c_imx->irq;
> >
> > lpi2c_imx->adapter.owner= THIS_MODULE;
> > lpi2c_imx->adapter.algo = &lpi2c_imx_algo;
> > @@ -581,14 +582,6 @@ static int lpi2c_imx_probe(struct platform_device
> > *pdev)
> > if (ret)
> > lpi2c_imx->bitrate = I2C_MAX_STANDARD_MODE_FREQ;
> >
> > -   ret = devm_request_irq(&pdev->dev, irq, lpi2c_imx_isr,
> > -  IRQF_NO_SUSPEND,
> > -  pdev->name, lpi2c_imx);
> > -   if (ret) {
> > -   dev_err(&pdev->dev, "can't claim irq %d\n", irq);
> > -   return ret;
> > -   }
> > -
> > i2c_set_adapdata(&lpi2c_imx->adapter, lpi2c_imx);
> > platform_set_drvdata(pdev, lpi2c_imx);
> >
> > @@ -640,6 +633,7 @@ static int __maybe_unused
> > lpi2c_runtime_suspend(struct device *dev)  {
> > struct lpi2c_imx_struct *lpi2c_imx = dev_get_drvdata(dev);
> >
> > +   devm_free_irq(dev, lpi2c_imx->irq, lpi2c_imx);
> > clk_disable_unprepare(lpi2c_imx->clk_ipg);
> > clk_disable_unprepare(lpi2c_imx->clk_per);
> > pinctrl_pm_select_idle_state(dev);
> > @@ -665,6 +659,14 @@ static int __maybe_unused
> > lpi2c_runtime_resume(struct device *dev)
> > dev_err(dev, "can't enable I2C ipg clock, ret=%d\n", ret);
> > }
> >
> > +   ret = devm_request_irq(dev, lpi2c_imx->irq, lpi2c_imx_isr,
> 
> I guess unnecessary to use devm in rpm

devm_request_irq() will use device resource management.
Other resource like clk and struct space are all managed by devres.
Maybe we can still use devm_ to let devres manage irq here?

Thanks.

Best Regards,
Clark Wang


> 
> > +  IRQF_NO_SUSPEND,
> > +  dev_name(dev), lpi2c_imx);
> > +   if (ret) {
> > +   dev_err(dev, "can't claim irq %d\n", lpi2c_imx->irq);
> > +   return ret;
> > +   }
> > +
> > return ret;
> >  }
> >
> > --
> > 2.25.1



smime.p7s
Description: S/MIME cryptographic signature

RE: [PATCH 04/11] i2c: imx-lpi2c: manage irq resource request/release in runtime pm

2021-03-18 Thread Aisheng Dong

> From: Clark Wang 
> Sent: Wednesday, March 17, 2021 2:54 PM
> 
> Manage irq resource request/release in runtime pm to save irq domain's
> power.
> 
> Signed-off-by: Frank Li 
> Signed-off-by: Fugang Duan 
> Reviewed-by: Frank Li 
> ---
>  drivers/i2c/busses/i2c-imx-lpi2c.c | 26 ++
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c
> b/drivers/i2c/busses/i2c-imx-lpi2c.c
> index 664fcc0dba51..e718bb6b2387 100644
> --- a/drivers/i2c/busses/i2c-imx-lpi2c.c
> +++ b/drivers/i2c/busses/i2c-imx-lpi2c.c
> @@ -94,6 +94,7 @@ enum lpi2c_imx_pincfg {
> 
>  struct lpi2c_imx_struct {
>   struct i2c_adapter  adapter;
> + int irq;
>   struct clk  *clk_per;
>   struct clk  *clk_ipg;
>   void __iomem*base;
> @@ -543,7 +544,7 @@ static int lpi2c_imx_probe(struct platform_device
> *pdev)  {
>   struct lpi2c_imx_struct *lpi2c_imx;
>   unsigned int temp;
> - int irq, ret;
> + int ret;
> 
>   lpi2c_imx = devm_kzalloc(&pdev->dev, sizeof(*lpi2c_imx), GFP_KERNEL);
>   if (!lpi2c_imx)
> @@ -553,9 +554,9 @@ static int lpi2c_imx_probe(struct platform_device
> *pdev)
>   if (IS_ERR(lpi2c_imx->base))
>   return PTR_ERR(lpi2c_imx->base);
> 
> - irq = platform_get_irq(pdev, 0);
> - if (irq < 0)
> - return irq;
> + lpi2c_imx->irq = platform_get_irq(pdev, 0);
> + if (lpi2c_imx->irq < 0)
> + return lpi2c_imx->irq;
> 
>   lpi2c_imx->adapter.owner= THIS_MODULE;
>   lpi2c_imx->adapter.algo = &lpi2c_imx_algo;
> @@ -581,14 +582,6 @@ static int lpi2c_imx_probe(struct platform_device
> *pdev)
>   if (ret)
>   lpi2c_imx->bitrate = I2C_MAX_STANDARD_MODE_FREQ;
> 
> - ret = devm_request_irq(&pdev->dev, irq, lpi2c_imx_isr,
> -IRQF_NO_SUSPEND,
> -pdev->name, lpi2c_imx);
> - if (ret) {
> - dev_err(&pdev->dev, "can't claim irq %d\n", irq);
> - return ret;
> - }
> -
>   i2c_set_adapdata(&lpi2c_imx->adapter, lpi2c_imx);
>   platform_set_drvdata(pdev, lpi2c_imx);
> 
> @@ -640,6 +633,7 @@ static int __maybe_unused
> lpi2c_runtime_suspend(struct device *dev)  {
>   struct lpi2c_imx_struct *lpi2c_imx = dev_get_drvdata(dev);
> 
> + devm_free_irq(dev, lpi2c_imx->irq, lpi2c_imx);
>   clk_disable_unprepare(lpi2c_imx->clk_ipg);
>   clk_disable_unprepare(lpi2c_imx->clk_per);
>   pinctrl_pm_select_idle_state(dev);
> @@ -665,6 +659,14 @@ static int __maybe_unused
> lpi2c_runtime_resume(struct device *dev)
>   dev_err(dev, "can't enable I2C ipg clock, ret=%d\n", ret);
>   }
> 
> + ret = devm_request_irq(dev, lpi2c_imx->irq, lpi2c_imx_isr,

I guess unnecessary to use devm in rpm

> +IRQF_NO_SUSPEND,
> +dev_name(dev), lpi2c_imx);
> + if (ret) {
> + dev_err(dev, "can't claim irq %d\n", lpi2c_imx->irq);
> + return ret;
> + }
> +
>   return ret;
>  }
> 
> --
> 2.25.1

Re: BUG: unable to handle kernel paging request in __kvm_mmu_prepare_zap_page

2021-03-18 Thread Dmitry Vyukov

> Sean Christopherson wrote:
> On Mon, Feb 25, 2019 at 06:50:05AM -0800, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 94a47529a645 Add linux-next specific files for 20190222
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13c1c692c0
> > kernel config: https://syzkaller.appspot.com/x/.config?x=51cd1c8c39e8a207
> > dashboard link: https://syzkaller.appspot.com/bug?extid=222746e0104bbb617d51
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11fcba7cc0
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+222746...@syzkaller.appspotmail.com
> >
> > BUG: unable to handle kernel paging request at 88809c0e1000
> > #PF error: [PROT] [WRITE] [RSVD]
> > PGD b201067 P4D b201067 PUD 21067 PMD 80009c0001e3
> > Oops: 000b [#1] PREEMPT SMP KASAN
> > CPU: 1 PID: 7863 Comm: syz-executor.2 Not tainted 5.0.0-rc7-next-20190222
> > #41
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:__write_once_size include/linux/compiler.h:224 [inline]
> > RIP: 0010:__update_clear_spte_fast arch/x86/kvm/mmu.c:558 [inline]
> > RIP: 0010:mmu_spte_clear_no_track arch/x86/kvm/mmu.c:859 [inline]
> > RIP: 0010:drop_parent_pte arch/x86/kvm/mmu.c:2051 [inline]
> > RIP: 0010:kvm_mmu_unlink_parents arch/x86/kvm/mmu.c:2645 [inline]
> > RIP: 0010:__kvm_mmu_prepare_zap_page+0x1ee/0x11a0 arch/x86/kvm/mmu.c:2683
> > Code: f8 30 60 00 48 89 de 4c 89 e7 e8 bd ad fe ff 4c 89 e0 48 b9 00 00 00
> > 00 00 fc ff df 48 c1 e8 03 80 3c 08 00 0f 85 a5 0d 00 00 <49> c7 04 24 00 00
> > 00 00 e8 c5 30 60 00 4c 89 ea 4c 89 fe 48 89 df
> > RSP: 0018:88809e96faf0 EFLAGS: 00010246
> > RAX: 11101381c200 RBX: 888098149820 RCX: dc00
> > RDX:  RSI: 810ed2f4 RDI: 0007
> > RBP: 88809e96fbd0 R08: 888094052700 R09: ed100e339d92
> > R10: ed100e339d91 R11: 8880719cec8b R12: 88809c0e1000
> > R13: 88809e96fb70 R14: c90006a39000 R15: 88809e96fb68
> > FS: 0239b940() GS:8880ae90() knlGS:
> > CS: 0010 DS:  ES:  CR0: 80050033
> > CR2: 88809c0e1000 CR3: a143a000 CR4: 001426e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: fffe0ff0 DR7: 0400
> > Call Trace:
> > __kvm_mmu_zap_all+0x1f6/0x350 arch/x86/kvm/mmu.c:5856
>
> This strongly suggests the recent MMU zapping changes[1] are to blame,
> but I haven't had any luck reproducing the bug.
>
> Paolo, were you able to make any headway on the kvm-unit-test issues
> that are potentially due to the MMU changes?
>
> [1] https://patchwork.kernel.org/cover/10798425/

Something has fixed this, let's consider it's this (I did not find any
better candidate):

#syz fix:
KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'


> > kvm_mmu_zap_all+0x18/0x20 arch/x86/kvm/mmu.c:5870
> > kvm_arch_flush_shadow_all+0x16/0x20 arch/x86/kvm/x86.c:9473
> > kvm_mmu_notifier_release+0x63/0xb0
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:499
> > mmu_notifier_unregister+0x137/0x440 mm/mmu_notifier.c:356
> > kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:745 [inline]
> > kvm_put_kvm+0x553/0xc70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:770
> > kvm_vcpu_release+0x7b/0xa0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2500
> > __fput+0x2e5/0x8d0 fs/file_table.c:278
> > fput+0x16/0x20 fs/file_table.c:309
> > task_work_run+0x14a/0x1c0 kernel/task_work.c:113
> > tracehook_notify_resume include/linux/tracehook.h:188 [inline]
> > exit_to_usermode_loop+0x273/0x2c0 arch/x86/entry/common.c:166
> > prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
> > syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
> > do_syscall_64+0x52d/0x610 arch/x86/entry/common.c:293
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x411d31
> > Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 94 19 00 00 c3 48
> > 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89
> > c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
> > RSP: 002b:7ffe9da10de0 EFLAGS: 0293 ORIG_RAX: 0003
> > RAX:  RBX: 0006 RCX: 00411d31
> > RDX:  RSI: 00740528 RDI: 00

Re: BUG: unable to handle kernel paging request in workingset_age_nonresident

2021-03-18 Thread Dmitry Vyukov

On Tue, Dec 1, 2020 at 8:40 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:c6b11acc Add linux-next specific files for 20201130
> git tree:   linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=114b94e950
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b5e03844e9b34d37
> dashboard link: https://syzkaller.appspot.com/bug?extid=a59e7ceb87a83c5233df
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=150fed8b50
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1726291d50
>
> The issue was bisected to:
>
> commit 76761ffa9ea1ddca78e817bf7eec5fcb0378a00c
> Author: Alex Shi 
> Date:   Sun Nov 29 23:58:06 2020 +
>
> mm/memcg: bail out early when !memcg in mem_cgroup_lruvec

This patch was removed from linux-next, so let's close the report:

#syz invalid


> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=122ff44550
> final oops: https://syzkaller.appspot.com/x/report.txt?x=112ff44550
> console output: https://syzkaller.appspot.com/x/log.txt?x=162ff44550
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a59e7ceb87a83c523...@syzkaller.appspotmail.com
> Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in 
> mem_cgroup_lruvec")
>
> BUG: unable to handle page fault for address: 81417c79
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0003) - permissions violation
> PGD b08f067 P4D b08f067 PUD b090063 PMD 14001e1
> Oops: 0003 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 8503 Comm: syz-executor118 Not tainted 
> 5.10.0-rc5-next-20201130-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:630 [inline]
> RIP: 0010:parent_lruvec include/linux/memcontrol.h:1560 [inline]
> RIP: 0010:workingset_age_nonresident+0x179/0x1c0 mm/workingset.c:242
> Code: 85 db 0f 85 c8 fe ff ff 5b 5d 41 5c 41 5d 41 5e 41 5f e9 6a 67 cf ff e8 
> 65 67 cf ff 49 8d 9d 18 4d 00 00 eb b3 e8 57 67 cf ff <4c> 89 ab c0 00 00 00 
> eb c7 e8 69 35 12 00 e9 d3 fe ff ff e8 5f 35
> RSP: 0018:c9000112f4c0 EFLAGS: 00010093
> RAX:  RBX: 81417bb9 RCX: 
> RDX: 88801eee5040 RSI: 81a159f9 RDI: 81417c79
> RBP: dc00 R08: 0001 R09: 88813dbf
> R10: ed1027b7 R11:  R12: 8e7911d0
> R13: 88813fffb000 R14: 0001 R15: 8e7910b0
> FS:  02581880() GS:8880b9f0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 81417c79 CR3: 13a2f000 CR4: 001506e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  workingset_eviction+0x452/0x9b0 mm/workingset.c:266
>  __remove_mapping+0x867/0xd20 mm/vmscan.c:927
>  shrink_page_list+0x246a/0x5e80 mm/vmscan.c:1431
>  reclaim_pages+0x3e2/0xcd0 mm/vmscan.c:2148
>  madvise_cold_or_pageout_pte_range+0x1615/0x2880 mm/madvise.c:473
>  walk_pmd_range mm/pagewalk.c:89 [inline]
>  walk_pud_range mm/pagewalk.c:160 [inline]
>  walk_p4d_range mm/pagewalk.c:193 [inline]
>  walk_pgd_range mm/pagewalk.c:229 [inline]
>  __walk_page_range+0xda4/0x1e20 mm/pagewalk.c:331
>  walk_page_range+0x1be/0x450 mm/pagewalk.c:427
>  madvise_pageout_page_range mm/madvise.c:526 [inline]
>  madvise_pageout+0x21b/0x390 mm/madvise.c:562
>  madvise_vma mm/madvise.c:943 [inline]
>  do_madvise.part.0+0x9f2/0x1ed0 mm/madvise.c:1142
>  do_madvise mm/madvise.c:1168 [inline]
>  __do_sys_madvise mm/madvise.c:1168 [inline]
>  __se_sys_madvise mm/madvise.c:1166 [inline]
>  __x64_sys_madvise+0x113/0x150 mm/madvise.c:1166
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x440279
> Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7ffe7c1ab298 EFLAGS: 0246 ORIG_RAX: 001c
> RAX: ffda RBX: 004002c8 RCX: 00440279
> RDX: 0015 RSI: 0063 RDI: 2000
> RBP: 006ca018 R08:  R09: 
> R10: 0004 R11: 0246 R12: 00401a80
> R13: 00401b10 R14:  R15: 
> Modules linked in:
> CR2: 81417c79
> ---[ end trace 89bcebda47215cf6 ]---
> RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:630 [inline]
> RIP: 0010:parent_lruvec include/linux/memcontrol.h:1560 [inline]
> RIP: 0010:workingset_age_nonresident+0x179/0x1c0 mm/workingset.c:242
> Code: 85 db 0f 85 c8 fe ff ff 5b 5d 41 5c 41 5d 41 5e

RE: [PATCH 4/4] phy: cadence-torrent: Explicitly request exclusive reset control

2021-03-18 Thread Swapnil Kashinath Jakhade




> -Original Message-
> From: Kishon Vijay Abraham I 
> Sent: Wednesday, March 10, 2021 9:25 PM
> To: Kishon Vijay Abraham I ; Vinod Koul
> ; Rob Herring ; Philipp Zabel
> ; Swapnil Kashinath Jakhade
> 
> Cc: linux-kernel@vger.kernel.org; devicet...@vger.kernel.org; Lokesh Vutla
> ; linux-...@lists.infradead.org
> Subject: [PATCH 4/4] phy: cadence-torrent: Explicitly request exclusive reset
> control
> 
> EXTERNAL MAIL
> 
> 
> No functional change. Since the reset controls obtained in
> Torrent is exclusively used by the Torrent device, use
> exclusive reset control request API calls.
> 
> Signed-off-by: Kishon Vijay Abraham I 
> ---
>  drivers/phy/cadence/phy-cadence-torrent.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Swapnil Jakhade 

Thanks & regards,
Swapnil

> diff --git a/drivers/phy/cadence/phy-cadence-torrent.c
> b/drivers/phy/cadence/phy-cadence-torrent.c
> index 5ee1657f5a1c..ff8bb4b724c0 100644
> --- a/drivers/phy/cadence/phy-cadence-torrent.c
> +++ b/drivers/phy/cadence/phy-cadence-torrent.c
> @@ -2264,7 +2264,7 @@ static int cdns_torrent_reset(struct
> cdns_torrent_phy *cdns_phy)
>   return PTR_ERR(cdns_phy->phy_rst);
>   }
> 
> - cdns_phy->apb_rst = devm_reset_control_get_optional(dev,
> "torrent_apb");
> + cdns_phy->apb_rst =
> devm_reset_control_get_optional_exclusive(dev, "torrent_apb");
>   if (IS_ERR(cdns_phy->apb_rst)) {
>   dev_err(dev, "%s: failed to get apb reset\n",
>   dev->of_node->full_name);
> --
> 2.17.1

Re: [PATCH v2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-18 Thread Christoph Hellwig

On Wed, Mar 17, 2021 at 08:13:59PM +0100, Sanket Parmar wrote:
> dma_alloc_coherent() might fail on the platform with a small
> DMA region.
> 
> To avoid such failure in cdns3_prepare_aligned_request_buf(),
> dma_alloc_coherent() is replaced with dma_alloc_noncoherent()
> to allocate aligned request buffer of dynamic length.
> 
> Reported-by: Aswath Govindraju 
> Signed-off-by: Sanket Parmar 

Looks good to me:

Reviewed-by: Christoph Hellwig

[PATCH v2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-17 Thread Sanket Parmar

dma_alloc_coherent() might fail on the platform with a small
DMA region.

To avoid such failure in cdns3_prepare_aligned_request_buf(),
dma_alloc_coherent() is replaced with dma_alloc_noncoherent()
to allocate aligned request buffer of dynamic length.

Reported-by: Aswath Govindraju 
Signed-off-by: Sanket Parmar 
---

Changelog:
v2:
- used dma_*_noncoherent() APIs
- changed the commit log

 drivers/usb/cdns3/cdns3-gadget.c | 30 --
 drivers/usb/cdns3/cdns3-gadget.h |  2 ++
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 0b892a2..126087b 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -819,9 +819,15 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
priv_ep->dir);
 
if ((priv_req->flags & REQUEST_UNALIGNED) &&
-   priv_ep->dir == USB_DIR_OUT && !request->status)
+   priv_ep->dir == USB_DIR_OUT && !request->status) {
+   /* Make DMA buffer CPU accessible */
+   dma_sync_single_for_cpu(priv_dev->sysdev,
+   priv_req->aligned_buf->dma,
+   priv_req->aligned_buf->size,
+       priv_req->aligned_buf->dir);
memcpy(request->buf, priv_req->aligned_buf->buf,
   request->length);
+   }
 
priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
/* All TRBs have finished, clear the counter */
@@ -883,8 +889,8 @@ static void cdns3_free_aligned_request_buf(struct 
work_struct *work)
 * interrupts.
 */
spin_unlock_irqrestore(&priv_dev->lock, flags);
-   dma_free_coherent(priv_dev->sysdev, buf->size,
- buf->buf, buf->dma);
+   dma_free_noncoherent(priv_dev->sysdev, buf->size,
+ buf->buf, buf->dma, buf->dir);
kfree(buf);
spin_lock_irqsave(&priv_dev->lock, flags);
}
@@ -911,10 +917,13 @@ static int cdns3_prepare_aligned_request_buf(struct 
cdns3_request *priv_req)
return -ENOMEM;
 
buf->size = priv_req->request.length;
+   buf->dir = usb_endpoint_dir_in(priv_ep->endpoint.desc) ?
+   DMA_TO_DEVICE : DMA_FROM_DEVICE;
 
-   buf->buf = dma_alloc_coherent(priv_dev->sysdev,
+   buf->buf = dma_alloc_noncoherent(priv_dev->sysdev,
  buf->size,
  &buf->dma,
+ buf->dir,
  GFP_ATOMIC);
if (!buf->buf) {
kfree(buf);
@@ -936,10 +945,18 @@ static int cdns3_prepare_aligned_request_buf(struct 
cdns3_request *priv_req)
}
 
if (priv_ep->dir == USB_DIR_IN) {
+   /* Make DMA buffer CPU accessible */
+   dma_sync_single_for_cpu(priv_dev->sysdev,
+   buf->dma, buf->size, buf->dir);
memcpy(buf->buf, priv_req->request.buf,
   priv_req->request.length);
}
 
+   /* Transfer DMA buffer ownership back to device */
+   dma_sync_single_for_device(priv_dev->sysdev,
+   buf->dma, buf->size, buf->dir);
+
+
priv_req->flags |= REQUEST_UNALIGNED;
trace_cdns3_prepare_aligned_request(priv_req);
 
@@ -3088,9 +3105,10 @@ static void cdns3_gadget_exit(struct cdns *cdns)
struct cdns3_aligned_buf *buf;
 
buf = cdns3_next_align_buf(&priv_dev->aligned_buf_list);
-   dma_free_coherent(priv_dev->sysdev, buf->size,
+   dma_free_noncoherent(priv_dev->sysdev, buf->size,
  buf->buf,
- buf->dma);
+ buf->dma,
+ buf->dir);
 
list_del(&buf->list);
kfree(buf);
diff --git a/drivers/usb/cdns3/cdns3-gadget.h b/drivers/usb/cdns3/cdns3-gadget.h
index ecf9b91..c5660f2 100644
--- a/drivers/usb/cdns3/cdns3-gadget.h
+++ b/drivers/usb/cdns3/cdns3-gadget.h
@@ -12,6 +12,7 @@
 #ifndef __LINUX_CDNS3_GADGET
 #define __LINUX_CDNS3_GADGET
 #include 
+#include 
 
 /*
  * USBSS-DEV register interface.
@@ -1205,6 +1206,7 @@ struct cdns3_aligned_buf {
void*buf;
dma_addr_t  dma;
u32 size;
+   enum dma_data_direction dir;
unsignedin_use:1;
struct list_headlist;
 };
-- 
2.4.5

[tip: sched/core] rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

2021-03-17 Thread tip-bot2 for Piotr Figiel

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 90f093fa8ea48e5d991332cee160b761423d55c1
Gitweb:
https://git.kernel.org/tip/90f093fa8ea48e5d991332cee160b761423d55c1
Author:Piotr Figiel 
AuthorDate:Fri, 26 Feb 2021 14:51:56 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:15:39 +01:00

rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

For userspace checkpoint and restore (C/R) a way of getting process state
containing RSEQ configuration is needed.

There are two ways this information is going to be used:
 - to re-enable RSEQ for threads which had it enabled before C/R
 - to detect if a thread was in a critical section during C/R

Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
using the address registered before C/R.

Detection whether the thread is in a critical section during C/R is needed
to enforce behavior of RSEQ abort during C/R. Attaching with ptrace()
before registers are dumped itself doesn't cause RSEQ abort.
Restoring the instruction pointer within the critical section is
problematic because rseq_cs may get cleared before the control is passed
to the migrated application code leading to RSEQ invariants not being
preserved. C/R code will use RSEQ ABI address to find the abort handler
to which the instruction pointer needs to be set.

To achieve above goals expose the RSEQ ABI address and the signature value
with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION.

This new ptrace request can also be used by debuggers so they are aware
of stops within restartable sequences in progress.

Signed-off-by: Piotr Figiel 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Michal Miroslaw 
Reviewed-by: Mathieu Desnoyers 
Acked-by: Oleg Nesterov 
Link: https://lkml.kernel.org/r/20210226135156.1081606-1-fig...@google.com
---
 include/uapi/linux/ptrace.h | 10 ++
 kernel/ptrace.c | 25 +
 2 files changed, 35 insertions(+)

diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 83ee45f..3747bf8 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -102,6 +102,16 @@ struct ptrace_syscall_info {
};
 };
 
+#define PTRACE_GET_RSEQ_CONFIGURATION  0x420f
+
+struct ptrace_rseq_configuration {
+   __u64 rseq_abi_pointer;
+   __u32 rseq_abi_size;
+   __u32 signature;
+   __u32 flags;
+   __u32 pad;
+};
+
 /*
  * These values are stored in task->ptrace_message
  * by tracehook_report_syscall_* to describe the current syscall-stop.
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 821cf17..c71270a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include/* for syscall_get_* */
 
@@ -779,6 +780,24 @@ static int ptrace_peek_siginfo(struct task_struct *child,
return ret;
 }
 
+#ifdef CONFIG_RSEQ
+static long ptrace_get_rseq_configuration(struct task_struct *task,
+ unsigned long size, void __user *data)
+{
+   struct ptrace_rseq_configuration conf = {
+   .rseq_abi_pointer = (u64)(uintptr_t)task->rseq,
+   .rseq_abi_size = sizeof(*task->rseq),
+   .signature = task->rseq_sig,
+   .flags = 0,
+   };
+
+   size = min_t(unsigned long, size, sizeof(conf));
+   if (copy_to_user(data, &conf, size))
+   return -EFAULT;
+   return sizeof(conf);
+}
+#endif
+
 #ifdef PTRACE_SINGLESTEP
 #define is_singlestep(request) ((request) == PTRACE_SINGLESTEP)
 #else
@@ -1222,6 +1241,12 @@ int ptrace_request(struct task_struct *child, long 
request,
ret = seccomp_get_metadata(child, addr, datavp);
break;
 
+#ifdef CONFIG_RSEQ
+   case PTRACE_GET_RSEQ_CONFIGURATION:
+   ret = ptrace_get_rseq_configuration(child, addr, datavp);
+   break;
+#endif
+
default:
break;
}

[tip: sched/core] rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

2021-03-17 Thread tip-bot2 for Piotr Figiel

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 2c406d3f436db1deea55ec44cc4c3c0861c3c185
Gitweb:
https://git.kernel.org/tip/2c406d3f436db1deea55ec44cc4c3c0861c3c185
Author:Piotr Figiel 
AuthorDate:Fri, 26 Feb 2021 14:51:56 +01:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 17 Mar 2021 14:05:40 +01:00

rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

For userspace checkpoint and restore (C/R) a way of getting process state
containing RSEQ configuration is needed.

There are two ways this information is going to be used:
 - to re-enable RSEQ for threads which had it enabled before C/R
 - to detect if a thread was in a critical section during C/R

Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
using the address registered before C/R.

Detection whether the thread is in a critical section during C/R is needed
to enforce behavior of RSEQ abort during C/R. Attaching with ptrace()
before registers are dumped itself doesn't cause RSEQ abort.
Restoring the instruction pointer within the critical section is
problematic because rseq_cs may get cleared before the control is passed
to the migrated application code leading to RSEQ invariants not being
preserved. C/R code will use RSEQ ABI address to find the abort handler
to which the instruction pointer needs to be set.

To achieve above goals expose the RSEQ ABI address and the signature value
with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION.

This new ptrace request can also be used by debuggers so they are aware
of stops within restartable sequences in progress.

Signed-off-by: Piotr Figiel 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Michal Miroslaw 
Reviewed-by: Mathieu Desnoyers 
Acked-by: Oleg Nesterov 
Link: https://lkml.kernel.org/r/20210226135156.1081606-1-fig...@google.com
---
 include/uapi/linux/ptrace.h | 10 ++
 kernel/ptrace.c | 25 +
 2 files changed, 35 insertions(+)

diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 83ee45f..3747bf8 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -102,6 +102,16 @@ struct ptrace_syscall_info {
};
 };
 
+#define PTRACE_GET_RSEQ_CONFIGURATION  0x420f
+
+struct ptrace_rseq_configuration {
+   __u64 rseq_abi_pointer;
+   __u32 rseq_abi_size;
+   __u32 signature;
+   __u32 flags;
+   __u32 pad;
+};
+
 /*
  * These values are stored in task->ptrace_message
  * by tracehook_report_syscall_* to describe the current syscall-stop.
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 821cf17..c71270a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include/* for syscall_get_* */
 
@@ -779,6 +780,24 @@ static int ptrace_peek_siginfo(struct task_struct *child,
return ret;
 }
 
+#ifdef CONFIG_RSEQ
+static long ptrace_get_rseq_configuration(struct task_struct *task,
+ unsigned long size, void __user *data)
+{
+   struct ptrace_rseq_configuration conf = {
+   .rseq_abi_pointer = (u64)(uintptr_t)task->rseq,
+   .rseq_abi_size = sizeof(*task->rseq),
+   .signature = task->rseq_sig,
+   .flags = 0,
+   };
+
+   size = min_t(unsigned long, size, sizeof(conf));
+   if (copy_to_user(data, &conf, size))
+   return -EFAULT;
+   return sizeof(conf);
+}
+#endif
+
 #ifdef PTRACE_SINGLESTEP
 #define is_singlestep(request) ((request) == PTRACE_SINGLESTEP)
 #else
@@ -1222,6 +1241,12 @@ int ptrace_request(struct task_struct *child, long 
request,
ret = seccomp_get_metadata(child, addr, datavp);
break;
 
+#ifdef CONFIG_RSEQ
+   case PTRACE_GET_RSEQ_CONFIGURATION:
+   ret = ptrace_get_rseq_configuration(child, addr, datavp);
+   break;
+#endif
+
default:
break;
}

[PATCH 27/36] scsi: isci: request: Fix doc-rot issue relating to 'ireq' param

2021-03-17 Thread Lee Jones

Fixes the following W=1 kernel build warning(s):

 drivers/scsi/isci/request.c:496: warning: Function parameter or member 'ireq' 
not described in 'scu_sata_request_construct_task_context'
 drivers/scsi/isci/request.c:496: warning: Excess function parameter 'sci_req' 
description in 'scu_sata_request_construct_task_context'

Cc: Artur Paszkiewicz 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-s...@vger.kernel.org
Signed-off-by: Lee Jones 
---
 drivers/scsi/isci/request.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/isci/request.c b/drivers/scsi/isci/request.c
index 49ab2555c0cdf..593b38c59924a 100644
--- a/drivers/scsi/isci/request.c
+++ b/drivers/scsi/isci/request.c
@@ -481,7 +481,7 @@ static void 
scu_ssp_task_request_construct_task_context(struct isci_request *ire
  * scu_sata_request_construct_task_context()
  * This method is will fill in the SCU Task Context for any type of SATA
  *request.  This is called from the various SATA constructors.
- * @sci_req: The general IO request object which is to be used in
+ * @ireq: The general IO request object which is to be used in
  *constructing the SCU task context.
  * @task_context: The buffer pointer for the SCU task context which is being
  *constructed.
-- 
2.27.0

[PATCH 20/36] scsi: isci: request: Fix a myriad of kernel-doc issues

2021-03-17 Thread Lee Jones

Fixes the following W=1 kernel build warning(s):

 drivers/scsi/isci/request.c:211: warning: wrong kernel-doc identifier on line:
 drivers/scsi/isci/request.c:414: warning: wrong kernel-doc identifier on line:
 drivers/scsi/isci/request.c:472: warning: Function parameter or member 'ireq' 
not described in 'scu_ssp_task_request_construct_task_context'
 drivers/scsi/isci/request.c:472: warning: expecting prototype for The(). 
Prototype was for scu_ssp_task_request_construct_task_context() instead
 drivers/scsi/isci/request.c:501: warning: Function parameter or member 'ireq' 
not described in 'scu_sata_request_construct_task_context'
 drivers/scsi/isci/request.c:501: warning: expecting prototype for This method 
is will fill in the SCU Task Context for any type of SATA(). Prototype was for 
scu_sata_request_construct_task_context() instead
 drivers/scsi/isci/request.c:597: warning: Cannot understand  *
 drivers/scsi/isci/request.c:785: warning: expecting prototype for 
sci_req_tx_bytes(). Prototype was for SCU_TASK_CONTEXT_SRAM() instead
 drivers/scsi/isci/request.c:1399: warning: Cannot understand  *
 drivers/scsi/isci/request.c:1446: warning: Cannot understand  *
 drivers/scsi/isci/request.c:2465: warning: Function parameter or member 'task' 
not described in 'isci_request_process_response_iu'
 drivers/scsi/isci/request.c:2465: warning: Excess function parameter 
'sas_task' description in 'isci_request_process_response_iu'
 drivers/scsi/isci/request.c:2501: warning: Function parameter or member 'task' 
not described in 'isci_request_set_open_reject_status'
 drivers/scsi/isci/request.c:2524: warning: Function parameter or member 'idev' 
not described in 'isci_request_handle_controller_specific_errors'
 drivers/scsi/isci/request.c:2524: warning: Function parameter or member 'task' 
not described in 'isci_request_handle_controller_specific_errors'
 drivers/scsi/isci/request.c:3337: warning: Function parameter or member 'idev' 
not described in 'isci_io_request_build'
 drivers/scsi/isci/request.c:3337: warning: Excess function parameter 
'sci_device' description in 'isci_io_request_build'

Cc: Artur Paszkiewicz 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-s...@vger.kernel.org
Signed-off-by: Lee Jones 
---
 drivers/scsi/isci/request.c | 58 ++---
 1 file changed, 28 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/isci/request.c b/drivers/scsi/isci/request.c
index 58e62162882f2..49ab2555c0cdf 100644
--- a/drivers/scsi/isci/request.c
+++ b/drivers/scsi/isci/request.c
@@ -207,11 +207,8 @@ static void sci_task_request_build_ssp_task_iu(struct 
isci_request *ireq)
SCI_CONTROLLER_INVALID_IO_TAG;
 }
 
-/**
+/*
  * This method is will fill in the SCU Task Context for any type of SSP 
request.
- * @sci_req:
- * @task_context:
- *
  */
 static void scu_ssp_request_construct_task_context(
struct isci_request *ireq,
@@ -410,10 +407,8 @@ static void scu_ssp_ireq_dif_strip(struct isci_request 
*ireq, u8 type, u8 op)
tc->ref_tag_seed_gen = 0;
 }
 
-/**
+/*
  * This method is will fill in the SCU Task Context for a SSP IO request.
- * @sci_req:
- *
  */
 static void scu_ssp_io_request_construct_task_context(struct isci_request 
*ireq,
  enum dma_data_direction 
dir,
@@ -456,17 +451,16 @@ static void 
scu_ssp_io_request_construct_task_context(struct isci_request *ireq,
 }
 
 /**
- * This method will fill in the SCU Task Context for a SSP Task request.  The
- *following important settings are utilized: -# priority ==
- *SCU_TASK_PRIORITY_HIGH.  This ensures that the task request is issued
- *ahead of other task destined for the same Remote Node. -# task_type ==
- *SCU_TASK_TYPE_IOREAD.  This simply indicates that a normal request type
- *(i.e. non-raw frame) is being utilized to perform task management. -#
- *    control_frame == 1.  This ensures that the proper endianess is set so
- *that the bytes are transmitted in the right order for a task frame.
- * @sci_req: This parameter specifies the task request object being
- *constructed.
- *
+ * scu_ssp_task_request_construct_task_context() - This method will fill in
+ *the SCU Task Context for a SSP Task request.  The following important
+ *settings are utilized: -# priority == SCU_TASK_PRIORITY_HIGH.  This
+ *ensures that the task request is issued ahead of other task destined
+ *for the same Remote Node. -# task_type == SCU_TASK_TYPE_IOREAD.  This
+ *simply indicates that a normal request type (i.e. non-raw frame) is
+ *being utilized to perform task management. -#control_frame == 1.  This
+ *ensures that the proper endianess is set so that the bytes are
+ *transmitted i

[PATCH 04/11] i2c: imx-lpi2c: manage irq resource request/release in runtime pm

2021-03-16 Thread Clark Wang

From: Fugang Duan 

Manage irq resource request/release in runtime pm to save irq domain's
power.

Signed-off-by: Frank Li 
Signed-off-by: Fugang Duan 
Reviewed-by: Frank Li 
---
 drivers/i2c/busses/i2c-imx-lpi2c.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c 
b/drivers/i2c/busses/i2c-imx-lpi2c.c
index 664fcc0dba51..e718bb6b2387 100644
--- a/drivers/i2c/busses/i2c-imx-lpi2c.c
+++ b/drivers/i2c/busses/i2c-imx-lpi2c.c
@@ -94,6 +94,7 @@ enum lpi2c_imx_pincfg {
 
 struct lpi2c_imx_struct {
struct i2c_adapter  adapter;
+   int irq;
struct clk  *clk_per;
struct clk  *clk_ipg;
void __iomem*base;
@@ -543,7 +544,7 @@ static int lpi2c_imx_probe(struct platform_device *pdev)
 {
struct lpi2c_imx_struct *lpi2c_imx;
unsigned int temp;
-   int irq, ret;
+   int ret;
 
lpi2c_imx = devm_kzalloc(&pdev->dev, sizeof(*lpi2c_imx), GFP_KERNEL);
if (!lpi2c_imx)
@@ -553,9 +554,9 @@ static int lpi2c_imx_probe(struct platform_device *pdev)
if (IS_ERR(lpi2c_imx->base))
return PTR_ERR(lpi2c_imx->base);
 
-   irq = platform_get_irq(pdev, 0);
-   if (irq < 0)
-   return irq;
+   lpi2c_imx->irq = platform_get_irq(pdev, 0);
+   if (lpi2c_imx->irq < 0)
+   return lpi2c_imx->irq;
 
lpi2c_imx->adapter.owner= THIS_MODULE;
lpi2c_imx->adapter.algo = &lpi2c_imx_algo;
@@ -581,14 +582,6 @@ static int lpi2c_imx_probe(struct platform_device *pdev)
if (ret)
lpi2c_imx->bitrate = I2C_MAX_STANDARD_MODE_FREQ;
 
-   ret = devm_request_irq(&pdev->dev, irq, lpi2c_imx_isr,
-  IRQF_NO_SUSPEND,
-  pdev->name, lpi2c_imx);
-   if (ret) {
-   dev_err(&pdev->dev, "can't claim irq %d\n", irq);
-   return ret;
-   }
-
i2c_set_adapdata(&lpi2c_imx->adapter, lpi2c_imx);
platform_set_drvdata(pdev, lpi2c_imx);
 
@@ -640,6 +633,7 @@ static int __maybe_unused lpi2c_runtime_suspend(struct 
device *dev)
 {
struct lpi2c_imx_struct *lpi2c_imx = dev_get_drvdata(dev);
 
+   devm_free_irq(dev, lpi2c_imx->irq, lpi2c_imx);
clk_disable_unprepare(lpi2c_imx->clk_ipg);
clk_disable_unprepare(lpi2c_imx->clk_per);
pinctrl_pm_select_idle_state(dev);
@@ -665,6 +659,14 @@ static int __maybe_unused lpi2c_runtime_resume(struct 
device *dev)
dev_err(dev, "can't enable I2C ipg clock, ret=%d\n", ret);
}
 
+   ret = devm_request_irq(dev, lpi2c_imx->irq, lpi2c_imx_isr,
+  IRQF_NO_SUSPEND,
+  dev_name(dev), lpi2c_imx);
+   if (ret) {
+   dev_err(dev, "can't claim irq %d\n", lpi2c_imx->irq);
+   return ret;
+   }
+
return ret;
 }
 
-- 
2.25.1

[PATCH][next] ath11k: qmi: Fix spelling mistake "requeqst" -> "request"

2021-03-16 Thread Colin King

From: Colin Ian King 

There is a spelling mistake in an ath11k_warn message. Fix it.

Signed-off-by: Colin Ian King 
---
 drivers/net/wireless/ath/ath11k/qmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath11k/qmi.c 
b/drivers/net/wireless/ath/ath11k/qmi.c
index a612e279ea5b..b5e34d670715 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -2514,7 +2514,7 @@ static int ath11k_qmi_event_load_bdf(struct ath11k_qmi 
*qmi)
 
ret = ath11k_qmi_request_target_cap(ab);
if (ret < 0) {
-   ath11k_warn(ab, "failed to requeqst qmi target capabilities: 
%d\n",
+   ath11k_warn(ab, "failed to request qmi target capabilities: 
%d\n",
ret);
return ret;
}
-- 
2.30.2

Re: [PATCH net-next 3/3] net: ipa: extend the INDICATION_REGISTER request

2021-03-15 Thread Manivannan Sadhasivam

On Mon, Mar 15, 2021 at 10:21:12AM -0500, Alex Elder wrote:
> The specified format of the INDICATION_REGISTER QMI request message
> has been extended to support two more optional fields:
>   endpoint_desc_ind:
> sender wishes to receive endpoint descriptor information via
> an IPA ENDP_DESC indication QMI message
>   bw_change_ind:
> sender wishes to receive bandwidth change information via
> an IPA BW_CHANGE indication QMI message
> 
> Add definitions that permit these fields to be formatted and parsed
> by the QMI library code.
> 
> Signed-off-by: Alex Elder 

Acked-by: Manivannan Sadhasivam 

Thanks,
Mani

> ---
>  drivers/net/ipa/ipa_qmi_msg.c | 40 +++
>  drivers/net/ipa/ipa_qmi_msg.h |  6 +-
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ipa/ipa_qmi_msg.c b/drivers/net/ipa/ipa_qmi_msg.c
> index e4a6efbe9bd00..6838e8065072b 100644
> --- a/drivers/net/ipa/ipa_qmi_msg.c
> +++ b/drivers/net/ipa/ipa_qmi_msg.c
> @@ -70,6 +70,46 @@ struct qmi_elem_info ipa_indication_register_req_ei[] = {
>   .offset = offsetof(struct ipa_indication_register_req,
>  ipa_mhi_ready_ind),
>   },
> + {
> + .data_type  = QMI_OPT_FLAG,
> + .elem_len   = 1,
> + .elem_size  =
> + sizeof_field(struct ipa_indication_register_req,
> +  endpoint_desc_ind_valid),
> + .tlv_type   = 0x13,
> + .offset = offsetof(struct ipa_indication_register_req,
> +endpoint_desc_ind_valid),
> + },
> + {
> + .data_type  = QMI_UNSIGNED_1_BYTE,
> + .elem_len   = 1,
> + .elem_size  =
> + sizeof_field(struct ipa_indication_register_req,
> +  endpoint_desc_ind),
> + .tlv_type   = 0x13,
> + .offset = offsetof(struct ipa_indication_register_req,
> +endpoint_desc_ind),
> + },
> + {
> + .data_type  = QMI_OPT_FLAG,
> + .elem_len   = 1,
> + .elem_size  =
> + sizeof_field(struct ipa_indication_register_req,
> +  bw_change_ind_valid),
> + .tlv_type   = 0x14,
> + .offset = offsetof(struct ipa_indication_register_req,
> +bw_change_ind_valid),
> + },
> + {
> + .data_type  = QMI_UNSIGNED_1_BYTE,
> + .elem_len   = 1,
> + .elem_size  =
> + sizeof_field(struct ipa_indication_register_req,
> +  bw_change_ind),
> + .tlv_type   = 0x14,
> + .offset = offsetof(struct ipa_indication_register_req,
> +bw_change_ind),
> + },
>   {
>   .data_type  = QMI_EOTI,
>   },
> diff --git a/drivers/net/ipa/ipa_qmi_msg.h b/drivers/net/ipa/ipa_qmi_msg.h
> index 12b6621f4b0e6..3233d145fd87c 100644
> --- a/drivers/net/ipa/ipa_qmi_msg.h
> +++ b/drivers/net/ipa/ipa_qmi_msg.h
> @@ -24,7 +24,7 @@
>   * information for each field.  The qmi_send_*() interfaces require
>   * the message size to be provided.
>   */
> -#define IPA_QMI_INDICATION_REGISTER_REQ_SZ   12  /* -> server handle */
> +#define IPA_QMI_INDICATION_REGISTER_REQ_SZ   20  /* -> server handle */
>  #define IPA_QMI_INDICATION_REGISTER_RSP_SZ   7   /* <- server handle */
>  #define IPA_QMI_INIT_DRIVER_REQ_SZ   162 /* client handle -> */
>  #define IPA_QMI_INIT_DRIVER_RSP_SZ   25  /* client handle <- */
> @@ -44,6 +44,10 @@ struct ipa_indication_register_req {
>   u8 data_usage_quota_reached;
>   u8 ipa_mhi_ready_ind_valid;
>   u8 ipa_mhi_ready_ind;
> + u8 endpoint_desc_ind_valid;
> + u8 endpoint_desc_ind;
> + u8 bw_change_ind_valid;
> + u8 bw_change_ind;
>  };
>  
>  /* The response to a IPA_QMI_INDICATION_REGISTER request consists only of
> -- 
> 2.27.0
>

Re: [PATCH 2/2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-15 Thread Peter Chen

On 21-03-15 15:51:04, Sanket Parmar wrote:
> > > +
> > >   priv_req->flags |= REQUEST_UNALIGNED;
> > >   trace_cdns3_prepare_aligned_request(priv_req);
> > >
> > > @@ -3088,11 +3113,11 @@ static void cdns3_gadget_exit(struct cdns
> > *cdns)
> > >   struct cdns3_aligned_buf *buf;
> > >
> > >   buf = cdns3_next_align_buf(&priv_dev->aligned_buf_list);
> > > - dma_free_coherent(priv_dev->sysdev, buf->size,
> > > -   buf->buf,
> > > -   buf->dma);
> > > + dma_unmap_single(priv_dev->sysdev, buf->dma, buf->size,
> > > + buf->dir);
> > 
> > It only needs to DMA unmap after DMA has completed, this buf will not be
> > used, otherwise, the kfree below will cause issue.
> 
> This part is not clear.  Aligned DMA buffer is allocated and mapped in 
> cdns3_prepare_aligned_request_buf()
> and put into aligned_buf_list. While unloading the gadget, We need to undo 
> the same if aligned_buf_list is not
> empty.  Am I missing something here? 

My point is this unmap operation is useless since there is no user for
aligned buf, and it calls kfree afterwards. You could also keep it as it has
no harm.

> 
> Also, I will post v2 of this patch which uses dma_*_noncoherent APIs 
> suggested by Christoph Hellwig.

-- 

Thanks,
Peter Chen

Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

2021-03-15 Thread Sagi Grimberg





Hi Sagi,

On Fri, Mar 05, 2021 at 11:57:30AM -0800, Sagi Grimberg wrote:

Daniel, again, there is nothing specific about this to nvme-tcp,
this is a safeguard against a funky controller (or a different
bug that is hidden by this).


As far I can tell, the main difference between nvme-tcp and FC/NVMe,
nvme-tcp has not a FW or a big driver which filter out some noise from a
misbehaving controller. I haven't really checked the other transports
but I wouldn't surprised they share the same properties as FC/NVMe.


The same can happen in any other transport so I would suggest that if
this is a safeguard we want to put in place, we should make it a
generic one.

i.e. nvme_tag_to_rq() that _all_ transports call consistently.


Okay, I'll review all the relevant code and see what could made more
generic and consistent.

Though I think nvme-tcp plays in a different league as it is exposed to
normal networking traffic and this is a very hostile environment.


It is, but in this situation, the controller is sending a second
completion that results in a use-after-free, which makes the
transport irrelevant. Unless there is some other flow (which is unclear
to me) that causes this which is a bug that needs to be fixed rather
than hidden with a safeguard.

RE: [PATCH 2/2] usb: cdns3: Optimize DMA request buffer allocation

2021-03-15 Thread Sanket Parmar

> 
> On 21-03-09 06:19:40, Sanket Parmar wrote:
> > dma_alloc_coherent() might fail on the platform with a small DMA region.
> >
> > To avoid such failure in cdns3_prepare_aligned_request_buf(),
> > dma_alloc_coherent() is replaced with kmalloc and dma_map API to
> > allocate aligned request buffer of dynamic length.
> >
> > Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
> 
> The comment with the 1st patch, it is not a bug-fix.

I will remove this. 

> 
> > Reported-by: Aswath Govindraju 
> > Signed-off-by: Sanket Parmar 
> > ---
> >  drivers/usb/cdns3/cdns3-gadget.c |   73 +--
> --
> >  drivers/usb/cdns3/cdns3-gadget.h |2 +
> >  2 files changed, 51 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-
> gadget.c
> > index 5f51215..b4955ce 100644
> > --- a/drivers/usb/cdns3/cdns3-gadget.c
> > +++ b/drivers/usb/cdns3/cdns3-gadget.c
> > @@ -818,10 +818,26 @@ void cdns3_gadget_giveback(struct
> cdns3_endpoint *priv_ep,
> > usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
> >     priv_ep->dir);
> >
> > -   if ((priv_req->flags & REQUEST_UNALIGNED) &&
> > -   priv_ep->dir == USB_DIR_OUT && !request->status)
> > -   memcpy(request->buf, priv_req->aligned_buf->buf,
> > -  request->length);
> > +   if ((priv_req->flags & REQUEST_UNALIGNED) && priv_req-
> >aligned_buf) {
> > +   struct cdns3_aligned_buf *buf;
> > +
> > +   buf = priv_req->aligned_buf;
> > +   dma_unmap_single(priv_dev->sysdev, buf->dma, buf->size,
> > +   buf->dir);
> > +   priv_req->flags &= ~REQUEST_UNALIGNED;
> > +
> > +   if (priv_ep->dir == USB_DIR_OUT && !request->status) {
> > +   memcpy(request->buf, priv_req->aligned_buf->buf,
> > +  request->length);
> > +   }
> > +
> > +   trace_cdns3_free_aligned_request(priv_req);
> > +   priv_req->aligned_buf->in_use = 0;
> > +   queue_work(system_freezable_wq,
> > +  &priv_dev->aligned_buf_wq);
> > +   priv_req->aligned_buf = NULL;
> > +
> > +   }
> >
> > priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
> > /* All TRBs have finished, clear the counter */
> > @@ -883,8 +899,7 @@ static void cdns3_free_aligned_request_buf(struct
> work_struct *work)
> >  * interrupts.
> >  */
> > spin_unlock_irqrestore(&priv_dev->lock, flags);
> > -   dma_free_coherent(priv_dev->sysdev, buf->size,
> > - buf->buf, buf->dma);
> > +   kfree(buf->buf);
> > kfree(buf);
> > spin_lock_irqsave(&priv_dev->lock, flags);
> > }
> > @@ -910,27 +925,16 @@ static int
> cdns3_prepare_aligned_request_buf(struct cdns3_request *priv_req)
> > if (!buf)
> > return -ENOMEM;
> >
> > -   buf->size = priv_req->request.length;
> > +   buf->size = usb_endpoint_dir_out(priv_ep->endpoint.desc)
> ?
> > +   usb_ep_align(&(priv_ep->endpoint),
> priv_req->request.length)
> > +   : priv_req->request.length;
> >
> > -   buf->buf = dma_alloc_coherent(priv_dev->sysdev,
> > - buf->size,
> > - &buf->dma,
> > - GFP_ATOMIC);
> > +   buf->buf = kmalloc(buf->size, GFP_ATOMIC);
> > if (!buf->buf) {
> > kfree(buf);
> > return -ENOMEM;
> > }
> >
> > -   if (priv_req->aligned_buf) {
> > -   trace_cdns3_free_aligned_request(priv_req);
> > -   priv_req->aligned_buf->in_use = 0;
> > -   queue_work(system_freezable_wq,
> > -  &priv_dev->aligned_buf_wq);
> > -   }
> > -
> > -   buf->in_use = 1;
> > -   priv_req->aligned_buf = buf;
> > -
>

[PATCH net-next 3/3] net: ipa: extend the INDICATION_REGISTER request

2021-03-15 Thread Alex Elder

The specified format of the INDICATION_REGISTER QMI request message
has been extended to support two more optional fields:
  endpoint_desc_ind:
sender wishes to receive endpoint descriptor information via
an IPA ENDP_DESC indication QMI message
  bw_change_ind:
sender wishes to receive bandwidth change information via
an IPA BW_CHANGE indication QMI message

Add definitions that permit these fields to be formatted and parsed
by the QMI library code.

Signed-off-by: Alex Elder 
---
 drivers/net/ipa/ipa_qmi_msg.c | 40 +++
 drivers/net/ipa/ipa_qmi_msg.h |  6 +-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ipa/ipa_qmi_msg.c b/drivers/net/ipa/ipa_qmi_msg.c
index e4a6efbe9bd00..6838e8065072b 100644
--- a/drivers/net/ipa/ipa_qmi_msg.c
+++ b/drivers/net/ipa/ipa_qmi_msg.c
@@ -70,6 +70,46 @@ struct qmi_elem_info ipa_indication_register_req_ei[] = {
.offset = offsetof(struct ipa_indication_register_req,
   ipa_mhi_ready_ind),
},
+   {
+   .data_type  = QMI_OPT_FLAG,
+   .elem_len   = 1,
+   .elem_size  =
+   sizeof_field(struct ipa_indication_register_req,
+endpoint_desc_ind_valid),
+   .tlv_type   = 0x13,
+   .offset = offsetof(struct ipa_indication_register_req,
+  endpoint_desc_ind_valid),
+   },
+   {
+   .data_type  = QMI_UNSIGNED_1_BYTE,
+   .elem_len   = 1,
+   .elem_size  =
+   sizeof_field(struct ipa_indication_register_req,
+endpoint_desc_ind),
+   .tlv_type   = 0x13,
+   .offset = offsetof(struct ipa_indication_register_req,
+  endpoint_desc_ind),
+   },
+   {
+   .data_type  = QMI_OPT_FLAG,
+   .elem_len   = 1,
+   .elem_size  =
+   sizeof_field(struct ipa_indication_register_req,
+bw_change_ind_valid),
+   .tlv_type   = 0x14,
+   .offset = offsetof(struct ipa_indication_register_req,
+  bw_change_ind_valid),
+   },
+   {
+   .data_type  = QMI_UNSIGNED_1_BYTE,
+   .elem_len   = 1,
+   .elem_size  =
+   sizeof_field(struct ipa_indication_register_req,
+bw_change_ind),
+   .tlv_type   = 0x14,
+   .offset = offsetof(struct ipa_indication_register_req,
+  bw_change_ind),
+   },
{
.data_type  = QMI_EOTI,
},
diff --git a/drivers/net/ipa/ipa_qmi_msg.h b/drivers/net/ipa/ipa_qmi_msg.h
index 12b6621f4b0e6..3233d145fd87c 100644
--- a/drivers/net/ipa/ipa_qmi_msg.h
+++ b/drivers/net/ipa/ipa_qmi_msg.h
@@ -24,7 +24,7 @@
  * information for each field.  The qmi_send_*() interfaces require
  * the message size to be provided.
  */
-#define IPA_QMI_INDICATION_REGISTER_REQ_SZ 12  /* -> server handle */
+#define IPA_QMI_INDICATION_REGISTER_REQ_SZ 20  /* -> server handle */
 #define IPA_QMI_INDICATION_REGISTER_RSP_SZ 7   /* <- server handle */
 #define IPA_QMI_INIT_DRIVER_REQ_SZ 162 /* client handle -> */
 #define IPA_QMI_INIT_DRIVER_RSP_SZ 25  /* client handle <- */
@@ -44,6 +44,10 @@ struct ipa_indication_register_req {
u8 data_usage_quota_reached;
u8 ipa_mhi_ready_ind_valid;
u8 ipa_mhi_ready_ind;
+   u8 endpoint_desc_ind_valid;
+   u8 endpoint_desc_ind;
+   u8 bw_change_ind_valid;
+   u8 bw_change_ind;
 };
 
 /* The response to a IPA_QMI_INDICATION_REGISTER request consists only of
-- 
2.27.0

[PATCH 5.10 190/290] s390/dasd: fix hanging IO request during DASD driver unbind

2021-03-15 Thread gregkh

From: Greg Kroah-Hartman 

From: Stefan Haberland 

commit 66f669a272898feb1c69b770e1504aa2ec7723d1 upstream.

Prevent that an IO request is build during device shutdown initiated by
a driver unbind. This request will never be able to be processed or
canceled and will hang forever. This will lead also to a hanging unbind.

Fix by checking not only if the device is in READY state but also check
that there is no device offline initiated before building a new IO request.

Fixes: e443343e509a ("s390/dasd: blk-mq conversion")

Cc:  # v4.14+
Signed-off-by: Stefan Haberland 
Tested-by: Bjoern Walk 
Reviewed-by: Jan Hoeppner 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/s390/block/dasd.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3087,7 +3087,8 @@ static blk_status_t do_dasd_request(stru

basedev = block->base;
spin_lock_irq(&dq->lock);
-   if (basedev->state < DASD_STATE_READY) {
+   if (basedev->state < DASD_STATE_READY ||
+   test_bit(DASD_FLAG_OFFLINE, &basedev->flags)) {
DBF_DEV_EVENT(DBF_ERR, basedev,
      "device not ready for request %p", req);
rc = BLK_STS_IOERR;

[PATCH 5.11 184/306] s390/dasd: fix hanging IO request during DASD driver unbind

2021-03-15 Thread gregkh

From: Greg Kroah-Hartman 

From: Stefan Haberland 

commit 66f669a272898feb1c69b770e1504aa2ec7723d1 upstream.

Prevent that an IO request is build during device shutdown initiated by
a driver unbind. This request will never be able to be processed or
canceled and will hang forever. This will lead also to a hanging unbind.

Fix by checking not only if the device is in READY state but also check
that there is no device offline initiated before building a new IO request.

Fixes: e443343e509a ("s390/dasd: blk-mq conversion")

Cc:  # v4.14+
Signed-off-by: Stefan Haberland 
Tested-by: Bjoern Walk 
Reviewed-by: Jan Hoeppner 
Signed-off-by: Jens Axboe 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/s390/block/dasd.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3068,7 +3068,8 @@ static blk_status_t do_dasd_request(stru

basedev = block->base;
spin_lock_irq(&dq->lock);
-   if (basedev->state < DASD_STATE_READY) {
+   if (basedev->state < DASD_STATE_READY ||
+   test_bit(DASD_FLAG_OFFLINE, &basedev->flags)) {
DBF_DEV_EVENT(DBF_ERR, basedev,
      "device not ready for request %p", req);
rc = BLK_STS_IOERR;

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 5697 matches

Mail list logo