Re: Mistake in include IS_ENABLED(CONFIG_LIVEPATCH)

2017-02-10 Thread Jiri Kosina
On Sat, 11 Feb 2017, Denys Fedoryshchenko wrote:

> I noticed that sample of livepatch is not working in 4.9.9, because in
> include,
> linux/livepatch.h
> it is:
> #if IS_ENABLED(CONFIG_LIVEPATCH)
> 
> while config option is:
> CONFIG_HAVE_LIVEPATCH=y
> 
> After editing livepatch.h sample module compiles fine
> 
> Probably that's just a typo?

There are two config variables. CONFIG_HAVE_LIVEPATCH is set by those 
architectures for which livepatching implementation exists.

CONFIG_LIVEPATCH is the actual config option turning the support in kernel 
on/off.

What you are seeing is that if you have kernel configuration that has 
livepatching (CONFIG_LIVEPATCH) turned off, the sample module doesn't 
compile for it either. I'd say it's not unexpected behavior.

-- 
Jiri Kosina
SUSE Labs



Re: [PATCH] ASoC: fsl_sai: support more than 2 channels

2017-02-10 Thread Nicolin Chen
On Fri, Feb 10, 2017 at 07:42:43PM +0100, Alexandre Belloni wrote:
> The FSL SAI can support up to 32 channels using TDM. Report that value so
> they can actually be used.
> 
> Tested using 8 channels.
> 
> Signed-off-by: Alexandre Belloni 

Acked-by: Nicolin Chen 

> ---
>  sound/soc/fsl/fsl_sai.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> index 9fadf7e31c5f..18e5ce81527d 100644
> --- a/sound/soc/fsl/fsl_sai.c
> +++ b/sound/soc/fsl/fsl_sai.c
> @@ -668,7 +668,7 @@ static struct snd_soc_dai_driver fsl_sai_dai = {
>   .playback = {
>   .stream_name = "CPU-Playback",
>   .channels_min = 1,
> - .channels_max = 2,
> + .channels_max = 32,
>   .rate_min = 8000,
>   .rate_max = 192000,
>   .rates = SNDRV_PCM_RATE_KNOT,
> @@ -677,7 +677,7 @@ static struct snd_soc_dai_driver fsl_sai_dai = {
>   .capture = {
>   .stream_name = "CPU-Capture",
>   .channels_min = 1,
> - .channels_max = 2,
> + .channels_max = 32,
>   .rate_min = 8000,
>   .rate_max = 192000,
>   .rates = SNDRV_PCM_RATE_KNOT,
> -- 
> 2.11.0
> 


Re: [PULL] IIO fixes for 4.10 set 3 - a couple of regression fixes.

2017-02-10 Thread Greg Kroah-Hartman
On Fri, Feb 10, 2017 at 11:35:35PM +0100, Peter Rosin wrote:
> > On Sun, Feb 05, 2017 at 10:35:02AM +, Jonathan Cameron wrote:
> >> The following changes since commit 
> >> 5c113b5e0082e90d2e1c7b12e96a7b8cf0623e27:
> >> 
> >>   iio: dht11: Use usleep_range instead of msleep for start signal 
> >> (2017-01-22 13:35:40 +)
> >> 
> >> are available in the git repository at:
> >> 
> >>   git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git 
> >> tags/iio-fixes-for-4.10c
> > 
> > It's a bit late for 4.10 for me, can I just pull this into my -next
> > branch and will they get to 4.10.1 properly?  Meaning, do that have cc:
> > stable markings on them?  Or do you want to fix that up and resend this
> > request?
> 
> Hi Greg,
> 
> You should ask Ken Lin who has the HW and who is apparently affected.
> I think it's bad that you are willing to have a known regression hit
> v4.10 when all was fine in v4.9. Or perhaps you didn't realize that
> the regression was from this cycle?
> 
> The fixes are obvious. I don't understand your hesitation.

My "hesitation" is that I'm about to get on a plane for a day or so and
don't have the time to get this to Linus before 4.10-final is out this
Sunday.  Getting it in a week later should be ok, we all make mistakes,
as long as we fix them it's all good, and for 4.10.1 should be ok.

thanks,

greg k-h


Re: [PATCH] device-dax: don't set kobj parent during cdev init

2017-02-10 Thread Greg Kroah-Hartman
On Fri, Feb 10, 2017 at 02:25:35PM -0800, Dan Williams wrote:
> On Fri, Feb 10, 2017 at 12:17 PM, Greg Kroah-Hartman
>  wrote:
> > On Fri, Feb 10, 2017 at 11:41:20AM -0800, Dan Williams wrote:
> >> On Fri, Feb 10, 2017 at 11:19 AM, Logan Gunthorpe  
> >> wrote:
> >> > I copied this code and per feedback from Greg Kroah-Hartman [1] the
> >> > cdev's kobject's parent should not be set to the related device.
> >> > This should have minor consequences but isn't doing what anyone
> >> > expects it to.
> >> >
> >> > This patch then fixes device-dax so it doesn't make the same mistake.
> >> >
> >> > [1] https://lkml.org/lkml/2017/2/10/370
> >> >
> >> > Signed-off-by: Logan Gunthorpe 
> >>
> >> Thanks for following up with this fix, but this causes a
> >> use-after-free regression:
> >>
> >>  general protection fault:  [#1] SMP DEBUG_PAGEALLOC
> >>  [..]
> >>  Call Trace:
> >>   vsnprintf+0x2d7/0x500
> >>   snprintf+0x49/0x60
> >>   dev_vprintk_emit+0x68/0x230
> >>   ? debug_lockdep_rcu_enabled+0x1d/0x20
> >>   ? trace_hardirqs_off+0xd/0x10
> >>   ? cmpxchg_double_slab.isra.70+0x15a/0x1c0
> >>   ? __slab_free+0x134/0x290
> >>   dev_printk_emit+0x4e/0x70
> >>   __dynamic_dev_dbg+0xc8/0x110
> >>   ? __lock_acquire+0x33d/0x1290
> >>   dax_dev_huge_fault+0xee/0x570 [dax]
> >>   __handle_mm_fault+0x5aa/0x10a0
> >>   handle_mm_fault+0x154/0x350
> >>   ? handle_mm_fault+0x3c/0x350
> >>   __do_page_fault+0x26b/0x4c0
> >>   trace_do_page_fault+0x58/0x270
> >>   do_async_page_fault+0x1a/0xa0
> >>   async_page_fault+0x28/0x30
> >>
> >> I added this reference explicitly so the parent struct device has the
> >> correct lifetime after this feedback from Al.
> >>
> >>https://lists.01.org/pipermail/linux-nvdimm/2016-August/006563.html
> >>
> >> ...so I'm wondering what the actual problem is with setting cdev->parent?
> >
> > It shouldn't do anything at all.  The kobject in a cdev isn't a "normal"
> > kobject, it doesn't show up in sysfs, or anywhere else.  It's used for
> > an internal representation to the cdev code (a kmap) to look up the
> > object to call when userspace opens the device node in a quick manner.
> >
> > Now changing from initialize/add to just register, does do different
> > things, perhaps that is the issue here.  Just try removing the
> > cdev->kobject parent stuff and see if that causes a problem or not.
> >
> 
>  That doesn't help.  I rely on the "kobject_get(p->kobj.parent);" in
> cdev_add() to pin my device and cdev_default_release() to free it.

"pin it" where?  Why do you need this?  That feels really "odd" to me...


Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline

2017-02-10 Thread luca abeni
Hi Daniel,

On Fri, 10 Feb 2017 20:48:11 +0100
Daniel Bristot de Oliveira  wrote:

> During the activation, CBS checks if it can reuse the current task's
> runtime and period. If the deadline of the task is in the past, CBS
> cannot use the runtime, and so it replenishes the task. This rule
> works fine for implicit deadline tasks (deadline == period), and the
> CBS was designed for implicit deadline tasks. However, a task with
> constrained deadline (deadine < period) might be awakened after the
> deadline, but before the next period. In this case, replenishing the
> task would allow it to run for runtime / deadline. As in this case
> deadline < period, CBS enables a task to run for more than the
> runtime/period. In a very load system, this can cause the domino
> effect, making other tasks to miss their deadlines.

I think you are right: SCHED_DEADLINE implements the original CBS
algorithm here, but uses relative deadlines different from periods in
other places (while the original algorithm only considered relative
deadlines equal to periods).
An this mix is dangerous... I think your fix is correct, and cures a
real problem.



Thanks,
Luca


> 
> To avoid this problem, in the activation of a constrained deadline
> task after the deadline but before the next period, throttle the
> task and set the replenishing timer to the begin of the next period,
> unless it is boosted.
> 
> Reproducer:
> 
>  --- %< ---
>   int main (int argc, char **argv)
>   {
>   int ret;
>   int flags = 0;
>   unsigned long l = 0;
>   struct timespec ts;
>   struct sched_attr attr;
> 
>   memset(&attr, 0, sizeof(attr));
>   attr.size = sizeof(attr);
> 
>   attr.sched_policy   = SCHED_DEADLINE;
>   attr.sched_runtime  = 2 * 1000 * 1000;  /* 2 ms
> */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
>   attr.sched_period   = 2 * 1000 * 1000 * 1000;   /* 2 s */
> 
>   ts.tv_sec = 0;
>   ts.tv_nsec = 2000 * 1000;   /* 2 ms */
> 
>   ret = sched_setattr(0, &attr, flags);
> 
>   if (ret < 0) {
>   perror("sched_setattr");
>   exit(-1);
>   }
> 
>   for(;;) {
>   /* XXX: you may need to adjust the loop */
>   for (l = 0; l < 15; l++);
>   /*
>* The ideia is to go to sleep right before the
> deadline
>* and then wake up before the next period to receive
>* a new replenishment.
>*/
>   nanosleep(&ts, NULL);
>   }
> 
>   exit(0);
>   }
>   --- >% ---  
> 
> On my box, this reproducer uses almost 50% of the CPU time, which is
> obviously wrong for a task with 2/2000 reservation.
> 
> Signed-off-by: Daniel Bristot de Oliveira 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Juri Lelli 
> Cc: Tommaso Cucinotta 
> Cc: Luca Abeni 
> Cc: Steven Rostedt 
> Cc: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/deadline.c | 44
>  1 file changed, 44
> insertions(+)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 3c94d85..b74d40e 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -694,6 +694,36 @@ void init_dl_task_timer(struct sched_dl_entity
> *dl_se) timer->function = dl_task_timer;
>  }
>  
> +/* During the activation, CBS checks if it can reuse the current
> task's
> + * runtime and period. If the deadline of the task is in the past,
> CBS
> + * cannot use the runtime, and so it replenishes the task. This rule
> + * works fine for implicit deadline tasks (deadline == period), and
> the
> + * CBS was designed for implicit deadline tasks. However, a task with
> + * constrained deadline (deadine < period) might be awakened after
> the
> + * deadline, but before the next period. In this case, replenishing
> the
> + * task would allow it to run for runtime / deadline. As in this case
> + * deadline < period, CBS enables a task to run for more than the
> + * runtime / period. In a very load system, this can cause the domino
> + * effect, making other tasks to miss their deadlines.
> + *
> + * To avoid this problem, in the activation of a constrained deadline
> + * task after the deadline but before the next period, throttle the
> + * task and set the replenishing timer to the begin of the next
> period,
> + * unless it is boosted.
> + */
> +static inline void dl_check_constrained_dl(struct sched_dl_entity
> *dl_se) +{
> + struct task_struct *p = dl_task_of(dl_se);
> + struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se));
> +
> + if (dl_time_before(dl_se->deadline, rq_clock(rq)) &&
> + dl_time_before(rq_clock(rq), dl_next_period(dl_se))) {
> + if (unlikely(dl_se->dl_boosted
> || !start_dl_timer(p)))
> + return;
> + dl_se->dl_throttled = 1;
> + }
>

Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period

2017-02-10 Thread luca abeni
Hi Daniel,

On Fri, 10 Feb 2017 20:48:10 +0100
Daniel Bristot de Oliveira  wrote:

[...]
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 70ef2b1..3c94d85 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -505,10 +505,15 @@ static void update_dl_entity(struct
> sched_dl_entity *dl_se, }
>  }
>  
> +static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
> +{
> + return dl_se->deadline - dl_se->dl_deadline +
> dl_se->dl_period; +}
> +
>  /*
>   * If the entity depleted all its runtime, and if we want it to sleep
>   * while waiting for some new execution time to become available, we
> - * set the bandwidth enforcement timer to the replenishment instant
> + * set the bandwidth replenishment timer to the replenishment instant
>   * and try to activate it.
>   *
>   * Notice that it is important for the caller to know if the timer
> @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
>* that it is actually coming from rq->clock and not from
>* hrtimer's time base reading.
>*/
> - act = ns_to_ktime(dl_se->deadline);
> + act = ns_to_ktime(dl_next_period(dl_se));

Looks like there is a real bug in the code, and your fix looks correct
to me. I think it should be committed.


Thanks,
Luca


>   now = hrtimer_cb_get_time(timer);
>   delta = ktime_to_ns(now) - rq_clock(rq);
>   act = ktime_add_ns(act, delta);



Re: [GIT PULL] PCI fixes for v4.10

2017-02-10 Thread Yinghai Lu
On Fri, Feb 10, 2017 at 6:39 PM, Yinghai Lu  wrote:
> Ashok,
>
> Can ask your QA guys check only attached patch and commit 68db9bc ?

more clean patches: split that into two small patches.

Thanks

Yinghai
From 68db9bc814362e7f24371c27d12a4f34477d9356 Mon Sep 17 00:00:00 2001
From: Lukas Wunner 
Date: Fri, 28 Oct 2016 10:52:06 +0200
Subject: PCI: pciehp: Add runtime PM support for PCIe hotplug ports

Linux 4.8 added support for runtime suspending PCIe ports to D3hot with
commit 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports"), but
excluded hotplug ports.  Those are now afforded runtime PM by the present
commit.

Hotplug ports require a few extra considerations:

- The configuration space of the port remains accessible in D3hot, so all
  the functions to read or modify the Slot Status and Slot Control
  registers need not be modified.  Even turning on slot power doesn't seem
  to require the port to be in D0, at least the PCIe spec doesn't say so
  and I confirmed that by testing with a Thunderbolt controller.

- However D0 is required to access devices on the secondary bus.  This
  happens in pciehp_check_link_status() and pciehp_configure_device() (both
  called from board_added()) and in pciehp_unconfigure_device() (called
  from remove_board()), so acquire a runtime PM ref for their invocation.

- The hotplug port stays active as long as it has active children.  If all
  hotplugged devices below the port runtime suspend, the port is allowed to
  runtime suspend as well.  Plug and unplug detection continues to work in
  D3hot.

- Hotplug interrupts are delivered in-band, so while the hotplug port
  itself is allowed to go to D3hot, its parent ports must stay in D0 for
  interrupts to come through.  Add a corresponding restriction to
  pci_dev_check_d3cold().

- Runtime PM may only be allowed if the hotplug port is handled natively by
  the OS.  On ACPI systems, the port may alternatively be handled by the
  firmware and things break if the OS puts the port into D3 behind the
  firmware's back:  E.g. Thunderbolt hotplug ports on non-Macs are handled
  by Intel's firmware in System Management Mode and the firmware is known
  to access devices on the port's secondary bus without checking first if
  the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811

Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
CC: Mika Westerberg 

diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index efe69e8..ffd3fe6 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "../pci.h"
 #include "pciehp.h"
@@ -98,6 +99,7 @@ static int board_added(struct slot *p_slot)
 	pciehp_green_led_blink(p_slot);
 
 	/* Check link training status */
+	pm_runtime_get_sync(&ctrl->pcie->port->dev);
 	retval = pciehp_check_link_status(ctrl);
 	if (retval) {
 		ctrl_err(ctrl, "Failed to check link status\n");
@@ -118,12 +120,14 @@ static int board_added(struct slot *p_slot)
 		if (retval != -EEXIST)
 			goto err_exit;
 	}
+	pm_runtime_put(&ctrl->pcie->port->dev);
 
 	pciehp_green_led_on(p_slot);
 	pciehp_set_attention_status(p_slot, 0);
 	return 0;
 
 err_exit:
+	pm_runtime_put(&ctrl->pcie->port->dev);
 	set_slot_off(ctrl, p_slot);
 	return retval;
 }
@@ -137,7 +141,9 @@ static int remove_board(struct slot *p_slot)
 	int retval;
 	struct controller *ctrl = p_slot->ctrl;
 
+	pm_runtime_get_sync(&ctrl->pcie->port->dev);
 	retval = pciehp_unconfigure_device(p_slot);
+	pm_runtime_put(&ctrl->pcie->port->dev);
 	if (retval)
 		return retval;
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d86351a..1eb622c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2245,13 +2245,10 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
 			return false;
 
 		/*
-		 * Hotplug interrupts cannot be delivered if the link is down,
-		 * so parents of a hotplug port must stay awake. In addition,
-		 * hotplug ports handled by firmware in System Management Mode
+		 * Hotplug ports handled by firmware in System Management Mode
 		 * may not be put into D3 by the OS (Thunderbolt on non-Macs).
-		 * For simplicity, disallow in general for now.
 		 */
-		if (bridge->is_hotplug_bridge)
+		if (bridge->is_hotplug_bridge && !pciehp_is_native(bridge))
 			return false;
 
 		if (pci_bridge_d3_force)
@@ -2283,7 +2280,10 @@ static int pci_dev_check_d3cold(struct pci_dev *dev, void *data)
 	 !pci_pme_capable(dev, PCI_D3cold)) ||
 
 	/* If it is a bridge it must be allowed to go to D3. */
-	!pci_power_manageable(dev))
+	!pci_power_manageable(dev) ||
+
+	/* Hotplug interrupts cannot be delivered if the link is down. */
+	dev->is_hotplug_bridge)
 
 		*d3cold_ok = false;
 
Subject: [PATCH] PCI, pciehp: clean and reuse set_slot_off

Move out led setting, and reuse it in remove_board.

Signed-off-by: Yinghai Lu 

---
 drivers/pci/hotplug/p

[PATCH] HID: intel-ish-hid: constify device_type structure

2017-02-10 Thread Bhumika Goyal
Declare device_type structure as const as it is only stored in the
type field of a device structure. This field is of type const, so add
const to the declaration of device_type structure.

File size before: drivers/hid/intel-ish-hid/ishtp/bus.o
   textdata bss dec hex filename
   4260 336  1646121204 hid/intel-ish-hid/ishtp/bus.o

File size after: drivers/hid/intel-ish-hid/ishtp/bus.o
   textdata bss dec hex filename
   4324 272  1646121204 hid/intel-ish-hid/ishtp/bus.o

Signed-off-by: Bhumika Goyal 
---
 drivers/hid/intel-ish-hid/ishtp/bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/intel-ish-hid/ishtp/bus.c 
b/drivers/hid/intel-ish-hid/ishtp/bus.c
index f4cbc74..5f382fe 100644
--- a/drivers/hid/intel-ish-hid/ishtp/bus.c
+++ b/drivers/hid/intel-ish-hid/ishtp/bus.c
@@ -358,7 +358,7 @@ static void ishtp_cl_dev_release(struct device *dev)
kfree(to_ishtp_cl_device(dev));
 }
 
-static struct device_type ishtp_cl_device_type = {
+static const struct device_type ishtp_cl_device_type = {
.release= ishtp_cl_dev_release,
 };
 
-- 
1.9.1



Re: [PATCH] staging: rtl8192u: Fix brace placement

2017-02-10 Thread Greg KH
On Sat, Feb 11, 2017 at 10:46:27AM +0530, simran singhal wrote:
> Fix brace placement errors caught by checkpatch.pl ERROR: that open
> brace { should be on the previous line
> 
> Signed-off-by: simran singhal 
> ---
>  .../staging/rtl8192u/ieee80211/rtl819x_BAProc.c| 90 
> --
>  1 file changed, 30 insertions(+), 60 deletions(-)

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You sent multiple patches, yet no indication of which ones should be
  applied in which order.  Greg could just guess, but if you are
  receiving this email, he guessed wrong and the patches didn't apply.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/SubmittingPatches for a description of how
  to do this so that Greg has a chance to apply these correctly.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot


Re: [PATCH] staging: rtl8192u: Removing multiple blank lines

2017-02-10 Thread Greg KH
On Sat, Feb 11, 2017 at 09:34:12AM +0530, SIMRAN SINGHAL wrote:
> Multiple patches ...?

You sent me lots of patches, how am I supposed to know which one to
apply in what order?

> Can you please clarify what all patches you are including in "Multiple 
> Patches".

Everything you have sent me.

> And the Order you should go for is the order in which I submitted them.

Email does not guarantee "in order" delivery at all.  And how do you
know how my emails are sorted?  That is why you are supposed to number
your patches.  Please read Documentation/SubmittingPatches for how to do
this properly.

thanks,

greg k-h


Re: [PATCH] staging:vt6656:channel.h: fix function definition argument without identifier name issue

2017-02-10 Thread Greg KH
On Sat, Feb 11, 2017 at 07:48:35AM +0530, Arushi Singhal wrote:
> Hi
> Sorry Greg but how this not applying to your mailing list.

Your patch did not apply to my staging-next git tree.  Probably because
soemone else already did this same work before you did.  Try rebasing
your patch on my staging-next branch of staging.git on git.kernel.org
and see for yourself.

thanks,

greg k-h


[PATCH] rbd: constify device_type structure

2017-02-10 Thread Bhumika Goyal
Declare device_type structure as const as it is only stored in the
type field of a device structure. This field is of type const, so add
const to the declaration of device_type structure.

File size before:
   textdata bss dec hex filename
  61546   11610 208   73364   11e94 drivers/block/rbd.o

File size after:
   textdata bss dec hex filename
  61610   11578 208   73396   11eb4 drivers/block/rbd.o

Signed-off-by: Bhumika Goyal 
---
 drivers/block/rbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 36d2b9f..dfc708b 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -4779,7 +4779,7 @@ static ssize_t rbd_image_refresh(struct device *dev,
 
 static void rbd_dev_release(struct device *dev);
 
-static struct device_type rbd_device_type = {
+static const struct device_type rbd_device_type = {
.name   = "rbd",
.groups = rbd_attr_groups,
.release= rbd_dev_release,
-- 
1.9.1



Re: [PATCH v4] fork: free vmapped stacks in cache when cpus are offline

2017-02-10 Thread Michal Hocko
On Sat 11-02-17 08:40:38, Hoeun Ryu wrote:
>  Using virtually mapped stack, kernel stacks are allocated via vmalloc.
> In the current implementation, two stacks per cpu can be cached when
> tasks are freed and the cached stacks are used again in task duplications.
> but the cached stacks may remain unfreed even when cpu are offline.
>  By adding a cpu hotplug callback to free the cached stacks when a cpu
> goes offline, the pages of the cached stacks are not wasted.
> 
> Signed-off-by: Hoeun Ryu 

Acked-by: Michal Hocko 

> ---
> v4:
>  use CPUHP_BP_PREPARE_DYN state for cpuhp setup
>  fix minor coding style
> v3:
>  fix misuse of per-cpu api
>  fix location of function definition within CONFIG_VMAP_STACK
> v2:
>  remove cpuhp callback for `startup`, only `teardown` callback is installed.
> 
>  kernel/fork.c | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 937ba59..61634d7 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -168,6 +168,24 @@ void __weak arch_release_thread_stack(unsigned long 
> *stack)
>   */
>  #define NR_CACHED_STACKS 2
>  static DEFINE_PER_CPU(struct vm_struct *, cached_stacks[NR_CACHED_STACKS]);
> +
> +static int free_vm_stack_cache(unsigned int cpu)
> +{
> + struct vm_struct **cached_vm_stacks = per_cpu_ptr(cached_stacks, cpu);
> + int i;
> +
> + for (i = 0; i < NR_CACHED_STACKS; i++) {
> + struct vm_struct *vm_stack = cached_vm_stacks[i];
> +
> + if (!vm_stack)
> + continue;
> +
> + vfree(vm_stack->addr);
> + cached_vm_stacks[i] = NULL;
> + }
> +
> + return 0;
> +}
>  #endif
>  
>  static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int 
> node)
> @@ -456,6 +474,11 @@ void __init fork_init(void)
>   for (i = 0; i < UCOUNT_COUNTS; i++) {
>   init_user_ns.ucount_max[i] = max_threads/2;
>   }
> +
> +#ifdef CONFIG_VMAP_STACK
> + cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vmstack_cache",
> +   NULL, free_vm_stack_cache);
> +#endif
>  }
>  
>  int __weak arch_dup_task_struct(struct task_struct *dst,
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs


[GIT PULL] SCSI fixes for 4.10-rc7

2017-02-10 Thread James Bottomley
Six fairly small fixes.  None is a real show stopper, two automation
detected problems: one memory leak, one use after free and four others
each of which fixes something that has been a significant source of
annoyance to someone.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bart Van Assche (1):
  scsi: qla2xxx: Fix a recently introduced memory leak

Dave Carroll (1):
  scsi: aacraid: Fix INTx/MSI-x issue with older controllers

Mauricio Faria de Oliveira (1):
  scsi: qla2xxx: Avoid that issuing a LIP triggers a kernel crash

Ram Pai (1):
  scsi: mpt3sas: Force request partial completion alignment

Steffen Maier (1):
  scsi: zfcp: fix use-after-free by not tracing WKA port open/close on 
failed send

ojab (1):
  scsi: mpt3sas: disable ASPM for MPI2 controllers

And the diffstat:

 drivers/s390/scsi/zfcp_fsf.c |  8 
 drivers/scsi/aacraid/comminit.c  |  8 ++--
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 18 ++
 drivers/scsi/qla2xxx/qla_isr.c   |  3 ++-
 drivers/scsi/qla2xxx/qla_os.c|  2 +-
 5 files changed, 31 insertions(+), 8 deletions(-)

With full diff below.

James

---

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 75f820ca..27ff38f 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -1583,7 +1583,7 @@ static void zfcp_fsf_open_wka_port_handler(struct 
zfcp_fsf_req *req)
 int zfcp_fsf_open_wka_port(struct zfcp_fc_wka_port *wka_port)
 {
struct zfcp_qdio *qdio = wka_port->adapter->qdio;
-   struct zfcp_fsf_req *req = NULL;
+   struct zfcp_fsf_req *req;
int retval = -EIO;
 
spin_lock_irq(&qdio->req_q_lock);
@@ -1612,7 +1612,7 @@ int zfcp_fsf_open_wka_port(struct zfcp_fc_wka_port 
*wka_port)
zfcp_fsf_req_free(req);
 out:
spin_unlock_irq(&qdio->req_q_lock);
-   if (req && !IS_ERR(req))
+   if (!retval)
zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id);
return retval;
 }
@@ -1638,7 +1638,7 @@ static void zfcp_fsf_close_wka_port_handler(struct 
zfcp_fsf_req *req)
 int zfcp_fsf_close_wka_port(struct zfcp_fc_wka_port *wka_port)
 {
struct zfcp_qdio *qdio = wka_port->adapter->qdio;
-   struct zfcp_fsf_req *req = NULL;
+   struct zfcp_fsf_req *req;
int retval = -EIO;
 
spin_lock_irq(&qdio->req_q_lock);
@@ -1667,7 +1667,7 @@ int zfcp_fsf_close_wka_port(struct zfcp_fc_wka_port 
*wka_port)
zfcp_fsf_req_free(req);
 out:
spin_unlock_irq(&qdio->req_q_lock);
-   if (req && !IS_ERR(req))
+   if (!retval)
zfcp_dbf_rec_run_wka("fscwp_1", wka_port, req->req_id);
return retval;
 }
diff --git a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c
index 4f56b10..5b48bed 100644
--- a/drivers/scsi/aacraid/comminit.c
+++ b/drivers/scsi/aacraid/comminit.c
@@ -50,9 +50,13 @@ struct aac_common aac_config = {
 
 static inline int aac_is_msix_mode(struct aac_dev *dev)
 {
-   u32 status;
+   u32 status = 0;
 
-   status = src_readl(dev, MUnit.OMR);
+   if (dev->pdev->device == PMC_DEVICE_S6 ||
+   dev->pdev->device == PMC_DEVICE_S7 ||
+   dev->pdev->device == PMC_DEVICE_S8) {
+   status = src_readl(dev, MUnit.OMR);
+   }
return (status & AAC_INT_MODE_MSIX);
 }
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 75f3fce..0b5b423 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -4657,6 +4658,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 
msix_index, u32 reply)
struct MPT3SAS_DEVICE *sas_device_priv_data;
u32 response_code = 0;
unsigned long flags;
+   unsigned int sector_sz;
 
mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
scmd = _scsih_scsi_lookup_get_clear(ioc, smid);
@@ -4715,6 +4717,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 
msix_index, u32 reply)
}
 
xfer_cnt = le32_to_cpu(mpi_reply->TransferCount);
+
+   /* In case of bogus fw or device, we could end up having
+* unaligned partial completion. We can force alignment here,
+* then scsi-ml does not need to handle this misbehavior.
+*/
+   sector_sz = scmd->device->sector_size;
+   if (unlikely(scmd->request->cmd_type == REQ_TYPE_FS && sector_sz &&
+xfer_cnt % sector_sz)) {
+   sdev_printk(KERN_INFO, scmd->device,
+   "unaligned partial completion avoided (xfer_cnt=%u, 
sector_sz=%u)\n",
+   xfer_cnt, sector_sz);
+   xfer_cnt = round_down(xfer_cnt, sector_sz);
+   }
+
scsi_set_resid(scmd, scsi_bufflen(scm

Re: [PATCH] f2fs: introduce nid cache

2017-02-10 Thread Chao Yu
On 2017/2/9 9:28, Jaegeuk Kim wrote:
> On 02/08, Chao Yu wrote:
>> On 2017/2/7 15:24, Chao Yu wrote:
>>> Hi Jaegeuk,
>>>
>>> Happy Chinese New Year! :)
>>>
>>> On 2017/1/24 12:35, Jaegeuk Kim wrote:
 Hi Chao,

 On 01/22, Chao Yu wrote:
> In scenario of intensively node allocation, free nids will be ran out
> soon, then it needs to stop to load free nids by traversing NAT blocks,
> in worse case, if NAT blocks does not be cached in memory, it generates
> IOs which slows down our foreground operations.
>
> In order to speed up node allocation, in this patch we introduce a new
> option named "nid cache", when turns on this option, it will load all
> nat entries in NAT blocks when doing mount, and organize all free nids
> in a bitmap, for any operations related to free nid, we will query and
> set the new prebuilded bitmap instead of reading and lookuping NAT
> blocks, so performance of node allocation can be improved.
>

 How does this affect mount time and memory consumption?
>>>
>>> Sorry for the delay.
>>>
>>> Let me figure out some numbers later.
>>
>> a. mount time
>>
>> I choose slow device (Kingston 16GB SD card) to see how this option affect 
>> mount
>> time when there is not enough bandwidth in low level,
>>
>> Before the test, I change readahead window size of NAT pages from 
>> FREE_NID_PAGES
>> * 8 to sbi->blocks_per_seg for better ra performance, so the result is:
>>
>> time mount -t f2fs -o nid_cache /dev/sde /mnt/f2fs/
>>
>> before:
>> real 0m0.204s
>> user 0m0.004s
>> sys  0m0.020s
>>
>> after:
>> real 0m3.792s
> 
> Oops, we can't accept this even only for 16GB, right? :(

Pengyang Hou help testing this patch in 64GB UFS, the result of mount time is:

Before: 110 ms
After:  770 ms

So these test results shows that we'd better not set nid_cache option by default
in upstream since anyway it slows down mount procedure obviously, but still
users can decide whether use it or not depending on their requirement. e.g.:
a. For readonly case, this option is complete no needed.
b. For in batch node allocation/deletion case, this option is recommended.

> 
>> user 0m0.000s
>> sys  0m0.140s
>>
>> b. memory consumption
>>
>> For 16GB size image, there is total 34 NAT pages, so memory footprint is:
>> 34 / 2 * 512 * 455 / 8 = 495040 bytes = 483.4 KB
>>
>> Increasing of memory footprint is liner with total user valid blocks in 
>> image,
>> and at most it will eat 3900 * 8 * 455 / 8 = 1774500 bytes = 1732.9 KB
> 
> How about adding two bitmaps for whole NAT pages and storing the bitmaps in
> checkpoint pack, which needs at most two blocks additionally?
> 
> 1. full-assigned NAT bitmap, where 1 means there is no free nids.
> 2. empty NAT bitmap, where 1 means whole there-in nids are free.
> 
> With these bitmaps, build_free_nids() can scan from 0'th NAT block by:
> 
>   if (full-assigned NAT)
>   skip;
>   else if (empty NAT)
>   add_free_nid(all);
>   else
>   read NAT page and add_free_nid();
> 
> The flush_nat_entries() has to change its bitmaps accordingly.
> 
> With this approach, I expect we can reuse nids as much as possible while
> getting cached NAT pages more effectively.

Good idea! :)

And there is another approach which do not need to change disk layout is:

We can allocate free_nid_bitmap[NAT_BLOCKS_COUNT][455] array, each bitmap
indicates usage of free nids in one NAT block, and we introduce another
nat_block_bitmap[NAT_BLOCKS_COUNT] to indicate each NAT block is loaded or not,
if it is loaded and we can do lookup in free_nid_bitmap correspondingly. So I
expect that we will load one NAT block from disk one time at most, it will:
- not increase mount latency
- after loading NAT blocks from disk, we will build its bitmap inside memory to
reduce lookup time for second time

Thoughts? Which one is preferred?

Thanks,

> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
 IMO, if those do not
 raise huge concerns, we would be able to consider just replacing current 
 free
 nid list with this bitmap.
>>>
>>> Yup, I agree with you.
>>>
>>> Thanks,
>>>
> 
> .
> 



[PATCH] staging: rtl8192u: Fix RETURN_VOID warnings

2017-02-10 Thread simran singhal
Fix 'void function return statements are not generally useful'
checkpatch.pl warnings.

Signed-off-by: simran singhal 
---
 drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c 
b/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c
index c09f3ad..f02eb8e 100644
--- a/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c
+++ b/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c
@@ -258,7 +258,6 @@ static void ieee80211_send_ADDBAReq(struct ieee80211_device 
*ieee,
else {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "alloc skb error in function 
%s()\n", __func__);
}
-   return;
 }
 
 
/
@@ -308,7 +307,6 @@ static void ieee80211_send_DELBA(struct ieee80211_device 
*ieee, u8 *dst,
else {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "alloc skb error in function 
%s()\n", __func__);
}
-   return ;
 }
 
 
/
@@ -708,5 +706,4 @@ void RxBaInactTimeout(unsigned long data)
&pRxTs->RxAdmittedBARecord,
RX_DIR,
DELBA_REASON_TIMEOUT);
-   return ;
 }
-- 
2.7.4



[PATCH] staging: rtl8192u: Fix brace placement

2017-02-10 Thread simran singhal
Fix brace placement errors caught by checkpatch.pl ERROR: that open
brace { should be on the previous line

Signed-off-by: simran singhal 
---
 .../staging/rtl8192u/ieee80211/rtl819x_BAProc.c| 90 --
 1 file changed, 30 insertions(+), 60 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c 
b/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c
index 91ea77d..c09f3ad 100644
--- a/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c
+++ b/drivers/staging/rtl8192u/ieee80211/rtl819x_BAProc.c
@@ -46,15 +46,13 @@ static u8 TxTsDeleteBA(struct ieee80211_device *ieee, 
PTX_TS_RECORD pTxTs)
u8  bSendDELBA = false;
 
// Delete pending BA
-   if (pPendingBa->bValid)
-   {
+   if (pPendingBa->bValid) {
DeActivateBAEntry(ieee, pPendingBa);
bSendDELBA = true;
}
 
// Delete admitted BA
-   if (pAdmittedBa->bValid)
-   {
+   if (pAdmittedBa->bValid) {
DeActivateBAEntry(ieee, pAdmittedBa);
bSendDELBA = true;
}
@@ -74,8 +72,7 @@ static u8 RxTsDeleteBA(struct ieee80211_device *ieee, 
PRX_TS_RECORD pRxTs)
PBA_RECORD  pBa = &pRxTs->RxAdmittedBARecord;
u8  bSendDELBA = false;
 
-   if (pBa->bValid)
-   {
+   if (pBa->bValid) {
DeActivateBAEntry(ieee, pBa);
bSendDELBA = true;
}
@@ -115,14 +112,12 @@ static struct sk_buff *ieee80211_ADDBA(struct 
ieee80211_device *ieee, u8 *Dst, P
u16 len = ieee->tx_headroom + 9;
//category(1) + action field(1) + Dialog Token(1) + BA Parameter Set(2) 
+  BA Timeout Value(2) +  BA Start SeqCtrl(2)(or StatusCode(2))
IEEE80211_DEBUG(IEEE80211_DL_TRACE | IEEE80211_DL_BA, ">%s(), 
frame(%d) sentd to:%pM, ieee->dev:%p\n", __func__, type, Dst, ieee->dev);
-   if (pBA == NULL)
-   {
+   if (pBA == NULL) {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "pBA is NULL\n");
return NULL;
}
skb = dev_alloc_skb(len + sizeof( struct rtl_80211_hdr_3addr)); //need 
to add something others? FIXME
-   if (skb == NULL)
-   {
+   if (skb == NULL) {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "can't alloc skb for 
ADDBA_REQ\n");
return NULL;
}
@@ -146,8 +141,7 @@ static struct sk_buff *ieee80211_ADDBA(struct 
ieee80211_device *ieee, u8 *Dst, P
// Dialog Token
*tag ++= pBA->DialogToken;
 
-   if (ACT_ADDBARSP == type)
-   {
+   if (ACT_ADDBARSP == type) {
// Status Code
printk("=>to send ADDBARSP\n");
 
@@ -163,8 +157,7 @@ static struct sk_buff *ieee80211_ADDBA(struct 
ieee80211_device *ieee, u8 *Dst, P
put_unaligned_le16(pBA->BaTimeoutValue, tag);
tag += 2;
 
-   if (ACT_ADDBAREQ == type)
-   {
+   if (ACT_ADDBAREQ == type) {
// BA Start SeqCtrl
memcpy(tag, (u8 *)&(pBA->BaStartSeqCtrl), 2);
tag += 2;
@@ -209,8 +202,7 @@ static struct sk_buff *ieee80211_DELBA(
DelbaParamSet.field.TID = pBA->BaParamSet.field.TID;
 
skb = dev_alloc_skb(len + sizeof( struct rtl_80211_hdr_3addr)); //need 
to add something others? FIXME
-   if (skb == NULL)
-   {
+   if (skb == NULL) {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "can't alloc skb for 
ADDBA_REQ\n");
return NULL;
}
@@ -257,15 +249,13 @@ static void ieee80211_send_ADDBAReq(struct 
ieee80211_device *ieee,
struct sk_buff *skb;
skb = ieee80211_ADDBA(ieee, dst, pBA, 0, ACT_ADDBAREQ); //construct 
ACT_ADDBAREQ frames so set statuscode zero.
 
-   if (skb)
-   {
+   if (skb) {
softmac_mgmt_xmit(skb, ieee);
//add statistic needed here.
//and skb will be freed in softmac_mgmt_xmit(), so omit all 
dev_kfree_skb_any() outside softmac_mgmt_xmit()
//WB
}
-   else
-   {
+   else {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "alloc skb error in function 
%s()\n", __func__);
}
return;
@@ -284,13 +274,11 @@ static void ieee80211_send_ADDBARsp(struct 
ieee80211_device *ieee, u8 *dst,
 {
struct sk_buff *skb;
skb = ieee80211_ADDBA(ieee, dst, pBA, StatusCode, ACT_ADDBARSP); 
//construct ACT_ADDBARSP frames
-   if (skb)
-   {
+   if (skb) {
softmac_mgmt_xmit(skb, ieee);
//same above
}
-   else
-   {
+   else {
IEEE80211_DEBUG(IEEE80211_DL_ERR, "alloc skb error in function 
%s()\n", __func__);
}
 
@@ -313,13 +301,11 @@ static void ieee80211_send_DELBA(struct ieee80211_device 
*ieee, u8 *dst,
 {
struct sk_buff *skb;
skb = ieee80211_DELBA(ieee, dst, pBA, TxRxSelect, ReasonCode); 
//construct ACT_ADDBARSP frames
-   if (skb)
-   {
+   if (skb) {
 

Mistake in include IS_ENABLED(CONFIG_LIVEPATCH)

2017-02-10 Thread Denys Fedoryshchenko

Hello,

I noticed that sample of livepatch is not working in 4.9.9, because in 
include,

linux/livepatch.h
it is:
#if IS_ENABLED(CONFIG_LIVEPATCH)

while config option is:
CONFIG_HAVE_LIVEPATCH=y

After editing livepatch.h sample module compiles fine

Probably that's just a typo?


[PATCH] staging: dgnc: dgnc_tty.c: fix argument list alignment issue.

2017-02-10 Thread Nathan Howard
Fix checkpatch.pl issue of the form:
"CHECK: Alignment should match open parenthesis".

Signed-off-by: Nathan Howard 
---
 drivers/staging/dgnc/dgnc_tty.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c
index 1e10c0f..c63e591 100644
--- a/drivers/staging/dgnc/dgnc_tty.c
+++ b/drivers/staging/dgnc/dgnc_tty.c
@@ -971,9 +971,10 @@ static int dgnc_tty_open(struct tty_struct *tty, struct 
file *file)
 * touched safely, the close routine will signal the
 * ch_flags_wait to wake us back up.
 */
-   rc = wait_event_interruptible(ch->ch_flags_wait,
-   (((ch->ch_tun.un_flags |
-  ch->ch_pun.un_flags) & UN_CLOSING) == 0));
+   rc = wait_event_interruptible(
+   ch->ch_flags_wait,
+   (((ch->ch_tun.un_flags |
+   ch->ch_pun.un_flags) & UN_CLOSING) == 0));
 
/* If ret is non-zero, user ctrl-c'ed us */
if (rc)
@@ -1193,7 +1194,8 @@ static int dgnc_block_til_ready(struct tty_struct *tty,
 (old_flags != (ch->ch_tun.un_flags |
ch->ch_pun.un_flags)));
else
-   retval = wait_event_interruptible(ch->ch_flags_wait,
+   retval = wait_event_interruptible(
+   ch->ch_flags_wait,
(old_flags != ch->ch_flags));
 
/*
-- 
2.7.4



Re: [PATCH v2] arm64: dts: Enable ir-spi in the tm2 and tm2e boards

2017-02-10 Thread Andi Shyti
Hi Javier,

On Fri, Feb 10, 2017 at 11:04:50AM -0300, Javier Martinez Canillas wrote:
> On 02/09/2017 11:22 PM, Andi Shyti wrote:
...
> > +   irda_regulator: irda-regulator {
> > +   compatible = "regulator-fixed";
> > +   enable-active-high;
> > +   gpio = <&gpr3 3 GPIO_ACTIVE_HIGH>;
> > +   regulator-name = "irda_regulator";
> 
> How is this regulator named in the board schematics? My
> understanding is that regulator-name should match this.
> 
> I don't have access to this so it may be "irda_regulator"
> although I was expecting something more like "VDD_IRDA".

This is not a real regulator.

This is an external regulator which is enabled with a gpio
(GPR3[3]). The regulator-fixed allows me to use the regulator API
even though I would only need to control a gpio (with the gpio
API). I prefer using regulator to keep the same interface no
matter how the irda is connected.

About the name, I have full freedom to chose as of course it's
not documented in the exynos5433 datasheet. Perhaps I could call
it irda-gpio-regulator to make it more clear?

> Patch looks good to me though:
> 
> Reviewed-by: Javier Martinez Canillas 

Thanks,
Andi


Re: module: Optimize search_module_extables()

2017-02-10 Thread Jessica Yu

+++ Peter Zijlstra [08/02/17 15:48 +0100]:


While looking through the __ex_table stuff I found that we do a linear
lookup of the module. Also fix up a comment.

Signed-off-by: Peter Zijlstra (Intel) 


Applied, thanks.

Hm. A quick scan through module.c still shows a couple of places that use
similar linear lookups, and may benefit from the same __module_address
optimization. But I'll save that for a separate patch..

Jessica


---
kernel/module.c | 27 ++-
1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 3d8f126208e3..7bcdc35dbf95 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4165,22 +4165,23 @@ const struct exception_table_entry 
*search_module_extables(unsigned long addr)
struct module *mod;

preempt_disable();
-   list_for_each_entry_rcu(mod, &modules, list) {
-   if (mod->state == MODULE_STATE_UNFORMED)
-   continue;
-   if (mod->num_exentries == 0)
-   continue;
+   mod = __module_address(addr);
+   if (!mod)
+   goto out;

-   e = search_extable(mod->extable,
-  mod->extable + mod->num_exentries - 1,
-  addr);
-   if (e)
-   break;
-   }
+   if (!mod->num_exentries)
+   goto out;
+
+   e = search_extable(mod->extable,
+  mod->extable + mod->num_exentries - 1,
+  addr);
+out:
preempt_enable();

-   /* Now, if we found one, we are running inside it now, hence
-  we cannot unload the module, hence no refcnt needed. */
+   /*
+* Now, if we found one, we are running inside it now, hence
+* we cannot unload the module, hence no refcnt needed.
+*/
return e;
}



[PATCH] staging: greybus: arpc.h: remove duplicate line.

2017-02-10 Thread Nathan Howard
Fix checkpatch.pl issue of the form:
"CHECK: Please don't use multiple blank lines".

Signed-off-by: Nathan Howard 
---
 drivers/staging/greybus/arpc.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/greybus/arpc.h b/drivers/staging/greybus/arpc.h
index 7fbddfc..c0b63c0 100644
--- a/drivers/staging/greybus/arpc.h
+++ b/drivers/staging/greybus/arpc.h
@@ -74,7 +74,6 @@ struct arpc_response_message {
__u8result; /* Result of RPC */
 } __packed;
 
-
 /* ARPC requests */
 #define ARPC_TYPE_CPORT_CONNECTED  0x01
 #define ARPC_TYPE_CPORT_QUIESCE0x02
-- 
2.7.4



Re: [PATCH] staging: rtl8192u: Removing multiple blank lines

2017-02-10 Thread SIMRAN SINGHAL
Multiple patches ...?
Can you please clarify what all patches you are including in "Multiple Patches".

And the Order you should go for is the order in which I submitted them.

On Sat, Feb 11, 2017 at 7:48 AM, SIMRAN SINGHAL
 wrote:
> Multiple patches ...?
> Can you please clarify what all patches you are including in "Multiple
> Patches".
>
> And the Order you should go for is the order in which I submitted them.
>
> On Feb 10, 2017 19:34, "Greg KH"  wrote:
>
> On Thu, Feb 09, 2017 at 06:02:12PM +0530, simran singhal wrote:
>> This patch fixes the checkpatch warning by removing multiple blank
>> lines.
>> CHECK: Please don't use multiple blank lines
>>
>> Signed-off-by: simran singhal 
>> ---
>>  drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_ccmp.c | 12
>> 
>>  1 file changed, 12 deletions(-)
>
> Hi,
>
> This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
> a patch that has triggered this response.  He used to manually respond
> to these common problems, but in order to save his sanity (he kept
> writing the same thing over and over, yet to different people), I was
> created.  Hopefully you will not take offence and will fix the problem
> in your patch and resubmit it so that it can be accepted into the Linux
> kernel tree.
>
> You are receiving this message because of the following common error(s)
> as indicated below:
>
> - You sent multiple patches, yet no indication of which ones should be
>   applied in which order.  Greg could just guess, but if you are
>   receiving this email, he guessed wrong and the patches didn't apply.
>   Please read the section entitled "The canonical patch format" in the
>   kernel file, Documentation/SubmittingPatches for a description of how
>   to do this so that Greg has a chance to apply these correctly.
>
> If you wish to discuss this problem further, or you have questions about
> how to resolve this issue, please feel free to respond to this email and
> Greg will reply once he has dug out from the pending patches received
> from other developers.
>
> thanks,
>
> greg k-h's patch email bot
>
>


[PATCH] block/loop: fix race between I/O and set_status

2017-02-10 Thread Ming Lei
Inside set_status, transfer need to setup again, so
we have to drain IO before the transition, otherwise
oops may be triggered like the following:

divide error:  [#1] SMP KASAN
CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ #213
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
task: 88006ba1e840 task.stack: 880067338000
RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110
RSP: 0018:88006733f108 EFLAGS: 00010246
RAX:  RBX: 8800688d7000 RCX: 0059
RDX:  RSI: 11000d743f43 RDI: 880068891c08
RBP: 88006733f160 R08: 8800688d7001 R09: 
R10:  R11:  R12: 8800688d7000
R13: 880067b7d000 R14: dc00 R15: 
FS:  () GS:88006d00()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 006c17e0 CR3: 66e3b000 CR4: 001406f0
Call Trace:
 lo_do_transfer drivers/block/loop.c:251 [inline]
 lo_read_transfer drivers/block/loop.c:392 [inline]
 do_req_filebacked drivers/block/loop.c:541 [inline]
 loop_handle_cmd drivers/block/loop.c:1677 [inline]
 loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689
 kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630
 kthread+0x326/0x3f0 kernel/kthread.c:227
 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08
84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 
7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83
RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP:
88006733f108
---[ end trace 0166f7bd3b0c0933 ]---

Reported-by: Dmitry Vyukov 
Cc: sta...@vger.kernel.org
Signed-off-by: Ming Lei 
---
 drivers/block/loop.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ed5259510857..4b52a1690329 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1097,9 +1097,12 @@ loop_set_status(struct loop_device *lo, const struct 
loop_info64 *info)
if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
return -EINVAL;
 
+   /* I/O need to be drained during transfer transition */
+   blk_mq_freeze_queue(lo->lo_queue);
+
err = loop_release_xfer(lo);
if (err)
-   return err;
+   goto exit;
 
if (info->lo_encrypt_type) {
unsigned int type = info->lo_encrypt_type;
@@ -1114,12 +1117,14 @@ loop_set_status(struct loop_device *lo, const struct 
loop_info64 *info)
 
err = loop_init_xfer(lo, xfer, info);
if (err)
-   return err;
+   goto exit;
 
if (lo->lo_offset != info->lo_offset ||
lo->lo_sizelimit != info->lo_sizelimit)
-   if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit))
-   return -EFBIG;
+   if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
+   err = -EFBIG;
+   goto exit;
+   }
 
loop_config_discard(lo);
 
@@ -1156,7 +1161,9 @@ loop_set_status(struct loop_device *lo, const struct 
loop_info64 *info)
/* update dio if lo_offset or transfer is changed */
__loop_update_dio(lo, lo->use_dio);
 
-   return 0;
+ exit:
+   blk_mq_unfreeze_queue(lo->lo_queue);
+   return err;
 }
 
 static int
-- 
2.7.4



Re: [PATCH] checkpatch: add warning on %pk instead of %pK usage

2017-02-10 Thread Joe Perches
On Sat, 2017-02-11 at 01:32 +, Roberts, William C wrote:
> 
> > > By "normal" I'm referring to things that call into pointer(), just
> > > casually looking I see bstr_printf vsnprintf kvasprintf, which would
> > > be easy enough to add
> > > 
> > > > What do you think is missing?  sn?printf ? That's easy to add.
> > > 
> > > The problem starts to get hairy when we think of how often folks roll
> > > their own logging macros (see some small sampling at the end).
> > > 
> > > I think we would want to add DEBUG DBG and sn?printf and maybe
> > > consider dropping the \b on the regex so it's a bit more matchy but
> > > still shouldn't end up matching on any ASM as you pointed out in the V2 
> > > nack.
> > > 
> > > Ill break this down into:
> > > 1. the patch as I know you'll take it, as you wrote it :-P 2. Adding
> > > to the logging macros 3. exploring making it less matchy
> 
> -Kees and Andrew they likely don't care about the rest of this...
> 
> I have been working up a regex (I suck at these) to match C functions that 
> have an invalid
> %p format string and take arguments:
> http://www.regexr.com/3f92k
> 
> This could be a way to get better coverage in a more generic approach, 
> thoughts?

Maybe this: (attached too because Evolution is a bad email client)

It's still kind of hacky, but it does find multiple line
statements like:

+   printf(KERN_INFO
+  "a %pX",
+  foo);

---
Subject: [PATCH] checkpatch: Add ability to find bad uses of vsprintf %p 
extensions

%pK was at least once misused at %pk in an out-of-tree module.
This lead to some security concerns.  Add the ability to track
single and multiple line statements for misuses of %p.

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ad5ea5c545b2..0eaf6b8580d6 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5676,6 +5676,32 @@ sub process {
}
}
 
+   # check for vsprintf extension %p misuses
+   if ($^V && $^V ge 5.10.0 &&
+   defined $stat &&
+   $stat =~ /^\+(?![^\{]*\{\s*).*\b(\w+)\s*\(.*$String\s*,/s &&
+   $1 !~ /^_*volatile_*$/) {
+   my $bad_extension = "";
+   my $lc = $stat =~ tr@\n@@;
+   $lc = $lc + $linenr;
+   for (my $count = $linenr; $count <= $lc; $count++) {
+   my $fmt = get_quoted_string($lines[$count - 1], 
raw_line($count, 0));
+   $fmt =~ s/%%//g;
+   if ($fmt =~ 
/(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGN]).)/) {
+   $bad_extension = $1;
+   last;
+   }
+   }
+   if ($bad_extension ne "") {
+   my $stat_real = raw_line($linenr, 0);
+   for (my $count = $linenr + 1; $count <= $lc; 
$count++) {
+   $stat_real = $stat_real . "\n" . 
raw_line($count, 0);
+   }
+   WARN("VSPRINTF_POINTER_EXTENSION",
+"Invalid vsprintf pointer extension 
'$bad_extension'\n" . "$here\n$stat_real\n");
+   }
+   }
+
 # Check for misused memsets
if ($^V && $^V ge 5.10.0 &&
defined $stat &&
-- 
From 3bd6868711efeb587c5c48e060c415a150fccaca Mon Sep 17 00:00:00 2001
Message-Id: <3bd6868711efeb587c5c48e060c415a150fccaca.1486783224.git@perches.com>
From: Joe Perches 
Date: Fri, 10 Feb 2017 19:17:42 -0800
Subject: [PATCH] checkpatch: Add ability to find bad uses of vsprintf %p
 extensions

%pK was at least once misused at %pk in an out-of-tree module.
This lead to some security concerns.  Add the ability to track
single and multiple line statements for misuses of %p.

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ad5ea5c545b2..0eaf6b8580d6 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5676,7 +5676,32 @@ sub process {
 			}
 		}
 
+		# check for vsprintf extension %p misuses
+		if ($^V && $^V ge 5.10.0 &&
+		defined $stat &&
+		$stat =~ /^\+(?![^\{]*\{\s*).*\b(\w+)\s*\(.*$String\s*,/s &&
+		$1 !~ /^_*volatile_*$/) {
+			my $bad_extension = "";
+			my $lc = $stat =~ tr@\n@@;
+			$lc = $lc + $linenr;
+		for (my $count = $linenr; $count <= $lc; $count++) {
+my $fmt = get_quoted_string($lines[$count - 1], raw_line($count, 0));
+$fmt =~ s/%%//g;
+if ($fmt =~ /(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGN]).)

Re: [PATCH 3/3] DT: add Faraday Tec. as vendor

2017-02-10 Thread Joel Stanley
On Fri, Feb 10, 2017 at 11:46 PM, Linus Walleij
 wrote:
> On Wed, Feb 8, 2017 at 9:00 PM, Hans Ulli Kroll
>  wrote:
>
>> add Faraday Technology Corporation as vendor faraday for DT
>>
>> Signed-off-by: Hans Ulli Kroll 
>
> Reviewed-by: Linus Walleij 
>
> I think I should use this for the PCI block as well, looking over some
> code and the root hub is using Faraday's PCI ID.

Acked-by: Joel Stanley 

This string is already used by the ftgmac100 Ethernet driver. Thanks
for adding it in.

Cheers,

Joel


Re: [PATCH v4] drivers/misc: Add Aspeed LPC control driver

2017-02-10 Thread Joel Stanley
Hey Greg,

On Sat, Feb 11, 2017 at 1:00 AM, Greg KH  wrote:
> On Wed, Feb 08, 2017 at 10:42:47AM +1100, Cyril Bur wrote:
>> In order to manage server systems, there is typically another processor
>> known as a BMC (Baseboard Management Controller) which is responsible
>> for powering the server and other various elements, sometimes fans,
>> often the system flash.
>
> Without some other reviewed-by: or at least tested-by lines here, I'm
> not going to take this.  Go poke your fellow ppc people to do some work
> here, it shouldn't be up to me to do it for them :(

We're on it. Thanks for your review so far.

By the way, this is a driver for an ARM SoC, not a PPC chip.
Nevertheless I'm sure those PPC people will still give us a review or
two.

Cheers,

Joel


Re: [GIT PULL] PCI fixes for v4.10

2017-02-10 Thread Yinghai Lu
On Thu, Feb 9, 2017 at 12:11 PM, Bjorn Helgaas  wrote:
> On Thu, Feb 09, 2017 at 09:09:50AM -0600, Bjorn Helgaas wrote:
>> [+cc Ashok, Keith]
>>
>> On Thu, Feb 09, 2017 at 05:06:48AM +0100, Lukas Wunner wrote:
>> > On Wed, Feb 08, 2017 at 01:22:56PM -0600, Bjorn Helgaas wrote:
>> > > Bjorn Helgaas (1):
>> > >   Revert "PCI: pciehp: Add runtime PM support for PCIe hotplug ports"
>> >
>> > What's the rationale for reverting this?
>> >
>> > You've received patches to fix the issue on both affected machines,
>> > so a revert seems unnecessary:
>> >
>> > https://patchwork.kernel.org/patch/9557113/
>> > https://patchwork.kernel.org/patch/9562007/
>>
>> I don't think we've gotten to the root cause of the problem yet,
>> and I don't want to throw in fixes at the last minute without a better
>> understanding of it.
>>
>> PCIe hotplug hardware is not very complicated, it hasn't changed in
>> many years, and at least for the Intel hardware in question, is
>> generally pretty well-tested with Windows.  So I want to be careful
>> about asserting that this new piece of hardware is broken.
>
> I apologize: I had quirks on the brain, but neither of the patches
> above is device-specific.  So neither is claiming broken hardware.
>
> However, 9557113 claims we get unwanted PME interrupts if the slot is
> occupied when we suspend to D3hot.  This is what I want to explore
> further, because that hardware behavior doesn't really make sense to
> me.
>
> 9562007 apparently fixes something, but at this point it's a debugging
> patch (no changelog or signed-off-by) so not a candidate for tossing
> into v4.10 at this late date.

Agreed. It should need more test coverage.

Found more problems.

Actually we don't need 9557113.
as even with that, we still saw link up when power off slots with some cards.

please check updated version of 9562007, that fix power on/off link up problem.

Ashok,

Can ask your QA guys check only attached patch and commit 68db9bc ?

Thanks

Yinghai
Subject:[PATCH v2] PCI, pciechp: Only power on/off slots when it is D0
Found power on via /sys has problem.
sca05-0a81fd7f:~ # echo 1 > /sys/bus/pci/slots/7/power
[  300.949937] pci_hotplug: power_write_file: power = 1
[  300.955502] pciehp :73:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
[  300.982557] pciehp :73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[  300.991171] pciehp :73:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
[  301.33] pciehp :73:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[  301.009274] pciehp :73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[  301.662172] pciehp :73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
[  301.670827] pciehp :73:00.0:pcie004: pending interrupts 0x0108 from Slot Status
[  301.679376] pciehp :73:00.0:pcie004: Slot(7): Link Up
[  301.685463] pciehp :73:00.0:pcie004: Slot(7): Link Up event ignored; already powering on
[  301.685508] pciehp :73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
[  302.005967] pciehp :73:00.0:pcie004: pciehp_check_link_status: lnk_status = f083
[  302.014859] pci :74:00.0: [15b3:1003] type 00 class 0x0c0600

also find other slot with other card still have extra link up problem on power off
even has can_wake patch.

sca05-0a81fd7f:~ # echo 0 > /sys/bus/pci/slots/1/power 
[ 6116.873632] pci_hotplug: power_write_file: power = 0
[ 6116.879198] pciehp :16:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 11f1
[ 6116.888730] pciehp :16:00.0:pcie004: pciehp_unconfigure_device: domain:bus:dev = :17:00
[ 6116.898464] pci :17:00.0: PME# disabled
[ 6116.903541] pci :17:00.0: freeing pci_dev info
[ 6116.909662] pciehp :16:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 6116.918277] pciehp :16:00.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[ 6116.982048] pciehp :16:00.0:pcie004: pending interrupts 0x0108 from Slot Status
[ 6116.990608] pciehp :16:00.0:pcie004: Slot(1): Link Down
[ 6116.996876] pciehp :16:00.0:pcie004: Slot(1): Link Down event ignored; already powering off
[ 6117.961521] pciehp :16:00.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[ 6117.970575] pciehp :16:00.0:pcie004: pending interrupts 0x0018 from Slot Status
[ 6117.970581] pciehp :16:00.0:pcie004: Slot(1): Card present
[ 6117.985660] pciehp :16:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
[ 6117.995825] pciehp :16:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 6118.005489] pciehp :16:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
[ 6118.014628] pciehp :16:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 6118.023880] pciehp :16:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 6118.602855] pciehp :16:00.0:pcie004: pciehp_check_link_active: lnk_status = f103
[ 6118.611507] pciehp :

[PATCH v6 2/8] devicetree: property-units: Add uWh and uAh units

2017-02-10 Thread Liam Breck
From: Matt Ranostay 

Add entries for microwatt-hours and microamp-hours.

Cc: Rob Herring 
Cc: Mark Rutland 
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Matt Ranostay 
Signed-off-by: Liam Breck 
Acked-by: Sebastian Reichel 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/property-units.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/property-units.txt 
b/Documentation/devicetree/bindings/property-units.txt
index 12278d7..0849618 100644
--- a/Documentation/devicetree/bindings/property-units.txt
+++ b/Documentation/devicetree/bindings/property-units.txt
@@ -25,8 +25,10 @@ Distance
 Electricity
 
 -microamp  : micro amps
+-microamp-hours : micro amp-hours
 -ohms  : Ohms
 -micro-ohms: micro Ohms
+-microwatt-hours: micro Watt-hours
 -microvolt : micro volts
 
 Temperature
-- 
2.9.3



Re: Is it really safe to use workqueues to drive expedited grace periods?

2017-02-10 Thread Tejun Heo
Hello, Paul.

On Fri, Feb 10, 2017 at 01:21:58PM -0800, Paul E. McKenney wrote:
> So RCU's expedited grace periods have been using workqueues for a
> little while, and things seem to be working.  But as usual, I worry...
> Is this use subject to some sort of deadlock where RCU's workqueue cannot
> start running until after a grace period completes, but that grace
> period is the one needing the workqueue?  Note that there are ways to
> set up your kernel so that all RCU grace periods are expedited.
> 
> Should I be worried?  If not, what prevents this from being a problem,
> especially given that workqueue handlers are allowed to wait for RCU
> grace periods to complete?

A per-cpu (normal) workqueue's concurrency is regulated automatically
so that there are at least one worker running for the worker pool on a
given CPU.

Let's say there are two work items queued on a workqueue.  The first
one is something which will do synchronize_rcu() and the second is the
expedited grace period work item.  When the first one runs
synchronize_rcu(), it'd block.  If there are no other work items
running at the time, workqueue will dispatch another worker so that
there's at least one actively running, which in this case will be the
expedited rcu grace period work item.

The dispatching of a new worker can be delayed by two things - memory
pressure preventing creation of a new worker and the workqueue hitting
maximum concurrency limit.

If expedited RCU grace period is something that memory reclaim path
may depend on, the workqueue that it executes on should have
WQ_MEM_RECLAIM set, which will guarantee that there's at least one
worker (across all CPUs) which is ready to serve the work items on
that workqueue regardless of memory pressure.

The latter, concurrency limit, would only matter if the RCU work items
use system_wq.  system_wq's concurrency limit is very high (512 per
CPU), but it is theoretically possible to fill all up with work items
doing synchronize_rcu() with the expedited RCU work item scheduled
behind it.  The system would already be in a very messed up state
outside the RCU situation tho.

Thanks.

-- 
tejun


Re: [RFC PATCH 2/2] mm/sparse: add last_section_nr in sparse_init() to reduce some iteration cycle

2017-02-10 Thread Tejun Heo
Hello,

On Sat, Feb 11, 2017 at 10:18:29AM +0800, Wei Yang wrote:
> During the sparse_init(), it iterate on each possible section. On x86_64,
> it would always be (2^19) even there is not much memory. For example, on a
> typical 4G machine, it has only (2^5) to (2^6) present sections. This
> benefits more on a system with smaller memory.
> 
> This patch calculates the last section number from the highest pfn and use
> this as the boundary of iteration.

* How much does this actually matter?  Can you measure the impact?

* Do we really need to add full reverse iterator to just get the
  highest section number?

Thanks.

-- 
tejun


[RFC PATCH 2/2] mm/sparse: add last_section_nr in sparse_init() to reduce some iteration cycle

2017-02-10 Thread Wei Yang
During the sparse_init(), it iterate on each possible section. On x86_64,
it would always be (2^19) even there is not much memory. For example, on a
typical 4G machine, it has only (2^5) to (2^6) present sections. This
benefits more on a system with smaller memory.

This patch calculates the last section number from the highest pfn and use
this as the boundary of iteration.

Signed-off-by: Wei Yang 
---
 mm/sparse.c | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 1e168bf2779a..d72f390d9e61 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -468,18 +468,20 @@ void __weak __meminit vmemmap_populate_print_last(void)
 
 /**
  *  alloc_usemap_and_memmap - memory alloction for pageblock flags and vmemmap
- *  @map: usemap_map for pageblock flags or mmap_map for vmemmap
+ *  @data: usemap_map for pageblock flags or mmap_map for vmemmap
  */
 static void __init alloc_usemap_and_memmap(void (*alloc_func)
(void *, unsigned long, unsigned long,
-   unsigned long, int), void *data)
+   unsigned long, int),
+   void *data,
+   unsigned long last_section_nr)
 {
unsigned long pnum;
unsigned long map_count;
int nodeid_begin = 0;
unsigned long pnum_begin = 0;
 
-   for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
+   for (pnum = 0; pnum <= last_section_nr; pnum++) {
struct mem_section *ms;
 
if (!present_section_nr(pnum))
@@ -490,7 +492,7 @@ static void __init alloc_usemap_and_memmap(void 
(*alloc_func)
break;
}
map_count = 1;
-   for (pnum = pnum_begin + 1; pnum < NR_MEM_SECTIONS; pnum++) {
+   for (pnum = pnum_begin + 1; pnum <= last_section_nr; pnum++) {
struct mem_section *ms;
int nodeid;
 
@@ -503,16 +505,14 @@ static void __init alloc_usemap_and_memmap(void 
(*alloc_func)
continue;
}
/* ok, we need to take cake of from pnum_begin to pnum - 1*/
-   alloc_func(data, pnum_begin, pnum,
-   map_count, nodeid_begin);
+   alloc_func(data, pnum_begin, pnum, map_count, nodeid_begin);
/* new start, update count etc*/
nodeid_begin = nodeid;
pnum_begin = pnum;
map_count = 1;
}
/* ok, last chunk */
-   alloc_func(data, pnum_begin, NR_MEM_SECTIONS,
-   map_count, nodeid_begin);
+   alloc_func(data, pnum_begin, pnum, map_count, nodeid_begin);
 }
 
 /*
@@ -526,6 +526,9 @@ void __init sparse_init(void)
unsigned long *usemap;
unsigned long **usemap_map;
int size;
+   unsigned long last_section_nr;
+   int i;
+   unsigned long last_pfn = 0;
 #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
int size2;
struct page **map_map;
@@ -537,6 +540,11 @@ void __init sparse_init(void)
/* Setup pageblock_order for HUGETLB_PAGE_SIZE_VARIABLE */
set_pageblock_order();
 
+   for_each_mem_pfn_range_rev(i, NUMA_NO_NODE, NULL,
+   &last_pfn, NULL)
+   break;
+   last_section_nr = pfn_to_section_nr(last_pfn);
+
/*
 * map is using big page (aka 2M in x86 64 bit)
 * usemap is less one page (aka 24 bytes)
@@ -553,7 +561,8 @@ void __init sparse_init(void)
if (!usemap_map)
panic("can not allocate usemap_map\n");
alloc_usemap_and_memmap(sparse_early_usemaps_alloc_node,
-   (void *)usemap_map);
+   (void *)usemap_map,
+   last_section_nr);
 
 #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
size2 = sizeof(struct page *) * NR_MEM_SECTIONS;
@@ -561,10 +570,11 @@ void __init sparse_init(void)
if (!map_map)
panic("can not allocate map_map\n");
alloc_usemap_and_memmap(sparse_early_mem_maps_alloc_node,
-   (void *)map_map);
+   (void *)map_map,
+   last_section_nr);
 #endif
 
-   for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
+   for (pnum = 0; pnum <= last_section_nr; pnum++) {
if (!present_section_nr(pnum))
continue;
 
-- 
2.11.0



[RFC PATCH 1/2] mm/memblock: introduce for_each_mem_pfn_range_rev()

2017-02-10 Thread Wei Yang
This patch introduces the helper function for_each_mem_pfn_range_rev() for
later use.

Signed-off-by: Wei Yang 
---
 include/linux/memblock.h | 18 ++
 mm/memblock.c| 39 ++-
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5b759c9acf97..87a0ebe18606 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,6 +203,8 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned 
long *start_pfn,
unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
  unsigned long *out_end_pfn, int *out_nid);
+void __next_mem_pfn_range_rev(int *idx, int nid, unsigned long *out_start_pfn,
+ unsigned long *out_end_pfn, int *out_nid);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
@@ -217,6 +219,22 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long 
*out_start_pfn,
 #define for_each_mem_pfn_range(i, nid, p_start, p_end, p_nid)  \
for (i = -1, __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid); \
 i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
+
+/**
+ * for_each_mem_pfn_range_rev - early memory pfn range rev-iterator
+ * @i: an integer used as loop variable
+ * @nid: node selector, %NUMA_NO_NODE for all nodes
+ * @p_start: ptr to ulong for start pfn of the range, can be %NULL
+ * @p_end: ptr to ulong for end pfn of the range, can be %NULL
+ * @p_nid: ptr to int for nid of the range, can be %NULL
+ *
+ * Walks over configured memory ranges in reverse order.
+ */
+#define for_each_mem_pfn_range_rev(i, nid, p_start, p_end, p_nid)  \
+   for (i = (int)INT_MAX,  \
+ __next_mem_pfn_range_rev(&i, nid, p_start, p_end, p_nid); \
+i != (int)INT_MAX; \
+ __next_mem_pfn_range_rev(&i, nid, p_start, p_end, p_nid))
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 /**
diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc305936..79490005ecd6 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1075,7 +1075,7 @@ void __init_memblock __next_mem_range_rev(u64 *idx, int 
nid, ulong flags,
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /*
- * Common iterator interface used to define for_each_mem_range().
+ * Common iterator interface used to define for_each_mem_pfn_range().
  */
 void __init_memblock __next_mem_pfn_range(int *idx, int nid,
unsigned long *out_start_pfn,
@@ -1105,6 +1105,43 @@ void __init_memblock __next_mem_pfn_range(int *idx, int 
nid,
*out_nid = r->nid;
 }
 
+/*
+ * Common rev-iterator interface used to define for_each_mem_pfn_range_rev().
+ */
+void __init_memblock __next_mem_pfn_range_rev(int *idx, int nid,
+   unsigned long *out_start_pfn,
+   unsigned long *out_end_pfn, int *out_nid)
+{
+   struct memblock_type *type = &memblock.memory;
+   struct memblock_region *r;
+
+   if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is 
deprecated. Use NUMA_NO_NODE instead\n"))
+   nid = NUMA_NO_NODE;
+
+   if (*idx == (int)INT_MAX)
+   *idx = type->cnt;
+
+   while (--*idx >= 0) {
+   r = &type->regions[*idx];
+
+   if (PFN_UP(r->base) >= PFN_DOWN(r->base + r->size))
+   continue;
+   if (nid == NUMA_NO_NODE || nid == r->nid)
+   break;
+   }
+   if (*idx < 0) {
+   *idx = (int)INT_MAX;
+   return;
+   }
+
+   if (out_start_pfn)
+   *out_start_pfn = PFN_UP(r->base);
+   if (out_end_pfn)
+   *out_end_pfn = PFN_DOWN(r->base + r->size);
+   if (out_nid)
+   *out_nid = r->nid;
+}
+
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
-- 
2.11.0



Re: [PATCH v2 2/5] time: mark syscore_ops as __ro_after_init

2017-02-10 Thread John Stultz
On Fri, Feb 10, 2017 at 5:37 PM, Jess Frazelle  wrote:
> Marked syscore_ops structs as __ro_after_init when register_syscore_ops was
> called only during init. Most of the caller functions were already annotated 
> as
> __init.
> unregister_syscore_ops() was never called on these ops.
> This protects the data structure from accidental corruption.
>
> Suggested-by: Kees Cook 
> Signed-off-by: Jess Frazelle 
> Acked-by: Rik van Riel 

Thanks for sending this out. Looks reasonable to me. I'll queue it for
testing, targeting for 4.12.

thanks
-john


[PATCH v13 07/12] usb: ehci: use bus->sysdev for DMA configuration

2017-02-10 Thread Peter Chen
Set the dma for ehci from sysdev. The sysdev is pointing to device that
is known to the system firmware or hardware.

Cc: Arnd Bergmann 
Cc: Sriram Dash 
Signed-off-by: Peter Chen 
Acked-by: Alan Stern 
---
 drivers/usb/host/ehci-mem.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/host/ehci-mem.c b/drivers/usb/host/ehci-mem.c
index 4de4301..9b7e639 100644
--- a/drivers/usb/host/ehci-mem.c
+++ b/drivers/usb/host/ehci-mem.c
@@ -138,7 +138,7 @@ static void ehci_mem_cleanup (struct ehci_hcd *ehci)
ehci->sitd_pool = NULL;
 
if (ehci->periodic)
-   dma_free_coherent (ehci_to_hcd(ehci)->self.controller,
+   dma_free_coherent(ehci_to_hcd(ehci)->self.sysdev,
ehci->periodic_size * sizeof (u32),
ehci->periodic, ehci->periodic_dma);
ehci->periodic = NULL;
@@ -155,7 +155,7 @@ static int ehci_mem_init (struct ehci_hcd *ehci, gfp_t 
flags)
 
/* QTDs for control/bulk/intr transfers */
ehci->qtd_pool = dma_pool_create ("ehci_qtd",
-   ehci_to_hcd(ehci)->self.controller,
+   ehci_to_hcd(ehci)->self.sysdev,
sizeof (struct ehci_qtd),
32 /* byte alignment (for hw parts) */,
4096 /* can't cross 4K */);
@@ -165,7 +165,7 @@ static int ehci_mem_init (struct ehci_hcd *ehci, gfp_t 
flags)
 
/* QHs for control/bulk/intr transfers */
ehci->qh_pool = dma_pool_create ("ehci_qh",
-   ehci_to_hcd(ehci)->self.controller,
+   ehci_to_hcd(ehci)->self.sysdev,
sizeof(struct ehci_qh_hw),
32 /* byte alignment (for hw parts) */,
4096 /* can't cross 4K */);
@@ -179,7 +179,7 @@ static int ehci_mem_init (struct ehci_hcd *ehci, gfp_t 
flags)
 
/* ITD for high speed ISO transfers */
ehci->itd_pool = dma_pool_create ("ehci_itd",
-   ehci_to_hcd(ehci)->self.controller,
+   ehci_to_hcd(ehci)->self.sysdev,
sizeof (struct ehci_itd),
32 /* byte alignment (for hw parts) */,
4096 /* can't cross 4K */);
@@ -189,7 +189,7 @@ static int ehci_mem_init (struct ehci_hcd *ehci, gfp_t 
flags)
 
/* SITD for full/low speed split ISO transfers */
ehci->sitd_pool = dma_pool_create ("ehci_sitd",
-   ehci_to_hcd(ehci)->self.controller,
+   ehci_to_hcd(ehci)->self.sysdev,
sizeof (struct ehci_sitd),
32 /* byte alignment (for hw parts) */,
4096 /* can't cross 4K */);
@@ -199,7 +199,7 @@ static int ehci_mem_init (struct ehci_hcd *ehci, gfp_t 
flags)
 
/* Hardware periodic table */
ehci->periodic = (__le32 *)
-   dma_alloc_coherent (ehci_to_hcd(ehci)->self.controller,
+   dma_alloc_coherent(ehci_to_hcd(ehci)->self.sysdev,
ehci->periodic_size * sizeof(__le32),
&ehci->periodic_dma, flags);
if (ehci->periodic == NULL) {
-- 
2.7.4



[PATCH v13 06/12] usb: xhci: use bus->sysdev for DMA configuration

2017-02-10 Thread Peter Chen
From: Arnd Bergmann 

For xhci-hcd platform device, all the DMA parameters are not
configured properly, notably dma ops for dwc3 devices. So, set
the dma for xhci from sysdev. sysdev is pointing to device that
is known to the system firmware or hardware.

Cc: Baolin Wang 
Cc: Vivek Gautam 
Cc: Alexander Sverdlin 
Cc: Mathias Nyman 

Signed-off-by: Arnd Bergmann 
Signed-off-by: Sriram Dash 
---
Hi, Baolin, Vivek and Alexander,
I removed your tested-by tag due to add one change that adding sysdev
for shared hcd too, if your test shows this change works for you or
has no effect for you, please consider adding tested-by tag again,
thanks.

 drivers/usb/host/xhci-mem.c  | 12 ++--
 drivers/usb/host/xhci-plat.c | 35 +++
 drivers/usb/host/xhci.c  | 15 +++
 3 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index ba1853f4..032a702 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -586,7 +586,7 @@ static void xhci_free_stream_ctx(struct xhci_hcd *xhci,
unsigned int num_stream_ctxs,
struct xhci_stream_ctx *stream_ctx, dma_addr_t dma)
 {
-   struct device *dev = xhci_to_hcd(xhci)->self.controller;
+   struct device *dev = xhci_to_hcd(xhci)->self.sysdev;
size_t size = sizeof(struct xhci_stream_ctx) * num_stream_ctxs;
 
if (size > MEDIUM_STREAM_ARRAY_SIZE)
@@ -614,7 +614,7 @@ static struct xhci_stream_ctx *xhci_alloc_stream_ctx(struct 
xhci_hcd *xhci,
unsigned int num_stream_ctxs, dma_addr_t *dma,
gfp_t mem_flags)
 {
-   struct device *dev = xhci_to_hcd(xhci)->self.controller;
+   struct device *dev = xhci_to_hcd(xhci)->self.sysdev;
size_t size = sizeof(struct xhci_stream_ctx) * num_stream_ctxs;
 
if (size > MEDIUM_STREAM_ARRAY_SIZE)
@@ -1686,7 +1686,7 @@ void xhci_slot_copy(struct xhci_hcd *xhci,
 static int scratchpad_alloc(struct xhci_hcd *xhci, gfp_t flags)
 {
int i;
-   struct device *dev = xhci_to_hcd(xhci)->self.controller;
+   struct device *dev = xhci_to_hcd(xhci)->self.sysdev;
int num_sp = HCS_MAX_SCRATCHPAD(xhci->hcs_params2);
 
xhci_dbg_trace(xhci, trace_xhci_dbg_init,
@@ -1758,7 +1758,7 @@ static void scratchpad_free(struct xhci_hcd *xhci)
 {
int num_sp;
int i;
-   struct device *dev = xhci_to_hcd(xhci)->self.controller;
+   struct device *dev = xhci_to_hcd(xhci)->self.sysdev;
 
if (!xhci->scratchpad)
return;
@@ -1831,7 +1831,7 @@ void xhci_free_command(struct xhci_hcd *xhci,
 
 void xhci_mem_cleanup(struct xhci_hcd *xhci)
 {
-   struct device   *dev = xhci_to_hcd(xhci)->self.controller;
+   struct device   *dev = xhci_to_hcd(xhci)->self.sysdev;
int size;
int i, j, num_ports;
 
@@ -2373,7 +2373,7 @@ static int xhci_setup_port_arrays(struct xhci_hcd *xhci, 
gfp_t flags)
 int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 {
dma_addr_t  dma;
-   struct device   *dev = xhci_to_hcd(xhci)->self.controller;
+   struct device   *dev = xhci_to_hcd(xhci)->self.sysdev;
unsigned intval, val2;
u64 val_64;
struct xhci_segment *seg;
diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index 6d33b42..4ecb3fd 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -148,6 +149,7 @@ static int xhci_plat_probe(struct platform_device *pdev)
 {
const struct of_device_id *match;
const struct hc_driver  *driver;
+   struct device   *sysdev;
struct xhci_hcd *xhci;
struct resource *res;
struct usb_hcd  *hcd;
@@ -164,22 +166,39 @@ static int xhci_plat_probe(struct platform_device *pdev)
if (irq < 0)
return -ENODEV;
 
+   /*
+* sysdev must point to a device that is known to the system firmware
+* or PCI hardware. We handle these three cases here:
+* 1. xhci_plat comes from firmware
+* 2. xhci_plat is child of a device from firmware (dwc3-plat)
+* 3. xhci_plat is grandchild of a pci device (dwc3-pci)
+*/
+   sysdev = &pdev->dev;
+   if (sysdev->parent && !sysdev->of_node && sysdev->parent->of_node)
+   sysdev = sysdev->parent;
+#ifdef CONFIG_PCI
+   else if (sysdev->parent && sysdev->parent->parent &&
+sysdev->parent->parent->bus == &pci_bus_type)
+   sysdev = sysdev->parent->parent;
+#endif
+
/* Try to set 64-bit DMA first */
-   if (!pdev->dev.dma_mask)
+   if (WARN_ON(!sysdev->dma_mask))
/* Platform did not initialize dma_mask */
-   ret = dma_coerce_mask_and_coherent(&pdev->dev,
+   ret = dma_coerce_mask_and_

[PATCH v13 10/12] ARM: dts: imx6qdl: Enable usb node children with

2017-02-10 Thread Peter Chen
From: Joshua Clayton 

Give usb nodes #address and #size attributes, so that a child node
representing a permanently connected device such as an onboard hub may
be addressed with a  attribute

Signed-off-by: Joshua Clayton 
Signed-off-by: Peter Chen 
---
 arch/arm/boot/dts/imx6qdl.dtsi | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
index 89b834f..a00de77 100644
--- a/arch/arm/boot/dts/imx6qdl.dtsi
+++ b/arch/arm/boot/dts/imx6qdl.dtsi
@@ -936,6 +936,8 @@
 
usbh1: usb@02184200 {
compatible = "fsl,imx6q-usb", "fsl,imx27-usb";
+   #address-cells = <1>;
+   #size-cells = <0>;
reg = <0x02184200 0x200>;
interrupts = <0 40 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6QDL_CLK_USBOH3>;
@@ -950,6 +952,8 @@
 
usbh2: usb@02184400 {
compatible = "fsl,imx6q-usb", "fsl,imx27-usb";
+   #address-cells = <1>;
+   #size-cells = <0>;
reg = <0x02184400 0x200>;
interrupts = <0 41 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6QDL_CLK_USBOH3>;
@@ -963,6 +967,8 @@
 
usbh3: usb@02184600 {
compatible = "fsl,imx6q-usb", "fsl,imx27-usb";
+   #address-cells = <1>;
+   #size-cells = <0>;
reg = <0x02184600 0x200>;
interrupts = <0 42 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6QDL_CLK_USBOH3>;
-- 
2.7.4



[PATCH v13 02/12] power: add power sequence library

2017-02-10 Thread Peter Chen
We have an well-known problem that the device needs to do some power
sequence before it can be recognized by related host, the typical
example like hard-wired mmc devices and usb devices.

This power sequence is hard to be described at device tree and handled by
related host driver, so we have created a common power sequence
library to cover this requirement. The core code has supplied
some common helpers for host driver, and individual power sequence
libraries handle kinds of power sequence for devices. The pwrseq
librares always need to allocate extra instance for compatible
string match.

pwrseq_generic is intended for general purpose of power sequence, which
handles gpios and clocks currently, and can cover other controls in
future. The host driver just needs to call of_pwrseq_on/of_pwrseq_off
if only one power sequence is needed, else call of_pwrseq_on_list
/of_pwrseq_off_list instead (eg, USB hub driver).

For new power sequence library, it can add its compatible string
to pwrseq_of_match_table, then the pwrseq core will match it with
DT's, and choose this library at runtime.

Signed-off-by: Peter Chen 
Tested-by: Maciej S. Szmigiero 
Tested-by Joshua Clayton 
Reviewed-by: Matthias Kaehlcke 
Tested-by: Matthias Kaehlcke 
---
 Documentation/power/power-sequence/design.rst |  54 +
 MAINTAINERS   |   9 +
 drivers/power/Kconfig |   1 +
 drivers/power/Makefile|   1 +
 drivers/power/pwrseq/Kconfig  |  20 ++
 drivers/power/pwrseq/Makefile |   2 +
 drivers/power/pwrseq/core.c   | 335 ++
 drivers/power/pwrseq/pwrseq_generic.c | 234 ++
 include/linux/power/pwrseq.h  |  81 +++
 9 files changed, 737 insertions(+)
 create mode 100644 Documentation/power/power-sequence/design.rst
 create mode 100644 drivers/power/pwrseq/Kconfig
 create mode 100644 drivers/power/pwrseq/Makefile
 create mode 100644 drivers/power/pwrseq/core.c
 create mode 100644 drivers/power/pwrseq/pwrseq_generic.c
 create mode 100644 include/linux/power/pwrseq.h

diff --git a/Documentation/power/power-sequence/design.rst 
b/Documentation/power/power-sequence/design.rst
new file mode 100644
index 000..554608e
--- /dev/null
+++ b/Documentation/power/power-sequence/design.rst
@@ -0,0 +1,54 @@
+
+Power Sequence Library
+
+
+:Date: Feb, 2017
+:Author: Peter Chen 
+
+
+Introduction
+
+
+We have an well-known problem that the device needs to do a power
+sequence before it can be recognized by related host, the typical
+examples are hard-wired mmc devices and usb devices. The host controller
+can't know what kinds of this device is in its bus if the power
+sequence has not done, since the related devices driver's probe calling
+is determined by runtime according to eunumeration results. Besides,
+the devices may have custom power sequence, so the power sequence library
+which is independent with the devices is needed.
+
+Design
+
+
+The power sequence library includes the core file and customer power
+sequence library. The core file exports interfaces are called by
+host controller driver for power sequence and customer power sequence
+library files to register its power sequence instance to global
+power sequence list. The custom power sequence library creates power
+sequence instance and implement custom power sequence.
+
+Since the power sequence describes hardware design, the description is
+located at board description file, eg, device tree dts file. And
+a specific power sequence belongs to device, so its description
+is under the device node, please refer to:
+Documentation/devicetree/bindings/power/pwrseq/pwrseq-generic.txt
+
+Custom power sequence library allocates one power sequence instance at
+bootup periods using postcore_initcall, this static allocated instance is
+used to compare with device-tree (DT) node to see if this library can be
+used for the node or not. When the result is matched, the core API will
+try to get resourses (->get, implemented at each library) for power
+sequence, if all resources are got, it will try to allocate another
+instance for next possible request from host driver.
+
+Then, the host controller driver can carry out power sequence on for this
+DT node, the library will do corresponding operations, like open clocks,
+toggle gpio, etc. The power sequence off routine will close and free the
+resources, and is called when the parent is removed. And the power
+sequence suspend and resume routine can be called at host driver's
+suspend and resume routine if needed.
+
+The exported interfaces
+.. kernel-doc:: drivers/power/pwrseq/core.c
+   :export:
diff --git a/MAINTAINERS b/MAINTAINERS
index 187b961..e5cbf7d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9854,6 +9854,15 @@ F:   include/linux/pm_*
 F: include/linux/

[PATCH v13 00/12] power: add power sequence library

2017-02-10 Thread Peter Chen
Hi all,

This is a follow-up for my last power sequence framework patch set [1].
According to Rob Herring and Ulf Hansson's comments[2]. The kinds of
power sequence instances will be added at postcore_initcall, the match
criteria is compatible string first, if the compatible string is not
matched between dts and library, it will try to use generic power sequence.
 
The host driver just needs to call of_pwrseq_on/of_pwrseq_off
if only one power sequence instance is needed, for more power sequences
are used, using of_pwrseq_on_list/of_pwrseq_off_list instead (eg, USB hub 
driver).

In future, if there are special power sequence requirements, the special
power sequence library can be created.

This patch set is tested on i.mx6 sabresx evk using a dts change, I use
two hot-plug devices to simulate this use case, the related binding
change is updated at patch [1/6], The udoo board changes were tested
using my last power sequence patch set.[3]

Except for hard-wired MMC and USB devices, I find the USB ULPI PHY also
need to power on itself before it can be found by ULPI bus.

Changes for v13:
- Add more design descriptions at design doc and fix one build error
  introduced by v12 wrongly [Patch 2/12]
- Add the last three dts patches which were forgotten at last series
- Move the comment for usb_create_shared_hcd to correct place [Patch 3/12]
- Add sysdev for shared hcd too for xhci-plat.c [Patch 6/12]

Rafael, if the first two power sequence patches are ok for you, would you 
consider
accept these first, the other USB patches can go through USB tree at v4.12-rc1?

Changes for v12:
- Add design doc and more comments at generic power sequence source file [Patch 
2/9]
- Introduce four Arnd Bergmann patches and one my ehci related patches, these 
patches
  are used to get property DT/firmware information at USB code, and these 
information
  are needed for power sequence operation at USB. With these five patches, my 
chipidea
  hack patch in previous patch set can be removed. [Patch 3-7/9]
- Add -ENOENT judgement to avoid USB error if no power sequence library is 
chosen [9/9]

Changes for v11:
- Fix warning: (USB) selects POWER_SEQUENCE which has unmet direct dependencies 
(OF)
- Delete redundant copyright statement.
- Change pr_warn to pr_debug at wrseq_find_available_instance
- Refine kerneldoc
- %s/ENONET/ENOENT 
- Allocate pwrseq list node before than carry out power sequence on 
- Add mutex_lock/mutex_lock for pwrseq node browse at 
pwrseq_find_available_instance
- Add pwrseq_suspend/resume for API both single instance and list 
- Add .pwrseq_suspend/resume for pwrseq_generic.c
- Add pwrseq_suspend_list and pwrseq_resume_list for USB hub suspend
  and resume routine

Changes for v10:
- Improve the kernel-doc for power sequence core, including exported APIs and
  main structure. [Patch 2/8]
- Change Kconfig, and let the user choose power sequence. [Patch 2/8]
- Delete EXPORT_SYMBOL and change related APIs as local, these APIs do not
  be intended to export currently. [Patch 2/8]
- Selete POWER_SEQUENCE at USB core's Kconfig. [Patch 4/8]

Changes for v9:
- Add Vaibhav Hiremath's reviewed-by [Patch 4/8]
- Rebase to v4.9-rc1

Changes for v8:
- Allocate one extra pwrseq instance if pwrseq_get has succeed, it can avoid
  preallocate instances problem which the number of instance is decided at
  compile time, thanks for Heiko Stuebner's suggestion [Patch 2/8]
- Delete pwrseq_compatible_sample.c which is the demo purpose to show compatible
  match method. [Patch 2/8]
- Add Maciej S. Szmigiero's tested-by. [Patch 7/8]

Changes for v7:
- Create kinds of power sequence instance at postcore_initcall, and match
  the instance with node using compatible string, the beneit of this is
  the host driver doesn't need to consider which pwrseq instance needs
  to be used, and pwrseq core will match it, however, it eats some memories
  if less power sequence instances are used. [Patch 2/8]
- Add pwrseq_compatible_sample.c to test match pwrseq using device_id. [Patch 
2/8]
- Fix the comments Vaibhav Hiremath adds for error path for clock and do not
  use device_node for parameters at pwrseq_on. [Patch 2/8]
- Simplify the caller to use power sequence, follows Alan's commnets [Patch 4/8]
- Tested three pwrseq instances together using both specific compatible string 
and
  generic libraries.

Changes for v6:
- Add Matthias Kaehlcke's Reviewed-by and Tested-by. (patch [2/6])
- Change chipidea core of_node assignment for coming user. (patch [5/6])
- Applies Joshua Clayton's three dts changes for two boards,
  the USB device's reg has only #address-cells, but without #size-cells.

Changes for v5:
- Delete pwrseq_register/pwrseq_unregister, which is useless currently
- Fix the linker error when the pwrseq user is compiled as module

Changes for v4:
- Create the patch on next-20160722 
- Fix the of_node is not NULL after chipidea driver is unbinded [Patch 5/6]
- Using more friendly wait method for reset gpio [Patch 2/6]
- Support mu

[PATCH v13 12/12] ARM: dts: imx6q-evi: Fix onboard hub reset line

2017-02-10 Thread Peter Chen
From: Joshua Clayton 

Previously the onboard hub was made to work by treating its
reset gpio as a regulator enable.
Get rid of that kludge now that pwseq has added reset gpio support
Move pin muxing the hub reset pin into the usbh1 group

Signed-off-by: Joshua Clayton 
Signed-off-by: Peter Chen 
---
 arch/arm/boot/dts/imx6q-evi.dts | 25 +++--
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/arch/arm/boot/dts/imx6q-evi.dts b/arch/arm/boot/dts/imx6q-evi.dts
index 7c7c1a8..79a0bd5 100644
--- a/arch/arm/boot/dts/imx6q-evi.dts
+++ b/arch/arm/boot/dts/imx6q-evi.dts
@@ -54,18 +54,6 @@
reg = <0x1000 0x4000>;
};
 
-   reg_usbh1_vbus: regulator-usbhubreset {
-   compatible = "regulator-fixed";
-   regulator-name = "usbh1_vbus";
-   regulator-min-microvolt = <500>;
-   regulator-max-microvolt = <500>;
-   enable-active-high;
-   startup-delay-us = <2>;
-   pinctrl-names = "default";
-   pinctrl-0 = <&pinctrl_usbh1_hubreset>;
-   gpio = <&gpio7 12 GPIO_ACTIVE_HIGH>;
-   };
-
reg_usb_otg_vbus: regulator-usbotgvbus {
compatible = "regulator-fixed";
regulator-name = "usb_otg_vbus";
@@ -207,12 +195,18 @@
 };
 
 &usbh1 {
-   vbus-supply = <®_usbh1_vbus>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usbh1>;
dr_mode = "host";
disable-over-current;
status = "okay";
+
+   usb2415host: hub@1 {
+   compatible = "usb424,2513";
+   reg = <1>;
+   reset-gpios = <&gpio7 12 GPIO_ACTIVE_LOW>;
+   reset-duration-us = <3000>;
+   };
 };
 
 &usbotg {
@@ -468,11 +462,6 @@
MX6QDL_PAD_GPIO_3__USB_H1_OC 0x1b0b0
/* usbh1_b OC */
MX6QDL_PAD_GPIO_0__GPIO1_IO00 0x1b0b0
-   >;
-   };
-
-   pinctrl_usbh1_hubreset: usbh1hubresetgrp {
-   fsl,pins = <
MX6QDL_PAD_GPIO_17__GPIO7_IO12 0x1b0b0
>;
};
-- 
2.7.4



[PATCH v13 03/12] usb: separate out sysdev pointer from usb_bus

2017-02-10 Thread Peter Chen
From: Arnd Bergmann 

For xhci-hcd platform device, all the DMA parameters are not
configured properly, notably dma ops for dwc3 devices.

The idea here is that you pass in the parent of_node along with
the child device pointer, so it would behave exactly like the
parent already does. The difference is that it also handles all
the other attributes besides the mask.

sysdev will represent the physical device, as seen from firmware
or bus.Splitting the usb_bus->controller field into the
Linux-internal device (used for the sysfs hierarchy, for printks
and for power management) and a new pointer (used for DMA,
DT enumeration and phy lookup) probably covers all that we really
need.

Signed-off-by: Arnd Bergmann 
Signed-off-by: Sriram Dash 
Tested-by: Baolin Wang 
Tested-by: Brian Norris 
Tested-by: Alexander Sverdlin 
Tested-by: Vivek Gautam 
Signed-off-by: Mathias Nyman 
Cc: Felipe Balbi 
Cc: Grygorii Strashko 
Cc: Sinjan Kumar 
Cc: David Fisher 
Cc: Catalin Marinas 
Cc: "Thang Q. Nguyen" 
Cc: Yoshihiro Shimoda 
Cc: Stephen Boyd 
Cc: Bjorn Andersson 
Cc: Ming Lei 
Cc: Jon Masters 
Cc: Dann Frazier 
Cc: Peter Chen 
Cc: Leo Li 
---
 drivers/usb/core/buffer.c | 12 +++
 drivers/usb/core/hcd.c| 80 ++-
 drivers/usb/core/usb.c| 18 +--
 include/linux/usb.h   |  1 +
 include/linux/usb/hcd.h   |  3 ++
 5 files changed, 64 insertions(+), 50 deletions(-)

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index b9bf6e2..b64568c 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -66,7 +66,7 @@ int hcd_buffer_create(struct usb_hcd *hcd)
int i, size;
 
if (!IS_ENABLED(CONFIG_HAS_DMA) ||
-   (!hcd->self.controller->dma_mask &&
+   (!is_device_dma_capable(hcd->self.sysdev) &&
 !(hcd->driver->flags & HCD_LOCAL_MEM)))
return 0;
 
@@ -75,7 +75,7 @@ int hcd_buffer_create(struct usb_hcd *hcd)
if (!size)
continue;
snprintf(name, sizeof(name), "buffer-%d", size);
-   hcd->pool[i] = dma_pool_create(name, hcd->self.controller,
+   hcd->pool[i] = dma_pool_create(name, hcd->self.sysdev,
size, size, 0);
if (!hcd->pool[i]) {
hcd_buffer_destroy(hcd);
@@ -130,7 +130,7 @@ void *hcd_buffer_alloc(
 
/* some USB hosts just use PIO */
if (!IS_ENABLED(CONFIG_HAS_DMA) ||
-   (!bus->controller->dma_mask &&
+   (!is_device_dma_capable(bus->sysdev) &&
 !(hcd->driver->flags & HCD_LOCAL_MEM))) {
*dma = ~(dma_addr_t) 0;
return kmalloc(size, mem_flags);
@@ -140,7 +140,7 @@ void *hcd_buffer_alloc(
if (size <= pool_max[i])
return dma_pool_alloc(hcd->pool[i], mem_flags, dma);
}
-   return dma_alloc_coherent(hcd->self.controller, size, dma, mem_flags);
+   return dma_alloc_coherent(hcd->self.sysdev, size, dma, mem_flags);
 }
 
 void hcd_buffer_free(
@@ -157,7 +157,7 @@ void hcd_buffer_free(
return;
 
if (!IS_ENABLED(CONFIG_HAS_DMA) ||
-   (!bus->controller->dma_mask &&
+   (!is_device_dma_capable(bus->sysdev) &&
 !(hcd->driver->flags & HCD_LOCAL_MEM))) {
kfree(addr);
return;
@@ -169,5 +169,5 @@ void hcd_buffer_free(
return;
}
}
-   dma_free_coherent(hcd->self.controller, size, addr, dma);
+   dma_free_coherent(hcd->self.sysdev, size, addr, dma);
 }
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 612fab6..2342c1f 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1073,6 +1073,7 @@ static void usb_deregister_bus (struct usb_bus *bus)
 static int register_root_hub(struct usb_hcd *hcd)
 {
struct device *parent_dev = hcd->self.controller;
+   struct device *sysdev = hcd->self.sysdev;
struct usb_device *usb_dev = hcd->self.root_hub;
const int devnum = 1;
int retval;
@@ -1119,7 +1120,7 @@ static int register_root_hub(struct usb_hcd *hcd)
/* Did the HC die before the root hub was registered? */
if (HCD_DEAD(hcd))
usb_hc_died (hcd);  /* This time clean up */
-   usb_dev->dev.of_node = parent_dev->of_node;
+   usb_dev->dev.of_node = sysdev->of_node;
}
mutex_unlock(&usb_bus_idr_lock);
 
@@ -1465,19 +1466,19 @@ void usb_hcd_unmap_urb_for_dma(struct usb_hcd *hcd, 
struct urb *urb)
dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
if (IS_ENABLED(CONFIG_HAS_DMA) &&
(urb->transfer_flags & URB_DMA_MAP_SG))
-   dma_unmap_sg(hcd->self.controller,
+   dma_unmap_sg(hcd->self.sysdev,
urb->sg,
ur

[PATCH v13 04/12] usb: chipidea: use bus->sysdev for DMA configuration

2017-02-10 Thread Peter Chen
From: Arnd Bergmann 

Set the dma for chipidea from sysdev. This is inherited from its
parent node. Also, do not set dma mask for child as it is not required
now.

Signed-off-by: Arnd Bergmann 
Signed-off-by: Sriram Dash 
Acked-by: Peter Chen 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/chipidea/core.c |  3 ---
 drivers/usb/chipidea/host.c |  3 ++-
 drivers/usb/chipidea/udc.c  | 10 ++
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c
index 79ad8e9..b4a78b2 100644
--- a/drivers/usb/chipidea/core.c
+++ b/drivers/usb/chipidea/core.c
@@ -783,9 +783,6 @@ struct platform_device *ci_hdrc_add_device(struct device 
*dev,
}
 
pdev->dev.parent = dev;
-   pdev->dev.dma_mask = dev->dma_mask;
-   pdev->dev.dma_parms = dev->dma_parms;
-   dma_set_coherent_mask(&pdev->dev, dev->coherent_dma_mask);
 
ret = platform_device_add_resources(pdev, res, nres);
if (ret)
diff --git a/drivers/usb/chipidea/host.c b/drivers/usb/chipidea/host.c
index 915f3e9..18cb8e4 100644
--- a/drivers/usb/chipidea/host.c
+++ b/drivers/usb/chipidea/host.c
@@ -123,7 +123,8 @@ static int host_start(struct ci_hdrc *ci)
if (usb_disabled())
return -ENODEV;
 
-   hcd = usb_create_hcd(&ci_ehci_hc_driver, ci->dev, dev_name(ci->dev));
+   hcd = __usb_create_hcd(&ci_ehci_hc_driver, ci->dev->parent,
+  ci->dev, dev_name(ci->dev), NULL);
if (!hcd)
return -ENOMEM;
 
diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
index f88e915..1fb5235 100644
--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -423,7 +423,8 @@ static int _hardware_enqueue(struct ci_hw_ep *hwep, struct 
ci_hw_req *hwreq)
 
hwreq->req.status = -EALREADY;
 
-   ret = usb_gadget_map_request(&ci->gadget, &hwreq->req, hwep->dir);
+   ret = usb_gadget_map_request_by_dev(ci->dev->parent,
+   &hwreq->req, hwep->dir);
if (ret)
return ret;
 
@@ -603,7 +604,8 @@ static int _hardware_dequeue(struct ci_hw_ep *hwep, struct 
ci_hw_req *hwreq)
list_del_init(&node->td);
}
 
-   usb_gadget_unmap_request(&hwep->ci->gadget, &hwreq->req, hwep->dir);
+   usb_gadget_unmap_request_by_dev(hwep->ci->dev->parent,
+   &hwreq->req, hwep->dir);
 
hwreq->req.actual += actual;
 
@@ -1899,13 +1901,13 @@ static int udc_start(struct ci_hdrc *ci)
INIT_LIST_HEAD(&ci->gadget.ep_list);
 
/* alloc resources */
-   ci->qh_pool = dma_pool_create("ci_hw_qh", dev,
+   ci->qh_pool = dma_pool_create("ci_hw_qh", dev->parent,
   sizeof(struct ci_hw_qh),
   64, CI_HDRC_PAGE_SIZE);
if (ci->qh_pool == NULL)
return -ENOMEM;
 
-   ci->td_pool = dma_pool_create("ci_hw_td", dev,
+   ci->td_pool = dma_pool_create("ci_hw_td", dev->parent,
   sizeof(struct ci_hw_td),
   64, CI_HDRC_PAGE_SIZE);
if (ci->td_pool == NULL) {
-- 
2.7.4



[PATCH v13 11/12] ARM: dts: imx6qdl-udoo.dtsi: fix onboard USB HUB property

2017-02-10 Thread Peter Chen
The current dts describes USB HUB's property at USB controller's
entry, it is improper. The USB HUB should be the child node
under USB controller, and power sequence properties are under
it. Besides, using gpio pinctrl setting for USB2415's reset pin.

Signed-off-by: Peter Chen 
Signed-off-by: Joshua Clayton 
Tested-by: Maciej S. Szmigiero 
---
 arch/arm/boot/dts/imx6qdl-udoo.dtsi | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/arch/arm/boot/dts/imx6qdl-udoo.dtsi 
b/arch/arm/boot/dts/imx6qdl-udoo.dtsi
index c96c91d..a173de2 100644
--- a/arch/arm/boot/dts/imx6qdl-udoo.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-udoo.dtsi
@@ -9,6 +9,8 @@
  *
  */
 
+#include 
+
 / {
aliases {
backlight = &backlight;
@@ -58,17 +60,6 @@
#address-cells = <1>;
#size-cells = <0>;
 
-   reg_usb_h1_vbus: regulator@0 {
-   compatible = "regulator-fixed";
-   reg = <0>;
-   regulator-name = "usb_h1_vbus";
-   regulator-min-microvolt = <500>;
-   regulator-max-microvolt = <500>;
-   enable-active-high;
-   startup-delay-us = <2>; /* USB2415 requires a POR of 1 
us minimum */
-   gpio = <&gpio7 12 0>;
-   };
-
reg_panel: regulator@1 {
compatible = "regulator-fixed";
reg = <1>;
@@ -188,7 +179,7 @@
 
pinctrl_usbh: usbhgrp {
fsl,pins = <
-   MX6QDL_PAD_GPIO_17__GPIO7_IO12 0x8000
+   MX6QDL_PAD_GPIO_17__GPIO7_IO12  0x1b0b0
MX6QDL_PAD_NANDF_CS2__CCM_CLKO2 0x130b0
>;
};
@@ -259,9 +250,16 @@
 &usbh1 {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usbh>;
-   vbus-supply = <®_usb_h1_vbus>;
-   clocks = <&clks IMX6QDL_CLK_CKO>;
status = "okay";
+
+   usb2415: hub@1 {
+   compatible = "usb424,2514";
+   reg = <1>;
+
+   clocks = <&clks IMX6QDL_CLK_CKO>;
+   reset-gpios = <&gpio7 12 GPIO_ACTIVE_LOW>;
+   reset-duration-us = <3000>;
+   };
 };
 
 &usdhc3 {
-- 
2.7.4



[PATCH v13 09/12] usb: core: add power sequence handling for USB devices

2017-02-10 Thread Peter Chen
Some hard-wired USB devices need to do power sequence to let the
device work normally, the typical power sequence like: enable USB
PHY clock, toggle reset pin, etc. But current Linux USB driver
lacks of such code to do it, it may cause some hard-wired USB devices
works abnormal or can't be recognized by controller at all.

In this patch, it calls power sequence library APIs to finish
the power sequence events. It will do power on sequence at hub's
probe for all devices under this hub (includes root hub).
At hub_disconnect, it will do power off sequence which is at powered
on list.

Signed-off-by: Peter Chen 
Tested-by Joshua Clayton 
Tested-by: Maciej S. Szmigiero 
Reviewed-by: Vaibhav Hiremath 
---
 drivers/usb/Kconfig|  1 +
 drivers/usb/core/hub.c | 49 +
 drivers/usb/core/hub.h |  1 +
 3 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/Kconfig b/drivers/usb/Kconfig
index fbe493d..706f261 100644
--- a/drivers/usb/Kconfig
+++ b/drivers/usb/Kconfig
@@ -40,6 +40,7 @@ config USB
tristate "Support for Host-side USB"
depends on USB_ARCH_HAS_HCD
select USB_COMMON
+   select POWER_SEQUENCE
select NLS  # for UTF-8 strings
---help---
  Universal Serial Bus (USB) is a specification for a serial bus
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index a56c75e..5b40c48 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1616,6 +1617,7 @@ static void hub_disconnect(struct usb_interface *intf)
hub->error = 0;
hub_quiesce(hub, HUB_DISCONNECT);
 
+   of_pwrseq_off_list(&hub->pwrseq_on_list);
mutex_lock(&usb_port_peer_mutex);
 
/* Avoid races with recursively_mark_NOTATTACHED() */
@@ -1643,12 +1645,42 @@ static void hub_disconnect(struct usb_interface *intf)
kref_put(&hub->kref, hub_release);
 }
 
+#ifdef CONFIG_OF
+static int hub_of_pwrseq_on(struct usb_hub *hub)
+{
+   struct device *parent;
+   struct usb_device *hdev = hub->hdev;
+   struct device_node *np;
+   int ret;
+
+   if (hdev->parent)
+   parent = &hdev->dev;
+   else
+   parent = bus_to_hcd(hdev->bus)->self.sysdev;
+
+   for_each_child_of_node(parent->of_node, np) {
+   ret = of_pwrseq_on_list(np, &hub->pwrseq_on_list);
+   /* Maybe no power sequence library is chosen */
+   if (ret && ret != -ENOENT)
+   return ret;
+   }
+
+   return 0;
+}
+#else
+static int hub_of_pwrseq_on(struct usb_hub *hub)
+{
+   return 0;
+}
+#endif
+
 static int hub_probe(struct usb_interface *intf, const struct usb_device_id 
*id)
 {
struct usb_host_interface *desc;
struct usb_endpoint_descriptor *endpoint;
struct usb_device *hdev;
struct usb_hub *hub;
+   int ret = -ENODEV;
 
desc = intf->cur_altsetting;
hdev = interface_to_usbdev(intf);
@@ -1753,6 +1785,7 @@ static int hub_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
INIT_DELAYED_WORK(&hub->leds, led_work);
INIT_DELAYED_WORK(&hub->init_work, NULL);
INIT_WORK(&hub->events, hub_event);
+   INIT_LIST_HEAD(&hub->pwrseq_on_list);
usb_get_intf(intf);
usb_get_dev(hdev);
 
@@ -1766,11 +1799,14 @@ static int hub_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
if (id->driver_info & HUB_QUIRK_CHECK_PORT_AUTOSUSPEND)
hub->quirk_check_port_auto_suspend = 1;
 
-   if (hub_configure(hub, endpoint) >= 0)
-   return 0;
+   if (hub_configure(hub, endpoint) >= 0) {
+   ret = hub_of_pwrseq_on(hub);
+   if (!ret)
+   return 0;
+   }
 
hub_disconnect(intf);
-   return -ENODEV;
+   return ret;
 }
 
 static int
@@ -3584,14 +3620,19 @@ static int hub_suspend(struct usb_interface *intf, 
pm_message_t msg)
 
/* stop hub_wq and related activity */
hub_quiesce(hub, HUB_SUSPEND);
-   return 0;
+   return pwrseq_suspend_list(&hub->pwrseq_on_list);
 }
 
 static int hub_resume(struct usb_interface *intf)
 {
struct usb_hub *hub = usb_get_intfdata(intf);
+   int ret;
 
dev_dbg(&intf->dev, "%s\n", __func__);
+   ret = pwrseq_resume_list(&hub->pwrseq_on_list);
+   if (ret)
+   return ret;
+
hub_activate(hub, HUB_RESUME);
return 0;
 }
diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
index 34c1a7e..cd86f91 100644
--- a/drivers/usb/core/hub.h
+++ b/drivers/usb/core/hub.h
@@ -78,6 +78,7 @@ struct usb_hub {
struct delayed_work init_work;
struct work_struct  events;
struct usb_port **ports;
+   struct list_headpwrseq_on_list; /* powered pwrseq node list */
 };
 
 /**
-- 
2.7.4



[PATCH] net: ethernet: ti: cpsw: return NET_XMIT_DROP if skb_padto failed

2017-02-10 Thread Ivan Khoronzhuk
If skb_padto failed the skb has been dropped already, so it was
consumed, but it doesn't mean it was sent, thus no need to update
queue tx time, etc. So, return NET_XMIT_DROP as more appropriate.

Signed-off-by: Ivan Khoronzhuk 
---
Based on net-next/master

 drivers/net/ethernet/ti/cpsw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 4d1c0c3..503fa8a 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1604,7 +1604,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
if (skb_padto(skb, CPSW_MIN_PACKET_SIZE)) {
cpsw_err(priv, tx_err, "packet pad failed\n");
ndev->stats.tx_dropped++;
-   return NETDEV_TX_OK;
+   return NET_XMIT_DROP;
}
 
if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP &&
-- 
2.7.4



[PATCH v13 05/12] usb: ehci: fsl: use bus->sysdev for DMA configuration

2017-02-10 Thread Peter Chen
From: Arnd Bergmann 

For the dual role ehci fsl driver, sysdev will handle the dma
config.

Signed-off-by: Arnd Bergmann 
Signed-off-by: Sriram Dash 
Signed-off-by: Mathias Nyman 
---
 drivers/usb/host/ehci-fsl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
index 3733aab..4a08b70 100644
--- a/drivers/usb/host/ehci-fsl.c
+++ b/drivers/usb/host/ehci-fsl.c
@@ -96,8 +96,8 @@ static int fsl_ehci_drv_probe(struct platform_device *pdev)
}
irq = res->start;
 
-   hcd = usb_create_hcd(&fsl_ehci_hc_driver, &pdev->dev,
-   dev_name(&pdev->dev));
+   hcd = __usb_create_hcd(&fsl_ehci_hc_driver, pdev->dev.parent,
+  &pdev->dev, dev_name(&pdev->dev), NULL);
if (!hcd) {
retval = -ENOMEM;
goto err1;
-- 
2.7.4



[PATCH v13 01/12] binding-doc: power: pwrseq-generic: add binding doc for generic power sequence library

2017-02-10 Thread Peter Chen
Add binding doc for generic power sequence library.

Signed-off-by: Peter Chen 
Acked-by: Philipp Zabel 
Acked-by: Rob Herring 
---
 .../bindings/power/pwrseq/pwrseq-generic.txt   | 48 ++
 1 file changed, 48 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/power/pwrseq/pwrseq-generic.txt

diff --git a/Documentation/devicetree/bindings/power/pwrseq/pwrseq-generic.txt 
b/Documentation/devicetree/bindings/power/pwrseq/pwrseq-generic.txt
new file mode 100644
index 000..ebf0d47
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/pwrseq/pwrseq-generic.txt
@@ -0,0 +1,48 @@
+The generic power sequence library
+
+Some hard-wired devices (eg USB/MMC) need to do power sequence before
+the device can be enumerated on the bus, the typical power sequence
+like: enable USB PHY clock, toggle reset pin, etc. But current
+Linux device driver lacks of such code to do it, it may cause some
+hard-wired devices works abnormal or can't be recognized by
+controller at all. The power sequence will be done before this device
+can be found at the bus.
+
+The power sequence properties is under the device node.
+
+Optional properties:
+- clocks: the input clocks for device.
+- reset-gpios: Should specify the GPIO for reset.
+- reset-duration-us: the duration in microsecond for assert reset signal.
+
+Below is the example of USB power sequence properties on USB device
+nodes which have two level USB hubs.
+
+&usbotg1 {
+   vbus-supply = <®_usb_otg1_vbus>;
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_usb_otg1_id>;
+   status = "okay";
+
+   #address-cells = <1>;
+   #size-cells = <0>;
+   genesys: hub@1 {
+   compatible = "usb5e3,608";
+   reg = <1>;
+
+   clocks = <&clks IMX6SX_CLK_CKO>;
+   reset-gpios = <&gpio4 5 GPIO_ACTIVE_LOW>; /* hub reset pin */
+   reset-duration-us = <10>;
+
+   #address-cells = <1>;
+   #size-cells = <0>;
+   asix: ethernet@1 {
+   compatible = "usbb95,1708";
+   reg = <1>;
+
+   clocks = <&clks IMX6SX_CLK_IPG>;
+   reset-gpios = <&gpio4 6 GPIO_ACTIVE_LOW>; /* 
ethernet_rst */
+   reset-duration-us = <15>;
+   };
+   };
+};
-- 
2.7.4



[PATCH v13 08/12] binding-doc: usb: usb-device: add optional properties for power sequence

2017-02-10 Thread Peter Chen
Add optional properties for power sequence.

Signed-off-by: Peter Chen 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/usb/usb-device.txt | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/usb/usb-device.txt 
b/Documentation/devicetree/bindings/usb/usb-device.txt
index 1c35e7b..3661dd2 100644
--- a/Documentation/devicetree/bindings/usb/usb-device.txt
+++ b/Documentation/devicetree/bindings/usb/usb-device.txt
@@ -13,6 +13,10 @@ Required properties:
 - reg: the port number which this device is connecting to, the range
   is 1-31.
 
+Optional properties:
+power sequence properties, see
+Documentation/devicetree/bindings/power/pwrseq/pwrseq-generic.txt for detail
+
 Example:
 
 &usb1 {
@@ -21,8 +25,12 @@ Example:
#address-cells = <1>;
#size-cells = <0>;
 
-   hub: genesys@1 {
+   genesys: hub@1 {
compatible = "usb5e3,608";
reg = <1>;
+
+   clocks = <&clks IMX6SX_CLK_CKO>;
+   reset-gpios = <&gpio4 5 GPIO_ACTIVE_LOW>; /* hub reset pin */
+   reset-duration-us = <10>;
};
 }
-- 
2.7.4



[PATCH] usb: musb: add code comment for clarification

2017-02-10 Thread Gustavo A. R. Silva

Add code comment to make it clear that the fall-through is intentional.
Read the link for more details: https://lkml.org/lkml/2017/2/9/292

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/usb/musb/musb_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index 892088f..1aec986 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -1869,6 +1869,7 @@ static void musb_pm_runtime_check_session(struct  
musb *musb)


return;
}
+   /* fall through */
case MUSB_QUIRK_A_DISCONNECT_19:
if (musb->quirk_retries--) {
musb_dbg(musb,
--
2.5.0






Re: [PATCH 2/3] Bluetooth: cmtp: fix possible might sleep error in cmtp_session

2017-02-10 Thread Brian Norris
Hi,

On Tue, Jan 24, 2017 at 12:07:50PM +0800, Jeffy Chen wrote:
> It looks like cmtp_session has same pattern as the issue reported in
> old rfcomm:
> 
>   while (1) {
>   set_current_state(TASK_INTERRUPTIBLE);
>   if (condition)
>   break;
>   // may call might_sleep here
>   schedule();
>   }
>   __set_current_state(TASK_RUNNING);
> 
> Which fixed at:
>   dfb2fae Bluetooth: Fix nested sleeps
> 
> So let's fix it at the same way, also follow the suggestion of:
> https://lwn.net/Articles/628628/
> 
> Signed-off-by: Jeffy Chen 
> ---
> 
>  net/bluetooth/cmtp/core.c | 21 ++---
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/net/bluetooth/cmtp/core.c b/net/bluetooth/cmtp/core.c
> index 9e59b66..6b03f2b 100644
> --- a/net/bluetooth/cmtp/core.c
> +++ b/net/bluetooth/cmtp/core.c
> @@ -280,16 +280,16 @@ static int cmtp_session(void *arg)
>   struct cmtp_session *session = arg;
>   struct sock *sk = session->sock->sk;
>   struct sk_buff *skb;
> - wait_queue_t wait;
> + DEFINE_WAIT_FUNC(wait, woken_wake_function);
>  
>   BT_DBG("session %p", session);
>  
>   set_user_nice(current, -15);
>  
> - init_waitqueue_entry(&wait, current);
>   add_wait_queue(sk_sleep(sk), &wait);
>   while (1) {
> - set_current_state(TASK_INTERRUPTIBLE);
> + /* Ensure session->terminate is updated */
> + smp_mb__before_atomic();
>  
>   if (atomic_read(&session->terminate))
>   break;
> @@ -306,9 +306,8 @@ static int cmtp_session(void *arg)
>  
>   cmtp_process_transmit(session);
>  
> - schedule();
> + wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
>   }
> - __set_current_state(TASK_RUNNING);
>   remove_wait_queue(sk_sleep(sk), &wait);
>  
>   down_write(&cmtp_session_sem);
> @@ -393,7 +392,11 @@ int cmtp_add_connection(struct cmtp_connadd_req *req, 
> struct socket *sock)
>   err = cmtp_attach_device(session);
>   if (err < 0) {
>   atomic_inc(&session->terminate);
> - wake_up_process(session->task);
> +
> + /* Ensure session->terminate is updated */
> + smp_mb__after_atomic();
> +

Same comment about the barrier.

> + wake_up_interruptible(sk_sleep(session->sock->sk));
>   up_write(&cmtp_session_sem);
>   return err;
>   }
> @@ -431,7 +434,11 @@ int cmtp_del_connection(struct cmtp_conndel_req *req)
>  
>   /* Stop session thread */
>   atomic_inc(&session->terminate);
> - wake_up_process(session->task);
> +
> + /* Ensure session->terminate is updated */
> + smp_mb__after_atomic();

And again.

But otherwise I think this looks OK, again with the caveat that I don't
know Bluetooth/CMTP that well:

Reviewed-by: Brian Norris 

> +
> + wake_up_interruptible(sk_sleep(session->sock->sk));
>   } else
>   err = -ENOENT;
>  
> -- 
> 2.1.4
> 
> 


Re: Linux 4.9.6 ( Restore IO-APIC irq_chip retrigger callback , breaks my box )

2017-02-10 Thread Gabriel C



On 11.02.2017 00:17, Gabriel C wrote:




Btw, how far in the boot process is the machine when this happens?


Right after :

Uncompressing Linux.
Booting the kernel..

So early..



After lots more boots .. I found out sometimes it gets to :

..

[4.656826] Key type dns_resolver registered

..

next line(s) in all my logs would be :

..

[4.657507] microcode: sig=0x106a5, pf=0x1, revision=0x19
[4.658678] microcode: Microcode Update Driver: v2.01 
, Peter Oruba

..

so maybe some sort race in microcode code ? but this would be strange ?


Re: [PATCH 1/3] Bluetooth: bnep: fix possible might sleep error in bnep_session

2017-02-10 Thread Brian Norris
Hi,

On Tue, Jan 24, 2017 at 12:07:49PM +0800, Jeffy Chen wrote:
> It looks like bnep_session has same pattern as the issue reported in
> old rfcomm:
> 
>   while (1) {
>   set_current_state(TASK_INTERRUPTIBLE);
>   if (condition)
>   break;
>   // may call might_sleep here
>   schedule();
>   }
>   __set_current_state(TASK_RUNNING);
> 
> Which fixed at:
>   dfb2fae Bluetooth: Fix nested sleeps
> 
> So let's fix it at the same way, also follow the suggestion of:
> https://lwn.net/Articles/628628/
> 
> Signed-off-by: Jeffy Chen 
> ---
> 
>  net/bluetooth/bnep/core.c | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c
> index fbf251f..da04d51 100644
> --- a/net/bluetooth/bnep/core.c
> +++ b/net/bluetooth/bnep/core.c
> @@ -484,16 +484,16 @@ static int bnep_session(void *arg)
>   struct net_device *dev = s->dev;
>   struct sock *sk = s->sock->sk;
>   struct sk_buff *skb;
> - wait_queue_t wait;
> + DEFINE_WAIT_FUNC(wait, woken_wake_function);
>  
>   BT_DBG("");
>  
>   set_user_nice(current, -15);
>  
> - init_waitqueue_entry(&wait, current);
>   add_wait_queue(sk_sleep(sk), &wait);
>   while (1) {
> - set_current_state(TASK_INTERRUPTIBLE);
> + /* Ensure session->terminate is updated */
> + smp_mb__before_atomic();
>  
>   if (atomic_read(&s->terminate))
>   break;
> @@ -515,9 +515,8 @@ static int bnep_session(void *arg)
>   break;
>   netif_wake_queue(dev);
>  
> - schedule();
> + wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
>   }
> - __set_current_state(TASK_RUNNING);
>   remove_wait_queue(sk_sleep(sk), &wait);
>  
>   /* Cleanup session */
> @@ -666,7 +665,11 @@ int bnep_del_connection(struct bnep_conndel_req *req)
>   s = __bnep_get_session(req->dst);
>   if (s) {
>   atomic_inc(&s->terminate);
> - wake_up_process(s->task);
> +
> + /* Ensure session->terminate is updated */
> + smp_mb__after_atomic();
> +

__wake_up() suggests:

 * It may be assumed that this function implies a write memory barrier before
 * changing the task state if and only if any tasks are woken up.

so the above barrier is probably unnecessary. I'm not so sure about the
one before atomic_read(); seems fine.

Other than that, I this looks ok:

Reviewed-by: Brian Norris 

But I haven't been testing BNEP.

Brian

> + wake_up_interruptible(sk_sleep(s->sock->sk));
>   } else
>   err = -ENOENT;
>  
> -- 
> 2.1.4
> 
> 


[PATCH v2 4/5] staging: set msi_domain_ops as __ro_after_init

2017-02-10 Thread Jess Frazelle
Marked msi_domain_ops structs as __ro_after_init when called only during init.
This protects the data structure from accidental corruption.

Suggested-by: Kees Cook 
Signed-off-by: Jess Frazelle 
---
 drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c 
b/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c
index 6b1cd574644f..0e2c1b5e13b7 100644
--- a/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c
+++ b/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c
@@ -51,7 +51,7 @@ static int its_fsl_mc_msi_prepare(struct irq_domain 
*msi_domain,
return msi_info->ops->msi_prepare(msi_domain->parent, dev, nvec, info);
 }

-static struct msi_domain_ops its_fsl_mc_msi_ops = {
+static struct msi_domain_ops its_fsl_mc_msi_ops __ro_after_init = {
.msi_prepare = its_fsl_mc_msi_prepare,
 };

--
2.11.0



[PATCH v2 5/5] x86: set msi_domain_ops as __ro_after_init

2017-02-10 Thread Jess Frazelle
Marked msi_domain_ops structs as __ro_after_init when called only during init.
This protects the data structure from accidental corruption.

Suggested-by: Kees Cook 
Signed-off-by: Jess Frazelle 
---
 arch/x86/kernel/apic/msi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 015bbf30e3e3..27783a1e7166 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -121,7 +121,7 @@ void pci_msi_set_desc(msi_alloc_info_t *arg, struct 
msi_desc *desc)
 }
 EXPORT_SYMBOL_GPL(pci_msi_set_desc);

-static struct msi_domain_ops pci_msi_domain_ops = {
+static struct msi_domain_ops pci_msi_domain_ops __ro_after_init = {
.get_hwirq  = pci_msi_get_hwirq,
.msi_prepare= pci_msi_prepare,
.set_desc   = pci_msi_set_desc,
@@ -207,7 +207,7 @@ static int dmar_msi_init(struct irq_domain *domain,
return 0;
 }

-static struct msi_domain_ops dmar_msi_domain_ops = {
+static struct msi_domain_ops dmar_msi_domain_ops __ro_after_init = {
.get_hwirq  = dmar_msi_get_hwirq,
.msi_init   = dmar_msi_init,
 };
@@ -304,7 +304,7 @@ static void hpet_msi_free(struct irq_domain *domain,
irq_clear_status_flags(virq, IRQ_MOVE_PCNTXT);
 }

-static struct msi_domain_ops hpet_msi_domain_ops = {
+static struct msi_domain_ops hpet_msi_domain_ops __ro_after_init = {
.get_hwirq  = hpet_msi_get_hwirq,
.msi_init   = hpet_msi_init,
.msi_free   = hpet_msi_free,
--
2.11.0



[PATCH v2 2/5] time: mark syscore_ops as __ro_after_init

2017-02-10 Thread Jess Frazelle
Marked syscore_ops structs as __ro_after_init when register_syscore_ops was
called only during init. Most of the caller functions were already annotated as
__init.
unregister_syscore_ops() was never called on these ops.
This protects the data structure from accidental corruption.

Suggested-by: Kees Cook 
Signed-off-by: Jess Frazelle 
Acked-by: Rik van Riel 
---
 kernel/time/sched_clock.c | 2 +-
 kernel/time/timekeeping.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index a26036d37a38..5df2fc07300b 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -289,7 +289,7 @@ static void sched_clock_resume(void)
rd->read_sched_clock = cd.actual_read_sched_clock;
 }

-static struct syscore_ops sched_clock_ops = {
+static struct syscore_ops sched_clock_ops __ro_after_init = {
.suspend= sched_clock_suspend,
.resume = sched_clock_resume,
 };
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index db087d7e106d..467e3021723a 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1756,7 +1756,7 @@ int timekeeping_suspend(void)
 }

 /* sysfs resume/suspend bits for timekeeping */
-static struct syscore_ops timekeeping_syscore_ops = {
+static struct syscore_ops timekeeping_syscore_ops __ro_after_init = {
.resume = timekeeping_resume,
.suspend= timekeeping_suspend,
 };
--
2.11.0



[PATCH v2 3/5] pci: set msi_domain_ops as __ro_after_init

2017-02-10 Thread Jess Frazelle
Marked msi_domain_ops structs as __ro_after_init when called only during init.
This protects the data structure from accidental corruption.

Suggested-by: Kees Cook 
Signed-off-by: Jess Frazelle 
---
 drivers/pci/host/pci-hyperv.c | 2 +-
 drivers/pci/host/vmd.c| 2 +-
 drivers/pci/msi.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 3efcc7bdc5fb..f05b93689d8f 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -958,7 +958,7 @@ static irq_hw_number_t hv_msi_domain_ops_get_hwirq(struct 
msi_domain_info *info,
return arg->msi_hwirq;
 }

-static struct msi_domain_ops hv_msi_ops = {
+static struct msi_domain_ops hv_msi_ops __ro_after_init = {
.get_hwirq  = hv_msi_domain_ops_get_hwirq,
.msi_prepare= pci_msi_prepare,
.set_desc   = pci_msi_set_desc,
diff --git a/drivers/pci/host/vmd.c b/drivers/pci/host/vmd.c
index 18ef1a93c10a..152c461538e4 100644
--- a/drivers/pci/host/vmd.c
+++ b/drivers/pci/host/vmd.c
@@ -253,7 +253,7 @@ static void vmd_set_desc(msi_alloc_info_t *arg, struct 
msi_desc *desc)
arg->desc = desc;
 }

-static struct msi_domain_ops vmd_msi_domain_ops = {
+static struct msi_domain_ops vmd_msi_domain_ops __ro_after_init = {
.get_hwirq  = vmd_get_hwirq,
.msi_init   = vmd_msi_init,
.msi_free   = vmd_msi_free,
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 50c5003295ca..93141d5e2d1c 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1413,7 +1413,7 @@ static void pci_msi_domain_set_desc(msi_alloc_info_t *arg,
 #define pci_msi_domain_set_descNULL
 #endif

-static struct msi_domain_ops pci_msi_domain_ops_default = {
+static struct msi_domain_ops pci_msi_domain_ops_default __ro_after_init = {
.set_desc   = pci_msi_domain_set_desc,
.msi_check  = pci_msi_domain_check_cap,
.handle_error   = pci_msi_domain_handle_error,
--
2.11.0



[PATCH v2 1/5] irq: set {msi_domain,syscore}_ops as __ro_after_init

2017-02-10 Thread Jess Frazelle
Marked msi_domain_ops structs as __ro_after_init when called only during init.
Marked syscore_ops structs as __ro_after_init when register_syscore_ops was
called only during init. Most of the caller functions were already annotated as
__init.
unregister_syscore_ops() was never called on these syscore_ops.
This protects the data structure from accidental corruption.

Suggested-by: Kees Cook 
Signed-off-by: Jess Frazelle 
---
 kernel/irq/generic-chip.c | 2 +-
 kernel/irq/msi.c  | 2 +-
 kernel/irq/pm.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/generic-chip.c b/kernel/irq/generic-chip.c
index ee32870079c9..cca63dbaabea 100644
--- a/kernel/irq/generic-chip.c
+++ b/kernel/irq/generic-chip.c
@@ -623,7 +623,7 @@ static void irq_gc_shutdown(void)
}
 }

-static struct syscore_ops irq_gc_syscore_ops = {
+static struct syscore_ops irq_gc_syscore_ops __ro_after_init = {
.suspend = irq_gc_suspend,
.resume = irq_gc_resume,
.shutdown = irq_gc_shutdown,
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index ee230063f033..0e5b723f710f 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -217,7 +217,7 @@ static int msi_domain_ops_check(struct irq_domain *domain,
return 0;
 }

-static struct msi_domain_ops msi_domain_ops_default = {
+static struct msi_domain_ops msi_domain_ops_default __ro_after_init = {
.get_hwirq  = msi_domain_ops_get_hwirq,
.msi_init   = msi_domain_ops_init,
.msi_check  = msi_domain_ops_check,
diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
index cea1de0161f1..d6b889bed323 100644
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -185,7 +185,7 @@ static void irq_pm_syscore_resume(void)
resume_irqs(true);
 }

-static struct syscore_ops irq_pm_syscore_ops = {
+static struct syscore_ops irq_pm_syscore_ops __ro_after_init = {
.resume = irq_pm_syscore_resume,
 };

--
2.11.0



RE: [PATCH] checkpatch: add warning on %pk instead of %pK usage

2017-02-10 Thread Roberts, William C

> > By "normal" I'm referring to things that call into pointer(), just
> > casually looking I see bstr_printf vsnprintf kvasprintf, which would
> > be easy enough to add
> >
> > > What do you think is missing?  sn?printf ? That's easy to add.
> >
> > The problem starts to get hairy when we think of how often folks roll
> > their own logging macros (see some small sampling at the end).
> >
> > I think we would want to add DEBUG DBG and sn?printf and maybe
> > consider dropping the \b on the regex so it's a bit more matchy but
> > still shouldn't end up matching on any ASM as you pointed out in the V2 
> > nack.
> >
> > Ill break this down into:
> > 1. the patch as I know you'll take it, as you wrote it :-P 2. Adding
> > to the logging macros 3. exploring making it less matchy

-Kees and Andrew they likely don't care about the rest of this...

I have been working up a regex (I suck at these) to match C functions that have 
an invalid
%p format string and take arguments:
http://www.regexr.com/3f92k

This could be a way to get better coverage in a more generic approach, thoughts?



Tracebacks in -next due to 'of: fix of_node leak caused in of_find_node_opts_by_path'

2017-02-10 Thread Guenter Roeck
Hi,

I see a number of tracebacks in test runs on qemu-next, all related to omap
configurations.

Here is an example:

[0.00] OF: ERROR: Bad of_node_put() on /ocp@6800
[0.00] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
4.10.0-rc7-next-20170210 #1
[0.00] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
[0.00] [] (unwind_backtrace) from [] 
(show_stack+0x10/0x14)
[0.00] [] (show_stack) from [] 
(dump_stack+0x98/0xac)
[0.00] [] (dump_stack) from [] 
(kobject_release+0x48/0x7c)
[0.00] [] (kobject_release) from [] 
(__of_translate_address+0xb0/0x2cc)
[0.00] [] (__of_translate_address) from [] 
(__of_address_to_resource+0x28/0xb4)
[0.00] [] (__of_address_to_resource) from [] 
(of_address_to_resource+0x70/0x80)
[0.00] [] (of_address_to_resource) from [] 
(of_syscon_register+0x88/0x22c)
[0.00] [] (of_syscon_register) from [] 
(syscon_node_to_regmap+0x90/0x94)
[0.00] [] (syscon_node_to_regmap) from [] 
(omap_control_init+0x50/0xd8)
[0.00] [] (omap_control_init) from [] 
(omap_clk_init+0x3c/0x70)
[0.00] [] (omap_clk_init) from [] 
(__omap_sync32k_timer_init+0x20/0x2b4)
[0.00] [] (__omap_sync32k_timer_init) from [] 
(omap3_secure_sync32k_timer_init+0x3c/0x48)
[0.00] [] (omap3_secure_sync32k_timer_init) from [] 
(start_kernel+0x244/0x38c)
[0.00] [] (start_kernel) from [<8020807c>] (0x8020807c)
[0.00] Clocking rate (Crystal/Core/MPU): 26.0/332/500 MHz

There are several such messages with different call paths.

A log with all tracebacks is available at
http://kerneltests.org/builders/qemu-arm-next/builds/627/steps/qemubuildcommand/logs/stdio

Bisect points to commit 'of: fix of_node leak caused in
of_find_node_opts_by_path'. Bisect log is attached.

It is going to be interesting to learn if the patch introduces a problem
or if it exposes one.

Guenter

---
# bad: [632571b1bee00494aef749512d9f3290dfba0ead] Add linux-next specific files 
for 20170210
# good: [d5adbfcd5f7bcc6fa58a41c5c5ada0e5c826ce2c] Linux 4.10-rc7
git bisect start 'HEAD' 'v4.10-rc7'
# good: [0bd52e1bdcb050ad5bef5d8e93838d40fd44ac4b] Merge remote-tracking branch 
'crypto/master'
git bisect good 0bd52e1bdcb050ad5bef5d8e93838d40fd44ac4b
# bad: [6431424e2adf1b48333f7bd54cd5be8fef3953d7] Merge remote-tracking branch 
'tip/auto-latest'
git bisect bad 6431424e2adf1b48333f7bd54cd5be8fef3953d7
# good: [37f0c524925ae1f8fb62e39b3330357b1dc090bf] Merge remote-tracking branch 
'sound/for-next'
git bisect good 37f0c524925ae1f8fb62e39b3330357b1dc090bf
# good: [18e1b83f8219ccc8051e80384f90682739bf19c4] Merge remote-tracking branch 
'mfd/for-mfd-next'
git bisect good 18e1b83f8219ccc8051e80384f90682739bf19c4
# good: [7b844fd09215b87ed67ad69a3a5e09858f761dbb] Merge remote-tracking branch 
'iommu/next'
git bisect good 7b844fd09215b87ed67ad69a3a5e09858f761dbb
# good: [11e891d2e00d4c9408c8a35712538d1003e3f549] Merge branch 'sched/core'
git bisect good 11e891d2e00d4c9408c8a35712538d1003e3f549
# bad: [059a17407b8136363594f1d8c9fa53ac6ca6ac2a] Merge remote-tracking branch 
'spi/for-next'
git bisect bad 059a17407b8136363594f1d8c9fa53ac6ca6ac2a
# good: [9cfda694080954aa2be700ccadcedd0c5c15277a] Merge remote-tracking 
branches 'spi/topic/mpc52xx', 'spi/topic/ppc4xx', 'spi/topic/pxa2xx', 
'spi/topic/rockchip' and 'spi/topic/rspi' into spi-next
git bisect good 9cfda694080954aa2be700ccadcedd0c5c15277a
# good: [4b741bc35962ccf93b798a233512850c48c2646e] dt-bindings: net: remove 
reference to fixed link support
git bisect good 4b741bc35962ccf93b798a233512850c48c2646e
# good: [6160be71baba3dd80501c90ab44e0f17d5854721] Merge remote-tracking 
branches 'spi/topic/s3c64xx', 'spi/topic/sh-msiof', 'spi/topic/slave' and 
'spi/topic/topcliff-pch' into spi-next
git bisect good 6160be71baba3dd80501c90ab44e0f17d5854721
# good: [2a9bcff7f0d3883f5381f0fd8232990013002f92] Merge remote-tracking branch 
'audit/next'
git bisect good 2a9bcff7f0d3883f5381f0fd8232990013002f92
# bad: [e553f539f2af39db5e3b2c273cc1a22d34be49ad] of: make 
of_device_make_bus_id() static
git bisect bad e553f539f2af39db5e3b2c273cc1a22d34be49ad
# bad: [0549bde0fcb11a95773e7dc4121738b9e653abf4] of: fix of_node leak caused 
in of_find_node_opts_by_path
git bisect bad 0549bde0fcb11a95773e7dc4121738b9e653abf4
# first bad commit: [0549bde0fcb11a95773e7dc4121738b9e653abf4] of: fix of_node 
leak caused in of_find_node_opts_by_path


Is it really safe to use workqueues to drive expedited grace periods?

2017-02-10 Thread Paul E. McKenney
Hello!

So RCU's expedited grace periods have been using workqueues for a
little while, and things seem to be working.  But as usual, I worry...
Is this use subject to some sort of deadlock where RCU's workqueue cannot
start running until after a grace period completes, but that grace
period is the one needing the workqueue?  Note that there are ways to
set up your kernel so that all RCU grace periods are expedited.

Should I be worried?  If not, what prevents this from being a problem,
especially given that workqueue handlers are allowed to wait for RCU
grace periods to complete?

Thanx, Paul



Re: [PATCH 3/3] Bluetooth: hidp: fix possible might sleep error in hidp_session_thread

2017-02-10 Thread Brian Norris
Hi Jeffy,

I'm really not an expert on bluetooth or HIDP, but I can't bring myself
to say that this is correct. I still think you have a problem.

On Tue, Jan 24, 2017 at 12:07:51PM +0800, Jeffy Chen wrote:
> It looks like hidp_session_thread has same pattern as the issue reported in
> old rfcomm:
> 
>   while (1) {
>   set_current_state(TASK_INTERRUPTIBLE);
>   if (condition)
>   break;
>   // may call might_sleep here
>   schedule();
>   }
>   __set_current_state(TASK_RUNNING);
> 
> Which fixed at:
>   dfb2fae Bluetooth: Fix nested sleeps
> 
> So let's fix it at the same way, also follow the suggestion of:
> https://lwn.net/Articles/628628/
> 
> Signed-off-by: Jeffy Chen 
> ---
> 
>  net/bluetooth/hidp/core.c | 23 +++
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
> index 0bec458..43d6e6a 100644
> --- a/net/bluetooth/hidp/core.c
> +++ b/net/bluetooth/hidp/core.c
> @@ -36,6 +36,7 @@
>  #define VERSION "1.2"
>  
>  static DECLARE_RWSEM(hidp_session_sem);
> +static DECLARE_WAIT_QUEUE_HEAD(hidp_session_wq);
>  static LIST_HEAD(hidp_session_list);
>  
>  static unsigned char hidp_keycode[256] = {
> @@ -1068,12 +1069,15 @@ static int hidp_session_start_sync(struct 
> hidp_session *session)
>   * Wake up session thread and notify it to stop. This is asynchronous and
>   * returns immediately. Call this whenever a runtime error occurs and you 
> want
>   * the session to stop.
> - * Note: wake_up_process() performs any necessary memory-barriers for us.
>   */
>  static void hidp_session_terminate(struct hidp_session *session)
>  {
>   atomic_inc(&session->terminate);
> - wake_up_process(session->task);
> +
> + /* Ensure session->terminate is updated */
> + smp_mb__after_atomic();
> +
> + wake_up_interruptible(&hidp_session_wq);

So, you're adding a whole new wait queue here.

>  }
>  
>  /*
> @@ -1180,7 +1184,9 @@ static void hidp_session_run(struct hidp_session 
> *session)
>   struct sock *ctrl_sk = session->ctrl_sock->sk;
>   struct sock *intr_sk = session->intr_sock->sk;
>   struct sk_buff *skb;
> + DEFINE_WAIT_FUNC(wait, woken_wake_function);
>  
> + add_wait_queue(&hidp_session_wq, &wait);
>   for (;;) {
>   /*
>* This thread can be woken up two ways:
> @@ -1188,12 +1194,10 @@ static void hidp_session_run(struct hidp_session 
> *session)
>*session->terminate flag and wakes this thread up.
>*  - Via modifying the socket state of ctrl/intr_sock. This
>*thread is woken up by ->sk_state_changed().
> -  *
> -  * Note: set_current_state() performs any necessary
> -  * memory-barriers for us.
>*/
> - set_current_state(TASK_INTERRUPTIBLE);
>  
> + /* Ensure session->terminate is updated */
> + smp_mb__before_atomic();
>   if (atomic_read(&session->terminate))
>   break;
>  
> @@ -1227,11 +1231,14 @@ static void hidp_session_run(struct hidp_session 
> *session)
>   hidp_process_transmit(session, &session->ctrl_transmit,
> session->ctrl_sock);
>  
> - schedule();
> + wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);

And you're waiting on it here.

But you're already on two other wait queues (hidp_session_thread()). So
the nice WQ_FLAG_WOKEN handling will only happen if you get woken via
the new hidp_session_wq queue. But what about the other two? Seems like
again you might have a race condition that would lead you to
(temporarily, at least?) missing a wake-up attempt.

I'm not really sure what the best way to resolve this would be. My best
guess would be to either consolidate the use of these wait queues, or
lese roll a version of wait_woken() to handle 2 or more wait heads...

Am I wrong? I easily could be.

Brian

>   }
> + remove_wait_queue(&hidp_session_wq, &wait);
>  
>   atomic_inc(&session->terminate);
> - set_current_state(TASK_RUNNING);
> +
> + /* Ensure session->terminate is updated */
> + smp_mb__after_atomic();
>  }
>  
>  /*
> -- 
> 2.1.4
> 
> 


RE: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus remove

2017-02-10 Thread Long Li
Hi Bjorn,

This patch and the other one in the series ([Resend PATCH 2/2 v3] pci-hyperv: 
lock pci bus on device eject) have been Acked.

Is there anything else should be done before it can be merged? Please let me 
know.

Thanks

Long

> -Original Message-
> From: KY Srinivasan
> Sent: Friday, January 27, 2017 10:42 AM
> To: Long Li ; Haiyang Zhang
> ; Bjorn Helgaas 
> Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Long Li 
> Subject: RE: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus
> remove
> 
> 
> 
> > -Original Message-
> > From: Long Li [mailto:lon...@exchange.microsoft.com]
> > Sent: Monday, January 23, 2017 9:45 PM
> > To: KY Srinivasan ; Haiyang Zhang
> > ; Bjorn Helgaas 
> > Cc: de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; Long Li 
> > Subject: [Resend PATCH 1/2 v3] pci-hyperv: properly handle pci bus
> > remove
> >
> > [This sender failed our fraud detection checks and may not be who they
> > appear to be. Learn about spoofing at
> > http://aka.ms/LearnAboutSpoofing]
> >
> > From: Long Li 
> >
> > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI
> > device from host (e.g. by disabling SRIOV on a device). In
> > hv_pci_remove, the bus is already removed before the call, so we don't
> > need to rescan the bus in the workqueue scheduled from
> > hv_pci_devices_present. By introducing status hv_pcibus_removed, we
> can avoid this situation.
> >
> > Signed-off-by: Long Li 
> > Reported-by: Xiaofeng Wang 
> Acked-by: K. Y. Srinivasan 
> > ---
> >  drivers/pci/host/pci-hyperv.c | 20 +---
> >  1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/host/pci-hyperv.c
> > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644
> > --- a/drivers/pci/host/pci-hyperv.c
> > +++ b/drivers/pci/host/pci-hyperv.c
> > @@ -348,6 +348,7 @@ enum hv_pcibus_state {
> > hv_pcibus_init = 0,
> > hv_pcibus_probed,
> > hv_pcibus_installed,
> > +   hv_pcibus_removed,
> > hv_pcibus_maximum
> >  };
> >
> > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct
> > work_struct *work)
> > put_pcichild(hpdev, hv_pcidev_ref_initial);
> > }
> >
> > -   /* Tell the core to rescan bus because there may have been changes.
> */
> > -   if (hbus->state == hv_pcibus_installed) {
> > +   switch (hbus->state) {
> > +   case hv_pcibus_installed:
> > +   /*
> > +* Tell the core to rescan bus
> > +* because there may have been changes.
> > +*/
> > pci_lock_rescan_remove();
> > pci_scan_child_bus(hbus->pci_bus);
> > pci_unlock_rescan_remove();
> > -   } else {
> > +   break;
> > +
> > +   case hv_pcibus_init:
> > +   case hv_pcibus_probed:
> > survey_child_resources(hbus);
> > +   break;
> > +
> > +   default:
> > +   break;
> > }
> >
> > up(&hbus->enum_sem);
> > @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev,
> > hbus = kzalloc(sizeof(*hbus), GFP_KERNEL);
> > if (!hbus)
> > return -ENOMEM;
> > +   hbus->state = hv_pcibus_init;
> >
> > /*
> >  * The PCI bus "domain" is what is called "segment" in ACPI
> > and @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device
> *hdev)
> > pci_stop_root_bus(hbus->pci_bus);
> > pci_remove_root_bus(hbus->pci_bus);
> > pci_unlock_rescan_remove();
> > +   hbus->state = hv_pcibus_removed;
> > }
> >
> > ret = hv_send_resources_released(hdev);
> > --
> > 1.8.5.6



Re: [PATCH 0/2] net: ethernet: ti: cpsw: fix susp/resume

2017-02-10 Thread Ivan Khoronzhuk
On Fri, Feb 10, 2017 at 12:05:07PM -0600, Grygorii Strashko wrote:
> 
> 
> On 02/09/2017 07:45 PM, David Miller wrote:
> >From: Ivan Khoronzhuk 
> >Date: Fri, 10 Feb 2017 00:54:24 +0200
> >
> >>On Thu, Feb 09, 2017 at 05:21:26PM -0500, David Miller wrote:
> >>>From: Ivan Khoronzhuk 
> >>>Date: Thu,  9 Feb 2017 02:07:34 +0200
> >>>
> These two patches fix suspend/resume chain.
> >>>
> >>>Patch 2 doesn't apply cleanly to the 'net' tree, please
> >>>respin this series.
> >>
> >>Strange, I've just checked it on net-next/master, it was applied w/o any
> >>warnings.
> >
> >It makes no sense to test "net-next" when I am telling you that it is
> >the "net" tree it doesn't apply to.
> >
> >This is a bug fix, so it should be targetting the "net" tree.
> >
> 
> Looks like the first fix is for net, but the second one is for net-next
> I do not see
> 03fd01ad0eead23eb79294b6fb4d71dcac493855
> "net: ethernet: ti: cpsw: don't duplicate ndev_running"
> in net.

There is dependency, both for net-next and only first is for net tree

> 
> -- 
> regards,
> -grygorii


[PATCH] Staging: media: platform: bcm2835 - style fix

2017-02-10 Thread Derek Robson
Changed permissions to octal style
Found using checkpatch

Signed-off-by: Derek Robson 
---
 drivers/staging/media/platform/bcm2835/bcm2835-camera.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/platform/bcm2835/bcm2835-camera.c 
b/drivers/staging/media/platform/bcm2835/bcm2835-camera.c
index ca15a698e018..7ef9147ddef7 100644
--- a/drivers/staging/media/platform/bcm2835/bcm2835-camera.c
+++ b/drivers/staging/media/platform/bcm2835/bcm2835-camera.c
@@ -61,9 +61,9 @@ MODULE_PARM_DESC(video_nr, "videoX start numbers, -1 is 
autodetect");
 
 static int max_video_width = MAX_VIDEO_MODE_WIDTH;
 static int max_video_height = MAX_VIDEO_MODE_HEIGHT;
-module_param(max_video_width, int, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+module_param(max_video_width, int, 0644);
 MODULE_PARM_DESC(max_video_width, "Threshold for video mode");
-module_param(max_video_height, int, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+module_param(max_video_height, int, 0644);
 MODULE_PARM_DESC(max_video_height, "Threshold for video mode");
 
 /* Gstreamer bug https://bugzilla.gnome.org/show_bug.cgi?id=726521
@@ -76,7 +76,7 @@ MODULE_PARM_DESC(max_video_height, "Threshold for video 
mode");
  * result).
  */
 static int gst_v4l2src_is_broken;
-module_param(gst_v4l2src_is_broken, int, S_IRUSR | S_IWUSR | S_IRGRP | 
S_IROTH);
+module_param(gst_v4l2src_is_broken, int, 0644);
 MODULE_PARM_DESC(gst_v4l2src_is_broken, "If non-zero, enable workaround for 
Gstreamer");
 
 /* global device data array */
-- 
2.11.1



Re: [PATCH] net: ethernet: ti: netcp_core: return netdev_tx_t in xmit

2017-02-10 Thread Ivan Khoronzhuk
On Fri, Feb 10, 2017 at 02:45:21PM -0500, David Miller wrote:
> From: Ivan Khoronzhuk 
> Date: Thu,  9 Feb 2017 16:24:14 +0200
> 
> > @@ -1300,7 +1301,7 @@ static int netcp_ndo_start_xmit(struct sk_buff *skb, 
> > struct net_device *ndev)
> > dev_warn(netcp->ndev_dev, "padding failed (%d), packet 
> > dropped\n",
> >  ret);
> > tx_stats->tx_dropped++;
> > -   return ret;
> > +   return NETDEV_TX_BUSY;
> > }
> > skb->len = NETCP_MIN_PACKET_SIZE;
> > }
> > @@ -1329,7 +1330,7 @@ static int netcp_ndo_start_xmit(struct sk_buff *skb, 
> > struct net_device *ndev)
> > if (desc)
> > netcp_free_tx_desc_chain(netcp, desc, sizeof(*desc));
> > dev_kfree_skb(skb);
> > -   return ret;
> > +   return NETDEV_TX_BUSY;
> >  }
> 
> I really think these should be returning NET_XMIT_DROP.

Yes, it seems here can be a little more changes then, will send new version
later.


[PATCH] Staging: media: lirc - style fix

2017-02-10 Thread Derek Robson
Changed permissions to octal across whole driver
Found by checkpatch

Signed-off-by: Derek Robson 
---
 drivers/staging/media/lirc/lirc_sasem.c | 2 +-
 drivers/staging/media/lirc/lirc_sir.c   | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/media/lirc/lirc_sasem.c 
b/drivers/staging/media/lirc/lirc_sasem.c
index b0c176e14b6b..ac69fe1e2d44 100644
--- a/drivers/staging/media/lirc/lirc_sasem.c
+++ b/drivers/staging/media/lirc/lirc_sasem.c
@@ -158,7 +158,7 @@ static int debug;
 MODULE_AUTHOR(MOD_AUTHOR);
 MODULE_DESCRIPTION(MOD_DESC);
 MODULE_LICENSE("GPL");
-module_param(debug, int, S_IRUGO | S_IWUSR);
+module_param(debug, int, 0644);
 MODULE_PARM_DESC(debug, "Debug messages: 0=no, 1=yes (default: no)");
 
 static void delete_context(struct sasem_context *context)
diff --git a/drivers/staging/media/lirc/lirc_sir.c 
b/drivers/staging/media/lirc/lirc_sir.c
index c75ae43095ba..426753edac1c 100644
--- a/drivers/staging/media/lirc/lirc_sir.c
+++ b/drivers/staging/media/lirc/lirc_sir.c
@@ -826,14 +826,14 @@ MODULE_AUTHOR("Milan Pikula");
 #endif
 MODULE_LICENSE("GPL");
 
-module_param(io, int, S_IRUGO);
+module_param(io, int, 0444);
 MODULE_PARM_DESC(io, "I/O address base (0x3f8 or 0x2f8)");
 
-module_param(irq, int, S_IRUGO);
+module_param(irq, int, 0444);
 MODULE_PARM_DESC(irq, "Interrupt (4 or 3)");
 
-module_param(threshold, int, S_IRUGO);
+module_param(threshold, int, 0444);
 MODULE_PARM_DESC(threshold, "space detection threshold (3)");
 
-module_param(debug, bool, S_IRUGO | S_IWUSR);
+module_param(debug, bool, 0644);
 MODULE_PARM_DESC(debug, "Enable debugging messages");
-- 
2.11.1



Re: [PATCH v1 0/5] md: use bio_clone_fast()

2017-02-10 Thread Shaohua Li
On Fri, Feb 10, 2017 at 06:56:12PM +0800, Ming Lei wrote:
> Hi,
> 
> This patches replaces bio_clone() with bio_fast_clone() in
> bio_clone_mddev() because:
> 
> 1) bio_clone_mddev() is used in raid normal I/O and isn't in
> resync I/O path, and all the direct access to bvec table in
> raid happens on resync I/O only except for write behind of raid1.
> Write behind is treated specially, so the replacement is safe.
> 
> 2) for write behind, bio_clone() is kept, but this patchset
> introduces bio_clone_bioset_partial() to just clone one specific 
> bvecs range instead of whole table. Then write behind is improved
> too.

Thanks! this patch set looks good to me.
Jens,
can you look at the first patch? If it's ok, I'll carry it in my tree.

Thanks,
Shaohua
 
> V1:
>   1) don't introduce bio_clone_slow_mddev_partial()
>   2) return failure if mddev->bio_set can't be created
>   3) remove check in bio_clone_mddev() as suggested by
>   Christoph Hellwig.
>   4) rename bio_clone_mddev() as bio_clone_fast_mddev()
> 
> 
> Ming Lei (5):
>   block: introduce bio_clone_bioset_partial()
>   md/raid1: use bio_clone_bioset_partial() in case of write behind
>   md: fail if mddev->bio_set can't be created
>   md: remove unnecessary check on mddev
>   md: fast clone bio in bio_clone_mddev()
> 
>  block/bio.c | 61 
> +
>  drivers/md/faulty.c |  2 +-
>  drivers/md/md.c | 14 ++--
>  drivers/md/md.h |  4 ++--
>  drivers/md/raid1.c  | 26 ---
>  drivers/md/raid10.c | 11 +-
>  drivers/md/raid5.c  |  4 ++--
>  include/linux/bio.h | 11 --
>  8 files changed, 92 insertions(+), 41 deletions(-)
> 
> -- 
> 2.7.4
> 
> Thanks,
> Ming


[PATCH v2 1/9] sysctl: fix lax sysctl_check_table() sanity check

2017-02-10 Thread Luis R. Rodriguez
Commit 7c60c48f58a7 ("sysctl: Improve the sysctl sanity checks")
improved sanity checks considerbly, however the enhancements on
sysctl_check_table() meant adding a functional change so that
only the last table entry's sanity error is propagated. It also
changed the way errors were propagated so that each new check
reset the err value, this means only last sanity check computed
is used for an error. This has been in the kernel since v3.4 days.

Fix this by carrying on errors from previous checks and iterations
as we traverse the table and ensuring we keep any error from previous
checks. We keep iterating on the table even if an error is found so
we can complain for all errors found in one shot. This works as
-EINVAL is always returned on error anyway, and the check for error
is any non-zero value.

Fixes: 7c60c48f58a7 ("sysctl: Improve the sysctl sanity checks")
Signed-off-by: Luis R. Rodriguez 
---
 fs/proc/proc_sysctl.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index d4e37acd4821..d22ee738d2eb 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1036,7 +1036,7 @@ static int sysctl_check_table(const char *path, struct 
ctl_table *table)
int err = 0;
for (; table->procname; table++) {
if (table->child)
-   err = sysctl_err(path, table, "Not a file");
+   err |= sysctl_err(path, table, "Not a file");
 
if ((table->proc_handler == proc_dostring) ||
(table->proc_handler == proc_dointvec) ||
@@ -1047,15 +1047,15 @@ static int sysctl_check_table(const char *path, struct 
ctl_table *table)
(table->proc_handler == proc_doulongvec_minmax) ||
(table->proc_handler == proc_doulongvec_ms_jiffies_minmax)) 
{
if (!table->data)
-   err = sysctl_err(path, table, "No data");
+   err |= sysctl_err(path, table, "No data");
if (!table->maxlen)
-   err = sysctl_err(path, table, "No maxlen");
+   err |= sysctl_err(path, table, "No maxlen");
}
if (!table->proc_handler)
-   err = sysctl_err(path, table, "No proc_handler");
+   err |= sysctl_err(path, table, "No proc_handler");
 
if ((table->mode & (S_IRUGO|S_IWUGO)) != table->mode)
-   err = sysctl_err(path, table, "bogus .mode 0%o",
+   err |= sysctl_err(path, table, "bogus .mode 0%o",
table->mode);
}
return err;
-- 
2.11.0



[PATCH v2 9/9] test_sysctl: test against int proc_dointvec() array support

2017-02-10 Thread Luis R. Rodriguez
Add a few initial respective tests for an array:

  o Echoing values separated by spaces works
  o Echoing only first elements will set first elements
  o Confirm PAGE_SIZE limit still applies even if an array is used

Signed-off-by: Luis R. Rodriguez 
---
 lib/test_sysctl.c| 13 +
 tools/testing/selftests/sysctl/sysctl.sh | 89 
 2 files changed, 102 insertions(+)

diff --git a/lib/test_sysctl.c b/lib/test_sysctl.c
index 1654f41961b7..603c24b2f6cb 100644
--- a/lib/test_sysctl.c
+++ b/lib/test_sysctl.c
@@ -35,6 +35,7 @@ static int i_one_hundred = 100;
 struct test_sysctl_data {
int int_0001;
int int_0002;
+   int int_0003[4];
 
unsigned int uint_0001;
 
@@ -45,6 +46,11 @@ static struct test_sysctl_data test_data = {
.int_0001 = 60,
.int_0002 = 1,
 
+   .int_0003[0] = 0,
+   .int_0003[1] = 1,
+   .int_0003[2] = 2,
+   .int_0003[3] = 3,
+
.uint_0001 = 314,
 
.string_0001 = "(none)",
@@ -69,6 +75,13 @@ static struct ctl_table test_table[] = {
.proc_handler   = proc_dointvec,
},
{
+   .procname   = "int_0003",
+   .data   = &test_data.int_0003,
+   .maxlen = sizeof(test_data.int_0003),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec,
+   },
+   {
.procname   = "uint_0001",
.data   = &test_data.uint_0001,
.maxlen = sizeof(unsigned int),
diff --git a/tools/testing/selftests/sysctl/sysctl.sh 
b/tools/testing/selftests/sysctl/sysctl.sh
index eedfba6f0a57..963d572155b1 100755
--- a/tools/testing/selftests/sysctl/sysctl.sh
+++ b/tools/testing/selftests/sysctl/sysctl.sh
@@ -26,6 +26,7 @@ ALL_TESTS="0001:1:1"
 ALL_TESTS="$ALL_TESTS 0002:1:1"
 ALL_TESTS="$ALL_TESTS 0003:1:1"
 ALL_TESTS="$ALL_TESTS 0004:1:1"
+ALL_TESTS="$ALL_TESTS 0005:3:1"
 
 test_modprobe()
 {
@@ -78,6 +79,10 @@ test_reqs()
echo "$0: You need getconf installed"
exit 1
fi
+   if ! which diff 2> /dev/null > /dev/null; then
+   echo "$0: You need diff installed"
+   exit 1
+   fi
 }
 
 function load_req_mod()
@@ -137,6 +142,12 @@ verify()
return 0
 }
 
+verify_diff_w()
+{
+   echo "$TEST_STR" | diff -w -u - $1 2>&1 > /dev/null
+   return $?
+}
+
 test_rc()
 {
if [[ $rc != 0 ]]; then
@@ -317,6 +328,74 @@ run_limit_digit_int()
test_rc
 }
 
+# You used an int array
+run_limit_digit_int_array()
+{
+   echo -n "Testing array works as expected ... "
+   TEST_STR="4 3 2 1"
+   echo -n $TEST_STR > $TARGET
+
+   if ! verify_diff_w "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing skipping trailing array elements works ... "
+   # Do not reset_vals, carry on the values from the last test.
+   # If we only echo in two digits the last two are left intact
+   TEST_STR="100 101"
+   echo -n $TEST_STR > $TARGET
+   # After we echo in, to help diff we need to set on TEST_STR what
+   # we expect the result to be.
+   TEST_STR="100 101 2 1"
+
+   if ! verify_diff_w "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing PAGE_SIZE limit on array works ... "
+   # Do not reset_vals, carry on the values from the last test.
+   # Even if you use an int array, you are still restricted to
+   # MAX_DIGITS, this is a known limitation. Test limit works.
+   LIMIT=$((MAX_DIGITS -1))
+   TEST_STR="9"
+   (perl -e 'print " " x '$LIMIT';'; echo "${TEST_STR}") | \
+   dd of="${TARGET}" 2>/dev/null
+
+   TEST_STR="9 101 2 1"
+   if ! verify_diff_w "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing exceeding PAGE_SIZE limit fails as expected ... "
+   # Do not reset_vals, carry on the values from the last test.
+   # Now go over limit.
+   LIMIT=$((MAX_DIGITS))
+   TEST_STR="7"
+   (perl -e 'print " " x '$LIMIT';'; echo "${TEST_STR}") | \
+   dd of="${TARGET}" 2>/dev/null
+
+   TEST_STR="7 101 2 1"
+   if verify_diff_w "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+}
+
 # You are using an unsigned int
 run_limit_digit_uint()
 {
@@ -477,6 +556,15 @@ sysctl_test_0004()
run_limit_digit_uint
 }
 
+sysctl_test_0005()
+{
+   TARGET="${SYSCTL}/int_0003"
+   reset_vals
+   ORIG=$(cat "${TARGET}")
+
+   run_limit_digit_int_array
+}
+
 list_tests()
 {
echo "Test ID list:"
@@ -489,6 +

[rcu:rcu/dev 31/38] kernel/rcu/rcu_segcblist.h:77:2: error: implicit declaration of function 'prefetch'

2017-02-10 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
rcu/dev
head:   1f7c9e1bb76b7dc50e515bd6ce9b3a8526377d17
commit: 0e629d6798567fe31bcf7e16ba5c5affcad15059 [31/38] rcu: Abstract 
multi-tail callback list handling
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 0e629d6798567fe31bcf7e16ba5c5affcad15059
# save the attached .config to linux build tree
make.cross ARCH=parisc 

All errors (new ones prefixed by >>):

   In file included from kernel/rcu/tree.h:32:0,
from kernel/rcu/tree_trace.c:46:
   kernel/rcu/rcu_segcblist.h: In function 'rcu_cblist_dequeue':
>> kernel/rcu/rcu_segcblist.h:77:2: error: implicit declaration of function 
>> 'prefetch' [-Werror=implicit-function-declaration]
 prefetch(rhp);
 ^~~~
   cc1: some warnings being treated as errors

vim +/prefetch +77 kernel/rcu/rcu_segcblist.h

71  {
72  struct rcu_head *rhp;
73  
74  rhp = rclp->head;
75  if (!rhp)
76  return NULL;
  > 77  prefetch(rhp);
78  rclp->len--;
79  rclp->head = rhp->next;
80  if (!rclp->head)

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH v2 4/9] test_sysctl: add dedicated proc sysctl test driver

2017-02-10 Thread Luis R. Rodriguez
Although we have had tools/testing/selftests/sysctl/ with two
test cases these use existing kernel sysctl interfaces. We want
to expand test coverage, we can't just be looking for random
safe production values to poke at, instead just dedicate a test
driver for debugging purposes and port the existing scripts to
use it. This will make it easier for further tests to be added.

Signed-off-by: Luis R. Rodriguez 
---
 lib/Kconfig.debug   |  11 +++
 lib/Makefile|   1 +
 lib/test_sysctl.c   | 106 
 tools/testing/selftests/sysctl/config   |   1 +
 tools/testing/selftests/sysctl/run_numerictests |   4 +-
 tools/testing/selftests/sysctl/run_stringtests  |   4 +-
 6 files changed, 123 insertions(+), 4 deletions(-)
 create mode 100644 lib/test_sysctl.c
 create mode 100644 tools/testing/selftests/sysctl/config

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 64c03b07ad2f..d753fac41f78 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1975,6 +1975,17 @@ config TEST_FIRMWARE
 
  If unsure, say N.
 
+config TEST_SYSCTL
+   tristate "sysctl test driver"
+   default n
+   depends on PROC_SYSCTL
+   help
+ This builds the "test_sysctl" module. This driver enables to test the
+ proc sysctl interfaces available to drivers safely without affecting
+ production knobs which might alter system functionality.
+
+ If unsure, say N.
+
 config TEST_UDELAY
tristate "udelay test driver"
default n
diff --git a/lib/Makefile b/lib/Makefile
index 445a39c21f46..ac832c440d41 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -45,6 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
 obj-y += kstrtox.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
+obj-$(CONFIG_TEST_SYSCTL) += test_sysctl.o
 obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
diff --git a/lib/test_sysctl.c b/lib/test_sysctl.c
new file mode 100644
index ..9b9ae1a95ab3
--- /dev/null
+++ b/lib/test_sysctl.c
@@ -0,0 +1,106 @@
+/*
+ * proc sysctl test driver
+ *
+ * Copyright (C) 2017 Luis R. Rodriguez 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of copyleft-next (version 0.3.1 or later) as published
+ * at http://copyleft-next.org/.
+ */
+
+/*
+ * This module provides an interface to the the proc sysctl interfaces.  This
+ * driver requires CONFIG_PROC_SYSCTL. It will not normally be loaded by the
+ * system unless explicitly requested by name. You can also build this driver
+ * into your kernel.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int i_zero;
+static int i_one_hundred = 100;
+
+struct test_sysctl_data {
+   int int_0001;
+   char string_0001[65];
+};
+
+static struct test_sysctl_data test_data = {
+   .int_0001 = 60,
+   .string_0001 = "(none)",
+};
+
+/* These are all under /proc/sys/debug/test_sysctl/ */
+static struct ctl_table test_table[] = {
+   {
+   .procname   = "int_0001",
+   .data   = &test_data.int_0001,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = &i_zero,
+   .extra2 = &i_one_hundred,
+   },
+   {
+   .procname   = "string_0001",
+   .data   = &test_data.string_0001,
+   .maxlen = sizeof(test_data.string_0001),
+   .mode   = 0644,
+   .proc_handler   = proc_dostring,
+   },
+   { }
+};
+
+static struct ctl_table test_sysctl_table[] = {
+   {
+   .procname   = "test_sysctl",
+   .maxlen = 0,
+   .mode   = 0555,
+   .child  = test_table,
+   },
+   { }
+};
+
+static struct ctl_table test_sysctl_root_table[] = {
+   {
+   .procname   = "debug",
+   .maxlen = 0,
+   .mode   = 0555,
+   .child  = test_sysctl_table,
+   },
+   { }
+};
+
+static struct ctl_table_header *test_sysctl_header;
+
+static int __init test_sysctl_init(void)
+{
+   test_sysctl_header = register_sysctl_table(test_sysctl_root_table);
+   if (!test_sysctl_header)
+   return -ENOMEM;
+   return 0;
+}
+late_initcall(test_sysctl_init);
+
+static void __exit test_sysctl_exit(void)
+{
+   if (test_sysctl_header)
+   unregister_sysctl_table(test_sysctl_header);
+}
+
+module_exit(test_sysctl_exit);
+
+MODULE_AUTHOR("Luis R. Rodriguez ");
+

[PATCH v2 2/9] sysctl: add proper unsigned int support

2017-02-10 Thread Luis R. Rodriguez
Commit e7d316a02f6838 ("sysctl: handle error writing UINT_MAX to u32
fields") added proc_douintvec() to start help adding support for
unsigned int, this however was only half the work needed, all these
issues are present with the current implementation:

  o Printing the values shows a negative value, this happens since
do_proc_dointvec() and this uses proc_put_long()
  o We can easily wrap around the int values: UINT_MAX is 4294967295,
if we echo in 4294967295 + 1 we end up with 0, using 4294967295 + 2
we end up with 1.
  o We echo negative values in and they are accepted
  o sysctl_check_table() was never extended for proc_douintvec()

Fix all these issues by adding our own do_proc_douintvec() and adding
proc_douintvec() to sysctl_check_table().

Historically sysctl proc helpers have supported arrays, due to the
complexity this adds though we've taken a step back to evaluate array
users to determine if its worth upkeeping for unsigned int. An
evaluation using Coccinelle has been done to perform a grammatical
search to ask ourselves:

  o How many sysctl proc_dointvec() (int) users exist which likely
should be moved over to proc_douintvec() (unsigned int) ?
Answer: about 8
- Of these how many are array users ?
Answer: Probably only 1
  o How many sysctl array users exist ?
Answer: about 12

This last question gives us an idea just how popular arrays: they
are not. Array support should probably just be kept for strings.

The identified uint ports are:

drivers/infiniband/core/ucma.c - max_backlog
drivers/infiniband/core/iwcm.c - default_backlog
net/core/sysctl_net_core.c - rps_sock_flow_sysctl()
net/netfilter/nf_conntrack_timestamp.c - nf_conntrack_timestamp -- bool
net/netfilter/nf_conntrack_acct.c nf_conntrack_acct -- bool
net/netfilter/nf_conntrack_ecache.c - nf_conntrack_events -- bool
net/netfilter/nf_conntrack_helper.c - nf_conntrack_helper -- bool
net/phonet/sysctl.c proc_local_port_range()

The only possible array users is proc_local_port_range() but it does not
seem worth it to add array support just for this given the range support
works just as well. Unsigned int support should be desirable more for
when you *need* more than INT_MAX or using int min/max support then
does not suffice for your ranges.

If you forget and by mistake happen to register an unsigned int proc entry
with an array, the driver will fail and you will get something as follows:

sysctl table check failed: debug/test_sysctl//uint_0002 array now allowed
CPU: 2 PID: 1342 Comm: modprobe Tainted: GW   E 
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Call Trace:
 dump_stack+0x63/0x81
 __register_sysctl_table+0x350/0x650
 ? kmem_cache_alloc_trace+0x107/0x240
 __register_sysctl_paths+0x1b3/0x1e0
 ? 0xc005f000
 register_sysctl_table+0x1f/0x30
 test_sysctl_init+0x10/0x1000 [test_sysctl]
 do_one_initcall+0x52/0x1a0
 ? kmem_cache_alloc_trace+0x107/0x240
 do_init_module+0x5f/0x200
 load_module+0x1867/0x1bd0
 ? __symbol_put+0x60/0x60
 SYSC_finit_module+0xdf/0x110
 SyS_finit_module+0xe/0x10
 entry_SYSCALL_64_fastpath+0x1e/0xad
RIP: 0033:0x7f042b22d119


Cc: Subash Abhinov Kasiviswanathan 
Cc: Heinrich Schuchardt 
Cc: Kees Cook 
Cc: "David S. Miller" 
Cc: Ingo Molnar 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields")
Signed-off-by: Luis R. Rodriguez 
---
 fs/proc/proc_sysctl.c |  15 +
 kernel/sysctl.c   | 161 --
 2 files changed, 170 insertions(+), 6 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index d22ee738d2eb..73696a73a1ec 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1031,6 +1031,18 @@ static int sysctl_err(const char *path, struct ctl_table 
*table, char *fmt, ...)
return -EINVAL;
 }
 
+static int sysctl_check_table_array(const char *path, struct ctl_table *table)
+{
+   int err = 0;
+
+   if (table->proc_handler == proc_douintvec) {
+   if (table->maxlen != sizeof(unsigned int))
+   err |= sysctl_err(path, table, "array now allowed");
+   }
+
+   return err;
+}
+
 static int sysctl_check_table(const char *path, struct ctl_table *table)
 {
int err = 0;
@@ -1040,6 +1052,7 @@ static int sysctl_check_table(const char *path, struct 
ctl_table *table)
 
if ((table->proc_handler == proc_dostring) ||
(table->proc_handler == proc_dointvec) ||
+   (table->proc_handler == proc_douintvec) ||
(table->proc_handler == proc_dointvec_minmax) ||
(table->proc_handler == proc_dointvec_jiffies) ||
(table->proc_handler == proc_dointvec_userhz_jiffies) ||
@@ -1050,6 +1063,8 @@ static int sysctl_check_table(const char *path, struct 
ctl_table *table)
err |= sysctl_err(path, table, "No data");
 

[PATCH v2 7/9] test_sysctl: add simple proc_dointvec() case

2017-02-10 Thread Luis R. Rodriguez
Test against a simple proc_dointvec() case. While at it, add
a test against INT_MAX. Make sure INT_MAX works, and INT_MAX+1
will fail. Also test negative values work.

Signed-off-by: Luis R. Rodriguez 
---
 lib/test_sysctl.c| 11 ++
 tools/testing/selftests/sysctl/sysctl.sh | 62 
 2 files changed, 73 insertions(+)

diff --git a/lib/test_sysctl.c b/lib/test_sysctl.c
index 9b9ae1a95ab3..c36a024d7351 100644
--- a/lib/test_sysctl.c
+++ b/lib/test_sysctl.c
@@ -34,11 +34,15 @@ static int i_one_hundred = 100;
 
 struct test_sysctl_data {
int int_0001;
+   int int_0002;
+
char string_0001[65];
 };
 
 static struct test_sysctl_data test_data = {
.int_0001 = 60,
+   .int_0002 = 1,
+
.string_0001 = "(none)",
 };
 
@@ -54,6 +58,13 @@ static struct ctl_table test_table[] = {
.extra2 = &i_one_hundred,
},
{
+   .procname   = "int_0002",
+   .data   = &test_data.int_0002,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec,
+   },
+   {
.procname   = "string_0001",
.data   = &test_data.string_0001,
.maxlen = sizeof(test_data.string_0001),
diff --git a/tools/testing/selftests/sysctl/sysctl.sh 
b/tools/testing/selftests/sysctl/sysctl.sh
index 14b9d875db42..45fd2ee5739c 100755
--- a/tools/testing/selftests/sysctl/sysctl.sh
+++ b/tools/testing/selftests/sysctl/sysctl.sh
@@ -24,6 +24,7 @@ TEST_FILE=$(mktemp)
 # we have tons of space.
 ALL_TESTS="0001:1:1"
 ALL_TESTS="$ALL_TESTS 0002:1:1"
+ALL_TESTS="$ALL_TESTS 0003:1:1"
 
 test_modprobe()
 {
@@ -52,6 +53,9 @@ function allow_user_defaults()
if [ -z $MAX_DIGITS ]; then
MAX_DIGITS=$(($PAGE_SIZE/8))
fi
+   if [ -z $INT_MAX ]; then
+   INT_MAX=$(getconf INT_MAX)
+   fi
 }
 
 test_reqs()
@@ -92,6 +96,9 @@ reset_vals()
int_0001)
VAL="60"
;;
+   int_0002)
+   VAL="1"
+   ;;
string_0001)
VAL="(none)"
;;
@@ -261,6 +268,48 @@ run_limit_digit()
test_rc
 }
 
+# You are using an int
+run_limit_digit_int()
+{
+   echo -n "Testing INT_MAX works ..."
+   reset_vals
+   TEST_STR="$INT_MAX"
+   echo -n $TEST_STR > $TARGET
+
+   if ! verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing INT_MAX + 1 will fail as expected..."
+   reset_vals
+   TEST_STR=$(($INT_MAX+1))
+   echo -n $TEST_STR > $TARGET 2> /dev/null
+
+   if verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing negative values will work as expected..."
+   reset_vals
+   TEST_STR="-3"
+   echo -n $TEST_STR > $TARGET 2> /dev/null
+   if ! verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+}
+
 run_stringtests()
 {
echo -n "Writing entire sysctl in short writes ... "
@@ -354,6 +403,18 @@ sysctl_test_0002()
run_stringtests
 }
 
+sysctl_test_0003()
+{
+   TARGET="${SYSCTL}/int_0002"
+   reset_vals
+   ORIG=$(cat "${TARGET}")
+   TEST_STR=$(( $ORIG + 1 ))
+
+   run_numerictests
+   run_limit_digit
+   run_limit_digit_int
+}
+
 list_tests()
 {
echo "Test ID list:"
@@ -364,6 +425,7 @@ list_tests()
echo
echo "0001 x $(get_test_count 0001) - tests proc_dointvec_minmax()"
echo "0002 x $(get_test_count 0002) - tests proc_dostring()"
+   echo "0003 x $(get_test_count 0003) - tests proc_dointvec()"
 }
 
 test_reqs
-- 
2.11.0



[PATCH v2 5/9] test_sysctl: add generic script to expand on tests

2017-02-10 Thread Luis R. Rodriguez
This adds a generic script to let us more easily add more tests
cases. Since we really have only two types of tests cases just
fold them into the one file. Each test unit is now identified
into its separate function:

  # ./sysctl.sh -l
Test ID list:

TEST_ID x NUM_TEST
TEST_ID:   Test ID
NUM_TESTS: Number of recommended times to run the test

0001 x 1 - tests proc_dointvec_minmax()
0002 x 1 - tests proc_dostring()

For now we start off with what we had before, and run only each test once.
We can now watch a test case until it fails:

./sysctl.sh -w 0002

We can also run a test case x number of times, say we want to run
a test case 100 times:

./sysctl.sh -c 0001 100

To run a test case only once, for example:

./sysctl.sh -s 0002

The default settings are specified at the top of sysctl.sh.

Signed-off-by: Luis R. Rodriguez 
---
 tools/testing/selftests/sysctl/Makefile |   3 +-
 tools/testing/selftests/sysctl/common_tests | 109 --
 tools/testing/selftests/sysctl/run_numerictests |  10 -
 tools/testing/selftests/sysctl/run_stringtests  |  77 
 tools/testing/selftests/sysctl/sysctl.sh| 459 
 5 files changed, 460 insertions(+), 198 deletions(-)
 delete mode 100644 tools/testing/selftests/sysctl/common_tests
 delete mode 100755 tools/testing/selftests/sysctl/run_numerictests
 delete mode 100755 tools/testing/selftests/sysctl/run_stringtests
 create mode 100755 tools/testing/selftests/sysctl/sysctl.sh

diff --git a/tools/testing/selftests/sysctl/Makefile 
b/tools/testing/selftests/sysctl/Makefile
index b3c33e071f10..95c320b354e8 100644
--- a/tools/testing/selftests/sysctl/Makefile
+++ b/tools/testing/selftests/sysctl/Makefile
@@ -4,8 +4,7 @@
 # No binaries, but make sure arg-less "make" doesn't trigger "run_tests".
 all:
 
-TEST_PROGS := run_numerictests run_stringtests
-TEST_FILES := common_tests
+TEST_PROGS := sysctl.sh
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/sysctl/common_tests 
b/tools/testing/selftests/sysctl/common_tests
deleted file mode 100644
index 17d534b1b7b4..
--- a/tools/testing/selftests/sysctl/common_tests
+++ /dev/null
@@ -1,109 +0,0 @@
-#!/bin/sh
-
-TEST_FILE=$(mktemp)
-
-echo "== Testing sysctl behavior against ${TARGET} =="
-
-set_orig()
-{
-   echo "${ORIG}" > "${TARGET}"
-}
-
-set_test()
-{
-   echo "${TEST_STR}" > "${TARGET}"
-}
-
-verify()
-{
-   local seen
-   seen=$(cat "$1")
-   if [ "${seen}" != "${TEST_STR}" ]; then
-   return 1
-   fi
-   return 0
-}
-
-trap 'set_orig; rm -f "${TEST_FILE}"' EXIT
-
-rc=0
-
-echo -n "Writing test file ... "
-echo "${TEST_STR}" > "${TEST_FILE}"
-if ! verify "${TEST_FILE}"; then
-   echo "FAIL" >&2
-   exit 1
-else
-   echo "ok"
-fi
-
-echo -n "Checking sysctl is not set to test value ... "
-if verify "${TARGET}"; then
-   echo "FAIL" >&2
-   exit 1
-else
-   echo "ok"
-fi
-
-echo -n "Writing sysctl from shell ... "
-set_test
-if ! verify "${TARGET}"; then
-   echo "FAIL" >&2
-   exit 1
-else
-   echo "ok"
-fi
-
-echo -n "Resetting sysctl to original value ... "
-set_orig
-if verify "${TARGET}"; then
-   echo "FAIL" >&2
-   exit 1
-else
-   echo "ok"
-fi
-
-# Now that we've validated the sanity of "set_test" and "set_orig",
-# we can use those functions to set starting states before running
-# specific behavioral tests.
-
-echo -n "Writing entire sysctl in single write ... "
-set_orig
-dd if="${TEST_FILE}" of="${TARGET}" bs=4096 2>/dev/null
-if ! verify "${TARGET}"; then
-   echo "FAIL" >&2
-   rc=1
-else
-   echo "ok"
-fi
-
-echo -n "Writing middle of sysctl after synchronized seek ... "
-set_test
-dd if="${TEST_FILE}" of="${TARGET}" bs=1 seek=1 skip=1 2>/dev/null
-if ! verify "${TARGET}"; then
-   echo "FAIL" >&2
-   rc=1
-else
-   echo "ok"
-fi
-
-echo -n "Writing beyond end of sysctl ... "
-set_orig
-dd if="${TEST_FILE}" of="${TARGET}" bs=20 seek=2 2>/dev/null
-if verify "${TARGET}"; then
-echo "FAIL" >&2
-rc=1
-else
-echo "ok"
-fi
-
-echo -n "Writing sysctl with multiple long writes ... "
-set_orig
-(perl -e 'print "A" x 50;'; echo "${TEST_STR}") | \
-   dd of="${TARGET}" bs=50 2>/dev/null
-if verify "${TARGET}"; then
-   echo "FAIL" >&2
-   rc=1
-else
-   echo "ok"
-fi
diff --git a/tools/testing/selftests/sysctl/run_numerictests 
b/tools/testing/selftests/sysctl/run_numerictests
deleted file mode 100755
index cdfeef96568c..
--- a/tools/testing/selftests/sysctl/run_numerictests
+++ /dev/null
@@ -1,10 +0,0 @@
-#!/bin/sh
-
-SYSCTL="/proc/sys/debug/test_sysctl/"
-TARGET="${SYSCTL}/int_0001"
-ORIG=$(cat "${TARGET}")
-TEST_STR=$(( $ORIG + 1 ))
-
-. ./common_tests
-
-exit $rc
diff --git a/tools/testing/selftests/sysctl/run_stringtests 
b/tools/testing/selftests/sysctl/run_stringtests
deleted file mode 100755
index 15a646b2c527..
--- a/tools/testing/selftests/sysctl/run_stringtests
+++ /dev/null
@@ -

[PATCH v2 8/9] test_sysctl: add simple proc_douintvec() case

2017-02-10 Thread Luis R. Rodriguez
Test against a simple proc_douintvec() case. While at it, add
a test against UINT_MAX. Make sure UINT_MAX works, and UINT_MAX+1
will fail and that negative values are not accepted.

Signed-off-by: Luis R. Rodriguez 
---
 lib/test_sysctl.c| 11 ++
 tools/testing/selftests/sysctl/sysctl.sh | 63 
 2 files changed, 74 insertions(+)

diff --git a/lib/test_sysctl.c b/lib/test_sysctl.c
index c36a024d7351..1654f41961b7 100644
--- a/lib/test_sysctl.c
+++ b/lib/test_sysctl.c
@@ -36,6 +36,8 @@ struct test_sysctl_data {
int int_0001;
int int_0002;
 
+   unsigned int uint_0001;
+
char string_0001[65];
 };
 
@@ -43,6 +45,8 @@ static struct test_sysctl_data test_data = {
.int_0001 = 60,
.int_0002 = 1,
 
+   .uint_0001 = 314,
+
.string_0001 = "(none)",
 };
 
@@ -65,6 +69,13 @@ static struct ctl_table test_table[] = {
.proc_handler   = proc_dointvec,
},
{
+   .procname   = "uint_0001",
+   .data   = &test_data.uint_0001,
+   .maxlen = sizeof(unsigned int),
+   .mode   = 0644,
+   .proc_handler   = proc_douintvec,
+   },
+   {
.procname   = "string_0001",
.data   = &test_data.string_0001,
.maxlen = sizeof(test_data.string_0001),
diff --git a/tools/testing/selftests/sysctl/sysctl.sh 
b/tools/testing/selftests/sysctl/sysctl.sh
index 45fd2ee5739c..eedfba6f0a57 100755
--- a/tools/testing/selftests/sysctl/sysctl.sh
+++ b/tools/testing/selftests/sysctl/sysctl.sh
@@ -25,6 +25,7 @@ TEST_FILE=$(mktemp)
 ALL_TESTS="0001:1:1"
 ALL_TESTS="$ALL_TESTS 0002:1:1"
 ALL_TESTS="$ALL_TESTS 0003:1:1"
+ALL_TESTS="$ALL_TESTS 0004:1:1"
 
 test_modprobe()
 {
@@ -56,6 +57,9 @@ function allow_user_defaults()
if [ -z $INT_MAX ]; then
INT_MAX=$(getconf INT_MAX)
fi
+   if [ -z $UINT_MAX ]; then
+   UINT_MAX=$(getconf UINT_MAX)
+   fi
 }
 
 test_reqs()
@@ -99,6 +103,9 @@ reset_vals()
int_0002)
VAL="1"
;;
+   uint_0001)
+   VAL="314"
+   ;;
string_0001)
VAL="(none)"
;;
@@ -310,6 +317,49 @@ run_limit_digit_int()
test_rc
 }
 
+# You are using an unsigned int
+run_limit_digit_uint()
+{
+   echo -n "Testing UINT_MAX works ..."
+   reset_vals
+   TEST_STR="$UINT_MAX"
+   echo -n $TEST_STR > $TARGET
+
+   if ! verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing UINT_MAX + 1 will fail as expected..."
+   reset_vals
+   TEST_STR=$(($UINT_MAX+1))
+   echo -n $TEST_STR > $TARGET 2> /dev/null
+
+   if verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Testing negative values will not work as expected ..."
+   reset_vals
+   TEST_STR="-3"
+   echo -n $TEST_STR > $TARGET 2> /dev/null
+
+   if verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+}
+
 run_stringtests()
 {
echo -n "Writing entire sysctl in short writes ... "
@@ -415,6 +465,18 @@ sysctl_test_0003()
run_limit_digit_int
 }
 
+sysctl_test_0004()
+{
+   TARGET="${SYSCTL}/uint_0001"
+   reset_vals
+   ORIG=$(cat "${TARGET}")
+   TEST_STR=$(( $ORIG + 1 ))
+
+   run_numerictests
+   run_limit_digit
+   run_limit_digit_uint
+}
+
 list_tests()
 {
echo "Test ID list:"
@@ -426,6 +488,7 @@ list_tests()
echo "0001 x $(get_test_count 0001) - tests proc_dointvec_minmax()"
echo "0002 x $(get_test_count 0002) - tests proc_dostring()"
echo "0003 x $(get_test_count 0003) - tests proc_dointvec()"
+   echo "0004 x $(get_test_count 0004) - tests proc_douintvec()"
 }
 
 test_reqs
-- 
2.11.0



[PATCH v2 6/9] test_sysctl: test against PAGE_SIZE for int

2017-02-10 Thread Luis R. Rodriguez
Add the following tests to ensure we do not regress:

  o Test using a buffer full of space (PAGE_SIZE-1) followed by a
single digit works

  o Test using a buffer full of spaces (PAGE_SIZE or over) will fail

As tests increase instead of unloading the module and reloading it
we can just do a shell reset_vals() with a reset to values we know
are set at init on the driver.

Signed-off-by: Luis R. Rodriguez 
---
 tools/testing/selftests/sysctl/sysctl.sh | 65 
 1 file changed, 65 insertions(+)

diff --git a/tools/testing/selftests/sysctl/sysctl.sh 
b/tools/testing/selftests/sysctl/sysctl.sh
index f8f29092063d..14b9d875db42 100755
--- a/tools/testing/selftests/sysctl/sysctl.sh
+++ b/tools/testing/selftests/sysctl/sysctl.sh
@@ -46,6 +46,12 @@ function allow_user_defaults()
if [ -z $SYSCTL ]; then
SYSCTL="/proc/sys/debug/test_sysctl"
fi
+   if [ -z $PAGE_SIZE ]; then
+   PAGE_SIZE=$(getconf PAGESIZE)
+   fi
+   if [ -z $MAX_DIGITS ]; then
+   MAX_DIGITS=$(($PAGE_SIZE/8))
+   fi
 }
 
 test_reqs()
@@ -60,6 +66,10 @@ test_reqs()
echo "$0: You need perl installed"
exit 1
fi
+   if ! which getconf 2> /dev/null > /dev/null; then
+   echo "$0: You need getconf installed"
+   exit 1
+   fi
 }
 
 function load_req_mod()
@@ -74,6 +84,23 @@ function load_req_mod()
fi
 }
 
+reset_vals()
+{
+   VAL=""
+   TRIGGER=$(basename ${TARGET})
+   case "$TRIGGER" in
+   int_0001)
+   VAL="60"
+   ;;
+   string_0001)
+   VAL="(none)"
+   ;;
+   *)
+   ;;
+   esac
+   echo -n $VAL > $TARGET
+}
+
 set_orig()
 {
if [ ! -z $TARGET ]; then
@@ -195,7 +222,42 @@ run_numerictests()
else
echo "ok"
fi
+   test_rc
+}
+
+# Your test must accept digits 3 and 4 to use this
+run_limit_digit()
+{
+   echo -n "Checking ignoring spaces up to PAGE_SIZE works on write ..."
+   reset_vals
+
+   LIMIT=$((MAX_DIGITS -1))
+   TEST_STR="3"
+   (perl -e 'print " " x '$LIMIT';'; echo "${TEST_STR}") | \
+   dd of="${TARGET}" 2>/dev/null
+
+   if ! verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
+   test_rc
+
+   echo -n "Checking passing PAGE_SIZE of spaces fails on write ..."
+   reset_vals
 
+   LIMIT=$((MAX_DIGITS))
+   TEST_STR="4"
+   (perl -e 'print " " x '$LIMIT';'; echo "${TEST_STR}") | \
+   dd of="${TARGET}" 2>/dev/null
+
+   if verify "${TARGET}"; then
+   echo "FAIL" >&2
+   rc=1
+   else
+   echo "ok"
+   fi
test_rc
 }
 
@@ -271,15 +333,18 @@ run_stringtests()
 sysctl_test_0001()
 {
TARGET="${SYSCTL}/int_0001"
+   reset_vals
ORIG=$(cat "${TARGET}")
TEST_STR=$(( $ORIG + 1 ))
 
run_numerictests
+   run_limit_digit
 }
 
 sysctl_test_0002()
 {
TARGET="${SYSCTL}/string_0001"
+   reset_vals
ORIG=$(cat "${TARGET}")
TEST_STR="Testing sysctl"
# Only string sysctls support seeking/appending.
-- 
2.11.0



[PATCH v2 3/9] sysctl: add unsigned int range support

2017-02-10 Thread Luis R. Rodriguez
To keep parity with regular int interfaces provide the an unsigned
int proc_douintvec_minmax() which allows you to specify a range of
allowed valid numbers.

Adding proc_douintvec_minmax_sysadmin() is easy but we can wait for
an actual user for that.

Cc: Subash Abhinov Kasiviswanathan 
Cc: Heinrich Schuchardt 
Cc: Kees Cook 
Cc: "David S. Miller" 
Cc: Ingo Molnar 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Signed-off-by: Luis R. Rodriguez 
---
 fs/proc/proc_sysctl.c  |  4 ++-
 include/linux/sysctl.h |  3 +++
 kernel/sysctl.c| 66 ++
 3 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 73696a73a1ec..bc231740239d 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1035,7 +1035,8 @@ static int sysctl_check_table_array(const char *path, 
struct ctl_table *table)
 {
int err = 0;
 
-   if (table->proc_handler == proc_douintvec) {
+   if ((table->proc_handler == proc_douintvec) ||
+   (table->proc_handler == proc_douintvec_minmax)) {
if (table->maxlen != sizeof(unsigned int))
err |= sysctl_err(path, table, "array now allowed");
}
@@ -1053,6 +1054,7 @@ static int sysctl_check_table(const char *path, struct 
ctl_table *table)
if ((table->proc_handler == proc_dostring) ||
(table->proc_handler == proc_dointvec) ||
(table->proc_handler == proc_douintvec) ||
+   (table->proc_handler == proc_douintvec_minmax) ||
(table->proc_handler == proc_dointvec_minmax) ||
(table->proc_handler == proc_dointvec_jiffies) ||
(table->proc_handler == proc_dointvec_userhz_jiffies) ||
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index adf4e51cf597..a35d40ecc211 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -47,6 +47,9 @@ extern int proc_douintvec(struct ctl_table *, int,
 void __user *, size_t *, loff_t *);
 extern int proc_dointvec_minmax(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
+extern int proc_douintvec_minmax(struct ctl_table *table, int write,
+void __user *buffer, size_t *lenp,
+loff_t *ppos);
 extern int proc_dointvec_jiffies(struct ctl_table *, int,
 void __user *, size_t *, loff_t *);
 extern int proc_dointvec_userhz_jiffies(struct ctl_table *, int,
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 493bc05e546a..b286f57e9abe 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2533,6 +2533,65 @@ int proc_dointvec_minmax(struct ctl_table *table, int 
write,
do_proc_dointvec_minmax_conv, ¶m);
 }
 
+struct do_proc_douintvec_minmax_conv_param {
+   unsigned int *min;
+   unsigned int *max;
+};
+
+static int do_proc_douintvec_minmax_conv(unsigned long *lvalp,
+unsigned int *valp,
+int write, void *data)
+{
+   struct do_proc_douintvec_minmax_conv_param *param = data;
+
+   if (write) {
+   unsigned int val = *lvalp;
+
+   if ((param->min && *param->min > val) ||
+   (param->max && *param->max < val))
+   return -ERANGE;
+
+   if (*lvalp > UINT_MAX)
+   return -EINVAL;
+   *valp = val;
+   } else {
+   unsigned int val = *valp;
+   *lvalp = (unsigned long) val;
+   }
+
+   return 0;
+}
+
+/**
+ * proc_douintvec_minmax - read a vector of unsigned ints with min/max values
+ * @table: the sysctl table
+ * @write: %TRUE if this is a write to the sysctl file
+ * @buffer: the user buffer
+ * @lenp: the size of the user buffer
+ * @ppos: file position
+ *
+ * Reads/writes up to table->maxlen/sizeof(unsigned int) unsigned integer
+ * values from/to the user buffer, treated as an ASCII string. Negative
+ * strings are not allowed.
+ *
+ * This routine will ensure the values are within the range specified by
+ * table->extra1 (min) and table->extra2 (max). There is a final sanity
+ * check for UINT_MAX to avoid having to support wrap around uses from
+ * userspace.
+ *
+ * Returns 0 on success.
+ */
+int proc_douintvec_minmax(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+   struct do_proc_douintvec_minmax_conv_param param = {
+   .min = (unsigned int *) table->extra1,
+   .max = (unsigned int *) table->extra2,
+   };
+   return do_proc_douintvec(table, write, buffer, lenp, ppos,
+do_proc_douintvec_minmax_conv, ¶m);
+}
+
 static void validate_coredump_safety(void)
 {
 #ifdef CONFIG_COREDUMP
@@ -3041,6 +3100,12 @

[PATCH v2 0/9] sysctl: add and fix proper unsigned int support

2017-02-10 Thread Luis R. Rodriguez
On this v2 I've taken Alexey's recommendation and looked at array users
of the proc sysctl interface which complicate the interfece to see if
we can instead just simplify the unsigned int implementation. I could
not find any clear candidate. As such I've just ripped out array
support.

Since some future unsigned int proc sysctl users might think there is
array support I've taken measures to do sanity checks on initialization
and warn the kernel if such users creep up. To validate this I ended up
just writing a simple test driver, and extending our tests. In doing this
I also found a really old issue with sysctl_check_table(), and yet another
issue with the first incarnation of proc_douintvec().

I hammered on proc_douintvec() as much as I could, and extended tests for
this to ensure we don't regress should some int users convert over.

I noticed one more issue but I did not fix as I figured it was worth
discussing: proc_doi*_minmax() handlers have historically allowed users
to register even if their own data does not match the expressed min/max
values. When this happens the value is exposed on /proc/sys but reading
or writing does not work against it. I'm of the opinion that
sysctl_check_table() should just validate this and bail preventing such
entries from ever creeping up. The only reason I didn't do this is this
*could* mean some tables don't get registered in some cases -- I haven't
done the vetting. If we're fine with this I can add it later.

Luis R. Rodriguez (9):
  sysctl: fix lax sysctl_check_table() sanity check
  sysctl: add proper unsigned int support
  sysctl: add unsigned int range support
  test_sysctl: add dedicated proc sysctl test driver
  test_sysctl: add generic script to expand on tests
  test_sysctl: test against PAGE_SIZE for int
  test_sysctl: add simple proc_dointvec() case
  test_sysctl: add simple proc_douintvec() case
  test_sysctl: test against int proc_dointvec() array support

 fs/proc/proc_sysctl.c   |  27 +-
 include/linux/sysctl.h  |   3 +
 kernel/sysctl.c | 227 +++-
 lib/Kconfig.debug   |  11 +
 lib/Makefile|   1 +
 lib/test_sysctl.c   | 141 +
 tools/testing/selftests/sysctl/Makefile |   3 +-
 tools/testing/selftests/sysctl/common_tests | 109 
 tools/testing/selftests/sysctl/config   |   1 +
 tools/testing/selftests/sysctl/run_numerictests |  10 -
 tools/testing/selftests/sysctl/run_stringtests  |  77 ---
 tools/testing/selftests/sysctl/sysctl.sh| 738 
 12 files changed, 1139 insertions(+), 209 deletions(-)
 create mode 100644 lib/test_sysctl.c
 delete mode 100644 tools/testing/selftests/sysctl/common_tests
 create mode 100644 tools/testing/selftests/sysctl/config
 delete mode 100755 tools/testing/selftests/sysctl/run_numerictests
 delete mode 100755 tools/testing/selftests/sysctl/run_stringtests
 create mode 100755 tools/testing/selftests/sysctl/sysctl.sh

-- 
2.11.0



Re: [RFC/PATCH 2/3] security: Add the Timgad module

2017-02-10 Thread Kees Cook
On Thu, Feb 2, 2017 at 9:04 AM, Djalal Harouni  wrote:
> From: Djalal Harouni 
>
> This adds the Timgad module. Timgad allows to apply restrictions on
> which task is allowed to load or unload kernel modules. Auto-load module
> feature is also handled. The settings can also be applied globally using
> a sysctl interface, this allows to complete the core kernel interface
> "modules_disable" which has only two modes: allow globally or deny
> globally.

To bikeshed on the name: since this is a module loading restriction
LSM, perhaps something more descriptive: ModAutoRestrict or something
like that? (Yes, Yama is poorly named, but initially it was going to
be more than just ptrace restrictions...)

> The feature is useful for sandboxing, embedded systems and Linux
> containers where only some containers/processes that have the
> right privileges are allowed to load/unload modules. Unprivileged

I'd be explicit here and discuss _auto_loading of modules. (Otherwise
people quickly get confused about this vs CAP_SYS_MODULE.) You mention
auto-load later, but I think mentioning it first make this easier to
understand.

> processes should not be able to load/unload modules nor trigger the
> module auto-load feature. This behaviour was inspired from grsecurity's
> GRKERNSEC_MODHARDEN option.
>
> However I still did not complete the check since this has to be
> discussed first, so any bug here is not from grsecurity, but my bugs and
> on purpose. As this is a preliminary RFC these points are not settled,
> discussion has to happen on what should be the best behaviour and what
> checks should be in place. Currently the settings:
>
> Timgad module can be controled using a global sysctl setting:

typo: controlled

>/proc/sys/kernel/timgad/module_restrict

If this becomes /proc/sys/kernel/mod_autoload_restrict/ then the flag
can just be "enabled". (e.g. see LoadPin LSM.)

> Or using the prctl() interface:
>prctl(PR_TIMGAD_OPTS, PR_TIGMAD_SET_MOD_RESTRICT, value, 0, 0)
>
> *) The per-process prctl() settings are:
> prctl(PR_TIMGAD_OPTS, PR_TIGMAD_SET_MOD_RESTRICT, value, 0, 0)

Excellent, yes, please require the trailing zeros. :)

> Where value means:
>
> 0 - Classic module load and unload permissions, nothing changes.

"Classic module auto-load permissions ..." Nothing can auto-unload as-is.

> 1 - The current process must have CAP_SYS_MODULE to be able to load and
> unload modules. CAP_NET_ADMIN should allow the current process to
> load and unload only netdev aliased modules, not implemented

"... to be able to auto-load modules." Same thing about unloading:
everything needs CAP_SYS_MODULE to unload a module.

> 2 - Current process can not loaded nor unloaded modules.

"... cannot auto-load modules."

>
> *) sysctl interface supports the followin values:
>
> 0 - Classic module load and unload permissions, nothing changes.
>
> 1 - Only privileged processes with CAP_SYS_MODULE should be able to load and
> unload modules.
>
> To be added: processes with CAP_NET_ADMIN should be able to
> load and unload only netdev aliased modules, this is currently not
> supported. Other checks for real root without CAP_SYS_MODULE ? ...
>
> (This should be improved)
>
> 2 - Modules can not be loaded nor unloaded. Once set, this sysctl value
> cannot be changed.
>
> Rules:
> First the prctl() settings are checked, if the access is not denied
> then the global sysctl settings are checked.
>
> As said I will update the permission checks later, this is a preliminary
> RFC.
>
> Cc: Kees Cook 
> Signed-off-by: Djalal Harouni 
> ---
>  include/linux/lsm_hooks.h |   5 +
>  include/uapi/linux/prctl.h|   5 +
>  security/Kconfig  |   1 +
>  security/Makefile |   2 +
>  security/security.c   |   1 +
>  security/timgad/Kconfig   |  10 ++
>  security/timgad/Makefile  |   3 +
>  security/timgad/timgad_core.c | 306 +++
>  security/timgad/timgad_core.h |  53 +++
>  security/timgad/timgad_lsm.c  | 327 
> ++
>  10 files changed, 713 insertions(+)
>  create mode 100644 security/timgad/Kconfig
>  create mode 100644 security/timgad/Makefile
>  create mode 100644 security/timgad/timgad_core.c
>  create mode 100644 security/timgad/timgad_core.h
>  create mode 100644 security/timgad/timgad_lsm.c
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index b37e35e..6b83aaa 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1935,5 +1935,10 @@ void __init loadpin_add_hooks(void);
>  #else
>  static inline void loadpin_add_hooks(void) { };
>  #endif
> +#ifdef CONFIG_SECURITY_TIMGAD
> +extern void __init timgad_add_hooks(void);
> +#else
> +static inline void __init timgad_add_hooks(void) { }
> +#endif
>
>  #endif /* ! __LINUX_LSM_HOOKS_H */
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index a8d0759..6d80eed 100644
> --- 

Re: [PATCH v1] regulator: Add driver for voltage controlled regulators

2017-02-10 Thread Matthias Kaehlcke
El Fri, Feb 10, 2017 at 12:43:48PM -0800 Matthias Kaehlcke ha dit:

> The output voltage of a voltage controlled regulator can be controlled
> through the voltage of another regulator. The current version of this
> driver assumes that the output voltage is a linear function of the control
> voltage.
> 
> ...
>
> +static int vctrl_probe(struct platform_device *pdev)
> +{
> ...
> + /* determine if the voltage range of the control supply is continuous */
> + if ((regulator_count_voltages(vctrl->ctrl_supply) == 1) &&
> + regulator_is_supported_voltage(vctrl->ctrl_supply,
> +vrange_ctrl->min_uV,
> +vrange_ctrl->min_uV) &&
> + regulator_is_supported_voltage(vctrl->ctrl_supply,
> +vrange_ctrl->max_uV,
> +vrange_ctrl->max_uV)) {
> + rdesc->continuous_voltage_range = true;
> + rdesc->ops = &vctrl_ops_cont;
> + } else {
> + rdesc->ops = &vctrl_ops_non_cont;
> + }

This creature of indisputable beauty seemed to do the job on my
test systen, however I just realized that the condition is BS. It
turns out that on my system the voltage count of 1 stems from the
parent, since the voltage count of the control supply itself is zero.

int regulator_count_voltages(struct regulator *regulator)
{
struct regulator_dev*rdev = regulator->rdev;

if (rdev->desc->n_voltages)
return rdev->desc->n_voltages;

if (!rdev->supply)
return -EINVAL;

return regulator_count_voltages(rdev->supply);
}

This certainly doesn't help to determine if the regulator has a
continous voltage range. It seems we need a function that evaluates
rdesc->continuous_voltage_range

--

Matthias


Re: [PATCH v3 net-next 4/9] sunvnet: add driver stats for ethtool support

2017-02-10 Thread Stephen Hemminger
On Fri, 10 Feb 2017 09:38:20 -0800
Shannon Nelson  wrote:

> +static void vsw_get_ethtool_stats(struct net_device *dev,
> +   struct ethtool_stats *estats, u64 *data)
> +{
> + int i = 0;
> +
> + data[i++] = dev->stats.rx_packets;
> + data[i++] = dev->stats.tx_packets;
> + data[i++] = dev->stats.rx_bytes;
> + data[i++] = dev->stats.tx_bytes;
> + data[i++] = dev->stats.rx_errors;
> + data[i++] = dev->stats.tx_errors;
> + data[i++] = dev->stats.rx_dropped;
> + data[i++] = dev->stats.tx_dropped;
> + data[i++] = dev->stats.multicast;

Please do not duplicate regular network statistics into ethtool.
This doesn't really add any value.


Re: [PATCH v2 3/4] seccomp: Create an action to log before allowing

2017-02-10 Thread Tyler Hicks
On 02/07/2017 06:33 PM, Kees Cook wrote:
> On Thu, Feb 2, 2017 at 9:37 PM, Tyler Hicks  wrote:
>> Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing
>> the syscall. At the implementation level, this action is identical to
>> the existing SECCOMP_RET_ALLOW action. However, it can be very useful when
>> initially developing a seccomp filter for an application. The developer
>> can set the default action to be SECCOMP_RET_LOG, maybe mark any
>> obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the
>> application through its paces. A list of syscalls that triggered the
>> default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and
>> that list can be used to build the syscall whitelist. Finally, the
>> developer can change the default action to the desired value.
>>
>> This provides a more friendly experience than seeing the application get
>> killed, then updating the filter and rebuilding the app, seeing the
>> application get killed due to a different syscall, then updating the
>> filter and rebuilding the app, etc.
>>
>> The functionality is similar to what's supported by the various LSMs.
>> SELinux has permissive mode, AppArmor has complain mode, SMACK has
>> bring-up mode, etc.
>>
>> SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW so that
>> "allow" can be written to the max_action_to_log sysctl in order to get a
>> list of logged actions without the, potentially larger, set of allowed
>> actions.
>>
>> Signed-off-by: Tyler Hicks 
>> ---
>>  Documentation/prctl/seccomp_filter.txt | 6 ++
>>  include/uapi/linux/seccomp.h   | 1 +
>>  kernel/seccomp.c   | 4 
>>  3 files changed, 11 insertions(+)
>>
>> diff --git a/Documentation/prctl/seccomp_filter.txt 
>> b/Documentation/prctl/seccomp_filter.txt
>> index 1e469ef..ba55a91 100644
>> --- a/Documentation/prctl/seccomp_filter.txt
>> +++ b/Documentation/prctl/seccomp_filter.txt
>> @@ -138,6 +138,12 @@ SECCOMP_RET_TRACE:
>> allow use of ptrace, even of other sandboxed processes, without
>> extreme care; ptracers can use this mechanism to escape.)
>>
>> +SECCOMP_RET_LOG:
>> +   Results in the system call being executed after it is logged. This
>> +   should be used by application developers to learn which syscalls 
>> their
>> +   application needs without having to iterate through multiple test and
>> +   development cycles to build the list.
>> +
>>  SECCOMP_RET_ALLOW:
>> Results in the system call being executed.
>>
>> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
>> index 0f238a4..67f72cd 100644
>> --- a/include/uapi/linux/seccomp.h
>> +++ b/include/uapi/linux/seccomp.h
>> @@ -29,6 +29,7 @@
>>  #define SECCOMP_RET_TRAP   0x0003U /* disallow and force a SIGSYS */
>>  #define SECCOMP_RET_ERRNO  0x0005U /* returns an errno */
>>  #define SECCOMP_RET_TRACE  0x7ff0U /* pass to a tracer or disallow 
>> */
>> +#define SECCOMP_RET_LOG0x7ffeU /* allow after logging */
> 
> This adds to UAPI, so it'd be good to think for a moment about how
> this would work on older kernels: right now, if someone tried to use
> this RET_LOG on an old kernel, it'll get treated like RET_KILL. Is
> this sane?

It is not sane for userspace code to blindly attempt to use a new
feature on an old kernel. One of the main motivations of the
actions_avail sysctl is to allow userspace to be smart about what the
current kernel supports.

I'll be adding logic (requested by Paul) to libseccomp that checks this
sysctl when SECOMP_RET_LOG is attempted to be used. Programs that don't
use libseccomp will have to do something similar.

> 
> I'm also trying to figure out if there is some other solution to this,
> but they all involve tests against an otherwise RET_ALLOW case, which
> I want to avoid. :)
> 
> So, I think, for now, this looks good, but I'd prefer this be
> 0x7ffcU, just to make sure we have not painted ourselves into a
> numerical corner if we for some reason ever need to put something
> between RET_ALLOW and RET_LOG.

That makes sense. I'll do that in v3.
> 
>>  #define SECCOMP_RET_ALLOW  0x7fffU /* allow */
>>
>>  /* Masks for the return value sections. */
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 548fb89..8627481 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -650,6 +650,7 @@ static int __seccomp_filter(int this_syscall, const 
>> struct seccomp_data *sd,
>>
>> return 0;
>>
>> +   case SECCOMP_RET_LOG:
> 
> Given my protective feelings about the RET_ALLOW case, can you make
> this a fully separate case statement? I'd rather have RET_ALLOW be
> distinctly separate.

Sure! It actually has to be two different cases now that we're doing the
hot path approach for SECCOMP_RET_ALLOW.

Tyler

> 
>> case SECCOMP_RET_ALLOW:
>> seccomp_log(this_syscall, 0, action);
>> return 0;
>> @@ -934,

Re: [PATCH v2 3/4] seccomp: Create an action to log before allowing

2017-02-10 Thread Kees Cook
On Fri, Feb 10, 2017 at 4:01 PM, Tyler Hicks  wrote:
> On 02/07/2017 06:33 PM, Kees Cook wrote:
>> This adds to UAPI, so it'd be good to think for a moment about how
>> this would work on older kernels: right now, if someone tried to use
>> this RET_LOG on an old kernel, it'll get treated like RET_KILL. Is
>> this sane?
>
> It is not sane for userspace code to blindly attempt to use a new
> feature on an old kernel. One of the main motivations of the
> actions_avail sysctl is to allow userspace to be smart about what the
> current kernel supports.

Yeah, agreed. I mean, userspace could also build a little test
program, toss in RET_LOG, run it and see if it get SIGSYS. But that's
so much more pain that checking in /proc.

> I'll be adding logic (requested by Paul) to libseccomp that checks this
> sysctl when SECOMP_RET_LOG is attempted to be used. Programs that don't
> use libseccomp will have to do something similar.

Excellent, I had been meaning to ask if you'd chatted with Paul at
all, since this is an API addition for libseccomp. Speaking of which,
can you CC linux-api@ on the next version too?

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v2 3/4] seccomp: Create an action to log before allowing

2017-02-10 Thread Tyler Hicks
On 02/10/2017 06:08 PM, Kees Cook wrote:
> On Fri, Feb 10, 2017 at 4:01 PM, Tyler Hicks  wrote:
>> On 02/07/2017 06:33 PM, Kees Cook wrote:
>>> This adds to UAPI, so it'd be good to think for a moment about how
>>> this would work on older kernels: right now, if someone tried to use
>>> this RET_LOG on an old kernel, it'll get treated like RET_KILL. Is
>>> this sane?
>>
>> It is not sane for userspace code to blindly attempt to use a new
>> feature on an old kernel. One of the main motivations of the
>> actions_avail sysctl is to allow userspace to be smart about what the
>> current kernel supports.
> 
> Yeah, agreed. I mean, userspace could also build a little test
> program, toss in RET_LOG, run it and see if it get SIGSYS. But that's
> so much more pain that checking in /proc.
> 
>> I'll be adding logic (requested by Paul) to libseccomp that checks this
>> sysctl when SECOMP_RET_LOG is attempted to be used. Programs that don't
>> use libseccomp will have to do something similar.
> 
> Excellent, I had been meaning to ask if you'd chatted with Paul at
> all, since this is an API addition for libseccomp.

We talked through some of it after the initial PR that I submitted to
libseccomp:

  https://github.com/seccomp/libseccomp/pull/64

I'll be updating that as we get closer to a land-able set of kernel patches.

> Speaking of which, can you CC linux-api@ on the next version too?

Yes, good idea!

Tyler




signature.asc
Description: OpenPGP digital signature


[PATCH 3/3] Input: tsc2004/5 - switch to using generic device properties

2017-02-10 Thread Dmitry Torokhov
Instead of supporting legacy platform data (of which we have no mainline
users) and OF-based properties, let's switch to generic device properties.
This will still allow legacy boards to use the driver (by defining property
sets and attaching them to the drivers) and will simplify probe and make
driver usable on ACPI-based systems as well.

Signed-off-by: Dmitry Torokhov 
---
 drivers/input/touchscreen/tsc200x-core.c | 93 +++-
 include/linux/spi/tsc2005.h  | 34 
 2 files changed, 30 insertions(+), 97 deletions(-)
 delete mode 100644 include/linux/spi/tsc2005.h

diff --git a/drivers/input/touchscreen/tsc200x-core.c 
b/drivers/input/touchscreen/tsc200x-core.c
index 1c14a38e3748..88ea5e1b72ae 100644
--- a/drivers/input/touchscreen/tsc200x-core.c
+++ b/drivers/input/touchscreen/tsc200x-core.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -114,7 +113,6 @@ struct tsc200x {
struct regulator*vio;
 
struct gpio_desc*reset_gpio;
-   void(*set_reset)(bool enable);
int (*tsc200x_cmd)(struct device *dev, u8 cmd);
int irq;
 };
@@ -227,12 +225,13 @@ static void tsc200x_stop_scan(struct tsc200x *ts)
ts->tsc200x_cmd(ts->dev, TSC200X_CMD_STOP);
 }
 
-static void tsc200x_set_reset(struct tsc200x *ts, bool enable)
+static void tsc200x_reset(struct tsc200x *ts)
 {
-   if (ts->reset_gpio)
-   gpiod_set_value_cansleep(ts->reset_gpio, enable);
-   else if (ts->set_reset)
-   ts->set_reset(enable);
+   if (ts->reset_gpio) {
+   gpiod_set_value_cansleep(ts->reset_gpio, 1);
+   usleep_range(100, 500); /* only 10us required */
+   gpiod_set_value_cansleep(ts->reset_gpio, 0);
+   }
 }
 
 /* must be called with ts->mutex held */
@@ -253,7 +252,7 @@ static void __tsc200x_enable(struct tsc200x *ts)
 {
tsc200x_start_scan(ts);
 
-   if (ts->esd_timeout && (ts->set_reset || ts->reset_gpio)) {
+   if (ts->esd_timeout && ts->reset_gpio) {
ts->last_valid_interrupt = jiffies;
schedule_delayed_work(&ts->esd_work,
round_jiffies_relative(
@@ -310,9 +309,7 @@ static ssize_t tsc200x_selftest_show(struct device *dev,
}
 
/* hardware reset */
-   tsc200x_set_reset(ts, false);
-   usleep_range(100, 500); /* only 10us required */
-   tsc200x_set_reset(ts, true);
+   tsc200x_reset(ts);
 
if (!success)
goto out;
@@ -354,7 +351,7 @@ static umode_t tsc200x_attr_is_visible(struct kobject *kobj,
umode_t mode = attr->mode;
 
if (attr == &dev_attr_selftest.attr) {
-   if (!ts->set_reset && !ts->reset_gpio)
+   if (!ts->reset_gpio)
mode = 0;
}
 
@@ -404,9 +401,7 @@ static void tsc200x_esd_work(struct work_struct *work)
 
tsc200x_update_pen_state(ts, 0, 0, 0);
 
-   tsc200x_set_reset(ts, false);
-   usleep_range(100, 500); /* only 10us required */
-   tsc200x_set_reset(ts, true);
+   tsc200x_reset(ts);
 
enable_irq(ts->irq);
tsc200x_start_scan(ts);
@@ -454,26 +449,12 @@ int tsc200x_probe(struct device *dev, int irq, const 
struct input_id *tsc_id,
  struct regmap *regmap,
  int (*tsc200x_cmd)(struct device *dev, u8 cmd))
 {
-   const struct tsc2005_platform_data *pdata = dev_get_platdata(dev);
-   struct device_node *np = dev->of_node;
-
struct tsc200x *ts;
struct input_dev *input_dev;
-   unsigned int max_x = MAX_12BIT;
-   unsigned int max_y = MAX_12BIT;
-   unsigned int max_p = MAX_12BIT;
-   unsigned int fudge_x = TSC200X_DEF_X_FUZZ;
-   unsigned int fudge_y = TSC200X_DEF_Y_FUZZ;
-   unsigned int fudge_p = TSC200X_DEF_P_FUZZ;
-   unsigned int x_plate_ohm = TSC200X_DEF_RESISTOR;
-   unsigned int esd_timeout;
+   u32 x_plate_ohm;
+   u32 esd_timeout;
int error;
 
-   if (!np && !pdata) {
-   dev_err(dev, "no platform data\n");
-   return -ENODEV;
-   }
-
if (irq <= 0) {
dev_err(dev, "no irq\n");
return -ENODEV;
@@ -487,23 +468,6 @@ int tsc200x_probe(struct device *dev, int irq, const 
struct input_id *tsc_id,
return -ENODEV;
}
 
-   if (pdata) {
-   fudge_x = pdata->ts_x_fudge;
-   fudge_y = pdata->ts_y_fudge;
-   fudge_p = pdata->ts_pressure_fudge;
-   max_x   = pdata->ts_x_max;
-   max_y   = pdata->ts_y_max;
-   max_p   = pdata->ts_pressure_max;
-   x_plate_ohm = pdata->ts_x_plate_ohm;
-   esd_timeout = pdata->esd_timeout_ms;
-   } else {
-   x_plate_ohm = TSC200X_DEF_RESISTOR;
-   of_property_read_u

[PATCH] Documentation: make Makefile.sphinx no-ops quieter

2017-02-10 Thread Jim Davis
Silence the "make[1]: Nothing to be done for ..." messages for the
no-op targets in Makefile.sphinx.

Signed-off-by: Jim Davis 
---
 Documentation/Makefile.sphinx | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/Makefile.sphinx b/Documentation/Makefile.sphinx
index 707c65337ebf..b83d1160aaba 100644
--- a/Documentation/Makefile.sphinx
+++ b/Documentation/Makefile.sphinx
@@ -92,9 +92,13 @@ xmldocs:
 
 # no-ops for the Sphinx toolchain
 sgmldocs:
+   @:
 psdocs:
+   @:
 mandocs:
+   @:
 installmandocs:
+   @:
 
 cleandocs:
$(Q)rm -rf $(BUILDDIR)
-- 
2.9.3



[PATCH 2/3] Input: tsc2004/5 - fix regulator handling

2017-02-10 Thread Dmitry Torokhov
In case of an optional regulator missing regulator core will return
ERR_PTR(-ENOENT) and not NULL, so the check for missing regulator is
incorrect. Also, the regulator is not optional, it may simply be missing
from platform decsription, so let's use devm_regulator_get() and rely on
regulator core to give us dummy supply when real one is not available.

Fixes: d257f2980feb ("Input: tsc2005 - convert to gpiod")
Signed-off-by: Dmitry Torokhov 
---

Sebastian, I am wondering, what regulator this is. If it is IO VDD,
then I think we activate it too late (i.e. we are truing to shut off
the controller before we turn the regulator on. If it is sensor VDD,
then we probably need to mention it, and also add IO VVD supply as
well.

 drivers/input/touchscreen/tsc200x-core.c | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/input/touchscreen/tsc200x-core.c 
b/drivers/input/touchscreen/tsc200x-core.c
index b7059ed8872e..1c14a38e3748 100644
--- a/drivers/input/touchscreen/tsc200x-core.c
+++ b/drivers/input/touchscreen/tsc200x-core.c
@@ -527,10 +527,10 @@ int tsc200x_probe(struct device *dev, int irq, const 
struct input_id *tsc_id,
return error;
}
 
-   ts->vio = devm_regulator_get_optional(dev, "vio");
+   ts->vio = devm_regulator_get(dev, "vio");
if (IS_ERR(ts->vio)) {
error = PTR_ERR(ts->vio);
-   dev_err(dev, "vio regulator missing (%d)", error);
+   dev_err(dev, "error acquiring vio regulator: %d", error);
return error;
}
 
@@ -587,12 +587,9 @@ int tsc200x_probe(struct device *dev, int irq, const 
struct input_id *tsc_id,
return error;
}
 
-   /* enable regulator for DT */
-   if (ts->vio) {
-   error = regulator_enable(ts->vio);
-   if (error)
-   return error;
-   }
+   error = regulator_enable(ts->vio);
+   if (error)
+   return error;
 
dev_set_drvdata(dev, ts);
error = sysfs_create_group(&dev->kobj, &tsc200x_attr_group);
@@ -615,8 +612,7 @@ int tsc200x_probe(struct device *dev, int irq, const struct 
input_id *tsc_id,
 err_remove_sysfs:
sysfs_remove_group(&dev->kobj, &tsc200x_attr_group);
 disable_regulator:
-   if (ts->vio)
-   regulator_disable(ts->vio);
+   regulator_disable(ts->vio);
return error;
 }
 EXPORT_SYMBOL_GPL(tsc200x_probe);
@@ -627,8 +623,7 @@ int tsc200x_remove(struct device *dev)
 
sysfs_remove_group(&dev->kobj, &tsc200x_attr_group);
 
-   if (ts->vio)
-   regulator_disable(ts->vio);
+   regulator_disable(ts->vio);
 
return 0;
 }
-- 
2.11.0.483.g087da7b7c-goog



RE: [PATCHv5 0/7] Refactor macvtap to re-use tap functionality by other virtual intefaces

2017-02-10 Thread Grandhi, Sainath


> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Thursday, February 09, 2017 2:08 PM
> To: Grandhi, Sainath 
> Cc: net...@vger.kernel.org; mah...@bandewar.net; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCHv5 0/7] Refactor macvtap to re-use tap functionality by
> other virtual intefaces
> 
> From: Sainath Grandhi 
> Date: Wed,  8 Feb 2017 13:37:09 -0800
> 
> > Tap character devices can be implemented on other virtual interfaces
> > like ipvlan, similar to macvtap. Source code for tap functionality in
> > macvtap can be re-used for this purpose.
> >
> > This patch series splits macvtap source into two modules, macvtap and tap.
> > This patch series also includes a patch for implementing tap character
> > device driver based on the IP-VLAN network interface, called ipvtap.
> >
> > These patches are tested on x86 platform.
> 
> I get rejects on patch #7 when I try to apply this to net-next, please respin.

Please check next version. I have based it on net-next.
There is a change in "net-next" repo with ipvlan_core.c that has not made into 
"net" repo.


[PATCHv6 2/7] tap: Renaming tap related APIs, data structures, macros

2017-02-10 Thread Sainath Grandhi
Renaming tap related APIs, data structures and macros in tap.c from macvtap_.* 
to tap_.*

Signed-off-by: Sainath Grandhi 
---
 drivers/net/macvtap_main.c |  18 +--
 drivers/net/tap.c  | 332 ++---
 drivers/vhost/net.c|   3 +-
 include/linux/if_macvlan.h |  17 +--
 include/linux/if_macvtap.h |  10 --
 include/linux/if_tap.h |  23 
 6 files changed, 202 insertions(+), 201 deletions(-)
 delete mode 100644 include/linux/if_macvtap.h
 create mode 100644 include/linux/if_tap.h

diff --git a/drivers/net/macvtap_main.c b/drivers/net/macvtap_main.c
index 96ffa60..548f339 100644
--- a/drivers/net/macvtap_main.c
+++ b/drivers/net/macvtap_main.c
@@ -1,6 +1,6 @@
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -62,7 +62,7 @@ static int macvtap_newlink(struct net *src_net,
 */
vlan->tap_features = TUN_OFFLOADS;
 
-   err = netdev_rx_handler_register(dev, macvtap_handle_frame, vlan);
+   err = netdev_rx_handler_register(dev, tap_handle_frame, vlan);
if (err)
return err;
 
@@ -82,7 +82,7 @@ static void macvtap_dellink(struct net_device *dev,
struct list_head *head)
 {
netdev_rx_handler_unregister(dev);
-   macvtap_del_queues(dev);
+   tap_del_queues(dev);
macvlan_dellink(dev, head);
 }
 
@@ -121,7 +121,7 @@ static int macvtap_device_event(struct notifier_block 
*unused,
 * been registered but before register_netdevice has
 * finished running.
 */
-   err = macvtap_get_minor(vlan);
+   err = tap_get_minor(vlan);
if (err)
return notifier_from_errno(err);
 
@@ -129,7 +129,7 @@ static int macvtap_device_event(struct notifier_block 
*unused,
classdev = device_create(&macvtap_class, &dev->dev, devt,
 dev, tap_name);
if (IS_ERR(classdev)) {
-   macvtap_free_minor(vlan);
+   tap_free_minor(vlan);
return notifier_from_errno(PTR_ERR(classdev));
}
err = sysfs_create_link(&dev->dev.kobj, &classdev->kobj,
@@ -144,10 +144,10 @@ static int macvtap_device_event(struct notifier_block 
*unused,
sysfs_remove_link(&dev->dev.kobj, tap_name);
devt = MKDEV(MAJOR(macvtap_major), vlan->minor);
device_destroy(&macvtap_class, devt);
-   macvtap_free_minor(vlan);
+   tap_free_minor(vlan);
break;
case NETDEV_CHANGE_TX_QUEUE_LEN:
-   if (macvtap_queue_resize(vlan))
+   if (tap_queue_resize(vlan))
return NOTIFY_BAD;
break;
}
@@ -159,7 +159,7 @@ static struct notifier_block macvtap_notifier_block 
__read_mostly = {
.notifier_call  = macvtap_device_event,
 };
 
-extern struct file_operations macvtap_fops;
+extern struct file_operations tap_fops;
 static int macvtap_init(void)
 {
int err;
@@ -169,7 +169,7 @@ static int macvtap_init(void)
if (err)
goto out1;
 
-   cdev_init(&macvtap_cdev, &macvtap_fops);
+   cdev_init(&macvtap_cdev, &tap_fops);
err = cdev_add(&macvtap_cdev, macvtap_major, MACVTAP_NUM_DEVS);
if (err)
goto out2;
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 6f6228e..15ca2d5 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -24,16 +24,16 @@
 #include 
 
 /*
- * A macvtap queue is the central object of this driver, it connects
+ * A tap queue is the central object of this driver, it connects
  * an open character device to a macvlan interface. There can be
  * multiple queues on one interface, which map back to queues
  * implemented in hardware on the underlying device.
  *
- * macvtap_proto is used to allocate queues through the sock allocation
+ * tap_proto is used to allocate queues through the sock allocation
  * mechanism.
  *
  */
-struct macvtap_queue {
+struct tap_queue {
struct sock sk;
struct socket sock;
struct socket_wq wq;
@@ -47,21 +47,21 @@ struct macvtap_queue {
struct skb_array skb_array;
 };
 
-#define MACVTAP_FEATURES (IFF_VNET_HDR | IFF_MULTI_QUEUE)
+#define TAP_IFFEATURES (IFF_VNET_HDR | IFF_MULTI_QUEUE)
 
-#define MACVTAP_VNET_LE 0x8000
-#define MACVTAP_VNET_BE 0x4000
+#define TAP_VNET_LE 0x8000
+#define TAP_VNET_BE 0x4000
 
 #ifdef CONFIG_TUN_VNET_CROSS_LE
-static inline bool macvtap_legacy_is_little_endian(struct macvtap_queue *q)
+static inline bool tap_legacy_is_little_endian(struct tap_queue *q)
 {
-   return q->flags & MACVTAP_VNET_BE ? false :
+   return q->flags & TAP_VNET_BE ? false :
virtio_legacy_is_little_endian();
 }
 
-static long macvtap_get_vnet_be(struct macvtap_queue *q, int __user *sp)
+static long tap_g

[PATCHv6 1/7] tap: Refactoring macvtap.c

2017-02-10 Thread Sainath Grandhi
macvtap module has code for tap/queue management and link management. This 
patch splits
the code into macvtap_main.c for link management and tap.c for tap/queue 
management.
Functionality in tap.c can be re-used for implementing tap on other virtual 
interfaces.

Signed-off-by: Sainath Grandhi 
---
 drivers/net/Makefile |   2 +
 drivers/net/macvtap_main.c   | 218 +++
 drivers/net/{macvtap.c => tap.c} | 204 ++--
 include/linux/if_macvtap.h   |  10 ++
 4 files changed, 238 insertions(+), 196 deletions(-)
 create mode 100644 drivers/net/macvtap_main.c
 rename drivers/net/{macvtap.c => tap.c} (84%)
 create mode 100644 include/linux/if_macvtap.h

diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 7336cbd..19b03a9 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -29,6 +29,8 @@ obj-$(CONFIG_GTP) += gtp.o
 obj-$(CONFIG_NLMON) += nlmon.o
 obj-$(CONFIG_NET_VRF) += vrf.o
 
+macvtap-objs := macvtap_main.o tap.o
+
 #
 # Networking Drivers
 #
diff --git a/drivers/net/macvtap_main.c b/drivers/net/macvtap_main.c
new file mode 100644
index 000..96ffa60
--- /dev/null
+++ b/drivers/net/macvtap_main.c
@@ -0,0 +1,218 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Variables for dealing with macvtaps device numbers.
+ */
+static dev_t macvtap_major;
+#define MACVTAP_NUM_DEVS (1U << MINORBITS)
+
+static const void *macvtap_net_namespace(struct device *d)
+{
+   struct net_device *dev = to_net_dev(d->parent);
+   return dev_net(dev);
+}
+
+static struct class macvtap_class = {
+   .name = "macvtap",
+   .owner = THIS_MODULE,
+   .ns_type = &net_ns_type_operations,
+   .namespace = macvtap_net_namespace,
+};
+static struct cdev macvtap_cdev;
+
+#define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
+ NETIF_F_TSO6 | NETIF_F_UFO)
+
+static int macvtap_newlink(struct net *src_net,
+  struct net_device *dev,
+  struct nlattr *tb[],
+  struct nlattr *data[])
+{
+   struct macvlan_dev *vlan = netdev_priv(dev);
+   int err;
+
+   INIT_LIST_HEAD(&vlan->queue_list);
+
+   /* Since macvlan supports all offloads by default, make
+* tap support all offloads also.
+*/
+   vlan->tap_features = TUN_OFFLOADS;
+
+   err = netdev_rx_handler_register(dev, macvtap_handle_frame, vlan);
+   if (err)
+   return err;
+
+   /* Don't put anything that may fail after macvlan_common_newlink
+* because we can't undo what it does.
+*/
+   err = macvlan_common_newlink(src_net, dev, tb, data);
+   if (err) {
+   netdev_rx_handler_unregister(dev);
+   return err;
+   }
+
+   return 0;
+}
+
+static void macvtap_dellink(struct net_device *dev,
+   struct list_head *head)
+{
+   netdev_rx_handler_unregister(dev);
+   macvtap_del_queues(dev);
+   macvlan_dellink(dev, head);
+}
+
+static void macvtap_setup(struct net_device *dev)
+{
+   macvlan_common_setup(dev);
+   dev->tx_queue_len = TUN_READQ_SIZE;
+}
+
+static struct rtnl_link_ops macvtap_link_ops __read_mostly = {
+   .kind   = "macvtap",
+   .setup  = macvtap_setup,
+   .newlink= macvtap_newlink,
+   .dellink= macvtap_dellink,
+};
+
+static int macvtap_device_event(struct notifier_block *unused,
+   unsigned long event, void *ptr)
+{
+   struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+   struct macvlan_dev *vlan;
+   struct device *classdev;
+   dev_t devt;
+   int err;
+   char tap_name[IFNAMSIZ];
+
+   if (dev->rtnl_link_ops != &macvtap_link_ops)
+   return NOTIFY_DONE;
+
+   snprintf(tap_name, IFNAMSIZ, "tap%d", dev->ifindex);
+   vlan = netdev_priv(dev);
+
+   switch (event) {
+   case NETDEV_REGISTER:
+   /* Create the device node here after the network device has
+* been registered but before register_netdevice has
+* finished running.
+*/
+   err = macvtap_get_minor(vlan);
+   if (err)
+   return notifier_from_errno(err);
+
+   devt = MKDEV(MAJOR(macvtap_major), vlan->minor);
+   classdev = device_create(&macvtap_class, &dev->dev, devt,
+dev, tap_name);
+   if (IS_ERR(classdev)) {
+   macvtap_free_minor(vlan);
+   return notifier_from_errno(PTR_ERR(classdev));
+   }
+   err = sysfs_create_l

[PATCH 1/3] Input: tsc2005 - add OF device table

2017-02-10 Thread Dmitry Torokhov
To be prepared for SPI module loading using full compatible strings from
device tree, let's add OF module device table data.

Signed-off-by: Dmitry Torokhov 
---
 drivers/input/touchscreen/tsc2005.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/tsc2005.c 
b/drivers/input/touchscreen/tsc2005.c
index f2c5f0e47f77..e02b69f40ad8 100644
--- a/drivers/input/touchscreen/tsc2005.c
+++ b/drivers/input/touchscreen/tsc2005.c
@@ -18,8 +18,9 @@
  * GNU General Public License for more details.
  */
 
-#include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include "tsc200x-core.h"
@@ -77,9 +78,18 @@ static int tsc2005_remove(struct spi_device *spi)
return tsc200x_remove(&spi->dev);
 }
 
+#ifdef CONFIG_OF
+static const struct of_device_id tsc2005_of_match[] = {
+   { .compatible = "ti,tsc2005" },
+   { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, tsc2005_of_match);
+#endif
+
 static struct spi_driver tsc2005_driver = {
.driver = {
.name   = "tsc2005",
+   .of_match_table = of_match_ptr(tsc2005_of_match),
.pm = &tsc200x_pm_ops,
},
.probe  = tsc2005_probe,
-- 
2.11.0.483.g087da7b7c-goog



Re: [PATCH v2 2/4] seccomp: Add sysctl to configure actions that should be logged

2017-02-10 Thread Kees Cook
On Fri, Feb 10, 2017 at 3:56 PM, Tyler Hicks  wrote:
> On 02/07/2017 06:24 PM, Kees Cook wrote:
>> case SECCOMP_RET_ALLOW:
>> /* Open-coded seccomp_log(), optimized for RET_ALLOW. */
>> if (unlikely(seccomp_max_action_to_log == 0))
>> __audit_seccomp(syscall, signr, action);
>> return 0;
>
> That makes sense.

And, heh, reading it again now, my example should be ==
SECCOMP_RET_ALLOW (which is 0, yes, but eek raw number, bad me).

>>> +/* Largest strlen() of all action names */
>>> +#define SECCOMP_RET_MAX_NAME_LEN   5
>>
>> This feels fragile... though I don't have a good suggestion yet. :P
>
> I agree and I also don't have a good solution. I didn't like having to
> hard code it.

Yeah. Hrmpf. I mean, it could be sizeof(seccomp_actions_avail) ...
that'll always be long enough. :) But it's a bit of stack over-kill,
but ... is that so bad? I dunno.

>> In the hopes of some day making the sysctl table entirely read-only,
>> can you add some fancy crap here for me? :) See
>> security/yama/yama_lsm.c's yama_dointvec_minmax(), which uses a copy
>> of the sysctl table on the stack.
>
> Will do. I'll deviate slightly from yama_dointvec_minmax(). To make it
> clear that the ctl_table param shouldn't be modified, I'm going to name
> it ro_table and then the stack variable will be named table.

Sounds great, thanks!

-Kees

-- 
Kees Cook
Pixel Security


[PATCHv6 4/7] tap: Abstract type of virtual interface from tap implementation

2017-02-10 Thread Sainath Grandhi
macvlan object is re-structured to hold tap related elements in a separate
entity, tap_dev. Upon NETDEV_REGISTER device_event, tap_dev is registered with
idr and fetched again on tap_open. Few of the tap functions are modified to
accepted tap_dev as argument. tap_dev object includes callbacks to be used by
underlying virtual interface to take care of tx and rx accounting.

Signed-off-by: Sainath Grandhi 
---
 drivers/net/macvlan.c  |   2 +-
 drivers/net/macvtap_main.c |  71 +---
 drivers/net/tap.c  | 264 -
 include/linux/if_tap.h |  57 +-
 4 files changed, 229 insertions(+), 165 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index cbfc1be..9261722 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -1525,7 +1525,6 @@ static const struct nla_policy 
macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
 int macvlan_link_register(struct rtnl_link_ops *ops)
 {
/* common fields */
-   ops->priv_size  = sizeof(struct macvlan_dev);
ops->validate   = macvlan_validate;
ops->maxtype= IFLA_MACVLAN_MAX;
ops->policy = macvlan_policy;
@@ -1548,6 +1547,7 @@ static struct rtnl_link_ops macvlan_link_ops = {
.newlink= macvlan_newlink,
.dellink= macvlan_dellink,
.get_link_net   = macvlan_get_link_net,
+   .priv_size  = sizeof(struct macvlan_dev),
 };
 
 static int macvlan_device_event(struct notifier_block *unused,
diff --git a/drivers/net/macvtap_main.c b/drivers/net/macvtap_main.c
index 215ab7a..0238df6 100644
--- a/drivers/net/macvtap_main.c
+++ b/drivers/net/macvtap_main.c
@@ -24,6 +24,11 @@
 #include 
 #include 
 
+struct macvtap_dev {
+   struct macvlan_dev vlan;
+   struct tap_devtap;
+};
+
 /*
  * Variables for dealing with macvtaps device numbers.
  */
@@ -46,22 +51,55 @@ static struct cdev macvtap_cdev;
 #define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
  NETIF_F_TSO6 | NETIF_F_UFO)
 
+static void macvtap_count_tx_dropped(struct tap_dev *tap)
+{
+   struct macvtap_dev *vlantap = container_of(tap, struct macvtap_dev, 
tap);
+   struct macvlan_dev *vlan = &vlantap->vlan;
+
+   this_cpu_inc(vlan->pcpu_stats->tx_dropped);
+}
+
+static void macvtap_count_rx_dropped(struct tap_dev *tap)
+{
+   struct macvtap_dev *vlantap = container_of(tap, struct macvtap_dev, 
tap);
+   struct macvlan_dev *vlan = &vlantap->vlan;
+
+   macvlan_count_rx(vlan, 0, 0, 0);
+}
+
+static void macvtap_update_features(struct tap_dev *tap,
+   netdev_features_t features)
+{
+   struct macvtap_dev *vlantap = container_of(tap, struct macvtap_dev, 
tap);
+   struct macvlan_dev *vlan = &vlantap->vlan;
+
+   vlan->set_features = features;
+   netdev_update_features(vlan->dev);
+}
+
 static int macvtap_newlink(struct net *src_net,
   struct net_device *dev,
   struct nlattr *tb[],
   struct nlattr *data[])
 {
-   struct macvlan_dev *vlan = netdev_priv(dev);
+   struct macvtap_dev *vlantap = netdev_priv(dev);
int err;
 
-   INIT_LIST_HEAD(&vlan->queue_list);
+   INIT_LIST_HEAD(&vlantap->tap.queue_list);
 
/* Since macvlan supports all offloads by default, make
 * tap support all offloads also.
 */
-   vlan->tap_features = TUN_OFFLOADS;
+   vlantap->tap.tap_features = TUN_OFFLOADS;
 
-   err = netdev_rx_handler_register(dev, tap_handle_frame, vlan);
+   /* Register callbacks for rx/tx drops accounting and updating
+* net_device features
+*/
+   vlantap->tap.count_tx_dropped = macvtap_count_tx_dropped;
+   vlantap->tap.count_rx_dropped = macvtap_count_rx_dropped;
+   vlantap->tap.update_features  = macvtap_update_features;
+
+   err = netdev_rx_handler_register(dev, tap_handle_frame, &vlantap->tap);
if (err)
return err;
 
@@ -74,14 +112,18 @@ static int macvtap_newlink(struct net *src_net,
return err;
}
 
+   vlantap->tap.dev = vlantap->vlan.dev;
+
return 0;
 }
 
 static void macvtap_dellink(struct net_device *dev,
struct list_head *head)
 {
+   struct macvtap_dev *vlantap = netdev_priv(dev);
+
netdev_rx_handler_unregister(dev);
-   tap_del_queues(dev);
+   tap_del_queues(&vlantap->tap);
macvlan_dellink(dev, head);
 }
 
@@ -96,13 +138,14 @@ static struct rtnl_link_ops macvtap_link_ops __read_mostly 
= {
.setup  = macvtap_setup,
.newlink= macvtap_newlink,
.dellink= macvtap_dellink,
+   .priv_size  = sizeof(struct macvtap_dev),
 };
 
 static int macvtap_device_event(struct notifier_block *unused,
unsigned long event, void *ptr)
 {
struc

[PATCHv6 7/7] ipvtap: IP-VLAN based tap driver

2017-02-10 Thread Sainath Grandhi
This patch adds a tap character device driver that is based on the
IP-VLAN network interface, called ipvtap. An ipvtap device can be created
in the same way as an ipvlan device, using 'type ipvtap', and then accessed
using the tap user space interface.

Signed-off-by: Sainath Grandhi 
---
 drivers/net/Kconfig  |  13 +++
 drivers/net/Makefile |   1 +
 drivers/net/ipvlan/Makefile  |   1 +
 drivers/net/ipvlan/ipvlan.h  |   7 ++
 drivers/net/ipvlan/ipvlan_core.c |   3 +-
 drivers/net/ipvlan/ipvlan_main.c |  27 +++--
 drivers/net/ipvlan/ipvtap.c  | 241 +++
 7 files changed, 280 insertions(+), 13 deletions(-)
 create mode 100644 drivers/net/ipvlan/ipvtap.c

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 5763503..823bc2f 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -166,6 +166,19 @@ config IPVLAN
   To compile this driver as a module, choose M here: the module
   will be called ipvlan.
 
+config IPVTAP
+   tristate "IP-VLAN based tap driver"
+   depends on IPVLAN
+   depends on INET
+   select TAP
+   ---help---
+ This adds a specialized tap character device driver that is based
+ on the IP-VLAN network interface, called ipvtap. An ipvtap device
+ can be added in the same way as a ipvlan device, using 'type
+ ipvtap', and then be accessed through the tap user space interface.
+
+ To compile this driver as a module, choose M here: the module
+ will be called ipvtap.
 
 config VXLAN
tristate "Virtual eXtensible Local Area Network (VXLAN)"
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 7dd86ca..98ed4d9 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -7,6 +7,7 @@
 #
 obj-$(CONFIG_BONDING) += bonding/
 obj-$(CONFIG_IPVLAN) += ipvlan/
+obj-$(CONFIG_IPVTAP) += ipvlan/
 obj-$(CONFIG_DUMMY) += dummy.o
 obj-$(CONFIG_EQUALIZER) += eql.o
 obj-$(CONFIG_IFB) += ifb.o
diff --git a/drivers/net/ipvlan/Makefile b/drivers/net/ipvlan/Makefile
index df79910..8a2c64d 100644
--- a/drivers/net/ipvlan/Makefile
+++ b/drivers/net/ipvlan/Makefile
@@ -3,5 +3,6 @@
 #
 
 obj-$(CONFIG_IPVLAN) += ipvlan.o
+obj-$(CONFIG_IPVTAP) += ipvtap.o
 
 ipvlan-objs := ipvlan_core.o ipvlan_main.o
diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 406ae4f..800a46c 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -135,4 +135,11 @@ struct sk_buff *ipvlan_l3_rcv(struct net_device *dev, 
struct sk_buff *skb,
  u16 proto);
 unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
 const struct nf_hook_state *state);
+void ipvlan_count_rx(const struct ipvl_dev *ipvlan,
+unsigned int len, bool success, bool mcast);
+int ipvlan_link_new(struct net *src_net, struct net_device *dev,
+   struct nlattr *tb[], struct nlattr *data[]);
+void ipvlan_link_delete(struct net_device *dev, struct list_head *head);
+void ipvlan_link_setup(struct net_device *dev);
+int ipvlan_link_register(struct rtnl_link_ops *ops);
 #endif /* __IPVLAN_H */
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 8ae335d..1f3295e 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -16,7 +16,7 @@ void ipvlan_init_secret(void)
net_get_random_once(&ipvlan_jhash_secret, sizeof(ipvlan_jhash_secret));
 }
 
-static void ipvlan_count_rx(const struct ipvl_dev *ipvlan,
+void ipvlan_count_rx(const struct ipvl_dev *ipvlan,
unsigned int len, bool success, bool mcast)
 {
if (likely(success)) {
@@ -33,6 +33,7 @@ static void ipvlan_count_rx(const struct ipvl_dev *ipvlan,
this_cpu_inc(ipvlan->pcpu_stats->rx_errs);
}
 }
+EXPORT_SYMBOL_GPL(ipvlan_count_rx);
 
 static u8 ipvlan_get_v6_hash(const void *iaddr)
 {
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 95b18f4..aa8575c 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -496,8 +496,8 @@ static int ipvlan_nl_fillinfo(struct sk_buff *skb,
return ret;
 }
 
-static int ipvlan_link_new(struct net *src_net, struct net_device *dev,
-  struct nlattr *tb[], struct nlattr *data[])
+int ipvlan_link_new(struct net *src_net, struct net_device *dev,
+   struct nlattr *tb[], struct nlattr *data[])
 {
struct ipvl_dev *ipvlan = netdev_priv(dev);
struct ipvl_port *port;
@@ -594,8 +594,9 @@ static int ipvlan_link_new(struct net *src_net, struct 
net_device *dev,
ipvlan_port_destroy(phy_dev);
return err;
 }
+EXPORT_SYMBOL_GPL(ipvlan_link_new);
 
-static void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
+void ipvlan_link_delete(struct net_device *dev, struct list_head *head)
 {
struct ipvl_dev *ip

[PATCHv6 5/7] tap: Extending tap device create/destroy APIs

2017-02-10 Thread Sainath Grandhi
Extending tap APIs get/free_minor and create/destroy_cdev to handle more than 
one
type of virtual interface.

Signed-off-by: Sainath Grandhi 
---
 drivers/net/macvtap_main.c |   6 +--
 drivers/net/tap.c  | 118 +
 include/linux/if_tap.h |   4 +-
 3 files changed, 102 insertions(+), 26 deletions(-)

diff --git a/drivers/net/macvtap_main.c b/drivers/net/macvtap_main.c
index 0238df6..a4bfc10 100644
--- a/drivers/net/macvtap_main.c
+++ b/drivers/net/macvtap_main.c
@@ -163,7 +163,7 @@ static int macvtap_device_event(struct notifier_block 
*unused,
 * been registered but before register_netdevice has
 * finished running.
 */
-   err = tap_get_minor(&vlantap->tap);
+   err = tap_get_minor(macvtap_major, &vlantap->tap);
if (err)
return notifier_from_errno(err);
 
@@ -171,7 +171,7 @@ static int macvtap_device_event(struct notifier_block 
*unused,
classdev = device_create(&macvtap_class, &dev->dev, devt,
 dev, tap_name);
if (IS_ERR(classdev)) {
-   tap_free_minor(&vlantap->tap);
+   tap_free_minor(macvtap_major, &vlantap->tap);
return notifier_from_errno(PTR_ERR(classdev));
}
err = sysfs_create_link(&dev->dev.kobj, &classdev->kobj,
@@ -186,7 +186,7 @@ static int macvtap_device_event(struct notifier_block 
*unused,
sysfs_remove_link(&dev->dev.kobj, tap_name);
devt = MKDEV(MAJOR(macvtap_major), vlantap->tap.minor);
device_destroy(&macvtap_class, devt);
-   tap_free_minor(&vlantap->tap);
+   tap_free_minor(macvtap_major, &vlantap->tap);
break;
case NETDEV_CHANGE_TX_QUEUE_LEN:
if (tap_queue_resize(&vlantap->tap))
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 7d3e8b1..71bbf0b 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -99,12 +99,17 @@ static struct proto tap_proto = {
 };
 
 #define TAP_NUM_DEVS (1U << MINORBITS)
+
+static LIST_HEAD(major_list);
+
 struct major_info {
+   struct rcu_head rcu;
dev_t major;
struct idr minor_idr;
struct mutex minor_lock;
const char *device_name;
-} macvtap_major;
+   struct list_head next;
+};
 
 #define GOODCOPY_LEN 128
 
@@ -385,44 +390,89 @@ rx_handler_result_t tap_handle_frame(struct sk_buff 
**pskb)
return RX_HANDLER_CONSUMED;
 }
 
-int tap_get_minor(struct tap_dev *tap)
+static struct major_info *tap_get_major(int major)
+{
+   struct major_info *tap_major;
+
+   list_for_each_entry_rcu(tap_major, &major_list, next) {
+   if (tap_major->major == major)
+   return tap_major;
+   }
+
+   return NULL;
+}
+
+int tap_get_minor(dev_t major, struct tap_dev *tap)
 {
int retval = -ENOMEM;
+   struct major_info *tap_major;
+
+   rcu_read_lock();
+   tap_major = tap_get_major(MAJOR(major));
+   if (!tap_major) {
+   retval = -EINVAL;
+   goto unlock;
+   }
 
-   mutex_lock(&macvtap_major.minor_lock);
-   retval = idr_alloc(&macvtap_major.minor_idr, tap, 1, TAP_NUM_DEVS, 
GFP_KERNEL);
+   mutex_lock(&tap_major->minor_lock);
+   retval = idr_alloc(&tap_major->minor_idr, tap, 1, TAP_NUM_DEVS, 
GFP_KERNEL);
if (retval >= 0) {
tap->minor = retval;
} else if (retval == -ENOSPC) {
netdev_err(tap->dev, "Too many tap devices\n");
retval = -EINVAL;
}
-   mutex_unlock(&macvtap_major.minor_lock);
+   mutex_unlock(&tap_major->minor_lock);
+
+unlock:
+   rcu_read_unlock();
return retval < 0 ? retval : 0;
 }
 
-void tap_free_minor(struct tap_dev *tap)
+void tap_free_minor(dev_t major, struct tap_dev *tap)
 {
-   mutex_lock(&macvtap_major.minor_lock);
+   struct major_info *tap_major;
+
+   rcu_read_lock();
+   tap_major = tap_get_major(MAJOR(major));
+   if (!tap_major) {
+   goto unlock;
+   }
+
+   mutex_lock(&tap_major->minor_lock);
if (tap->minor) {
-   idr_remove(&macvtap_major.minor_idr, tap->minor);
+   idr_remove(&tap_major->minor_idr, tap->minor);
tap->minor = 0;
}
-   mutex_unlock(&macvtap_major.minor_lock);
+   mutex_unlock(&tap_major->minor_lock);
+
+unlock:
+   rcu_read_unlock();
 }
 
-static struct tap_dev *dev_get_by_tap_minor(int minor)
+static struct tap_dev *dev_get_by_tap_file(int major, int minor)
 {
struct net_device *dev = NULL;
struct tap_dev *tap;
+   struct major_info *tap_major;
 
-   mutex_lock(&macvtap_major.minor_lock);
-   tap = idr_find(&macvtap_major.minor_idr, minor);
+   rcu_read_lock();
+   tap_major = tap_get_major(ma

[PATCHv6 6/7] tap: tap as an independent module

2017-02-10 Thread Sainath Grandhi
This patch makes tap a separate module for other types of virtual interfaces, 
for example,
ipvlan to use.

Signed-off-by: Sainath Grandhi 
---
 drivers/net/Kconfig   |  7 +++
 drivers/net/Makefile  |  3 +--
 drivers/net/{macvtap_main.c => macvtap.c} |  0
 drivers/net/tap.c | 11 +++
 drivers/vhost/Kconfig |  2 +-
 include/linux/if_tap.h|  4 ++--
 6 files changed, 22 insertions(+), 5 deletions(-)
 rename drivers/net/{macvtap_main.c => macvtap.c} (100%)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index a993cbe..5763503 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -135,6 +135,7 @@ config MACVTAP
tristate "MAC-VLAN based tap driver"
depends on MACVLAN
depends on INET
+   select TAP
help
  This adds a specialized tap character device driver that is based
  on the MAC-VLAN network interface, called macvtap. A macvtap device
@@ -287,6 +288,12 @@ config TUN
 
  If you don't know what to use this for, you don't need it.
 
+config TAP
+   tristate
+   ---help---
+ This option is selected by any driver implementing tap user space
+ interface for a virtual interface to re-use core tap functionality.
+
 config TUN_VNET_CROSS_LE
bool "Support for cross-endian vnet headers on little-endian kernels"
default n
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 19b03a9..7dd86ca 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_PHYLIB) += phy/
 obj-$(CONFIG_RIONET) += rionet.o
 obj-$(CONFIG_NET_TEAM) += team/
 obj-$(CONFIG_TUN) += tun.o
+obj-$(CONFIG_TAP) += tap.o
 obj-$(CONFIG_VETH) += veth.o
 obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
 obj-$(CONFIG_VXLAN) += vxlan.o
@@ -29,8 +30,6 @@ obj-$(CONFIG_GTP) += gtp.o
 obj-$(CONFIG_NLMON) += nlmon.o
 obj-$(CONFIG_NET_VRF) += vrf.o
 
-macvtap-objs := macvtap_main.o tap.o
-
 #
 # Networking Drivers
 #
diff --git a/drivers/net/macvtap_main.c b/drivers/net/macvtap.c
similarity index 100%
rename from drivers/net/macvtap_main.c
rename to drivers/net/macvtap.c
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 71bbf0b..35b55a2 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -312,6 +312,7 @@ void tap_del_queues(struct tap_dev *tap)
/* guarantee that any future tap_set_queue will fail */
tap->numvtaps = MAX_TAP_QUEUES;
 }
+EXPORT_SYMBOL_GPL(tap_del_queues);
 
 rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
 {
@@ -389,6 +390,7 @@ rx_handler_result_t tap_handle_frame(struct sk_buff **pskb)
kfree_skb(skb);
return RX_HANDLER_CONSUMED;
 }
+EXPORT_SYMBOL_GPL(tap_handle_frame);
 
 static struct major_info *tap_get_major(int major)
 {
@@ -428,6 +430,7 @@ int tap_get_minor(dev_t major, struct tap_dev *tap)
rcu_read_unlock();
return retval < 0 ? retval : 0;
 }
+EXPORT_SYMBOL_GPL(tap_get_minor);
 
 void tap_free_minor(dev_t major, struct tap_dev *tap)
 {
@@ -449,6 +452,7 @@ void tap_free_minor(dev_t major, struct tap_dev *tap)
 unlock:
rcu_read_unlock();
 }
+EXPORT_SYMBOL_GPL(tap_free_minor);
 
 static struct tap_dev *dev_get_by_tap_file(int major, int minor)
 {
@@ -1210,6 +1214,7 @@ int tap_queue_resize(struct tap_dev *tap)
kfree(arrays);
return ret;
 }
+EXPORT_SYMBOL_GPL(tap_queue_resize);
 
 static int tap_list_add(dev_t major, const char *device_name)
 {
@@ -1257,6 +1262,7 @@ int tap_create_cdev(struct cdev *tap_cdev,
 out1:
return err;
 }
+EXPORT_SYMBOL_GPL(tap_create_cdev);
 
 void tap_destroy_cdev(dev_t major, struct cdev *tap_cdev)
 {
@@ -1272,3 +1278,8 @@ void tap_destroy_cdev(dev_t major, struct cdev *tap_cdev)
}
}
 }
+EXPORT_SYMBOL_GPL(tap_destroy_cdev);
+
+MODULE_AUTHOR("Arnd Bergmann ");
+MODULE_AUTHOR("Sainath Grandhi ");
+MODULE_LICENSE("GPL");
diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 40764ec..cfdecea 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -1,6 +1,6 @@
 config VHOST_NET
tristate "Host kernel accelerator for virtio net"
-   depends on NET && EVENTFD && (TUN || !TUN) && (MACVTAP || !MACVTAP)
+   depends on NET && EVENTFD && (TUN || !TUN) && (TAP || !TAP)
select VHOST
---help---
  This kernel module can be loaded in host kernel to accelerate
diff --git a/include/linux/if_tap.h b/include/linux/if_tap.h
index 362e71c..3482c3c 100644
--- a/include/linux/if_tap.h
+++ b/include/linux/if_tap.h
@@ -1,7 +1,7 @@
 #ifndef _LINUX_IF_TAP_H_
 #define _LINUX_IF_TAP_H_
 
-#if IS_ENABLED(CONFIG_MACVTAP)
+#if IS_ENABLED(CONFIG_TAP)
 struct socket *tap_get_socket(struct file *);
 #else
 #include 
@@ -12,7 +12,7 @@ static inline struct socket *tap_get_socket(struct file *f)
 {
return ERR_PTR(-EINVAL);
 }
-#endif /* CONFIG_MACVTAP */
+#endif /* CONFIG_TAP */
 
 #include 
 #include 
-- 
2.7

[PATCHv6 3/7] tap: Tap character device creation/destroy API

2017-02-10 Thread Sainath Grandhi
This patch provides tap device create/destroy APIs in tap.c.

Signed-off-by: Sainath Grandhi 
---
 drivers/net/macvtap_main.c | 30 +++---
 drivers/net/tap.c  | 62 ++
 include/linux/if_tap.h |  3 +++
 3 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/drivers/net/macvtap_main.c b/drivers/net/macvtap_main.c
index 548f339..215ab7a 100644
--- a/drivers/net/macvtap_main.c
+++ b/drivers/net/macvtap_main.c
@@ -28,7 +28,6 @@
  * Variables for dealing with macvtaps device numbers.
  */
 static dev_t macvtap_major;
-#define MACVTAP_NUM_DEVS (1U << MINORBITS)
 
 static const void *macvtap_net_namespace(struct device *d)
 {
@@ -159,57 +158,46 @@ static struct notifier_block macvtap_notifier_block 
__read_mostly = {
.notifier_call  = macvtap_device_event,
 };
 
-extern struct file_operations tap_fops;
 static int macvtap_init(void)
 {
int err;
 
-   err = alloc_chrdev_region(&macvtap_major, 0,
-   MACVTAP_NUM_DEVS, "macvtap");
-   if (err)
-   goto out1;
+   err = tap_create_cdev(&macvtap_cdev, &macvtap_major, "macvtap");
 
-   cdev_init(&macvtap_cdev, &tap_fops);
-   err = cdev_add(&macvtap_cdev, macvtap_major, MACVTAP_NUM_DEVS);
if (err)
-   goto out2;
+   goto out1;
 
err = class_register(&macvtap_class);
if (err)
-   goto out3;
+   goto out2;
 
err = register_netdevice_notifier(&macvtap_notifier_block);
if (err)
-   goto out4;
+   goto out3;
 
err = macvlan_link_register(&macvtap_link_ops);
if (err)
-   goto out5;
+   goto out4;
 
return 0;
 
-out5:
-   unregister_netdevice_notifier(&macvtap_notifier_block);
 out4:
-   class_unregister(&macvtap_class);
+   unregister_netdevice_notifier(&macvtap_notifier_block);
 out3:
-   cdev_del(&macvtap_cdev);
+   class_unregister(&macvtap_class);
 out2:
-   unregister_chrdev_region(macvtap_major, MACVTAP_NUM_DEVS);
+   tap_destroy_cdev(macvtap_major, &macvtap_cdev);
 out1:
return err;
 }
 module_init(macvtap_init);
 
-extern struct idr minor_idr;
 static void macvtap_exit(void)
 {
rtnl_link_unregister(&macvtap_link_ops);
unregister_netdevice_notifier(&macvtap_notifier_block);
class_unregister(&macvtap_class);
-   cdev_del(&macvtap_cdev);
-   unregister_chrdev_region(macvtap_major, MACVTAP_NUM_DEVS);
-   idr_destroy(&minor_idr);
+   tap_destroy_cdev(macvtap_major, &macvtap_cdev);
 }
 module_exit(macvtap_exit);
 
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 15ca2d5..04ba978 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -123,8 +123,12 @@ static struct proto tap_proto = {
 };
 
 #define TAP_NUM_DEVS (1U << MINORBITS)
-static DEFINE_MUTEX(minor_lock);
-DEFINE_IDR(minor_idr);
+struct major_info {
+   dev_t major;
+   struct idr minor_idr;
+   struct mutex minor_lock;
+   const char *device_name;
+} macvtap_major;
 
 #define GOODCOPY_LEN 128
 
@@ -413,26 +417,26 @@ int tap_get_minor(struct macvlan_dev *vlan)
 {
int retval = -ENOMEM;
 
-   mutex_lock(&minor_lock);
-   retval = idr_alloc(&minor_idr, vlan, 1, TAP_NUM_DEVS, GFP_KERNEL);
+   mutex_lock(&macvtap_major.minor_lock);
+   retval = idr_alloc(&macvtap_major.minor_idr, vlan, 1, TAP_NUM_DEVS, 
GFP_KERNEL);
if (retval >= 0) {
vlan->minor = retval;
} else if (retval == -ENOSPC) {
netdev_err(vlan->dev, "Too many tap devices\n");
retval = -EINVAL;
}
-   mutex_unlock(&minor_lock);
+   mutex_unlock(&macvtap_major.minor_lock);
return retval < 0 ? retval : 0;
 }
 
 void tap_free_minor(struct macvlan_dev *vlan)
 {
-   mutex_lock(&minor_lock);
+   mutex_lock(&macvtap_major.minor_lock);
if (vlan->minor) {
-   idr_remove(&minor_idr, vlan->minor);
+   idr_remove(&macvtap_major.minor_idr, vlan->minor);
vlan->minor = 0;
}
-   mutex_unlock(&minor_lock);
+   mutex_unlock(&macvtap_major.minor_lock);
 }
 
 static struct net_device *dev_get_by_tap_minor(int minor)
@@ -440,13 +444,13 @@ static struct net_device *dev_get_by_tap_minor(int minor)
struct net_device *dev = NULL;
struct macvlan_dev *vlan;
 
-   mutex_lock(&minor_lock);
-   vlan = idr_find(&minor_idr, minor);
+   mutex_lock(&macvtap_major.minor_lock);
+   vlan = idr_find(&macvtap_major.minor_idr, minor);
if (vlan) {
dev = vlan->dev;
dev_hold(dev);
}
-   mutex_unlock(&minor_lock);
+   mutex_unlock(&macvtap_major.minor_lock);
return dev;
 }
 
@@ -1184,3 +1188,39 @@ int tap_queue_resize(struct macvlan_dev *vlan)
kfree(arrays);
return ret;
 }
+
+int tap_create_cdev(struct cdev

[PATCHv6 0/7] Refactor macvtap to re-use tap functionality by other virtual intefaces

2017-02-10 Thread Sainath Grandhi
Tap character devices can be implemented on other virtual interfaces like
ipvlan, similar to macvtap. Source code for tap functionality in macvtap
can be re-used for this purpose.

This patch series splits macvtap source into two modules, macvtap and tap.
This patch series also includes a patch for implementing tap character
device driver based on the IP-VLAN network interface, called ipvtap.

These patches are tested on x86 platform.

Sainath Grandhi (7):
  tap: Refactoring macvtap.c
  tap: Renaming tap related APIs, data structures, macros
  tap: Tap character device creation/destroy API
  tap: Abstract type of virtual interface from tap implementation
  tap: Extending tap device create/destroy APIs
  tap: tap as an independent module
  ipvtap: IP-VLAN based tap driver

 drivers/net/Kconfig  |   20 +
 drivers/net/Makefile |2 +
 drivers/net/ipvlan/Makefile  |1 +
 drivers/net/ipvlan/ipvlan.h  |7 +
 drivers/net/ipvlan/ipvlan_core.c |3 +-
 drivers/net/ipvlan/ipvlan_main.c |   27 +-
 drivers/net/ipvlan/ipvtap.c  |  241 +++
 drivers/net/macvlan.c|2 +-
 drivers/net/macvtap.c| 1229 ++--
 drivers/net/tap.c| 1285 ++
 drivers/vhost/Kconfig|2 +-
 drivers/vhost/net.c  |3 +-
 include/linux/if_macvlan.h   |   17 +-
 include/linux/if_tap.h   |   75 +++
 14 files changed, 1706 insertions(+), 1208 deletions(-)
 create mode 100644 drivers/net/ipvlan/ipvtap.c
 create mode 100644 drivers/net/tap.c
 create mode 100644 include/linux/if_tap.h

-- 
2.7.4



Re: [PATCH] checkpatch: add warning on %pk instead of %pK usage

2017-02-10 Thread Joe Perches
On Fri, 2017-02-10 at 23:54 +, Roberts, William C wrote:
> > The problem starts to get hairy when we think of how often folks roll their 
> > own
> > logging macros (see some small sampling at the end).

It's not just the "hairy" local macros.

In its current form, checkpatch could not find uses like:

netif_(x, y, z,
"some string with %pk",
args);
and
some_logging_function(arg, "string 1" CONSTANT "string 2", etc...)

if string 2 or CONSTANT had the "%pk" use.

and a bunch of other styles.

This really needs to be verified by the compiler.


Re: [PATCH v2 2/4] seccomp: Add sysctl to configure actions that should be logged

2017-02-10 Thread Tyler Hicks
On 02/07/2017 06:24 PM, Kees Cook wrote:
> On Thu, Feb 2, 2017 at 9:37 PM, Tyler Hicks  wrote:
>> Administrators can write to this sysctl to set the maximum seccomp
>> action that should be logged. Any actions with values greater than
>> what's written to the sysctl will not be logged.
>>
>> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
>> SECCOMP_RET_ERRNO actions would be logged if "errno" were written to the
>> sysctl. SECCOMP_RET_TRACE and SECCOMP_RET_ALLOW actions would not be
>> logged since their values are higher than SECCOMP_RET_ERRNO.
>>
>> The path to the sysctl is:
>>
>>  /proc/sys/kernel/seccomp/max_action_to_log
> 
> /me looks for new bikeshed paint.
> 
> How about .../seccomp/action_log ? (And a corresponding
> s/max_action_to_log/action_log/, if that looks readable...) I think
> four words is just too long. :)

Kees and I discussed this a bit over IRC today. We settled on
log_max_action for v3 of the patch set.

> 
>> The actions_avail sysctl can be read to discover the valid action names
>> that can be written to the max_action_to_log sysctl. The actions_avail
>> sysctl is also useful in understanding the ordering of actions used when
>> deciding the maximum action to log.
>>
>> The default setting for the sysctl is to only log SECCOMP_RET_KILL
>> actions which matches the existing behavior.
>>
>> There's one important exception to this sysctl. If a task is
>> specifically being audited, meaning that an audit context has been
>> allocated for the task, seccomp will log all actions other than
>> SECCOMP_RET_ALLOW despite the value of max_action_to_log. This exception
>> preserves the existing auditing behavior of tasks with an allocated
>> audit context.
>>
>> Signed-off-by: Tyler Hicks 
>> ---
>>  include/linux/audit.h |   6 +--
>>  kernel/seccomp.c  | 114 
>> --
>>  2 files changed, 112 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>> index f51fca8d..e0d95fc 100644
>> --- a/include/linux/audit.h
>> +++ b/include/linux/audit.h
>> @@ -315,11 +315,7 @@ void audit_core_dumps(long signr);
>>
>>  static inline void audit_seccomp(unsigned long syscall, long signr, int 
>> code)
>>  {
>> -   if (!audit_enabled)
>> -   return;
>> -
>> -   /* Force a record to be reported if a signal was delivered. */
>> -   if (signr || unlikely(!audit_dummy_context()))
>> +   if (audit_enabled && unlikely(!audit_dummy_context()))
>> __audit_seccomp(syscall, signr, code);
>>  }
>>
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 919ad9f..548fb89 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -509,6 +509,24 @@ static void seccomp_send_sigsys(int syscall, int reason)
>>  }
>>  #endif /* CONFIG_SECCOMP_FILTER */
>>
>> +static u32 seccomp_max_action_to_log = SECCOMP_RET_KILL;
>> +
>> +static void seccomp_log(unsigned long syscall, long signr, u32 action)
> 
> Please mark this inline...

Will do.

> 
>> +{
>> +   /* Force an audit message to be emitted when the action is not 
>> greater
>> +* than the configured maximum action.
>> +*/
>> +   if (action <= seccomp_max_action_to_log)
>> +   return __audit_seccomp(syscall, signr, action);
>> +
>> +   /* If the action is not an ALLOW action, let the audit subsystem 
>> decide
>> +* if it should be audited based on whether the current task itself 
>> is
>> +* being audited.
>> +*/
>> +   if (action != SECCOMP_RET_ALLOW)
>> +   return audit_seccomp(syscall, signr, action);
> 
> Based on my thoughts below, this test can actually be removed (making
> the audit_seccomp() call unconditional), since callers will always be
> != RET_ALLOW.

Agreed.

> 
>> +}
>> +
>>  /*
>>   * Secure computing mode 1 allows only read/write/exit/sigreturn.
>>   * To be fully secure this must be combined with rlimit
>> @@ -534,7 +552,7 @@ static void __secure_computing_strict(int this_syscall)
>>  #ifdef SECCOMP_DEBUG
>> dump_stack();
>>  #endif
>> -   audit_seccomp(this_syscall, SIGKILL, SECCOMP_RET_KILL);
>> +   seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
>> do_exit(SIGKILL);
>>  }
>>
>> @@ -633,18 +651,19 @@ static int __seccomp_filter(int this_syscall, const 
>> struct seccomp_data *sd,
>> return 0;
>>
>> case SECCOMP_RET_ALLOW:
>> +   seccomp_log(this_syscall, 0, action);
>> return 0;
> 
> I am extremely sensitive about anything appearing in the RET_ALLOW
> case, since it's the hot path for seccomp. This adds a full function
> call (which also contains a redundant test: the action IS RET_ALLOW,
> so we'll never call audit_seccomp() in seccomp_log()).
> 
> While the inline request above removes the function call, it's not
> clear to me if gcc is going to do the right thing here, and I'd like
> to assist the branch predictor (likely separate from the o

  1   2   3   4   5   6   7   8   >