date:20160613

[PATCH 1/3] Kbuild: don't add ../../ to include path

2016-06-13 Thread Arnd Bergmann

When we build with O=objdir and objdir is directly below the source tree,
$(srctree) becomes '..'.

When a Makefile adds a CFLAGS option like -Ipath/to/headers and
we are building with a separate object directory, Kbuild tries to
add two -I options, one for the source tree and one for the object
tree. An absolute path is treated as a special case, and don't add
this one twice. This also normally catches -I$(srctree)/$(src)
as $(srctree) usually is an absolute directory like /home/arnd/linux/.

The combination of the two behaviors however results in an invalid
path name to be included: we get both ../$(src) and ../../$(src),
the latter one pointing outside of the source tree, usually to a
nonexisting directory. Building with 'make W=1' makes this obvious:

cc1: error: ../../arch/arm/mach-s3c24xx/include: No such file or directory 
[-Werror=missing-include-dirs]

This adds another special case, treating path names starting with ../
like those starting with / so we don't try to prefix that with
$(srctree).

Signed-off-by: Arnd Bergmann 
---
 scripts/Kbuild.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 0f82314621f2..f8b45eb47ed3 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -202,7 +202,7 @@ hdr-inst := -f $(srctree)/scripts/Makefile.headersinst obj
 # Prefix -I with $(srctree) if it is not an absolute path.
 # skip if -I has no parameter
 addtree = $(if $(patsubst -I%,%,$(1)), \
-$(if $(filter-out -I/%,$(1)),$(patsubst -I%,-I$(srctree)/%,$(1))) $(1))
+$(if $(filter-out -I/% -I../%,$(1)),$(patsubst -I%,-I$(srctree)/%,$(1))) $(1))
 
 # Find all -I options and call addtree
 flags = $(foreach o,$($(1)),$(if $(filter -I%,$(o)),$(call addtree,$(o)),$(o)))
-- 
2.7.0

[PATCH 2/3] Kbuild: don't add obj tree in additional includes

2016-06-13 Thread Arnd Bergmann

When building with separate object directories and driver specific
Makefiles that add additional header include paths, Kbuild adjusts
the gcc flags so that we include both the directory in the source
tree and in the object tree.

However, due to another bug I fixed earlier, this did not actually
include the correct directory in the object tree, so we know that
we only really need the source tree here. Also, including the
object tree sometimes causes warnings about nonexisting directories
when the include path only exists in the source.

This changes the logic to only emit the -I argument for the srctree,
not for objects. We still need both $(srctree)/$(src) and $(obj)
though, so I'm adding them manually.

Signed-off-by: Arnd Bergmann 
---
 scripts/Kbuild.include | 2 +-
 scripts/Makefile.lib   | 7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index f8b45eb47ed3..f8f95e32a746 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -202,7 +202,7 @@ hdr-inst := -f $(srctree)/scripts/Makefile.headersinst obj
 # Prefix -I with $(srctree) if it is not an absolute path.
 # skip if -I has no parameter
 addtree = $(if $(patsubst -I%,%,$(1)), \
-$(if $(filter-out -I/% -I../%,$(1)),$(patsubst -I%,-I$(srctree)/%,$(1))) $(1))
+$(if $(filter-out -I/% -I../%,$(1)),$(patsubst -I%,-I$(srctree)/%,$(1)),$(1)))
 
 # Find all -I options and call addtree
 flags = $(foreach o,$($(1)),$(if $(filter -I%,$(o)),$(call addtree,$(o)),$(o)))
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 76494e15417b..0a07f9014944 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -155,9 +155,10 @@ else
 # $(call addtree,-I$(obj)) locates .h files in srctree, from generated .c files
 #   and locates generated .h files
 # FIXME: Replace both with specific CFLAGS* statements in the makefiles
-__c_flags  = $(call addtree,-I$(obj)) $(call flags,_c_flags)
-__a_flags  =  $(call flags,_a_flags)
-__cpp_flags =  $(call flags,_cpp_flags)
+__c_flags  = $(if $(obj),-I$(srctree)/$(src) -I$(obj)) \
+ $(call flags,_c_flags)
+__a_flags  = $(call flags,_a_flags)
+__cpp_flags = $(call flags,_cpp_flags)
 endif
 
 c_flags= -Wp,-MD,$(depfile) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) \
-- 
2.7.0

Re: [RFC 05/18] limits: track and present RLIMIT_NOFILE actual max

2016-06-13 Thread Andy Lutomirski

On Mon, Jun 13, 2016 at 2:13 PM, Topi Miettinen  wrote:
> On 06/13/16 20:40, Andy Lutomirski wrote:
>> On 06/13/2016 12:44 PM, Topi Miettinen wrote:
>>> Track maximum number of files for the process, present current maximum
>>> in /proc/self/limits.
>>
>> The core part should be its own patch.
>>
>> Also, you have this weirdly named (and racy!) function bump_rlimit.
>
> I can change the name if you have better suggestions. rlimit_track_max?
>
> The max value is written often but read seldom, if ever. What kind of
> locking should I use then?

Possibly none, but WRITE_ONCE would be good as would a comment
indicating that your code in intentionally racy.  Or you could use
atomic_cmpxchg if that won't kill performance.

rlimit_track_max sounds like a better name to me.

>
>> Wouldn't this be nicer if you taught the rlimit code to track the
>> *current* usage generically and to derive the max usage from that?
>
> Current rlimit code performs checks against current limits. These are
> typically done early in the calling function and further checks could
> also fail. Thus max should not be updated until much later. Maybe these
> could be combined, but not easily if at all.

I mean:  why not actually show the current value in /proc/pid/limits
and track the max via whatever teaches proc about the current value?

>
>>
>>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>>> index a11eb71..227997b 100644
>>> --- a/fs/proc/base.c
>>> +++ b/fs/proc/base.c
>>> @@ -630,8 +630,8 @@ static int proc_pid_limits(struct seq_file *m,
>>> struct pid_namespace *ns,
>>>  /*
>>>   * print the file header
>>>   */
>>> -   seq_printf(m, "%-25s %-20s %-20s %-10s\n",
>>> -  "Limit", "Soft Limit", "Hard Limit", "Units");
>>> +seq_printf(m, "%-25s %-20s %-20s %-10s %-20s\n",
>>> +   "Limit", "Soft Limit", "Hard Limit", "Units", "Max");
>>
>> What existing programs, if any, does this break?
>
> Using Debian codesearch for /limits" string, I'd check pam_limits and
> rtkit. The max values could be put into a new file if you prefer.

If it actually breaks them, then you need to change the patch so you
don't break them.

Re: [patch] x86/ldt: silence a static checker warning

2016-06-13 Thread Andy Lutomirski

On Sun, Jun 12, 2016 at 11:57 PM, Dan Carpenter
 wrote:
> It likely doesn't make a difference but my static checker complains
> that we put an upper bound on "size" but not a lower bound.  Let's just
> make it unsigned.

Shouldn't oldsize and newsize in write_ldt as well as the "size"
member in ldt_struct change, too?

--Andy

Re: [RFC 05/18] limits: track and present RLIMIT_NOFILE actual max

2016-06-13 Thread Topi Miettinen

On 06/13/16 20:40, Andy Lutomirski wrote:
> On 06/13/2016 12:44 PM, Topi Miettinen wrote:
>> Track maximum number of files for the process, present current maximum
>> in /proc/self/limits.
> 
> The core part should be its own patch.
> 
> Also, you have this weirdly named (and racy!) function bump_rlimit.

I can change the name if you have better suggestions. rlimit_track_max?

The max value is written often but read seldom, if ever. What kind of
locking should I use then?

> Wouldn't this be nicer if you taught the rlimit code to track the
> *current* usage generically and to derive the max usage from that?

Current rlimit code performs checks against current limits. These are
typically done early in the calling function and further checks could
also fail. Thus max should not be updated until much later. Maybe these
could be combined, but not easily if at all.

> 
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index a11eb71..227997b 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -630,8 +630,8 @@ static int proc_pid_limits(struct seq_file *m,
>> struct pid_namespace *ns,
>>  /*
>>   * print the file header
>>   */
>> -   seq_printf(m, "%-25s %-20s %-20s %-10s\n",
>> -  "Limit", "Soft Limit", "Hard Limit", "Units");
>> +seq_printf(m, "%-25s %-20s %-20s %-10s %-20s\n",
>> +   "Limit", "Soft Limit", "Hard Limit", "Units", "Max");
> 
> What existing programs, if any, does this break?

Using Debian codesearch for /limits" string, I'd check pam_limits and
rtkit. The max values could be put into a new file if you prefer.

> 
>>
>>  for (i = 0; i < RLIM_NLIMITS; i++) {
>>  if (rlim[i].rlim_cur == RLIM_INFINITY)
>> @@ -647,9 +647,11 @@ static int proc_pid_limits(struct seq_file *m,
>> struct pid_namespace *ns,
>>  seq_printf(m, "%-20lu ", rlim[i].rlim_max);
>>
>>  if (lnames[i].unit)
>> -seq_printf(m, "%-10s\n", lnames[i].unit);
>> +seq_printf(m, "%-10s", lnames[i].unit);
>>  else
>> -seq_putc(m, '\n');
>> +seq_printf(m, "%-10s", "");
>> +seq_printf(m, "%-20lu\n",
>> +   task->signal->rlim_curmax[i]);
>>  }
>>
>>  return 0;
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 9c48a08..0150380 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -782,6 +782,7 @@ struct signal_struct {
>>   * have no need to disable irqs.
>>   */
>>  struct rlimit rlim[RLIM_NLIMITS];
>> +unsigned long rlim_curmax[RLIM_NLIMITS];
>>
>>  #ifdef CONFIG_BSD_PROCESS_ACCT
>>  struct pacct_struct pacct;/* per-process accounting
>> information */
>> @@ -3376,6 +3377,12 @@ static inline unsigned long rlimit_max(unsigned
>> int limit)
>>  return task_rlimit_max(current, limit);
>>  }
>>
>> +static inline void bump_rlimit(unsigned int limit, unsigned long r)
>> +{
>> +if (READ_ONCE(current->signal->rlim_curmax[limit]) < r)
>> +current->signal->rlim_curmax[limit] = r;
>> +}
>> +
>>  #ifdef CONFIG_CPU_FREQ
>>  struct update_util_data {
>>  void (*func)(struct update_util_data *data,
>>
>

Re: [RFC 02/18] cgroup_pids: track maximum pids

2016-06-13 Thread Tejun Heo

Hello,

On Mon, Jun 13, 2016 at 10:44:09PM +0300, Topi Miettinen wrote:
> Track maximum pids in the cgroup, present it in cgroup pids.current_max.

"max" is often used for maximum limits in cgroup.  I think "watermark"
or "high_watermark" would be a lot clearer.

> @@ -236,6 +246,14 @@ static void pids_free(struct task_struct *task)
>   pids_uncharge(pids, 1);
>  }
>  
> +static void pids_fork(struct task_struct *task)
> +{
> + struct pids_cgroup *pids = css_pids(task_css(task, pids_cgrp_id));
> +
> + if (atomic64_read(&pids->cur_max) < atomic64_read(&pids->counter))
> + atomic64_set(&pids->cur_max, atomic64_read(&pids->counter));
> +}

Wouldn't it make more sense to track high watermark from the charge
functions instead?  I don't get why this requires a separate fork
callback.  Also, racing atomic64_set's are racy.  The counter can end
up with a lower number than it should be.

> @@ -300,6 +326,11 @@ static struct cftype pids_files[] = {
>   .read_s64 = pids_current_read,
>   .flags = CFTYPE_NOT_ON_ROOT,
>   },
> + {
> + .name = "current_max",

Please make this "high_watermark" field in pids.stats file.

Thanks.

-- 
tejun

Re: [RFC 01/18] capabilities: track actually used capabilities

2016-06-13 Thread Andy Lutomirski

On Mon, Jun 13, 2016 at 1:45 PM, Topi Miettinen  wrote:
> On 06/13/16 20:32, Andy Lutomirski wrote:
>> On Mon, Jun 13, 2016 at 12:44 PM, Topi Miettinen  wrote:
>>> Track what capabilities are actually used and present the current
>>> situation in /proc/self/status.
>>
>> What for?
>

>
> Capabilities
> [RFC 01/18] capabilities: track actually used capabilities
>
> Currently, there is no way to know which capabilities are actually used.
> Even
> the source code is only implicit, in-depth knowledge of each capability must
> be used when analyzing a program to judge which capabilities the program
> will
> exercise."
>
> Should I perhaps cite some of this in the commit?

Yes, but you should also clarify what users are supposed to do with
this.  Given ambient capabilities, I suspect that you'll find that
your patch doesn't actually work very well.  For example, if you run a
shell script with ambient caps, then you won't notice caps used by
short-lived helper processes.

>
>>
>> What is the intended behavior on fork()?  Whatever the intended
>> behavior is, there should IMO be a selftest for it.
>>
>> --Andy
>>
>
> The capabilities could be tracked from three points of daemon
> initialization sequence onwards:
> fork()
> setpcap()
> exec()
>
> fork() case would be logical as the /proc entry is per task. But if you
> consider the tools to set the capabilities (for example systemd unit
> files), there can be between fork() and exec() further preparations
> which need more capabilities than the program itself needs.
>
> setpcap() is probably the real point after which we are interested if
> the capabilities are enough.
>
> The amount of setup between setpcap() and exec() is probably very low.

When I asked "what is the intended behavior on fork()?", I mean "what
should CapUsed be after fork()?".  The answer should be about four
words long and should have a test case.  There should maybe also be an
explanation of why the intended behavior is useful.

But, as I said above, I think that you may need to rethink this
entirely to make it useful.  You might need to do it per process tree
or per cgroup or something.

--Andy

[PATCH 0/2] pinctrl: convert palmas and as3722 to tristate

2016-06-13 Thread Paul Gortmaker

As part of a previous review[1], these two drivers (currently bool) were
nominated by their author to be converted to tristate (vs. removing the
existing modular references.)

Upon detecting a non-modular driver making modular references, I don't
immediately convert them to tristate, since it increases functionality
that I can't readily test, and it may not have a sensible use case (e.g.
in the case of core arch support relating to timer ticks or similar.)
So instead the modular references are removed w/o changing the existing
functionality by default.

However there is no reason the original author or an interested user
with the capability to test can't nominate the driver to be tristate
either as the original intent, or as a functional and tested use case.

Here we convert two drivers to tristate and ensure that they can compile
and pass modpost without suffering unresolved symbols:

paul@builder:~/git/linux-head$ ls -l ../arm-build/drivers/pinctrl/*ko
[...]
-rw-rw-r-- 1 paul paul 15585 Jun 13 15:49 
../arm-build/drivers/pinctrl/pinctrl-as3722.ko
-rw-rw-r-- 1 paul paul 25497 Jun 13 15:49 
../arm-build/drivers/pinctrl/pinctrl-palmas.ko
paul@builder:~/git/linux-head$

To be clear, I don't have the hardware required for runtime testing of
the modular instances, and hence that remains to be done by someone
with the hardware and the desire to have the driver(s) modular.

That said, this change won't regress any existing users who are relying
on the current built-in behaviour, so merging doesn't need to be
conditional on obtaining run time testing of the modular instances.

Paul.

[1] 
https://lkml.kernel.org/r/1465267388-17884-1-git-send-email-paul.gortma...@windriver.com
---

Cc: Laxman Dewangan 
Cc: Linus Walleij 
Cc: linux-g...@vger.kernel.org


Paul Gortmaker (2):
  pinctrl: palmas: convert PINCTRL_PALMAS from bool to tristate
  pinctrl: as3722: convert PINCTRL_AS3722 from bool to tristate

 drivers/pinctrl/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.8.4

Re: [PATCH 3.16 000/114] 3.16.36-rc1 review

2016-06-13 Thread Sudip Mukherjee


On Monday 13 June 2016 07:36 PM, Ben Hutchings wrote:

This is the start of the stable review cycle for the 3.16.36 release.
There are 114 patches in this series, which will be posted as responses
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Jun 15 19:00:00 UTC 2016.
Anything received after that time might be too late.

A combined patch relative to 3.16.35 will be posted as an additional
response to this.  A shortlog and diffstat can be found below.


Hi Ben,
I am not able to find the mail with the combined patch, cant even find 
on lkml also. I think I am missing something. Can you please send it to 
me again.


Regards
Sudip

[PATCH 2/2] pinctrl: as3722: convert PINCTRL_AS3722 from bool to tristate

2016-06-13 Thread Paul Gortmaker

The Kconfig currently controlling compilation of this code is:

config PINCTRL_AS3722
bool "Pinctrl and GPIO driver for ams AS3722 PMIC"

...meaning that it currently is not being built as a module by anyone.

During an audit for non-modular drivers using modular infrastructure
this driver showed up.

But rather than demodularize it, Laxman indicated that it would be
prefereable to instead convert the driver option to tristate.

This does that, and confirms that it will compile and modpost as
such.  However, since I do not have the hardware to confirm that
no new runtime issues exist when modular, that remains untested.

Cc: Laxman Dewangan 
Cc: Linus Walleij 
Cc: linux-g...@vger.kernel.org
Signed-off-by: Paul Gortmaker 
---
 drivers/pinctrl/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig
index a92e61870024..2f805014cc21 100644
--- a/drivers/pinctrl/Kconfig
+++ b/drivers/pinctrl/Kconfig
@@ -35,7 +35,7 @@ config PINCTRL_ADI2
  machine and arch are selected to build.
 
 config PINCTRL_AS3722
-   bool "Pinctrl and GPIO driver for ams AS3722 PMIC"
+   tristate "Pinctrl and GPIO driver for ams AS3722 PMIC"
depends on MFD_AS3722 && GPIOLIB
select PINMUX
select GENERIC_PINCONF
-- 
2.8.4

Re: [RESEND PATCH 1/3] rfkill: Create "rfkill-airplane-mode" LED trigger

2016-06-13 Thread João Paulo Rechi Vita

On 13 June 2016 at 17:01, Pavel Machek  wrote:
> On Mon 2016-06-13 15:59:35, João Paulo Rechi Vita wrote:
>> On 13 June 2016 at 15:00, Pavel Machek  wrote:
>> > Hi!
>> >
>> >> > João, that means you should send a patch to add the ::rfkill suffix.
>> >> >
>> >>
>> >> IMO "airplane" (or maybe "airplane-mode") is a better suffix, as it
>> >> reflects the label on the machine's chassis. I'll name it
>> >> "asus-wireless::airplane" and send this through platform-drivers-x86,
>> >> as this is now contained in the platform-drivers-x86 subsystem. Thanks
>> >> Johannes for your patience and help designing and reviewing the rfkill
>> >> changes, even if not all of them made it through in the end. And
>> >> thanks everyone else involved for the feedback.
>> >
>> > Actually, I'd do '::rfkill', for consistency with other places in
>> > /sys.
>> >
>> > /sys/devices/platform/thinkpad_acpi/rfkill/rfkill1/name
>> > /sys/class/rfkill
>> > /sys/module/rfkill
>> >
>>
>> If we use "rfkill" as a suffix, how do you expect userspace to be able
>> to differentiate between a LED that indicates airplane-mode (LED ON
>> when all radios are OFF) and a LED that indicates the state of a
>> specific radio like WiFi or Bluetooth (LED ON when that specific radio
>> is ON)? If we're going this route we should provide meaningful
>> information here.
>
> '::airplane' has same problem, no?
>

No, because in this case we would not use "airplane" as a suffix for a
LED associated with an individual radio.

> If you want to distinguish that, maybe you can do '::rfkill' for
> everything vs '::rfkill-wifi' for wifi-only and '::rfkill-bt' for
> bluetooth...
>

The problem here is that the "rfkill" name is already associated with
individual rfkill switches under /sys/class/rfkill,
/sys/devices/platform/*/rfkill etc, so I think we're better off
distinguishing "airplane" vs "wifi" vs "bluetooth" etc, to avoid
confusion.

--
João Paulo Rechi Vita
http://about.me/jprvita

[PATCH 1/2] pinctrl: palmas: convert PINCTRL_PALMAS from bool to tristate

2016-06-13 Thread Paul Gortmaker

The Kconfig currently controlling compilation of this code is:

config PINCTRL_PALMAS
bool "Pinctrl driver for the PALMAS Series MFD devices"

...meaning that it currently is not being built as a module by anyone.

During an audit for non-modular drivers using modular infrastructure
this driver showed up.

But rather than demodularize it, Laxman indicated that it would be
prefereable to instead convert the driver option to tristate.

This does that, and confirms that it will compile and modpost as
such.  However, since I do not have the hardware to confirm that
no new runtime issues exist when modular, that remains untested.

Cc: Laxman Dewangan 
Cc: Linus Walleij 
Cc: linux-g...@vger.kernel.org
Signed-off-by: Paul Gortmaker 
---
 drivers/pinctrl/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig
index ea25eeeceef1..a92e61870024 100644
--- a/drivers/pinctrl/Kconfig
+++ b/drivers/pinctrl/Kconfig
@@ -218,7 +218,7 @@ config PINCTRL_MAX77620
  open drain, FPS slots etc.
 
 config PINCTRL_PALMAS
-   bool "Pinctrl driver for the PALMAS Series MFD devices"
+   tristate "Pinctrl driver for the PALMAS Series MFD devices"
depends on OF && MFD_PALMAS
select PINMUX
select GENERIC_PINCONF
-- 
2.8.4

Re: [RFC PATCH V2 1/2] ACPI/PCI: Match PCI config space accessors against platfrom specific ECAM quirks

2016-06-13 Thread Duc Dang

On Mon, Jun 13, 2016 at 8:59 AM, Jeffrey Hugo  wrote:
> On 6/13/2016 9:12 AM, ok...@codeaurora.org wrote:
>>
>> On 2016-06-13 10:29, Gabriele Paoloni wrote:
>>>
>>> Hi Sinan
>>>
 -Original Message-
 From: Sinan Kaya [mailto:ok...@codeaurora.org]
 Sent: 13 June 2016 15:03
 To: Gabriele Paoloni; liudongdong (C); helg...@kernel.org;
 a...@arndb.de; will.dea...@arm.com; catalin.mari...@arm.com;
 raf...@kernel.org; hanjun@linaro.org; lorenzo.pieral...@arm.com;
 jchan...@broadcom.com; t...@semihalf.com
 Cc: robert.rich...@caviumnetworks.com; m...@semihalf.com;
 liviu.du...@arm.com; dda...@caviumnetworks.com; Wangyijing;
 suravee.suthikulpa...@amd.com; msal...@redhat.com; linux-
 p...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux-
 a...@vger.kernel.org; linux-kernel@vger.kernel.org; linaro-
 a...@lists.linaro.org; j...@redhat.com; andrea.ga...@linaro.org;
 dhd...@apm.com; jeremy.lin...@arm.com; c...@codeaurora.org; Chenxin
 (Charles); Linuxarm
 Subject: Re: [RFC PATCH V2 1/2] ACPI/PCI: Match PCI config space
 accessors against platfrom specific ECAM quirks

 On 6/13/2016 9:54 AM, Gabriele Paoloni wrote:
 > As you can see here Liudongdong has replaced oem_revision with
 > oem_table_id.
 >
 > Now it seems that there are some platforms that have already shipped
 > using a matching based on the oem_revision (right Jon?)
 >
 > However I guess that if in FW they have defined oem_table_id properly
 > they should be able to use this mechanism without needing to a FW
 update.
 >
 > Can these vendors confirm this?
 >
 > Tomasz do you think this can work for Cavium Thunder?
 >
 > Thanks
 >
 > Gab

 Why not have all three of them?

 The initial approach was OEM id and revision id.

 Jeff Hugo indicated that addition (not removing any other fields) of
 table id
 would make more sense.
>>>
>>>
>>> Mmm from last email of Jeff Hugo on "[RFC PATCH 1/3] pci, acpi: Match
>>> PCI config space accessors against platfrom specific ECAM quirks."
>>>
>>> I quote:
>>>
>>>  "Using the OEM revision
>>>  field does not seem to be appropriate since these are different
>>>  platforms and the revision field appears to be for the purpose of
>>>  tracking differences within a single platform.  Therefore, Cov is
>>>  proposing using the OEM table id as a mechanism to distinguish
>>>  platform A (needs quirk applied) vs platform B (no quirks) from the
>>>  same OEM."
>>>
>>> So it looks to me that he pointed out that using the OEM revision field
>>> is wrong...and this is why I have asked if replacing it with the table
>>> id can work for other vendors
>>>
>>> Thanks
>>>
>>> Gab
>>>
>>
>> I had an internal discussion with jeff and cov before posting on the
>> maillist.
>>
>> I think there is missing info in the email.
>>
>> Usage of oem id + table id + revision is ok.
>>
>> Usage of oem id + revision is not ok as one oem can build multiple chips
>> with the same oem id and revision id but different table id. Otherwise,
>> we can run out of revisions very quickly.
>
>
> Agreed.
>
> I'm sorry for the confusion.  My intent was to point out that revision alone
> appeared insufficient to address all the identified problems, but I believe
> there is still a case for using revision. Table id is useful for
> differentiating between platforms/chips.  Revision is useful for
> differentiation between different versions of a single platform/chip
> assuming the silicon is respun or some other fix is applied.  Both solve
> different scenarios, and I'm not aware of a reason why they could not be
> used together to solve all currently identified cases.

Using OEM ID + Table ID + Revision will work for X-Gene platforms as well.

Regards,
Duc Dang.
>
>>
>>>

 --
 Sinan Kaya
 Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center,
 Inc.
 Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
 Linux Foundation Collaborative Project
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>
>
> --
> Jeffrey Hugo
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project

[PATCH] power_supply: fix return value of get_property

2016-06-13 Thread Rhyland Klein

power_supply_get_property() should ideally return -EAGAIN if it is
called while the power_supply is being registered. There was no way
previously to determine if use_cnt == 0 meant that the power_supply
wasn't fully registered yet, or if it had already been unregistered.

Add a new boolean to the power_supply struct to simply show if
registration is completed.

Signed-off-by: Rhyland Klein 
---
This patch continues what was discussed with the patch
"power_supply: power_supply_read_temp only if use_cnt > 0".
Looking at the thermal code, it looks like we should indeed return
EAGAIN if possible, and since this change is fairly simple, I think
it makes sense to do it.

 drivers/power/power_supply_core.c | 6 +-
 include/linux/power_supply.h  | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/power/power_supply_core.c 
b/drivers/power/power_supply_core.c
index b13cd074c52a..a39a47672979 100644
--- a/drivers/power/power_supply_core.c
+++ b/drivers/power/power_supply_core.c
@@ -491,8 +491,11 @@ int power_supply_get_property(struct power_supply *psy,
enum power_supply_property psp,
union power_supply_propval *val)
 {
-   if (atomic_read(&psy->use_cnt) <= 0)
+   if (atomic_read(&psy->use_cnt) <= 0) {
+   if (!psy->initialized)
+   return -EAGAIN;
return -ENODEV;
+   }
 
return psy->desc->get_property(psy, psp, val);
 }
@@ -775,6 +778,7 @@ __power_supply_register(struct device *parent,
if (rc)
goto create_triggers_failed;
 
+   psy->initialized = true;
/*
 * Update use_cnt after any uevents (most notably from device_add()).
 * We are here still during driver's probe but
diff --git a/include/linux/power_supply.h b/include/linux/power_supply.h
index 751061790626..3965503315ef 100644
--- a/include/linux/power_supply.h
+++ b/include/linux/power_supply.h
@@ -248,6 +248,7 @@ struct power_supply {
struct delayed_work deferred_register_work;
spinlock_t changed_lock;
bool changed;
+   bool initialized;
atomic_t use_cnt;
 #ifdef CONFIG_THERMAL
struct thermal_zone_device *tzd;
-- 
1.9.1

RE: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival

2016-06-13 Thread Odzioba, Lukasz

On 09-06-16 17:42:00, Dave Hansen wrote:
> Does your workload put large pages in and out of those pvecs, though?
> If your system doesn't have any activity, then all we've shown is that
> they're not a problem when not in use.  But what about when we use them?

It doesn't. To use them extensively I guess we would have to
craft a separate program for each one, which is not trivial.

> Have you, for instance, tried this on a system with memory pressure?

Not then, but here are exemplary snapshots with system using swap to handle 
allocation requests with patch applied: (notation: pages = sum in bytes):
LRU_add  336 = 1344kB
LRU_rotate   158 =  632kB
LRU_deactivate 0 =0kB
LRU_deact_file 0 =0kB
LRU_activate   1 =4kB
---
LRU_add 3262 =13048kB
LRU_rotate   142 =  568kB
LRU_deactivate 0 =0kB
LRU_deact_file 0 =0kB
LRU_activate   6 =   24kB
---
LRU_add 3689 =14756kB
LRU_rotate81 =  324kB
LRU_deactivate 0 =0kB
LRU_deact_file 0 =0kB
LRU_activate  19 =   76kB

While running idle os we have:
LRU_add 1038 = 4152kB
LRU_rotate 0 =0kB
LRU_deactivate 0 =0kB
LRU_deact_file 0 =0kB
LRU_activate   0 =0kB

I know those are not representative in overall.

Thanks,
Lukas

Re: [PATCH 2/5] asus-wmi: Create quirk for airplane_mode LED

2016-06-13 Thread João Paulo Rechi Vita

On 25 May 2016 at 17:24, Darren Hart  wrote:
>

(...)

> I believe this is all still blocked on the underlying RFKILL support. João,
> correct me if I'm mistaken.
>

That was true at the time of this message, but the RFKill
infrastructure that I was planning to use here is not going to be
merged. So the new plan is now to simply expose the LED to userspace
under a meaningful name ("asus-wireless::airplane") and have the
userspace people look for LEDs with this suffix and drive then
accordingly.

I have just sent an updated patchset.

--
João Paulo Rechi Vita
http://about.me/jprvita

Re: [PATCH v5 2/5] ACPI / processor_idle: Add support for Low Power Idle(LPI) states

2016-06-13 Thread Rafael J. Wysocki

On Friday, June 10, 2016 06:38:01 PM Sudeep Holla wrote:
> Hi Rafael,

Hi,
 
> On 11/05/16 16:37, Sudeep Holla wrote:
> > ACPI 6.0 introduced an optional object _LPI that provides an alternate
> > method to describe Low Power Idle states. It defines the local power
> > states for each node in a hierarchical processor topology. The OSPM can
> > use _LPI object to select a local power state for each level of processor
> > hierarchy in the system. They used to produce a composite power state
> > request that is presented to the platform by the OSPM.
> >
> > Since multiple processors affect the idle state for any non-leaf hierarchy
> > node, coordination of idle state requests between the processors is
> > required. ACPI supports two different coordination schemes: Platform
> > coordinated and  OS initiated.
> >
> > This patch adds initial support for Platform coordination scheme of LPI.
> >
> 
> I have added support for autopromote states(basically skip flattening or
> creating composite state). I have also fixed the bug discussed in this
> thread with Prashant. Do you have any other feedback on this version
> that I incorporate before posting next version.

I'd really preferred it if you posted the next version without waiting for
my feedback to the previous one (as the feedback may not be relevant any
more among other things).

Thanks,
Rafael

[PATCH 1/7] asus-wireless: Toggle airplane mode LED

2016-06-13 Thread João Paulo Rechi Vita

In the ASHS device we have the HSWC method, which calls either OWGD or
OWGS, depending on its parameter:

Device (ASHS)
{
Name (_HID, "ATK4002")  // _HID: Hardware ID
Method (HSWC, 1, Serialized)
{
If ((Arg0 < 0x02))
{
OWGD (Arg0)
Return (One)
}
If ((Arg0 == 0x02))
{
Local0 = OWGS ()
If (Local0)
{
Return (0x05)
}
Else
{
Return (0x04)
}
}
If ((Arg0 == 0x03))
{
Return (0xFF)
}
If ((Arg0 == 0x04))
{
OWGD (Zero)
Return (One)
}
If ((Arg0 == 0x05))
{
OWGD (One)
Return (One)
}
If ((Arg0 == 0x80))
{
Return (One)
}
}
Method (_STA, 0, NotSerialized)  // _STA: Status
{
If ((MSOS () >= OSW8))
{
Return (0x0F)
}
Else
{
Return (Zero)
}
}
}

On the Asus laptops that does not have an airplane mode LED, OWGD has an
empty implementation and OWGS simply returns 0. On the ones that have an
airplane mode LED these methods have the following implementation:

Method (OWGD, 1, Serialized)
{
SGPL (0x0203000F, Arg0)
SGPL (0x0203000F, Arg0)
}

Method (OWGS, 0, Serialized)
{
Store (RGPL (0x0203000F), Local0)
Return (Local0)
}

Where OWGD(1) sets the airplane mode LED ON, OWGD(0) set it off, and
OWGS() returns its state.

This commit exposes the airplane mode indicator LED to userspace under
the name asus-wireless::airplane, so it can be driven according to
userspace's policy.

Signed-off-by: João Paulo Rechi Vita 
---
 drivers/platform/x86/Kconfig |  2 +
 drivers/platform/x86/asus-wireless.c | 91 +++-
 2 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index c06bb85..e9d0144 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -604,6 +604,8 @@ config ASUS_WIRELESS
tristate "Asus Wireless Radio Control Driver"
depends on ACPI
depends on INPUT
+   select NEW_LEDS
+   select LEDS_CLASS
---help---
  The Asus Wireless Radio Control handles the airplane mode hotkey
  present on some Asus laptops.
diff --git a/drivers/platform/x86/asus-wireless.c 
b/drivers/platform/x86/asus-wireless.c
index 9ec721e..d617dfd 100644
--- a/drivers/platform/x86/asus-wireless.c
+++ b/drivers/platform/x86/asus-wireless.c
@@ -15,11 +15,78 @@
 #include 
 #include 
 #include 
+#include 
+
+#define ASUS_WIRELESS_LED_STATUS 0x2
+#define ASUS_WIRELESS_LED_OFF 0x4
+#define ASUS_WIRELESS_LED_ON 0x5
 
 struct asus_wireless_data {
struct input_dev *idev;
+   struct acpi_device *adev;
+   struct workqueue_struct *wq;
+   struct work_struct led_work;
+   struct led_classdev led;
+   int led_state;
 };
 
+static u64 asus_wireless_method(acpi_handle handle, const char *method,
+   int param)
+{
+   union acpi_object obj;
+   struct acpi_object_list p;
+   acpi_status s;
+   u64 ret;
+
+   acpi_handle_debug(handle, "Evaluating method %s, parameter %#x\n",
+ method, param);
+   obj.type = ACPI_TYPE_INTEGER;
+   obj.integer.value = param;
+   p.count = 1;
+   p.pointer = &obj;
+
+   s = acpi_evaluate_integer(handle, (acpi_string) method, &p, &ret);
+   if (ACPI_FAILURE(s))
+   acpi_handle_err(handle,
+   "Failed to eval method %s, param %#x (%d)\n",
+   method, param, s);
+   acpi_handle_debug(handle, "%s returned %#x\n", method, (uint) ret);
+   return ret;
+}
+
+static enum led_brightness led_state_get(struct led_classdev *led)
+{
+   struct asus_wireless_data *data;

[PATCH 3/7] asus-wmi: Add quirk_no_rfkill for the Asus N552VW

2016-06-13 Thread João Paulo Rechi Vita

The Asus N552VW has an airplane-mode indicator LED and the WMI WLAN user
bit set, so asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store
the wlan state, which has a side-effect of driving the airplane mode
indicator LED in an inverted fashion. quirk_no_rfkill prevents asus-wmi
from registering RFKill switches at all for this laptop and allows
asus-wireless to drive the LED through the ASHS ACPI device.

Signed-off-by: João Paulo Rechi Vita 
---
 drivers/platform/x86/asus-nb-wmi.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 90c8f41..31fcde1 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -319,6 +319,15 @@ static const struct dmi_system_id asus_quirks[] = {
},
.driver_data = &quirk_no_rfkill,
},
+   {
+   .callback = dmi_matched,
+   .ident = "ASUSTeK COMPUTER INC. N552VW",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "N552VW"),
+   },
+   .driver_data = &quirk_no_rfkill,
+   },
{},
 };
 
-- 
2.5.0

Re: [PATCH] init, fix initcall blacklist for modules

2016-06-13 Thread Rasmus Villemoes

On Mon, Jun 13 2016, Prarit Bhargava  wrote:

> Sorry ... forgot to cc everyone on the last email.
>
> P.
>
> 8<
>
> sprint_symbol_no_offset() returns the string "function_name [module_name]"
> where [module_name] is not printed for built in kernel functions.  This
> means that the initcall blacklisting code will now always fail when

I was and am pretty sure that %pf ends up using
sprint_symbol_no_offset(), so I don't see how this is new. But maybe
"now" doesn't refer to c8cdd2be21?

> comparing module_init() function names.  This patch resolves the issue by
> comparing to the length of the function_name.
>
> Signed-off-by: Prarit Bhargava 
> Cc: Andrew Morton 
> Cc: Thomas Gleixner 
> Cc: Yang Shi 
> Cc: Ingo Molnar 
> Cc: Mel Gorman 
> Cc: Rasmus Villemoes 
> Cc: Kees Cook 
> Cc: Yaowei Bai 
> Cc: Andrey Ryabinin 
> ---
>  init/main.c |   14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/init/main.c b/init/main.c
> index 4c17fda5c2ff..09a795e91efe 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -708,14 +708,26 @@ static bool __init_or_module 
> initcall_blacklisted(initcall_t fn)
>  {
>   struct blacklist_entry *entry;
>   char fn_name[KSYM_SYMBOL_LEN];
> + char *space;
> + int length;
>  
>   if (list_empty(&blacklisted_initcalls))
>   return false;
>  
>   sprint_symbol_no_offset(fn_name, (unsigned long)fn);
> + /*
> +  * fn will be "function_name [module_name]" where [module_name] is not
> +  * displayed for built-in initcall functions.  Strip off the
> +  * [module_name].
> +  */
> + space = strchrnul(fn_name, ' ');
> + if (!space)
> + length = strlen(fn_name);
> + else
> + length = space - fn_name;

strchrnul never returns NULL, so this could just be 'length =
strchrnul(fn_name, ' ') - fn_name;'. But I don't think that's what you
want anyway: Suppose one has blacklisted "init_foobar", and the function
pointer resolves to a completely unrelated "init_foo", we'll end up
falsely also blacklisting that since we're just comparing prefixes.

May I suggest

  strreplace(fn_name, ' ', '\0');

which also seems to match the comment a little better (and eliminates
the extra variables and the hunk below).

>   list_for_each_entry(entry, &blacklisted_initcalls, next) {
> - if (!strcmp(fn_name, entry->buf)) {
> + if (!strncmp(fn_name, entry->buf, length)) {
>   pr_debug("initcall %s blacklisted\n", fn_name);
>   return true;
>   }

Rasmus

Re: [RESEND PATCH 1/3] rfkill: Create "rfkill-airplane-mode" LED trigger

2016-06-13 Thread Pavel Machek

On Mon 2016-06-13 15:59:35, João Paulo Rechi Vita wrote:
> On 13 June 2016 at 15:00, Pavel Machek  wrote:
> > Hi!
> >
> >> > João, that means you should send a patch to add the ::rfkill suffix.
> >> >
> >>
> >> IMO "airplane" (or maybe "airplane-mode") is a better suffix, as it
> >> reflects the label on the machine's chassis. I'll name it
> >> "asus-wireless::airplane" and send this through platform-drivers-x86,
> >> as this is now contained in the platform-drivers-x86 subsystem. Thanks
> >> Johannes for your patience and help designing and reviewing the rfkill
> >> changes, even if not all of them made it through in the end. And
> >> thanks everyone else involved for the feedback.
> >
> > Actually, I'd do '::rfkill', for consistency with other places in
> > /sys.
> >
> > /sys/devices/platform/thinkpad_acpi/rfkill/rfkill1/name
> > /sys/class/rfkill
> > /sys/module/rfkill
> >
> 
> If we use "rfkill" as a suffix, how do you expect userspace to be able
> to differentiate between a LED that indicates airplane-mode (LED ON
> when all radios are OFF) and a LED that indicates the state of a
> specific radio like WiFi or Bluetooth (LED ON when that specific radio
> is ON)? If we're going this route we should provide meaningful
> information here.

'::airplane' has same problem, no?

If you want to distinguish that, maybe you can do '::rfkill' for
everything vs '::rfkill-wifi' for wifi-only and '::rfkill-bt' for
bluetooth...

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Re: [RFC PATCH V2 1/2] ACPI/PCI: Match PCI config space accessors against platfrom specific ECAM quirks

2016-06-13 Thread Duc Dang

On Mon, Jun 13, 2016 at 8:47 AM, Christopher Covington
 wrote:
> Hi Dongdong,
>
> On 06/13/2016 09:02 AM, Dongdong Liu wrote:
>> diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
>> index d3c3e85..49612b3 100644
>> --- a/drivers/acpi/pci_mcfg.c
>> +++ b/drivers/acpi/pci_mcfg.c
>> @@ -22,6 +22,10 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +
>> +/* Root pointer to the mapped MCFG table */
>> +static struct acpi_table_mcfg *mcfg_table;
>>
>>  /* Structure to hold entries from the MCFG table */
>>  struct mcfg_entry {
>> @@ -35,6 +39,38 @@ struct mcfg_entry {
>>  /* List to save mcfg entries */
>>  static LIST_HEAD(pci_mcfg_list);
>>
>> +extern struct pci_cfg_fixup __start_acpi_mcfg_fixups[];
>> +extern struct pci_cfg_fixup __end_acpi_mcfg_fixups[];
>> +
>> +struct pci_ecam_ops *pci_mcfg_get_ops(struct acpi_pci_root *root)
>> +{
>> + int bus_num = root->secondary.start;
>> + int domain = root->segment;
>> + struct pci_cfg_fixup *f;
>> +
>> + if (!mcfg_table)
>> + return &pci_generic_ecam_ops;
>> +
>> + /*
>> +  * Match against platform specific quirks and return corresponding
>> +  * CAM ops.
>> +  *
>> +  * First match against PCI topology  then use OEM ID and
>> +  * OEM revision from MCFG table standard header.
>> +  */
>> + for (f = __start_acpi_mcfg_fixups; f < __end_acpi_mcfg_fixups; f++) {
>> + if ((f->domain == domain || f->domain == PCI_MCFG_DOMAIN_ANY) 
>> &&
>> + (f->bus_num == bus_num || f->bus_num == PCI_MCFG_BUS_ANY) 
>> &&
>> + (!strncmp(f->oem_id, mcfg_table->header.oem_id,
>> +   ACPI_OEM_ID_SIZE)) &&
>> + (!strncmp(f->oem_table_id, mcfg_table->header.oem_table_id,
>> +   ACPI_OEM_TABLE_ID_SIZE)))
>
> This would just be a small convenience, but if the character count used here 
> were
>
> min(strlen(f->oem_id), ACPI_OEM_ID_SIZE)
>
> then the parameters to DECLARE_ACPI_MCFG_FIXUP macro could be substrings and
> wouldn't need to be padded out to the full length.
>
>> + return f->ops;
>> + }
>> + /* No quirks, use ECAM */
>> + return &pci_generic_ecam_ops;
>> +}
>
>> diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
>> index 7d63a66..088a1da 100644
>> --- a/include/linux/pci-acpi.h
>> +++ b/include/linux/pci-acpi.h
>> @@ -25,6 +25,7 @@ static inline acpi_status 
>> pci_acpi_remove_pm_notifier(struct acpi_device *dev)
>>  extern phys_addr_t acpi_pci_root_get_mcfg_addr(acpi_handle handle);
>>
>>  extern phys_addr_t pci_mcfg_lookup(u16 domain, struct resource *bus_res);
>> +extern struct pci_ecam_ops *pci_mcfg_get_ops(struct acpi_pci_root *root);
>>
>>  static inline acpi_handle acpi_find_root_bridge_handle(struct pci_dev *pdev)
>>  {
>> @@ -72,6 +73,25 @@ struct acpi_pci_root_ops {
>>   int (*prepare_resources)(struct acpi_pci_root_info *info);
>>  };
>>
>> +struct pci_cfg_fixup {
>> + struct pci_ecam_ops *ops;
>> + char *oem_id;
>> + char *oem_table_id;
>> + int domain;
>> + int bus_num;
>> +};
>> +
>> +#define PCI_MCFG_DOMAIN_ANY  -1
>> +#define PCI_MCFG_BUS_ANY -1
>> +
>> +/* Designate a routine to fix up buggy MCFG */
>> +#define DECLARE_ACPI_MCFG_FIXUP(ops, oem_id, oem_table_id, dom, bus) \
>> + static const struct pci_cfg_fixup   \
>> + __mcfg_fixup_##oem_id##oem_table_id##dom##bus   \
>
> I'm not entirely sure that this is the right fix--I'm pretty blindly
> following a GCC documentation suggestion [1]--but removing the first two
> preprocessor concatenation operators "##" solved the following build error
> for me.
>
> include/linux/pci-acpi.h:90:2: error: pasting "__mcfg_fixup_" and ""QCOM"" 
> does not give a valid preprocessing token
>   __mcfg_fixup_##oem_id##oem_table_id##dom##bus   \

I think the problem is gcc is not happy with quoted string when
processing these tokens
(""QCOM"", the extra "" are added by gcc). So should we not concat
string tokens and
use the fixup definition in v1 of this RFC:
/* Designate a routine to fix up buggy MCFG */
#define DECLARE_ACPI_MCFG_FIXUP(ops, oem_id, rev, dom, bus) \
static const struct pci_cfg_fixup __mcfg_fixup_##system##dom##bus\
 __used __attribute__((__section__(".acpi_fixup_mcfg"), \
aligned((sizeof(void *) =   \
{ ops, oem_id, rev, dom, bus };

Regards,
Duc Dang.


>   ^
> arch/arm64/kernel/pci.c:225:1: note: in expansion of macro 
> ‘DECLARE_ACPI_MCFG_FIXUP’
>  DECLARE_ACPI_MCFG_FIXUP(&pci_32b_ecam_ops, "QCOM", "QDF2432", 
> PCI_MCFG_DOMAIN_ANY, PCI_MCFG_BUS_ANY);
>  ^
> arch/arm64/kernel/pci.c:225:44: error: pasting ""QCOM"" and ""QDF2432"" does 
> not give a valid preprocessing token
>  DECLARE_ACPI_MCFG_FIXUP(&pci_32b_ecam_ops, "QCOM", "QDF2432", 
> PCI_MCFG_DOMAIN_ANY, PCI_MCFG_BUS_ANY);
>

[GIT PULL] percpu fixes for v4.7-rc3

2016-06-13 Thread Tejun Heo

Hello, Linus.

While adding GFP_ATOMIC support to the percpu allocator,
synchronization for fast-path which doesn't require external
allocations was separated into pcpu_lock; unfortunately, it
incorrectly decoupled async paths and percpu chunks could get
destroyed while still being operated on.  This pull request contains
two patches to fix the bug.

Thanks.

The following changes since commit 28165ec7a99be98123aa89540bf2cfc24df19498:

  Merge tag 'armsoc-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc (2016-05-24 15:50:58 
-0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-4.7-fixes

for you to fetch changes up to 6710e594f71ccaad8101bc64321152af7cd9ea28:

  percpu: fix synchronization between synchronous map extension and chunk 
destruction (2016-05-25 11:48:25 -0400)


Tejun Heo (2):
  percpu: fix synchronization between chunk->map_extend_work and chunk 
destruction
  percpu: fix synchronization between synchronous map extension and chunk 
destruction

 mm/percpu.c | 73 +
 1 file changed, 44 insertions(+), 29 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 0c59684..9903830 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -112,7 +112,7 @@ struct pcpu_chunk {
int map_used;   /* # of map entries used before 
the sentry */
int map_alloc;  /* # of map entries allocated */
int *map;   /* allocation map */
-   struct work_struct  map_extend_work;/* async ->map[] extension */
+   struct list_headmap_extend_list;/* on pcpu_map_extend_chunks */
 
void*data;  /* chunk data */
int first_free; /* no free below this */
@@ -162,10 +162,13 @@ static struct pcpu_chunk *pcpu_reserved_chunk;
 static int pcpu_reserved_chunk_limit;
 
 static DEFINE_SPINLOCK(pcpu_lock); /* all internal data structures */
-static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop */
+static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop, map 
ext */
 
 static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */
 
+/* chunks which need their map areas extended, protected by pcpu_lock */
+static LIST_HEAD(pcpu_map_extend_chunks);
+
 /*
  * The number of empty populated pages, protected by pcpu_lock.  The
  * reserved chunk doesn't contribute to the count.
@@ -395,13 +398,19 @@ static int pcpu_need_to_extend(struct pcpu_chunk *chunk, 
bool is_atomic)
 {
int margin, new_alloc;
 
+   lockdep_assert_held(&pcpu_lock);
+
if (is_atomic) {
margin = 3;
 
if (chunk->map_alloc <
-   chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW &&
-   pcpu_async_enabled)
-   schedule_work(&chunk->map_extend_work);
+   chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW) {
+   if (list_empty(&chunk->map_extend_list)) {
+   list_add_tail(&chunk->map_extend_list,
+ &pcpu_map_extend_chunks);
+   pcpu_schedule_balance_work();
+   }
+   }
} else {
margin = PCPU_ATOMIC_MAP_MARGIN_HIGH;
}
@@ -435,6 +444,8 @@ static int pcpu_extend_area_map(struct pcpu_chunk *chunk, 
int new_alloc)
size_t old_size = 0, new_size = new_alloc * sizeof(new[0]);
unsigned long flags;
 
+   lockdep_assert_held(&pcpu_alloc_mutex);
+
new = pcpu_mem_zalloc(new_size);
if (!new)
return -ENOMEM;
@@ -467,20 +478,6 @@ static int pcpu_extend_area_map(struct pcpu_chunk *chunk, 
int new_alloc)
return 0;
 }
 
-static void pcpu_map_extend_workfn(struct work_struct *work)
-{
-   struct pcpu_chunk *chunk = container_of(work, struct pcpu_chunk,
-   map_extend_work);
-   int new_alloc;
-
-   spin_lock_irq(&pcpu_lock);
-   new_alloc = pcpu_need_to_extend(chunk, false);
-   spin_unlock_irq(&pcpu_lock);
-
-   if (new_alloc)
-   pcpu_extend_area_map(chunk, new_alloc);
-}
-
 /**
  * pcpu_fit_in_area - try to fit the requested allocation in a candidate area
  * @chunk: chunk the candidate area belongs to
@@ -740,7 +737,7 @@ static struct pcpu_chunk *pcpu_alloc_chunk(void)
chunk->map_used = 1;
 
INIT_LIST_HEAD(&chunk->list);
-   INIT_WORK(&chunk->map_extend_work, pcpu_map_extend_workfn);
+   INIT_LIST_HEAD(&chunk->map_extend_list);
chunk->free_size = pcpu_unit_size;
chunk->contig_hint = pcpu_unit_size;
 
@@ -895,6 +892,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, 
bool reserved,

[PATCH 6/7] asus-wmi: Add quirk_no_rfkill_wapf4 for the Asus X456UF

2016-06-13 Thread João Paulo Rechi Vita

The Asus X456UF has an airplane-mode indicator LED and the WMI WLAN user
bit set, so asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store
the wlan state, which has a side-effect of driving the airplane mode
indicator LED in an inverted fashion.

quirk_no_rfkill prevents asus-wmi from registering RFKill switches at
all for this laptop and allows asus-wireless to drive the LED through
the ASHS ACPI device.  This laptop already has a quirk for setting
WAPF=4, so this commit creates a new quirk, quirk_no_rfkill_wapf4, which
both disables rfkill and sets WAPF=4.

Signed-off-by: João Paulo Rechi Vita 
Reported-by: Carlo Caione 
---
 drivers/platform/x86/asus-nb-wmi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 7b5c444..87cd60b 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -82,6 +82,11 @@ static struct quirk_entry quirk_no_rfkill = {
.no_rfkill = true,
 };
 
+static struct quirk_entry quirk_no_rfkill_wapf4 = {
+   .wapf = 4,
+   .no_rfkill = true,
+};
+
 static int dmi_matched(const struct dmi_system_id *dmi)
 {
quirks = dmi->driver_data;
@@ -146,7 +151,7 @@ static const struct dmi_system_id asus_quirks[] = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
DMI_MATCH(DMI_PRODUCT_NAME, "X456UF"),
},
-   .driver_data = &quirk_asus_wapf4,
+   .driver_data = &quirk_no_rfkill_wapf4,
},
{
.callback = dmi_matched,
-- 
2.5.0

[PATCH 5/7] asus-wmi: Add quirk_no_rfkill for the Asus Z550MA

2016-06-13 Thread João Paulo Rechi Vita

The Asus Z550MA has an airplane-mode indicator LED and the WMI WLAN user
bit set, so asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store
the wlan state, which has a side-effect of driving the airplane mode
indicator LED in an inverted fashion. quirk_no_rfkill prevents asus-wmi
from registering RFKill switches at all for this laptop and allows
asus-wireless to drive the LED through the ASHS ACPI device.

Signed-off-by: João Paulo Rechi Vita 
Reported-by: Ming Shuo Chiu 
---
 drivers/platform/x86/asus-nb-wmi.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index efae467..7b5c444 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -337,6 +337,15 @@ static const struct dmi_system_id asus_quirks[] = {
},
.driver_data = &quirk_no_rfkill,
},
+   {
+   .callback = dmi_matched,
+   .ident = "ASUSTeK COMPUTER INC. Z550MA",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "Z550MA"),
+   },
+   .driver_data = &quirk_no_rfkill,
+   },
{},
 };
 
-- 
2.5.0

[PATCH 0/7] asus-wireless: LED control

2016-06-13 Thread João Paulo Rechi Vita

This series adds support for controlling the airplane-mode indicator LED
present in some Asus laptops. It also creates a quirk in asus-wmi so it does not
create RFKill devices for platforms that use asus-wireless and where there is a
competition for the LED control (see "asus-wmi: Create quirk for airplane_mode
LED" for more details).

João Paulo Rechi Vita (7):
  asus-wireless: Toggle airplane mode LED
  asus-wmi: Create quirk for airplane_mode LED
  asus-wmi: Add quirk_no_rfkill for the Asus N552VW
  asus-wmi: Add quirk_no_rfkill for the Asus U303LB
  asus-wmi: Add quirk_no_rfkill for the Asus Z550MA
  asus-wmi: Add quirk_no_rfkill_wapf4 for the Asus X456UF
  asus-wmi: Add quirk_no_rfkill_wapf4 for the Asus X456UA

 drivers/platform/x86/Kconfig |  2 +
 drivers/platform/x86/asus-nb-wmi.c   | 49 ++-
 drivers/platform/x86/asus-wireless.c | 91 +++-
 drivers/platform/x86/asus-wmi.c  |  8 ++--
 drivers/platform/x86/asus-wmi.h  |  1 +
 5 files changed, 145 insertions(+), 6 deletions(-)

-- 
2.5.0

[PATCH 2/7] asus-wmi: Create quirk for airplane_mode LED

2016-06-13 Thread João Paulo Rechi Vita

Some Asus laptops that have an airplane-mode indicator LED, also have
the WMI WLAN user bit set, and the following bits in their DSDT:

Scope (_SB)
{
  (...)
  Device (ATKD)
  {
(...)
Method (WMNB, 3, Serialized)
{
  (...)
  If (LEqual (IIA0, 0x00010002))
  {
OWGD (IIA1)
Return (One)
  }
}
  }
}

So when asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store the
wlan state, it drives the airplane-mode indicator LED (through the call
to OWGD) in an inverted fashion: the LED is ON when airplane mode is OFF
(since wlan is ON), and vice-versa.

This commit creates a quirk to not register a RFKill switch at all for
these laptops, to allow the asus-wireless driver to drive the airplane
mode LED correctly through the ASHS ACPI device. It also adds a match to
that quirk for the Asus X555UB, which is affected by this problem.

Signed-off-by: João Paulo Rechi Vita 
---
 drivers/platform/x86/asus-nb-wmi.c | 13 +
 drivers/platform/x86/asus-wmi.c|  8 +---
 drivers/platform/x86/asus-wmi.h|  1 +
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 091ca7a..90c8f41 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -78,6 +78,10 @@ static struct quirk_entry quirk_asus_x200ca = {
.wapf = 2,
 };
 
+static struct quirk_entry quirk_no_rfkill = {
+   .no_rfkill = true,
+};
+
 static int dmi_matched(const struct dmi_system_id *dmi)
 {
quirks = dmi->driver_data;
@@ -306,6 +310,15 @@ static const struct dmi_system_id asus_quirks[] = {
},
.driver_data = &quirk_asus_x200ca,
},
+   {
+   .callback = dmi_matched,
+   .ident = "ASUSTeK COMPUTER INC. X555UB",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "X555UB"),
+   },
+   .driver_data = &quirk_no_rfkill,
+   },
{},
 };
 
diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index a26dca3..7c093a0 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -2069,9 +2069,11 @@ static int asus_wmi_add(struct platform_device *pdev)
if (err)
goto fail_leds;
 
-   err = asus_wmi_rfkill_init(asus);
-   if (err)
-   goto fail_rfkill;
+   if (!asus->driver->quirks->no_rfkill) {
+   err = asus_wmi_rfkill_init(asus);
+   if (err)
+   goto fail_rfkill;
+   }
 
/* Some Asus desktop boards export an acpi-video backlight interface,
   stop this from showing up */
diff --git a/drivers/platform/x86/asus-wmi.h b/drivers/platform/x86/asus-wmi.h
index 4da4c8b..5de1df5 100644
--- a/drivers/platform/x86/asus-wmi.h
+++ b/drivers/platform/x86/asus-wmi.h
@@ -38,6 +38,7 @@ struct key_entry;
 struct asus_wmi;
 
 struct quirk_entry {
+   bool no_rfkill;
bool hotplug_wireless;
bool scalar_panel_brightness;
bool store_backlight_power;
-- 
2.5.0

[PATCH v8 3/5] skb_array: array based FIFO for skbs

2016-06-13 Thread Michael S. Tsirkin

A simple array based FIFO of pointers.  Intended for net stack so uses
skbs for type safety. Implemented as a set of wrappers around ptr_ring.

Signed-off-by: Michael S. Tsirkin 
---
 include/linux/skb_array.h | 144 ++
 1 file changed, 144 insertions(+)
 create mode 100644 include/linux/skb_array.h

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
new file mode 100644
index 000..c4c0902
--- /dev/null
+++ b/include/linux/skb_array.h
@@ -0,0 +1,144 @@
+/*
+ * Definitions for the 'struct skb_array' datastructure.
+ *
+ * Author:
+ * Michael S. Tsirkin 
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * Limited-size FIFO of skbs. Can be used more or less whenever
+ * sk_buff_head can be used, except you need to know the queue size in
+ * advance.
+ * Implemented as a type-safe wrapper around ptr_ring.
+ */
+
+#ifndef _LINUX_SKB_ARRAY_H
+#define _LINUX_SKB_ARRAY_H 1
+
+#ifdef __KERNEL__
+#include 
+#include 
+#include 
+#endif
+
+struct skb_array {
+   struct ptr_ring ring;
+};
+
+/* Might be slightly faster than skb_array_full below, but callers invoking
+ * this in a loop must use a compiler barrier, for example cpu_relax().
+ */
+static inline bool __skb_array_full(struct skb_array *a)
+{
+   return __ptr_ring_full(&a->ring);
+}
+
+static inline bool skb_array_full(struct skb_array *a)
+{
+   return ptr_ring_full(&a->ring);
+}
+
+static inline int skb_array_produce(struct skb_array *a, struct sk_buff *skb)
+{
+   return ptr_ring_produce(&a->ring, skb);
+}
+
+static inline int skb_array_produce_irq(struct skb_array *a, struct sk_buff 
*skb)
+{
+   return ptr_ring_produce_irq(&a->ring, skb);
+}
+
+static inline int skb_array_produce_bh(struct skb_array *a, struct sk_buff 
*skb)
+{
+   return ptr_ring_produce_bh(&a->ring, skb);
+}
+
+static inline int skb_array_produce_any(struct skb_array *a, struct sk_buff 
*skb)
+{
+   return ptr_ring_produce_any(&a->ring, skb);
+}
+
+/* Might be slightly faster than skb_array_empty below, but callers invoking
+ * this in a loop must take care to use a compiler barrier, for example
+ * cpu_relax().
+ */
+static inline bool __skb_array_empty(struct skb_array *a)
+{
+   return !__ptr_ring_peek(&a->ring);
+}
+
+static inline bool skb_array_empty(struct skb_array *a)
+{
+   return ptr_ring_empty(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_consume(struct skb_array *a)
+{
+   return ptr_ring_consume(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_consume_irq(struct skb_array *a)
+{
+   return ptr_ring_consume_irq(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_consume_any(struct skb_array *a)
+{
+   return ptr_ring_consume_any(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_consume_bh(struct skb_array *a)
+{
+   return ptr_ring_consume_bh(&a->ring);
+}
+
+static inline int __skb_array_len_with_tag(struct sk_buff *skb)
+{
+   if (likely(skb)) {
+   int len = skb->len;
+
+   if (skb_vlan_tag_present(skb))
+   len += VLAN_HLEN;
+
+   return len;
+   } else {
+   return 0;
+   }
+}
+
+static inline int skb_array_peek_len(struct skb_array *a)
+{
+   return PTR_RING_PEEK_CALL(&a->ring, __skb_array_len_with_tag);
+}
+
+static inline int skb_array_peek_len_irq(struct skb_array *a)
+{
+   return PTR_RING_PEEK_CALL_IRQ(&a->ring, __skb_array_len_with_tag);
+}
+
+static inline int skb_array_peek_len_bh(struct skb_array *a)
+{
+   return PTR_RING_PEEK_CALL_BH(&a->ring, __skb_array_len_with_tag);
+}
+
+static inline int skb_array_peek_len_any(struct skb_array *a)
+{
+   return PTR_RING_PEEK_CALL_ANY(&a->ring, __skb_array_len_with_tag);
+}
+
+static inline int skb_array_init(struct skb_array *a, int size, gfp_t gfp)
+{
+   return ptr_ring_init(&a->ring, size, gfp);
+}
+
+static inline void skb_array_cleanup(struct skb_array *a)
+{
+   ptr_ring_cleanup(&a->ring);
+}
+
+#endif /* _LINUX_SKB_ARRAY_H  */
-- 
MST

[PATCH 7/7] asus-wmi: Add quirk_no_rfkill_wapf4 for the Asus X456UA

2016-06-13 Thread João Paulo Rechi Vita

The Asus X456UA has an airplane-mode indicator LED and the WMI WLAN user
bit set, so asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store
the wlan state, which has a side-effect of driving the airplane mode
indicator LED in an inverted fashion.

quirk_no_rfkill prevents asus-wmi from registering RFKill switches at
all for this laptop and allows asus-wireless to drive the LED through
the ASHS ACPI device.  This laptop already has a quirk for setting
WAPF=4, so this commit creates a new quirk, quirk_no_rfkill_wapf4, which
both disables rfkill and sets WAPF=4.

Signed-off-by: João Paulo Rechi Vita 
Reported-by: Angela Traeger 
---
 drivers/platform/x86/asus-nb-wmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 87cd60b..87d618f 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -142,7 +142,7 @@ static const struct dmi_system_id asus_quirks[] = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
DMI_MATCH(DMI_PRODUCT_NAME, "X456UA"),
},
-   .driver_data = &quirk_asus_wapf4,
+   .driver_data = &quirk_no_rfkill_wapf4,
},
{
.callback = dmi_matched,
-- 
2.5.0

Re: [PATCH 1/2] powercap/rapl: handle missing msrs

2016-06-13 Thread Rafael J. Wysocki

On Monday, June 13, 2016 11:53:10 AM jacob wrote:
> Hi Rafael,
> 
> Any feedback? It that is OK, can you take this patch independent of the
> second patch (which is going into tip tree)?

I'll do that.

Thanks,
Rafael

[PATCH 4/7] asus-wmi: Add quirk_no_rfkill for the Asus U303LB

2016-06-13 Thread João Paulo Rechi Vita

The Asus U303LB has an airplane-mode indicator LED and the WMI WLAN user
bit set, so asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store
the wlan state, which has a side-effect of driving the airplane mode
indicator LED in an inverted fashion. quirk_no_rfkill prevents asus-wmi
from registering RFKill switches at all for this laptop and allows
asus-wireless to drive the LED through the ASHS ACPI device.

Signed-off-by: João Paulo Rechi Vita 
Reported-by: Mousou Yuu 
---
 drivers/platform/x86/asus-nb-wmi.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/platform/x86/asus-nb-wmi.c 
b/drivers/platform/x86/asus-nb-wmi.c
index 31fcde1..efae467 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -328,6 +328,15 @@ static const struct dmi_system_id asus_quirks[] = {
},
.driver_data = &quirk_no_rfkill,
},
+   {
+   .callback = dmi_matched,
+   .ident = "ASUSTeK COMPUTER INC. U303LB",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "U303LB"),
+   },
+   .driver_data = &quirk_no_rfkill,
+   },
{},
 };
 
-- 
2.5.0

[PATCH v8 4/5] ptr_ring: resize support

2016-06-13 Thread Michael S. Tsirkin

This adds ring resize support. Seems to be necessary as
users such as tun allow userspace control over queue size.

If resize is used, this costs us ability to peek at queue without
consumer lock - should not be a big deal as peek and consumer are
usually run on the same CPU.

If ring is made bigger, ring contents is preserved.  If ring is made
smaller, extra pointers are passed to an optional destructor callback.

Cleanup function also gains destructor callback such that
all pointers in queue can be cleaned up.

This changes some APIs but we don't have any users yet,
so it won't break bisect.

Signed-off-by: Michael S. Tsirkin 
---
 include/linux/ptr_ring.h | 157 ++-
 1 file changed, 143 insertions(+), 14 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 633406f..562a65e 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -43,9 +43,9 @@ struct ptr_ring {
 };
 
 /* Note: callers invoking this in a loop must use a compiler barrier,
- * for example cpu_relax().
- * Callers don't need to take producer lock - if they don't
- * the next call to __ptr_ring_produce may fail.
+ * for example cpu_relax().  If ring is ever resized, callers must hold
+ * producer_lock - see e.g. ptr_ring_full.  Otherwise, if callers don't hold
+ * producer_lock, the next call to __ptr_ring_produce may fail.
  */
 static inline bool __ptr_ring_full(struct ptr_ring *r)
 {
@@ -54,16 +54,55 @@ static inline bool __ptr_ring_full(struct ptr_ring *r)
 
 static inline bool ptr_ring_full(struct ptr_ring *r)
 {
-   barrier();
-   return __ptr_ring_full(r);
+   bool ret;
+
+   spin_lock(&r->producer_lock);
+   ret = __ptr_ring_full(r);
+   spin_unlock(&r->producer_lock);
+
+   return ret;
+}
+
+static inline bool ptr_ring_full_irq(struct ptr_ring *r)
+{
+   bool ret;
+
+   spin_lock_irq(&r->producer_lock);
+   ret = __ptr_ring_full(r);
+   spin_unlock_irq(&r->producer_lock);
+
+   return ret;
+}
+
+static inline bool ptr_ring_full_any(struct ptr_ring *r)
+{
+   unsigned long flags;
+   bool ret;
+
+   spin_lock_irqsave(&r->producer_lock, flags);
+   ret = __ptr_ring_full(r);
+   spin_unlock_irqrestore(&r->producer_lock, flags);
+
+   return ret;
+}
+
+static inline bool ptr_ring_full_bh(struct ptr_ring *r)
+{
+   bool ret;
+
+   spin_lock_bh(&r->producer_lock);
+   ret = __ptr_ring_full(r);
+   spin_unlock_bh(&r->producer_lock);
+
+   return ret;
 }
 
 /* Note: callers invoking this in a loop must use a compiler barrier,
- * for example cpu_relax().
+ * for example cpu_relax(). Callers must hold producer_lock.
  */
 static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
 {
-   if (__ptr_ring_full(r))
+   if (r->queue[r->producer])
return -ENOSPC;
 
r->queue[r->producer++] = ptr;
@@ -120,20 +159,68 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, 
void *ptr)
 /* Note: callers invoking this in a loop must use a compiler barrier,
  * for example cpu_relax(). Callers must take consumer_lock
  * if they dereference the pointer - see e.g. PTR_RING_PEEK_CALL.
- * There's no need for a lock if pointer is merely tested - see e.g.
- * ptr_ring_empty.
+ * If ring is never resized, and if the pointer is merely
+ * tested, there's no need to take the lock - see e.g.  __ptr_ring_empty.
  */
 static inline void *__ptr_ring_peek(struct ptr_ring *r)
 {
return r->queue[r->consumer];
 }
 
-static inline bool ptr_ring_empty(struct ptr_ring *r)
+/* Note: callers invoking this in a loop must use a compiler barrier,
+ * for example cpu_relax(). Callers must take consumer_lock
+ * if the ring is ever resized - see e.g. ptr_ring_empty.
+ */
+static inline bool __ptr_ring_empty(struct ptr_ring *r)
 {
-   barrier();
return !__ptr_ring_peek(r);
 }
 
+static inline bool ptr_ring_empty(struct ptr_ring *r)
+{
+   bool ret;
+
+   spin_lock(&r->consumer_lock);
+   ret = __ptr_ring_empty(r);
+   spin_unlock(&r->consumer_lock);
+
+   return ret;
+}
+
+static inline bool ptr_ring_empty_irq(struct ptr_ring *r)
+{
+   bool ret;
+
+   spin_lock_irq(&r->consumer_lock);
+   ret = __ptr_ring_empty(r);
+   spin_unlock_irq(&r->consumer_lock);
+
+   return ret;
+}
+
+static inline bool ptr_ring_empty_any(struct ptr_ring *r)
+{
+   unsigned long flags;
+   bool ret;
+
+   spin_lock_irqsave(&r->consumer_lock, flags);
+   ret = __ptr_ring_empty(r);
+   spin_unlock_irqrestore(&r->consumer_lock, flags);
+
+   return ret;
+}
+
+static inline bool ptr_ring_empty_bh(struct ptr_ring *r)
+{
+   bool ret;
+
+   spin_lock_bh(&r->consumer_lock);
+   ret = __ptr_ring_empty(r);
+   spin_unlock_bh(&r->consumer_lock);
+
+   return ret;
+}
+
 /* Must only be called after __ptr_ring_peek returned !NULL */
 static inline void __ptr_ring_discard_one(struct ptr_rin

[PATCH v8 5/5] skb_array: resize support

2016-06-13 Thread Michael S. Tsirkin

Update skb_array after ptr_ring API changes.

Signed-off-by: Michael S. Tsirkin 
---
 include/linux/skb_array.h | 33 +
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index c4c0902..678bfbf 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -63,9 +63,9 @@ static inline int skb_array_produce_any(struct skb_array *a, 
struct sk_buff *skb
return ptr_ring_produce_any(&a->ring, skb);
 }
 
-/* Might be slightly faster than skb_array_empty below, but callers invoking
- * this in a loop must take care to use a compiler barrier, for example
- * cpu_relax().
+/* Might be slightly faster than skb_array_empty below, but only safe if the
+ * array is never resized. Also, callers invoking this in a loop must take care
+ * to use a compiler barrier, for example cpu_relax().
  */
 static inline bool __skb_array_empty(struct skb_array *a)
 {
@@ -77,6 +77,21 @@ static inline bool skb_array_empty(struct skb_array *a)
return ptr_ring_empty(&a->ring);
 }
 
+static inline bool skb_array_empty_bh(struct skb_array *a)
+{
+   return ptr_ring_empty_bh(&a->ring);
+}
+
+static inline bool skb_array_empty_irq(struct skb_array *a)
+{
+   return ptr_ring_empty_irq(&a->ring);
+}
+
+static inline bool skb_array_empty_any(struct skb_array *a)
+{
+   return ptr_ring_empty_any(&a->ring);
+}
+
 static inline struct sk_buff *skb_array_consume(struct skb_array *a)
 {
return ptr_ring_consume(&a->ring);
@@ -136,9 +151,19 @@ static inline int skb_array_init(struct skb_array *a, int 
size, gfp_t gfp)
return ptr_ring_init(&a->ring, size, gfp);
 }
 
+void __skb_array_destroy_skb(void *ptr)
+{
+   kfree_skb(ptr);
+}
+
+int skb_array_resize(struct skb_array *a, int size, gfp_t gfp)
+{
+   return ptr_ring_resize(&a->ring, size, gfp, __skb_array_destroy_skb);
+}
+
 static inline void skb_array_cleanup(struct skb_array *a)
 {
-   ptr_ring_cleanup(&a->ring);
+   ptr_ring_cleanup(&a->ring, __skb_array_destroy_skb);
 }
 
 #endif /* _LINUX_SKB_ARRAY_H  */
-- 
MST

Re: [PATCH v16 6/6] ARM: socfpga: fpga bridge driver support

2016-06-13 Thread atull

On Fri, 10 Jun 2016, Trent Piepho wrote:

> On Fri, 2016-02-05 at 15:30 -0600, at...@opensource.altera.com wrote:
> > Supports Altera SOCFPGA bridges:
> >  * fpga2sdram
> >  * fpga2hps
> >  * hps2fpga
> >  * lwhps2fpga
> > 
> > Allows enabling/disabling the bridges through the FPGA
> > Bridge Framework API functions.
> 
> I'm replying to v16 because it exists on gmane, while v17 appears not
> to.  lkml.org's forward feature appears to be broken so I can't reply to
> that message (no way to get message-id).  But v17 of this patch should
> be the same.  If a v18 was posted, I've not been able to find it.

Hi Trent,

Yes, we're up to v17. V18 will be soon, but v16 is good enough for
the purposes of this review.

> > +
> > +#define ALT_L3_REMAP_OFST  0x0
> > +#define ALT_L3_REMAP_MPUZERO_MSK   0x0001
> > +#define ALT_L3_REMAP_H2F_MSK   0x0008
> > +#define ALT_L3_REMAP_LWH2F_MSK 0x0010
> > +
> > +#define HPS2FPGA_BRIDGE_NAME   "hps2fpga"
> > +#define LWHPS2FPGA_BRIDGE_NAME "lwhps2fpga"
> > +#define FPGA2HPS_BRIDGE_NAME   "fpga2hps"
> > +
> > +struct altera_hps2fpga_data {
> > +   const char *name;
> > +   struct reset_control *bridge_reset;
> > +   struct regmap *l3reg;
> > +   /* The L3 REMAP register is write only, so keep a cached value. */
> > +   unsigned int l3_remap_value;
> > +   unsigned int remap_mask;
> > +   struct clk *clk;
> > +};
> > +
> > +static int alt_hps2fpga_enable_show(struct fpga_bridge *bridge)
> > +{
> > +   struct altera_hps2fpga_data *priv = bridge->priv;
> > +
> > +   return reset_control_status(priv->bridge_reset);
> > +}
> > +
> > +static int _alt_hps2fpga_enable_set(struct altera_hps2fpga_data *priv,
> > +   bool enable)
> > +{
> > +   int ret;
> > +
> > +   /* bring bridge out of reset */
> > +   if (enable)
> > +   ret = reset_control_deassert(priv->bridge_reset);
> > +   else
> > +   ret = reset_control_assert(priv->bridge_reset);
> > +   if (ret)
> > +   return ret;
> > +
> > +   /* Allow bridge to be visible to L3 masters or not */
> > +   if (priv->remap_mask) {
> > +   priv->l3_remap_value |= ALT_L3_REMAP_MPUZERO_MSK;
> 
> Doesn't seem like this belongs here.  I realize the write-only register
> is a problem.  Maybe the syscon driver should be initializing this
> value?
> 
> > +
> > +   if (enable)
> > +   priv->l3_remap_value |= priv->remap_mask;
> > +   else
> > +   priv->l3_remap_value &= ~priv->remap_mask;
> > +
> > +   ret = regmap_write(priv->l3reg, ALT_L3_REMAP_OFST,
> > +  priv->l3_remap_value);
> 
> This isn't going work if more than one bridge is used.  Each bridge has
> its own priv and thus priv->l3_remap_value.  Each bridge's priv will
> have just the bit for it's own remap set.  The 2nd bridge to be enabled
> will turn off the 1st bridge when it re-write the l3 register.
> 
> If all the bridges shared a static global to cache the reg, then this
> problem would be a replaced by a race, since nothing would be managing
> concurrent access to that global from the independent bridge devices.
> 
> How about using the already existing regmap cache ability take care of
> this?  Use regmap_update_bits() to update just the desired bit and let
> remap take care of keeping track caching the register and protecting
> access from multiple users.  It should support that and it should
> support write-only registers, with the creator of the regmap (the syscon
> driver in this case) supplying the initial value of the write-only reg.
> Which is where ALT_L3_REMAP_MPUZERO_MSK could go in.

Please correct me if I'm wrong, but I think that regmap supports
the features you are talking about, but not syscon.

One simple solution would be to take l3_remap_value out of the priv
and let it be shared by all h2f bridges.  That involves the least
amount of change.

> 
> 
> > +   }
> > +
> > +   return ret;
> > +}
> > +
> > +static int alt_hps2fpga_enable_set(struct fpga_bridge *bridge, bool enable)
> > +{
> > +   return _alt_hps2fpga_enable_set(bridge->priv, enable);
> > +}
> > +
> > +static const struct fpga_bridge_ops altera_hps2fpga_br_ops = {
> > +   .enable_set = alt_hps2fpga_enable_set,
> > +   .enable_show = alt_hps2fpga_enable_show,
> > +};
> > +
> > +static struct altera_hps2fpga_data hps2fpga_data  = {
> > +   .name = HPS2FPGA_BRIDGE_NAME,
> > +   .remap_mask = ALT_L3_REMAP_H2F_MSK,
> > +};
> 
> Each of these data structs also includes space for all the private data
> field of the drivers' state.  Seems a bit inefficient if only two of
> them are configuration data.  It also means only one device of each type
> can exists.  If one creates two bridges of the same type they'll
> (silently) share a priv data struct and randomly break.  And the config
> data structs can't be const.

Our hardware doesn't conta

[PATCH v8 2/5] ptr_ring: ring test

2016-06-13 Thread Michael S. Tsirkin

Add ringtest based unit test for ptr ring.

Signed-off-by: Michael S. Tsirkin 
---
 tools/virtio/ringtest/ptr_ring.c | 192 +++
 tools/virtio/ringtest/Makefile   |   5 +-
 2 files changed, 196 insertions(+), 1 deletion(-)
 create mode 100644 tools/virtio/ringtest/ptr_ring.c

diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c
new file mode 100644
index 000..74abd74
--- /dev/null
+++ b/tools/virtio/ringtest/ptr_ring.c
@@ -0,0 +1,192 @@
+#define _GNU_SOURCE
+#include "main.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SMP_CACHE_BYTES 64
+#define cache_line_size() SMP_CACHE_BYTES
+#define cacheline_aligned_in_smp __attribute__ ((aligned 
(SMP_CACHE_BYTES)))
+#define unlikely(x)(__builtin_expect(!!(x), 0))
+#define ALIGN(x, a) (((x) + (a) - 1) / (a) * (a))
+typedef pthread_spinlock_t  spinlock_t;
+
+typedef int gfp_t;
+static void *kzalloc(unsigned size, gfp_t gfp)
+{
+   void *p = memalign(64, size);
+   if (!p)
+   return p;
+   memset(p, 0, size);
+
+   return p;
+}
+
+static void kfree(void *p)
+{
+   if (p)
+   free(p);
+}
+
+static void spin_lock_init(spinlock_t *lock)
+{
+   int r = pthread_spin_init(lock, 0);
+   assert(!r);
+}
+
+static void spin_lock(spinlock_t *lock)
+{
+   int ret = pthread_spin_lock(lock);
+   assert(!ret);
+}
+
+static void spin_unlock(spinlock_t *lock)
+{
+   int ret = pthread_spin_unlock(lock);
+   assert(!ret);
+}
+
+static void spin_lock_bh(spinlock_t *lock)
+{
+   spin_lock(lock);
+}
+
+static void spin_unlock_bh(spinlock_t *lock)
+{
+   spin_unlock(lock);
+}
+
+static void spin_lock_irq(spinlock_t *lock)
+{
+   spin_lock(lock);
+}
+
+static void spin_unlock_irq(spinlock_t *lock)
+{
+   spin_unlock(lock);
+}
+
+static void spin_lock_irqsave(spinlock_t *lock, unsigned long f)
+{
+   spin_lock(lock);
+}
+
+static void spin_unlock_irqrestore(spinlock_t *lock, unsigned long f)
+{
+   spin_unlock(lock);
+}
+
+#include "../../../include/linux/ptr_ring.h"
+
+static unsigned long long headcnt, tailcnt;
+static struct ptr_ring array cacheline_aligned_in_smp;
+
+/* implemented by ring */
+void alloc_ring(void)
+{
+   int ret = ptr_ring_init(&array, ring_size, 0);
+   assert(!ret);
+}
+
+/* guest side */
+int add_inbuf(unsigned len, void *buf, void *datap)
+{
+   int ret;
+
+   ret = __ptr_ring_produce(&array, buf);
+   if (ret >= 0) {
+   ret = 0;
+   headcnt++;
+   }
+
+   return ret;
+}
+
+/*
+ * ptr_ring API provides no way for producer to find out whether a given
+ * buffer was consumed.  Our tests merely require that a successful get_buf
+ * implies that add_inbuf succeed in the past, and that add_inbuf will succeed,
+ * fake it accordingly.
+ */
+void *get_buf(unsigned *lenp, void **bufp)
+{
+   void *datap;
+
+   if (tailcnt == headcnt || __ptr_ring_full(&array))
+   datap = NULL;
+   else {
+   datap = "Buffer\n";
+   ++tailcnt;
+   }
+
+   return datap;
+}
+
+void poll_used(void)
+{
+   void *b;
+
+   do {
+   if (tailcnt == headcnt || __ptr_ring_full(&array)) {
+   b = NULL;
+   barrier();
+   } else {
+   b = "Buffer\n";
+   }
+   } while (!b);
+}
+
+void disable_call()
+{
+   assert(0);
+}
+
+bool enable_call()
+{
+   assert(0);
+}
+
+void kick_available(void)
+{
+   assert(0);
+}
+
+/* host side */
+void disable_kick()
+{
+   assert(0);
+}
+
+bool enable_kick()
+{
+   assert(0);
+}
+
+void poll_avail(void)
+{
+   void *b;
+
+   do {
+   barrier();
+   b = __ptr_ring_peek(&array);
+   } while (!b);
+}
+
+bool use_buf(unsigned *lenp, void **bufp)
+{
+   void *ptr;
+
+   ptr = __ptr_ring_consume(&array);
+
+   return ptr;
+}
+
+void call_used(void)
+{
+   assert(0);
+}
diff --git a/tools/virtio/ringtest/Makefile b/tools/virtio/ringtest/Makefile
index 6173ada..877a8a4 100644
--- a/tools/virtio/ringtest/Makefile
+++ b/tools/virtio/ringtest/Makefile
@@ -1,6 +1,6 @@
 all:
 
-all: ring virtio_ring_0_9 virtio_ring_poll virtio_ring_inorder noring
+all: ring virtio_ring_0_9 virtio_ring_poll virtio_ring_inorder ptr_ring noring
 
 CFLAGS += -Wall
 CFLAGS += -pthread -O2 -ggdb
@@ -8,6 +8,7 @@ LDFLAGS += -pthread -O2 -ggdb
 
 main.o: main.c main.h
 ring.o: ring.c main.h
+ptr_ring.o: ptr_ring.c main.h ../../../include/linux/ptr_ring.h
 virtio_ring_0_9.o: virtio_ring_0_9.c main.h
 virtio_ring_poll.o: virtio_ring_poll.c virtio_ring_0_9.c main.h
 virtio_ring_inorder.o: virtio_ring_inorder.c virtio_ring_0_9.c main.h
@@ -15,6 +16,7 @@ ring: ring.o main.o
 virtio_ring_0_9: virtio_ring_0_9.o main.o
 virtio_ring_poll: virtio_ring_poll.o main.o
 virtio_ring_inorder: virtio_ring_inorder.o ma

Re: [PATCH v6 09/11] cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES

2016-06-13 Thread Rafael J. Wysocki

On Monday, June 13, 2016 05:01:50 PM Daniel Lezcano wrote:
> On Wed, Jun 08, 2016 at 11:54:29AM -0500, Shreyas B. Prabhu wrote:
> > Use cpuidle's CPUIDLE_STATE_MAX macro instead of powernv specific
> > MAX_POWERNV_IDLE_STATES.
> > 
> > Cc: Rafael J. Wysocki 
> > Cc: Daniel Lezcano 
> > Cc: linux...@vger.kernel.org
> > Suggested-by: Daniel Lezcano 
> > Signed-off-by: Shreyas B. Prabhu 
> > ---
> 
> Acked-by: Daniel Lezcano 

Since this seems to depend on some other patches in the series, I'm expecting
it to go in along with the patches it depends on.

Thanks,
Rafael

[PATCH v8 0/5] skb_array: array based FIFO for skbs

2016-06-13 Thread Michael S. Tsirkin

This is in response to the proposal by Jason to make tun
rx packet queue lockless using a circular buffer.
My testing seems to show that at least for the common usecase
in networking, which isn't lockless, circular buffer
with indices does not perform that well, because
each index access causes a cache line to bounce between
CPUs, and index access causes stalls due to the dependency.

By comparison, an array of pointers where NULL means invalid
and !NULL means valid, can be updated without messing up barriers
at all and does not have this issue.

On the flip side, cache pressure may be caused by using large queues.
tun has a queue of 1000 entries by default and that's 8K.
At this point I'm not sure this can be solved efficiently.
The correct solution might be sizing the queues appropriately.

Here's an implementation of this idea: it can be used more
or less whenever sk_buff_head can be used, except you need
to know the queue size in advance.

As this might be useful outside of networking, I implemented
a generic array of void pointers, with a type-safe wrapper for skbs.

It remains to be seen whether resizing is required, in case it is
I included patches implementing resizing by holding both the
consumer and the producer locks.

I think this code works fine without any extra memory barriers since we
always read and write the same location, so the accesses can not be
reordered.
Multiple writes of the same value into memory would mess things up
for us, I don't think compilers would do it though.
But if people feel it's better to be safe wrt compiler optimizations,
specifying queue as volatile would probably do it in a cleaner way
than converting all accesses to READ_ONCE/WRITE_ONCE. Thoughts?

The only issue is with calls within a loop using the __ptr_ring_XXX
accessors - in theory compiler could hoist accesses out of the loop.

Following volatile-considered-harmful.txt I merely
documented that callers that busy-poll should invoke cpu_relax().
Most people will use the external skb_array_XXX APIs with a spinlock,
so this should not be an issue for them.

Eric Dumazet suggested adding an extra pointer to skb for when
we have a single outstanding packet. I could not figure out
a way to implement this without a shared consumer/producer lock
though, which would cause cache line bounces by itself.

Jesper, Jason, I know that both of you tested this,
please post Tested-by tags for whatever was tested.

changes since v7
fix typos noticed by Jesper Brouer

changes since v6
resize implemented. peek/full calls are no longer lockless

replaced _FIELD macros with _CALL which invoke a function
on the pointer rather than just returning a value

destroy now scans the array and frees all queued skbs

changes since v5
implemented a generic ptr_ring api, and
made skb_array a type-safe wrapper
apis for taking the spinlock in different contexts
following expected usecase in tun
changes since v4 (v3 was never posted)
documentation
dropped SKB_ARRAY_MIN_SIZE heuristic
unit test (in userspace, included as patch 2)

changes since v2:
fixed integer overflow pointed out by Eric.
added some comments.

changes since v1:
fixed bug pointed out by Eric.

Michael S. Tsirkin (5):
  ptr_ring: array based FIFO for pointers
  ptr_ring: ring test
  skb_array: array based FIFO for skbs
  ptr_ring: resize support
  skb_array: resize support

 include/linux/ptr_ring.h | 393 +++
 include/linux/skb_array.h| 169 +
 tools/virtio/ringtest/ptr_ring.c | 192 +++
 tools/virtio/ringtest/Makefile   |   5 +-
 4 files changed, 758 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/ptr_ring.h
 create mode 100644 include/linux/skb_array.h
 create mode 100644 tools/virtio/ringtest/ptr_ring.c

-- 
MST

[PATCH v8 1/5] ptr_ring: array based FIFO for pointers

2016-06-13 Thread Michael S. Tsirkin

A simple array based FIFO of pointers.  Intended for net stack which
commonly has a single consumer/producer.

Signed-off-by: Michael S. Tsirkin 
---
 include/linux/ptr_ring.h | 264 +++
 1 file changed, 264 insertions(+)
 create mode 100644 include/linux/ptr_ring.h

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
new file mode 100644
index 000..633406f
--- /dev/null
+++ b/include/linux/ptr_ring.h
@@ -0,0 +1,264 @@
+/*
+ * Definitions for the 'struct ptr_ring' datastructure.
+ *
+ * Author:
+ * Michael S. Tsirkin 
+ *
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * This is a limited-size FIFO maintaining pointers in FIFO order, with
+ * one CPU producing entries and another consuming entries from a FIFO.
+ *
+ * This implementation tries to minimize cache-contention when there is a
+ * single producer and a single consumer CPU.
+ */
+
+#ifndef _LINUX_PTR_RING_H
+#define _LINUX_PTR_RING_H 1
+
+#ifdef __KERNEL__
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
+
+struct ptr_ring {
+   int producer cacheline_aligned_in_smp;
+   spinlock_t producer_lock;
+   int consumer cacheline_aligned_in_smp;
+   spinlock_t consumer_lock;
+   /* Shared consumer/producer data */
+   /* Read-only by both the producer and the consumer */
+   int size cacheline_aligned_in_smp; /* max entries in queue */
+   void **queue;
+};
+
+/* Note: callers invoking this in a loop must use a compiler barrier,
+ * for example cpu_relax().
+ * Callers don't need to take producer lock - if they don't
+ * the next call to __ptr_ring_produce may fail.
+ */
+static inline bool __ptr_ring_full(struct ptr_ring *r)
+{
+   return r->queue[r->producer];
+}
+
+static inline bool ptr_ring_full(struct ptr_ring *r)
+{
+   barrier();
+   return __ptr_ring_full(r);
+}
+
+/* Note: callers invoking this in a loop must use a compiler barrier,
+ * for example cpu_relax().
+ */
+static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
+{
+   if (__ptr_ring_full(r))
+   return -ENOSPC;
+
+   r->queue[r->producer++] = ptr;
+   if (unlikely(r->producer >= r->size))
+   r->producer = 0;
+   return 0;
+}
+
+static inline int ptr_ring_produce(struct ptr_ring *r, void *ptr)
+{
+   int ret;
+
+   spin_lock(&r->producer_lock);
+   ret = __ptr_ring_produce(r, ptr);
+   spin_unlock(&r->producer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_produce_irq(struct ptr_ring *r, void *ptr)
+{
+   int ret;
+
+   spin_lock_irq(&r->producer_lock);
+   ret = __ptr_ring_produce(r, ptr);
+   spin_unlock_irq(&r->producer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_produce_any(struct ptr_ring *r, void *ptr)
+{
+   unsigned long flags;
+   int ret;
+
+   spin_lock_irqsave(&r->producer_lock, flags);
+   ret = __ptr_ring_produce(r, ptr);
+   spin_unlock_irqrestore(&r->producer_lock, flags);
+
+   return ret;
+}
+
+static inline int ptr_ring_produce_bh(struct ptr_ring *r, void *ptr)
+{
+   int ret;
+
+   spin_lock_bh(&r->producer_lock);
+   ret = __ptr_ring_produce(r, ptr);
+   spin_unlock_bh(&r->producer_lock);
+
+   return ret;
+}
+
+/* Note: callers invoking this in a loop must use a compiler barrier,
+ * for example cpu_relax(). Callers must take consumer_lock
+ * if they dereference the pointer - see e.g. PTR_RING_PEEK_CALL.
+ * There's no need for a lock if pointer is merely tested - see e.g.
+ * ptr_ring_empty.
+ */
+static inline void *__ptr_ring_peek(struct ptr_ring *r)
+{
+   return r->queue[r->consumer];
+}
+
+static inline bool ptr_ring_empty(struct ptr_ring *r)
+{
+   barrier();
+   return !__ptr_ring_peek(r);
+}
+
+/* Must only be called after __ptr_ring_peek returned !NULL */
+static inline void __ptr_ring_discard_one(struct ptr_ring *r)
+{
+   r->queue[r->consumer++] = NULL;
+   if (unlikely(r->consumer >= r->size))
+   r->consumer = 0;
+}
+
+static inline void *__ptr_ring_consume(struct ptr_ring *r)
+{
+   void *ptr;
+
+   ptr = __ptr_ring_peek(r);
+   if (ptr)
+   __ptr_ring_discard_one(r);
+
+   return ptr;
+}
+
+static inline void *ptr_ring_consume(struct ptr_ring *r)
+{
+   void *ptr;
+
+   spin_lock(&r->consumer_lock);
+   ptr = __ptr_ring_consume(r);
+   spin_unlock(&r->consumer_lock);
+
+   return ptr;
+}
+
+static inline void *ptr_ring_consume_irq(struct ptr_ring *r)
+{
+   void *ptr;
+
+   spin_lock_irq(&r->consumer_lock);
+   ptr = __ptr_ring_consume(r);
+

Re: [PATCH] cpufreq: conservative: Do not use transition notifications

2016-06-13 Thread Rafael J. Wysocki

On Monday, June 13, 2016 08:58:34 PM Viresh Kumar wrote:
> On 13-06-16, 15:36, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > Subject: [PATCH v2] cpufreq: conservative: Do not use transition 
> > notifications
> > 
> > The conservative governor registers a transition notifier so it
> > can update its internal requested_freq value if it falls out of the
> > policy->min...policy->max range, but requested_freq is not really
> > necessary.
> > 
> > That value is used to track the frequency requested by the governor
> > previously, but policy->cur can be used instead of it and then the
> > governor will not have to worry about updating the tracked value when
> > the current frequency changes independently (for example, as a result
> > of min or max changes).
> > 
> > Accodringly, drop requested_freq from struct cs_policy_dbs_info
> > and modify cs_dbs_timer() to use policy->cur instead of it.
> > While at it, notice that __cpufreq_driver_target() clamps its
> > target_freq argument between policy->min and policy->max, so
> > the callers of it don't have to do that and make additional
> > changes in cs_dbs_timer() in accordance with that.
> > 
> > After these changes the transition notifier used by the conservative
> > governor is not necessary any more, so drop it, which also makes it
> > possible to drop the struct cs_governor definition and simplify the
> > code accordingly.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> > ---
> >  drivers/cpufreq/cpufreq_conservative.c |  103 
> > ++---
> >  1 file changed, 21 insertions(+), 82 deletions(-)
> > 
> 
> Acked-by: Viresh Kumar 

Thanks!

> > -static struct cs_governor cs_gov = {
> > -   .dbs_gov = {
> > -   .gov = CPUFREQ_DBS_GOVERNOR_INITIALIZER("conservative"),
> > -   .kobj_type = { .default_attrs = cs_attributes },
> > -   .gov_dbs_timer = cs_dbs_timer,
> > -   .alloc = cs_alloc,
> > -   .free = cs_free,
> > -   .init = cs_init,
> > -   .exit = cs_exit,
> > -   .start = cs_start,
> > -   },
> > +static struct dbs_governor cs_governor = {
> > +   .gov = CPUFREQ_DBS_GOVERNOR_INITIALIZER("conservative"),
> > +   .kobj_type = { .default_attrs = cs_attributes },
> > +   .gov_dbs_timer = cs_dbs_timer,
> > +   .alloc = cs_alloc,
> > +   .free = cs_free,
> > +   .init = cs_init,
> > +   .exit = cs_exit,
> > +   .start = cs_start,
> >  };
> 
> Though, I am not sure why this change was required :)

This is because struct cs_governor is not necessary any more, since its only
member, usage_count, was only needed because of the notifier.  So dbs_governor
can be used directly here now.

Thanks,
Rafael

Re: [RFC 18/18] proc: present VM_LOCKED memory in /proc/self/maps

2016-06-13 Thread Kees Cook

On Mon, Jun 13, 2016 at 10:44:25PM +0300, Topi Miettinen wrote:
> Add a flag to /proc/self/maps to show that the memory area is locked.
> 
> Signed-off-by: Topi Miettinen 
> ---
>  fs/proc/task_mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 4648c7f..8229509 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c

If you change the maps format, you'll need to update task_nommu.c too.

> @@ -313,13 +313,14 @@ show_map_vma(struct seq_file *m, struct vm_area_struct 
> *vma, int is_pid)
>   end -= PAGE_SIZE;
>  
>   seq_setwidth(m, 25 + sizeof(void *) * 6 - 1);

I think the width needs to be adjusted for the new character.

> - seq_printf(m, "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ",
> + seq_printf(m, "%08lx-%08lx %c%c%c%c%c %08llx %02x:%02x %lu ",

Have you checked that no userspace tools that parse "maps" will break with
this flag addition?

>   start,
>   end,
>   flags & VM_READ ? 'r' : '-',
>   flags & VM_WRITE ? 'w' : '-',
>   flags & VM_EXEC ? 'x' : '-',
>   flags & VM_MAYSHARE ? 's' : 'p',
> + flags & VM_LOCKED ? 'l' : '-',

IIUC, the smaps file already includes the locked information in VmFlags as
"lo" (see show_smap_vma_flags), so I think you probably don't want this
patch at all.

-Kees

>   pgoff,
>   MAJOR(dev), MINOR(dev), ino);
>  
> -- 
> 2.8.1

-- 
Kees Cook@outflux.net

Re: [RFC 18/18] proc: present VM_LOCKED memory in /proc/self/maps

2016-06-13 Thread Topi Miettinen

On 06/13/16 20:43, Kees Cook wrote:
> On Mon, Jun 13, 2016 at 10:44:25PM +0300, Topi Miettinen wrote:
>> Add a flag to /proc/self/maps to show that the memory area is locked.
>>
>> Signed-off-by: Topi Miettinen 
>> ---
>>  fs/proc/task_mmu.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> index 4648c7f..8229509 100644
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
> 
> If you change the maps format, you'll need to update task_nommu.c too.
> 
>> @@ -313,13 +313,14 @@ show_map_vma(struct seq_file *m, struct vm_area_struct 
>> *vma, int is_pid)
>>  end -= PAGE_SIZE;
>>  
>>  seq_setwidth(m, 25 + sizeof(void *) * 6 - 1);
> 
> I think the width needs to be adjusted for the new character.
> 
>> -seq_printf(m, "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ",
>> +seq_printf(m, "%08lx-%08lx %c%c%c%c%c %08llx %02x:%02x %lu ",
> 
> Have you checked that no userspace tools that parse "maps" will break with
> this flag addition?
> 
>>  start,
>>  end,
>>  flags & VM_READ ? 'r' : '-',
>>  flags & VM_WRITE ? 'w' : '-',
>>  flags & VM_EXEC ? 'x' : '-',
>>  flags & VM_MAYSHARE ? 's' : 'p',
>> +flags & VM_LOCKED ? 'l' : '-',
> 
> IIUC, the smaps file already includes the locked information in VmFlags as
> "lo" (see show_smap_vma_flags), so I think you probably don't want this
> patch at all.

Yes. the amount of locked memory is also shown:
Locked:8 kB
VmFlags: rd wr mr mw me lo ac sd

Sorry, I didn't notice that. I'll drop the patch.

-Topi

> 
> -Kees
> 
>>  pgoff,
>>  MAJOR(dev), MINOR(dev), ino);
>>  
>> -- 
>> 2.8.1
>

[RFC] serial: 8250: fix regression in 8250 uart driver

2016-06-13 Thread Dinh Nguyen

Hi Andy,

I saw that you have discovered that commit ec5a11a91eec ("serial: 8250:
Validate dmaengine rx chan meets requirements") introduced a regression
in the 8250 uart driver. For SoCFPGA platform, I am seeing this error:

[5.541751] ttyS0 - failed to request DMA

Reverting the commit ec5a11a91eec removes the error.

I saw that you started the discussion, but I didn't see that a patch was
included[1].

The following patch seems to fix the error, but I'm not sure if it's the
same fix that you had in mind.

Thanks,
Dinh

[1] http://marc.info/?l=linux-serial&m=146254187602862&w=2

---8<
diff --git a/drivers/tty/serial/8250/8250_dma.c
b/drivers/tty/serial/8250/8250_dma.c
index 7f33d1c..847a203 100644
--- a/drivers/tty/serial/8250/8250_dma.c
+++ b/drivers/tty/serial/8250/8250_dma.c
@@ -176,7 +176,7 @@ int serial8250_request_dma(struct uart_8250_port *p)
ret = dma_get_slave_caps(dma->rxchan, &caps);
if (ret)
goto release_rx;
-   if (!caps.cmd_pause || !caps.cmd_terminate ||
+   if ((!caps.cmd_pause || !caps.cmd_terminate) &&
caps.residue_granularity ==
DMA_RESIDUE_GRANULARITY_DESCRIPTOR) {
ret = -EINVAL;
goto release_rx;

Re: [PATCH 00/14] run seccomp after ptrace

2016-06-13 Thread Kees Cook

(Oops, forgot to send this series through the lsm list...)

On Thu, Jun 9, 2016 at 2:01 PM, Kees Cook  wrote:
> There has been a long-standing (and documented) issue with seccomp
> where ptrace can be used to change a syscall out from under seccomp.
> This is a problem for containers and other wider seccomp filtered
> environments where ptrace needs to remain available, as it allows
> for an escape of the seccomp filter.
>
> Since the ptrace attack surface is available for any allowed syscall,
> moving seccomp after ptrace doesn't increase the actually available
> attack surface. And this actually improves tracing since, for
> example, tracers will be notified of syscall entry before seccomp
> sends a SIGSYS, which makes debugging filters much easier.
>
> The per-architecture changes do make one (hopefully small)
> semantic change, which is that since ptrace comes first, it may
> request a syscall be skipped. Running seccomp after this doesn't
> make sense, so if ptrace wants to skip a syscall, it will bail
> out early similarly to how seccomp was. This means that skipped
> syscalls will not be fed through audit, though that likely means
> we're actually avoiding noise this way.
>
> This series first cleans up seccomp to remove the now unneeded
> two-phase entry, fixes the SECCOMP_RET_TRACE hole (same as the
> ptrace hole above), and then reorders seccomp after ptrace on
> each architecture.

Has anyone else had a chance to review this series? I'd like to get it
landed in -next as early as possible in case there are unexpected
problems...

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

Re: [RFC 01/18] capabilities: track actually used capabilities

2016-06-13 Thread Topi Miettinen

On 06/13/16 20:32, Andy Lutomirski wrote:
> On Mon, Jun 13, 2016 at 12:44 PM, Topi Miettinen  wrote:
>> Track what capabilities are actually used and present the current
>> situation in /proc/self/status.
> 
> What for?

Excerpt from the cover letter:

"There are many basic ways to control processes, including capabilities,
cgroups and resource limits. However, there are far fewer ways to find out
useful values for the limits, except blind trial and error.

This patch series attempts to fix that by giving at least a nice starting
point from the actual maximum values. I looked where each limit is checked
and added a call to limit bump nearby.

Capabilities
[RFC 01/18] capabilities: track actually used capabilities

Currently, there is no way to know which capabilities are actually used.
Even
the source code is only implicit, in-depth knowledge of each capability must
be used when analyzing a program to judge which capabilities the program
will
exercise."

Should I perhaps cite some of this in the commit?

>
> What is the intended behavior on fork()?  Whatever the intended
> behavior is, there should IMO be a selftest for it.
>
> --Andy
>

The capabilities could be tracked from three points of daemon
initialization sequence onwards:
fork()
setpcap()
exec()

fork() case would be logical as the /proc entry is per task. But if you
consider the tools to set the capabilities (for example systemd unit
files), there can be between fork() and exec() further preparations
which need more capabilities than the program itself needs.

setpcap() is probably the real point after which we are interested if
the capabilities are enough.

The amount of setup between setpcap() and exec() is probably very low.

-Topi

Re: [PATCH] s390/oprofile: Remove deprecated create_workqueue

2016-06-13 Thread William Cohen

On 06/13/2016 12:29 PM, Robert Richter wrote:
> Heiko,
> 
> On 09.06.16 11:00:56, Heiko Carstens wrote:
>> However I'm wondering if we shouldn't simply remove at least the s390
>> specific hwswampler code from the oprofile module. This would still leave
>> the common code timer based sampling mode for oprofile working on s390.
>>
>> It looks like the oprofile user space utility nowadays (since 2012) uses
>> the kernel perf interface instead of the oprofile interface anyway, if
>> present. So the oprofile module itself doesn't seem to have too many users
>> left.
>>
>> Any opinions?
> 
> yes, the kernel driver is not necessary for oprofile userland for a
> while now. There is no ongoing development any longer, most patches
> are due to changes in the kernel apis.
> 
> So if there is code that needs a larger rework due to other kernel
> changes and there is no user anymore, I am fine with removing the code
> instead of reworking it. I still would just keep existing code as long
> as we can keep it unchanged (some like the lightwight of oprofile,
> esp. in the embedded space). If there is a user of the code, a
> Tested-by would be good for new code changes.
> 
> If there are users of the hwswampler, speak up now. Else, let's just
> remove it.
> 
> -Robert
> 

Hi,

As Robert mentioned the user-space oprofile code that would have used the 
oprofile device driver was removed around August 2014 in preparation for 
oprofile-1.0. The operf command which uses the kernel perf infrastructure and 
does not need for the oprofile kernel driver has been in oprofile since August 
of 2012.  For some architectures it would make sense to simplify things by 
eliminate the oprofile kernel driver.

-Will Cohen

[PATCH] lustre: hide call to Posix ACL in ifdef

2016-06-13 Thread Arnd Bergmann

A call to forget_cached_acl() was recently added to the lustre file
system, but this is only available when CONFIG_FS_POSIX_ACL is
enabled, otherwise the build now fails with:

lustre/llite/file.c: In function 'll_get_acl':
lustre/llite/file.c:3134:2: error: implicit declaration of function 
'forget_cached_acl' [-Werror=implicit-function-declaration]
  forget_cached_acl(inode, type);

This adds one more #ifdef for this call, corresponding to the
other 22 such checks for ACL in lustre.

Signed-off-by: Arnd Bergmann 
Fixes: b788dc51e425 ("staging: lustre: llite: drop acl from cache")
---
 drivers/staging/lustre/lustre/llite/file.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index bafa0b701e87..26c6cd60ae1d 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3131,7 +3131,9 @@ struct posix_acl *ll_get_acl(struct inode *inode, int 
type)
spin_lock(&lli->lli_lock);
/* VFS' acl_permission_check->check_acl will release the refcount */
acl = posix_acl_dup(lli->lli_posix_acl);
+#ifdef CONFIG_FS_POSIX_ACL
forget_cached_acl(inode, type);
+#endif
spin_unlock(&lli->lli_lock);
 
return acl;
-- 
2.7.0

Re: [PATCH] clk: rockchip: add flag CLK_SET_RATE_PARENT for dclk_vop0_div on RK3399

2016-06-13 Thread Doug Anderson

Hi,

On Mon, Jun 13, 2016 at 11:37 AM, Brian Norris  wrote:
> Hi,
>
> On Sun, Jun 12, 2016 at 06:46:51PM +0800, Yakir Yang wrote:
>> On 06/12/2016 05:48 PM, Xing Zheng wrote:
>> >The functions and features VOP0 more complete than VOP1's, we need to
>> >use it dclk_vop0_div operate VPLLI, and let VOP0 as the default primary
>> >screen.
>
> Personally, I'd like a little better description that talks about the
> rates, not just the differences between VOP0 and VOP1. Presumably the
> clock rates needed by VOP0 are not achievable just by these dividers, so
> we need to adjust the PLL?

The idea is that there is a "big" VOP (vop0) and a "little" VOP
(vop1).  The "big" VOP can support higher resolutions and more output
formats but draws a little more power.  The "little" VOP supports
lower resolutions and a more limited set of formats.  If you're
curious, chapter 1 of the rk3399 TRM has a summary of the VOP features
(big and little).

In general, I think the SoC allows dynamic assignment of the VOPs to
the various video devices (eDP, DP, MIPI, HDMI).  So you can output to
two places at once and you get to pick whichever VOP you want for each
output.

The VOPs have three PLL sources: VPLL, CPLL, and GPLL.  Those PLLs are
best described as:
* CPLL - The PLL that runs at 800MHz.
* GPLL - The PLL that runs at ~600MHz (actually 594 MHz).
* VPLL - The PLL that can change rates to make various pixels clocks.

Presumably:
* The little VOP has enough features that you'd want to use it for
most internal laptop / cellphone / tablet panels.
* The big VOP is a good choice for whatever external graphics
connector you have so you can support the widest range of devices.

So if you've got a laptop that happens to have an internal panel and
an external connector, presumably:

* You want to adjust your display timings (hblank, vblank, etc) to
make sure that the internal display can be driven by dividing 800 MHz
or 594 MHz by some integral amount.  As an example, for the Starry
panel I posted recently 
you could make exactly 148.5 (594 / 4) by subtracting 4 from the
horizontal total and adding 15 to the vertical: 1250 * 1980 * 60 Hz =
148.5 MHz

* You want to make sure that the internal display gets assigned the
little VOP so save power / leave flexibility for the external
connector.

* You want to make sure that that the little VOP _doesn't_
accidentally get assigned VPLL even if (at boot) VPLL happens to be at
a rate that would be fine for the panel.  If you accidentally use VPLL
as a parent then you'll have a tougher time changing VPLL later when
an external display is plugged in.

NOTE: If you have things other than a laptop the decisions between
VOP0 and VOP1 get much tougher.

> FWIW, I haven't actually found this patch necessary in my own testing (I
> have eDP running fine without this change), but perhaps with better
> justification, this will make more sense.

It is probable that firmware has already set the PLL up.  It would be
interested to hack your firmware to turn off the display and see if
your behavior changes.  Alternatively, try adding something like this
to hack the VOPs:

--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -816,7 +816,7 @@
<&cru ACLK_VOP1>, <&cru HCLK_VOP1>,
<&cru ARMCLKL>, <&cru ARMCLKB>,
<&cru PLL_GPLL>, <&cru PLL_CPLL>,
-   <&cru PLL_NPLL>,
+   <&cru PLL_NPLL>, <&cru PLL_VPLL>,
<&cru ACLK_PERIHP>, <&cru HCLK_PERIHP>,
<&cru PCLK_PERIHP>,
<&cru ACLK_PERILP0>, <&cru HCLK_PERILP0>,
@@ -827,7 +827,7 @@
<4>,  <2>,
<81600>, <100800>,
 <59400>,  <8>,
-   <10>,
+   <10>,  <9>,
 <15000>,   <7500>,
  <3750>,
 <1>,  <1>,

NOTE also: it seems terribly unlikely that adding CLK_SET_RATE_PARENT
to "vop0" would help with eDP, which really ought to be using vop1,
right?  In testing on my board, I found that eDP is in fact using vop1
with my current patch stack.

---

To summarize all that, I think that the following things would work OK
for a laptop until a better solution comes along:

* Probably VOP0 and VOP1 should both be able to change their parent's rate.

* Somehow adjust the panel rate to one that could be produced by CPLL
/ GPLL.  Presumably we'd want some code to add these extra modes to
simple-panel (?) and some code to know which mode to pick (?).

* make sure VOP0 is assigned to the panel (make this already is forced somehow?)

* make sure VOP0 starts out with the right parent (CPLL / GPLL) using
assigned-clocks in the device tree, so CCF will leave things al

Re: [PATCH 2/8] kexec_file: Generalize kexec_add_buffer.

2016-06-13 Thread Thiago Jung Bauermann

Hi Dave,

Am Montag, 13 Juni 2016, 16:08:19 schrieb Thiago Jung Bauermann:
> Am Montag, 13 Juni 2016, 15:29:39 schrieb Dave Young:
> > On 06/12/16 at 12:10am, Thiago Jung Bauermann wrote:
> > > Allow architectures to specify different memory walking functions for
> > > kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> > > but PowerPC uses the memblock subsystem.
> > 
> > Can the crashk_res be inserted to iomem_resource so that only one
> > weak function for system ram is needed?
> 
> Sorry, it's not clear to me what you mean by inserting crashk_res into
> iomem_resource, but I can add a bool for_crashkernel to
> arch_walk_system_ram so that it can decide which kind of memory to
> traverse.

This is the patch implementing that idea. What do you think?
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


kexec_file: Generalize kexec_add_buffer.

Allow architectures to specify different memory walking functions for
kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
but PowerPC uses the memblock subsystem.

Signed-off-by: Thiago Jung Bauermann 
Cc: Eric Biederman 
Cc: ke...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index e8acb2b43dd9..be18cb80c14e 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -315,6 +315,9 @@ int __weak arch_kexec_apply_relocations_add(const Elf_Ehdr 
*ehdr,
Elf_Shdr *sechdrs, unsigned int relsec);
 int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr 
*sechdrs,
unsigned int relsec);
+int __weak arch_walk_system_ram(bool for_crashkernel, unsigned long start,
+   unsigned long end, bool top_down, void *data,
+   int (*func)(u64, u64, void *));
 void arch_kexec_protect_crashkres(void);
 void arch_kexec_unprotect_crashkres(void);
 
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b6eec7527e9f..5d0a6a20b12b 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -428,6 +428,38 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
void *arg)
return locate_mem_hole_bottom_up(start, end, kbuf);
 }
 
+/**
+ * arch_walk_system_ram - call func(data) on free memory regions
+ * @for_crashkernel:   Is this for the crash kernel?
+ * @start: Don't visit memory regions below this address.
+ * @end:   Don't visit memory regions above this address.
+ * @top_down:  Starts from the highest address?
+ * @data:  Argument to pass to @func.
+ * @func:  Function to call for each memory region.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int __weak arch_walk_system_ram(bool for_crashkernel, unsigned long start,
+   unsigned long end, bool top_down, void *data,
+   int (*func)(u64, u64, void *))
+{
+   int ret;
+
+   if (for_crashkernel)
+   ret = walk_iomem_res_desc(crashk_res.desc,
+ IORESOURCE_SYSTEM_RAM | 
IORESOURCE_BUSY,
+ start, end, data, func);
+   else
+   ret = walk_system_ram_res(start, end, data, func);
+
+   if (ret != 1) {
+   /* A suitable memory range could not be found for buffer */
+   return -EADDRNOTAVAIL;
+   }
+
+   return 0;
+}
+
 /*
  * Helper function for placing a buffer in a kexec segment. This assumes
  * that kexec_mutex is held.
@@ -473,17 +505,14 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
unsigned long bufsz,
 
/* Walk the RAM ranges and allocate a suitable range for the buffer */
if (image->type == KEXEC_TYPE_CRASH)
-   ret = walk_iomem_res_desc(crashk_res.desc,
-   IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
-   crashk_res.start, crashk_res.end, kbuf,
-   locate_mem_hole_callback);
+   ret = arch_walk_system_ram(true, crashk_res.start,
+  crashk_res.end, top_down, kbuf,
+  locate_mem_hole_callback);
else
-   ret = walk_system_ram_res(0, -1, kbuf,
- locate_mem_hole_callback);
-   if (ret != 1) {
-   /* A suitable memory range could not be found for buffer */
-   return -EADDRNOTAVAIL;
-   }
+   ret = arch_walk_system_ram(false, 0, -1, top_down, kbuf,
+  locate_mem_hole_callback);
+   if (ret)
+   return ret;
 
/* Found a suitable memory range */
ksegment = &image->segment[image->nr_segments];

Re: [PATCH 5/6] x86/ptrace: down with test_thread_flag(TIF_IA32)

2016-06-13 Thread Andy Lutomirski

On Mon, Jun 13, 2016 at 6:50 AM, Oleg Nesterov  wrote:
> To avoid the confusion, let me first say that I am not going to argue
> with these changes, I simply do not understand the problem space enough.
>
> On 06/10, Andy Lutomirski wrote:
>>
>> On Fri, Jun 10, 2016 at 1:07 PM, Oleg Nesterov  wrote:
>> >
>> > IIRC, CRIU can't c/r the 32-bit applications, or this is no longer true?
>> >
>>
>> CRIU has a horrible, nasty, brilliant idea: it will start restoring
>> 32-bit processes by treating them mostly like 64-bit processes.  The
>> restorer will start out 64-bit, set everything up, and long
>> jump/return/sigreturn/whatever back to 32-bit mode.
>
> OK, I see,
>
>> My proposal was
>> that, rather than coming up with nasty hacks to switch the kernel's
>> idea of the task bitness,
>
> Well, I can't resist but to me SA_IA32_ABI/SA_X32_ABI looks like a hack
> too.  We actually shift TIF_*32 into k_sigaction->flags, and the fact
> that we do this per-signal looks, well, interesting ;)

Is anything actually wrong with this, though?

Re: [RFC 05/18] limits: track and present RLIMIT_NOFILE actual max

2016-06-13 Thread Andy Lutomirski


On 06/13/2016 12:44 PM, Topi Miettinen wrote:

Track maximum number of files for the process, present current maximum
in /proc/self/limits.


The core part should be its own patch.

Also, you have this weirdly named (and racy!) function bump_rlimit. 
Wouldn't this be nicer if you taught the rlimit code to track the 
*current* usage generically and to derive the max usage from that?



diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..227997b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -630,8 +630,8 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
/*
 * print the file header
 */
-   seq_printf(m, "%-25s %-20s %-20s %-10s\n",
- "Limit", "Soft Limit", "Hard Limit", "Units");
+   seq_printf(m, "%-25s %-20s %-20s %-10s %-20s\n",
+  "Limit", "Soft Limit", "Hard Limit", "Units", "Max");


What existing programs, if any, does this break?



for (i = 0; i < RLIM_NLIMITS; i++) {
if (rlim[i].rlim_cur == RLIM_INFINITY)
@@ -647,9 +647,11 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
seq_printf(m, "%-20lu ", rlim[i].rlim_max);

if (lnames[i].unit)
-   seq_printf(m, "%-10s\n", lnames[i].unit);
+   seq_printf(m, "%-10s", lnames[i].unit);
else
-   seq_putc(m, '\n');
+   seq_printf(m, "%-10s", "");
+   seq_printf(m, "%-20lu\n",
+  task->signal->rlim_curmax[i]);
}

return 0;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9c48a08..0150380 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -782,6 +782,7 @@ struct signal_struct {
 * have no need to disable irqs.
 */
struct rlimit rlim[RLIM_NLIMITS];
+   unsigned long rlim_curmax[RLIM_NLIMITS];

 #ifdef CONFIG_BSD_PROCESS_ACCT
struct pacct_struct pacct;  /* per-process accounting information */
@@ -3376,6 +3377,12 @@ static inline unsigned long rlimit_max(unsigned int 
limit)
return task_rlimit_max(current, limit);
 }

+static inline void bump_rlimit(unsigned int limit, unsigned long r)
+{
+   if (READ_ONCE(current->signal->rlim_curmax[limit]) < r)
+   current->signal->rlim_curmax[limit] = r;
+}
+
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
void (*func)(struct update_util_data *data,

[PATCH] clocksource: nps: fix nps_timer_init return value

2016-06-13 Thread Arnd Bergmann

The CLOCKSOURCE_OF_DECLARE macro ensures that the type of the init
function matches the caller. In case of the new timer-nps driver,
it doesn't match, so we get a warning:

../drivers/clocksource/timer-nps.c:97:208: error: comparison of distinct 
pointer types lacks a cast [-Werror]
 CLOCKSOURCE_OF_DECLARE(ezchip_nps400_clksrc, ezchip,nps400-timer,

This changes the return type to match what the caller expects.

Signed-off-by: Arnd Bergmann 
---
 drivers/clocksource/timer-nps.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/timer-nps.c b/drivers/clocksource/timer-nps.c
index d46108920b2c..ae34718f5ab2 100644
--- a/drivers/clocksource/timer-nps.c
+++ b/drivers/clocksource/timer-nps.c
@@ -81,17 +81,19 @@ static void __init nps_setup_clocksource(struct device_node 
*node,
}
 }
 
-static void __init nps_timer_init(struct device_node *node)
+static int __init nps_timer_init(struct device_node *node)
 {
struct clk *clk;
 
clk = of_clk_get(node, 0);
if (IS_ERR(clk)) {
pr_err("Can't get timer clock.\n");
-   return;
+   return PTR_ERR(clk);
}
 
nps_setup_clocksource(node, clk);
+
+   return 0;
 }
 
 CLOCKSOURCE_OF_DECLARE(ezchip_nps400_clksrc, "ezchip,nps400-timer",
-- 
2.7.0

[PATCH] clocksource: kona: avoid bogus warning

2016-06-13 Thread Arnd Bergmann

I could not figure out why, but gcc cannot prove that the
kona_timer_init function always initializes its two outputs,
and we get a warning for the use of the 'lsw' variable later,
which is obviously correct.

drivers/clocksource/bcm_kona_timer.c: In function 'kona_timer_init':
drivers/clocksource/bcm_kona_timer.c:119:13: error: 'lsw' may be used 
uninitialized in this function [-Werror=maybe-uninitialized]

Slightly reordering the loop makes the warning disappear, after
it becomes more obvious to the compiler that the loop is
always entered on the first iteration.

Signed-off-by: Arnd Bergmann 
---
 drivers/clocksource/bcm_kona_timer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/bcm_kona_timer.c 
b/drivers/clocksource/bcm_kona_timer.c
index 70d9c1e482dd..bbbfb03b46dd 100644
--- a/drivers/clocksource/bcm_kona_timer.c
+++ b/drivers/clocksource/bcm_kona_timer.c
@@ -69,7 +69,7 @@ static void kona_timer_disable_and_clear(void __iomem *base)
 static void
 kona_timer_get_counter(void __iomem *timer_base, uint32_t *msw, uint32_t *lsw)
 {
-   int loop_limit = 4;
+   int loop_limit = 3;
 
/*
 * Read 64-bit free running counter
@@ -83,12 +83,12 @@ kona_timer_get_counter(void __iomem *timer_base, uint32_t 
*msw, uint32_t *lsw)
 *  if new hi-word is equal to previously read hi-word then stop.
 */
 
-   while (--loop_limit) {
+   do {
*msw = readl(timer_base + KONA_GPTIMER_STCHI_OFFSET);
*lsw = readl(timer_base + KONA_GPTIMER_STCLO_OFFSET);
if (*msw == readl(timer_base + KONA_GPTIMER_STCHI_OFFSET))
break;
-   }
+   } while (--loop_limit);
if (!loop_limit) {
pr_err("bcm_kona_timer: getting counter failed.\n");
pr_err(" Timer will be impacted\n");
-- 
2.7.0

Re: [PATCH 1/2] liblockdep: Fix compile errors

2016-06-13 Thread Sasha Levin

On 06/11/2016 04:36 AM, Vishal Thanki wrote:
> On Sat, Jun 11, 2016 at 5:53 AM, Sasha Levin  wrote:
>> On 06/09/2016 09:34 AM, Vishal Thanki wrote:
>>> dfaaf3fa0: (Use __jhash_mix() for iterate_chain_key())
>>> Fixed by adding jhash.h with minimal stuff required
>>
>> Can we, instead of copying it over, include jhash.h directly
>> (just like we do for hash.h)?
>>
>>
> That was the first thing I tried, but then it caused more compilation
> errors due to nested header dependencies and I ended up taking the
> only required stuff. Better ideas are welcome.

I just gave it a quick go and didn't see anything beyond needing to take
in linux/unaligned/packed_struct.h as well. What sort of errors did you
hit?


Thanks,
Sasha

Re: [PATCH net-next] net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)

2016-06-13 Thread Matt Wilson

On Mon, Jun 13, 2016 at 11:46:13AM +0300, Netanel Belgazal wrote:
> This is a driver for the forthcoming ENA family of networking devices.

Reviewed-by: Matt Wilson 

> Signed-off-by: Netanel Belgazal 
> ---
>  Documentation/networking/00-INDEX |2 +
>  Documentation/networking/ena.txt  |  330 ++
>  MAINTAINERS   |9 +
>  drivers/net/ethernet/Kconfig  |1 +
>  drivers/net/ethernet/Makefile |1 +
>  drivers/net/ethernet/amazon/Kconfig   |   27 +
>  drivers/net/ethernet/amazon/Makefile  |5 +
>  drivers/net/ethernet/amazon/ena/Makefile  |9 +
>  drivers/net/ethernet/amazon/ena/ena_admin_defs.h  | 1246 
>  drivers/net/ethernet/amazon/ena/ena_com.c | 2768 +
>  drivers/net/ethernet/amazon/ena/ena_com.h | 1051 +++
>  drivers/net/ethernet/amazon/ena/ena_common_defs.h |   52 +
>  drivers/net/ethernet/amazon/ena/ena_eth_com.c |  508 
>  drivers/net/ethernet/amazon/ena/ena_eth_com.h |  160 +
>  drivers/net/ethernet/amazon/ena/ena_eth_io_defs.h |  460 +++
>  drivers/net/ethernet/amazon/ena/ena_ethtool.c |  836 +
>  drivers/net/ethernet/amazon/ena/ena_netdev.c  | 3362 
> +
>  drivers/net/ethernet/amazon/ena/ena_netdev.h  |  327 ++
>  drivers/net/ethernet/amazon/ena/ena_pci_id_tbl.h  |   67 +
>  drivers/net/ethernet/amazon/ena/ena_regs_defs.h   |  133 +
>  drivers/net/ethernet/amazon/ena/ena_sysfs.c   |  264 ++
>  drivers/net/ethernet/amazon/ena/ena_sysfs.h   |   55 +
>  22 files changed, 11673 insertions(+)
>  create mode 100644 Documentation/networking/ena.txt
>  create mode 100644 drivers/net/ethernet/amazon/Kconfig
>  create mode 100644 drivers/net/ethernet/amazon/Makefile
>  create mode 100644 drivers/net/ethernet/amazon/ena/Makefile
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_admin_defs.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_com.c
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_com.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_common_defs.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_eth_com.c
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_eth_com.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_eth_io_defs.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_ethtool.c
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_netdev.c
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_netdev.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_pci_id_tbl.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_regs_defs.h
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_sysfs.c
>  create mode 100644 drivers/net/ethernet/amazon/ena/ena_sysfs.h

[...]

--msw

Re: [RFC 01/18] capabilities: track actually used capabilities

2016-06-13 Thread Andy Lutomirski

On Mon, Jun 13, 2016 at 12:44 PM, Topi Miettinen  wrote:
> Track what capabilities are actually used and present the current
> situation in /proc/self/status.

What for?

What is the intended behavior on fork()?  Whatever the intended
behavior is, there should IMO be a selftest for it.

--Andy

Re: [PATCH v2 4/4] dynamic_debug: add jump label support

2016-06-13 Thread Jason Baron



On 06/13/2016 04:23 PM, Arnd Bergmann wrote:
> On Monday, June 13, 2016 6:05:22 PM CEST Arnd Bergmann wrote:
>> On Friday, June 10, 2016 11:33:07 AM CEST Jason Baron wrote:
>>> On 06/10/2016 05:54 AM, Arnd Bergmann wrote:
 On Friday, May 20, 2016 5:16:36 PM CEST Jason Baron wrote:
> Although dynamic debug is often only used for debug builds, sometimes its
> enabled for production builds as well. Minimize its impact by using jump
> labels. This reduces the text section by 7000+ bytes in the kernel image
> below. It does increase data, but this should only be referenced when
> changing the direction of the branches, and hence usually not in cache.
>
>text data bss dec hex filename
> 8194852  4879776  925696 14000324 d5a0c4 vmlinux.pre
> 8187337  4960224  925696 14073257 d6bda9 vmlinux.post
>
> Signed-off-by: Jason Baron 
> ---

 This causes problems for some of my randconfig builds, when a dynamic
 debug call is used inside of an __exit function:

 `.exit.text' referenced in section `__jump_table' of drivers/built-in.o: 
 defined in discarded section `.exit.text' of drivers/built-in.o
 `.exit.text' referenced in section `__jump_table' of drivers/built-in.o: 
 defined in discarded section `.exit.text' of drivers/built-in.o

>>>
>>> I stuck pr_debug() in a few functions marked with __exit, but did not
>>> reproduce yet. Can you share your .config and gcc --version.
>>>
>>
>> I found these on ARM randconfig builds e.g. this one
>> http://pastebin.com/raw/KjWHxnwU
>>
>> I also have some other patches applied that could have interacted with your
>> change, so if you can't reproduce it easily, let me try it on a plain 
>> linux-next
>> kernel.
>>
>> The compiler I use is  arm-linux-gnueabi-gcc (GCC) 6.0.0 20160323 
>> (experimental)
> 
> Update: on ARM, I have been able to reproduce this with gcc-4.6
> and gcc-4.8, so I'm pretty confident that this is independent of the
> toolchain. However, I have so far failed to reproduce this on x86.
> 
> Looking at the exit_ceph() function, I get these two assembly outputs,
> ARM fails with the link error above:
> 

ok, does this fix things up?

--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -44,7 +44,7 @@
 #endif

 #if (defined(CONFIG_SMP_ON_UP) && !defined(CONFIG_DEBUG_SPINLOCK)) || \
-   defined(CONFIG_GENERIC_BUG)
+   defined(CONFIG_GENERIC_BUG) || defined(CONFIG_JUMP_LABEL)
 #define ARM_EXIT_KEEP(x)   x
 #define ARM_EXIT_DISCARD(x)
 #else


Thanks,

-Jason

Re: 4.7-rc3: Reported regressions from 4.7

2016-06-13 Thread Rafael J. Wysocki

On Monday, June 13, 2016 03:53:35 PM Borislav Petkov wrote:
> On Mon, Jun 13, 2016 at 03:53:20PM +0200, Rafael J. Wysocki wrote:
> > I used kernel BZ entries for two reasons.
> > 
> > First, some of the bugs were in the kernel BZ already, so they could be 
> > added
> > to the tracked list very easily when it was in BZ itself.  Second, some 
> > people
> > actually used BZ entries created by me to work on the bugs going forward 
> > (for
> > storing logs, acpidumps and similar).
> 
> Could be useful too, especially if we mark all entries with some keyword
> like "regression" or somesuch and then search for it to geget all
> regressions.

There is a "regression" flag in the kernel BZ already.  Guess why. ;-)

> @Thorsten: well, if you do use bugzilla, you have everything there and
> ready to use :)

I may be able to find some scripts I used with that a few years ago (although
they aren't pretty).

Thanks,
Rafael

Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock

2016-06-13 Thread Sasha Levin

On 05/25/2016 10:37 PM, js1...@gmail.com wrote:
> From: Joonsoo Kim 
> 
> We don't need to split freepages with holding the zone lock. It will cause
> more contention on zone lock so not desirable.
> 
> Signed-off-by: Joonsoo Kim 

Hey Joonsoo,

I'm seeing the following corruption/crash which seems to be related to
this patch:

[ 3777.807224] [ cut here ]

[ 3777.807834] WARNING: CPU: 5 PID: 3270 at lib/list_debug.c:62 
__list_del_entry+0x14e/0x280

[ 3777.808562] list_del corruption. next->prev should be ea0004a76120, but 
was ea0004a72120

[ 3777.809498] Modules linked in:

[ 3777.809923] CPU: 5 PID: 3270 Comm: khugepaged Tainted: GW   
4.7.0-rc2-next-20160609-sasha-00024-g30ecaf6 #3101

[ 3777.811014]  1100f9315d7b 0bb7299a 8807c98aec60 
a0035b2b

[ 3777.811816]  0005 fbfff5630bf4 41b58ab3 
aaaf18e0

[ 3777.812662]  a00359bc 9e54d4a0 a8b2ade0 
8807c98aece0

[ 3777.813493] Call Trace:

[ 3777.813796] dump_stack (lib/dump_stack.c:53)
[ 3777.814310] ? arch_local_irq_restore (./arch/x86/include/asm/paravirt.h:134)
[ 3777.814947] ? is_module_text_address (kernel/module.c:4185)
[ 3777.815571] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1))
[ 3777.816174] ? vprintk_default (kernel/printk/printk.c:1886)
[ 3777.816761] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1))
[ 3777.817381] __warn (kernel/panic.c:518)
[ 3777.817867] warn_slowpath_fmt (kernel/panic.c:526)
[ 3777.818428] ? __warn (kernel/panic.c:526)
[ 3777.819001] ? __schedule (kernel/sched/core.c:2858 kernel/sched/core.c:3345)
[ 3777.819541] __list_del_entry (lib/list_debug.c:60 (discriminator 1))
[ 3777.820116] ? __list_add (lib/list_debug.c:45)
[ 3777.820721] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 3777.821347] list_del (lib/list_debug.c:78)
[ 3777.821829] __isolate_free_page (mm/page_alloc.c:2514)
[ 3777.822400] ? __zone_watermark_ok (mm/page_alloc.c:2493)
[ 3777.823007] isolate_freepages_block (mm/compaction.c:498)
[ 3777.823629] ? compact_unlock_should_abort (mm/compaction.c:417)
[ 3777.824312] compaction_alloc (mm/compaction.c:1112 mm/compaction.c:1156)
[ 3777.824871] ? isolate_freepages_block (mm/compaction.c:1146)
[ 3777.825512] ? __page_cache_release (mm/swap.c:73)
[ 3777.826127] migrate_pages (mm/migrate.c:1079 mm/migrate.c:1325)
[ 3777.826712] ? __reset_isolation_suitable (mm/compaction.c:1175)
[ 3777.827398] ? isolate_freepages_block (mm/compaction.c:1146)
[ 3777.828109] ? buffer_migrate_page (mm/migrate.c:1301)
[ 3777.828727] compact_zone (mm/compaction.c:1555)
[ 3777.829290] ? compaction_restarting (mm/compaction.c:1476)
[ 3777.829969] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 
include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
[ 3777.830607] compact_zone_order (mm/compaction.c:1653)
[ 3777.831204] ? kick_process (kernel/sched/core.c:2692)
[ 3777.831774] ? compact_zone (mm/compaction.c:1637)
[ 3777.832336] ? io_schedule_timeout (kernel/sched/core.c:3266)
[ 3777.832934] try_to_compact_pages (mm/compaction.c:1717)
[ 3777.833550] ? compaction_zonelist_suitable (mm/compaction.c:1679)
[ 3777.834265] __alloc_pages_direct_compact (mm/page_alloc.c:3180)
[ 3777.834922] ? get_page_from_freelist (mm/page_alloc.c:3172)
[ 3777.835549] __alloc_pages_slowpath (mm/page_alloc.c:3741)
[ 3777.836210] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:84 
arch/x86/kernel/kvmclock.c:92)
[ 3777.836744] ? __alloc_pages_direct_compact (mm/page_alloc.c:3546)
[ 3777.837429] ? get_page_from_freelist (mm/page_alloc.c:2950)
[ 3777.838072] ? release_pages (mm/swap.c:731)
[ 3777.838610] ? __isolate_free_page (mm/page_alloc.c:2883)
[ 3777.839209] ? ___might_sleep (kernel/sched/core.c:7540 (discriminator 1))
[ 3777.839826] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14))
[ 3777.840427] __alloc_pages_nodemask (mm/page_alloc.c:3841)
[ 3777.841071] ? rwsem_wake (kernel/locking/rwsem-xadd.c:580)
[ 3777.841608] ? __alloc_pages_slowpath (mm/page_alloc.c:3757)
[ 3777.842253] ? call_rwsem_wake (arch/x86/lib/rwsem.S:129)
[ 3777.842839] ? up_write (kernel/locking/rwsem.c:112)
[ 3777.843350] ? pmdp_huge_clear_flush (mm/pgtable-generic.c:131)
[ 3777.844125] khugepaged_alloc_page (mm/khugepaged.c:752)
[ 3777.844719] collapse_huge_page (mm/khugepaged.c:948)
[ 3777.845332] ? khugepaged_scan_shmem (mm/khugepaged.c:922)
[ 3777.846020] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14))
[ 3777.846608] ? remove_wait_queue (kernel/sched/wait.c:292)
[ 3777.847181] khugepaged (mm/khugepaged.c:1724 mm/khugepaged.c:1799 
mm/khugepaged.c:1848)
[ 3777.847704] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 
include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
[ 3777.848297] ? collapse_huge_page (mm/khugepaged.c:1840)
[ 3777.848950] ? io_schedule_timeout (kernel/sched/core.c:3266)
[ 3777.849555] ? default_wake_function (kernel/sched/core.c:3544)
[ 3777.

Re: [PATCH] ARM: mm: fix location of _etext

2016-06-13 Thread Kees Cook

On Wed, Jun 8, 2016 at 4:11 PM, Kees Cook  wrote:
> The _etext position is defined to be the end of the kernel text code,
> and should not include any part of the data segments. This interferes
> with things that might check memory ranges and expect executable code
> up to _etext.
>
> Signed-off-by: Kees Cook 

Can someone give this an Ack? I'd like to land it as it is a
prerequisite to some usercopy hardening work I'm doing.

Thanks!

-Kees

> ---
> arm64 needs this fixed too, but it has other assumptions built onto
> _etext that should be using different markers.
> ---
>  arch/arm/kernel/vmlinux.lds.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index e2c6da096cef..99420fc1f066 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -125,6 +125,8 @@ SECTIONS
>  #ifdef CONFIG_DEBUG_ALIGN_RODATA
> . = ALIGN(1<  #endif
> +   _etext = .; /* End of text section */
> +
> RO_DATA(PAGE_SIZE)
>
> . = ALIGN(4);
> @@ -155,8 +157,6 @@ SECTIONS
>
> NOTES
>
> -   _etext = .; /* End of text and rodata section */
> -
>  #ifdef CONFIG_DEBUG_RODATA
> . = ALIGN(1<  #else
> --
> 2.7.4
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security



-- 
Kees Cook
Chrome OS & Brillo Security

Re: [PATCH v2 4/4] dynamic_debug: add jump label support

2016-06-13 Thread Arnd Bergmann

On Monday, June 13, 2016 6:05:22 PM CEST Arnd Bergmann wrote:
> On Friday, June 10, 2016 11:33:07 AM CEST Jason Baron wrote:
> > On 06/10/2016 05:54 AM, Arnd Bergmann wrote:
> > > On Friday, May 20, 2016 5:16:36 PM CEST Jason Baron wrote:
> > >> Although dynamic debug is often only used for debug builds, sometimes its
> > >> enabled for production builds as well. Minimize its impact by using jump
> > >> labels. This reduces the text section by 7000+ bytes in the kernel image
> > >> below. It does increase data, but this should only be referenced when
> > >> changing the direction of the branches, and hence usually not in cache.
> > >>
> > >>text data bss dec hex filename
> > >> 8194852  4879776  925696 14000324 d5a0c4 vmlinux.pre
> > >> 8187337  4960224  925696 14073257 d6bda9 vmlinux.post
> > >>
> > >> Signed-off-by: Jason Baron 
> > >> ---
> > > 
> > > This causes problems for some of my randconfig builds, when a dynamic
> > > debug call is used inside of an __exit function:
> > > 
> > > `.exit.text' referenced in section `__jump_table' of drivers/built-in.o: 
> > > defined in discarded section `.exit.text' of drivers/built-in.o
> > > `.exit.text' referenced in section `__jump_table' of drivers/built-in.o: 
> > > defined in discarded section `.exit.text' of drivers/built-in.o
> > > 
> > 
> > I stuck pr_debug() in a few functions marked with __exit, but did not
> > reproduce yet. Can you share your .config and gcc --version.
> > 
> 
> I found these on ARM randconfig builds e.g. this one
> http://pastebin.com/raw/KjWHxnwU
> 
> I also have some other patches applied that could have interacted with your
> change, so if you can't reproduce it easily, let me try it on a plain 
> linux-next
> kernel.
> 
> The compiler I use is  arm-linux-gnueabi-gcc (GCC) 6.0.0 20160323 
> (experimental)

Update: on ARM, I have been able to reproduce this with gcc-4.6
and gcc-4.8, so I'm pretty confident that this is independent of the
toolchain. However, I have so far failed to reproduce this on x86.

Looking at the exit_ceph() function, I get these two assembly outputs,
ARM fails with the link error above:

.section.exit.text,"ax",%progbits
.align  2
.syntax unified
.arm
.fpu softvfp
.type   exit_ceph, %function
exit_ceph:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
mov ip, sp  @,
push{fp, ip, lr, pc}@
sub fp, ip, #4  @,,
sub sp, sp, #8  @,,
.syntax divided
@ 13 "/git/arm-soc/arch/arm/include/asm/jump_label.h" 1
1:
nop
.pushsection __jump_table,  "aw"
.word 1b, .L341, descriptor.39418+20@,
.popsection

@ 0 "" 2
.syntax unified
.L342:
ldr r0, .L344   @,
bl  unregister_filesystem   @
bl  ceph_xattr_exit @
bl  destroy_caches  @
b   .L343   @
.L341:
mov r1, #29 @,
ldr r0, .L344+4 @,
bl  ceph_file_part  @
mov r3, #1072   @ tmp118,
mov r2, #3  @,
stm sp, {r0, r3}@,,
ldr r1, .L344+8 @,
ldr r3, .L344+12@,
ldr r0, .L344+16@,
bl  __dynamic_pr_debug  @
b   .L342   @
.L343:
sub sp, fp, #12 @,,
ldm sp, {fp, sp, pc}@
.L345:
.align  2
.L344:
.word   .LANCHOR2+224
.word   .LC0
.word   .LC69
.word   .LC1
.word   .LANCHOR0+1088
.size   exit_ceph, .-exit_ceph


and x86 has no link error with:

.type   exit_ceph, @function
exit_ceph:
pushq   %rbp#
movq%rsp, %rbp  #,
#APP
# 35 "/git/arm-soc/arch/x86/include/asm/jump_label.h" 1
1:.byte 0x0f,0x1f,0x44,0x00,0
.pushsection __jump_table,  "aw"
 .balign 8
 .quad 1b, .L350, descriptor.39765+40 + 0   #,,
.popsection

# 0 "" 2
#NO_APP
.L351:
movq$ceph_fs_type, %rdi #,
callunregister_filesystem   #
callceph_xattr_exit #
calldestroy_caches  #
popq%rbp#
ret
.L350:
movl$29, %esi   #,
movq$.LC0, %rdi #,
callceph_file_part  #
movl$1072, %r9d #,
movq%rax, %r8   #, D.41790
movq$.LC1, %rcx #,
movl$3, %edx#,
movq$.LC85, %rsi#,
movq$descriptor.39765, %rdi #,
call__dynamic_pr_debug  #
jmp .L351   #
.size   exit_ceph, .-exit_ceph


In both cases, the __jump_table section clearly has a reference to a
discarded section.

Arnd

[PATCH] x86/SVM: Fix implicit declaration issue for __default_cpu_present_to_apicid()

2016-06-13 Thread Suravee Suthikulpanit

The commit 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC")
introduces a build error due to implicit function declaration
with#ifdef CONFIG_X86_32 and #ifndef CONFIG_X86_LOCAL_APIC.
with Kbuild test robot config file (i386-randconfig-x0-06121009).

This patch fixes the issue by using the default_cpu_present_to_apicid(),
adding necessary function declaration, and exporting the symbol.

Reported-by: kbuild test robot 
Fixes: commit 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC")
Signed-off-by: Suravee Suthikulpanit 
---
 arch/x86/include/asm/apic.h | 5 +
 arch/x86/kernel/setup.c | 1 +
 arch/x86/kvm/svm.c  | 6 +++---
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index bc27611..beb1d4a 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -631,6 +631,11 @@ extern int default_cpu_present_to_apicid(int mps_cpu);
 extern int default_check_phys_apicid_present(int phys_apicid);
 #endif
 
+#else /* CONFIG_X86_LOCAL_APIC */
+static inline int default_cpu_present_to_apicid(int mps_cpu)
+{
+   return BAD_APICID;
+}
 #endif /* CONFIG_X86_LOCAL_APIC */
 extern void irq_enter(void);
 extern void irq_exit(void);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index c4e7b39..90168be 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -137,6 +137,7 @@ int default_cpu_present_to_apicid(int mps_cpu)
 {
return __default_cpu_present_to_apicid(mps_cpu);
 }
+EXPORT_SYMBOL_GPL(default_cpu_present_to_apicid);
 
 int default_check_phys_apicid_present(int phys_apicid)
 {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 5b12438..803351a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1402,7 +1402,7 @@ avic_update_iommu(struct kvm_vcpu *vcpu, int cpu, 
phys_addr_t pa, bool r)
 static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
 {
u64 entry;
-   int h_physical_id = __default_cpu_present_to_apicid(vcpu->cpu);
+   int h_physical_id = default_cpu_present_to_apicid(vcpu->cpu);
struct vcpu_svm *svm = to_svm(vcpu);
 
if (!kvm_vcpu_apicv_active(vcpu))
@@ -1434,7 +1434,7 @@ static void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
u64 entry;
/* ID = 0xff (broadcast), ID > 0xff (reserved) */
-   int h_physical_id = __default_cpu_present_to_apicid(cpu);
+   int h_physical_id = default_cpu_present_to_apicid(cpu);
struct vcpu_svm *svm = to_svm(vcpu);
 
if (!kvm_vcpu_apicv_active(vcpu))
@@ -4328,7 +4328,7 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, 
int vec)
 
if (avic_vcpu_is_running(vcpu))
wrmsrl(SVM_AVIC_DOORBELL,
-  __default_cpu_present_to_apicid(vcpu->cpu));
+  default_cpu_present_to_apicid(vcpu->cpu));
else
kvm_vcpu_wake_up(vcpu);
 }
-- 
1.9.1

Re: [PATCH] brcmfmac: rework function picking free BSS index

2016-06-13 Thread Rafał Miłecki

On 13 June 2016 at 21:30, Arend van Spriel  wrote:
> On 09-06-16 21:16, Arend van Spriel wrote:
>> On 26-05-16 01:44, Rafał Miłecki wrote:
>>> The old implementation was overcomplicated and slightly bugged in some
>>> corner cases.
>>>
>
> [...]
>
>>> New code is simpler, placed in file where it's really used, handles
>>> running out of free BSS-es and allows using 4 interfaces at the same
>>> time. It also looks for the first free BSS instead of one after the last
>>> in use. It works well with current driver (which doesn't allow deleting
>>> interfaces) and should be future proof (if we ever allow deleting).
>>>
>>> Signed-off-by: Rafał Miłecki 
>>> ---
>>>  .../broadcom/brcm80211/brcmfmac/cfg80211.c | 17 ++-
>>>  .../wireless/broadcom/brcm80211/brcmfmac/core.c| 24 
>>> --
>>>  .../wireless/broadcom/brcm80211/brcmfmac/core.h|  1 -
>>>  3 files changed, 16 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c 
>>> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
>>> index 3d09d23..d00eef8 100644
>>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
>>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
>>> @@ -541,6 +541,21 @@ brcmf_cfg80211_update_proto_addr_mode(struct 
>>> wireless_dev *wdev)
>>>  ADDR_INDIRECT);
>>>  }
>>>
>>> +static int brcmf_get_first_free_bsscfgidx(struct brcmf_pub *drvr)
>>> +{
>>> +int bsscfgidx;
>>> +
>>> +for (bsscfgidx = 0; bsscfgidx < BRCMF_MAX_IFS; bsscfgidx++) {
>>> +/* bsscfgidx 1 is reserved for legacy P2P */
>>
>> Hi Rafał,
>>
>> A bit late as the patch is already applied, but this reserved index is
>> no longer needed as we removed all trickery that was build on the
>> assumption that the P2P_DEVICE interface was always in bsscfgidx 1.
>> Hence this could be removed.
>
> I tested STA on bsscfgidx=0, AP on bsscfgidx=1, and P2P_DEV on
> bsscfgidx=2. P2P discovery on P2P_DEV interface works as expected so we
> can indeed drop the 'bsscfgidx 1 is reserved' statement. I want to
> verify on an older device before creating a patch.

Thanks for taking look at this.

-- 
Rafał

[PATCH v1 1/1] x86/platform/intel-mid: Add Power Management Unit driver

2016-06-13 Thread Andy Shevchenko

Add Power Management Unit driver to handle power states of South Complex
devices on Intel Tangier. In the future it might be expanded to cover North
Complex devices as well.

With this driver the power state of the host controllers such as SPI, I2C,
UART, eMMC, and DMA would be managed.

Signed-off-by: Andy Shevchenko 
---
 arch/x86/include/asm/intel-mid.h |   8 +
 arch/x86/pci/intel_mid_pci.c |  35 +++-
 arch/x86/platform/intel-mid/Makefile |   2 +-
 arch/x86/platform/intel-mid/pmu.c| 392 +++
 drivers/pci/Makefile |   3 +
 drivers/pci/pci-mid.c|  77 +++
 6 files changed, 515 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/platform/intel-mid/pmu.c
 create mode 100644 drivers/pci/pci-mid.c

diff --git a/arch/x86/include/asm/intel-mid.h b/arch/x86/include/asm/intel-mid.h
index 7c5af12..0941497 100644
--- a/arch/x86/include/asm/intel-mid.h
+++ b/arch/x86/include/asm/intel-mid.h
@@ -12,9 +12,17 @@
 #define _ASM_X86_INTEL_MID_H
 
 #include 
+#include 
 #include 
 
 extern int intel_mid_pci_init(void);
+int intel_mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state);
+
+#define INTEL_MID_PMU_LSS_OFFSET   4
+#define INTEL_MID_PMU_LSS_TYPE (1 << 7)
+
+int intel_mid_pmu_get_lss_id(struct pci_dev *pdev);
+
 extern int get_gpio_by_name(const char *name);
 extern void intel_scu_device_register(struct platform_device *pdev);
 extern int __init sfi_parse_mrtc(struct sfi_table_header *table);
diff --git a/arch/x86/pci/intel_mid_pci.c b/arch/x86/pci/intel_mid_pci.c
index ae97f24..399c9d7 100644
--- a/arch/x86/pci/intel_mid_pci.c
+++ b/arch/x86/pci/intel_mid_pci.c
@@ -316,15 +316,48 @@ static void pci_d3delay_fixup(struct pci_dev *dev)
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_d3delay_fixup);
 
-static void mrst_power_off_unused_dev(struct pci_dev *dev)
+static void mid_power_off_dev(struct pci_dev *dev)
 {
+   u16 pmcsr;
+
+   /*
+* Update current state first, otherwise PCI core enforces PCI_D0 in
+* pci_set_power_state() for devices which status was PCI_UNKNOWN.
+*/
+   pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
+   dev->current_state = (pci_power_t __force)(pmcsr & 
PCI_PM_CTRL_STATE_MASK);
+
pci_set_power_state(dev, PCI_D3hot);
 }
+
+static void mrst_power_off_unused_dev(struct pci_dev *dev)
+{
+   mid_power_off_dev(dev);
+}
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0801, 
mrst_power_off_unused_dev);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0809, 
mrst_power_off_unused_dev);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x080C, 
mrst_power_off_unused_dev);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0815, 
mrst_power_off_unused_dev);
 
+static void mrfld_power_off_unused_dev(struct pci_dev *dev)
+{
+   int id;
+
+   if (!pci_soc_mode)
+   return;
+
+   id = intel_mid_pmu_get_lss_id(dev);
+   if (id < 0)
+   return;
+
+   /*
+* This sets only PMCSR bits. The actual power off will happen in
+* arch/x86/platform/intel-mid/pmu.c.
+*/
+   mid_power_off_dev(dev);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, 
mrfld_power_off_unused_dev);
+
 /*
  * Langwell devices reside at fixed offsets, don't try to move them.
  */
diff --git a/arch/x86/platform/intel-mid/Makefile 
b/arch/x86/platform/intel-mid/Makefile
index 0ce1b19..b89d150 100644
--- a/arch/x86/platform/intel-mid/Makefile
+++ b/arch/x86/platform/intel-mid/Makefile
@@ -1,4 +1,4 @@
-obj-$(CONFIG_X86_INTEL_MID) += intel-mid.o intel_mid_vrtc.o mfld.o mrfl.o
+obj-$(CONFIG_X86_INTEL_MID) += intel-mid.o intel_mid_vrtc.o mfld.o mrfl.o pmu.o
 
 # SFI specific code
 ifdef CONFIG_X86_INTEL_MID
diff --git a/arch/x86/platform/intel-mid/pmu.c 
b/arch/x86/platform/intel-mid/pmu.c
new file mode 100644
index 000..ad279dd
--- /dev/null
+++ b/arch/x86/platform/intel-mid/pmu.c
@@ -0,0 +1,392 @@
+/*
+ * Intel MID Power Management Unit device driver
+ *
+ * Copyright (C) 2016, Intel Corporation
+ *
+ * Author: Andy Shevchenko 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* Registers */
+#define PM_STS 0x00
+#define PM_CMD 0x04
+#define PM_ICS 0x08
+#define PM_WKC(x)  (0x10 + (x) * 4)
+#define PM_WKS(x)  (0x18 + (x) * 4)
+#define PM_SSC(x)  (0x20 + (x) * 4)
+#define PM_SSS(x)  (0x30 + (x) * 4)
+
+/* Bits in PM_STS */
+#define PM_STS_BUSY(1 << 8)
+
+/* Bits in PM_CMD */
+#define PM_CMD_CMD(x)  ((x) << 0)
+#define PM_CMD_IOC (1 << 8)
+#define PM_CMD_D3cold  (1 << 21)
+
+/*

Re: [PATCH] gcc-plugins: disable under COMPILE_TEST

2016-06-13 Thread Kees Cook

On Mon, Jun 13, 2016 at 1:40 AM, Sedat Dilek  wrote:
> On Sat, Jun 11, 2016 at 6:12 PM, Kees Cook  wrote:
>> Since adding the gcc plugin development headers is required for the
>> gcc plugin support, we should ease into this new kernel build dependency
>> more slowly. For now, disable the gcc plugins under COMPILE_TEST so that
>> all*config builds will skip it.
>>
>
> [ This might be a bit off-topic - Feel free to answer ]
>
> Hi,
>
> I want to try that new "GCC-plugin" feature.
> Do you have a Git repo for "easy-testing"?

Start with linux-next. It has the basic infrastructure. The
"latent_entropy" plugin is in my kssp tree here:
http://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=kspp/gcc-plugins/latent_entropy
though it is not the most up to date version.

> Does the kernel's build-system check for installed "gcc-plugin
> development headers"?

Yes, when the plugins have been selected.

> Which GCC versions support "gcc-plugin" feature?

gcc-4.5 and newer.

> I am here on Ubuntu/precise AMD64 and have gcc-4.6.4 and gcc-4.9.2.

I strongly recommend upgrading to Ubuntu 16.04, but regardless, using
gcc 4.9 should be fine.

> [ Optional ]
> What about the topic and support for "LLVM/Clang and hardening" of the
> Linux-kernel?

I haven't been involved in that project, sorry.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

Re: [PATCH 4/8] ntb_perf: Wait for link before running test

2016-06-13 Thread Jiang, Dave

On Fri, 2016-06-10 at 16:54 -0600, Logan Gunthorpe wrote:
> Instead of returning immediately with an error when the link is
> down, wait for the link to come up (or the user sends a SIGINT).
> 
> This is to make scripting ntb_perf easier.
> 
> Signed-off-by: Logan Gunthorpe 
Acked-by: Dave Jiang 

> ---
>  drivers/ntb/test/ntb_perf.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/test/ntb_perf.c
> b/drivers/ntb/test/ntb_perf.c
> index 05a8705..f0784e5 100644
> --- a/drivers/ntb/test/ntb_perf.c
> +++ b/drivers/ntb/test/ntb_perf.c
> @@ -135,6 +135,7 @@ struct perf_ctx {
>   boollink_is_up;
>   struct work_struct  link_cleanup;
>   struct delayed_work link_work;
> + wait_queue_head_t   link_wq;
>   struct dentry   *debugfs_node_dir;
>   struct dentry   *debugfs_run;
>   struct dentry   *debugfs_threads;
> @@ -533,6 +534,7 @@ static void perf_link_work(struct work_struct
> *work)
>   goto out1;
>  
>   perf->link_is_up = true;
> + wake_up(&perf->link_wq);
>  
>   return;
>  
> @@ -653,7 +655,7 @@ static ssize_t debugfs_run_write(struct file
> *filp, const char __user *ubuf,
>   int node, i;
>   DECLARE_WAIT_QUEUE_HEAD(wq);
>  
> - if (!perf->link_is_up)
> + if (wait_event_interruptible(perf->link_wq, perf-
> >link_is_up))
>   return -ENOLINK;
>  
>   if (perf->perf_threads == 0)
> @@ -783,6 +785,7 @@ static int perf_probe(struct ntb_client *client,
> struct ntb_dev *ntb)
>   mutex_init(&perf->run_mutex);
>   spin_lock_init(&perf->db_lock);
>   perf_setup_mw(ntb, perf);
> + init_waitqueue_head(&perf->link_wq);
>   INIT_DELAYED_WORK(&perf->link_work, perf_link_work);
>   INIT_WORK(&perf->link_cleanup, perf_link_cleanup);
>  
> -- 
> 2.1.4
>

Re: [PATCH] gcc-plugins: disable under COMPILE_TEST

2016-06-13 Thread Kees Cook

On Mon, Jun 13, 2016 at 11:32 AM, Austin S. Hemmelgarn
 wrote:
> On 2016-06-12 20:18, Emese Revfy wrote:
>>
>> On Sun, 12 Jun 2016 15:25:39 -0700
>> Kees Cook  wrote:
>>
>>> I don't like this because it means if someone specifically selects
>>> some plugins in their .config, and the headers are missing, the kernel
>>> will successfully compile. For many plugins, this results in a kernel
>>> that lacks the requested security features, and that I really do not
>>> want to have happening. I'm okay leaving these disabled for compile
>>> tests for now. We can revisit this once more distros have plugins
>>> enabled by default.
>>
>>
>> You are right. Your patch is safer.
>>
> Why not make it so that if COMPILE_TEST is enabled, the build warns if it
> can't find the headers, otherwise it fails?  That way, people who are doing
> all*config builds but don't have the headers will still get some build
> coverage, and the people who are enabling it as a security feature will
> still get build failures.

I don't see a clear way to do this, but if you can find a way to make
that happen, please send a patch! :)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

Re: [PATCH 3/8] ntb_perf: Return results by reading the run file

2016-06-13 Thread Jiang, Dave

On Fri, 2016-06-10 at 16:54 -0600, Logan Gunthorpe wrote:
> Instead of having to watch logs, allow the results to be retrieved
> by reading back the run file. This file will return "running" when
> the test is running and nothing if no tests have been run yet.
> It returns 1 line per thread, and will display an error message if
> the
> corresponding thread returns an error.
> 
> With the above change, the pr_info calls that returned the results
> are
> then changed to pr_debug calls.
> 
> Signed-off-by: Logan Gunthorpe 
Acked-by: Dave Jiang 

> ---
>  drivers/ntb/test/ntb_perf.c | 67
> +
>  1 file changed, 55 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/ntb/test/ntb_perf.c
> b/drivers/ntb/test/ntb_perf.c
> index db4dc61..05a8705 100644
> --- a/drivers/ntb/test/ntb_perf.c
> +++ b/drivers/ntb/test/ntb_perf.c
> @@ -123,6 +123,9 @@ struct pthr_ctx {
>   int src_idx;
>   void*srcs[MAX_SRCS];
>   wait_queue_head_t   *wq;
> + int status;
> + u64 copied;
> + u64 diff_us;
>  };
>  
>  struct perf_ctx {
> @@ -305,7 +308,7 @@ static int perf_move_data(struct pthr_ctx *pctx,
> char __iomem *dst, char *src,
>   }
>  
>   if (use_dma) {
> - pr_info("%s: All DMA descriptors submitted\n",
> current->comm);
> + pr_debug("%s: All DMA descriptors submitted\n",
> current->comm);
>   while (atomic_read(&pctx->dma_sync) != 0) {
>   if (kthread_should_stop())
>   break;
> @@ -317,13 +320,16 @@ static int perf_move_data(struct pthr_ctx
> *pctx, char __iomem *dst, char *src,
>   kdiff = ktime_sub(kstop, kstart);
>   diff_us = ktime_to_us(kdiff);
>  
> - pr_info("%s: copied %llu bytes\n", current->comm, copied);
> + pr_debug("%s: copied %llu bytes\n", current->comm, copied);
>  
> - pr_info("%s: lasted %llu usecs\n", current->comm, diff_us);
> + pr_debug("%s: lasted %llu usecs\n", current->comm, diff_us);
>  
>   perf = div64_u64(copied, diff_us);
>  
> - pr_info("%s: MBytes/s: %llu\n", current->comm, perf);
> + pr_debug("%s: MBytes/s: %llu\n", current->comm, perf);
> +
> + pctx->copied = copied;
> + pctx->diff_us = diff_us;
>  
>   return 0;
>  }
> @@ -345,7 +351,7 @@ static int ntb_perf_thread(void *data)
>   int rc, node, i;
>   struct dma_chan *dma_chan = NULL;
>  
> - pr_info("kthread %s starting...\n", current->comm);
> + pr_debug("kthread %s starting...\n", current->comm);
>  
>   node = dev_to_node(&pdev->dev);
>  
> @@ -575,19 +581,44 @@ static ssize_t debugfs_run_read(struct file
> *filp, char __user *ubuf,
>  {
>   struct perf_ctx *perf = filp->private_data;
>   char *buf;
> - ssize_t ret, out_offset;
> - int running;
> + ssize_t ret, out_off = 0;
> + struct pthr_ctx *pctx;
> + int i;
> + u64 rate;
>  
>   if (!perf)
>   return 0;
>  
> - buf = kmalloc(64, GFP_KERNEL);
> + buf = kmalloc(1024, GFP_KERNEL);
>   if (!buf)
>   return -ENOMEM;
>  
> - running = mutex_is_locked(&perf->run_mutex);
> - out_offset = snprintf(buf, 64, "%d\n", running);
> - ret = simple_read_from_buffer(ubuf, count, offp, buf,
> out_offset);
> + if (mutex_is_locked(&perf->run_mutex)) {
> + out_off = snprintf(buf, 64, "running\n");
> + goto read_from_buf;
> + }
> +
> + for (i = 0; i < MAX_THREADS; i++) {
> + pctx = &perf->pthr_ctx[i];
> +
> + if (pctx->status == -ENODATA)
> + break;
> +
> + if (pctx->status) {
> + out_off += snprintf(buf + out_off, 1024 -
> out_off,
> + "%d: error %d\n", i,
> + pctx->status);
> + continue;
> + }
> +
> + rate = div64_u64(pctx->copied, pctx->diff_us);
> + out_off += snprintf(buf + out_off, 1024 - out_off,
> + "%d: copied %llu bytes in %llu usecs, %llu
> MBytes/s\n",
> + i, pctx->copied, pctx->diff_us, rate);
> + }
> +
> +read_from_buf:
> + ret = simple_read_from_buffer(ubuf, count, offp, buf,
> out_off);
>   kfree(buf);
>  
>   return ret;
> @@ -601,12 +632,20 @@ static void threads_cleanup(struct perf_ctx
> *perf)
>   for (i = 0; i < MAX_THREADS; i++) {
>   pctx = &perf->pthr_ctx[i];
>   if (pctx->thread) {
> - kthread_stop(pctx->thread);
> + pctx->status = kthread_stop(pctx->thread);
>   pctx->thread = NULL;
>   }
>   }
>  }
>  
> +static void perf_clear_thread_status(struct perf_ctx *perf)
> +{
> + int i;
> +
> + for (i = 0; i < MAX_THREADS; i++)
> + perf->pthr_ctx[i].status

Re: [PATCH -next] mtd: nand: sunxi: fix return value check in sunxi_nfc_dma_op_prepare()

2016-06-13 Thread Boris Brezillon

On Mon, 13 Jun 2016 14:27:18 +
weiyj...@163.com wrote:

> From: Wei Yongjun 
> 
> In case of error, the function dmaengine_prep_slave_sg() returns NULL
> pointer not ERR_PTR(). The IS_ERR() test in the return value check
> should be replaced with NULL test.
> 
> Signed-off-by: Wei Yongjun 

Applied.

Thanks,

Boris

> ---
>  drivers/mtd/nand/sunxi_nand.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mtd/nand/sunxi_nand.c b/drivers/mtd/nand/sunxi_nand.c
> index ef7f6df..653cb3a 100644
> --- a/drivers/mtd/nand/sunxi_nand.c
> +++ b/drivers/mtd/nand/sunxi_nand.c
> @@ -390,8 +390,8 @@ static int sunxi_nfc_dma_op_prepare(struct mtd_info *mtd, 
> const void *buf,
>   return -ENOMEM;
>  
>   dmad = dmaengine_prep_slave_sg(nfc->dmac, sg, 1, tdir, DMA_CTRL_ACK);
> - if (IS_ERR(dmad)) {
> - ret = PTR_ERR(dmad);
> + if (!dmad) {
> + ret = -EINVAL;
>   goto err_unmap_buf;
>   }
> 
> 



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

Re: [PATCH V9 09/11] ARM64/PCI: ACPI support for legacy IRQs parsing and consolidation with DT code

2016-06-13 Thread Duc Dang

On Mon, Jun 13, 2016 at 3:40 AM, Lorenzo Pieralisi
 wrote:
>
> On Fri, Jun 10, 2016 at 06:36:12PM -0500, Bjorn Helgaas wrote:
> > On Fri, Jun 10, 2016 at 09:55:17PM +0200, Tomasz Nowicki wrote:
> > > To enable PCI legacy IRQs on platforms booting with ACPI, arch code
> > > should include ACPI specific callbacks that parse and set-up the
> > > device IRQ number, equivalent to the DT boot path. Owing to the current
> > > ACPI core scan handlers implementation, ACPI PCI legacy IRQs bindings
> > > cannot be parsed at device add time, since that would trigger ACPI scan
> > > handlers ordering issues depending on how the ACPI tables are defined.
> >
> > Uh, OK :)  I can't figure out exactly what the problem is here -- I
> > don't know where to look if I wanted to fix the scan handler ordering
> > issues, and I don't know how I could tell if it would ever be safe to
> > move this from driver probe-time back to device add-time.
>
> Right, the commit log could have been more informative.
>
> pcibios_add_device() was added in:
>
> commit d1e6dc91b532 ("arm64: Add architectural support for PCI")
>
> whose commit log does not specify why legacy IRQ parsing should
> be done at pcibios_add_device() either, so honestly we had to
> do with the information we have at hand.
>
> > I also notice that x86 and ia64 call acpi_pci_irq_enable() even later,
> > when the driver *enables* the device.  Is there a reason you didn't do
> > it at the same time as x86 and ia64?  This is another of those pcibios
> > hooks that really don't do anything arch-specific, so I can imagine
> > refactoring this somehow, someday.
>
> Yes, with [1], that was the goal, that stopped because [1] does not
> work on x86.
>
> Only DT platform(s) affected by this change are all platforms relying on
> drivers/pci/host/pci-xgene.c (others rely on pci_fixup_irqs() that
> should be removed too), if on those platforms probing IRQs at device
> enable time works ok I can update this patch (it can be done through [1]
> once we figure out what to do with it on x86) and move the IRQ set-up at
> pcibios_enable_device() time.
>
> @Duc: any feedback on this ?

Hi Lorenzo,

The changes to add pcibios_alloc_irq works fine on X-Gene PCIe

I also tried to remove pcibios_alloc_irq and move its code into
pcibios_enable_device
after pci_enable_resource call and legacy IRQ also works.

Can you also point me to the discussion thread or some info. about the
issue on x86 with [1]?
I want to check if there is any more test case I need to verify.

Regards,
Duc Dang.

>
> Thanks,
> Lorenzo
>
> [1] http://www.spinics.net/lists/linux-pci/msg45950.html
>
> > Did we have this conversation before?  It seems vaguely familiar, so I
> > apologize if you already explained this once.
> >
> > > To solve this problem and consolidate FW PCI legacy IRQs parsing in
> > > one single pcibios callback (pending final removal), this patch moves
> > > DT PCI IRQ parsing to the pcibios_alloc_irq() callback (called by
> > > PCI core code at device probe time) and adds ACPI PCI legacy IRQs
> > > parsing to the same callback too, so that FW PCI legacy IRQs parsing
> > > is confined in one single arch callback that can be easily removed
> > > when code parsing PCI legacy IRQs is consolidated and moved to core
> > > PCI code.
> > >
> > > Signed-off-by: Tomasz Nowicki 
> > > Suggested-by: Lorenzo Pieralisi 
> > > ---
> > >  arch/arm64/kernel/pci.c | 11 ---
> > >  1 file changed, 8 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> > > index d5d3d26..b3b8a2c 100644
> > > --- a/arch/arm64/kernel/pci.c
> > > +++ b/arch/arm64/kernel/pci.c
> > > @@ -51,11 +51,16 @@ int pcibios_enable_device(struct pci_dev *dev, int 
> > > mask)
> > >  }
> > >
> > >  /*
> > > - * Try to assign the IRQ number from DT when adding a new device
> > > + * Try to assign the IRQ number when probing a new device
> > >   */
> > > -int pcibios_add_device(struct pci_dev *dev)
> > > +int pcibios_alloc_irq(struct pci_dev *dev)
> > >  {
> > > -   dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
> > > +   if (acpi_disabled)
> > > +   dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
> > > +#ifdef CONFIG_ACPI
> > > +   else
> > > +   return acpi_pci_irq_enable(dev);
> > > +#endif
> > >
> > > return 0;
> > >  }
> > > --
> > > 1.9.1
> > >
> >

Re: [RESEND PATCH 1/3] rfkill: Create "rfkill-airplane-mode" LED trigger

2016-06-13 Thread João Paulo Rechi Vita

On 13 June 2016 at 15:00, Pavel Machek  wrote:
> Hi!
>
>> > João, that means you should send a patch to add the ::rfkill suffix.
>> >
>>
>> IMO "airplane" (or maybe "airplane-mode") is a better suffix, as it
>> reflects the label on the machine's chassis. I'll name it
>> "asus-wireless::airplane" and send this through platform-drivers-x86,
>> as this is now contained in the platform-drivers-x86 subsystem. Thanks
>> Johannes for your patience and help designing and reviewing the rfkill
>> changes, even if not all of them made it through in the end. And
>> thanks everyone else involved for the feedback.
>
> Actually, I'd do '::rfkill', for consistency with other places in
> /sys.
>
> /sys/devices/platform/thinkpad_acpi/rfkill/rfkill1/name
> /sys/class/rfkill
> /sys/module/rfkill
>

If we use "rfkill" as a suffix, how do you expect userspace to be able
to differentiate between a LED that indicates airplane-mode (LED ON
when all radios are OFF) and a LED that indicates the state of a
specific radio like WiFi or Bluetooth (LED ON when that specific radio
is ON)? If we're going this route we should provide meaningful
information here.

--
João Paulo Rechi Vita
http://about.me/jprvita

Re: [PATCH v3 03/13] spi: sun4i: fix FIFO limit

2016-06-13 Thread Maxime Ripard

On Mon, Jun 13, 2016 at 05:46:49PM -, Michal Suchanek wrote:
> When testing SPI without DMA I noticed that filling the FIFO on the
> spi controller causes timeout.
> 
> Always leave room for one byte in the FIFO.
> 
> Signed-off-by: Michal Suchanek 

Acked-by: Maxime Ripard 

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [PATCH v3 04/13] spi: sunxi: expose maximum transfer size limit

2016-06-13 Thread Maxime Ripard

On Mon, Jun 13, 2016 at 05:46:50PM -, Michal Suchanek wrote:
> The sun4i spi hardware can trasfer at most 63 bytes of data without DMA
> support so report the limitation. Same for sun6i.
> 
> Signed-off-by: Michal Suchanek 

Acked-by: Maxime Ripard 

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [PATCH v3 00/13] sunxi spi fixes

2016-06-13 Thread Maxime Ripard

On Mon, Jun 13, 2016 at 05:46:48PM -, Michal Suchanek wrote:
> Hello,
> 
> This is update of the sunxi spi patches that should give full-featured SPI
> driver.
> 
> First three patches fix issues with the current driver and can be of use for
> stable kernels so adding cc for those.
> 
> I merged the sun4i and sun6i driver because there several issues that need to
> be fixed in both separately and they are even out of sync wrt some fixes.
> I guess some of the merge patches can be squashed.
> 
> I tested this with A10s Olinuxino Micro. I have no sun6i device so I cannot
> tell if that side was broken by this patchset - especially the last patch that
> adds DMA was afaik never tested on sun6i.

So, you didn't run that code through checkpatch and you rewrite the
whole thing entirely without even testing it... Awesome.

For the record, I'm still very much opposed to such a merge.

The first fixes are very welcome though, can and should go in.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [PATCH v3 02/13] spi: sunxi: fix transfer timeout

2016-06-13 Thread Maxime Ripard

On Mon, Jun 13, 2016 at 05:46:49PM -, Michal Suchanek wrote:
> The trasfer timeout is fixed at 1000 ms. Reading a 4Mbyte flash over
> 1MHz SPI bus takes way longer than that. Calculate the timeout from the
> actual time the transfer is supposed to take and multiply by 2 for good
> measure.
> 
> Signed-off-by: Michal Suchanek 

Acked-by: Maxime Ripard 

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [PATCH v3 01/13] spi: sunxi: set maximum and minimum speed of SPI master

2016-06-13 Thread Maxime Ripard

On Mon, Jun 13, 2016 at 05:46:49PM -, Michal Suchanek wrote:
> The speed limits are unset in the sun4i and sun6i SPI drivers.
> 
> The maximum speed of SPI master is used when maximum speed of SPI slave
> is not specified. Also the __spi_validate function should check that
> transfer speeds do not exceed the master limits.
> 
> The user manual for A10 and A31 specifies maximum
> speed of the SPI clock as 100MHz and minimum as 3kHz.
> 
> Setting the SPI clock to out-of-spec values can lock up the SoC.
> 
> Signed-off-by: Michal Suchanek 
> --
> v2:
> new patch
> v3:
> fix constant style
> ---
>  drivers/spi/spi-sun4i.c | 2 ++
>  drivers/spi/spi-sun6i.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/spi/spi-sun4i.c b/drivers/spi/spi-sun4i.c
> index 1ddd9e2..4213508 100644
> --- a/drivers/spi/spi-sun4i.c
> +++ b/drivers/spi/spi-sun4i.c
> @@ -387,6 +387,8 @@ static int sun4i_spi_probe(struct platform_device *pdev)
>   }
>  
>   sspi->master = master;
> + master->max_speed_hz = 100 * 1000 * 1000;
> + master->min_speed_hz =  3 * 1000;
>   master->set_cs = sun4i_spi_set_cs;
>   master->transfer_one = sun4i_spi_transfer_one;
>   master->num_chipselect = 4;
> diff --git a/drivers/spi/spi-sun6i.c b/drivers/spi/spi-sun6i.c
> index 42e2c4b..fe70695 100644
> --- a/drivers/spi/spi-sun6i.c
> +++ b/drivers/spi/spi-sun6i.c
> @@ -386,6 +386,8 @@ static int sun6i_spi_probe(struct platform_device *pdev)
>   }
>  
>   sspi->master = master;
> + master->max_speed_hz = 100 * 1000 * 1000;
> + master->min_speed_hz =  3 * 1000;
>   master->set_cs = sun6i_spi_set_cs;
>   master->transfer_one = sun6i_spi_transfer_one;
>   master->num_chipselect = 4;

I really don't get why you want to do that kind of padding, when no
one does in the rest of the driver, or the rest of the kernel.

Once properly changed,
Acked-by: Maxime Ripard 

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [very-RFC 0/8] TSN driver for the kernel

2016-06-13 Thread Richard Cochran

On Mon, Jun 13, 2016 at 01:47:13PM +0200, Richard Cochran wrote:
> 3. ALSA support for tunable AD/DA clocks.  The rate of the Listener's
>DA clock must match that of the Talker and the other Listeners.
>Either you adjust it in HW using a VCO or similar, or you do
>adaptive sample rate conversion in the application. (And that is
>another reason for *not* having a shared kernel buffer.)  For the
>Talker, either you adjust the AD clock to match the PTP time, or
>you measure the frequency offset.

Actually, we already have support for tunable clock-like HW elements,
namely the dynamic posix clock API.  It is trivial to write a driver
for VCO or the like.  I am just not too familiar with the latest high
end audio devices.

I have seen audio PLL/multiplier chips that will take, for example, a
10 kHz input and produce your 48 kHz media clock.  With the right HW
design, you can tell your PTP Hardware Clock to produce a 1 PPS,
and you will have a synchronized AVB endpoint.  The software is all
there already.  Somebody should tell the ALSA guys about it.

I don't know if ALSA has anything for sample rate conversion or not,
but haven't seen anything that addresses distributed synchronized
audio applications.

Thanks,
Richard

[ANNOUNCE] Git v2.9.0

2016-06-13 Thread Junio C Hamano

The latest feature release Git v2.9.0 is now available at the
usual places.  It is comprised of 497 non-merge commits since
v2.8.0, contributed by 75 people, 28 of which are new faces.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/

The following public repositories all have a copy of the 'v2.9.0'
tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = git://git.sourceforge.jp/gitroot/git-core/git.git
  url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.8.0 are as follows.
Welcome to the Git development community!

  Alexander Rinass, Antonin, Armin Kunaschik, Benjamin Dopplinger,
  Ben Woosley, Erwan Mathoniere, Gabriel Souza Franco, Jacob
  Nisnevich, Jan Durovec, Jean-Noël Avila, Kazuki Yamaguchi,
  Keller Fuchs, Laurent Arnoud, Li Peng, Marios Titas, Mehul Jain,
  Michael Procter, Nikola Forró, Pablo Santiago Blum de Aguiar,
  Pranit Bauva, Ray Zhang, René Nyffenegger, Santiago Torres,
  Saurav Sachidanand, Shin Kojima, Sidhant Sharma [:tk], Stanislav
  Kolotinskiy, and Xiaolong Ye.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Adam Dinwoodie, Ævar Arnfjörð Bjarmason, Alexander Kuleshov,
  Alexander Shopov, brian m. carlson, Brian Norris, Changwoo
  Ryu, Christian Couder, David Aguilar, David Turner, Dennis
  Kaarsemaker, Dimitriy Ryazantcev, Elia Pinto, Elijah Newren,
  Eric Sunshine, Eric Wong, Felipe Contreras, Jacob Keller,
  Jean-Noel Avila, Jeff King, Jiang Xin, Johannes Schindelin,
  Johannes Sixt, John Keeping, Junio C Hamano, Karsten Blees,
  Lars Schneider, Linus Torvalds, Luke Diamand, Matthieu Moy,
  Michael Haggerty, Michael J Gruber, Michael Rappazzo, Nguyễn
  Thái Ngọc Duy, Ori Avtalion, Peter Krefting, Ralf Thielow,
  Ramsay Jones, Ray Chen, René Scharfe, Stefan Beller, Stephen
  P. Smith, Sven Strickroth, SZEDER Gábor, Torsten Bögershausen,
  Trần Ngọc Quân, and Vasco Almeida.



Git 2.9 Release Notes
=

Backward compatibility notes


The end-user facing Porcelain level commands in the "git diff" and
"git log" family by default enable the rename detection; you can still
use "diff.renames" configuration variable to disable this.

Merging two branches that have no common ancestor with "git merge" is
by default forbidden now to prevent creating such an unusual merge by
mistake.

The output formats of "git log" that indents the commit log message by
4 spaces now expands HT in the log message by default.  You can use
the "--no-expand-tabs" option to disable this.

"git commit-tree" plumbing command required the user to always sign
its result when the user sets the commit.gpgsign configuration
variable, which was an ancient mistake, which this release corrects.
A script that drives commit-tree, if it relies on this mistake, now
needs to read commit.gpgsign and pass the -S option as necessary.


Updates since v2.8
--

UI, Workflows & Features

 * Comes with git-multimail 1.3.1 (in contrib/).

 * The end-user facing commands like "git diff" and "git log"
   now enable the rename detection by default.

 * The credential.helper configuration variable is cumulative and
   there is no good way to override it from the command line.  As
   a special case, giving an empty string as its value now serves
   as the signal to clear the values specified in various files.

 * A new "interactive.diffFilter" configuration can be used to
   customize the diff shown in "git add -i" sessions.

 * "git p4" now allows P4 author names to be mapped to Git author
   names.

 * "git rebase -x" can be used without passing "-i" option.

 * "git -c credential.= submodule" can now be used to
   propagate configuration variables related to credential helper
   down to the submodules.

 * "git tag" can create an annotated tag without explicitly given an
   "-a" (or "-s") option (i.e. when a tag message is given).  A new
   configuration variable, tag.forceSignAnnotated, can be used to tell
   the command to create signed tag in such a situation.

 * "git merge" used to allow merging two branches that have no common
   base by default, which led to a brand new history of an existing
   project created and then get pulled by an unsuspecting maintainer,
   which allowed an unnecessary parallel history merged into the
   existing project.  The command has been taught not to allow this by
   default, with an escape hatch "--allow-unrelated-histories" option
   to be used in a rare event that merges histories of two projects
   that started their lives independently.

 * "git pull" has been taught to pass the "--allow-unrelated-histories"
   option to underlying "git merge".

 * "git apply -v" learned to report paths

Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-13 Thread Davidlohr Bueso


On Fri, 03 Jun 2016, Pan Xinhui wrote:


The existing version uses a heavy barrier while only release semantics
is required. So use atomic_sub_return_release instead.

Suggested-by: Peter Zijlstra (Intel) 
Signed-off-by: Pan Xinhui 


I just noticed this change in -tip and, while I know that saving a barrier
in core spinlock paths is perhaps a worthy exception, I cannot help but
wonder if this is the begging of the end for smp__{before,after}_atomic().

[RFC 17/18] limits: track RLIMIT_RTPRIO actual max

2016-06-13 Thread Topi Miettinen

Track maximum RT priority, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 817d720..d31a06a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4219,6 +4219,8 @@ change:
balance_callback(rq);
preempt_enable();
 
+   task_bump_rlimit(p, RLIMIT_RTPRIO, attr->sched_priority);
+
return 0;
 }
 
-- 
2.8.1

Re: [Cocci] [PATCH 4/4] coccicheck: add indexing enhancement options

2016-06-13 Thread Julia Lawall



On Mon, 13 Jun 2016, Wolfram Sang wrote:

> 
> > Is there another scripts/coccinelle/ file I can use to test against to demo
> > against glimpse/idutils/gitgrep best?
> 
> I'd think this one may be a candidate:
> 
> scripts/coccinelle/misc/irqf_oneshot.cocci
> 
> Not too many, but quite some matches over the tree.

Seems like a reasonable choice.

julia

[RFC 18/18] proc: present VM_LOCKED memory in /proc/self/maps

2016-06-13 Thread Topi Miettinen

Add a flag to /proc/self/maps to show that the memory area is locked.

Signed-off-by: Topi Miettinen 
---
 fs/proc/task_mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..8229509 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -313,13 +313,14 @@ show_map_vma(struct seq_file *m, struct vm_area_struct 
*vma, int is_pid)
end -= PAGE_SIZE;
 
seq_setwidth(m, 25 + sizeof(void *) * 6 - 1);
-   seq_printf(m, "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ",
+   seq_printf(m, "%08lx-%08lx %c%c%c%c%c %08llx %02x:%02x %lu ",
start,
end,
flags & VM_READ ? 'r' : '-',
flags & VM_WRITE ? 'w' : '-',
flags & VM_EXEC ? 'x' : '-',
flags & VM_MAYSHARE ? 's' : 'p',
+   flags & VM_LOCKED ? 'l' : '-',
pgoff,
MAJOR(dev), MINOR(dev), ino);
 
-- 
2.8.1

[RFC 13/18] limits: track RLIMIT_AS actual max

2016-06-13 Thread Topi Miettinen

Track maximum size of address space, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 mm/mmap.c   | 4 
 mm/mremap.c | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index 4e683dd..4876c21 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2706,6 +2706,9 @@ static int do_brk(unsigned long addr, unsigned long len)
 out:
perf_event_mmap(vma);
mm->total_vm += len >> PAGE_SHIFT;
+
+   bump_rlimit(RLIMIT_AS, mm->total_vm << PAGE_SHIFT);
+
mm->data_vm += len >> PAGE_SHIFT;
if (flags & VM_LOCKED)
mm->locked_vm += (len >> PAGE_SHIFT);
@@ -2926,6 +2929,7 @@ bool may_expand_vm(struct mm_struct *mm, vm_flags_t 
flags, unsigned long npages)
 void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages)
 {
mm->total_vm += npages;
+   bump_rlimit(RLIMIT_AS, mm->total_vm << PAGE_SHIFT);
 
if (is_exec_mapping(flags))
mm->exec_vm += npages;
diff --git a/mm/mremap.c b/mm/mremap.c
index ade3e13..6be3c01 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -397,6 +397,9 @@ static struct vm_area_struct *vma_to_resize(unsigned long 
addr,
if (vma->vm_flags & VM_LOCKED)
bump_rlimit(RLIMIT_MEMLOCK, (mm->locked_vm << PAGE_SHIFT) +
new_len - old_len);
+   bump_rlimit(RLIMIT_AS, (mm->total_vm << PAGE_SHIFT) +
+   new_len - old_len);
+
return vma;
 }
 
-- 
2.8.1

[RFC 14/18] limits: track RLIMIT_SIGPENDING actual max

2016-06-13 Thread Topi Miettinen

Track maximum number of pending signals, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 kernel/signal.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/signal.c b/kernel/signal.c
index 96e9bc4..c8fbccd 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -387,6 +387,8 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t 
flags, int override_rlimi
INIT_LIST_HEAD(&q->list);
q->flags = 0;
q->user = user;
+   /* XXX resource limits apply per task, not per user */
+   bump_rlimit(RLIMIT_SIGPENDING, atomic_read(&user->sigpending));
}
 
return q;
-- 
2.8.1

Re: [PATCH 4/4] coccicheck: add indexing enhancement options

2016-06-13 Thread Julia Lawall



On Mon, 13 Jun 2016, Luis R. Rodriguez wrote:

> On Fri, Jun 10, 2016 at 11:21:28PM +0200, Julia Lawall wrote:
> > 
> > 
> > On Fri, 10 Jun 2016, Luis R. Rodriguez wrote:
> > 
> > > On Fri, Jun 10, 2016 at 11:02:38PM +0200, Julia Lawall wrote:
> > > > 
> > > > 
> > > > On Fri, 10 Jun 2016, Luis R. Rodriguez wrote:
> > > > 
> > > > > Enable indexing optimizations heuristics. Coccinelle has
> > > > > support to make use of its own enhanced "grep" mechanisms
> > > > > instead of using regular grep for searching code 'coccigrep',
> > > > > in practice though this seems to not perform better than
> > > > > regular grep however its expected to help with some use cases
> > > > > so we use that if you have no other indexing options in place
> > > > > available.
> > > > > 
> > > > > Since git has its own index, support for using 'git grep' has been
> > > > > added to Coccinelle, that should on average perform better than
> > > > > using the internal cocci grep, and regular grep. Lastly, Coccinelle
> > > > > has had support for glimpseindex for a long while, however the
> > > > > tool was previously closed source, its now open sourced, and
> > > > > provides the best performance, so support that if we can detect
> > > > > you have a glimpse index.
> > > > > 
> > > > > These tests have been run on an 8 core system:
> > > > > 
> > > > > Before:
> > > > > 
> > > > > $ export COCCI=scripts/coccinelle/free/kfree.cocci
> > > > > $ time make coccicheck MODE=report
> > > > > 
> > > > > Before this patch with no indexing or anything:
> > > > > 
> > > > > real16m22.435s
> > > > > user128m30.060s
> > > > > sys 0m2.712s
> > > > > 
> > > > > Using coccigrep (after this patch if you have no .git):
> > > > > 
> > > > > real16m27.650s
> > > > > user128m47.904s
> > > > > sys 0m2.176s
> > > > > 
> > > > > If you have .git and therefore use gitgrep:
> > > > > 
> > > > > real16m21.220s
> > > > > user129m30.940s
> > > > > sys 0m2.060s
> > > > > 
> > > > > And if you have a .glimpse_index:
> > > > > 
> > > > > real16m14.794s
> > > > > user128m42.356s
> > > > > sys 0m1.880s
> > > > 
> > > > I don't see any convincing differences in these times.
> > > > 
> > > > I believe that Coccinelle's internal grep is always used, even with no 
> > > > option.
> > > 
> > > Ah that would explain it. This uses coccinelle 1.0.5, is the default
> > > there to use --use-coccigrep if no other index is specified ?
> > 
> > It has been the default for a long time.
> > 
> > > > I'm puzzled why glimpse gives no benefit.
> > > 
> > > Well, slightly better.
> > 
> > No, it should be much better.  You would have to look at the standard 
> > error to see if you are getting any benefit.  There should be very few 
> > occurrences of Skipping if you are really using glimpse.  In any case, if 
> > you asked for glimpse and it was not able to provide it, there should be 
> > warning messages at the top of stderr.
> 
> I'll redirect stderr to stdout by default when parmap support is used then.

Usually I put them in different files.

julia

[RFC 15/18] limits: track RLIMIT_MSGQUEUE actual max

2016-06-13 Thread Topi Miettinen

Track maximum size of message queues, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 ipc/mqueue.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ade739f..edccf55 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -287,6 +287,8 @@ static struct inode *mqueue_get_inode(struct super_block 
*sb,
 
/* all is ok */
info->user = get_uid(u);
+   /* XXX resource limits apply per task, not per user */
+   bump_rlimit(RLIMIT_MSGQUEUE, u->mq_bytes);
} else if (S_ISDIR(mode)) {
inc_nlink(inode);
/* Some things misbehave if size == 0 on a directory */
-- 
2.8.1

[RFC 16/18] limits: track RLIMIT_NICE actual max

2016-06-13 Thread Topi Miettinen

Track maximum nice priority, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 017d539..817d720 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3692,6 +3692,8 @@ void set_user_nice(struct task_struct *p, long nice)
if (delta < 0 || (delta > 0 && task_running(rq, p)))
resched_curr(rq);
}
+   task_bump_rlimit(p, RLIMIT_NICE, nice_to_rlimit(nice));
+
 out_unlock:
task_rq_unlock(rq, p, &rf);
 }
-- 
2.8.1

[RFC 02/18] cgroup_pids: track maximum pids

2016-06-13 Thread Topi Miettinen

Track maximum pids in the cgroup, present it in cgroup pids.current_max.

Signed-off-by: Topi Miettinen 
---
 kernel/cgroup_pids.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/kernel/cgroup_pids.c b/kernel/cgroup_pids.c
index 303097b..53fb21d 100644
--- a/kernel/cgroup_pids.c
+++ b/kernel/cgroup_pids.c
@@ -48,6 +48,7 @@ struct pids_cgroup {
 * %PIDS_MAX = (%PID_MAX_LIMIT + 1).
 */
atomic64_t  counter;
+   atomic64_t  cur_max;
int64_t limit;
 };
 
@@ -72,6 +73,7 @@ pids_css_alloc(struct cgroup_subsys_state *parent)
 
pids->limit = PIDS_MAX;
atomic64_set(&pids->counter, 0);
+   atomic64_set(&pids->cur_max, 0);
return &pids->css;
 }
 
@@ -182,6 +184,10 @@ static int pids_can_attach(struct cgroup_taskset *tset)
 
pids_charge(pids, 1);
pids_uncharge(old_pids, 1);
+   if (atomic64_read(&pids->cur_max) <
+   atomic64_read(&pids->counter))
+   atomic64_set(&pids->cur_max,
+atomic64_read(&pids->counter));
}
 
return 0;
@@ -202,6 +208,10 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 
pids_charge(old_pids, 1);
pids_uncharge(pids, 1);
+   if (atomic64_read(&old_pids->cur_max) <
+   atomic64_read(&old_pids->counter))
+   atomic64_set(&old_pids->cur_max,
+atomic64_read(&old_pids->counter));
}
 }
 
@@ -236,6 +246,14 @@ static void pids_free(struct task_struct *task)
pids_uncharge(pids, 1);
 }
 
+static void pids_fork(struct task_struct *task)
+{
+   struct pids_cgroup *pids = css_pids(task_css(task, pids_cgrp_id));
+
+   if (atomic64_read(&pids->cur_max) < atomic64_read(&pids->counter))
+   atomic64_set(&pids->cur_max, atomic64_read(&pids->counter));
+}
+
 static ssize_t pids_max_write(struct kernfs_open_file *of, char *buf,
  size_t nbytes, loff_t off)
 {
@@ -288,6 +306,14 @@ static s64 pids_current_read(struct cgroup_subsys_state 
*css,
return atomic64_read(&pids->counter);
 }
 
+static s64 pids_current_max_read(struct cgroup_subsys_state *css,
+struct cftype *cft)
+{
+   struct pids_cgroup *pids = css_pids(css);
+
+   return atomic64_read(&pids->cur_max);
+}
+
 static struct cftype pids_files[] = {
{
.name = "max",
@@ -300,6 +326,11 @@ static struct cftype pids_files[] = {
.read_s64 = pids_current_read,
.flags = CFTYPE_NOT_ON_ROOT,
},
+   {
+   .name = "current_max",
+   .read_s64 = pids_current_max_read,
+   .flags = CFTYPE_NOT_ON_ROOT,
+   },
{ } /* terminate */
 };
 
@@ -313,4 +344,5 @@ struct cgroup_subsys pids_cgrp_subsys = {
.free   = pids_free,
.legacy_cftypes = pids_files,
.dfl_cftypes= pids_files,
+   .fork   = pids_fork,
 };
-- 
2.8.1

[RFC 09/18] limits: track RLIMIT_CORE actual max

2016-06-13 Thread Topi Miettinen

Track maximum size of core dump written, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 fs/coredump.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 281b768..abedc99 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -784,20 +784,25 @@ int dump_emit(struct coredump_params *cprm, const void 
*addr, int nr)
struct file *file = cprm->file;
loff_t pos = file->f_pos;
ssize_t n;
+   int r = 0;
+
if (cprm->written + nr > cprm->limit)
return 0;
while (nr) {
if (dump_interrupted())
-   return 0;
+   goto err;
n = __kernel_write(file, addr, nr, &pos);
if (n <= 0)
-   return 0;
+   goto err;
file->f_pos = pos;
cprm->written += n;
cprm->pos += n;
nr -= n;
}
-   return 1;
+   r = 1;
+ err:
+   bump_rlimit(RLIMIT_CORE, cprm->written);
+   return r;
 }
 EXPORT_SYMBOL(dump_emit);
 
-- 
2.8.1

[RFC 05/18] limits: track and present RLIMIT_NOFILE actual max

2016-06-13 Thread Topi Miettinen

Track maximum number of files for the process, present current maximum
in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 fs/file.c |  4 
 fs/proc/base.c| 10 ++
 include/linux/sched.h |  7 +++
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 6b1acdf..2d0d206 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -547,6 +547,8 @@ repeat:
}
 #endif
 
+   bump_rlimit(RLIMIT_NOFILE, fd);
+
 out:
spin_unlock(&files->file_lock);
return error;
@@ -857,6 +859,8 @@ __releases(&files->file_lock)
if (tofree)
filp_close(tofree, files);
 
+   bump_rlimit(RLIMIT_NOFILE, fd);
+
return fd;
 
 Ebusy:
diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..227997b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -630,8 +630,8 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
/*
 * print the file header
 */
-   seq_printf(m, "%-25s %-20s %-20s %-10s\n",
- "Limit", "Soft Limit", "Hard Limit", "Units");
+   seq_printf(m, "%-25s %-20s %-20s %-10s %-20s\n",
+  "Limit", "Soft Limit", "Hard Limit", "Units", "Max");
 
for (i = 0; i < RLIM_NLIMITS; i++) {
if (rlim[i].rlim_cur == RLIM_INFINITY)
@@ -647,9 +647,11 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
seq_printf(m, "%-20lu ", rlim[i].rlim_max);
 
if (lnames[i].unit)
-   seq_printf(m, "%-10s\n", lnames[i].unit);
+   seq_printf(m, "%-10s", lnames[i].unit);
else
-   seq_putc(m, '\n');
+   seq_printf(m, "%-10s", "");
+   seq_printf(m, "%-20lu\n",
+  task->signal->rlim_curmax[i]);
}
 
return 0;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9c48a08..0150380 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -782,6 +782,7 @@ struct signal_struct {
 * have no need to disable irqs.
 */
struct rlimit rlim[RLIM_NLIMITS];
+   unsigned long rlim_curmax[RLIM_NLIMITS];
 
 #ifdef CONFIG_BSD_PROCESS_ACCT
struct pacct_struct pacct;  /* per-process accounting information */
@@ -3376,6 +3377,12 @@ static inline unsigned long rlimit_max(unsigned int 
limit)
return task_rlimit_max(current, limit);
 }
 
+static inline void bump_rlimit(unsigned int limit, unsigned long r)
+{
+   if (READ_ONCE(current->signal->rlim_curmax[limit]) < r)
+   current->signal->rlim_curmax[limit] = r;
+}
+
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
void (*func)(struct update_util_data *data,
-- 
2.8.1

[RFC 04/18] device_cgroup: track and present accessed devices

2016-06-13 Thread Topi Miettinen

Track what devices are accessed and present them cgroup devices.accessed.

Signed-off-by: Topi Miettinen 
---
 security/device_cgroup.c | 70 +---
 1 file changed, 60 insertions(+), 10 deletions(-)

diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 03c1652..45aa730 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -48,6 +48,7 @@ struct dev_exception_item {
 struct dev_cgroup {
struct cgroup_subsys_state css;
struct list_head exceptions;
+   struct list_head accessed;
enum devcg_behavior behavior;
 };
 
@@ -90,7 +91,7 @@ free_and_exit:
 /*
  * called under devcgroup_mutex
  */
-static int dev_exception_add(struct dev_cgroup *dev_cgroup,
+static int dev_exception_add(struct list_head *exceptions,
 struct dev_exception_item *ex)
 {
struct dev_exception_item *excopy, *walk;
@@ -101,7 +102,7 @@ static int dev_exception_add(struct dev_cgroup *dev_cgroup,
if (!excopy)
return -ENOMEM;
 
-   list_for_each_entry(walk, &dev_cgroup->exceptions, list) {
+   list_for_each_entry(walk, exceptions, list) {
if (walk->type != ex->type)
continue;
if (walk->major != ex->major)
@@ -115,7 +116,7 @@ static int dev_exception_add(struct dev_cgroup *dev_cgroup,
}
 
if (excopy != NULL)
-   list_add_tail_rcu(&excopy->list, &dev_cgroup->exceptions);
+   list_add_tail_rcu(&excopy->list, exceptions);
return 0;
 }
 
@@ -155,6 +156,16 @@ static void __dev_exception_clean(struct dev_cgroup 
*dev_cgroup)
}
 }
 
+static void dev_accessed_clean(struct dev_cgroup *dev_cgroup)
+{
+   struct dev_exception_item *ex, *tmp;
+
+   list_for_each_entry_safe(ex, tmp, &dev_cgroup->accessed, list) {
+   list_del_rcu(&ex->list);
+   kfree_rcu(ex, rcu);
+   }
+}
+
 /**
  * dev_exception_clean - frees all entries of the exception list
  * @dev_cgroup: dev_cgroup with the exception list to be cleaned
@@ -221,6 +232,7 @@ devcgroup_css_alloc(struct cgroup_subsys_state *parent_css)
if (!dev_cgroup)
return ERR_PTR(-ENOMEM);
INIT_LIST_HEAD(&dev_cgroup->exceptions);
+   INIT_LIST_HEAD(&dev_cgroup->accessed);
dev_cgroup->behavior = DEVCG_DEFAULT_NONE;
 
return &dev_cgroup->css;
@@ -231,6 +243,7 @@ static void devcgroup_css_free(struct cgroup_subsys_state 
*css)
struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
 
__dev_exception_clean(dev_cgroup);
+   dev_accessed_clean(dev_cgroup);
kfree(dev_cgroup);
 }
 
@@ -272,9 +285,9 @@ static void set_majmin(char *str, unsigned m)
sprintf(str, "%u", m);
 }
 
-static int devcgroup_seq_show(struct seq_file *m, void *v)
+static int devcgroup_seq_show_list(struct seq_file *m, struct dev_cgroup 
*devcgroup,
+  struct list_head *exceptions, bool allow)
 {
-   struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
struct dev_exception_item *ex;
char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
 
@@ -285,14 +298,14 @@ static int devcgroup_seq_show(struct seq_file *m, void *v)
 * - List the exceptions in case the default policy is to deny
 * This way, the file remains as a "whitelist of devices"
 */
-   if (devcgroup->behavior == DEVCG_DEFAULT_ALLOW) {
+   if (allow) {
set_access(acc, ACC_MASK);
set_majmin(maj, ~0);
set_majmin(min, ~0);
seq_printf(m, "%c %s:%s %s\n", type_to_char(DEV_ALL),
   maj, min, acc);
} else {
-   list_for_each_entry_rcu(ex, &devcgroup->exceptions, list) {
+   list_for_each_entry_rcu(ex, exceptions, list) {
set_access(acc, ex->access);
set_majmin(maj, ex->major);
set_majmin(min, ex->minor);
@@ -305,6 +318,36 @@ static int devcgroup_seq_show(struct seq_file *m, void *v)
return 0;
 }
 
+static int devcgroup_seq_show(struct seq_file *m, void *v)
+{
+   struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
+
+   return devcgroup_seq_show_list(m, devcgroup, &devcgroup->exceptions,
+  devcgroup->behavior == 
DEVCG_DEFAULT_ALLOW);
+}
+
+static int devcgroup_seq_show_accessed(struct seq_file *m, void *v)
+{
+   struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
+
+   return devcgroup_seq_show_list(m, devcgroup, &devcgroup->accessed, 
false);
+}
+
+static void devcgroup_add_accessed(struct dev_cgroup *dev_cgroup, short type,
+  u32 major, u32 minor, short access)
+{
+   struct dev_exception_item ex;
+
+   ex.type = type;
+   ex.major = major;
+   ex.minor = minor;
+   ex.access = access;
+
+

[RFC 06/18] limits: present RLIMIT_CPU and RLIMIT_RTTIMER current status

2016-06-13 Thread Topi Miettinen

Present current cputimer status in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 fs/proc/base.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 227997b..1df4fc8 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -650,8 +650,30 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
seq_printf(m, "%-10s", lnames[i].unit);
else
seq_printf(m, "%-10s", "");
-   seq_printf(m, "%-20lu\n",
-  task->signal->rlim_curmax[i]);
+
+   switch (i) {
+   case RLIMIT_RTTIME:
+   case RLIMIT_CPU:
+   if (rlim[i].rlim_max == RLIM_INFINITY)
+   seq_printf(m, "%-20s\n", "-");
+   else {
+   unsigned long long utime, ptime;
+   unsigned long psecs;
+   struct task_cputime cputime;
+
+   thread_group_cputimer(task, &cputime);
+   utime = cputime_to_expires(cputime.utime);
+   ptime = utime + 
cputime_to_expires(cputime.stime);
+   psecs = cputime_to_secs(ptime);
+   if (i == RLIMIT_RTTIME)
+   psecs *= USEC_PER_SEC;
+   seq_printf(m, "%-20lu\n", psecs);
+   }
+   break;
+   default:
+   seq_printf(m, "%-20lu\n",
+  task->signal->rlim_curmax[i]);
+   }
}
 
return 0;
-- 
2.8.1

[RFC 08/18] limits: track RLIMIT_DATA actual max

2016-06-13 Thread Topi Miettinen

Track maximum size of data VM, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 arch/x86/ia32/ia32_aout.c | 1 +
 fs/binfmt_aout.c  | 1 +
 fs/binfmt_flat.c  | 1 +
 kernel/sys.c  | 2 ++
 mm/mmap.c | 6 +-
 5 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index cb26f18..8a7d502 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -398,6 +398,7 @@ beyond_if:
regs->r8 = regs->r9 = regs->r10 = regs->r11 =
regs->r12 = regs->r13 = regs->r14 = regs->r15 = 0;
set_fs(USER_DS);
+   bump_limit(RLIMIT_DATA, ex.a_data + ex.a_bss);
return 0;
 }
 
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index ae1b540..86c6548 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -330,6 +330,7 @@ beyond_if:
regs->gp = ex.a_gpvalue;
 #endif
start_thread(regs, ex.a_entry, current->mm->start_stack);
+   bump_limit(RLIMIT_DATA, ex.a_data + ex.a_bss);
return 0;
 }
 
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index caf9e39..e309dad 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -792,6 +792,7 @@ static int load_flat_file(struct linux_binprm * bprm,
libinfo->lib_list[id].start_brk) +  /* start brk */
stack_len);
 
+   bump_limit(RLIMIT_DATA, data_len + bss_len);
return 0;
 err:
return ret;
diff --git a/kernel/sys.c b/kernel/sys.c
index 89d5be4..6629f6f 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1896,6 +1896,8 @@ static int prctl_set_mm_map(int opt, const void __user 
*addr, unsigned long data
if (prctl_map.auxv_size)
memcpy(mm->saved_auxv, user_auxv, sizeof(user_auxv));
 
+   bump_limit(RLIMIT_DATA, mm->end_data - mm->start_data);
+
up_write(&mm->mmap_sem);
return 0;
 }
diff --git a/mm/mmap.c b/mm/mmap.c
index de2c176..61867de 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -228,6 +228,8 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
goto out;
 
 set_brk:
+   bump_rlimit(RLIMIT_DATA, (brk - mm->start_brk) +
+   (mm->end_data - mm->start_data));
mm->brk = brk;
populate = newbrk > oldbrk && (mm->def_flags & VM_LOCKED) != 0;
up_write(&mm->mmap_sem);
@@ -2924,8 +2926,10 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t 
flags, long npages)
mm->exec_vm += npages;
else if (is_stack_mapping(flags))
mm->stack_vm += npages;
-   else if (is_data_mapping(flags))
+   else if (is_data_mapping(flags)) {
mm->data_vm += npages;
+   bump_rlimit(RLIMIT_DATA, mm->data_vm << PAGE_SHIFT);
+   }
 }
 
 static int special_mapping_fault(struct vm_area_struct *vma,
-- 
2.8.1

[RFC 03/18] memcontrol: present maximum used memory also for cgroup-v2

2016-06-13 Thread Topi Miettinen

Present maximum used memory in cgroup memory.current_max.

Signed-off-by: Topi Miettinen 
---
 include/linux/page_counter.h |  7 ++-
 mm/memcontrol.c  | 13 +
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h
index 7e62920..be4de17 100644
--- a/include/linux/page_counter.h
+++ b/include/linux/page_counter.h
@@ -9,9 +9,9 @@ struct page_counter {
atomic_long_t count;
unsigned long limit;
struct page_counter *parent;
+   unsigned long watermark;
 
/* legacy */
-   unsigned long watermark;
unsigned long failcnt;
 };
 
@@ -34,6 +34,11 @@ static inline unsigned long page_counter_read(struct 
page_counter *counter)
return atomic_long_read(&counter->count);
 }
 
+static inline unsigned long page_counter_read_watermark(struct page_counter 
*counter)
+{
+   return counter->watermark;
+}
+
 void page_counter_cancel(struct page_counter *counter, unsigned long nr_pages);
 void page_counter_charge(struct page_counter *counter, unsigned long nr_pages);
 bool page_counter_try_charge(struct page_counter *counter,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 75e7440..5513771 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4966,6 +4966,14 @@ static u64 memory_current_read(struct 
cgroup_subsys_state *css,
return (u64)page_counter_read(&memcg->memory) * PAGE_SIZE;
 }
 
+static u64 memory_current_max_read(struct cgroup_subsys_state *css,
+  struct cftype *cft)
+{
+   struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+
+   return (u64)page_counter_read_watermark(&memcg->memory) * PAGE_SIZE;
+}
+
 static int memory_low_show(struct seq_file *m, void *v)
 {
struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
@@ -5179,6 +5187,11 @@ static struct cftype memory_files[] = {
.read_u64 = memory_current_read,
},
{
+   .name = "current_max",
+   .flags = CFTYPE_NOT_ON_ROOT,
+   .read_u64 = memory_current_max_read,
+   },
+   {
.name = "low",
.flags = CFTYPE_NOT_ON_ROOT,
.seq_show = memory_low_show,
-- 
2.8.1

[RFC 07/18] limits: track RLIMIT_FSIZE actual max

2016-06-13 Thread Topi Miettinen

Track maximum file size, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 fs/attr.c| 2 ++
 mm/filemap.c | 1 +
 2 files changed, 3 insertions(+)

diff --git a/fs/attr.c b/fs/attr.c
index 25b24d0..1b620f7 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -116,6 +116,8 @@ int inode_newsize_ok(const struct inode *inode, loff_t 
offset)
return -ETXTBSY;
}
 
+   bump_rlimit(RLIMIT_FSIZE, offset);
+
return 0;
 out_sig:
send_sig(SIGXFSZ, current, 0);
diff --git a/mm/filemap.c b/mm/filemap.c
index 00ae878..1fa9864 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2447,6 +2447,7 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, 
struct iov_iter *from)
send_sig(SIGXFSZ, current, 0);
return -EFBIG;
}
+   bump_rlimit(RLIMIT_FSIZE, iocb->ki_pos);
iov_iter_truncate(from, limit - (unsigned long)pos);
}
 
-- 
2.8.1

[RFC 11/18] limits: track and present RLIMIT_NPROC actual max

2016-06-13 Thread Topi Miettinen

Track maximum number of processes per user and present it
in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 fs/proc/base.c| 4 
 include/linux/sched.h | 1 +
 kernel/fork.c | 5 +
 kernel/sys.c  | 5 +
 4 files changed, 15 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 1df4fc8..02576c6 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -670,6 +670,10 @@ static int proc_pid_limits(struct seq_file *m, struct 
pid_namespace *ns,
seq_printf(m, "%-20lu\n", psecs);
}
break;
+   case RLIMIT_NPROC:
+   seq_printf(m, "%-20d\n",
+  
atomic_read(&task->real_cred->user->max_processes));
+   break;
default:
seq_printf(m, "%-20lu\n",
   task->signal->rlim_curmax[i]);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0150380..feb9bb7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -838,6 +838,7 @@ static inline int signal_group_exit(const struct 
signal_struct *sig)
 struct user_struct {
atomic_t __count;   /* reference count */
atomic_t processes; /* How many processes does this user have? */
+   atomic_t max_processes; /* How many processes has this user had at the 
same time? */
atomic_t sigpending;/* How many pending signals does this user 
have? */
 #ifdef CONFIG_INOTIFY_USER
atomic_t inotify_watches; /* How many inotify watches does this user 
have? */
diff --git a/kernel/fork.c b/kernel/fork.c
index 5c2c355..667290f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1653,6 +1653,11 @@ static struct task_struct *copy_process(unsigned long 
clone_flags,
trace_task_newtask(p, clone_flags);
uprobe_copy_process(p, clone_flags);
 
+   if (atomic_read(&p->real_cred->user->max_processes) <
+   atomic_read(&p->real_cred->user->processes))
+   atomic_set(&p->real_cred->user->max_processes,
+  atomic_read(&p->real_cred->user->processes));
+
return p;
 
 bad_fork_cancel_cgroup:
diff --git a/kernel/sys.c b/kernel/sys.c
index 6629f6f..955cf21 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -439,6 +439,11 @@ static int set_user(struct cred *new)
else
current->flags &= ~PF_NPROC_EXCEEDED;
 
+   if (atomic_read(&new_user->max_processes) <
+   atomic_read(&new_user->processes))
+   atomic_set(&new_user->max_processes,
+  atomic_read(&new_user->processes));
+
free_uid(new->user);
new->user = new_user;
return 0;
-- 
2.8.1

[RFC 10/18] limits: track RLIMIT_STACK actual max

2016-06-13 Thread Topi Miettinen

Track maximum stack size, presented in /proc/self/limits.

Signed-off-by: Topi Miettinen 
---
 mm/mmap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index 61867de..0963e7f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2019,6 +2019,8 @@ static int acct_stack_growth(struct vm_area_struct *vma, 
unsigned long size, uns
if (security_vm_enough_memory_mm(mm, grow))
return -ENOMEM;
 
+   bump_rlimit(RLIMIT_STACK, actual_size);
+
return 0;
 }
 
-- 
2.8.1

Re: [PATCH 0/2] ARM: dts: NSP: add PL330 support and XMC board

2016-06-13 Thread Florian Fainelli

On 06/07/2016 03:28 PM, Jon Mason wrote:
> Add support for the PL330 DMA engine and the XMC form factor for
> Broadcom Northstar Plus SoCs
> 
> 
> Jon Mason (2):
>   ARM: dts: NSP: Add XMC board support
>   ARM: dts: NSP: Add PL330 support

Series applied, thanks Jon
-- 
Florian

[PATCH] media: s5p-mfc fix memory leak in s5p_mfc_remove()

2016-06-13 Thread Shuah Khan

s5p_mfc_remove() fails to release encoder and decoder video devices.

Signed-off-by: Shuah Khan 
Reviewed-by: Javier Martinez Canillas 
---

Changes since v1:
- Addressed comments from Javier Martinez Canillas and added
  his reviewed by:

 drivers/media/platform/s5p-mfc/s5p_mfc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc.c 
b/drivers/media/platform/s5p-mfc/s5p_mfc.c
index 274b4f1..f537b74 100644
--- a/drivers/media/platform/s5p-mfc/s5p_mfc.c
+++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c
@@ -1318,6 +1318,8 @@ static int s5p_mfc_remove(struct platform_device *pdev)
 
video_unregister_device(dev->vfd_enc);
video_unregister_device(dev->vfd_dec);
+   video_device_release(dev->vfd_enc);
+   video_device_release(dev->vfd_dec);
v4l2_device_unregister(&dev->v4l2_dev);
s5p_mfc_release_firmware(dev);
vb2_dma_contig_cleanup_ctx(dev->alloc_ctx[0]);
-- 
2.7.4

[RFC 01/18] capabilities: track actually used capabilities

2016-06-13 Thread Topi Miettinen

Track what capabilities are actually used and present the current
situation in /proc/self/status.

Signed-off-by: Topi Miettinen 
---
 fs/exec.c | 1 +
 fs/proc/array.c   | 1 +
 include/linux/sched.h | 1 +
 kernel/capability.c   | 1 +
 4 files changed, 4 insertions(+)

diff --git a/fs/exec.c b/fs/exec.c
index 887c1c9..ff6f644 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1269,6 +1269,7 @@ void setup_new_exec(struct linux_binprm * bprm)
if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP)
set_dumpable(current->mm, suid_dumpable);
}
+   cap_clear(current->cap_used);
 
/* An exec changes our domain. We are no longer part of the thread
   group */
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 88c7de1..9ee 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -343,6 +343,7 @@ static inline void task_cap(struct seq_file *m, struct 
task_struct *p)
render_cap_t(m, "CapEff:\t", &cap_effective);
render_cap_t(m, "CapBnd:\t", &cap_bset);
render_cap_t(m, "CapAmb:\t", &cap_ambient);
+   render_cap_t(m, "CapUsd:\t", &p->cap_used);
 }
 
 static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6e42ada..9c48a08 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1918,6 +1918,7 @@ struct task_struct {
 #ifdef CONFIG_MMU
struct task_struct *oom_reaper_list;
 #endif
+   kernel_cap_tcap_used;   /* Capabilities actually used */
 /* CPU-specific state of this task */
struct thread_struct thread;
 /*
diff --git a/kernel/capability.c b/kernel/capability.c
index 45432b5..aad8854 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -380,6 +380,7 @@ bool ns_capable(struct user_namespace *ns, int cap)
}
 
if (security_capable(current_cred(), ns, cap) == 0) {
+   cap_raise(current->cap_used, cap);
current->flags |= PF_SUPERPRIV;
return true;
}
-- 
2.8.1

[RFC 00/18] Present useful limits to user

2016-06-13 Thread Topi Miettinen

Hello,

There are many basic ways to control processes, including capabilities,
cgroups and resource limits. However, there are far fewer ways to find out
useful values for the limits, except blind trial and error.

This patch series attempts to fix that by giving at least a nice starting
point from the actual maximum values. I looked where each limit is checked
and added a call to limit bump nearby.


Capabilities
[RFC 01/18] capabilities: track actually used capabilities

Currently, there is no way to know which capabilities are actually used. Even
the source code is only implicit, in-depth knowledge of each capability must
be used when analyzing a program to judge which capabilities the program will
exercise.
 
Cgroups
[RFC 02/18] cgroup_pids: track maximum pids
[RFC 03/18] memcontrol: present maximum used memory also for
[RFC 04/18] device_cgroup: track and present accessed devices

For tasks and memory cgroup limits the situation is somewhat better as the
current tasks and memory status can be easily seen with ps(1). However, any
transient tasks or temporary higher memory use might slip from the view.
Device use may be seen with advanced MAC tools, like TOMOYO, but there is no
universal method. Program sources typically give no useful indication about
memory use or how many tasks there could be.
 
Resource limits
[RFC 05/18] limits: track and present RLIMIT_NOFILE actual max
[RFC 06/18] limits: present RLIMIT_CPU and RLIMIT_RTTIMER current
[RFC 07/18] limits: track RLIMIT_FSIZE actual max
[RFC 08/18] limits: track RLIMIT_DATA actual max
[RFC 09/18] limits: track RLIMIT_CORE actual max
[RFC 10/18] limits: track RLIMIT_STACK actual max
[RFC 11/18] limits: track and present RLIMIT_NPROC actual max
[RFC 12/18] limits: track RLIMIT_MEMLOCK actual max
[RFC 13/18] limits: track RLIMIT_AS actual max
[RFC 14/18] limits: track RLIMIT_SIGPENDING actual max
[RFC 15/18] limits: track RLIMIT_MSGQUEUE actual max
[RFC 16/18] limits: track RLIMIT_NICE actual max
[RFC 17/18] limits: track RLIMIT_RTPRIO actual max
[RFC 18/18] proc: present VM_LOCKED memory in /proc/self/maps

Current number of files and current VM usage (data pages, address space size)
could be calculated from available /proc files. Again, any temporarily higher
values could be easily missed. For many limits, there is no way to see what
is the current situation and source code is mostly useless.

As a side note, the resouce limits seem to be in bad shape. For example,
RLIMIT_MEMLOCK is used incoherently and I think VM statistics can miss
some changes. Adding RLIMIT_CODE could be useful.

The current maximum values for the resource limits are now shown in
/proc/task/limits. If this is deemed too confusing for the existing
programs which rely on the exact format, I can change that to a new file.


Finally, the patches work in my testing but I have probably missed finer
lock/RCU details.

-Topi

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1185 matches

Mail list logo