Re: [PATCH 2/2] security,perf: Allow further restriction of perf_event_open

2016-06-16 Thread Alexander Shishkin
Ben Hutchings  writes:

> When kernel.perf_event_open is set to 3 (or greater), disallow all
> access to performance events by users without CAP_SYS_ADMIN.
> Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
> makes this value the default.

So this patch does two things, can it then be made into two patches?

>
> This is based on a similar feature in grsecurity
> (CONFIG_GRKERNSEC_PERF_HARDEN).  This version doesn't include making
> the variable read-only.  It also allows enabling further restriction
> at run-time regardless of whether the default is changed.

This paragraph doesn't seem to belong in the commit message.

What this commit message is missing entirely is the rationale behind
this change other than "grsecurity does the same". Can you please
elaborate?

> Signed-off-by: Ben Hutchings 
> ---
> I made a similar change to Debian's kernel packages in August,
> including the more restrictive default, and no-one has complained yet.

As a debian user, is this a good place to complain? Because it does get
it the way.

Thanks,
--
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation: fix wrong value in md.txt

2016-06-16 Thread Tiezhu Yang
In the current Documentation/md.txt, the lower limit value of
stripe_cache_size is 16 and the default value is 128, but when
I update kernel to the latest mainline version and RAID5 array
is created by mdadm, then execute the following commands, it
shows an error and a difference respectively.

1) set stripe_cache_size to 16
[root@localhost ~]# echo 16 > /sys/block/md0/md/stripe_cache_size
bash: echo: write error: Invalid argument
2) read the default value of stripe_cache_size
[root@localhost ~]# cat /sys/block/md0/md/stripe_cache_size
256

I read drivers/md/raid5.c and find the following related code:
1) in function 'raid5_set_cache_size':
if (size <= 16 || size > 32768)
return -EINVAL;
2) #define NR_STRIPES   256

So the lower limit value of stripe_cache_size should be 17 and
the default value should be 256.

Signed-off-by: Tiezhu Yang 
---
 Documentation/md.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/md.txt b/Documentation/md.txt
index 1a2ada4..d6e2fcf 100644
--- a/Documentation/md.txt
+++ b/Documentation/md.txt
@@ -602,7 +602,7 @@ These currently include
 
   stripe_cache_size  (currently raid5 only)
   number of entries in the stripe cache.  This is writable, but
-  there are upper and lower limits (32768, 16).  Default is 128.
+  there are upper and lower limits (32768, 17).  Default is 256.
   strip_cache_active (currently raid5 only)
   number of active entries in the stripe cache
   preread_bypass_threshold (currently raid5 only)
-- 
1.8.3.1

[PATCH v2 2/5] firmware: annotate thou shalt not request fw on init or probe

2016-06-16 Thread Luis R. Rodriguez
Thou shalt not make firmware calls early on init or probe.

systemd already ripped support out for the usermode helper
a while ago, there are still users that require the usermode helper,
however systemd's use of the usermode helper exacerbated a long
lasting issue of the fact that we have many drivers that load
firmware on init or probe. Independent of the usermode helper
loading firmware on init or probe is a bad idea for a few reasons.

When the firmware is read directly from the filesystem by the kernel,
if the driver is built-in to the kernel the firmware may not yet be
available, for such uses one could use initramfs and stuff the firmware
inside, or one also use CONFIG_EXTRA_FIRMWARE; however not all distributions
use this, as such generally one cannot count on this. There is another
corner cases to consider, since we are accessing the firmware directly folks
cannot expect new found firmware on a filesystem after we switch off from
an initramfs with pivot_root().

Supporting such situations is possible today but fixing it for good is
really complex due to the large amount of variablity in the boot up
process.

Instead just document the expectations properly and add a grammar rule to
enable folks to check / validate / police if drivers are using the request
firmware API early on init or probe.

The SmPL rule used to check for the probe routine is loose and is
currently defined through a regexp, that can easily be extended to any
other known bus probe routine names. It also uses the new Python
iteration support which allows us to backtrack from a request_firmware*()
call back to a possible probe or init, iteratively. Without iteration
we would only be able to get reports for callers who directly use the
request_firmware*() API on the initial probe or init routine.

There are 4 offenders at this time:

mcgrof@ergon ~/linux-next (git::20160609)$ export 
COCCI=scripts/coccinelle/api/request_firmware.cocci
mcgrof@ergon ~/linux-next (git::20160609)$ make coccicheck MODE=report

drivers/fmc/fmc-fakedev.c: ERROR: driver call request firmware call on its init 
routine on line 321.
drivers/fmc/fmc-write-eeprom.c: ERROR: driver call request firmware call on its 
probe routine on line 136.
drivers/tty/serial/rp2.c: ERROR: driver call request firmware call on its probe 
routine on line 796.
drivers/tty/serial/ucc_uart.c: ERROR: driver call request firmware call on its 
probe routine on line 1246.

I checked and verified all these are valid reports. This list also matches
the same list as in 20150805, so we have fortunately not gotten worse.
Let's keep it that way and also fix these drivers.

v2: changes from v1 [0]:

o This now supports iteration, this extends our coverage on the report

o Update documentation and commit log to accept the fate of not being
  able to remove the usermode helper.

[0] 
https://lkml.kernel.org/r/1440811107-861-1-git-send-email-mcg...@do-not-panic.com

Cc: Alessandro Rubini 
Cc: Kevin Cernekee 
Cc: Daniel Vetter 
Cc: Mimi Zohar 
Cc: David Woodhouse 
Cc: Kees Cook 
Cc: Dmitry Torokhov 
Cc: Ming Lei 
Cc: Jonathan Corbet 
Cc: Julia Lawall 
Cc: Gilles Muller 
Cc: Nicolas Palix 
Cc: Thierry Martinez 
Cc: Michal Marek 
Cc: co...@systeme.lip6.fr
Cc: Alessandro Rubini 
Cc: Kevin Cernekee 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: linux-ser...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ser...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Luis R. Rodriguez 
---
 Documentation/firmware_class/README|  20 
 drivers/base/Kconfig   |   2 +-
 .../request_firmware-avoid-init-probe-init.cocci   | 130 +
 3 files changed, 151 insertions(+), 1 deletion(-)
 create mode 100644 
scripts/coccinelle/api/request_firmware-avoid-init-probe-init.cocci

diff --git a/Documentation/firmware_class/README 
b/Documentation/firmware_class/README
index cafdca8b3b15..056d1cb9d365 100644
--- a/Documentation/firmware_class/README
+++ b/Documentation/firmware_class/README
@@ -93,6 +93,26 @@
user contexts to request firmware asynchronously, but can't be called
in atomic contexts.
 
+Requirements:
+=
+
+You should avoid at all costs requesting firmware on both init and probe paths
+of your device driver. Reason for this is the complexity needed to ensure a
+firmware will be available for a driver early in boot through different
+build configurations. Consider built-in drivers needing firmware early, or
+consider a driver assuming it will only get firmware after pivot_root().
+
+Drivers that really need 

Re: [kernel-hardening] [PATCH 2/2] security,perf: Allow further restriction of perf_event_open

2016-06-16 Thread Kees Cook
Hi guys,

This patch wasn't originally CCed to you (I'm fixing that now). Would
you consider taking this into the perf tree? It's been in active use
in both Debian and Android for a while now.

(If need be, I can resend it.)

Thanks!

-Kees

On Sat, Jun 4, 2016 at 1:49 PM, Jeffrey Vander Stoep  wrote:
> Acked-by: Jeff Vander Stoep 
>
> In addition to Debian, this patch has been merged into AOSP and is a
> requirement for Android:
> https://android-review.googlesource.com/#/q/topic:CONFIG_SECURITY_PERF_EVENTS_RESTRICT
>
>
> On Wed, Apr 13, 2016 at 9:12 AM, Kees Cook  wrote:
>> On Mon, Jan 11, 2016 at 7:23 AM, Ben Hutchings 
>> wrote:
>>> When kernel.perf_event_open is set to 3 (or greater), disallow all
>>> access to performance events by users without CAP_SYS_ADMIN.
>>> Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
>>> makes this value the default.
>>>
>>> This is based on a similar feature in grsecurity
>>> (CONFIG_GRKERNSEC_PERF_HARDEN).  This version doesn't include making
>>> the variable read-only.  It also allows enabling further restriction
>>> at run-time regardless of whether the default is changed.
>>>
>>> Signed-off-by: Ben Hutchings 
>>
>> Whoops, I entirely missed this email! Just found it now.
>>
>> Ben, can you resend this with Perf maintainers in CC? This seems
>> sensible enough to me.
>>
>> -Kees
>>
>>> ---
>>> I made a similar change to Debian's kernel packages in August,
>>> including the more restrictive default, and no-one has complained yet.
>>>
>>> Ben.
>>>
>>>  Documentation/sysctl/kernel.txt | 4 +++-
>>>  include/linux/perf_event.h  | 5 +
>>>  kernel/events/core.c| 8 
>>>  security/Kconfig| 9 +
>>>  4 files changed, 25 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/sysctl/kernel.txt
>>> b/Documentation/sysctl/kernel.txt
>>> index 88a2c8e..76e2ca8 100644
>>> --- a/Documentation/sysctl/kernel.txt
>>> +++ b/Documentation/sysctl/kernel.txt
>>> @@ -629,12 +629,14 @@ allowed to execute.
>>>  perf_event_paranoid:
>>>
>>>  Controls use of the performance events system by unprivileged
>>> -users (without CAP_SYS_ADMIN).  The default value is 1.
>>> +users (without CAP_SYS_ADMIN).  The default value is 3 if
>>> +CONFIG_SECURITY_PERF_EVENTS_RESTRICT is set, or 1 otherwise.
>>>
>>>   -1: Allow use of (almost) all events by all users
>>>  >=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>>>  >=1: Disallow CPU event access by users without CAP_SYS_ADMIN
>>>  >=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
>>> +>=3: Disallow all event access by users without CAP_SYS_ADMIN
>>>
>>>  ==
>>>
>>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>>> index f9828a4..aa72940 100644
>>> --- a/include/linux/perf_event.h
>>> +++ b/include/linux/perf_event.h
>>> @@ -989,6 +989,11 @@ extern int perf_cpu_time_max_percent_handler(struct
>>> ctl_table *table, int write,
>>> loff_t *ppos);
>>>
>>>
>>> +static inline bool perf_paranoid_any(void)
>>> +{
>>> +   return sysctl_perf_event_paranoid > 2;
>>> +}
>>> +
>>>  static inline bool perf_paranoid_tracepoint_raw(void)
>>>  {
>>> return sysctl_perf_event_paranoid > -1;
>>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>>> index cfc227c..85bc810 100644
>>> --- a/kernel/events/core.c
>>> +++ b/kernel/events/core.c
>>> @@ -175,8 +175,13 @@ static struct srcu_struct pmus_srcu;
>>>   *   0 - disallow raw tracepoint access for unpriv
>>>   *   1 - disallow cpu events for unpriv
>>>   *   2 - disallow kernel profiling for unpriv
>>> + *   3 - disallow all unpriv perf event use
>>>   */
>>> +#ifdef CONFIG_SECURITY_PERF_EVENTS_RESTRICT
>>> +int sysctl_perf_event_paranoid __read_mostly = 3;
>>> +#else
>>>  int sysctl_perf_event_paranoid __read_mostly = 1;
>>> +#endif
>>>
>>>  /* Minimum for 512 kiB + 1 user control page */
>>>  int sysctl_perf_event_mlock __read_mostly = 512 + (PAGE_SIZE / 1024); /*
>>> 'free' kiB per user */
>>> @@ -8265,6 +8270,9 @@ SYSCALL_DEFINE5(perf_event_open,
>>> if (flags & ~PERF_FLAG_ALL)
>>> return -EINVAL;
>>>
>>> +   if (perf_paranoid_any() && !capable(CAP_SYS_ADMIN))
>>> +   return -EACCES;
>>> +
>>> err = perf_copy_attr(attr_uptr, );
>>> if (err)
>>> return err;
>>> diff --git a/security/Kconfig b/security/Kconfig
>>> index e452378..30a2603 100644
>>> --- a/security/Kconfig
>>> +++ b/security/Kconfig
>>> @@ -18,6 +18,15 @@ config SECURITY_DMESG_RESTRICT
>>>
>>>   If you are unsure how to answer this question, answer N.
>>>
>>> +config SECURITY_PERF_EVENTS_RESTRICT
>>> +   bool "Restrict unprivileged use of performance events"
>>> +   depends on PERF_EVENTS
>>> +   help
>>> + If you say Y here, the 

Re: [PATCHv3 4/7] Documentation: dt: socfpga: Add Arria10 Ethernet binding

2016-06-16 Thread Thor Thayer

Hi Rob,

On 06/16/2016 01:39 PM, Rob Herring wrote:

On Mon, Jun 13, 2016 at 04:19:09PM -0500, ttha...@opensource.altera.com wrote:

From: Thor Thayer 

Add the device tree bindings needed to support the Altera Ethernet
FIFO buffers on the Arria10 chip.

Signed-off-by: Thor Thayer 
---
v2  No Change
v3  Change to common compatible string based on maintainer comments
 Add local IRQ values.
---
  .../bindings/arm/altera/socfpga-eccmgr.txt |   24 
  1 file changed, 24 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt 
b/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt
index 15eb0df..e9febbb 100644
--- a/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt
+++ b/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt
@@ -82,6 +82,14 @@ Required Properties:
  - interrupts : Should be single bit error interrupt, then double bit error
interrupt, in this order.

+Ethernet FIFO ECC
+Required Properties:
+- compatible : Should be "altr,socfpga-eth-mac-ecc"
+- reg: Address and size for ECC block registers.



parent is too vague. How about altr,ethernet-mac.


OK. I'll change to this.

+- parent : phandle to parent altr,ethernet-mac node

Thanks for reviewing!


+- interrupts : Should be single bit error interrupt, then double bit error
+   interrupt, in this order.
+
  Example:

eccmgr: eccmgr@ffd06000 {
@@ -108,4 +116,20 @@ Example:
interrupts = <1 IRQ_TYPE_LEVEL_HIGH>,
 <33 IRQ_TYPE_LEVEL_HIGH> ;
};
+
+   emac0-rx-ecc@ff8c0800 {
+   compatible = "altr,socfpga-eth-mac-ecc";
+   reg = <0xff8c0800 0x400>;
+   parent = <>;
+   interrupts = <4 IRQ_TYPE_LEVEL_HIGH>,
+<36 IRQ_TYPE_LEVEL_HIGH>;
+   };
+
+   emac0-tx-ecc@ff8c0c00 {
+   compatible = "altr,socfpga-eth-mac-ecc";
+   reg = <0xff8c0c00 0x400>;
+   parent = <>;
+   interrupts = <5 IRQ_TYPE_LEVEL_HIGH>,
+<37 IRQ_TYPE_LEVEL_HIGH>;
+   };
};
--
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PULL v4.6-rc1] bcache: documentation updates

2016-06-16 Thread Eric Wheeler
On Thu, 9 Jun 2016, Jonathan Corbet wrote:

> On Tue, 31 May 2016 16:02:16 -0700
> Marc MERLIN  wrote:
> 
> > > I don't think I ever saw this, was it sent to me?  
> >  
> > No I think Eric Emailed linux-doc@vger.kernel.org, instead of you directly.
> 
> Eric, can you resend the patches to me, and I'll get them applied?

Now posted to the list.

Thank you for your help!


--
Eric Wheeler


> 
> Thanks,
> 
> jon
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 4/7] Documentation: dt: socfpga: Add Arria10 Ethernet binding

2016-06-16 Thread Rob Herring
On Mon, Jun 13, 2016 at 04:19:09PM -0500, ttha...@opensource.altera.com wrote:
> From: Thor Thayer 
> 
> Add the device tree bindings needed to support the Altera Ethernet
> FIFO buffers on the Arria10 chip.
> 
> Signed-off-by: Thor Thayer 
> ---
> v2  No Change
> v3  Change to common compatible string based on maintainer comments
> Add local IRQ values.
> ---
>  .../bindings/arm/altera/socfpga-eccmgr.txt |   24 
> 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt 
> b/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt
> index 15eb0df..e9febbb 100644
> --- a/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt
> +++ b/Documentation/devicetree/bindings/arm/altera/socfpga-eccmgr.txt
> @@ -82,6 +82,14 @@ Required Properties:
>  - interrupts : Should be single bit error interrupt, then double bit error
>   interrupt, in this order.
>  
> +Ethernet FIFO ECC
> +Required Properties:
> +- compatible : Should be "altr,socfpga-eth-mac-ecc"
> +- reg: Address and size for ECC block registers.
> +- parent : phandle to parent Ethernet node.

parent is too vague. How about altr,ethernet-mac.

> +- interrupts : Should be single bit error interrupt, then double bit error
> + interrupt, in this order.
> +
>  Example:
>  
>   eccmgr: eccmgr@ffd06000 {
> @@ -108,4 +116,20 @@ Example:
>   interrupts = <1 IRQ_TYPE_LEVEL_HIGH>,
><33 IRQ_TYPE_LEVEL_HIGH> ;
>   };
> +
> + emac0-rx-ecc@ff8c0800 {
> + compatible = "altr,socfpga-eth-mac-ecc";
> + reg = <0xff8c0800 0x400>;
> + parent = <>;
> + interrupts = <4 IRQ_TYPE_LEVEL_HIGH>,
> +  <36 IRQ_TYPE_LEVEL_HIGH>;
> + };
> +
> + emac0-tx-ecc@ff8c0c00 {
> + compatible = "altr,socfpga-eth-mac-ecc";
> + reg = <0xff8c0c00 0x400>;
> + parent = <>;
> + interrupts = <5 IRQ_TYPE_LEVEL_HIGH>,
> +  <37 IRQ_TYPE_LEVEL_HIGH>;
> + };
>   };
> -- 
> 1.7.9.5
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear

2016-06-16 Thread Tom Lendacky
On 06/15/2016 08:17 AM, Tom Lendacky wrote:
> On 06/13/2016 08:51 AM, Matt Fleming wrote:
>> On Thu, 09 Jun, at 01:33:30PM, Tom Lendacky wrote:
>>>

[...]

>>
>>> I'll look further into this, but I saw that this area of virtual memory
>>> was mapped un-encrypted and after freeing the boot services the
>>> mappings were somehow reused as un-encrypted for DMA which assumes
>>> (unless using swiotlb) encrypted. This resulted in DMA data being
>>> transferred in as encrypted and then accessed un-encrypted.
>>
>> That the mappings were re-used isn't a surprise.
>>
>> efi_free_boot_services() lifts the reservation that was put in place
>> during efi_reserve_boot_services() and releases the pages to the
>> kernel's memory allocators.
>>
>> What is surprising is that they were marked unencrypted at all.
>> There's nothing special about these pages as far as the __va() region
>> is concerned.
> 
> Right, let me keep looking into this to see if I can pin down what
> was (or is) happening.

Ok, I think this was happening before the commit to build our own
EFI page table structures:

commit 67a9108ed ("x86/efi: Build our own page table structures")

Before this commit the boot services ended up mapped into the kernel
page table entries as un-encrypted during efi_map_regions() and I needed
to change those entries back to encrypted. With your change above,
this appears to no longer be needed.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 3/4] i2c: i801: add support of Host Notify

2016-06-16 Thread Wolfram Sang

> (i2c-i801) can be carried over through the input tree. So as long as
> you don't need to have a new feature without users for a short period
> of time, that's fine by me :)

That's fine. I have extremly high trust that the user of the feature
will be added soon ;)



signature.asc
Description: PGP signature


Re: [PATCH v8 3/4] i2c: i801: add support of Host Notify

2016-06-16 Thread Benjamin Tissoires
On Jun 16 2016 or thereabouts, Wolfram Sang wrote:
> > > - removed the .resume hook as upstream changed suspend/resume hooks and 
> > > there
> > >   is no need in the end to re-enable host notify on resume (tested on 
> > > Lenovo
> > >   t440 and t450).
> > 
> > Actually, this hook seemed to be required on the Lenovo T440 (Haswell)
> > but not on the T450 (Broadwell) laptop I have now here.
> > 
> > Wolfram, I can resend the whole series or a follow-up patch to re-enable
> > after resume Host Notify. How do you prefer I deal with that?
> 
> That depends a little how we want to handle patch 4. I am going to apply
> patches 1+2 today to my tree. Then you can just resend patch 3 which I
> hope will get some review soon, but I will pick it up for 4.8 later
> nonetheless. If it is not causing too much dependency hassle, I'd prefer
> that patch 4 goes via Dmitry's input tree.
> 

Works for me. Thanks for picking up the core changes. As soon as this
gets merged, the input part (patch 4), which is independent of patch 3
(i2c-i801) can be carried over through the input tree. So as long as
you don't need to have a new feature without users for a short period
of time, that's fine by me :)

Cheers,
Benjamin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "mm: rename _count, field of the struct page, to _refcount"

2016-06-16 Thread Michal Hocko
On Thu 16-06-16 13:22:27, Vitaly Kuznetsov wrote:
> Michal Hocko  writes:
> 
> > On Thu 16-06-16 12:30:16, Vitaly Kuznetsov wrote:
> >> Christoph Hellwig  writes:
> >> 
> >> > On Thu, Jun 16, 2016 at 11:22:46AM +0200, Vitaly Kuznetsov wrote:
> >> >> _count -> _refcount rename in commit 0139aa7b7fa12 ("mm: rename _count,
> >> >> field of the struct page, to _refcount") broke kdump. makedumpfile(8) 
> >> >> does
> >> >> stuff like READ_MEMBER_OFFSET("page._count", page._count) and fails. 
> >> >> While
> >> >> it is definitely possible to fix this particular tool I'm not sure about
> >> >> other tools which might be doing the same.
> >> >> 
> >> >> I suggest we remember the "we don't break userspace" rule and revert for
> >> >> 4.7 while it's not too late.
> >> >
> >> > Err, sorry - this is not "userspace".  It's crazy crap digging into
> >> > kernel internal structure.
> >> >
> >> > The rename was absolutely useful, so fix up your stinking pike in kdump.
> >> 
> >> Ok, sure, I'll send a patch to it. I was worried about other tools out
> >> there which e.g. inspect /proc/vmcore. As it is something we support
> >> some conservatism around it is justified.
> >
> > struct page layout as some others that such a tool might depend on has
> > changes several times in the past so I fail to see how is it any
> > different this time.
> 
> IMO this time the change doesn't give us any advantage, it was just a
> rename.

Which would catch all the pending users who are not using the
appropriate API. This is IMHO very useful as the sole purpose of the
change is to catch _all_ users. So the reason is pretty much technicall.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "mm: rename _count, field of the struct page, to _refcount"

2016-06-16 Thread Vitaly Kuznetsov
Michal Hocko  writes:

> On Thu 16-06-16 12:30:16, Vitaly Kuznetsov wrote:
>> Christoph Hellwig  writes:
>> 
>> > On Thu, Jun 16, 2016 at 11:22:46AM +0200, Vitaly Kuznetsov wrote:
>> >> _count -> _refcount rename in commit 0139aa7b7fa12 ("mm: rename _count,
>> >> field of the struct page, to _refcount") broke kdump. makedumpfile(8) does
>> >> stuff like READ_MEMBER_OFFSET("page._count", page._count) and fails. While
>> >> it is definitely possible to fix this particular tool I'm not sure about
>> >> other tools which might be doing the same.
>> >> 
>> >> I suggest we remember the "we don't break userspace" rule and revert for
>> >> 4.7 while it's not too late.
>> >
>> > Err, sorry - this is not "userspace".  It's crazy crap digging into
>> > kernel internal structure.
>> >
>> > The rename was absolutely useful, so fix up your stinking pike in kdump.
>> 
>> Ok, sure, I'll send a patch to it. I was worried about other tools out
>> there which e.g. inspect /proc/vmcore. As it is something we support
>> some conservatism around it is justified.
>
> struct page layout as some others that such a tool might depend on has
> changes several times in the past so I fail to see how is it any
> different this time.

IMO this time the change doesn't give us any advantage, it was just a
rename.

> struct page is nothing the userspace should depend on.

True but at least makedumpfile(8) is special and even if it's a 'crazy
crap digging into ...' we could avoid breaking it for no technical
reason.

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "mm: rename _count, field of the struct page, to _refcount"

2016-06-16 Thread Michal Hocko
On Thu 16-06-16 12:30:16, Vitaly Kuznetsov wrote:
> Christoph Hellwig  writes:
> 
> > On Thu, Jun 16, 2016 at 11:22:46AM +0200, Vitaly Kuznetsov wrote:
> >> _count -> _refcount rename in commit 0139aa7b7fa12 ("mm: rename _count,
> >> field of the struct page, to _refcount") broke kdump. makedumpfile(8) does
> >> stuff like READ_MEMBER_OFFSET("page._count", page._count) and fails. While
> >> it is definitely possible to fix this particular tool I'm not sure about
> >> other tools which might be doing the same.
> >> 
> >> I suggest we remember the "we don't break userspace" rule and revert for
> >> 4.7 while it's not too late.
> >
> > Err, sorry - this is not "userspace".  It's crazy crap digging into
> > kernel internal structure.
> >
> > The rename was absolutely useful, so fix up your stinking pike in kdump.
> 
> Ok, sure, I'll send a patch to it. I was worried about other tools out
> there which e.g. inspect /proc/vmcore. As it is something we support
> some conservatism around it is justified.

struct page layout as some others that such a tool might depend on has
changes several times in the past so I fail to see how is it any
different this time. struct page is nothing the userspace should depend
on.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] Input: rotary-encoder- Add support for absolute encoder

2016-06-16 Thread Vignesh R
Hi Dmitry,

On Wednesday 25 May 2016 02:14 PM, Vignesh R wrote:
> Hi Dmitry,
> 
> On 05/23/2016 02:48 PM, R, Vignesh wrote:
>>
>>
>> On 5/20/2016 10:04 PM, Dmitry Torokhov wrote:
>>> On Thu, May 19, 2016 at 02:34:00PM +0530, Vignesh R wrote:
 There are rotary-encoders where GPIO lines reflect the actual position
 of the rotary encoder dial. For example, if dial points to 9, then four
 GPIO lines connected to the rotary encoder will read HLLH(1001b = 9).
 Add support for such rotary-encoder.
 The driver relies on rotary-encoder,absolute-encoder DT property to
 detect such encoders.
 Since, GPIO IRQs are not necessary to work with
 such encoders, optional polling mode support is added using
 input_poll_dev skeleton. This is can be used by enabling
 CONFIG_INPUT_GPIO_ROTARY_ENCODER_POLL_MODE_SUPPORT.
>>>
>>> Does this really belong to a rotary encoder and not a new driver that
>>> simply translates gpio-encoded value into ABS* event?
>>>
>>
>> Currently rotary encoder driver only supports incremental/step counting
>> rotary devices. However, the device that is there on am335x-ice is an
>> absolute encoder but, IMO, nevertheless a kind of rotary encoder. The
>> only difference is that there is no need to count steps and the absolute
>> position value is always available as binary encoded state of connected
>> GPIOs.
>> The hardware on am335x-ice is a mechanical rotary encoder switch
>> connected over 4 GPIOs. It is same as binary encoder described at [1]
>> (except there are 4 GPIO lines), so this lead me to add support in
>> rotary-encoder.
>>
>> [1]https://en.wikipedia.org/wiki/Rotary_encoder#Standard_binary_encoding
>>
> 
> Could you please comment on how would you like to support above
> described encoder: As a new driver or with existing driver with new
> compatible/mode setting via DT or as suggest by Uwe in another reply?
> IMHO, supporting using existing driver with new mode/compatible string
> looks a better option as the hardware is a kind of rotary-encoder.
> 

It would be great if you could comment on how would you like to see
support for absolute rotary-encoder(the one I described above)? As a new
driver or handle it by adding new compatible to existing rotary-encoder
driver?


-- 
Regards
Vignesh
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "mm: rename _count, field of the struct page, to _refcount"

2016-06-16 Thread Vlastimil Babka

On 06/16/2016 11:22 AM, Vitaly Kuznetsov wrote:

_count -> _refcount rename in commit 0139aa7b7fa12 ("mm: rename _count,
field of the struct page, to _refcount") broke kdump. makedumpfile(8) does
stuff like READ_MEMBER_OFFSET("page._count", page._count) and fails. While
it is definitely possible to fix this particular tool I'm not sure about
other tools which might be doing the same.

I suggest we remember the "we don't break userspace" rule and revert for
4.7 while it's not too late.


I don't think the rule applies to tools such as kdump and crash, or e.g. 
systemtap, that interact with kernel internals even though they are 
technically "userspace". They have to adapt to new kernel versions all 
the time, the internal APIs are not frozen. Otherwise we would be quite 
limited in evolving the kernel...


So even though the change in question is not essential (field rename) 
and reverting wouldn't really hurt technical progress, this is not a 
sufficient reason, IMO. Thus, NAK.


Vlastimil
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert "mm: rename _count, field of the struct page, to _refcount"

2016-06-16 Thread Christoph Hellwig
On Thu, Jun 16, 2016 at 11:22:46AM +0200, Vitaly Kuznetsov wrote:
> _count -> _refcount rename in commit 0139aa7b7fa12 ("mm: rename _count,
> field of the struct page, to _refcount") broke kdump. makedumpfile(8) does
> stuff like READ_MEMBER_OFFSET("page._count", page._count) and fails. While
> it is definitely possible to fix this particular tool I'm not sure about
> other tools which might be doing the same.
> 
> I suggest we remember the "we don't break userspace" rule and revert for
> 4.7 while it's not too late.

Err, sorry - this is not "userspace".  It's crazy crap digging into
kernel internal structure.

The rename was absolutely useful, so fix up your stinking pike in kdump.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Revert "mm: rename _count, field of the struct page, to _refcount"

2016-06-16 Thread Vitaly Kuznetsov
_count -> _refcount rename in commit 0139aa7b7fa12 ("mm: rename _count,
field of the struct page, to _refcount") broke kdump. makedumpfile(8) does
stuff like READ_MEMBER_OFFSET("page._count", page._count) and fails. While
it is definitely possible to fix this particular tool I'm not sure about
other tools which might be doing the same.

I suggest we remember the "we don't break userspace" rule and revert for
4.7 while it's not too late.

This is a partial revert, useful hunks in drivers which do
page_ref_{sub,add,inc} instead of open coded atomic_* operations stay.

Signed-off-by: Vitaly Kuznetsov 
---
 Documentation/vm/transhuge.txt   | 10 +-
 arch/tile/mm/init.c  |  2 +-
 drivers/block/aoe/aoecmd.c   |  2 +-
 drivers/hwtracing/intel_th/msu.c |  2 +-
 fs/proc/page.c   |  2 +-
 include/linux/mm.h   |  2 +-
 include/linux/mm_types.h | 14 +-
 include/linux/page_ref.h | 26 +-
 include/linux/pagemap.h  |  8 
 kernel/kexec_core.c  |  2 +-
 mm/huge_memory.c |  4 ++--
 mm/internal.h|  2 +-
 mm/page_alloc.c  |  4 ++--
 mm/slub.c|  4 ++--
 mm/vmscan.c  |  4 ++--
 15 files changed, 42 insertions(+), 46 deletions(-)

diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
index 7c871d6..75c774c 100644
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -394,9 +394,9 @@ hugepage natively. Once finished you can drop the page 
table lock.
 Refcounting on THP is mostly consistent with refcounting on other compound
 pages:
 
-  - get_page()/put_page() and GUP operate in head page's ->_refcount.
+  - get_page()/put_page() and GUP operate in head page's ->_count.
 
-  - ->_refcount in tail pages is always zero: get_page_unless_zero() never
+  - ->_count in tail pages is always zero: get_page_unless_zero() never
 succeed on tail pages.
 
   - map/unmap of the pages with PTE entry increment/decrement ->_mapcount
@@ -426,15 +426,15 @@ requests to split pinned huge page: it expects page count 
to be equal to
 sum of mapcount of all sub-pages plus one (split_huge_page caller must
 have reference for head page).
 
-split_huge_page uses migration entries to stabilize page->_refcount and
+split_huge_page uses migration entries to stabilize page->_count and
 page->_mapcount.
 
 We safe against physical memory scanners too: the only legitimate way
 scanner can get reference to a page is get_page_unless_zero().
 
-All tail pages have zero ->_refcount until atomic_add(). This prevents the
+All tail pages have zero ->_count until atomic_add(). This prevents the
 scanner from getting a reference to the tail page up to that point. After the
-atomic_add() we don't care about the ->_refcount value.  We already known how
+atomic_add() we don't care about the ->_count value.  We already known how
 many references should be uncharged from the head page.
 
 For head page get_page_unless_zero() will succeed and we don't mind. It's
diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
index adce254..a0582b7 100644
--- a/arch/tile/mm/init.c
+++ b/arch/tile/mm/init.c
@@ -679,7 +679,7 @@ static void __init init_free_pfn_range(unsigned long start, 
unsigned long end)
 * Hacky direct set to avoid unnecessary
 * lock take/release for EVERY page here.
 */
-   p->_refcount.counter = 0;
+   p->_count.counter = 0;
p->_mapcount.counter = -1;
}
init_page_count(page);
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index d597e43..437b3a8 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -861,7 +861,7 @@ rqbiocnt(struct request *r)
  * discussion.
  *
  * We cannot use get_page in the workaround, because it insists on a
- * positive page count as a precondition.  So we use _refcount directly.
+ * positive page count as a precondition.  So we use _count directly.
  */
 static void
 bio_pageinc(struct bio *bio)
diff --git a/drivers/hwtracing/intel_th/msu.c b/drivers/hwtracing/intel_th/msu.c
index e8d55a1..0974090 100644
--- a/drivers/hwtracing/intel_th/msu.c
+++ b/drivers/hwtracing/intel_th/msu.c
@@ -1172,7 +1172,7 @@ static void msc_mmap_close(struct vm_area_struct *vma)
if (!atomic_dec_and_mutex_lock(>mmap_count, >buf_mutex))
return;
 
-   /* drop page _refcounts */
+   /* drop page _counts */
for (pg = 0; pg < msc->nr_pages; pg++) {
struct page *page = msc_buffer_get_page(msc, pg);
 
diff --git a/fs/proc/page.c b/fs/proc/page.c
index 3ecd445..712f1b9 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -142,7 +142,7 @@ u64 stable_page_flags(struct page *page)
 
 
/*
-* Caveats on 

Re: [PATCH v8 3/4] i2c: i801: add support of Host Notify

2016-06-16 Thread Wolfram Sang
> > - removed the .resume hook as upstream changed suspend/resume hooks and 
> > there
> >   is no need in the end to re-enable host notify on resume (tested on Lenovo
> >   t440 and t450).
> 
> Actually, this hook seemed to be required on the Lenovo T440 (Haswell)
> but not on the T450 (Broadwell) laptop I have now here.
> 
> Wolfram, I can resend the whole series or a follow-up patch to re-enable
> after resume Host Notify. How do you prefer I deal with that?

That depends a little how we want to handle patch 4. I am going to apply
patches 1+2 today to my tree. Then you can just resend patch 3 which I
hope will get some review soon, but I will pick it up for 4.8 later
nonetheless. If it is not causing too much dependency hassle, I'd prefer
that patch 4 goes via Dmitry's input tree.



signature.asc
Description: PGP signature