date:20161217

Re: [GIT PULL] KVM fixes for 4.10 merge window

2016-12-17 Thread Paolo Bonzini


> On Fri, Dec 16, 2016 at 8:57 AM, Paolo Bonzini  wrote:
> >
> >   git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus
> 
> This piece-of-shit branch has obviously never been even compile-tested:
> 
>   arch/x86/kernel/kvm.c: In function ‘__kvm_vcpu_is_preempted’:
>   arch/x86/kernel/kvm.c:596:14: error: ‘struct kvm_steal_time’ has no
> member named ‘preempted’
> 
> where commit b94c3698b4b0 ("Revert "x86/kvm: Support the vCPU
> preemption check"") removed the "preempted" field from struct
> kvm_steal_time, but you left this in place:
> 
>   __visible bool __kvm_vcpu_is_preempted(int cpu)
>   {
>   struct kvm_steal_time *src = &per_cpu(steal_time, cpu);
> 
>   return !!src->preempted;
>   }
> 
> And no, that is not a merge artifact in my tree (although that
> function did come in from Ingo). That compile failure comes from your
> very own branch.

Yes, it does.  Well, to be honest I did test this (not just compile-test
it) but I didn't have KVM guest support turned on, only KVM host support.

Sorry, I'll resend it and make sure I do a "make allmodconfig" in the
future (and not send pull requests at 6 PM on Friday).

Paolo

Re: [GIT PULL] KVM fixes for 4.10 merge window

2016-12-17 Thread Paolo Bonzini



- Original Message -
> From: "Pan Xinhui" 
> To: "Linus Torvalds" , "Paolo Bonzini" 
> 
> Cc: "Linux Kernel Mailing List" , "Radim 
> Krčmář" , "KVM list"
> 
> Sent: Saturday, December 17, 2016 4:09:16 AM
> Subject: Re: [GIT PULL] KVM fixes for 4.10 merge window
> 
> 
> 
> 在 2016/12/17 03:42, Linus Torvalds 写道:
> > On Fri, Dec 16, 2016 at 8:57 AM, Paolo Bonzini  wrote:
> >>
> >>   git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus
> >
> > This piece-of-shit branch has obviously never been even compile-tested:
> >
> >   arch/x86/kernel/kvm.c: In function ‘__kvm_vcpu_is_preempted’:
> >   arch/x86/kernel/kvm.c:596:14: error: ‘struct kvm_steal_time’ has no
> > member named ‘preempted’
> >
> hi, Linus
>   oh, my bad also. I introduce this struct member and use it in same 
> patch.
> Better to separate tem into two patches. I make one fix patch below. sorry
> again.

Hi Xinhui, don't worry it's purely my fault. :)

>   I have known where is th problem, I think if we can set this ->preempted
>   later after preempted_enable()
> or just introduce something like write_guest_nosleep (per cpu memory section
> in guest, so there is no page_fault or any other cannot sleep problems)?

Yes there is already kvm_read_guest_inatomic, we can add an equivalent one
for writes.  It will be for 4.11 anyway, so there's time.

Paolo

Re: [PATCH] scsi: esas2r: Fix format string type mistakes

2016-12-17 Thread Bart Van Assche

On 12/16/2016 10:50 PM, Kees Cook wrote:
> diff --git a/drivers/scsi/esas2r/esas2r_ioctl.c 
> b/drivers/scsi/esas2r/esas2r_ioctl.c
> index 3e8483410f61..34976f9a1a10 100644
> --- a/drivers/scsi/esas2r/esas2r_ioctl.c
> +++ b/drivers/scsi/esas2r/esas2r_ioctl.c
> @@ -1301,7 +1301,7 @@ int esas2r_ioctl_handler(void *hostdata, int cmd, void 
> __user *arg)
>   ioctl = kzalloc(sizeof(struct atto_express_ioctl), GFP_KERNEL);
>   if (ioctl == NULL) {
>   esas2r_log(ESAS2R_LOG_WARN,
> -"ioctl_handler kzalloc failed for %d bytes",
> +"ioctl_handler kzalloc failed for %lu bytes",
>  sizeof(struct atto_express_ioctl));
>   return -ENOMEM;
>   }

Please use %zu to format size_t.

Bart.

Re: [GIT PULL (resend)] readlink cleanup

2016-12-17 Thread Miklos Szeredi

On Sat, Dec 17, 2016 at 12:08 AM, Al Viro  wrote:
> On Fri, Dec 16, 2016 at 11:48:59PM +0100, Miklos Szeredi wrote:
>
>> This is a rework of the readlink cleanup patchset from the last cycle.  Now
>> readlink(2) does the following:
>>
>>  - if i_op->readlink() is non-NULL (only proc and afs mountpoints for now)
>>then it calls that
>>
>>  - otherwise call i_op->get_link()
>>
>>  - signature of ->readlink() now matches that of ->get_link()
>>
>> In particular this last bullet point buys us:
>>
>>  - less complexity, because we already handle the delayed free of the
>>buffer and copying to userspace due to ->get_link() being the normal way
>>to read the symlink
>
> Less complexity where, exactly?  In the caller the life does not become
> any simpler - instead of "call ->readlink() and bugger off" you have
> "call ->readlink() and go through the same motions as in ->get_link()-based
> case".  In the instances it becomes _more_ complex.

Have you looked?  Because in actual fact they don't.

 Theoretically it's either:

 - kmalloc + fill + readlink_copy + kfree  -->  kmalloc + fill +
set_delayed_call

 - declare char[] on stack + fill + readlink_copy --> kmalloc + fill +
set_delayed_call

Presumably it's the second one you are talking about becoming more
complex.  There's exactly one instance of that in the tree and it
actually becomes cleaner after the change.

Current code does:

  - guess max link size to be 50 (very scientifically I'm sure, but no
explanation given)
  - call filler
  - hope it didn't get truncated

Which becomes:

  - call filler which allocates correctly sized buffer.

> What's more, this new signature for ->readlink() makes no sense - instead of
> "symlink traversal does not involve resolving a pathname, so we have to
> fake one for readlink(2)" you get something resembling ->get_link(), which
> would _not_ function as ->get_link() ought to.  But it can be called by the
> same codepath that calls ->get_link(), saving us the burden of returning
> without doing what ->get_link-based case would - we still get to check if
> ->readlink() is there, but we rejoin the common path immediately.  And AFAICS
> that's the _only_ benefit of that signature change - making it possible to
> reuse a few lines that adapt ->get_link() to readlinkat(2) needs.

With the signature change we get a consistent interface for reading
the contents of symlinks.  With that it will never make sense to play
the stupid get_ds/set_ds() games that we've had.  And no need to
duplicate helper functions, like page_readlink() that is exactly the
same as page_getlink() only for the different interface.  And no need
to export readlink_copy() which is something the filesystems never
actually want to care about.

Having different interfaces for the same thing is going to be more
complex.  I just don't get it what you are opposed to here.

Thanks,
Miklos

Re: [PATCH 2/2] iio: adc: hx711: Add IIO driver for AVIA HX711

2016-12-17 Thread Matt Ranostay

On Tue, Dec 13, 2016 at 10:02 AM, Andreas Klinger  wrote:
> This is the IIO driver for AVIA HX711 ADC which ist mostly used in weighting
> cells.

First off cool that this is finally getting a driver... I'll have to
get the SparkFun breakout and really cheap scale to test :).

>
> The protocol is quite simple and using GPIO's:
> One GPIO is used as clock (SCK) while another GPIO is read (DOUT)
>
> Signed-off-by: Andreas Klinger 
> ---
>  drivers/iio/adc/Kconfig  |  13 +++
>  drivers/iio/adc/Makefile |   1 +
>  drivers/iio/adc/hx711.c  | 269 
> +++
>  3 files changed, 283 insertions(+)
>  create mode 100644 drivers/iio/adc/hx711.c
>
> diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
> index 932de1f9d1e7..7902b50fcf32 100644
> --- a/drivers/iio/adc/Kconfig
> +++ b/drivers/iio/adc/Kconfig
> @@ -205,6 +205,19 @@ config HI8435
>   This driver can also be built as a module. If so, the module will be
>   called hi8435.
>
> +config HX711
> +   tristate "AVIA HX711 ADC for weight cells"
> +   depends on GPIOLIB
> +   help
> + If you say yes here you get support for AVIA HX711 ADC which is used
> + for weight cells
> +
> + This driver uses two GPIO's, one for setting the clock and the other
> + one for getting the data
> +
> + This driver can also be built as a module. If so, the module will be
> + called hx711.
> +
>  config INA2XX_ADC
> tristate "Texas Instruments INA2xx Power Monitors IIO driver"
> depends on I2C && !SENSORS_INA2XX
> diff --git a/drivers/iio/adc/Makefile b/drivers/iio/adc/Makefile
> index b1aa456e6af3..d46e289900ef 100644
> --- a/drivers/iio/adc/Makefile
> +++ b/drivers/iio/adc/Makefile
> @@ -21,6 +21,7 @@ obj-$(CONFIG_CC10001_ADC) += cc10001_adc.o
>  obj-$(CONFIG_DA9150_GPADC) += da9150-gpadc.o
>  obj-$(CONFIG_EXYNOS_ADC) += exynos_adc.o
>  obj-$(CONFIG_HI8435) += hi8435.o
> +obj-$(CONFIG_HX711) += hx711.o
>  obj-$(CONFIG_IMX7D_ADC) += imx7d_adc.o
>  obj-$(CONFIG_INA2XX_ADC) += ina2xx-adc.o
>  obj-$(CONFIG_LP8788_ADC) += lp8788_adc.o
> diff --git a/drivers/iio/adc/hx711.c b/drivers/iio/adc/hx711.c
> new file mode 100644
> index ..cbc89e467985
> --- /dev/null
> +++ b/drivers/iio/adc/hx711.c
> @@ -0,0 +1,269 @@
> +/*
> + * HX711: analog to digital converter for weight sensor module
> + *
> + * Copyright (c) Andreas Klinger 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define HX711_GAIN_32  2   /* gain = 32 for channel B  */
> +#define HX711_GAIN_64  3   /* gain = 64 for channel A  */
> +#define HX711_GAIN_128 1   /* gain = 128 for channel A */
> +
> +
> +struct hx711_data {
> +   struct device   *dev;
> +   dev_t   devt;
> +   struct gpio_desc*gpiod_sck;
> +   struct gpio_desc*gpiod_dout;
> +   int gain_pulse;
> +   struct mutexlock;
> +};
> +
> +static void hx711_reset(struct hx711_data *hx711_data)
> +{
> +   int val;
> +   int i;
> +
> +   val = gpiod_get_value(hx711_data->gpiod_dout);

could move the val assignment here to the initialization don't think
it will hit 80 chars

> +   if (val) {

move "int i" here to avoid compiler initialization warnings

> +   dev_warn(hx711_data->dev, "RESET-HX711\n");
> +
> +   gpiod_set_value(hx711_data->gpiod_sck, 1);
> +   udelay(80);

IIRC this chip has quite a bit of latency thresholds, can't use
usleep_range? Single core embedded systems could have an issue with
continuous  polling.

> +   gpiod_set_value(hx711_data->gpiod_sck, 0);
> +
> +   for (i = 0; i < 1000; i++) {
> +   val = gpiod_get_value(hx711_data->gpiod_dout);
> +   if (!val)
> +   break;
> +   /* sleep at least 1 ms*/
> +   msleep(1);
> +   }
> +   }
> +}
> +
> +static int hx711_cycle(struct hx711_data *hx711_data)
> +{
> +   int val;
> +
> +   /* if preempted for more then 60us while SCK is high:
> +* hx711 is going in reset
> +* ==> measuring is false
> +*/
> +   preempt_disable();
> +   gpiod_set_value(hx711_data

Re: [TSN RFC v2 5/9] Add TSN header for the driver

2016-12-17 Thread Henrik Austad

On Fri, Dec 16, 2016 at 11:09:38PM +0100, Richard Cochran wrote:
> On Fri, Dec 16, 2016 at 06:59:09PM +0100, hen...@austad.us wrote:
> > +/*
> > + * List of current subtype fields in the common header of AVTPDU
> > + *
> > + * Note: AVTPDU is a remnant of the standards from when it was AVB.
> > + *
> > + * The list has been updated with the recent values from IEEE 1722, draft 
> > 16.
> > + */
> > +enum avtp_subtype {
> > +   TSN_61883_IIDC = 0, /* IEC 61883/IIDC Format */
> > +   TSN_MMA_STREAM, /* MMA Streams */
> > +   TSN_AAF,/* AVTP Audio Format */
> > +   TSN_CVF,/* Compressed Video Format */
> > +   TSN_CRF,/* Clock Reference Format */
> > +   TSN_TSCF,   /* Time-Synchronous Control Format */
> > +   TSN_SVF,/* SDI Video Format */
> > +   TSN_RVF,/* Raw Video Format */
> > +   /* 0x08 - 0x6D reserved */
> > +   TSN_AEF_CONTINOUS = 0x6e, /* AES Encrypted Format Continous */
> > +   TSN_VSF_STREAM, /* Vendor Specific Format Stream */
> > +   /* 0x70 - 0x7e reserved */
> > +   TSN_EF_STREAM = 0x7f,   /* Experimental Format Stream */
> > +   /* 0x80 - 0x81 reserved */
> > +   TSN_NTSCF = 0x82,   /* Non Time-Synchronous Control Format */
> > +   /* 0x83 - 0xed reserved */
> > +   TSN_ESCF = 0xec,/* ECC Signed Control Format */
> > +   TSN_EECF,   /* ECC Encrypted Control Format */
> > +   TSN_AEF_DISCRETE,   /* AES Encrypted Format Discrete */
> > +   /* 0xef - 0xf9 reserved */
> > +   TSN_ADP = 0xfa, /* AVDECC Discovery Protocol */
> > +   TSN_AECP,   /* AVDECC Enumeration and Control Protocol */
> > +   TSN_ACMP,   /* AVDECC Connection Management Protocol */
> > +   /* 0xfd reserved */
> > +   TSN_MAAP = 0xfe,/* MAAP Protocol */
> > +   TSN_EF_CONTROL, /* Experimental Format Control */
> > +};
> 
> The kernel shouldn't be in the business of assembling media packets.

No, but assembling the packets and shipping frames to a destination is not 
neccessarily the same thing.

A nice workflow would be to signal to the shim that "I'm sending a 
compressed video format" and then the shim/tsn_core will ship out the 
frames over the network - and then you need to set TSN_CVF as subtype in 
each header.

That does not that mean you should do H.264 encode/decode *in* the kernel

Perhaps this is better placed in include/uapi/tsn.h so that userspace and 
kernel share the same header?

-- 
Henrik Austad


signature.asc
Description: PGP signature

Re: [GIT PULL] kbuild changes for v4.9-rc1

2016-12-17 Thread Jiri Slaby

On 12/16/2016, 08:57 PM, Linus Torvalds wrote:
> On Fri, Dec 16, 2016 at 11:55 AM, Jiri Slaby  wrote:
>>
>> what happened to this? I had to apply this to fix 4.9-pae kernel here.
> 
> Did you actually have to do that?

Yes, disk drivers won't load:
[2.141973] virtio_pci: disagrees about version of symbol mcount
[2.144415] virtio_pci: Unknown symbol mcount (err -22)
[2.164547] virtio_pci: disagrees about version of symbol mcount
[2.166309] virtio_pci: Unknown symbol mcount (err -22)
[2.180651] virtio_pci: disagrees about version of symbol mcount
[2.182823] virtio_pci: Unknown symbol mcount (err -22)
[2.210943] virtio_pci: disagrees about version of symbol mcount
[2.220097] virtio_pci: Unknown symbol mcount (err -22)
[2.220173] ata_piix: disagrees about version of symbol mcount
[2.220174] ata_piix: Unknown symbol mcount (err -22)

and whole machine gets stuck with systemd waiting for /dev/sd*.

> Because a missing CRC shouldn't be fatal in 4.9.
> 
> What was the failure mode?

I am not sure what you mean? The kernel is rpm-ized 4.9 vanilla and this
is the config:
http://kernel.suse.com/cgit/kernel-source/tree/config/i386/default?h=stable

thanks,
-- 
js
suse labs

Re: [TSN RFC v2 0/9] TSN driver for the kernel

2016-12-17 Thread Henrik Austad

Hi Richard,

On Fri, Dec 16, 2016 at 11:05:30PM +0100, Richard Cochran wrote:
> On Fri, Dec 16, 2016 at 06:59:04PM +0100, hen...@austad.us wrote:
> > The driver is directed via ConfigFS as we need userspace to handle
> > stream-reservation (MSRP), discovery and enumeration (IEEE 1722.1) and
> > whatever other management is needed.
> 
> I complained about configfs before, but you didn't listen.

Yes you did, I remember quite well, and no, I didn't listen :)

At the time, there were other issues that I had to address. The 
configfs-part is fairly isolated. As I tried to explain the last round, 
the *reason* I've used ConfigFS thus far, is because it makes it pretty 
easy from userspace to signal the driver to create a new alsa-device.

And the reason I haven't changed configfs, is because so far, that part has 
worked fairly well and have made testing quite easy. At this stage, *this* 
is what is helpful, not a perfect interface. This does not mean that 
configfs is set in stone.

To clearify:
I'm sending out a new set now because, what I have works _fairly_ well for 
testing and a way to see what you can do with AVB. Using spotify to play 
music on random machines is quite entertaining.

It is by no means -done-, nor do I consider it done. I have been tight on 
time, and instead of sitting in an office polishing on some code, I thought 
it better to send out a new (and not done) set of patches so that others 
could see it still being worked on. If this turned out to be noise-only, I 
appologize!

> > 2 new fields in netdev_ops have been introduced, and the Intel
> > igb-driver has been updated (as this an AVB-capable NIC which is
> > available as a PCI-e card).
> 
> The igb hacks show that you are on the wrong track.  We can and should
> be able to support TSN without resorting to driver specific hacks and
> module parameters.

I was not able to find a sane way to change the mode of the NIC, some of 
the settings required to enable Qav-mode must be done when bringing the 
NIC up, so I needed hooks in _probe().

ANother elemnt needed is a way for tsn_core to ascertain if a NIC is 
capable of TSN or not (this would be ndo_tsn_capable)

Then finally, you need to update values in a per-tx-queue manner when a new 
stream is ready (hence ndo_tsn_link_configure).

What you mean by 'driver specific hacks' is not obvious though, TSN 
requires a set of fairly standardized parameters (priority code points, 
size of frames to send in a new stream and so on), adding this to the 
hw-registers in the NIC is an operation that will be common for all 
TSN-capable NICs.

> > Before reading on - this is not even beta, but I'd really appreciate if
> > people would comment on the overall architecture and perhaps provide
> > some pointers to where I should improve/fix/update
> 
> As I said before about V1, this architecture stinks. 

I like feedback when it's short, sweet and to the point
2 out of 3 ain't that bad ;)

> You appear to have continued hacking along and posted the same design 
> again.  Did you even address any of the points I raised back then?

So you did raise a lot of good points the last round, and no, I have not 
had the time to address them properly. That does not mean I do not *want* 
to (apart from configfs actually having worked quite nicely thus far and 
'shim' being a name I like ;)

From the last round of discussion:

> 1. A proper userland stack for AVDECC, MAAP, FQTSS, and so on.  The
>OpenAVB project does not offer much beyond simple examples.

Yes, I fully agree, as far as I know, no-one is working on this. That being 
said, I have not paid much attention the userspace tooling lately. But all 
of this must be handled in userspace, having avdecc in the kernel would be 
an utter nightmare :)

> 2. A user space audio application that puts it all together, making
>   use of the services in #1, the linuxptp gPTP service, the ALSA
>   services, and the network connections.  This program will have all
>   the knowledge about packet formats, AV encodings, and the local HW
>   capabilities.  This program cannot yet be written, as we still need
>   some kernel work in the audio and networking subsystems.

And therein lies the problem. It cannot yet be written, so we have to start 
in *some* end. And as I repeatedly stated in June, I'm at an RFC here, 
trying to spark some interest and lure other developers in :)

Also, I really do not want a media-application to care about _where_ the 
frames are going. Sure, I see the issue of configuring a link, but that can 
be done from _outside_ the media-application. VLC (or aplay, or totem, or 
.. take your pick) should not have to worry about this.

Applications that require finer control over timestamping is easier to 
adapt to AVB than all the others, I'd rather add special knobs for those 
who are interested than adding a set of knobs that -all- applications must 
be aware of.

Could be that we are talking about the same thing, just from different 
perspectives.

Re: [PATCH 3.12 00/38] 3.12.69-stable review

2016-12-17 Thread Jiri Slaby

On 12/14/2016, 01:51 AM, Shuah Khan wrote:
> Compiled and booted on my test system. No dmesg regressions.

On 12/14/2016, 04:42 AM, Guenter Roeck wrote:
> Build results:
> total: 128 pass: 128 fail: 0
> Qemu test results:
> total: 93 pass: 93 fail: 0
>
> Details are available at http://kerneltests.org/builders.

Thank you both!

-- 
js
suse labs

Re: Document accounting of FDs passed over UNIX domain sockets

2016-12-17 Thread Michael Kerrisk (man-pages)

Hi Willy,

On 12/17/2016 08:04 AM, Willy Tarreau wrote:
> Hi Michael,
> 
> On Fri, Dec 16, 2016 at 12:08:33PM +0100, Michael Kerrisk (man-pages) wrote:
>> Hello Willy,
>>
>> Your commit 712f4aad406bb1 ("unix: properly account for FDs passed over 
>> unix sockets" added accounting to ensure that the RLIMIT_NOFILE limit
>> could not be bypassed when passing file descriptors across UNIX
>> domain sockets.
>>
>> Such patches should be CCed to linux-...@vger.kernel.org ;-)
> 
> Yes, I learned this after your presentation at kernel recipes, but this
> patch pre-dates it ;-)

But the note in Documentation/SubmittingPatches predates that ;-)

>> A documentation [atch would be great as well, but I had a shot 
>> at cobbling some text together. Does the text below (for the unix(7)
>> man page) look okay?
> 
> I think so, though maybe we can arrange it very slightly given that
> this was considered as a fix for a vulnerability and backported to
> various kernels :
> 
>>ETOOMANYREFS
>>   This  error  can  occur  for sendmsg(2) when sending a file
>>   descriptor as ancilary data over a UNIX domain socket  (see
>>   the  description  of  SCM_RIGHTS, above).  It occurs if the
>>   number  of  "in-flight"  file   descriptors   exceeds   the
>>   RLIMIT_NOFILE  resource  limit and the caller does not have
>>   the  CAP_SYS_RESOURCE  capability.An   in-flight   file
>>   descriptor  is  one that has been sent using sendmsg(2) but
>>   has not yet been accepted in the  recipient  process  using
>>   recvmsg(2).
>>
>>   This error is diagnosed since Linux 4.5.  In earlier kernel
>>   versions, it was possible to place an unlimited  number  of
>>   file descriptors in flight, by sending each file descriptor
>>   with sendmsg(2) and then closing  the  file  descriptor  so
>>   that   it  was  not  accounted  against  the  RLIMIT_NOFILE
>>   resource limit.
> 
> -   resource limit.
> +   resource limit. Some older stable kernels might have
> +   included the same check by backporting the fix from 4.5.
> 
> I've just checked the exact versions containing this, but I don't think
> it's worth providing the list, in my opinion mentionning that it could be
> observed on some older versions is enough to help developers who see it
> in field :
>   - 3.2.78
>   - 3.10.99
>   - 3.12.57
>   - 3.14.63
>   - 3.16.35
>   - 3.18.27
>   - 4.1.19
>   - 4.4.4

Yea. This is a tricky issue that I run into now and then. I've added
some different wording that expresses they same idea you intended.
Thanks for noting this.

Cheers,

Michael




-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Re: wl1251 NVS calibration data format

2016-12-17 Thread Sebastian Reichel

Hi,

On Fri, Dec 16, 2016 at 12:01:48PM +0100, Pali Rohár wrote:
> Hi! Do you know format of wl1251 NVS calibration data file?
> 
> I found that there is tool for changing NVS file for wl1271 and newer 
> chips (so not for wl1251!) at: https://github.com/gxk/ti-utils
> 
> And wl1271 has in NVS data already place for MAC address. And in wlcore 
> (for wl1271 and newer) there is really kernel code which is doing 
> something with MAC address in NVS, see: 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/ti/wlcore/boot.c#n352
> 
> So... I would like to know if in wl1251 NVS calibration file is also 
> some place for MAC address or not.
> 
> Default wl1251 NVS calibration file is available in linux-firmware: 
> https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/tree/ti-connectivity/wl1251-nvs.bin

Pandora people [0] have a description of the format at [1].

[0] https://pandorawiki.org/WiFi
[1] http://notaz.gp2x.de/misc/pnd/wl1251/nvs_map.txt

-- Sebastian


signature.asc
Description: PGP signature

Re: [GIT PULL] kbuild changes for v4.9-rc1

2016-12-17 Thread Adam Borowski

On Sat, Dec 17, 2016 at 09:57:47AM +0100, Jiri Slaby wrote:
> On 12/16/2016, 08:57 PM, Linus Torvalds wrote:
> > On Fri, Dec 16, 2016 at 11:55 AM, Jiri Slaby  wrote:
> >>
> >> what happened to this? I had to apply this to fix 4.9-pae kernel here.
> > 
> > Did you actually have to do that?
> 
> Yes, disk drivers won't load:
> [2.141973] virtio_pci: disagrees about version of symbol mcount
> [2.144415] virtio_pci: Unknown symbol mcount (err -22)
> and whole machine gets stuck with systemd waiting for /dev/sd*.
> 
> > Because a missing CRC shouldn't be fatal in 4.9.

Most of us get just a scary-looking warning, but whatever the problem is for
you, it's good to hear this patch works around it.

Whatever the long-term solution will be, for 4.10 an updated[1] version of
this fix is on kbuild/kbuild (and kbuild/for-next).  I guess we'll bother
stable@ once it is merged.

Note that it handles only x86, there's a bunch of other architectures
affected, alpha m68k s390 sparc ia64 might still need fixing.

Meow!

[1]. Turns out there was a missing symbol on 486; people build-test those
but don't try to actually boot, and even when they do, they don't read
warnings.
-- 
Autotools hint: to do a zx-spectrum build on a pdp11 host, type:
  ./configure --host=zx-spectrum --build=pdp11

Re: [PATCH] ALSA: use designated initializers

2016-12-17 Thread Takashi Sakamoto


On Dec 17 2016 09:59, Kees Cook wrote:

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook 
---
 sound/synth/emux/emux_seq.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


Reviewed-by: Takashi Sakamoto 


diff --git a/sound/synth/emux/emux_seq.c b/sound/synth/emux/emux_seq.c
index a0209204ae48..55579f6b8cb2 100644
--- a/sound/synth/emux/emux_seq.c
+++ b/sound/synth/emux/emux_seq.c
@@ -33,13 +33,13 @@ static int snd_emux_unuse(void *private_data, struct 
snd_seq_port_subscribe *inf
  * MIDI emulation operators
  */
 static struct snd_midi_op emux_ops = {
-   snd_emux_note_on,
-   snd_emux_note_off,
-   snd_emux_key_press,
-   snd_emux_terminate_note,
-   snd_emux_control,
-   snd_emux_nrpn,
-   snd_emux_sysex,
+   .note_on = snd_emux_note_on,
+   .note_off = snd_emux_note_off,
+   .key_press = snd_emux_key_press,
+   .note_terminate = snd_emux_terminate_note,
+   .control = snd_emux_control,
+   .nrpn = snd_emux_nrpn,
+   .sysex = snd_emux_sysex,
 };


Regards

Takashi Sakamoto

Re: Revised request_key(2) man page for review

2016-12-17 Thread Michael Kerrisk (man-pages)

Hello David,

On 12/15/2016 11:10 AM, David Howells wrote:
> Michael Kerrisk (man-pages)  wrote:
> 
>>>│Is 'keyring' allowed to be 0? Reading the source, it │
>>>│appears so.  In this case, by default,  the  key  is │
>>>│assigned   to   the   session   keyring.   But,  the │
>>>│KEYCTL_SET_REQKEY_KEYRING  also  seems  to  have  an │
>>>│influence here.  What are the details here?  │
> 
> Yes, the destination keyring can be 0.  If you don't specify a destination
> keyring, then:
> 
>  (1) If the key is found to already exist, the serial number is returned, but
>  no extra link is made.
> 
>  (2) If an error occurs other than "this key doesn't exist", then you'll just
>  get the error.
> 
>  (3) If we have to construct a new key, this will be attached to the default
>  keyring (as there's no destination keyring to attach to).

Okay. Please take a look at the revised text that I'll send out
after applying Eugene's patch. (Mail in a few minutes.)

>>># echo 'create user mtk:* *   /bin/keyctl instantiate %k %c %S' \
>>>  > /etc/request-keys.conf
> 
> There's a /etc/request-keys.d/ directory now.

Yes, I'm aware. Did you mean I should fix something on this page?

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

[PATCH] block: loose check on sg gap

2016-12-17 Thread Ming Lei

If the last bvec of the 1st bio and the 1st bvec of the next
bio are contineous physically, and the latter can be merged
to last segment of the 1st bio, we should think they don't
violate sg gap(or virt boundary) limit.

Both Vitaly and Dexuan reported lots of unmergeable small bios
are observed when running mkfs on Hyper-V virtual storage, and
performance becomes quite low, so this patch is figured out for
fixing the performance issue.

The same issue should exist on NVMe too sine it sets virt boundary too.

Reported-by: Vitaly Kuznetsov 
Reported-by: Dexuan Cui 
Tested-by: Dexuan Cui 
Cc: Keith Busch 
Signed-off-by: Ming Lei 
---
 include/linux/blkdev.h | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 286b2a264383..1ce26e771bcc 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1608,6 +1608,25 @@ static inline bool bvec_gap_to_prev(struct request_queue 
*q,
return __bvec_gap_to_prev(q, bprv, offset);
 }
 
+/*
+ * Check if the two bvecs from two bios can be merged to one segment.
+ * If yes, no need to check gap between the two bios since the 1st bio
+ * and the 1st bvec in the 2nd bio can be handled in one segment.
+ */
+static inline bool bios_segs_mergeable(struct request_queue *q,
+   struct bio *prev, struct bio_vec *prev_last_bv,
+   struct bio_vec *next_first_bv)
+{
+   if (!BIOVEC_PHYS_MERGEABLE(prev_last_bv, next_first_bv))
+   return false;
+   if (!BIOVEC_SEG_BOUNDARY(q, prev_last_bv, next_first_bv))
+   return false;
+   if (prev->bi_seg_back_size + next_first_bv->bv_len >
+   queue_max_segment_size(q))
+   return false;
+   return true;
+}
+
 static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 struct bio *next)
 {
@@ -1617,7 +1636,8 @@ static inline bool bio_will_gap(struct request_queue *q, 
struct bio *prev,
bio_get_last_bvec(prev, &pb);
bio_get_first_bvec(next, &nb);
 
-   return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
+   if (!bios_segs_mergeable(q, prev, &pb, &nb))
+   return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
}
 
return false;
-- 
2.7.4

Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2016-12-17 Thread Dave Young

On 12/16/16 at 02:18pm, Andy Shevchenko wrote:
> On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote:
> > On 12/15/16 at 12:28pm, Jean Delvare wrote:
> > > Hi Andy,
> > > 
> > > On Fri,  2 Dec 2016 21:54:16 +0200, Andy Shevchenko wrote:
> > > > Until now kexec'ed kernel has no clue where to look for DMI entry
> > > > point.
> > > > 
> > > > Pass it via kernel command line parameter in the same way as it's
> > > > done for ACPI
> > > > RSDP.
> > > 
> > > I am no kexec expert but this confuses me. Shouldn't the second
> > > kernel
> > > have access to the EFI systab as the first kernel does? It includes
> > > many more pointers than just ACPI and DMI tables, and it would seem
> > > inconvenient to have to pass all these addresses individually
> > > explicitly.
> > 
> > Yes, in modern linux kernel, kexec has the support for EFI, I think it
> > should work naturally at least in x86_64.
> 
> Thanks for this good news!
> 
> Unfortunately Intel Galileo is 32-bit platform.

Maybe you can try use efi=noruntime kernel parameter in kexec/kdump
kernel, see if it works or not.

> 
> -- 
> Andy Shevchenko 
> Intel Finland Oy

Re: [PATCH v1 2/2] firmware: dmi_scan: Pass dmi_entry_point to kexec'ed kernel

2016-12-17 Thread Dave Young

Ccing efi people.

On 12/16/16 at 02:33pm, Jean Delvare wrote:
> On Fri, 16 Dec 2016 14:18:58 +0200, Andy Shevchenko wrote:
> > On Fri, 2016-12-16 at 10:32 +0800, Dave Young wrote:
> > > On 12/15/16 at 12:28pm, Jean Delvare wrote:
> > > > I am no kexec expert but this confuses me. Shouldn't the second
> > > > kernel have access to the EFI systab as the first kernel does? It
> > > > includes many more pointers than just ACPI and DMI tables, and it
> > > > would seem inconvenient to have to pass all these addresses
> > > > individually explicitly.
> > > 
> > > Yes, in modern linux kernel, kexec has the support for EFI, I think it
> > > should work naturally at least in x86_64.
> > 
> > Thanks for this good news!
> > 
> > Unfortunately Intel Galileo is 32-bit platform.
> 
> If it was done for X86_64 then maybe it can be generalized to X86?

For X86_64, we have a new way for efi runtime memmory mapping, in i386
code it still use old ioremap way. It is impossible to use same way as
the X86_64 since the virtual address space is limited.

But maybe for 32bit, kexec kernel can run in physical mode, but I'm not
sure, I would suggest Andy to do a test first with efi=noruntime for
kexec 2nd kernel.

Thanks
Dave

> 
> -- 
> Jean Delvare
> SUSE L3 Support

Re: wl1251 NVS calibration data format

2016-12-17 Thread Pali Rohár

On Saturday 17 December 2016 10:37:05 Sebastian Reichel wrote:
> Hi,
> 
> On Fri, Dec 16, 2016 at 12:01:48PM +0100, Pali Rohár wrote:
> > Hi! Do you know format of wl1251 NVS calibration data file?
> > 
> > I found that there is tool for changing NVS file for wl1271 and
> > newer chips (so not for wl1251!) at:
> > https://github.com/gxk/ti-utils
> > 
> > And wl1271 has in NVS data already place for MAC address. And in
> > wlcore (for wl1271 and newer) there is really kernel code which is
> > doing something with MAC address in NVS, see:
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tre
> > e/drivers/net/wireless/ti/wlcore/boot.c#n352
> > 
> > So... I would like to know if in wl1251 NVS calibration file is
> > also some place for MAC address or not.
> > 
> > Default wl1251 NVS calibration file is available in linux-firmware:
> > https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmwar
> > e.git/tree/ti-connectivity/wl1251-nvs.bin
> 
> Pandora people [0] have a description of the format at [1].
> 
> [0] https://pandorawiki.org/WiFi
> [1] http://notaz.gp2x.de/misc/pnd/wl1251/nvs_map.txt

Thank you very very much!

I tried to search for something, but I have not find anything.

In that description is something about STA mac address:

01a   6d   //STA_ADDR_L Register Address.  (STA MAC Address)
01b   54   //
01c   00   //STA_ADDR_L Register
01d   00   //
01e   32   //
01f   28   //
020   00   //STA_ADDR_H Register Data.

STA would be abbreviation for station and so it should be really set to 
mac address of that chip?

If yes, that could allow us to set permanent MAC address at time when 
loading & sending NVS calibration data... Exactly same as wl1271 and new 
drivers are working.

I will try to play with driver if it is really truth!

I already looked into original TI's multiplatform HAL driver for wl1251 
chip (big mess) and found there that there is wl1251 command to read mac 
address from chip. It could be done by this wl1251 function:

wl1251_cmd_interrogate(wl, DOT11_STATION_ID, mac, sizeof(*mac))

(same id as for setting permanent mac address, but opposite to read it)

-- 
Pali Rohár
pali.ro...@gmail.com

signature.asc
Description: This is a digitally signed message part.

Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically

2016-12-17 Thread Tetsuo Handa

Michal Hocko wrote:
> On Fri 16-12-16 12:31:51, Johannes Weiner wrote:
>>> @@ -3737,6 +3752,16 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int 
>>> order,
>>>  */
>>> WARN_ON_ONCE(order > PAGE_ALLOC_COSTLY_ORDER);
>>>  
>>> +   /*
>>> +* Help non-failing allocations by giving them access to memory
>>> +* reserves but do not use ALLOC_NO_WATERMARKS because this
>>> +* could deplete whole memory reserves which would just make
>>> +* the situation worse
>>> +*/
>>> +   page = __alloc_pages_cpuset_fallback(gfp_mask, order, 
>>> ALLOC_HARDER, ac);
>>> +   if (page)
>>> +   goto got_pg;
>>> +
>>
>> But this should be a separate patch, IMO.
>>
>> Do we observe GFP_NOFS lockups when we don't do this? 
> 
> this is hard to tell but considering users like grow_dev_page we can get
> stuck with a very slow progress I believe. Those allocations could see
> some help.
> 
>> Don't we risk
>> premature exhaustion of the memory reserves, and it's better to wait
>> for other reclaimers to make some progress instead?
> 
> waiting for other reclaimers would be preferable but we should at least
> give these some priority, which is what ALLOC_HARDER should help with.
> 
>> Should we give
>> reserve access to all GFP_NOFS allocations, or just the ones from a
>> reclaim/cleaning context?
> 
> I would focus only for those which are important enough. Which are those
> is a harder question. But certainly those with GFP_NOFAIL are important
> enough.
> 
>> All that should go into the changelog of a separate allocation booster
>> patch, I think.
> 
> The reason I did both in the same patch is to address the concern about
> potential lockups when NOFS|NOFAIL cannot make any progress. I've chosen
> ALLOC_HARDER to give the minimum portion of the reserves so that we do
> not risk other high priority users to be blocked out but still help a
> bit at least and prevent from starvation when other reclaimers are
> faster to consume the reclaimed memory.
> 
> I can extend the changelog of course but I believe that having both
> changes together makes some sense. NOFS|NOFAIL allocations are not all
> that rare and sometimes we really depend on them making a further
> progress.
> 

I feel that allowing access to memory reserves based on __GFP_NOFAIL might not
make sense. My understanding is that actual I/O operation triggered by I/O
requests by filesystem code are processed by other threads. Even if we grant
access to memory reserves to GFP_NOFS | __GFP_NOFAIL allocations by fs code,
I think that it is possible that memory allocations by underlying bio code
fails to make a further progress unless memory reserves are granted as well.

Below is a typical trace which I observe under OOM lockuped situation (though
this trace is from an OOM stress test using XFS).


[ 1845.187246] MemAlloc: kworker/2:1(14498) flags=0x4208060 switches=323636 
seq=48 gfp=0x240(GFP_NOIO) order=0 delay=430400 uninterruptible
[ 1845.187248] kworker/2:1 D12712 14498  2 0x0080
[ 1845.187251] Workqueue: events_freezable_power_ disk_events_workfn
[ 1845.187252] Call Trace:
[ 1845.187253]  ? __schedule+0x23f/0xba0
[ 1845.187254]  schedule+0x38/0x90
[ 1845.187255]  schedule_timeout+0x205/0x4a0
[ 1845.187256]  ? del_timer_sync+0xd0/0xd0
[ 1845.187257]  schedule_timeout_uninterruptible+0x25/0x30
[ 1845.187258]  __alloc_pages_nodemask+0x1035/0x10e0
[ 1845.187259]  ? alloc_request_struct+0x14/0x20
[ 1845.187261]  alloc_pages_current+0x96/0x1b0
[ 1845.187262]  ? bio_alloc_bioset+0x20f/0x2e0
[ 1845.187264]  bio_copy_kern+0xc4/0x180
[ 1845.187265]  blk_rq_map_kern+0x6f/0x120
[ 1845.187268]  __scsi_execute.isra.23+0x12f/0x160
[ 1845.187270]  scsi_execute_req_flags+0x8f/0x100
[ 1845.187271]  sr_check_events+0xba/0x2b0 [sr_mod]
[ 1845.187274]  cdrom_check_events+0x13/0x30 [cdrom]
[ 1845.187275]  sr_block_check_events+0x25/0x30 [sr_mod]
[ 1845.187276]  disk_check_events+0x5b/0x150
[ 1845.187277]  disk_events_workfn+0x17/0x20
[ 1845.187278]  process_one_work+0x1fc/0x750
[ 1845.187279]  ? process_one_work+0x167/0x750
[ 1845.187279]  worker_thread+0x126/0x4a0
[ 1845.187280]  kthread+0x10a/0x140
[ 1845.187281]  ? process_one_work+0x750/0x750
[ 1845.187282]  ? kthread_create_on_node+0x60/0x60
[ 1845.187283]  ret_from_fork+0x2a/0x40


I think that this GFP_NOIO allocation request needs to consume more memory 
reserves
than GFP_NOFS allocation request to make progress. 
Do we want to add __GFP_NOFAIL to this GFP_NOIO allocation request in order to 
allow
access to memory reserves as well as GFP_NOFS | __GFP_NOFAIL allocation request?

Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2016-12-17 Thread Sebastian Andrzej Siewior

On 2016-12-16 23:00:27 [+0100], Arnd Bergmann wrote:
> On Friday, December 16, 2016 6:00:43 PM CET Sebastian Andrzej Siewior wrote:
> > On 2016-12-16 11:56:21 [+0100], Arnd Bergmann wrote:
> > > The original gcc-4.3 release was in early 2008. If we decide to still
> > > support that, we probably want the first 10 quirks in this series,
> > > while gcc-4.6 (released in 2011) requires none of them.
> > 
> > It this min gcc thingy ARM only?
> 
> This is part of the question that I'm trying to figure out myself.
> 
> Clearly having the same minimum version across all architectures simplifies
> things a lot, because many of the bugs in old versions are architecture
> independent. 

agreed.

> Then again, some architectures implicitly require a new version
> because an old one never existed (e.g. arm64 or risc-v), while some other
> architectures may require an old version.

A new version is understandable. But why is an old version required?
One thing is an enterprise distro that is "current" or "supported" and still
stuck with gcc 4.1 because that is the version they decided to include in
their release. This is sad. But you might want to ask yourself why you want
the latest kernel but an old gcc / binutils.

If you have an architecture that compiles with gcc v4.1 and not with gcc
latest stable / trunk then it is a sign that this port is not supported
properly / not heatly. One thing is something like avr32 which is not part of
upstream gcc due to some legal reason (that was my understanding a few years
ago). It might get to a problem for them once large parts of userland switch
to a later C++ standard which is gcc-5+.

>   Arnd

Sebastian

[PATCH] drivers: remoteproc: constify rproc_ops structures

2016-12-17 Thread Bhumika Goyal

Declare rproc_ops structures as const as they are only passed as an
argument to the function rproc_alloc. This argument is of type const, so
rproc_ops structures having this property can be declared const too.
Done using Coccinelle:

@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct rproc_ops i@p = {...};

@ok1@
identifier r1.i;
position p;
@@
rproc_alloc(...,&i@p,...)

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct rproc_ops i;

File sizes before:
   textdata bss dec hex filename
   1258 416   01674 68a remoteproc/omap_remoteproc.o
   2402 240   02642 a52 remoteproc/st_remoteproc.o
   2064 272   02336 920 remoteproc/st_slim_rproc.o
   2160 240   02400 960 remoteproc/wkup_m3_rproc.o

File sizes after:
   textdata bss dec hex filename
   1297 368   01665 681 remoteproc/omap_remoteproc.o
   2434 192   02626 a42 remoteproc/st_remoteproc.o
   2112 240   02352 930 remoteproc/st_slim_rproc.o
   2200 192   02392 958 remoteproc/wkup_m3_rproc.o

Signed-off-by: Bhumika Goyal 
---
 drivers/remoteproc/omap_remoteproc.c | 2 +-
 drivers/remoteproc/st_remoteproc.c   | 2 +-
 drivers/remoteproc/st_slim_rproc.c   | 2 +-
 drivers/remoteproc/wkup_m3_rproc.c   | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/remoteproc/omap_remoteproc.c 
b/drivers/remoteproc/omap_remoteproc.c
index fa63bf2..a96ce90 100644
--- a/drivers/remoteproc/omap_remoteproc.c
+++ b/drivers/remoteproc/omap_remoteproc.c
@@ -177,7 +177,7 @@ static int omap_rproc_stop(struct rproc *rproc)
return 0;
 }
 
-static struct rproc_ops omap_rproc_ops = {
+static const struct rproc_ops omap_rproc_ops = {
.start  = omap_rproc_start,
.stop   = omap_rproc_stop,
.kick   = omap_rproc_kick,
diff --git a/drivers/remoteproc/st_remoteproc.c 
b/drivers/remoteproc/st_remoteproc.c
index da4e152..f21787b 100644
--- a/drivers/remoteproc/st_remoteproc.c
+++ b/drivers/remoteproc/st_remoteproc.c
@@ -107,7 +107,7 @@ static int st_rproc_stop(struct rproc *rproc)
return sw_err ?: pwr_err;
 }
 
-static struct rproc_ops st_rproc_ops = {
+static const struct rproc_ops st_rproc_ops = {
.start  = st_rproc_start,
.stop   = st_rproc_stop,
 };
diff --git a/drivers/remoteproc/st_slim_rproc.c 
b/drivers/remoteproc/st_slim_rproc.c
index 507716c..6cfd862 100644
--- a/drivers/remoteproc/st_slim_rproc.c
+++ b/drivers/remoteproc/st_slim_rproc.c
@@ -200,7 +200,7 @@ static void *slim_rproc_da_to_va(struct rproc *rproc, u64 
da, int len)
return va;
 }
 
-static struct rproc_ops slim_rproc_ops = {
+static const struct rproc_ops slim_rproc_ops = {
.start  = slim_rproc_start,
.stop   = slim_rproc_stop,
.da_to_va   = slim_rproc_da_to_va,
diff --git a/drivers/remoteproc/wkup_m3_rproc.c 
b/drivers/remoteproc/wkup_m3_rproc.c
index 18175d0..1ada0e5 100644
--- a/drivers/remoteproc/wkup_m3_rproc.c
+++ b/drivers/remoteproc/wkup_m3_rproc.c
@@ -111,7 +111,7 @@ static void *wkup_m3_rproc_da_to_va(struct rproc *rproc, 
u64 da, int len)
return va;
 }
 
-static struct rproc_ops wkup_m3_rproc_ops = {
+static const struct rproc_ops wkup_m3_rproc_ops = {
.start  = wkup_m3_rproc_start,
.stop   = wkup_m3_rproc_stop,
.da_to_va   = wkup_m3_rproc_da_to_va,
-- 
1.9.1

Re: [PATCH 32/60] block: implement sp version of bvec iterator helpers

2016-12-17 Thread Ming Lei

Hi Guys,

On Sat, Oct 29, 2016 at 7:06 PM, kbuild test robot  wrote:
> Hi Ming,

Thanks for the report!

>
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.9-rc2 next-20161028]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> [Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
> convenience) to record what (public, well-known) commit your patch series was 
> built on]
> [Check https://git-scm.com/docs/git-format-patch for more information]
>
> url:
> https://github.com/0day-ci/linux/commits/Ming-Lei/block-support-multipage-bvec/20161029-163910
> config: sparc-defconfig (attached as .config)
> compiler: sparc-linux-gcc (GCC) 6.2.0
> reproduce:
> wget 
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
>  -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=sparc
>
> All error/warnings (new ones prefixed by >>):
>
>In file included from arch/sparc/include/asm/oplib.h:6:0,
> from arch/sparc/include/asm/pgtable_32.h:21,
> from arch/sparc/include/asm/pgtable.h:6,
> from include/linux/mm.h:68,


This issue should be caused by somewhere in sparc arch, and this patch
only adds '#include ' to 'include/linux/bvec.h' for using
nth_page().

So Cc sparc list.

Thanks,
Ming

> from include/linux/bvec.h:25,
> from include/linux/blk_types.h:9,
> from include/linux/fs.h:31,
> from include/linux/proc_fs.h:8,
> from arch/sparc/include/asm/prom.h:22,
> from include/linux/of.h:232,
> from arch/sparc/include/asm/openprom.h:14,
> from arch/sparc/include/asm/device.h:9,
> from include/linux/device.h:30,
> from include/linux/node.h:17,
> from include/linux/cpu.h:16,
> from include/linux/stop_machine.h:4,
> from kernel/sched/sched.h:10,
> from kernel/sched/loadavg.c:11:
>>> arch/sparc/include/asm/oplib_32.h:105:39: warning: 'struct 
>>> linux_prom_registers' declared inside parameter list will not be visible 
>>> outside of this definition or declaration
> int prom_startcpu(int cpunode, struct linux_prom_registers *context_table,
>   ^~~~
>arch/sparc/include/asm/oplib_32.h:168:36: warning: 'struct 
> linux_prom_registers' declared inside parameter list will not be visible 
> outside of this definition or declaration
> void prom_apply_obio_ranges(struct linux_prom_registers *obioregs, int 
> nregs);
>^~~~
>arch/sparc/include/asm/oplib_32.h:172:18: warning: 'struct 
> linux_prom_registers' declared inside parameter list will not be visible 
> outside of this definition or declaration
>   struct linux_prom_registers *sbusregs, int nregs);
>  ^~~~
> --
>In file included from arch/sparc/include/asm/oplib.h:6:0,
> from arch/sparc/include/asm/pgtable_32.h:21,
> from arch/sparc/include/asm/pgtable.h:6,
> from include/linux/mm.h:68,
> from include/linux/bvec.h:25,
> from include/linux/blk_types.h:9,
> from include/linux/fs.h:31,
> from include/linux/proc_fs.h:8,
> from arch/sparc/include/asm/prom.h:22,
> from include/linux/of.h:232,
> from arch/sparc/include/asm/openprom.h:14,
> from arch/sparc/prom/mp.c:12:
>>> arch/sparc/include/asm/oplib_32.h:105:39: warning: 'struct 
>>> linux_prom_registers' declared inside parameter list will not be visible 
>>> outside of this definition or declaration
> int prom_startcpu(int cpunode, struct linux_prom_registers *context_table,
>   ^~~~
>arch/sparc/include/asm/oplib_32.h:168:36: warning: 'struct 
> linux_prom_registers' declared inside parameter list will not be visible 
> outside of this definition or declaration
> void prom_apply_obio_ranges(struct linux_prom_registers *obioregs, int 
> nregs);
>^~~~
>arch/sparc/include/asm/oplib_32.h:172:18: warning: 'struct 
> linux_prom_registers' declared inside parameter list will not be visible 
> outside of this definition or declaration
>   struct linux_prom_registers *sbusregs, int nregs);
>  ^~~~
>>> arch/sparc/prom/mp.c:23:1: error: conflicting types for 'prom_startcpu'
> prom_startcpu(int cpunode, struct linux_prom_r

[tip:x86/urgent] x86/mpx: Move bd_addr to mm_context_t

2016-12-17 Thread tip-bot for Mark Rutland

Commit-ID:  cb02de96ec724b84373488dd349e53897ab432f5
Gitweb: http://git.kernel.org/tip/cb02de96ec724b84373488dd349e53897ab432f5
Author: Mark Rutland 
AuthorDate: Fri, 16 Dec 2016 12:40:55 +
Committer:  Thomas Gleixner 
CommitDate: Sat, 17 Dec 2016 12:29:56 +0100

x86/mpx: Move bd_addr to mm_context_t

Currently bd_addr lives in mm_struct, which is otherwise architecture
independent. Architecture-specific data is supposed to live within
mm_context_t (itself contained in mm_struct).

Other x86-specific context like the pkey accounting data lives in
mm_context_t, and there's no readon the MPX data can't also live there.
So as to keep the arch-specific data togather, and to set a good example
for others, this patch moves bd_addr into x86's mm_context_t.

Signed-off-by: Mark Rutland 
Acked-by: Dave Hansen 
Cc: Andrew Morton 
Link: 
http://lkml.kernel.org/r/1481892055-24596-1-git-send-email-mark.rutl...@arm.com
Signed-off-by: Thomas Gleixner 

---
 arch/x86/include/asm/mmu.h |  4 
 arch/x86/include/asm/mpx.h |  4 ++--
 arch/x86/mm/mpx.c  | 10 +-
 include/linux/mm_types.h   |  4 
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 72198c6..f9813b6 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -31,6 +31,10 @@ typedef struct {
u16 pkey_allocation_map;
s16 execute_only_pkey;
 #endif
+#ifdef CONFIG_X86_INTEL_MPX
+   /* address of the bounds directory */
+   void __user *bd_addr;
+#endif
 } mm_context_t;
 
 #ifdef CONFIG_SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 7a35495..0b416d4 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -59,7 +59,7 @@ siginfo_t *mpx_generate_siginfo(struct pt_regs *regs);
 int mpx_handle_bd_fault(void);
 static inline int kernel_managing_mpx_tables(struct mm_struct *mm)
 {
-   return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR);
+   return (mm->context.bd_addr != MPX_INVALID_BOUNDS_DIR);
 }
 static inline void mpx_mm_init(struct mm_struct *mm)
 {
@@ -67,7 +67,7 @@ static inline void mpx_mm_init(struct mm_struct *mm)
 * NULL is theoretically a valid place to put the bounds
 * directory, so point this at an invalid address.
 */
-   mm->bd_addr = MPX_INVALID_BOUNDS_DIR;
+   mm->context.bd_addr = MPX_INVALID_BOUNDS_DIR;
 }
 void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
  unsigned long start, unsigned long end);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e4f8009..324e571 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -350,12 +350,12 @@ int mpx_enable_management(void)
 * The copy_xregs_to_kernel() beneath get_xsave_field_ptr() is
 * expected to be relatively expensive. Storing the bounds
 * directory here means that we do not have to do xsave in the
-* unmap path; we can just use mm->bd_addr instead.
+* unmap path; we can just use mm->context.bd_addr instead.
 */
bd_base = mpx_get_bounds_dir();
down_write(&mm->mmap_sem);
-   mm->bd_addr = bd_base;
-   if (mm->bd_addr == MPX_INVALID_BOUNDS_DIR)
+   mm->context.bd_addr = bd_base;
+   if (mm->context.bd_addr == MPX_INVALID_BOUNDS_DIR)
ret = -ENXIO;
 
up_write(&mm->mmap_sem);
@@ -370,7 +370,7 @@ int mpx_disable_management(void)
return -ENXIO;
 
down_write(&mm->mmap_sem);
-   mm->bd_addr = MPX_INVALID_BOUNDS_DIR;
+   mm->context.bd_addr = MPX_INVALID_BOUNDS_DIR;
up_write(&mm->mmap_sem);
return 0;
 }
@@ -947,7 +947,7 @@ static int try_unmap_single_bt(struct mm_struct *mm,
end = bta_end_vaddr;
}
 
-   bde_vaddr = mm->bd_addr + mpx_get_bd_entry_offset(mm, start);
+   bde_vaddr = mm->context.bd_addr + mpx_get_bd_entry_offset(mm, start);
ret = get_bt_addr(mm, bde_vaddr, &bt_addr);
/*
 * No bounds table there, so nothing to unmap.
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4a8aced..ce70ceb 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -508,10 +508,6 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
-#ifdef CONFIG_X86_INTEL_MPX
-   /* address of the bounds directory */
-   void __user *bd_addr;
-#endif
 #ifdef CONFIG_HUGETLB_PAGE
atomic_long_t hugetlb_usage;
 #endif

RE: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-17 Thread Li, Liang Z

> On Fri, Dec 16, 2016 at 01:12:21AM +, Li, Liang Z wrote:
> > There still exist the case if the MAX_ORDER is configured to a large
> > value, e.g. 36 for a system with huge amount of memory, then there is only
> 28 bits left for the pfn, which is not enough.
> 
> Not related to the balloon but how would it help to set MAX_ORDER to 36?
> 

My point here is  MAX_ORDER may be configured to a big value.

> What the MAX_ORDER affects is that you won't be able to ask the kernel
> page allocator for contiguous memory bigger than 1<<(MAX_ORDER-1), but
> that's a driver issue not relevant to the amount of RAM. Drivers won't
> suddenly start to ask the kernel allocator to allocate compound pages at
> orders >= 11 just because more RAM was added.
> 
> The higher the MAX_ORDER the slower the kernel runs simply so the smaller
> the MAX_ORDER the better.
> 
> > Should  we limit the MAX_ORDER? I don't think so.
> 
> We shouldn't strictly depend on MAX_ORDER value but it's mostly limited
> already even if configurable at build time.
> 

I didn't know that and will take a look, thanks for your information.


Liang
> We definitely need it to reach at least the hugepage size, then it's mostly
> driver issue, but drivers requiring large contiguous allocations should rely 
> on
> CMA only or vmalloc if they only require it virtually contiguous, and not rely
> on larger MAX_ORDER that would slowdown all kernel allocations/freeing.

Re: [PATCH] staging: android: ion: return -ENOMEM in ion_cma_heap allocation failure

2016-12-17 Thread Jaewon Kim

2016-12-14 1:04 GMT+09:00 Laura Abbott :
> On 12/08/2016 09:05 PM, Jaewon Kim wrote:
>> Initial Commit 349c9e138551 ("gpu: ion: add CMA heap") returns -1 in 
>> allocation
>> failure. The returned value is passed up to userspace through ioctl. So user 
>> can
>> misunderstand error reason as -EPERM(1) rather than -ENOMEM(12).
>>
>> This patch simply changed this to return -ENOMEM.
>>
>> Signed-off-by: Jaewon Kim 
>> ---
>>  drivers/staging/android/ion/ion_cma_heap.c | 6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion_cma_heap.c 
>> b/drivers/staging/android/ion/ion_cma_heap.c
>> index 6c7de74..22b9582 100644
>> --- a/drivers/staging/android/ion/ion_cma_heap.c
>> +++ b/drivers/staging/android/ion/ion_cma_heap.c
>> @@ -24,8 +24,6 @@
>>  #include "ion.h"
>>  #include "ion_priv.h"
>>
>> -#define ION_CMA_ALLOCATE_FAILED -1
>> -
>>  struct ion_cma_heap {
>>   struct ion_heap heap;
>>   struct device *dev;
>> @@ -59,7 +57,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct 
>> ion_buffer *buffer,
>>
>>   info = kzalloc(sizeof(struct ion_cma_buffer_info), GFP_KERNEL);
>>   if (!info)
>> - return ION_CMA_ALLOCATE_FAILED;
>> + return -ENOMEM;
>>
>>   info->cpu_addr = dma_alloc_coherent(dev, len, &(info->handle),
>>   GFP_HIGHUSER | __GFP_ZERO);
>> @@ -88,7 +86,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct 
>> ion_buffer *buffer,
>>   dma_free_coherent(dev, len, info->cpu_addr, info->handle);
>>  err:
>>   kfree(info);
>> - return ION_CMA_ALLOCATE_FAILED;
>> + return -ENOMEM;
>>  }
>>
>>  static void ion_cma_free(struct ion_buffer *buffer)
>>
>
> Happy to see cleanup
>
> Acked-by: Laura Abbott 

Thank you Laura Abbott. I'm honored to get Ack from you. I looked many
patches of you.
I hope this patch to be mainlined.

Re: wl1251 NVS calibration data format

2016-12-17 Thread Sebastian Reichel

Hi,

On Sat, Dec 17, 2016 at 12:14:50PM +0100, Pali Rohár wrote:
> On Saturday 17 December 2016 10:37:05 Sebastian Reichel wrote:
> > On Fri, Dec 16, 2016 at 12:01:48PM +0100, Pali Rohár wrote:
> > > Hi! Do you know format of wl1251 NVS calibration data file?
> > > 
> > > I found that there is tool for changing NVS file for wl1271 and
> > > newer chips (so not for wl1251!) at:
> > > https://github.com/gxk/ti-utils
> > > 
> > > And wl1271 has in NVS data already place for MAC address. And in
> > > wlcore (for wl1271 and newer) there is really kernel code which is
> > > doing something with MAC address in NVS, see:
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tre
> > > e/drivers/net/wireless/ti/wlcore/boot.c#n352
> > > 
> > > So... I would like to know if in wl1251 NVS calibration file is
> > > also some place for MAC address or not.
> > > 
> > > Default wl1251 NVS calibration file is available in linux-firmware:
> > > https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmwar
> > > e.git/tree/ti-connectivity/wl1251-nvs.bin
> > 
> > Pandora people [0] have a description of the format at [1].
> > 
> > [0] https://pandorawiki.org/WiFi
> > [1] http://notaz.gp2x.de/misc/pnd/wl1251/nvs_map.txt
> 
> Thank you very very much!

You are welcome.

> I tried to search for something, but I have not find anything.
> In that description is something about STA mac address:
> 
> 01a   6d   //STA_ADDR_L Register Address.  (STA MAC Address)
> 01b   54   //
> 01c   00   //STA_ADDR_L Register
> 01d   00   //
> 01e   32   //
> 01f   28   //
> 020   00   //STA_ADDR_H Register Data.
> 
> STA would be abbreviation for station and so it should be really set to 
> mac address of that chip?

Yes, STA is a common abbreviation:

https://en.wikipedia.org/wiki/Station_(networking)

> If yes, that could allow us to set permanent MAC address at time when 
> loading & sending NVS calibration data... Exactly same as wl1271 and new 
> drivers are working.
> 
> I will try to play with driver if it is really truth!

Thanks for your work.

> I already looked into original TI's multiplatform HAL driver for wl1251 
> chip (big mess) and found there that there is wl1251 command to read mac 
> address from chip. It could be done by this wl1251 function:
> 
> wl1251_cmd_interrogate(wl, DOT11_STATION_ID, mac, sizeof(*mac))
> 
> (same id as for setting permanent mac address, but opposite to read it)

-- Sebastian


signature.asc
Description: PGP signature

Re: netfilter regression causes lost pings "operation not permitted"

2016-12-17 Thread Florian Westphal

Trevor Cordes  wrote:

Sorry for late reply.

> On 2016-12-07 Trevor Cordes wrote:
> > Bisected down to:
> > 870190a9ec9075205c0fa795a09fa931694a3ff1
> > 7c9664351980aaa6a4b8837a314360b3a4ad382a
> 
> Oh!  I forgot to mention the most important point: iptable_nat module
> MUST be loaded for the bug to show up!
> 
> modprobe iptable_nat
> 
> If you rmmod it, the bug goes away.  Interestingly, the bug occurs even
> if you have every iptables table (including -t nat) completely empty
> (no rules).  All that is required is iptable_nat simply to be loaded.

Pablo, I think stable should revert both patches.

The alternative is for stable to pick up the fixes from 4.10 tree but
that requires to pull rhhashtables new rhlist interface too...

So I think revert is the way to go.

Should I take care of that?

RE: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-17 Thread Li, Liang Z

> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for
> fast (de)inflating & fast live migration
> 
> On Thu, Dec 15, 2016 at 05:40:45PM -0800, Dave Hansen wrote:
> > On 12/15/2016 05:38 PM, Li, Liang Z wrote:
> > >
> > > Use 52 bits for 'pfn', 12 bits for 'length', when the 12 bits is not long
> enough for the 'length'
> > > Set the 'length' to a special value to indicate the "actual length in 
> > > next 8
> bytes".
> > >
> > > That will be much more simple. Right?
> >
> > Sounds fine to me.
> >
> 
> Sounds fine to me too indeed.
> 
> I'm only wondering what is the major point for compressing gpfn+len in
> 8 bytes in the common case, you already use sg_init_table to send down two
> pages, we could send three as well and avoid all math and bit shifts and ors,
> or not?
> 

Yes, we can use more pages for that.

> I agree with the above because from a performance prospective I tend to
> think the above proposal will run at least theoretically faster because the
> other way is to waste double amount of CPU cache, and bit mangling in the
> encoding and the later decoding on qemu side should be faster than
> accessing an array of double size, but then I'm not sure if it's measurable
> optimization. So I'd be curious to know the exact motivation and if it is to
> reduce the CPU cache usage or if there's some other fundamental reason to
> compress it.
> The header already tells qemu how big is the array payload, couldn't we just
> add more pages if one isn't enough?
> 

The original intention to compress the PFN and length it's to reduce the memory 
required.
Even the code was changed a lot from the previous versions, I think this is 
still true.

Now we allocate a specified buffer size to save the 'PFN|length', when the 
buffer is not big
enough to save all the page info for a specified order. A double size buffer 
will be allocated.
This is what we want to avoid because the allocation may fail and allocation 
takes some time,
for fast live migration, time is a critical factor we have to consider, more 
time takes means
more unnecessary pages are sent, because live migration starts before the 
request for unused
 pages get response. 

Thanks

Liang

> Thanks,
> Andrea

Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-17 Thread George Spelvin

BTW, here's some SipHash code I wrote for Linux a while ago.

My target application was ext4 directory hashing, resulting in different
implementation choices, although I still think that a rolled-up
implementation like this is reasonable.  Reducing I-cache impact speeds
up the calling code.

One thing I'd like to suggest you steal is the way it handles the
fetch of the final partial word.  It's a lot smaller and faster than
an 8-way case statement.


#include/* For rol64 */
#include 
#include 
#include 

/* The basic ARX mixing function, taken from Skein */
#define SIP_MIX(a, b, s) ((a) += (b), (b) = rol64(b, s), (b) ^= (a))

/*
 * The complete SipRound.  Note that, when unrolled twice like below,
 * the 32-bit rotates drop out on 32-bit machines.
 */
#define SIP_ROUND(a, b, c, d) \
(SIP_MIX(a, b, 13), SIP_MIX(c, d, 16), (a) = rol64(a, 32), \
 SIP_MIX(c, b, 17), SIP_MIX(a, d, 21), (c) = rol64(c, 32))

/*
 * This is rolled up more than most implementations, resulting in about
 * 55% the code size.  Speed is a few precent slower.  A crude benchmark
 * (for (i=1; i <= max; i++) for (j = 0; j < 4096-i; j++) hash(buf+j, i);)
 * produces the following timings (in usec):
 *
 *  i386i386i386x86_64  x86_64  x86_64  x86_64
 * Length   small   unroll  halfmd4 small   unroll  halfmd4 teahash
 * 1..4106910291608 195 160 399 690
 * 1..8248323813851 410 360 9881659
 * 1..12   430341526207 690 61816422690
 * 1..16   612259318668 968 87623633786
 * 1..20   83488137   112451323118531625567
 * 1..24  10580   10327   139351657150440667635
 * 1..28  13211   12956   168032069187150289759
 * 1..32  15843   15572   19725247022606084   11932
 * 1..36  18864   18609   24259293426787566   14794
 * 1..1024  5890194 6130242 10264816 881933  881244 3617392 7589036
 *
 * The performance penalty is quite minor, decreasing for long strings,
 * and it's significantly faster than half_md4, so I'm going for the
 * I-cache win.
 */
uint64_t
siphash24(char const *in, size_t len, uint32_t const seed[4])
{
uint64_t a = 0x736f6d6570736575;/* somepseu */
uint64_t b = 0x646f72616e646f6d;/* dorandom */
uint64_t c = 0x6c7967656e657261;/* lygenera */
uint64_t d = 0x7465646279746573;/* tedbytes */
uint64_t m = 0;
uint8_t padbyte = len;

/*
 * Mix in the 128-bit hash seed.  This is in a format convenient
 * to the ext3/ext4 code.  Please feel free to adapt the
 * */
if (seed) {
m = seed[2] | (uint64_t)seed[3] << 32;
b ^= m;
d ^= m;
m = seed[0] | (uint64_t)seed[1] << 32;
/* a ^= m; is done in loop below */
c ^= m;
}

/*
 * By using the same SipRound code for all iterations, we
 * save space, at the expense of some branch prediction.  But
 * branch prediction is hard because of variable length anyway.
 */
len = len/8 + 3;/* Now number of rounds to perform */
do {
a ^= m;

switch (--len) {
unsigned bytes;

default:/* Full words */
d ^= m = get_unaligned_le64(in);
in += 8;
break;
case 2: /* Final partial word */
/*
 * We'd like to do one 64-bit fetch rather than
 * mess around with bytes, but reading past the end
 * might hit a protection boundary.  Fortunately,
 * we know that protection boundaries are aligned,
 * so we can consider only three cases:
 * - The remainder occupies zero words
 * - The remainder fits into one word
 * - The remainder straddles two words
 */
bytes = padbyte & 7;

if (bytes == 0) {
m = 0;
} else {
unsigned offset = (unsigned)(uintptr_t)in & 7;

if (offset + bytes <= 8) {
m = le64_to_cpup((uint64_t const *)
(in - offset));
m >>= 8*offset;
} else {
m = get_unaligned_le64(in);
}
m &= ((uint64_t)1 << 8*

Re: [PATCH 1/2] dt-bindings: usb: add DT binding for s3c2410 USB device controller

2016-12-17 Thread Sergio Prado

On Tue, Dec 13, 2016 at 12:59:15PM -0600, Rob Herring wrote:
> > +Samsung S3C2410 and compatible USB device controller
> > +
> > +Required properties:
> > + - compatible: Should be one of the following
> > +  "samsung,s3c2410-udc"
> > +  "samsung,s3c2440-udc"
> > + - reg: address and length of the controller memory mapped region
> > + - interrupts: interrupt number for the USB device controller
> > + - clocks: Should reference the bus and host clocks
> > + - clock-names: Should contain two strings
> > +   "usb-bus-gadget" for the USB bus clock
> 
> Pretty sure the h/w clock name in the datasheet does not use the Linux 
> term gadget.

You are right. The datasheet calls it UCLK. In the S3c24010 clock driver
(clk-s3c2410.c), there's is a clock alias to UCLK called
"usb-bus-gadget" that was used in the USB device controller's driver.
We can change the driver and the DT binding to use "uclk" to
better reflect the name used in the datasheet. What do you think?

> 
> > +   "usb-device" for the USB device clock
> > +
> > +Optional properties:
> > + - samsung,vbus-gpio: If present, specifies a gpio that needs to be
> > +   activated for the bus to be powered.
> 
> Isn't it the host side that controls Vbus?

Yes. I'll change the description to "specifies a gpio that allows to
detect whether vbus is present (USB is connected)."

> 
> > + - samsung,pullup-gpio: If present, specifies a gpio to control the
> 
> Both GPIOs need to specify the active state.

OK.

> 
> > +   USB D+ pullup.
> > +
> > +usb1: udc@5200 {
> > +   compatible = "samsung,s3c2440-udc";
> > +   reg = <0x5200 0x10>;
> > +   interrupts = <0 0 25 3>;
> > +   clocks = <&clocks UCLK>, <&clocks HCLK_USBD>;
> > +   clock-names = "usb-bus-gadget", "usb-device";
> > +   samsung,pullup-gpio = <&gpc 5 GPIO_ACTIVE_HIGH>;
> > +};
> > -- 
> > 1.9.1
> > 

Best regards,

-- 
Sergio Prado
Embedded Labworks

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland

On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote:
> On Fri 16-12-16 19:47:00, Nils Holland wrote:
> > 
> > Dec 16 18:56:24 boerne.fritz.box kernel: Purging GPU memory, 37 pages 
> > freed, 10219 pages still pinned.
> > Dec 16 18:56:29 boerne.fritz.box kernel: kthreadd invoked oom-killer: 
> > gfp_mask=0x27080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), 
> > nodemask=0, order=1, oom_score_adj=0
> > Dec 16 18:56:29 boerne.fritz.box kernel: kthreadd cpuset=/ mems_allowed=0
> [...]
> > Dec 16 18:56:29 boerne.fritz.box kernel: Normal free:41008kB min:41100kB 
> > low:51372kB high:61644kB active_anon:0kB inactive_anon:0kB 
> > active_file:470556kB inactive_file:148kB unevictable:0kB 
> > writepending:1616kB present:897016kB managed:831480kB mlocked:0kB 
> > slab_reclaimable:213172kB slab_unreclaimable:86236kB kernel_stack:1864kB 
> > pagetables:3572kB bounce:0kB free_pcp:532kB local_pcp:456kB free_cma:0kB
> 
> this is a GFP_KERNEL allocation so it cannot use the highmem zone again.
> There is no anonymous memory in this zone but the allocation
> context implies the full reclaim context so the file LRU should be
> reclaimable. For some reason ~470MB of the active file LRU is still
> there. This is quite unexpected. It is harder to tell more without
> further data. It would be great if you could enable reclaim related
> tracepoints:
> 
> mount -t tracefs none /debug/trace
> echo 1 > /debug/trace/events/vmscan/enable
> cat /debug/trace/trace_pipe > trace.log
> 
> should help
> [...]

No problem! I enabled writing the trace data to a file and then tried
to trigger another OOM situation. That worked, this time without a
complete kernel panic, but with only my processes being killed and the
system becoming unresponsive. When that happened, I let it run for
another minute or two so that in case it was still logging something
to the trace file, it could continue to do so some time longer. Then I
rebooted with the only thing that still worked, i.e. by means of magic
SysRequest.

The trace file has actually become rather big (around 21 MB). I didn't
dare to cut anything from it because I didn't want to risk deleting
something that might turn out important. So, due to the size, I'm not
attaching the trace file to this message, but it's up compressed
(about 536 KB) to be grabbed at:

http://ftp.tisys.org/pub/misc/trace.log.xz

For reference, here's the OOM report that goes along with this
incident and the trace file:

Dec 17 13:31:06 boerne.fritz.box kernel: Purging GPU memory, 145 pages freed, 
10287 pages still pinned.
Dec 17 13:31:07 boerne.fritz.box kernel: awesome invoked oom-killer: 
gfp_mask=0x25000c0(GFP_KERNEL_ACCOUNT), nodemask=0, order=0, oom_score_adj=0
Dec 17 13:31:07 boerne.fritz.box kernel: awesome cpuset=/ mems_allowed=0
Dec 17 13:31:07 boerne.fritz.box kernel: CPU: 1 PID: 5599 Comm: awesome Not 
tainted 4.9.0-gentoo #3
Dec 17 13:31:07 boerne.fritz.box kernel: Hardware name: TOSHIBA Satellite 
L500/KSWAA, BIOS V1.80 10/28/2009
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c18
Dec 17 13:31:07 boerne.fritz.box kernel:  c1433406
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37d48
Dec 17 13:31:07 boerne.fritz.box kernel:  c5319280
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c48
Dec 17 13:31:07 boerne.fritz.box kernel:  c1170011
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c9c
Dec 17 13:31:07 boerne.fritz.box kernel:  00200286
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c48
Dec 17 13:31:07 boerne.fritz.box kernel:  c1438fff
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c4c
Dec 17 13:31:07 boerne.fritz.box kernel:  c72479c0
Dec 17 13:31:07 boerne.fritz.box kernel:  c60dd200
Dec 17 13:31:07 boerne.fritz.box kernel:  c5319280
Dec 17 13:31:07 boerne.fritz.box kernel:  c1ad1899
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37d48
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c8c
Dec 17 13:31:07 boerne.fritz.box kernel:  c1114407
Dec 17 13:31:07 boerne.fritz.box kernel:  c10513a5
Dec 17 13:31:07 boerne.fritz.box kernel:  c5a37c78
Dec 17 13:31:07 boerne.fritz.box kernel:  c11140a1
Dec 17 13:31:07 boerne.fritz.box kernel:  0005
Dec 17 13:31:07 boerne.fritz.box kernel:  
Dec 17 13:31:07 boerne.fritz.box kernel:  
Dec 17 13:31:07 boerne.fritz.box kernel: Call Trace:
Dec 17 13:31:07 boerne.fritz.box kernel:  [] dump_stack+0x47/0x61
Dec 17 13:31:07 boerne.fritz.box kernel:  [] dump_header+0x5f/0x175
Dec 17 13:31:07 boerne.fritz.box kernel:  [] ? ___ratelimit+0x7f/0xe0
Dec 17 13:31:07 boerne.fritz.box kernel:  [] 
oom_kill_process+0x207/0x3c0
Dec 17 13:31:07 boerne.fritz.box kernel:  [] ? 
has_capability_noaudit+0x15/0x20
Dec 17 13:31:07 boerne.fritz.box kernel:  [] ? 
oom_badness.part.13+0xb1/0x120
Dec 17 13:31:07 boerne.fritz.box kernel:  [] out_of_memory+0xd4/0x270
Dec 17 13:31:07 boerne.fritz.box kernel:  [] 
__alloc_pages_nodemask+0xcf5/0xd60
Dec 17 13:31:07 boerne.fritz.box kernel:  [] ? 
skb_queue_purge+0x30/0x30
Dec 17 13:31:07 boerne.fritz.box kernel:  [] 
alloc_skb_with_fr

MAC address in wl1251 NVS data (Was: Re: wl1251 NVS calibration data format)

2016-12-17 Thread Pali Rohár

> On Sat, Dec 17, 2016 at 12:14:50PM +0100, Pali Rohár wrote:
> > > [1] http://notaz.gp2x.de/misc/pnd/wl1251/nvs_map.txt
> > In that description is something about STA mac address:
> > 

019   02   //length

> > 01a   6d   //STA_ADDR_L Register Address.  (STA MAC
> > Address)
> > 01b   54   //
> > 01c   00   //STA_ADDR_L Register
> > 01d   00   //
> > 01e   32   //
> > 01f   28   //
> > 020   00   //STA_ADDR_H Register Data.

021   08   //
022   00   //
023   00   //

So... above data means:

019 - number of words
01a - low bits of offset applied with mask 0xfe
01b - high bits of offset
01c-01f first word
020-023 second word

Interpreted as: at address offset 0x536c are written two words 
0x2832 and 0x0800

wl1271 driver has in linux/drivers/net/wireless/ti/wlcore/boot.c this:

/* update current MAC address to NVS */
nvs_ptr[11] = wl->addresses[0].addr[0];
nvs_ptr[10] = wl->addresses[0].addr[1];
nvs_ptr[6] = wl->addresses[0].addr[2];
nvs_ptr[5] = wl->addresses[0].addr[3];
nvs_ptr[4] = wl->addresses[0].addr[4];
nvs_ptr[3] = wl->addresses[0].addr[5];

Looking at wl1271-nvs.bin file (which is "modified" in kernel by boot.c)

000: 01
001: 6d
002: 54
003: 00
004: 00
005: ef
006: be

Means: at address offset 0x536c is written one word 0xBEEF

007: 01
008: 71
009: 54
00a: ad
00b: de
00c: 00
00d: 00

Means: at address offset 0x5371 is written one word 0xDEAD

Above boot.c kernel code updates those data to MAC address, so at 
address offset 0x536c is written four low bytes of MAC address and to 
0x5371 are written remaining two bytes. So 00:00:DE:AD:BE:EF

So conclusion: address offset for wl1271 (where is written MAC address) 
is exactly same as for wl1251 which is marked in that documentation as 
STA_ADDR_L Register.

Btw, in our wl1251-nvs.bin found in Maemo rootfs, which is exactly same 
as in linux-firmware.git tree there are those data:

019: 02
01a: 6d
01b: 54
01c: 09
01d: 03
01e: 07
01f: 20
020: 00
021: 00
022: 00
023: 00

So hardcoded MAC address in wl1251-nvs.bin is: 00:00:20:07:03:09. Which 
is assigned to DIAB. Strange that it is not TI...

-- 
Pali Rohár
pali.ro...@gmail.com

signature.asc
Description: This is a digitally signed message part.

[PATCH] livepatch: fixup klp-convert tool integration

2016-12-17 Thread Konstantin Khlebnikov

I've found some minor problems, this patch fixes:

* save cmd_ld_ko_o into .module.cmd, if_changed_rule doesn't do that
* fix bashisms for debian where /bin/sh is a symlink to /bin/dash
* rename rule_link_module to rule_ld_ko_o, otherwise arg-check inside
  if_changed_rule compares cmd_link_module and cmd_ld_ko_o
* use HOSTLOADLIBES_$module instead of HOSTLDFLAGS: -lelf must be at the end
* check modinfo -F livepatch only if CONFIG_LIVEPATCH is true

I think "modinfo -F" could be replaced with explicit mark in makefile,
for example: LIVEPATCH_module.ko := y (like KASAN_SANITIZE_obj.o := n).

Signed-off-by: Konstantin Khlebnikov 
---
 scripts/Kbuild.include |4 +++-
 scripts/Makefile.modpost   |   24 +++-
 scripts/livepatch/Makefile |2 +-
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 179219845dfc..e299fde3423b 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -247,6 +247,8 @@ endif
 # (needed for the shell)
 make-cmd = $(call escsq,$(subst \#,\\\#,$(subst $$,,$(cmd_$(1)
 
+save-cmd = printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd
+
 # Find any prerequisites that is newer than target or that does not exist.
 # PHONY targets skipped in both cases.
 any-prereq = $(filter-out $(PHONY),$?) $(filter-out $(PHONY) $(wildcard $^),$^)
@@ -256,7 +258,7 @@ any-prereq = $(filter-out $(PHONY),$?) $(filter-out 
$(PHONY) $(wildcard $^),$^)
 if_changed = $(if $(strip $(any-prereq) $(arg-check)),   \
@set -e; \
$(echo-cmd) $(cmd_$(1)); \
-   printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd, @:)
+   $(save-cmd), @:)
 
 # Execute the command and also postprocess generated .d dependencies file.
 if_changed_dep = $(if $(strip $(any-prereq) $(arg-check) ),  \
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 916dd347e8f6..5d149d0b05c2 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -123,24 +123,22 @@ quiet_cmd_ld_ko_o = LD [M]  $@
$(LD) -r $(LDFLAGS) \
  $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \
  -o $@ $(filter-out FORCE,$^) ; \
-   $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) ;
+   $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-ifdef CONFIG_LIVEPATCH
 KLP_CONVERT = scripts/livepatch/klp-convert
-  cmd_klp_convert =
\
-   if [[ -n "`modinfo -F livepatch $@`" ]]; then   \
-   mv $@ $(@:.ko=.klp.o);  \
-   $(KLP_CONVERT) $(@:.ko=.klp.o) $@;  \
-   fi ;
-endif
-
-define rule_link_module
-   $(call echo-cmd,ld_ko_o) $(cmd_ld_ko_o) \
-   $(cmd_klp_convert)
+quiet_cmd_klp_convert = LIVEPATCH $@
+  cmd_klp_convert = mv $@ $(@:.ko=.klp.o); $(KLP_CONVERT) $(@:.ko=.klp.o) 
$@
+
+define rule_ld_ko_o
+   $(call echo-cmd,ld_ko_o) $(cmd_ld_ko_o) ;\
+   $(call save-cmd,ld_ko_o) ;   \
+   $(if $(CONFIG_LIVEPATCH),\
+ if [ -n "`modinfo -F livepatch $@`" ] ; then   \
+   $(call echo-cmd,klp_convert) $(cmd_klp_convert) ; fi)
 endef
 
 $(modules): %.ko :%.o %.mod.o FORCE
-   +$(call if_changed_rule,link_module)
+   +$(call if_changed_rule,ld_ko_o)
 
 targets += $(modules)
 
diff --git a/scripts/livepatch/Makefile b/scripts/livepatch/Makefile
index 221829bb34c7..bd5c1ae553ab 100644
--- a/scripts/livepatch/Makefile
+++ b/scripts/livepatch/Makefile
@@ -4,4 +4,4 @@ always  := $(hostprogs-y)
 klp-convert-objs   := klp-convert.o elf.o
 
 HOSTCFLAGS := -g -I$(INSTALL_HDR_PATH)/include -Wall
-HOSTLDFLAGS:= -lelf
+HOSTLOADLIBES_klp-convert  := -lelf

RE: [RFC 00/10] implement alternative and much simpler id allocator

2016-12-17 Thread Matthew Wilcox

From: Matthew Wilcox
> From: Rasmus Villemoes [mailto:li...@rasmusvillemoes.dk]
> > This sounds good. I think there may still be a lot of users that never
> > allocate more than a handful of IDAs, making a 128 byte allocation still
> > somewhat excessive. One thing I considered was (exactly as it's done for
> > file descriptor tables) to embed a single word in the struct ida and
> > use that initially; I haven't looked closely at newIDA, so I don't know
> > how easy that would be or if its worth the complexity.
> 
> Heh, I was thinking about that too.  The radix tree supports "exceptional
> entries" which have the bottom bit set.  On a 64-bit machine, we could use 62
> of the bits in the radix tree root to store the ID bitmap.  I'm a little wary 
> of the
> potential complexity, but we should try it out.

Test patch here: 
http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16
It passes the test suite ... which I actually had to adjust because it now 
succeeds in cases where it hadn't (allocating ID 0 without preallocating), and 
it will now fail in cases where it hadn't previously (assuming a single 
preallocation would be enough).  There shouldn't be any examples of that in the 
kernel proper; it was simply me being lazy when I wrote the test suite.

Re: [PATCH v2 04/11] locking/ww_mutex: Set use_ww_ctx even when locking without a context

2016-12-17 Thread Peter Zijlstra

On Fri, Dec 16, 2016 at 02:17:25PM +0100, Nicolai Hähnle wrote:
> On 06.12.2016 16:25, Peter Zijlstra wrote:
> >On Thu, Dec 01, 2016 at 03:06:47PM +0100, Nicolai Hähnle wrote:
> >
> >>@@ -640,10 +640,11 @@ __mutex_lock_common(struct mutex *lock, long state, 
> >>unsigned int subclass,
> >>struct mutex_waiter waiter;
> >>unsigned long flags;
> >>bool first = false;
> >>-   struct ww_mutex *ww;
> >>int ret;
> >>
> >>-   if (use_ww_ctx) {
> >>+   if (use_ww_ctx && ww_ctx) {
> >>+   struct ww_mutex *ww;
> >>+
> >>ww = container_of(lock, struct ww_mutex, base);
> >>if (unlikely(ww_ctx == READ_ONCE(ww->ctx)))
> >>return -EALREADY;
> >
> >So I don't see the point of removing *ww from the function scope, we can
> >still compute that container_of() even if !ww_ctx, right? That would
> >safe a ton of churn below, adding all those struct ww_mutex declarations
> >and container_of() casts.
> >
> >(and note that the container_of() is a fancy NO-OP because base is the
> >first member).
> 
> Sorry for taking so long to get back to you.
> 
> In my experience, the undefined behavior sanitizer in GCC for userspace
> programs complains about merely casting a pointer to the wrong type. I never
> went into the standards rabbit hole to figure out the details. It might be a
> C++ only thing (ubsan cannot tell the difference otherwise anyway), but that
> was the reason for doing the change in this more complicated way.

Note that C only has what C++ calls reinterpret_cast<>(). It cannot
complain about a 'wrong' cast, there is no such thing.

Also, container_of() works, irrespective of what C language says about
it -- note that the kernel in general hard relies on a lot of things C
calls undefined behaviour.

> Are you sure that this is defined behavior in C? If so, I'd be happy to go
> with the version that has less churn.

It should very much work with kernel C.

Re: [PATCH -v4 00/10] FUTEX_UNLOCK_PI wobbles

2016-12-17 Thread Peter Zijlstra

On Fri, Dec 16, 2016 at 03:31:40PM -0800, Darren Hart wrote:
> On Tue, Dec 13, 2016 at 09:36:38AM +0100, Peter Zijlstra wrote:
> > That way, when we drop hb->lock to wait, futex and rt_mutex wait state is
> > consistent.
> > 
> > 
> > In any case, it passes our inadequate testing.
> 
> It passed my CI tools/testing/selftests/futex/functional/run.sh. Did you also
> happen to run a fuzz tester?

I did not. I'm not sure how good trinity is at poking holes in futexes.
I would love a domain specific fuzzer for futex, but I suspect it would
end up being me writing it :-(

Re: [PATCH -v4 02/10] futex: Add missing error handling to FUTEX_REQUEUE_PI

2016-12-17 Thread Peter Zijlstra

On Fri, Dec 16, 2016 at 04:06:39PM -0800, Darren Hart wrote:
> On Tue, Dec 13, 2016 at 09:36:40AM +0100, Peter Zijlstra wrote:
> > Thomas spotted that fixup_pi_state_owner() can return errors and we
> > fail to unlock the rt_mutex in that case.
> > 
> 
> We handled this explicitly before Patch 1/10, so can this be rolled into 1/10
> (er 9) as a single commit?

I don't think we did, see how this branch doesn't set pi_mutex.

Re: [alsa-devel] [PATCH v6 1/3] clk: x86: Add Atom PMC platform clocks

2016-12-17 Thread Andy Shevchenko

On Sat, Dec 17, 2016 at 3:33 AM, Stephen Boyd  wrote:
> On 12/15, Pierre-Louis Bossart wrote:

>>Clients use devm_clk_get() with a "pmc_plt_clk_"
>> argument.
>
> This is the problem. Clients should be calling clk_get() like:
>
> clk_get(dev, "signal name in datasheet")
>
> where the first argument is the device and the second argument is
> some string that is meaningful to the device, not the system as a
> whole. The way clkdev is intended is so that the dev argument's
> dev_name() is combined with the con_id that matches some signale
> name in the datasheet. This way when the same IP is put into some
> other chip, the globally unique name doesn't need to change, just
> the device name that's registered with the lookup. Obviously this
> breaks down quite badly when dev_name() isn't stable. Is that
> happening here?

PMC Atom is a PCI device and thus each platform would have different
dev_name(). Do you want to list all in each consumer if consumer wants
to work on all of them or I missed something?

So, the question is how clock getting will look like to work on
currently both CherryTrail and BayTrail.

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH] rbtree: use designated initializers

2016-12-17 Thread Peter Zijlstra

On Fri, Dec 16, 2016 at 05:02:53PM -0800, Kees Cook wrote:
> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.

Works for me.

Acked-by: Peter Zijlstra (Intel) 

One note on these structures, the intent is that GCC value propagation
completely does away with everything and results in inlining the actual
functions. Older versions of GCC had a wee bit of trouble with this, but
recent versions do just that, not a single actual structure should end
up being emitted in the object code.

> Signed-off-by: Kees Cook 
> ---
>  include/linux/rbtree_augmented.h | 4 +++-
>  lib/rbtree.c | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/rbtree_augmented.h 
> b/include/linux/rbtree_augmented.h
> index d076183e49be..9702b6e183bc 100644
> --- a/include/linux/rbtree_augmented.h
> +++ b/include/linux/rbtree_augmented.h
> @@ -90,7 +90,9 @@ rbname ## _rotate(struct rb_node *rb_old, struct rb_node 
> *rb_new)   \
>   old->rbaugmented = rbcompute(old);  \
>  }\
>  rbstatic const struct rb_augment_callbacks rbname = {
> \
> - rbname ## _propagate, rbname ## _copy, rbname ## _rotate\
> + .propagate = rbname ## _propagate,  \
> + .copy = rbname ## _copy,\
> + .rotate = rbname ## _rotate \
>  };
>  
>  
> diff --git a/lib/rbtree.c b/lib/rbtree.c
> index 1f8b112a7c35..4ba2828a67c0 100644
> --- a/lib/rbtree.c
> +++ b/lib/rbtree.c
> @@ -427,7 +427,9 @@ static inline void dummy_copy(struct rb_node *old, struct 
> rb_node *new) {}
>  static inline void dummy_rotate(struct rb_node *old, struct rb_node *new) {}
>  
>  static const struct rb_augment_callbacks dummy_callbacks = {
> - dummy_propagate, dummy_copy, dummy_rotate
> + .propagate = dummy_propagate,
> + .copy = dummy_copy,
> + .rotate = dummy_rotate
>  };
>  
>  void rb_insert_color(struct rb_node *node, struct rb_root *root)
> -- 
> 2.7.4
> 
> 
> -- 
> Kees Cook
> Nexus Security

[PATCH v3 0/7] Runtime PM for Thunderbolt on Macs

2016-12-17 Thread Lukas Wunner

Power down Thunderbolt controllers on Macs when nothing is plugged in
to save around 2W per controller.

Apple provides an ACPI-based (but nonstandard) mechanism to cut power
and signal hotplug during powerdown.  The usual way to implement such
nonstandard mechanisms seems to be a struct dev_pm_domain.
E.g. vga_switcheroo uses that for Optimus GPUs which control power
with ACPI DSMs.  Hence this third iteration of the series uses that
as well.  In v2 a more complicated approach was employed wherein power
control was exerted by a PCIe port service driver instead.

All the prep work went into 4.9 and 4.10, shrinking this series to just
7 patches:

- The actual "meat" of the series (to borrow a term from Bjorn) is in
  patches [6/7] and [7/7].  These two need an ack from Andreas.

- Patches [1/7] to [3/7] need an ack from Bjorn (and possibly Rafael or
  Mika).  They're fairly small and just add a bit to struct pci_dev
  signifying that a device is part of a Thunderbolt daisy chain, then
  use that bit to modify runtime PM for PCIe ports.  I'm also cc'ing
  Tomas and Amir at Intel Israel, if you guys have comments please shout.

- Patches [4/7] and [5/7] need an ack from Rafael.  Their sole purpose
  is to avoid a gratuitous WARN splat when assigning the struct
  dev_pm_domain.

I've pushed the patches to GitHub to ease reviewing/fetching:
https://github.com/l1k/linux/commits/thunderbolt_runpm_v3

Link to the previous iteration (v2, May 2016):
http://www.spinics.net/lists/linux-pci/msg51158.html

Thanks,

Lukas


Lukas Wunner (7):
  PCI: Recognize Thunderbolt devices
  PCI: Allow runtime PM on Thunderbolt ports
  PCI: Don't block runtime PM for Thunderbolt host hotplug ports
  Revert "PM / Runtime: Remove the exported function
pm_children_suspended()"
  PM: Make requirements of dev_pm_domain_set() more precise
  thunderbolt: Power down controller when idle
  thunderbolt: Runtime suspend NHI when idle

 drivers/base/power/common.c  |  15 +-
 drivers/base/power/runtime.c |   3 +-
 drivers/pci/pci.c|  20 ++-
 drivers/pci/pci.h|   2 +
 drivers/pci/probe.c  |  34 +
 drivers/thunderbolt/Kconfig  |   3 +-
 drivers/thunderbolt/Makefile |   4 +-
 drivers/thunderbolt/nhi.c|   5 +
 drivers/thunderbolt/power.c  | 356 +++
 drivers/thunderbolt/power.h  |  37 +
 drivers/thunderbolt/switch.c |   9 ++
 drivers/thunderbolt/tb.c |  13 ++
 drivers/thunderbolt/tb.h |   2 +
 include/linux/pci.h  |   1 +
 include/linux/pm_runtime.h   |   7 +
 15 files changed, 500 insertions(+), 11 deletions(-)
 create mode 100644 drivers/thunderbolt/power.c
 create mode 100644 drivers/thunderbolt/power.h

-- 
2.10.2

[PATCH v3 2/7] PCI: Allow runtime PM on Thunderbolt ports

2016-12-17 Thread Lukas Wunner

Currently PCIe ports are only allowed to go to D3 if the BIOS is dated
2015 or newer to avoid potential issues with old chipsets.  However for
Thunderbolt we know that even the oldest controller, Light Ridge (2010),
is able to suspend its ports to D3 just fine.

We're about to add runtime PM for Thunderbolt on the Mac.  Apple has
released two EFI security updates in 2015 which encompass all machines
with Thunderbolt, but the achieved power saving should be made available
to users even if they haven't updated their BIOS.  To this end,
special-case Thunderbolt in pci_bridge_d3_possible().

This allows the Thunderbolt controller to power down but the root port
to which the Thunderbolt controller is attached remains in D0 unless
the EFI update is installed.  Users can pass pcie_port_pm=force on the
kernel command line if they cannot install the EFI update but still want
to benefit from the additional power saving of putting the root port
into D3.  In practice, root ports can be suspended to D3 without issues
at least on 2012 Ivy Bridge machines.

If the BIOS cut-off date is ever lowered to 2010, the Thunderbolt
special case can be removed.

Cc: Mika Westerberg 
Cc: Rafael J. Wysocki 
Cc: Andreas Noever 
Cc: Tomas Winkler 
Cc: Amir Levy 
Signed-off-by: Lukas Wunner 
---
 drivers/pci/pci.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a881c0d..8ed098d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2224,7 +2224,7 @@ void pci_config_pm_runtime_put(struct pci_dev *pdev)
  * @bridge: Bridge to check
  *
  * This function checks if it is possible to move the bridge to D3.
- * Currently we only allow D3 for recent enough PCIe ports.
+ * Currently we only allow D3 for recent enough PCIe ports and Thunderbolt.
  */
 bool pci_bridge_d3_possible(struct pci_dev *bridge)
 {
@@ -2258,6 +2258,11 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
year >= 2015) {
return true;
}
+
+   /* Even the oldest 2010 Thunderbolt controller supports D3. */
+   if (bridge->is_thunderbolt)
+   return true;
+
break;
}
 
-- 
2.10.2

[PATCH v3 3/7] PCI: Don't block runtime PM for Thunderbolt host hotplug ports

2016-12-17 Thread Lukas Wunner

Hotplug ports generally block their parents from suspending to D3hot as
otherwise their interrupts couldn't be delivered.

An exception are Thunderbolt host controllers:  They have a separate
GPIO pin to side-band signal plug events even if the controller is
powered down or its parent ports are suspended to D3.  They can be told
apart from Thunderbolt controllers in attached devices by checking if
they're situated below a non-Thunderbolt device (typically a root port,
or the downstream port of a PCIe switch in the case of the MacPro6,1).

To enable runtime PM for Thunderbolt on the Mac, the downstream bridges
of a host controller must not block runtime PM on the upstream bridge as
power to the chip is only cut once the upstream bridge has suspended.
Amend the condition in pci_dev_check_d3cold() accordingly.

Cc: Mika Westerberg 
Cc: Rafael J. Wysocki 
Cc: Andreas Noever 
Cc: Tomas Winkler 
Cc: Amir Levy 
Signed-off-by: Lukas Wunner 
---
 drivers/pci/pci.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8ed098d..0b03fe7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2271,6 +2271,7 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
 
 static int pci_dev_check_d3cold(struct pci_dev *dev, void *data)
 {
+   struct pci_dev *parent, *grandparent;
bool *d3cold_ok = data;
 
if (/* The device needs to be allowed to go D3cold ... */
@@ -2284,7 +2285,17 @@ static int pci_dev_check_d3cold(struct pci_dev *dev, 
void *data)
!pci_power_manageable(dev) ||
 
/* Hotplug interrupts cannot be delivered if the link is down. */
-   dev->is_hotplug_bridge)
+   (dev->is_hotplug_bridge &&
+
+   /*
+* Exception:  Thunderbolt host controllers have a pin to
+* side-band signal plug events.  Their hotplug ports are
+* recognizable by having a non-Thunderbolt device as
+* grandparent.
+*/
+   !(dev->is_thunderbolt && (parent = pci_upstream_bridge(dev)) &&
+(grandparent = pci_upstream_bridge(parent)) &&
+   !grandparent->is_thunderbolt)))
 
*d3cold_ok = false;
 
-- 
2.10.2

[PATCH v3 1/7] PCI: Recognize Thunderbolt devices

2016-12-17 Thread Lukas Wunner

We're about to allow runtime PM on Thunderbolt ports in
pci_bridge_d3_possible() and unblock runtime PM for Thunderbolt host
hotplug ports in pci_dev_check_d3cold().  In both cases we need to
uniquely identify if a PCI device belongs to a Thunderbolt controller.

We also have the need to detect presence of a Thunderbolt controller in
drivers/platform/x86/apple-gmux.c because dual GPU MacBook Pros cannot
switch external DP/HDMI ports between GPUs if they have Thunderbolt.

Furthermore, in multiple places in the DRM subsystem we need to detect
whether a GPU is on-board or attached with Thunderbolt.  As an example,
Thunderbolt-attached GPUs shall not be registered with vga_switcheroo.

Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234
on devices belonging to a Thunderbolt controller which allows us to
recognize them.

Detect presence of this VSEC on device probe and cache it in a newly
added is_thunderbolt bit in struct pci_dev which can then be queried by
pci_bridge_d3_possible(), pci_dev_check_d3cold(), apple-gmux and others.

Cc: Andreas Noever 
Cc: Tomas Winkler 
Cc: Amir Levy 
Signed-off-by: Lukas Wunner 
---
 drivers/pci/pci.h   |  2 ++
 drivers/pci/probe.c | 34 ++
 include/linux/pci.h |  1 +
 3 files changed, 37 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index cb17db2..45c2b81 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -3,6 +3,8 @@
 
 #define PCI_FIND_CAP_TTL   48
 
+#define PCI_VSEC_ID_INTEL_TBT  0x1234  /* Thunderbolt */
+
 extern const unsigned char pcie_link_speed[];
 
 bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e164b5c..891a8fa 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1206,6 +1206,37 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pdev->is_hotplug_bridge = 1;
 }
 
+static void set_pcie_vendor_specific(struct pci_dev *dev)
+{
+   int vsec = 0;
+   u32 header;
+
+   while ((vsec = pci_find_next_ext_capability(dev, vsec,
+   PCI_EXT_CAP_ID_VNDR))) {
+   pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, &header);
+
+   /* Is the device part of a Thunderbolt controller? */
+   if (dev->vendor == PCI_VENDOR_ID_INTEL &&
+   PCI_VNDR_HEADER_ID(header) == PCI_VSEC_ID_INTEL_TBT)
+   dev->is_thunderbolt = 1;
+   }
+
+   /*
+* Is the device attached with Thunderbolt?  Walk upwards and check for
+* each encountered bridge if it's part of a Thunderbolt controller.
+* Reaching the host bridge means dev is soldered to the mainboard.
+*/
+   if (!dev->is_thunderbolt) {
+   struct pci_dev *parent = dev;
+
+   while ((parent = pci_upstream_bridge(parent)))
+   if (parent->is_thunderbolt) {
+   dev->is_thunderbolt = 1;
+   break;
+   }
+   }
+}
+
 /**
  * pci_ext_cfg_is_aliased - is ext config space just an alias of std config?
  * @dev: PCI device
@@ -1358,6 +1389,9 @@ int pci_setup_device(struct pci_dev *dev)
/* need to have dev->class ready */
dev->cfg_size = pci_cfg_space_size(dev);
 
+   /* need to have dev->cfg_size ready */
+   set_pcie_vendor_specific(dev);
+
/* "Unknown power state" */
dev->current_state = PCI_UNKNOWN;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e2d1a12..3c775e8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -358,6 +358,7 @@ struct pci_dev {
unsigned intis_virtfn:1;
unsigned intreset_fn:1;
unsigned intis_hotplug_bridge:1;
+   unsigned intis_thunderbolt:1; /* part of Thunderbolt daisy chain */
unsigned int__aer_firmware_first_valid:1;
unsigned int__aer_firmware_first:1;
unsigned intbroken_intx_masking:1;
-- 
2.10.2

[PATCH v3 6/7] thunderbolt: Power down controller when idle

2016-12-17 Thread Lukas Wunner

Document and implement Apple's ACPI-based (but nonstandard) pm mechanism
for Thunderbolt.  Briefly, an ACPI method provided by Apple is used to
cut power to the controller.  A GPE is enabled while the controller is
powered down which sideband-signals a plug event, whereupon we reinstate
power using the ACPI method.

This saves 1.7 W on machines with a Light Ridge controller and is
reported to save 4 W on Cactus Ridge 4C and Falcon Ridge 4C.  (I believe
4 W includes the bus power drawn by Apple's Gigabit Ethernet adapter.)
It fixes (at least partially) a power regression introduced in 3.17 by
commit 7bc5a2bad0b8 ("ACPI: Support _OSI("Darwin") correctly").

A Thunderbolt controller appears to the OS as a set of virtual devices:
One upstream bridge, multiple downstream bridges and one NHI (Native
Host Interface).  The upstream and downstream bridges represent a PCIe
switch (see definition of a switch in the PCIe spec).  The NHI device is
used to manage the switch fabric.  Hotplugged devices appear behind the
downstream bridges:

  (Root Port)  Upstream Bridge --+-- Downstream Bridge 0  NHI
 +-- Downstream Bridge 1 --
 +-- Downstream Bridge 2 --
 ...

Power is cut to the entire set of devices.  The Linux pm model is
hierarchical and assumes that a child cannot resume before its parent.
To conform to this model, power control must be governed by the
Thunderbolt controller's topmost device, which is the upstream bridge.
The NHI and downstream bridges go to D3hot independently and the
upstream bridge goes to D3cold once all its children have suspended.
This commit only adds runtime pm for the upstream bridge.  Runtime pm
for the NHI is added in a separate commit to signify its independence.
Runtime pm for the downstream bridges is handled by the pcieport driver.

Because Apple's ACPI methods are nonstandard, a struct dev_pm_domain is
used to override the PCI bus pm_ops.  The thunderbolt driver binds to
the NHI, thus the dev_pm_domain is assigned to the upstream bridge when
its grandchild ->probes and evicted when it ->removes.

There are no Thunderbolt specs publicly available from Intel or Apple,
so I've included documentation to the extent that I was able to reverse-
engineer things.  Documentation on the Go2Sx and Ok2Go2Sx pins is
tentative as those are missing on my Light Ridge.  Apple only uses them
on Cactus Ridge 4C.  Someone with such a controller needs to find out
through experimentation if the documentation is accurate and amend it if
necessary.

To maximize power saving, the controller utilizes the PM core's direct-
complete procedure, i.e. it stays suspended during the system sleep
process.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=92111
Cc: Andreas Noever 
Signed-off-by: Lukas Wunner 
---
 drivers/thunderbolt/Kconfig  |   3 +-
 drivers/thunderbolt/Makefile |   4 +-
 drivers/thunderbolt/nhi.c|   3 +
 drivers/thunderbolt/power.c  | 347 +++
 drivers/thunderbolt/power.h  |  37 +
 drivers/thunderbolt/tb.h |   2 +
 6 files changed, 393 insertions(+), 3 deletions(-)
 create mode 100644 drivers/thunderbolt/power.c
 create mode 100644 drivers/thunderbolt/power.h

diff --git a/drivers/thunderbolt/Kconfig b/drivers/thunderbolt/Kconfig
index d35db16..41625cf 100644
--- a/drivers/thunderbolt/Kconfig
+++ b/drivers/thunderbolt/Kconfig
@@ -1,9 +1,10 @@
 menuconfig THUNDERBOLT
tristate "Thunderbolt support for Apple devices"
-   depends on PCI
+   depends on PCI && ACPI
depends on X86 || COMPILE_TEST
select APPLE_PROPERTIES if EFI_STUB && X86
select CRC32
+   select PM
help
  Cactus Ridge Thunderbolt Controller driver
  This driver is required if you want to hotplug Thunderbolt devices on
diff --git a/drivers/thunderbolt/Makefile b/drivers/thunderbolt/Makefile
index 5d1053c..b220825 100644
--- a/drivers/thunderbolt/Makefile
+++ b/drivers/thunderbolt/Makefile
@@ -1,3 +1,3 @@
 obj-${CONFIG_THUNDERBOLT} := thunderbolt.o
-thunderbolt-objs := nhi.o ctl.o tb.o switch.o cap.o path.o tunnel_pci.o 
eeprom.o
-
+thunderbolt-objs := nhi.o ctl.o tb.o switch.o cap.o path.o tunnel_pci.o \
+   eeprom.o power.o
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index a8c2041..88fb2fb 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -605,6 +605,8 @@ static int nhi_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
}
pci_set_drvdata(pdev, tb);
 
+   thunderbolt_power_init(tb);
+
return 0;
 }
 
@@ -612,6 +614,7 @@ static void nhi_remove(struct pci_dev *pdev)
 {
struct tb *tb = pci_get_drvdata(pdev);
struct tb_nhi *nhi = tb->nhi;
+   thunderbolt_power_fini(tb);
thunderbolt_shutdown_and_free(tb);
nhi_shutdown(nhi);
 }
diff --git a/drivers/thunderbolt/power.c b/drive

[PATCH v3 4/7] Revert "PM / Runtime: Remove the exported function pm_children_suspended()"

2016-12-17 Thread Lukas Wunner

This reverts commit 62006c1702b3b1be0c0726949e0ee0ea2326be9c which
removed pm_children_suspended() because it had only a single caller.
We're about to add a second caller, so establish the status quo ante.

Cc: Ulf Hansson 
Cc: Rafael J. Wysocki 
Signed-off-by: Lukas Wunner 
---
 drivers/base/power/runtime.c | 3 +--
 include/linux/pm_runtime.h   | 7 +++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 872eac4..03293c3 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -243,8 +243,7 @@ static int rpm_check_suspend_allowed(struct device *dev)
retval = -EACCES;
else if (atomic_read(&dev->power.usage_count) > 0)
retval = -EAGAIN;
-   else if (!dev->power.ignore_children &&
-   atomic_read(&dev->power.child_count))
+   else if (!pm_children_suspended(dev))
retval = -EBUSY;
 
/* Pending resume requests take precedence over suspends. */
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index ca4823e..7de2aa5c 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -66,6 +66,12 @@ static inline void pm_suspend_ignore_children(struct device 
*dev, bool enable)
dev->power.ignore_children = enable;
 }
 
+static inline bool pm_children_suspended(struct device *dev)
+{
+   return dev->power.ignore_children
+   || !atomic_read(&dev->power.child_count);
+}
+
 static inline void pm_runtime_get_noresume(struct device *dev)
 {
atomic_inc(&dev->power.usage_count);
@@ -161,6 +167,7 @@ static inline void pm_runtime_allow(struct device *dev) {}
 static inline void pm_runtime_forbid(struct device *dev) {}
 
 static inline void pm_suspend_ignore_children(struct device *dev, bool enable) 
{}
+static inline bool pm_children_suspended(struct device *dev) { return false; }
 static inline void pm_runtime_get_noresume(struct device *dev) {}
 static inline void pm_runtime_put_noidle(struct device *dev) {}
 static inline bool device_run_wake(struct device *dev) { return false; }
-- 
2.10.2

[PATCH v3 5/7] PM: Make requirements of dev_pm_domain_set() more precise

2016-12-17 Thread Lukas Wunner

Since commit 989561de9b51 ("PM / Domains: add setter for dev.pm_domain")
a PM domain may only be assigned to unbound devices.

The motivation was not made explicit in the changelog other than "in the
general case that can cause problems and also [...] we can simplify code
quite a bit if we can always assume that".  Rafael J. Wysocki elaborated
in a mailing list conversation that "setting a PM domain generally
changes the set of PM callbacks for the device and it may not be safe to
call it after the driver has been bound".

The concern seems to be that if a device is put to sleep and its PM
callbacks are changed, the device may end up in an undefined state or
not resume at all.  The real underlying requirement is thus to ensure
that the device is awake and execution of its PM callbacks is prevented
while the PM domain is assigned.  Unbound devices happen to fulfill this
requirement, but bound devices can be made to satisfy it as well:
The caller can prevent execution of PM ops with lock_system_sleep() and
by holding a runtime PM reference to the device.

Accordingly, adjust dev_pm_domain_set() to WARN only if the device is in
the midst of a system sleep transition, or runtime PM is enabled and the
device is either not active or may become inactive imminently (because
it has no active children or its refcount is zero).

The change is required to support runtime PM for Thunderbolt on the Mac,
which poses the unique issue that a child device (the NHI) needs to
assign a PM domain to its grandparent (the upstream bridge).  Because
the grandparent's driver is built-in and the child's driver is a module,
the grandparent is usually already bound when the child probes,
resulting in a WARN splat when calling dev_pm_domain_set().  However the
PM core guarantees both that the grandparent is active and that system
sleep is not commenced until the child has finished probing.  So in this
case it is safe to call dev_pm_domain_set() from the child's ->probe
hook and the WARN splat is entirely gratuitous.

Note that commit e79aee49bcf9 ("PM: Avoid false-positive warnings in
dev_pm_domain_set()") modified the WARN to not apply if a PM domain is
removed.  This is unsafe as it allows removal of the PM domain while
the device is asleep.  The present commit rectifies this.

Cc: Ulf Hansson 
Cc: Tomeu Vizoso 
Cc: Rafael J. Wysocki 
Signed-off-by: Lukas Wunner 
---
 drivers/base/power/common.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/base/power/common.c b/drivers/base/power/common.c
index f6a9ad5..d02c1e0 100644
--- a/drivers/base/power/common.c
+++ b/drivers/base/power/common.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -136,8 +137,10 @@ EXPORT_SYMBOL_GPL(dev_pm_domain_detach);
  * @dev: Device whose PM domain is to be set.
  * @pd: PM domain to be set, or NULL.
  *
- * Sets the PM domain the device belongs to. The PM domain of a device needs
- * to be set before its probe finishes (it's bound to a driver).
+ * Sets the PM domain the device belongs to.  The PM domain of a device needs
+ * to be set while the device is awake.  This is guaranteed during ->probe.
+ * Otherwise the caller is responsible for ensuring wakefulness, e.g. by
+ * holding a runtime PM reference as well as invoking lock_system_sleep().
  *
  * This function must be called with the device lock held.
  */
@@ -146,8 +149,12 @@ void dev_pm_domain_set(struct device *dev, struct 
dev_pm_domain *pd)
if (dev->pm_domain == pd)
return;
 
-   WARN(pd && device_is_bound(dev),
-"PM domains can only be changed for unbound devices\n");
+   WARN(dev->power.is_prepared || dev->power.is_suspended ||
+(pm_runtime_enabled(dev) &&
+ (dev->power.runtime_status != RPM_ACTIVE ||
+  (pm_children_suspended(dev) &&
+   !atomic_read(&dev->power.usage_count,
+"PM domains can only be changed for awake devices\n");
dev->pm_domain = pd;
device_pm_check_callbacks(dev);
 }
-- 
2.10.2

[PATCH v3 7/7] thunderbolt: Runtime suspend NHI when idle

2016-12-17 Thread Lukas Wunner

Runtime suspend the NHI when no Thunderbolt devices have been plugged in
for 10 sec (user-configurable via autosuspend_delay_ms in sysfs).

The NHI is not able to detect plug events while suspended, it relies on
the GPE handler to resume it on hotplug.

After the NHI resumes, it takes about 700 ms until a hotplug event
appears on the RX ring.  In case autosuspend_delay_ms has been reduced
to 0 by the user, we need to wait in tb_resume() to avoid going back to
sleep before we had a chance to detect a hotplugged device.  A runtime
pm ref is held for the duration of tb_handle_hotplug() to keep the NHI
awake while the hotplug event is processed.

Apart from that we acquire a runtime pm ref for each newly allocated
switch (except for the root switch) and drop one when a switch is freed,
thereby ensuring the NHI stays active as long as devices are plugged in.
This behaviour is identical to the macOS driver.

Cc: Andreas Noever 
Signed-off-by: Lukas Wunner 
---
 drivers/thunderbolt/nhi.c|  2 ++
 drivers/thunderbolt/power.c  |  9 +
 drivers/thunderbolt/switch.c |  9 +
 drivers/thunderbolt/tb.c | 13 +
 4 files changed, 33 insertions(+)

diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index 88fb2fb..319ed81 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -632,6 +632,8 @@ static const struct dev_pm_ops nhi_pm_ops = {
* pci-tunnels stay alive.
*/
.restore_noirq = nhi_resume_noirq,
+   .runtime_suspend = nhi_suspend_noirq,
+   .runtime_resume = nhi_resume_noirq,
 };
 
 static struct pci_device_id nhi_ids[] = {
diff --git a/drivers/thunderbolt/power.c b/drivers/thunderbolt/power.c
index 4d7c6a0..1b5f066 100644
--- a/drivers/thunderbolt/power.c
+++ b/drivers/thunderbolt/power.c
@@ -320,6 +320,12 @@ void thunderbolt_power_init(struct tb *tb)
 
tb->power = power;
 
+   pm_runtime_allow(nhi_dev);
+   pm_runtime_set_autosuspend_delay(nhi_dev, 1);
+   pm_runtime_use_autosuspend(nhi_dev);
+   pm_runtime_mark_last_busy(nhi_dev);
+   pm_runtime_put_autosuspend(nhi_dev);
+
return;
 
 err:
@@ -336,6 +342,9 @@ void thunderbolt_power_fini(struct tb *tb)
if (!power)
return;
 
+   pm_runtime_get(nhi_dev);
+   pm_runtime_forbid(nhi_dev);
+
tb->power = NULL;
dev_pm_domain_set(upstream_dev, NULL);
 
diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c
index c6f30b1..422fe6e 100644
--- a/drivers/thunderbolt/switch.c
+++ b/drivers/thunderbolt/switch.c
@@ -5,6 +5,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include "tb.h"
@@ -326,6 +327,11 @@ void tb_switch_free(struct tb_switch *sw)
if (!sw->is_unplugged)
tb_plug_events_active(sw, false);
 
+   if (sw != sw->tb->root_switch) {
+   pm_runtime_mark_last_busy(&sw->tb->nhi->pdev->dev);
+   pm_runtime_put_autosuspend(&sw->tb->nhi->pdev->dev);
+   }
+
kfree(sw->ports);
kfree(sw->drom);
kfree(sw);
@@ -420,6 +426,9 @@ struct tb_switch *tb_switch_alloc(struct tb *tb, u64 route)
if (tb_plug_events_active(sw, true))
goto err;
 
+   if (tb->root_switch)
+   pm_runtime_get(&tb->nhi->pdev->dev);
+
return sw;
 err:
kfree(sw->ports);
diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
index 24b6d30..a3fedf9 100644
--- a/drivers/thunderbolt/tb.c
+++ b/drivers/thunderbolt/tb.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "tb.h"
 #include "tb_regs.h"
@@ -217,8 +218,11 @@ static void tb_handle_hotplug(struct work_struct *work)
 {
struct tb_hotplug_event *ev = container_of(work, typeof(*ev), work);
struct tb *tb = ev->tb;
+   struct device *dev = &tb->nhi->pdev->dev;
struct tb_switch *sw;
struct tb_port *port;
+
+   pm_runtime_get(dev);
mutex_lock(&tb->lock);
if (!tb->hotplug_active)
goto out; /* during init, suspend or shutdown */
@@ -274,6 +278,8 @@ static void tb_handle_hotplug(struct work_struct *work)
 out:
mutex_unlock(&tb->lock);
kfree(ev);
+   pm_runtime_mark_last_busy(dev);
+   pm_runtime_put_autosuspend(dev);
 }
 
 /**
@@ -433,4 +439,11 @@ void thunderbolt_resume(struct tb *tb)
tb->hotplug_active = true;
mutex_unlock(&tb->lock);
tb_info(tb, "resume finished\n");
+
+   /*
+* If runtime resuming due to a hotplug event (rather than resuming
+* from system sleep), wait for it to arrive. May take about 700 ms.
+*/
+   if (tb->nhi->pdev->dev.power.runtime_status == RPM_RESUMING)
+   msleep(1000);
 }
-- 
2.10.2

Re: OOM: Better, but still there on

2016-12-17 Thread Tetsuo Handa

On 2016/12/17 21:59, Nils Holland wrote:
> On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote:
>> mount -t tracefs none /debug/trace
>> echo 1 > /debug/trace/events/vmscan/enable
>> cat /debug/trace/trace_pipe > trace.log
>>
>> should help
>> [...]
> 
> No problem! I enabled writing the trace data to a file and then tried
> to trigger another OOM situation. That worked, this time without a
> complete kernel panic, but with only my processes being killed and the
> system becoming unresponsive. When that happened, I let it run for
> another minute or two so that in case it was still logging something
> to the trace file, it could continue to do so some time longer. Then I
> rebooted with the only thing that still worked, i.e. by means of magic
> SysRequest.

Under OOM situation, writing to a file on disk unlikely works. Maybe
logging via network ( "cat /debug/trace/trace_pipe > /dev/udp/$ip/$port"
if your are using bash) works better. (I wish we can do it from kernel
so that /bin/cat is not disturbed by delays due to page fault.)

If you can configure netconsole for logging OOM killer messages and
UDP socket for logging trace_pipe messages, udplogger at
https://osdn.net/projects/akari/scm/svn/tree/head/branches/udplogger/
might fit for logging both output with timestamp into a single file.

Re: [PATCH v1 & v6 1/2] PM/devfreq: add suspend frequency support

2016-12-17 Thread Chanwoo Choi

Hi Lin,

2016-11-24 18:54 GMT+09:00 Chanwoo Choi :
> Hi Lin,
>
> On 2016년 11월 24일 18:28, Chanwoo Choi wrote:
>> Hi Lin,
>>
>> On 2016년 11월 24일 17:34, hl wrote:
>>> Hi Chanwoo Choi,
>>>
>>>
>>> On 2016年11月24日 16:16, Chanwoo Choi wrote:
 Hi Lin,

 On 2016년 11월 24일 16:34, hl wrote:
> Hi Chanwoo Choi,
>
>  I think the dev_pm_opp_get_suspend_opp() have implement most of
> the funtion, all we need is just define the node in dts, like following:
>
> &dmc_opp_table {
>  opp06 {
>  opp-suspend;
>  };
> };
 Two approaches use the 'opp-suspend' property.

 I think that the method to support suspend-opp have to
 guarantee following conditions:
 - Support the all of devfreq's governors.
>>> As MyungJoo Ham suggestion, i will set the suspend frequency in 
>>> devfreq_suspend_device(),
>>> which will ingore governor.
>>
>> Other approach already support the all of governors.
>> Before calling the mail, I discussed with Myungjoo Ham.
>> Myungjoo prefer to use the devfreq_suspend/devfreq_resume().
>
> It is not correct expression. We need to wait the reply from Myungjoo
> to clarify this.
>
>>
>> To Myungjoo,
>> Please add your opinion how to support the suspend frequency.
>
>>
 - Devfreq framework have the responsibility to change the
frequency/voltage for suspend-opp. If we uses the
new devfreq_suspend(), each devfreq device don't care
how to support the suspend-opp. Just the developer of each
devfreq device need to add 'opp-suspend' propet to OPP entry in DT file.
>>> Why should support change the voltage in devfreq framework, i think it 
>>> shuold be handle in
>>> specific driver, i think the devfreq only handle it can get the right 
>>> frequency, then pass it to
>>
>> No, the frequency should be handled by governor or framework.
>> The each devfreq device has no any responsibility of next frequency/voltage.
>> The governor and core of devfreq can decide the next frequency/voltage.
>> You can refer to the cpufreq subsystem.
>>
>>> specific driver, i think the voltage should handle in the 
>>> devfreq->profile->target();
>>
>> The call of devfreq->profile->target() have to be handled by devfreq 
>> framework.
>> If user want to set the suspend frequency, user can add the 'suspend-opp' 
>> property.
>> It think this way is easy.
>>
>> But,
>> If the each devfreq device want to decide the next frequency/voltage only for
>> suspend state. We can check the cpufreq subsystem.
>>
>> If specific devfreq device want to handle the suspend frequency,
>> each devfreq will add the own suspend/resume functions as following:
>>
>>   struct devfreq_dev_profile {
>>   int (*suspend)(struct devfreq *dev);// new function pointer
>>   int (*resume)(struct devfreq *dev); // new function pointer
>>   } a_profile;
>>
>>   a_profile = devfreq_generic_suspend;
>>
>>   The devfreq framework will provide the devfreq_generic_suspend() 
>> funticon.
>>   int devfreq_generic_suspend(struce devfreq *dev) {
>>   ...
>>   devfreq->profile->target(..., devfreq->suspend_freq);
>>   ...
>>   }
>>
>>   or
>>
>>   a_profile = a_devfreq_suspend; // specific function of each devfreq 
>> device
>>
>>   The devfreq_suspend() will call 'devfreq->profile->suspend()' function
>>   instead of devfreq->profile->target();
>>
>>   The devfreq call the 'devfreq->profile->suspend()'
>>   to support the suspend frequency.
>>
>> Regards,
>> Chanwoo Choi
>
> The key difference between two approaches:
>
> Your approach:
> - The each developer should add the 'opp-suspend' property to the dts file.
> - The each devfreq should call the devfreq_suspend_device()
>   to support the suspend frequency.
>
>   If each devfreq doesn't call the devfreq_suspend_device(), devfreq framework
>   can support the suspend frequency.
>
> Other approach:
> - The each developer only should add the 'opp-suspend' property to the dts 
> file
>   without the additional behavior.
>
> In the cpufreq subsystem,
> When support the suspend frequency of cpufreq, we just add 'opp-suspend' 
> property
> without the additional behavior.

I'm missing the use-case when using the devfreq_suspend_device()
before entering the suspend mode. We should consider the case when
devfreq device
calls the devfreq_suspend_device() directly. Because devfreq_suspend_device()
is exported function, each devfreq device call this function on the fly
without entering the suspend mode.

I correct my opinion. Your approach is necessary. I'm sorry to confuse you.
So, I make the following patch. This patch set the suspend frequency
in devfreq_suspend_device() after stoping the governor.
It consider the all governors of devfreq.

What do you think?
If you are ok, I'll send this patch with your author.

 int devfreq_suspend_device(struct devfreq *dev

Re: [PATCH v4] dt-bindings: power: supply: bq24735: reverse the polarity of ac-detect

2016-12-17 Thread Sebastian Reichel

Hi,

On Fri, Dec 16, 2016 at 10:44:00AM +0100, Peter Rosin wrote:
> The ACOK pin on the bq24735 is active-high, of course meaning that when
> AC is OK the pin is high. However, all Tegra dts files have incorrectly
> specified active-high even though the signal is inverted on the Tegra
> boards. This has worked since the Linux driver has also inverted the
> meaning of the GPIO. Fix this situation by simply specifying in the
> bindings what everybody else agrees on; that the ti,ac-detect-gpios is
> active on AC adapter absence.
> 
> Signed-off-by: Peter Rosin 

Thanks for your patch. We are currently in the merge
window and your patch will appear in linux-next once
4.10-rc1 has been tagged by Linus Torvalds.

Until then I queued it into this branch:

https://git.kernel.org/cgit/linux/kernel/git/sre/linux-power-supply.git/log/?h=for-next-next

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-17 Thread Jeffrey Walton

> diff --git a/lib/test_siphash.c b/lib/test_siphash.c
> new file mode 100644
> index ..93549e4e22c5
> --- /dev/null
> +++ b/lib/test_siphash.c
> @@ -0,0 +1,83 @@
> +/* Test cases for siphash.c
> + *
> + * Copyright (C) 2016 Jason A. Donenfeld . All Rights 
> Reserved.
> + *
> + * This file is provided under a dual BSD/GPLv2 license.
> + *
> + * SipHash: a fast short-input PRF
> + * https://131002.net/siphash/
> + *
> + * This implementation is specifically for SipHash2-4.
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/* Test vectors taken from official reference source available at:
> + * https://131002.net/siphash/siphash24.c
> + */
> +static const u64 test_vectors[64] = {
> +   0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
> +   0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
> +   0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
> +   0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
> +   0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
> +   0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
> +   0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
> +   0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
> +   0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
> +   0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
> +   0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
> +   0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
> +   0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
> +   0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
> +   0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
> +   0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
> +   0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
> +   0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
> +   0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
> +   0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
> +   0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
> +   0x958a324ceb064572ULL
> +};
> +static const siphash_key_t test_key =
> +   { 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL };
> +
> +static int __init siphash_test_init(void)
> +{
> +   u8 in[64] __aligned(SIPHASH_ALIGNMENT);
> +   u8 in_unaligned[65];
> +   u8 i;
> +   int ret = 0;
> +
> +   for (i = 0; i < 64; ++i) {
> +   in[i] = i;
> +   in_unaligned[i + 1] = i;
> +   if (siphash(in, i, test_key) != test_vectors[i]) {
> +   pr_info("self-test aligned %u: FAIL\n", i + 1);
> +   ret = -EINVAL;
> +   }
> +   if (siphash_unaligned(in_unaligned + 1, i, test_key) != 
> test_vectors[i]) {
> +   pr_info("self-test unaligned %u: FAIL\n", i + 1);
> +   ret = -EINVAL;
> +   }
> +   }
> +   if (!ret)
> +   pr_info("self-tests: pass\n");
> +   return ret;
> +}
> +
> +static void __exit siphash_test_exit(void)
> +{
> +}
> +
> +module_init(siphash_test_init);
> +module_exit(siphash_test_exit);
> +
> +MODULE_AUTHOR("Jason A. Donenfeld ");
> +MODULE_LICENSE("Dual BSD/GPL");
> --
> 2.11.0
>

I believe the output of SipHash depends upon endianness. Folks who
request a digest through the af_alg interface will likely expect a
byte array.

I think that means on little endian machines, values like element 0
must be reversed byte reversed:

0x726fdb47dd0e0e31ULL => 31,0e,0e,dd,47,db,6f,72

If I am not mistaken, that value (and other tv's) are returned here:

return (v0 ^ v1) ^ (v2 ^ v3);

It may be prudent to include the endian reversal in the test to ensure
big endian machines produce expected results. Some closely related
testing on an old Apple PowerMac G5 revealed that result needed to be
reversed before returning it to a caller.

Jeff

Re: [PATCH v3] power: supply: bq24735-charger: optionally poll the ac-detect gpio

2016-12-17 Thread Sebastian Reichel

Hi,

On Thu, Dec 15, 2016 at 10:28:46AM +0100, Peter Rosin wrote:
> If the ac-detect gpio does not support interrupts, provide a fallback
> to poll the gpio at a configurable interval.
> 
> Signed-off-by: Peter Rosin 

Thanks for your patch. We are currently in the merge
window and your patch will appear in linux-next once
4.10-rc1 has been tagged by Linus Torvalds.

Until then I queued it into this branch:

https://git.kernel.org/cgit/linux/kernel/git/sre/linux-power-supply.git/log/?h=for-next-next

-- Sebastian

signature.asc
Description: PGP signature

Re: [PATCH v2 1/2] mfd: axp20x: Add a few missing defines for AXP288 specific registers

2016-12-17 Thread Chen-Yu Tsai

On Thu, Dec 15, 2016 at 10:07 PM, Hans de Goede  wrote:
> Add defines for the AXP288_POWER_REASON and AXP288_RT_BATT_V_H and
> AXP288_RT_BATT_V_L registers. While at it also move the
> AXP288_TS_ADC_H-AXP288_GP_ADC_L defines, which for some reason where
> in a different place, together with the rest of the AXP288 specific
> defines.
>
> Signed-off-by: Hans de Goede 

Acked-by: Chen-Yu Tsai

Re: [PATCH v2 2/2] mfd: axp20x: Fix axp288 volatile ranges

2016-12-17 Thread Chen-Yu Tsai

On Thu, Dec 15, 2016 at 10:07 PM, Hans de Goede  wrote:
> The axp288 pmic has a lot more volatile registers then we were
> listing in axp288_volatile_ranges, fix this.
>
> Signed-off-by: Hans de Goede 

Acked-by: Chen-Yu Tsai

Re: [PATCH v3 1/2] mfd: axp20x: Add a few missing defines for AXP288 specific registers

2016-12-17 Thread Chen-Yu Tsai

On Sat, Dec 17, 2016 at 4:09 AM, Hans de Goede  wrote:
> Add defines for the AXP288_POWER_REASON and AXP288_RT_BATT_V_H and
> AXP288_RT_BATT_V_L and AXP288_BC_* registers. While at it also move the
> AXP288_TS_ADC_H-AXP288_GP_ADC_L defines, which for some reason where
> in a different place, together with the rest of the AXP288 specific
> defines.
>
> Signed-off-by: Hans de Goede 

Acked-by: Chen-Yu Tsai

Re: [PATCH v3 2/2] mfd: axp20x: Fix axp288 volatile ranges

2016-12-17 Thread Chen-Yu Tsai

On Sat, Dec 17, 2016 at 4:09 AM, Hans de Goede  wrote:
> The axp288 pmic has a lot more volatile registers then we were
> listing in axp288_volatile_ranges, fix this.
>
> Signed-off-by: Hans de Goede 

Acked-by: Chen-Yu Tsai 

FYI, if you're going to add support for the battery charger detection
module later, you would need to add the remaining AXP288_BC_* registers
to the writable table.

Re: [PATCH v1 & v6 1/2] PM/devfreq: add suspend frequency support

2016-12-17 Thread Tobias Jakobi

Hey guys,

Chanwoo Choi wrote:
> Hi Lin,
> 
> 2016-11-24 18:54 GMT+09:00 Chanwoo Choi :
>> Hi Lin,
>>
>> On 2016년 11월 24일 18:28, Chanwoo Choi wrote:
>>> Hi Lin,
>>>
>>> On 2016년 11월 24일 17:34, hl wrote:
 Hi Chanwoo Choi,


 On 2016年11月24日 16:16, Chanwoo Choi wrote:
> Hi Lin,
>
> On 2016년 11월 24일 16:34, hl wrote:
>> Hi Chanwoo Choi,
>>
>>  I think the dev_pm_opp_get_suspend_opp() have implement most of
>> the funtion, all we need is just define the node in dts, like following:
>>
>> &dmc_opp_table {
>>  opp06 {
>>  opp-suspend;
>>  };
>> };
> Two approaches use the 'opp-suspend' property.
>
> I think that the method to support suspend-opp have to
> guarantee following conditions:
> - Support the all of devfreq's governors.
 As MyungJoo Ham suggestion, i will set the suspend frequency in 
 devfreq_suspend_device(),
 which will ingore governor.
>>>
>>> Other approach already support the all of governors.
>>> Before calling the mail, I discussed with Myungjoo Ham.
>>> Myungjoo prefer to use the devfreq_suspend/devfreq_resume().
>>
>> It is not correct expression. We need to wait the reply from Myungjoo
>> to clarify this.
>>
>>>
>>> To Myungjoo,
>>> Please add your opinion how to support the suspend frequency.
>>
>>>
> - Devfreq framework have the responsibility to change the
>frequency/voltage for suspend-opp. If we uses the
>new devfreq_suspend(), each devfreq device don't care
>how to support the suspend-opp. Just the developer of each
>devfreq device need to add 'opp-suspend' propet to OPP entry in DT 
> file.
 Why should support change the voltage in devfreq framework, i think it 
 shuold be handle in
 specific driver, i think the devfreq only handle it can get the right 
 frequency, then pass it to
>>>
>>> No, the frequency should be handled by governor or framework.
>>> The each devfreq device has no any responsibility of next frequency/voltage.
>>> The governor and core of devfreq can decide the next frequency/voltage.
>>> You can refer to the cpufreq subsystem.
>>>
 specific driver, i think the voltage should handle in the 
 devfreq->profile->target();
>>>
>>> The call of devfreq->profile->target() have to be handled by devfreq 
>>> framework.
>>> If user want to set the suspend frequency, user can add the 'suspend-opp' 
>>> property.
>>> It think this way is easy.
>>>
>>> But,
>>> If the each devfreq device want to decide the next frequency/voltage only 
>>> for
>>> suspend state. We can check the cpufreq subsystem.
>>>
>>> If specific devfreq device want to handle the suspend frequency,
>>> each devfreq will add the own suspend/resume functions as following:
>>>
>>>   struct devfreq_dev_profile {
>>>   int (*suspend)(struct devfreq *dev);// new function 
>>> pointer
>>>   int (*resume)(struct devfreq *dev); // new function 
>>> pointer
>>>   } a_profile;
>>>
>>>   a_profile = devfreq_generic_suspend;
>>>
>>>   The devfreq framework will provide the devfreq_generic_suspend() 
>>> funticon.
>>>   int devfreq_generic_suspend(struce devfreq *dev) {
>>>   ...
>>>   devfreq->profile->target(..., devfreq->suspend_freq);
>>>   ...
>>>   }
>>>
>>>   or
>>>
>>>   a_profile = a_devfreq_suspend; // specific function of each devfreq 
>>> device
>>>
>>>   The devfreq_suspend() will call 'devfreq->profile->suspend()' function
>>>   instead of devfreq->profile->target();
>>>
>>>   The devfreq call the 'devfreq->profile->suspend()'
>>>   to support the suspend frequency.
>>>
>>> Regards,
>>> Chanwoo Choi
>>
>> The key difference between two approaches:
>>
>> Your approach:
>> - The each developer should add the 'opp-suspend' property to the dts file.
>> - The each devfreq should call the devfreq_suspend_device()
>>   to support the suspend frequency.
>>
>>   If each devfreq doesn't call the devfreq_suspend_device(), devfreq 
>> framework
>>   can support the suspend frequency.
>>
>> Other approach:
>> - The each developer only should add the 'opp-suspend' property to the dts 
>> file
>>   without the additional behavior.
>>
>> In the cpufreq subsystem,
>> When support the suspend frequency of cpufreq, we just add 'opp-suspend' 
>> property
>> without the additional behavior.
> 
> I'm missing the use-case when using the devfreq_suspend_device()
> before entering the suspend mode. We should consider the case when
> devfreq device
> calls the devfreq_suspend_device() directly. Because devfreq_suspend_device()
> is exported function, each devfreq device call this function on the fly
> without entering the suspend mode.
> 
> I correct my opinion. Your approach is necessary. I'm sorry to confuse you.
> So, I make the following patch. This patch set the suspend frequency
> in devfreq_s

Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-17 Thread George Spelvin

To follow up on my comments that your benchmark results were peculiar,
here's my benchmark code.

It just computes the hash of all n*(n+1)/2 possible non-empty substrings
of a buffer of n (called "max" below) bytes.  "cpb" is "cycles per byte".

(The average length is (n+2)/3, c.f. https://oeis.org/A000292)

On x86-32, HSipHash is asymptotically twice the speed of SipHash,
rising to 2.5x for short strings:

SipHash/HSipHash benchmark, sizeof(long) = 4
 SipHash: max=   4 cycles= 10495 cpb=524.7500 (sum=47a4f5554869fa97)
HSipHash: max=   4 cycles=  3400 cpb=170. (sum=146a863e)
 SipHash: max=   8 cycles= 24468 cpb=203.9000 (sum=21c41a86355affcc)
HSipHash: max=   8 cycles=  9237 cpb= 76.9750 (sum=d3b5e0cd)
 SipHash: max=  16 cycles= 94622 cpb=115.9583 (sum=26d816b72721e48f)
HSipHash: max=  16 cycles= 34499 cpb= 42.2782 (sum=16bb7475)
 SipHash: max=  32 cycles=418767 cpb= 69.9811 (sum=dd5a97694b8a832d)
HSipHash: max=  32 cycles=156695 cpb= 26.1857 (sum=eed00fcb)
 SipHash: max=  64 cycles=   2119152 cpb= 46.3101 (sum=a2a725aecc09ed00)
HSipHash: max=  64 cycles=   1008678 cpb= 22.0428 (sum=99b9f4f)
 SipHash: max= 128 cycles=  12728659 cpb= 35.5788 (sum=420878cd20272817)
HSipHash: max= 128 cycles=   5452931 cpb= 15.2419 (sum=f1f4ad18)
 SipHash: max= 256 cycles=  38931946 cpb= 13.7615 (sum=e05dfb28b90dfd98)
HSipHash: max= 256 cycles=  13807312 cpb=  4.8805 (sum=ceeafcc1)
 SipHash: max= 512 cycles= 205537380 cpb=  9.1346 (sum=7d129d4de145fbea)
HSipHash: max= 512 cycles= 103420960 cpb=  4.5963 (sum=7f15a313)
 SipHash: max=1024 cycles=1540259472 cpb=  8.5817 (sum=cca7cbdc778ca8af)
HSipHash: max=1024 cycles= 796090824 cpb=  4.4355 (sum=d8f3374f)

On x86-64, SipHash is consistently faster, asymptotically approaching 2x
for long strings:

SipHash/HSipHash benchmark, sizeof(long) = 8
 SipHash: max=   4 cycles=  2642 cpb=132.1000 (sum=47a4f5554869fa97)
HSipHash: max=   4 cycles=  2498 cpb=124.9000 (sum=146a863e)
 SipHash: max=   8 cycles=  5270 cpb= 43.9167 (sum=21c41a86355affcc)
HSipHash: max=   8 cycles=  7140 cpb= 59.5000 (sum=d3b5e0cd)
 SipHash: max=  16 cycles= 19950 cpb= 24.4485 (sum=26d816b72721e48f)
HSipHash: max=  16 cycles= 23546 cpb= 28.8554 (sum=16bb7475)
 SipHash: max=  32 cycles= 80188 cpb= 13.4004 (sum=dd5a97694b8a832d)
HSipHash: max=  32 cycles=101218 cpb= 16.9148 (sum=eed00fcb)
 SipHash: max=  64 cycles=373286 cpb=  8.1575 (sum=a2a725aecc09ed00)
HSipHash: max=  64 cycles=535568 cpb= 11.7038 (sum=99b9f4f)
 SipHash: max= 128 cycles=   2075224 cpb=  5.8006 (sum=420878cd20272817)
HSipHash: max= 128 cycles=   3336820 cpb=  9.3270 (sum=f1f4ad18)
 SipHash: max= 256 cycles=  14276278 cpb=  5.0463 (sum=e05dfb28b90dfd98)
HSipHash: max= 256 cycles=  28847880 cpb= 10.1970 (sum=ceeafcc1)
 SipHash: max= 512 cycles=  50135180 cpb=  2.2281 (sum=7d129d4de145fbea)
HSipHash: max= 512 cycles=  86145916 cpb=  3.8286 (sum=7f15a313)
 SipHash: max=1024 cycles= 334111900 cpb=  1.8615 (sum=cca7cbdc778ca8af)
HSipHash: max=1024 cycles= 640432452 cpb=  3.5682 (sum=d8f3374f)


Here's the code; compile with -DSELFTEST.  (The main purpose of
printing the sum is to prevent dead code elimination.)


#if SELFTEST
#include 
#include 

static inline uint64_t rol64(uint64_t word, unsigned int shift)
{
return word << shift | word >> (64 - shift);
}

static inline uint32_t rol32(uint32_t word, unsigned int shift)
{
return word << shift | word >> (32 - shift);
}

static inline uint64_t get_unaligned_le64(void const *p)
{
return *(uint64_t const *)p;
}

static inline uint32_t get_unaligned_le32(void const *p)
{
return *(uint32_t const *)p;
}

static inline uint64_t le64_to_cpup(uint64_t const *p)
{
return *p;
}

static inline uint32_t le32_to_cpup(uint32_t const *p)
{
return *p;
}


#else
#include/* For rol64 */
#include 
#include 
#include 
#endif

/* The basic ARX mixing function, taken from Skein */
#define SIP_MIX(a, b, s) ((a) += (b), (b) = rol64(b, s), (b) ^= (a))

/*
 * The complete SipRound.  Note that, when unrolled twice like below,
 * the 32-bit rotates drop out on 32-bit machines.
 */
#define SIP_ROUND(a, b, c, d) \
(SIP_MIX(a, b, 13), SIP_MIX(c, d, 16), (a) = rol64(a, 32), \
 SIP_MIX(c, b, 17), SIP_MIX(a, d, 21), (c) = rol64(c, 32))

/*
 * This is rolled up more than most implementations, resulting in about
 * 55% the code size.  Speed is a few precent slower.  A crude benchmark
 * (for (i=1; i <= max; i++) for (j = 0; j < 4096-i; j++) hash(buf+j, i);)
 * produces the following timings (in usec):
 *
 *  i386i386i386x86_64  x86_64  x86_64  x86_64
 * Length   small   unroll  halfmd4 small   unroll  halfmd4 teahash
 * 1..4106910291608 195 160 399 690
 * 1..8248323813851 410 360 9881659
 * 1..12   430341526207 690 61816422690
 * 1..16   61225931866

Re: [PATCH v3] net: macb: Added PCI wrapper for Platform Driver.

2016-12-17 Thread David Miller

From: Bartosz Folta 
Date: Wed, 14 Dec 2016 06:39:15 +

> There are hardware PCI implementations of Cadence GEM network
> controller. This patch will allow to use such hardware with reuse of
> existing Platform Driver.
> 
> Signed-off-by: Bartosz Folta 
> ---
> Changed in v3:
> Fixed dependencies in Kconfig.
> ---
> Changed in v2:
> Respin to net-next. Changed patch formatting.

Applied.

Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-17 Thread Theodore Ts'o

On Fri, Dec 16, 2016 at 09:15:03PM -0500, George Spelvin wrote:
> >> - Ted, Andy Lutorminski and I will try to figure out a construction of
> >>   get_random_long() that we all like.

We don't have to find the most optimal solution right away; we can
approach this incrementally, after all.

So long as we replace get_random_{long,int}() with something which is
(a) strictly better in terms of security given today's use of MD5, and
(b) which is strictly *faster* than the current construction on 32-bit
and 64-bit systems, we can do that, and can try to make it be faster
while maintaining some minimum level of security which is sufficient
for all current users of get_random_{long,int}() and which can be
clearly artificulated for future users of get_random_{long,int}().

The main worry at this point I have is benchmarking siphash on a
32-bit system.  It may be that simply batching the chacha20 output so
that we're using the urandom construction more efficiently is the
better way to go, since that *does* meet the criteron of strictly more
secure and strictly faster than the current MD5 solution.  I'm open to
using siphash, but I want to see the the 32-bit numbers first.

As far as half-siphash is concerned, it occurs to me that the main
problem will be those users who need to guarantee that output can't be
guessed over a long period of time.  For example, if you have a
long-running process, then the output needs to remain unguessable over
potentially months or years, or else you might be weakening the ASLR
protections.  If on the other hand, the hash table or the process will
be going away in a matter of seconds or minutes, the requirements with
respect to cryptographic strength go down significantly.

Now, maybe this doesn't matter that much if we can guarantee (or make
assumptions) that the attacker doesn't have unlimited access the
output stream of get_random_{long,int}(), or if it's being used in an
anti-DOS use case where it ultimately only needs to be harder than
alternate ways of attacking the system.

Rekeying every five minutes doesn't necessarily help the with respect
to ASLR, but it might reduce the amount of the output stream that
would be available to the attacker in order to be able to attack the
get_random_{long,int}() generator, and it also reduces the value of
doing that attack to only compromising the ASLR for those processes
started within that five minute window.

Cheers,

- Ted

P.S.  I'm using ASLR as an example use case, above; of course we will
need to make similar eximainations of the other uses of
get_random_{long,int}().

P.P.S.  We might also want to think about potentially defining
get_random_{long,int}() to be unambiguously strong, and then creating
a get_weak_random_{long,int}() which on platforms where performance
might be a consideration, it uses a weaker algorithm perhaps with some
kind of rekeying interval.

Re: [PATCH v2 00/11] add support for VBUS max current and min voltage limits AXP20X and AXP22X PMICs

2016-12-17 Thread Sebastian Reichel

Hi Quentin,

On Fri, Dec 09, 2016 at 12:04:08PM +0100, Quentin Schulz wrote:
> The X-Powers AXP209 and AXP20X PMICs are able to set a limit for the
> VBUS power supply for both max current and min voltage supplied. This
> series of patch adds the possibility to set these limits from sysfs.
> 
> Also, the AXP223 PMIC shares most of its behaviour with the AXP221 but
> the former can set the VBUS power supply max current to 100mA, unlike
> the latter. The AXP223 VBUS power supply driver used to probe on the
> AXP221 compatible. This series of patch introduces a new compatible for
> the AXP223 to be able to set the current max limit to 100mA.
> 
> With that new compatible, boards having the AXP223 see their DT updated
> to use the VBUS power supply driver with the correct compatible.
> 
> This series of patch also migrates from of_device_is_compatible function
> to the data field of of_device_id to identify the compatible used to
> probe. This improves the code readability.
> 
> Mostly cosmetic changes in v2 and adding volatile and writeable regs to
> AXP20X and AXP22X MFD cells for the VBUS power supply driver.
> 
> Quentin Schulz (11):
>   power: supply: axp20x_usb_power: use of_device_id data field instead
> of device_is_compatible
>   mfd: axp20x: add volatile and writeable reg ranges for VBUS power
> supply driver
>   power: supply: axp20x_usb_power: set min voltage and max current from
> sysfs
>   Documentation: DT: binding: axp20x_usb_power: add axp223 compatible
>   power: supply: axp20x_usb_power: add 100mA max current limit for
> AXP223
>   mfd: axp20x: add separate MFD cell for AXP223
>   ARM: dtsi: add DTSI for AXP223
>   ARM: dts: sun8i-a33-olinuxino: use AXP223 DTSI
>   ARM: dts: sun8i-a33-sinlinx-sina33: use AXP223 DTSI
>   ARM: dts: sun8i-r16-parrot: use AXP223 DTSI
>   ARM: dtsi: sun8i-reference-design-tablet: use AXP223 DTSI

Thanks for your patchset. We are currently in the merge
window and patches 1 & 3-5 will appear in linux-next once
4.10-rc1 has been tagged by Linus Torvalds.

Until then I queued them into this branch:

https://git.kernel.org/cgit/linux/kernel/git/sre/linux-power-supply.git/log/?h=for-next-next

-- Sebastian

-- Sebastian


signature.asc
Description: PGP signature

Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-17 Thread Jeffrey Walton

> As far as half-siphash is concerned, it occurs to me that the main
> problem will be those users who need to guarantee that output can't be
> guessed over a long period of time.  For example, if you have a
> long-running process, then the output needs to remain unguessable over
> potentially months or years, or else you might be weakening the ASLR
> protections.  If on the other hand, the hash table or the process will
> be going away in a matter of seconds or minutes, the requirements with
> respect to cryptographic strength go down significantly.

Perhaps SipHash-4-8 should be used instead of SipHash-2-4. I believe
SipHash-4-8 is recommended for the security conscious who want to be
more conservative in their security estimates.

SipHash-4-8 does not add much more processing. If you are clocking
SipHash-2-4 at 2.0 or 2.5 cpb, then SipHash-4-8 will run at 3.0 to
4.0. Both are well below MD5 times. (At least with the data sets I've
tested).

> Now, maybe this doesn't matter that much if we can guarantee (or make
> assumptions) that the attacker doesn't have unlimited access the
> output stream of get_random_{long,int}(), or if it's being used in an
> anti-DOS use case where it ultimately only needs to be harder than
> alternate ways of attacking the system.
>
> Rekeying every five minutes doesn't necessarily help the with respect
> to ASLR, but it might reduce the amount of the output stream that
> would be available to the attacker in order to be able to attack the
> get_random_{long,int}() generator, and it also reduces the value of
> doing that attack to only compromising the ASLR for those processes
> started within that five minute window.

Forgive my ignorance... I did not find reading on using the primitive
in a PRNG. Does anyone know what Aumasson or Bernstein have to say?
Aumasson's site does not seem to discuss the use case:
https://www.google.com/search?q=siphash+rng+site%3A131002.net. (And
their paper only mentions random-number once in a different context).

Making the leap from internal hash tables and short-lived network
packets to the rng case may leave something to be desired, especially
if the bits get used in unanticipated ways, like creating long term
private keys.

Jeff

Re: [PATCH] x86/floppy: use designated initializers

2016-12-17 Thread Ingo Molnar


* Kees Cook  wrote:

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 
> ---
>  arch/x86/include/asm/floppy.h | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/floppy.h b/arch/x86/include/asm/floppy.h
> index 1c7eefe32502..d0e4702883b9 100644
> --- a/arch/x86/include/asm/floppy.h
> +++ b/arch/x86/include/asm/floppy.h
> @@ -229,18 +229,18 @@ static struct fd_routine_l {
>   int (*_dma_setup)(char *addr, unsigned long size, int mode, int io);
>  } fd_routine[] = {
>   {
> - request_dma,
> - free_dma,
> - get_dma_residue,
> - dma_mem_alloc,
> - hard_dma_setup
> + ._request_dma = request_dma,
> + ._free_dma = free_dma,
> + ._get_dma_residue = get_dma_residue,
> + ._dma_mem_alloc = dma_mem_alloc,
> + ._dma_setup = hard_dma_setup
>   },
>   {
> - vdma_request_dma,
> - vdma_nop,
> - vdma_get_dma_residue,
> - vdma_mem_alloc,
> - vdma_dma_setup
> + ._request_dma = vdma_request_dma,
> + ._free_dma = vdma_nop,
> + ._get_dma_residue = vdma_get_dma_residue,
> + ._dma_mem_alloc = vdma_mem_alloc,
> + ._dma_setup = vdma_dma_setup

Please align the two columns vertically while at it.

Thanks,

Ingo

Re: [PATCH 4/5] irda: irnet: Remove unused IRNET_MAJOR define

2016-12-17 Thread David Miller

From: Corentin Labbe 
Date: Thu, 15 Dec 2016 11:42:49 +0100

> The IRNET_MAJOR define is not used, so this patch remove it.
> 
> Signed-off-by: Corentin Labbe 

Applied.

Re: [PATCH 5/5] irda: irnet: add member name to the miscdevice declaration

2016-12-17 Thread David Miller

From: Corentin Labbe 
Date: Thu, 15 Dec 2016 11:42:50 +0100

> Since the struct miscdevice have many members, it is dangerous to init
> it without members name relying only on member order.
> 
> This patch add member name to the init declaration.
> 
> Signed-off-by: Corentin Labbe 

Applied.

Re: [PATCH 1/5] irda: irproc.c: Remove unneeded linux/miscdevice.h include

2016-12-17 Thread David Miller

From: Corentin Labbe 
Date: Thu, 15 Dec 2016 11:42:46 +0100

> irproc.c does not use any miscdevice so this patch remove this
> unnecessary inclusion.
> 
> Signed-off-by: Corentin Labbe 

Applied.

Re: [PATCH 3/5] irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h

2016-12-17 Thread David Miller

From: Corentin Labbe 
Date: Thu, 15 Dec 2016 11:42:48 +0100

> This patch move the define for IRNET_MINOR to include/linux/miscdevice.h
> It is better that all minor number definitions are in the same place.
> 
> Signed-off-by: Corentin Labbe 

Applied.

Re: [PATCH 2/5] irda: irnet: Move linux/miscdevice.h include

2016-12-17 Thread David Miller

From: Corentin Labbe 
Date: Thu, 15 Dec 2016 11:42:47 +0100

> The only use of miscdevice is irda_ppp so no need to include
> linux/miscdevice.h for all irda files.
> This patch move the linux/miscdevice.h include to irnet_ppp.h
> 
> Signed-off-by: Corentin Labbe 

Applied.

[RFC][PATCH] spinlock_debug: report spinlock lockup from unlock

2016-12-17 Thread Sergey Senozhatsky

There is a race window between the point when __spin_lock_debug()
detects spinlock lockup and the time when CPU that caused the
lockup receives its backtrace interrupt.

Before __spin_lock_debug() triggers all_cpu_backtrace() it calls
spin_dump() to printk() the current state of the lock and CPU
backtrace. These printk() calls can take some time to print the
messages to serial console, for instance (we are not talking
about console_unlock() loop and a flood of messages from other
CPUs, but just spin_dump() printk() and serial console).

All those preparation steps can give CPU that caused the lockup
enough time to run away, so when it receives a backtrace interrupt
it can look completely innocent.

The patch extends `struct raw_spinlock' with additional variable
that stores jiffies of successful do_raw_spin_lock() and checks
in debug_spin_unlock() whether the spin_lock has been locked for
too long. So we will have a reliable backtrace from CPU that
locked up and a reliable backtrace from CPU that caused the
lockup.

Missed spin_lock unlock deadline report (example):

BUG: spinlock missed unlock deadline on CPU#0, bash/327
 lock: lock.25562+0x0/0x60, .magic: dead4ead, .owner: bash/327, .owner_cpu: 0
CPU: 0 PID: 327 Comm: bash
Call Trace:
 dump_stack+0x4f/0x65
 spin_dump+0x8a/0x8f
 spin_bug+0x2b/0x2d
 do_raw_spin_unlock+0x92/0xa3
 _raw_spin_unlock+0x27/0x44
 ...

Signed-off-by: Sergey Senozhatsky 
---
 include/linux/spinlock_types.h  | 4 +++-
 kernel/locking/spinlock_debug.c | 5 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h
index 73548eb13a5d..8972e56eeefb 100644
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -25,6 +25,7 @@ typedef struct raw_spinlock {
 #ifdef CONFIG_DEBUG_SPINLOCK
unsigned int magic, owner_cpu;
void *owner;
+   unsigned long acquire_tstamp;
 #endif
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
@@ -45,7 +46,8 @@ typedef struct raw_spinlock {
 # define SPIN_DEBUG_INIT(lockname) \
.magic = SPINLOCK_MAGIC,\
.owner_cpu = -1,\
-   .owner = SPINLOCK_OWNER_INIT,
+   .owner = SPINLOCK_OWNER_INIT,   \
+   .acquire_tstamp = 0,
 #else
 # define SPIN_DEBUG_INIT(lockname)
 #endif
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 0374a596cffa..daeab4bc86ff 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
  struct lock_class_key *key)
@@ -27,6 +28,7 @@ void __raw_spin_lock_init(raw_spinlock_t *lock, const char 
*name,
lock->magic = SPINLOCK_MAGIC;
lock->owner = SPINLOCK_OWNER_INIT;
lock->owner_cpu = -1;
+   lock->acquire_tstamp = 0;
 }
 
 EXPORT_SYMBOL(__raw_spin_lock_init);
@@ -90,6 +92,7 @@ static inline void debug_spin_lock_after(raw_spinlock_t *lock)
 {
lock->owner_cpu = raw_smp_processor_id();
lock->owner = current;
+   lock->acquire_tstamp = jiffies;
 }
 
 static inline void debug_spin_unlock(raw_spinlock_t *lock)
@@ -99,6 +102,8 @@ static inline void debug_spin_unlock(raw_spinlock_t *lock)
SPIN_BUG_ON(lock->owner != current, lock, "wrong owner");
SPIN_BUG_ON(lock->owner_cpu != raw_smp_processor_id(),
lock, "wrong CPU");
+   SPIN_BUG_ON(time_after_eq(jiffies, lock->acquire_tstamp + HZ),
+   lock, "missed unlock deadline");
lock->owner = SPINLOCK_OWNER_INIT;
lock->owner_cpu = -1;
 }
-- 
2.11.0

Re: [PATCH net 0/3] dpaa_eth: a couple of fixes

2016-12-17 Thread David Miller

From: Madalin Bucur 
Date: Thu, 15 Dec 2016 15:13:03 +0200

> This patch set introduces big endian accessors in the dpaa_eth driver
> making sure accesses to the QBMan HW are correct on little endian
> platforms. Removing a redundant Kconfig dependency on FSL_SOC.
> Adding myself as maintainer of the dpaa_eth driver.

Series applied, thanks.

Re: [PATCH v2 0/8] power: supply: tps65217: Support USB charger feature

2016-12-17 Thread Sebastian Reichel

Hi,

On Fri, Dec 09, 2016 at 04:48:58PM +0900, Milo Kim wrote:
> TPS65217 device supports two charger inputs - AC and USB.
> Currently, only AC charger is supported. This patch-set adds USB charger 
> feature. Tested on Beaglebone black.
> 
> Patch 1: Main patch
> Patch 2, 3: Clean up for charger driver data
> Patch 4 ~ 8: Naming changes for generic power supply class structure
> 
> v2:
>   Regenerate the patchset for better code review
> 
> Milo Kim (8):
>   power: supply: tps65217: Support USB charger interrupt
>   power: supply: tps65217: Use 'poll_task' on unloading the module

patches look fine, but these two patches must be reordered to fix
bisectability. Otherwise after patch 1 the thread is not properly
killed during driver removal.

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH v3 2/2] dt-bindings: power: add bindings for sbs-charger

2016-12-17 Thread Sebastian Reichel

Hi,

On Thu, Nov 24, 2016 at 01:33:43PM +0100, Nicolas Saenz Julienne wrote:
> Adds device tree documentation for SBS charger compilant devices as defined
> here: http://sbs-forum.org/specs/sbc110.pdf
> 
> Signed-off-by: Nicolas Saenz Julienne 
> ---
> v2 -> v3:
> - add part number as compatible
> 
>  .../bindings/power/supply/sbs_sbs-charger.txt  | 24 
> ++
>  1 file changed, 24 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/power/supply/sbs_sbs-charger.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/power/supply/sbs_sbs-charger.txt 
> b/Documentation/devicetree/bindings/power/supply/sbs_sbs-charger.txt
> new file mode 100644
> index 000..f6b6027
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/power/supply/sbs_sbs-charger.txt
> @@ -0,0 +1,24 @@
> +SBS sbs-charger
> +~~
> +
> +Required properties:
> + - compatible: should contain one of the following:
> + - "lltc,ltc4100"
> + - "sbs,sbs-charger"

That's not what I meant. The idea is to specify "lltc,ltc4100" with
"sbs,sbs-charger" as fallback. Then the driver for now only handles
"sbs,sbs-charger", but if any vendor registers need to be supported
we have a more specific compatible value in DT, that can be used to
identify the device.

> +Optional properties:
> +- interrupt-parent: Should be the phandle for the interrupt controller. Use 
> in
> +conjunction with "interrupts".
> +- interrupts: Interrupt mapping for GPIO IRQ. Use in conjunction with
> +"interrupt-parent". If an interrupt is not provided the driver will 
> switch
> +automatically to polling.
> +
> +Example:
> +
> + ltc4100@9 {
> + compatible = "sbs,sbs-charger";
> + reg = <0x9>;
> + interrupt-parent = <&gpio6>;
> + interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
> + };

So the example would look like

compatible = "lltc,ltc4100", "sbs,sbs-charger";

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH v3 0/2] power: supply: add sbs-charger driver

2016-12-17 Thread Sebastian Reichel

Hi,

On Tue, Dec 13, 2016 at 11:41:01AM +0100, Nicolas Saenz Julienne wrote:
> On 24/11/16 13:33, Nicolas Saenz Julienne wrote:
> > Hi,
> > 
> > This series adds support for all SBS compatible battery chargers, as defined
> > here: http://sbs-forum.org/specs/sbc110.pdf.
> > 
> > The first patch changes the sbs-battery device name in order to be able to
> > create a proper supplier/supplied relation between the two of them.
> > 
> > The second introduces the driver.
> > 
> > Regards,
> > Nicolas
> > 
> > changes since v2:
> > - updated driver and dt-binding with Sebatian's comments
> > 
> > changes since v1:
> > - added dt bindings
> > - updated driver with Sebastian's comments
> > - s/Nicola/Nicolas/ in commits
> > 
> > Nicolas Saenz Julienne (2):
> >   power: supply: add sbs-charger driver
> >   dt-bindings: power: add bindings for sbs-charger
> > 
> >  .../bindings/power/supply/sbs_sbs-charger.txt  |  24 ++
> >  drivers/power/supply/Kconfig   |   6 +
> >  drivers/power/supply/Makefile  |   1 +
> >  drivers/power/supply/sbs-charger.c | 275 
> > +
> >  4 files changed, 306 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/power/supply/sbs_sbs-charger.txt
> >  create mode 100644 drivers/power/supply/sbs-charger.c
> > 
> Hi,
> any update?

Sorry, I was busy.

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH v1 & v6 1/2] PM/devfreq: add suspend frequency support

2016-12-17 Thread Chanwoo Choi

2016-12-18 0:13 GMT+09:00 Tobias Jakobi :
> Hey guys,
>
> Chanwoo Choi wrote:
>> Hi Lin,
>>
>> 2016-11-24 18:54 GMT+09:00 Chanwoo Choi :
>>> Hi Lin,
>>>
>>> On 2016년 11월 24일 18:28, Chanwoo Choi wrote:
 Hi Lin,

 On 2016년 11월 24일 17:34, hl wrote:
> Hi Chanwoo Choi,
>
>
> On 2016年11月24日 16:16, Chanwoo Choi wrote:
>> Hi Lin,
>>
>> On 2016년 11월 24일 16:34, hl wrote:
>>> Hi Chanwoo Choi,
>>>
>>>  I think the dev_pm_opp_get_suspend_opp() have implement most of
>>> the funtion, all we need is just define the node in dts, like following:
>>>
>>> &dmc_opp_table {
>>>  opp06 {
>>>  opp-suspend;
>>>  };
>>> };
>> Two approaches use the 'opp-suspend' property.
>>
>> I think that the method to support suspend-opp have to
>> guarantee following conditions:
>> - Support the all of devfreq's governors.
> As MyungJoo Ham suggestion, i will set the suspend frequency in 
> devfreq_suspend_device(),
> which will ingore governor.

 Other approach already support the all of governors.
 Before calling the mail, I discussed with Myungjoo Ham.
 Myungjoo prefer to use the devfreq_suspend/devfreq_resume().
>>>
>>> It is not correct expression. We need to wait the reply from Myungjoo
>>> to clarify this.
>>>

 To Myungjoo,
 Please add your opinion how to support the suspend frequency.
>>>

>> - Devfreq framework have the responsibility to change the
>>frequency/voltage for suspend-opp. If we uses the
>>new devfreq_suspend(), each devfreq device don't care
>>how to support the suspend-opp. Just the developer of each
>>devfreq device need to add 'opp-suspend' propet to OPP entry in DT 
>> file.
> Why should support change the voltage in devfreq framework, i think it 
> shuold be handle in
> specific driver, i think the devfreq only handle it can get the right 
> frequency, then pass it to

 No, the frequency should be handled by governor or framework.
 The each devfreq device has no any responsibility of next 
 frequency/voltage.
 The governor and core of devfreq can decide the next frequency/voltage.
 You can refer to the cpufreq subsystem.

> specific driver, i think the voltage should handle in the 
> devfreq->profile->target();

 The call of devfreq->profile->target() have to be handled by devfreq 
 framework.
 If user want to set the suspend frequency, user can add the 'suspend-opp' 
 property.
 It think this way is easy.

 But,
 If the each devfreq device want to decide the next frequency/voltage only 
 for
 suspend state. We can check the cpufreq subsystem.

 If specific devfreq device want to handle the suspend frequency,
 each devfreq will add the own suspend/resume functions as following:

   struct devfreq_dev_profile {
   int (*suspend)(struct devfreq *dev);// new function 
 pointer
   int (*resume)(struct devfreq *dev); // new function 
 pointer
   } a_profile;

   a_profile = devfreq_generic_suspend;

   The devfreq framework will provide the devfreq_generic_suspend() 
 funticon.
   int devfreq_generic_suspend(struce devfreq *dev) {
   ...
   devfreq->profile->target(..., devfreq->suspend_freq);
   ...
   }

   or

   a_profile = a_devfreq_suspend; // specific function of each devfreq 
 device

   The devfreq_suspend() will call 'devfreq->profile->suspend()' 
 function
   instead of devfreq->profile->target();

   The devfreq call the 'devfreq->profile->suspend()'
   to support the suspend frequency.

 Regards,
 Chanwoo Choi
>>>
>>> The key difference between two approaches:
>>>
>>> Your approach:
>>> - The each developer should add the 'opp-suspend' property to the dts file.
>>> - The each devfreq should call the devfreq_suspend_device()
>>>   to support the suspend frequency.
>>>
>>>   If each devfreq doesn't call the devfreq_suspend_device(), devfreq 
>>> framework
>>>   can support the suspend frequency.
>>>
>>> Other approach:
>>> - The each developer only should add the 'opp-suspend' property to the dts 
>>> file
>>>   without the additional behavior.
>>>
>>> In the cpufreq subsystem,
>>> When support the suspend frequency of cpufreq, we just add 'opp-suspend' 
>>> property
>>> without the additional behavior.
>>
>> I'm missing the use-case when using the devfreq_suspend_device()
>> before entering the suspend mode. We should consider the case when
>> devfreq device
>> calls the devfreq_suspend_device() directly. Because devfreq_suspend_device()
>> is exported function, each devfreq device call this function on the fly
>> withou

Re: [PATCH] block: loose check on sg gap

2016-12-17 Thread Jens Axboe

On 12/17/2016 03:49 AM, Ming Lei wrote:
> If the last bvec of the 1st bio and the 1st bvec of the next
> bio are contineous physically, and the latter can be merged
> to last segment of the 1st bio, we should think they don't
> violate sg gap(or virt boundary) limit.
> 
> Both Vitaly and Dexuan reported lots of unmergeable small bios
> are observed when running mkfs on Hyper-V virtual storage, and
> performance becomes quite low, so this patch is figured out for
> fixing the performance issue.
> 
> The same issue should exist on NVMe too sine it sets virt boundary too.

It looks pretty reasonable to me. I'll queue it up for some testing,
changes like this always make me a little nervous.

-- 
Jens Axboe

Re: [PATCH] net: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 16:58:58 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied, although "decnet: " would have been a much better
subsystem prefix.

Re: [PATCH] isdn/gigaset: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 16:58:06 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied.

Re: [PATCH] ATM: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 16:58:43 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied.

Re: [PATCH] net/x25: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 17:03:39 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied.

Re: [PATCH] isdn: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 17:01:42 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied.

Re: [PATCH] WAN: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 16:59:18 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied.

Re: [PATCH] bna: use designated initializers

2016-12-17 Thread David Miller

From: Kees Cook 
Date: Fri, 16 Dec 2016 17:00:54 -0800

> Prepare to mark sensitive kernel structures for randomization by making
> sure they're using designated initializers. These were identified during
> allyesconfig builds of x86, arm, and arm64, with most initializer fixes
> extracted from grsecurity.
> 
> Signed-off-by: Kees Cook 

Applied.

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland

On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote:
> On 2016/12/17 21:59, Nils Holland wrote:
> > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote:
> >> mount -t tracefs none /debug/trace
> >> echo 1 > /debug/trace/events/vmscan/enable
> >> cat /debug/trace/trace_pipe > trace.log
> >>
> >> should help
> >> [...]
> > 
> > No problem! I enabled writing the trace data to a file and then tried
> > to trigger another OOM situation. That worked, this time without a
> > complete kernel panic, but with only my processes being killed and the
> > system becoming unresponsive.
> > [...]
> 
> Under OOM situation, writing to a file on disk unlikely works. Maybe
> logging via network ( "cat /debug/trace/trace_pipe > /dev/udp/$ip/$port"
> if your are using bash) works better. (I wish we can do it from kernel
> so that /bin/cat is not disturbed by delays due to page fault.)
> 
> If you can configure netconsole for logging OOM killer messages and
> UDP socket for logging trace_pipe messages, udplogger at
> https://osdn.net/projects/akari/scm/svn/tree/head/branches/udplogger/
> might fit for logging both output with timestamp into a single file.

Thanks for the hint, sounds very sane! I'll try to go that route for
the next log / trace I produce. Of course, if Michal says that the
trace file I've already posted, and which has been logged to file, is
useless and would have been better if I had instead logged to a
different machine via the network, I could also repeat the current
experiment and produce a new file at any time. :-)

Greetings
Nils

Re: usb/core: warning in usb_create_ep_devs/sysfs_create_dir_ns

2016-12-17 Thread Andrey Konovalov

On Fri, Dec 16, 2016 at 7:01 PM, Alan Stern  wrote:
> On Mon, 12 Dec 2016, Andrey Konovalov wrote:
>
>> Hi!
>>
>> While running the syzkaller fuzzer I've got the following error report.
>>
>> On commit 3c49de52d5647cda8b42c4255cf8a29d1e22eff5 (Dev 2).
>>
>> WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
>> gadgetfs: disconnected
>> sysfs: cannot create duplicate filename
>> '/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
>> Kernel panic - not syncing: panic_on_warn set ...
>>
>> CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ #34
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Workqueue: usb_hub_wq hub_event
>>  88006bee64c8 81f96b8a 0001 11000d7dcc2c
>>  ed000d7dcc24 0001 41b58ab3 8598b510
>>  81f968f8 850fee20 85cff020 dc00
>> Call Trace:
>>  [< inline >] __dump_stack lib/dump_stack.c:15
>>  [] dump_stack+0x292/0x398 lib/dump_stack.c:51
>>  [] panic+0x1cb/0x3a9 kernel/panic.c:179
>>  [] __warn+0x1c4/0x1e0 kernel/panic.c:542
>>  [] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
>>  [] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
>>  [] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
>>  [< inline >] create_dir lib/kobject.c:71
>>  [] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
>>  [< inline >] kobject_add_varg lib/kobject.c:366
>>  [] kobject_add+0x139/0x220 lib/kobject.c:411
>>  [] device_add+0x353/0x1660 drivers/base/core.c:1088
>>  [] device_register+0x1d/0x20 drivers/base/core.c:1206
>>  [] usb_create_ep_devs+0x163/0x260
>> drivers/usb/core/endpoint.c:195
>>  [] create_intf_ep_devs+0x13b/0x200
>> drivers/usb/core/message.c:1030
>>  [] usb_set_configuration+0x1083/0x18d0
>> drivers/usb/core/message.c:1937
>
> Hi, Andrey:
>
> Please check whether the patch below fixes this problem.

Hi Alan,

Been testing with your patch for the last day, haven't seen any more
reports or other issues.

Tested-by: Andrey Konovalov 

Thanks!

>
> Alan Stern
>
>
>
> Index: usb-4.x/drivers/usb/core/config.c
> ===
> --- usb-4.x.orig/drivers/usb/core/config.c
> +++ usb-4.x/drivers/usb/core/config.c
> @@ -234,6 +234,16 @@ static int usb_parse_endpoint(struct dev
> if (ifp->desc.bNumEndpoints >= num_ep)
> goto skip_to_next_endpoint_or_interface_descriptor;
>
> +   /* Check for duplicate endpoint addresses */
> +   for (i = 0; i < ifp->desc.bNumEndpoints; ++i) {
> +   if (ifp->endpoint[i].desc.bEndpointAddress ==
> +   d->bEndpointAddress) {
> +   dev_warn(ddev, "config %d interface %d altsetting %d 
> has a duplicate endpoint with address 0x%X, skipping\n",
> +   cfgno, inum, asnum, d->bEndpointAddress);
> +   goto skip_to_next_endpoint_or_interface_descriptor;
> +   }
> +   }
> +
> endpoint = &ifp->endpoint[ifp->desc.bNumEndpoints];
> ++ifp->desc.bNumEndpoints;
>
>

Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock

2016-12-17 Thread Pavel Machek

On Thu 2016-12-15 23:33:22, Lino Sanfilippo wrote:
> On 15.12.2016 22:32, Lino Sanfilippo wrote:
> 
> > Ah ok. Then maybe priv->hw->dma->stop_tx() does not do the job correctly 
> > (stop the
> > tx path properly) and the HW is still active on the tx path while the tx 
> > buffers are
> > freed. OTOH stmmac_release() also stops the phy before the tx (and rx) 
> > paths are stopped.
> > Did you try to stop the phy fist in stmmac_tx_err_work(), too?
> > 
> > Regards,
> > Lino
> > 
> 
> And this is the "sledgehammer" approach: Do a complete shutdown and restart
> of the hardware in case of tx error (against net-next and only
>compile tested).

Wow, thanks a lot. I'll try to get the driver back to the non-working
state, and try it.

I believe I have some idea what is wrong there. (Missing memory barriers).

> +static void stmmac_tx_err_work(struct work_struct *work)
> +{
> + struct stmmac_priv *priv = container_of(work, struct stmmac_priv,
> + tx_err_work);
> + /* restart netdev */
> + rtnl_lock();
> + stmmac_release(priv->dev);
> + stmmac_open(priv->dev);
> + rtnl_unlock();
> +}

Won't this up/down the interface, in a way userspace can observe?

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH v4 2/3] perf tool: add PERF_RECORD_NAMESPACES to include namespaces related info

2016-12-17 Thread Jiri Olsa

On Fri, Dec 16, 2016 at 12:07:20AM +0530, Hari Bathini wrote:

SNIP

> +
> +int thread__set_namespaces(struct thread *thread, u64 timestamp,
> +struct namespaces_event *event)
> +{
> + struct namespaces *new, *curr = thread__namespaces(thread);
> +
> + new = namespaces__new(event);
> + if (!new)
> + return -ENOMEM;
> +
> + list_add(&new->list, &thread->namespaces_list);
> +
> + if (timestamp && curr) {
> + /*
> +  * setns syscall must have changed few or all the namespaces
> +  * of this thread. Update end time for the namespaces
> +  * previously used.
> +  */
> + curr = list_next_entry(new, list);
> + curr->end_time = timestamp;

hi,
couldn't you use just the curr you got from thread__namespaces?
why to retrieve it again via 'new' pointer?

thanks,
jirka

Documentation/unaligned-memory-access.txt: fix incorrect comparison operator

2016-12-17 Thread Cihangir Akturk

In the actual implementation ether_addr_equal function tests for equality to 0
when returning. It seems in commit 0d74c4 it is somehow overlooked to change
this operator to reflect the actual function.

Signed-off-by: Cihangir Akturk 
---
 Documentation/unaligned-memory-access.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/unaligned-memory-access.txt 
b/Documentation/unaligned-memory-access.txt
index a445da0..3f76c0c 100644
--- a/Documentation/unaligned-memory-access.txt
+++ b/Documentation/unaligned-memory-access.txt
@@ -151,7 +151,7 @@ bool ether_addr_equal(const u8 *addr1, const u8 *addr2)
 #else
const u16 *a = (const u16 *)addr1;
const u16 *b = (const u16 *)addr2;
-   return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) != 0;
+   return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) == 0;
 #endif
 }
 
-- 
2.1.4

[RFC 0/1] New PCI Switch Management Driver

2016-12-17 Thread Logan Gunthorpe

Hi,

[Appologies: this is a resend for some people. Due to a configuration
error the original email was rejected by the mailing lists. I hope
this one makes it!]

We're looking to get some initial feedback on a new driver for
a line of PCIe switches produced and produced and sold by Microsemi.
The goal is to get the process moving to get this code included in
upstream hopefully for 4.11. Facebook is currently gearing up to
use this hardware in its Open Compute Platform and is pushing to
have this driver in the upstream kernel.

The following patch briefly describes the hardware and provides
the first draft of driver code. Currently, the driver works and
has been tested but is not feature complete. Thus, we are not looking
to get it merged immediately. However we would like some early review,
specifically on the interfaces and core concepts so that we don't
do a lot of work down a path the community would reject. Barring any
objections to this RFC, we will flesh out all the features
and provide a completed patch for inclusion in the coming weeks.

Work on a userspace tool, that utilizes this driver, is also being
done at [1]. The tool is currently also a bit of a skeleton and
will be fleshed out assuming there are no serious objections to our
userspace interface. In the end, the tool will be released with a
GPL license.

The patch is based off of the v4.9 release.

Thanks for your review,

Logan

[1] https://github.com/sbates130272/switchtec-user

Logan Gunthorpe (1):
  MicroSemi Switchtec management interface driver

 Documentation/switchtec.txt|  54 +++
 MAINTAINERS|   9 +
 drivers/pci/Kconfig|   1 +
 drivers/pci/Makefile   |   1 +
 drivers/pci/switch/Kconfig |  13 +
 drivers/pci/switch/Makefile|   1 +
 drivers/pci/switch/switchtec.c | 824 +
 drivers/pci/switch/switchtec.h | 119 ++
 8 files changed, 1022 insertions(+)
 create mode 100644 Documentation/switchtec.txt
 create mode 100644 drivers/pci/switch/Kconfig
 create mode 100644 drivers/pci/switch/Makefile
 create mode 100644 drivers/pci/switch/switchtec.c
 create mode 100644 drivers/pci/switch/switchtec.h

--
2.1.4

[RFC 1/1] MicroSemi Switchtec management interface driver

2016-12-17 Thread Logan Gunthorpe

Microsemi's "Switchtec" line of PCI switch devices is already
supported by the kernel with standard PCI switch drivers. However, the
Switchtec device advertises a special management endpoint which
enables some additional functionality. This includes:

 * Packet and Byte Counters
 * Firmware Upgrades
 * Event and Error logs
 * Querying port link status
 * Custom user firmware commands

This patch introduces the switchtec kernel module which provides
pci driver that exposes a char device. The char device provides
userspace access to this interface through read, write and (optionally)
poll calls. Currently no ioctls have been implemented but a couple
may be added in a later revision.

A short text file is provided which documents the switchtec driver
and outlines the semantics of using the char device.

A WIP userspace tool which utilizes this interface is available
at [1]. This tool takes
inspiration (and borrows some code) from nvme-cli [2].

[1] https://github.com/sbates130272/switchtec-user
[2] https://github.com/linux-nvme/nvme-cli

Signed-off-by: Logan Gunthorpe 
Signed-off-by: Stephen Bates 
---
 Documentation/switchtec.txt|  54 +++
 MAINTAINERS|   9 +
 drivers/pci/Kconfig|   1 +
 drivers/pci/Makefile   |   1 +
 drivers/pci/switch/Kconfig |  13 +
 drivers/pci/switch/Makefile|   1 +
 drivers/pci/switch/switchtec.c | 824 +
 drivers/pci/switch/switchtec.h | 119 ++
 8 files changed, 1022 insertions(+)
 create mode 100644 Documentation/switchtec.txt
 create mode 100644 drivers/pci/switch/Kconfig
 create mode 100644 drivers/pci/switch/Makefile
 create mode 100644 drivers/pci/switch/switchtec.c
 create mode 100644 drivers/pci/switch/switchtec.h

diff --git a/Documentation/switchtec.txt b/Documentation/switchtec.txt
new file mode 100644
index 000..04657ce
--- /dev/null
+++ b/Documentation/switchtec.txt
@@ -0,0 +1,54 @@
+
+Linux Switchtec Support
+
+
+Microsemi's "Switchtec" line of PCI switch devices is already
+supported by the kernel with standard PCI switch drivers. However, the
+Switchtec device advertises a special management endpoint which
+enables some additional functionality. This includes:
+
+ * Packet and Byte Counters
+ * Firmware Upgrades
+ * Event and Error logs
+ * Querying port link status
+ * Custom user firmware commands
+
+The switchtec kernel module implements this functionality.
+
+
+
+Interface
+=
+
+The primary means of communicating with the Switchtec management firmware is
+through the Memory-mapped Remote Procedure Call (MRPC) interface.
+Commands are submitted to the interface with a 4-byte command
+identifier and up to 1KB of command specific data. The firmware will
+respond with a 4 bytes return code and up to 1KB of command specific
+data. The interface only processes a single command at a time.
+
+
+Userspace Interface
+===
+
+The MRPC interface will be exposed to userspace through a simple char
+device: /dev/switchtec#, one for each management endpoint in the system.
+
+The char device has the following semantics:
+
+ * A write must consist of at least 4 bytes and no more than 1028 bytes.
+   The first four bytes will be interpreted as the command to run and
+   the remainder will be used as the input data. A write will send the
+   command to the firmware to begin processing.
+
+ * Each write must be followed by exactly one read. Any double write will
+   produce an error and any read that doesn't follow a write will
+   produce an error.
+
+ * A read will block until the firmware completes the command and return
+   the four bytes of status plus up to 1024 bytes of output data. (The
+   length will be specified by the size parameter of the read call --
+   reading less than 4 bytes will produce an error.
+
+ * The poll call will also be supported for userspace applications that
+   need to do other things while waiting for the command to complete.
diff --git a/MAINTAINERS b/MAINTAINERS
index 63cefa6..1e21505 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9288,6 +9288,15 @@ S:   Maintained
 F: Documentation/devicetree/bindings/pci/aardvark-pci.txt
 F: drivers/pci/host/pci-aardvark.c
 
+PCI DRIVER FOR MICROSEMI SWITCHTEC
+M: Kurt Schwemmer 
+M: Stephen Bates 
+M: Logan Gunthorpe 
+L: linux-...@vger.kernel.org
+S: Maintained
+F: Documentation/switchtec.txt
+F: drivers/pci/switch/switchtec*
+
 PCI DRIVER FOR NVIDIA TEGRA
 M: Thierry Reding 
 L: linux-te...@vger.kernel.org
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 6555eb7..f72e8c5 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -133,3 +133,4 @@ config PCI_HYPERV
 
 source "drivers/pci/hotplug/Kconfig"
 source "drivers/pci/host/Kconfig"
+source "drivers/pci/switch/Kconfig"
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 8db5079..15b46dd 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makef

What is the function of arch/x86/purgatory/purgatory.c?

2016-12-17 Thread Larry Finger

While checking the rtlwifi family of drivers using Sparse, I got the following 
warnings:


  CHECK   arch/x86/purgatory/purgatory.c
arch/x86/purgatory/purgatory.c:21:15: warning: symbol 'backup_dest' was not 
declared. Should it be static?
arch/x86/purgatory/purgatory.c:22:15: warning: symbol 'backup_src' was not 
declared. Should it be static?
arch/x86/purgatory/purgatory.c:23:15: warning: symbol 'backup_sz' was not 
declared. Should it be static?
arch/x86/purgatory/purgatory.c:25:4: warning: symbol 'sha256_digest' was not 
declared. Should it be static?
arch/x86/purgatory/purgatory.c:27:19: warning: symbol 'sha_regions' was not 
declared. Should it be static?
arch/x86/purgatory/purgatory.c:42:5: warning: symbol 'verify_sha256_digest' was 
not declared. Should it be static?
arch/x86/purgatory/purgatory.c:61:6: warning: symbol 'purgatory' was not 
declared. Should it be static?


Upon examination of the routine, I can see that if purgatory() should be static, 
then none of the code here will ever be accessed by any part of the kernel. Is 
there some bit of magic that is above my understanding, or is this a useless bit 
of code that has been forgotten and should be removed?


If the former, then I think there should be declarations so that the clueless 
like me are not confused.


Thanks,

Larry

Potential issues (security and otherwise) with the current cgroup-bpf API

2016-12-17 Thread Andy Lutomirski

Hi all-

I apologize for being rather late with this.  I didn't realize that
cgroup-bpf was going to be submitted for Linux 4.10, and I didn't see
it on the linux-api list, so I missed the discussion.

I think that the inet ingress, egress etc filters are a neat feature,
but I think the API has some issues that will bite us down the road
if it becomes stable in its current form.

Most of the problems I see are summarized in this transcript:

# mkdir cg2
# mount -t cgroup2 none cg2
# mkdir cg2/nosockets
# strace cgrp_socket_rule cg2/nosockets/ 0
...
open("cg2/nosockets/", O_RDONLY|O_DIRECTORY) = 3

 You can modify a cgroup after opening it O_RDONLY?

bpf(BPF_PROG_LOAD, {prog_type=0x9 /* BPF_PROG_TYPE_??? */, insn_cnt=2,
insns=0x7fffe3568c10, license="GPL", log_level=1, log_size=262144,
log_buf=0x6020c0, kern_version=0}, 48) = 4

 This is fine.  The bpf() syscall manipulates bpf objects.

bpf(0x8 /* BPF_??? */, 0x7fffe3568bf0, 48) = 0

 This is not so good:

 a) The bpf() syscall is supposed to manipulate bpf objects.  This
is manipulating a cgroup.  There's no reason that a socket creation
filter couldn't be written in a different language (new iptables
table?  Simple list of address families?), but if that happened,
then using bpf() to install it would be entirely nonsensical.

 b) This is starting to be an excessively ugly multiplexer.  Among
other things, it's very unfriendly to seccomp.

# echo $$ >cg2/nosockets/cgroup.procs
# ping 127.0.0.1
ping: socket: Operation not permitted
# ls cg2/nosockets/
cgroup.controllers  cgroup.events  cgroup.procs  cgroup.subtree_control
# cat cg2/nosockets/cgroup.controllers

 Something in cgroupfs should give an indication that this cgroup
 filters socket creation, but there's nothing there.  You should also
 be able to turn the filter off from cgroupfs.

# mkdir cg2/nosockets/sockets
# /home/luto/apps/linux/samples/bpf/cgrp_socket_rule cg2/nosockets/sockets/ 1

 This succeeded, which means that, if this feature is enabled in 4.10,
 then we're stuck with its semantics.  If it returned -EINVAL instead,
 there would be a chance to refine it.

# echo $$ >cg2/nosockets/sockets/cgroup.procs
# ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.029 ms
^C
--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.029/0.029/0.029/0.000 ms

 Bash was inside a cgroup that disallowed socket creation, but socket
 creation wasn't disallowed.  This means that the obvious use of socket
 creation filters in nestable constainers fails insecurely.


There's also a subtle but nasty potential security problem here.
In 4.9 and before, cgroups has only one real effect in the kernel:
resource control. A process in a malicious cgroup could be DoSed,
but that was about the extent of the damage that a malicious cgroup
could do.

In 4.10 with With CONFIG_CGROUP_BPF=y, a cgroup can have bpf
programs attached that can do things if various events occur. (Right
now, this means socket operations, but there are plans in the works
to do this for LSM hooks too.) These bpf programs can say yes or no,
but they can also read out various data (including socket payloads!)
and save them away where an attacker can find them. This sounds a
lot like seccomp with a narrower scope but a much stronger ability to
exfiltrate private information.

Unfortunately, while seccomp is very, very careful to prevent
injection of a privileged victim into a malicious sandbox, the
CGROUP_BPF mechanism appears to have no real security model. There
is nothing to prevent a program that's in a malicious cgroup from
running a setuid binary, and there is nothing to prevent a program
that has the ability to move itself or another program into a
malicious cgroup from doing so and then, if needed for exploitation,
exec a setuid binary.

This isn't much of a problem yet because you currently need
CAP_NET_ADMIN to create a malicious sandbox in the first place.  I'm
sure that, in the near future, someone will want to make this stuff
work in containers with delegated cgroup hierarchies, and then there
may be a real problem here.


I've included a few security people on this thread.  The current API
looks abusable, and it would be nice to find all the holes before
4.10 comes out.


(The cgrp_socket_rule source is attached.  You can build it by sticking it
 in samples/bpf and doing:

 $ make headers_install
 $ cd samples/bpf
 $ gcc -o cgrp_socket_rule cgrp_socket_rule.c libbpf.c -I../../usr/include
)

--Andy
/* eBPF example program:
 *
 * - Loads eBPF program
 *
 *   The eBPF program sets the sk_bound_dev_if index in new AF_INET{6}
 *   sockets opened by processes in the cgroup.
 *
 * - Attaches the new program to a cgroup using BPF_PROG_ATTACH
 */

#define _GNU_SOURCE

#include 
#include 
#include 
#include 
#include 
#in

Re: [PATCH v1 & v6 1/2] PM/devfreq: add suspend frequency support

2016-12-17 Thread Tobias Jakobi

Hey Chanwoo,


Chanwoo Choi wrote:
> 2016-12-18 0:13 GMT+09:00 Tobias Jakobi :
>> Hey guys,
>>
>> Chanwoo Choi wrote:
>>> Hi Lin,
>>>
>>> 2016-11-24 18:54 GMT+09:00 Chanwoo Choi :
 Hi Lin,

 On 2016년 11월 24일 18:28, Chanwoo Choi wrote:
> Hi Lin,
>
> On 2016년 11월 24일 17:34, hl wrote:
>> Hi Chanwoo Choi,
>>
>>
>> On 2016年11月24日 16:16, Chanwoo Choi wrote:
>>> Hi Lin,
>>>
>>> On 2016년 11월 24일 16:34, hl wrote:
 Hi Chanwoo Choi,

  I think the dev_pm_opp_get_suspend_opp() have implement most of
 the funtion, all we need is just define the node in dts, like 
 following:

 &dmc_opp_table {
  opp06 {
  opp-suspend;
  };
 };
>>> Two approaches use the 'opp-suspend' property.
>>>
>>> I think that the method to support suspend-opp have to
>>> guarantee following conditions:
>>> - Support the all of devfreq's governors.
>> As MyungJoo Ham suggestion, i will set the suspend frequency in 
>> devfreq_suspend_device(),
>> which will ingore governor.
>
> Other approach already support the all of governors.
> Before calling the mail, I discussed with Myungjoo Ham.
> Myungjoo prefer to use the devfreq_suspend/devfreq_resume().

 It is not correct expression. We need to wait the reply from Myungjoo
 to clarify this.

>
> To Myungjoo,
> Please add your opinion how to support the suspend frequency.

>
>>> - Devfreq framework have the responsibility to change the
>>>frequency/voltage for suspend-opp. If we uses the
>>>new devfreq_suspend(), each devfreq device don't care
>>>how to support the suspend-opp. Just the developer of each
>>>devfreq device need to add 'opp-suspend' propet to OPP entry in DT 
>>> file.
>> Why should support change the voltage in devfreq framework, i think it 
>> shuold be handle in
>> specific driver, i think the devfreq only handle it can get the right 
>> frequency, then pass it to
>
> No, the frequency should be handled by governor or framework.
> The each devfreq device has no any responsibility of next 
> frequency/voltage.
> The governor and core of devfreq can decide the next frequency/voltage.
> You can refer to the cpufreq subsystem.
>
>> specific driver, i think the voltage should handle in the 
>> devfreq->profile->target();
>
> The call of devfreq->profile->target() have to be handled by devfreq 
> framework.
> If user want to set the suspend frequency, user can add the 'suspend-opp' 
> property.
> It think this way is easy.
>
> But,
> If the each devfreq device want to decide the next frequency/voltage only 
> for
> suspend state. We can check the cpufreq subsystem.
>
> If specific devfreq device want to handle the suspend frequency,
> each devfreq will add the own suspend/resume functions as following:
>
>   struct devfreq_dev_profile {
>   int (*suspend)(struct devfreq *dev);// new function 
> pointer
>   int (*resume)(struct devfreq *dev); // new function 
> pointer
>   } a_profile;
>
>   a_profile = devfreq_generic_suspend;
>
>   The devfreq framework will provide the devfreq_generic_suspend() 
> funticon.
>   int devfreq_generic_suspend(struce devfreq *dev) {
>   ...
>   devfreq->profile->target(..., 
> devfreq->suspend_freq);
>   ...
>   }
>
>   or
>
>   a_profile = a_devfreq_suspend; // specific function of each devfreq 
> device
>
>   The devfreq_suspend() will call 'devfreq->profile->suspend()' 
> function
>   instead of devfreq->profile->target();
>
>   The devfreq call the 'devfreq->profile->suspend()'
>   to support the suspend frequency.
>
> Regards,
> Chanwoo Choi

 The key difference between two approaches:

 Your approach:
 - The each developer should add the 'opp-suspend' property to the dts file.
 - The each devfreq should call the devfreq_suspend_device()
   to support the suspend frequency.

   If each devfreq doesn't call the devfreq_suspend_device(), devfreq 
 framework
   can support the suspend frequency.

 Other approach:
 - The each developer only should add the 'opp-suspend' property to the dts 
 file
   without the additional behavior.

 In the cpufreq subsystem,
 When support the suspend frequency of cpufreq, we just add 'opp-suspend' 
 property
 without the additional behavior.
>>>
>>> I'm missing the use-case when using the devfreq_suspend_device()
>>> before entering the suspend mode. We should consider the

Re: [PATCH V1] i2c: xgene: Fix missing code of DTB support

2016-12-17 Thread Wolfram Sang

On Wed, Dec 14, 2016 at 02:17:26PM +0700, Tin Huynh wrote:
> In DTB case, i2c-core doesn't create slave device which is installed
> on i2c-xgene bus because of missing code in this driver.
> This patch fixes this issue.
> 
> Signed-off-by: Tin Huynh 

Applied to for-current, thanks!



signature.asc
Description: PGP signature

Re: [PATCH V5] i2c: designware: fix wrong Tx/Rx FIFO for ACPI

2016-12-17 Thread Wolfram Sang

On Wed, Dec 14, 2016 at 04:23:58PM +0700, Tin Huynh wrote:
> ACPI always sets Tx/Rx FIFO to 32. This configuration will
> cause problem if the IP core supports a FIFO size of less than 32.
> The driver should read the FIFO size from the IP and select the smaller
> one of the two.
> 
> Signed-off-by: Tin Huynh 
> 

Applied to for-current, thanks!



signature.asc
Description: PGP signature

[GIT PULL] ARM: exynos: Late mach/soc for v4.10

2016-12-17 Thread Krzysztof Kozlowski

Hi,


After our discussions about not-breaking out-of-tree DTB with SCU
change in DeviceTree, I prepared an updated pull request without
the questioned changes.

Ten days ago I prepared a tag, pushed it... and apparently forgot to send pull
request. At least, I don't have such email in my outbox. Dunno.

So let's send it now, better late then never. With just few commits (without
the DT SCU changes). These were sitting in the next for very long.


Best regards,
Krzysztof


The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:

  Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
tags/samsung-soc-4.10-2

for you to fetch changes up to da6b21e97e39d42f90ab490ce7b54a0fe2c3fe35:

  ARM: Drop fixed 200 Hz timer requirement from Samsung platforms (2016-12-07 
18:42:11 +0200)


Samsung mach/soc update for v4.10:
1. Minor cleanup in smp_operations.
2. Another step in switching s3c24xx to new DMA API.
3. Drop fixed requirement for HZ=200 on Samsung platforms.


Krzysztof Kozlowski (1):
  ARM: Drop fixed 200 Hz timer requirement from Samsung platforms

Pankaj Dubey (1):
  ARM: EXYNOS: Remove smp_init_cpus hook from platsmp.c

Sylwester Nawrocki (1):
  ARM: S3C24XX: Add DMA slave maps for remaining s3c24xx SoCs

 arch/arm/Kconfig   |  3 +-
 arch/arm/mach-exynos/platsmp.c | 31 -
 arch/arm/mach-s3c24xx/common.c | 76 ++
 3 files changed, 77 insertions(+), 33 deletions(-)

Re: Potential issues (security and otherwise) with the current cgroup-bpf API

2016-12-17 Thread Mickaël Salaün


On 17/12/2016 19:18, Andy Lutomirski wrote:
> Hi all-
> 
> I apologize for being rather late with this.  I didn't realize that
> cgroup-bpf was going to be submitted for Linux 4.10, and I didn't see
> it on the linux-api list, so I missed the discussion.
> 
> I think that the inet ingress, egress etc filters are a neat feature,
> but I think the API has some issues that will bite us down the road
> if it becomes stable in its current form.
> 
> Most of the problems I see are summarized in this transcript:
> 
> # mkdir cg2
> # mount -t cgroup2 none cg2
> # mkdir cg2/nosockets
> # strace cgrp_socket_rule cg2/nosockets/ 0
> ...
> open("cg2/nosockets/", O_RDONLY|O_DIRECTORY) = 3
> 
>  You can modify a cgroup after opening it O_RDONLY?

I sent a patch to check the cgroup.procs permission before attaching a
BPF program to it [1], but it was not merged because not part of the
current security model (which may not be crystal clear). The thing is
that the current socket/BPF/cgroup feature is only available to a
process with the *global CAP_NET_ADMIN* and such a process can already
modify the network for every processes, so it doesn't make much sense to
check if it can modify the network for a subset of this processes.

[1] https://lkml.org/lkml/2016/9/19/854

However, needing a process to open a cgroup *directory* in write mode
may not make sense because the process does not modify the content of
the cgroup but only use it as a *reference* in the network stack.
Forcing an open with write mode may forbid to use this kind of
network-filtering feature in a read-only file-system but not necessarily
read-only *network configuration*.

Another point of view is that the CAP_NET_ADMIN may be an unneeded
privilege if the cgroup migration is using a no_new_privs-like feature
as I proposed with Landlock [2] (with an extra ptrace_may_access() check).
The new capability proposition for cgroup may be interesting too.

[2] https://lkml.org/lkml/2016/9/14/82

> 
> bpf(BPF_PROG_LOAD, {prog_type=0x9 /* BPF_PROG_TYPE_??? */, insn_cnt=2,
> insns=0x7fffe3568c10, license="GPL", log_level=1, log_size=262144,
> log_buf=0x6020c0, kern_version=0}, 48) = 4
> 
>  This is fine.  The bpf() syscall manipulates bpf objects.
> 
> bpf(0x8 /* BPF_??? */, 0x7fffe3568bf0, 48) = 0
> 
>  This is not so good:
> 
>  a) The bpf() syscall is supposed to manipulate bpf objects.  This
> is manipulating a cgroup.  There's no reason that a socket creation
> filter couldn't be written in a different language (new iptables
> table?  Simple list of address families?), but if that happened,
> then using bpf() to install it would be entirely nonsensical.

Another point of view is to say that the BPF program (called by the
network stack) is using a reference to a set of processes thanks to a
cgroup.

> 
>  b) This is starting to be an excessively ugly multiplexer.  Among
> other things, it's very unfriendly to seccomp.

FWIW, Landlock will have the capability to filter this kind of action.

> 
> # echo $$ >cg2/nosockets/cgroup.procs
> # ping 127.0.0.1
> ping: socket: Operation not permitted
> # ls cg2/nosockets/
> cgroup.controllers  cgroup.events  cgroup.procs  cgroup.subtree_control
> # cat cg2/nosockets/cgroup.controllers
> 
>  Something in cgroupfs should give an indication that this cgroup
>  filters socket creation, but there's nothing there.  You should also
>  be able to turn the filter off from cgroupfs.

Right. Everybody was OK at LPC to add such an information but it is not
there yet.

> 
> # mkdir cg2/nosockets/sockets
> # /home/luto/apps/linux/samples/bpf/cgrp_socket_rule cg2/nosockets/sockets/ 1
> 
>  This succeeded, which means that, if this feature is enabled in 4.10,
>  then we're stuck with its semantics.  If it returned -EINVAL instead,
>  there would be a chance to refine it.

This is indeed unfortunate.

> 
> # echo $$ >cg2/nosockets/sockets/cgroup.procs
> # ping 127.0.0.1
> PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
> 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.029 ms
> ^C
> --- 127.0.0.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.029/0.029/0.029/0.000 ms
> 
>  Bash was inside a cgroup that disallowed socket creation, but socket
>  creation wasn't disallowed.  This means that the obvious use of socket
>  creation filters in nestable constainers fails insecurely.
> 
> 
> There's also a subtle but nasty potential security problem here.
> In 4.9 and before, cgroups has only one real effect in the kernel:
> resource control. A process in a malicious cgroup could be DoSed,
> but that was about the extent of the damage that a malicious cgroup
> could do.
> 
> In 4.10 with With CONFIG_CGROUP_BPF=y, a cgroup can have bpf
> programs attached that can do things if various events occur. (Right
> now, this means socket operations, but there are plans in the works
> to do this f

[PATCH] staging: rtl8712: changed struct members to __le32

2016-12-17 Thread Jannik Becher

Fixed sparse warning "cast to restricted __le32".
struct recv_stat and struct phy_stat have always little endian members.

Signed-off-by: Jannik Becher 
---
 drivers/staging/rtl8712/rtl8712_recv.h | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/rtl8712/rtl8712_recv.h 
b/drivers/staging/rtl8712/rtl8712_recv.h
index 0b0c273..0352e6f 100644
--- a/drivers/staging/rtl8712/rtl8712_recv.h
+++ b/drivers/staging/rtl8712/rtl8712_recv.h
@@ -50,12 +50,12 @@
 #define REORDER_WAIT_TIME  30 /* (ms)*/
 
 struct recv_stat {
-   unsigned int rxdw0;
-   unsigned int rxdw1;
-   unsigned int rxdw2;
-   unsigned int rxdw3;
-   unsigned int rxdw4;
-   unsigned int rxdw5;
+   __le32 rxdw0;
+   __le32 rxdw1;
+   __le32 rxdw2;
+   __le32 rxdw3;
+   __le32 rxdw4;
+   __le32 rxdw5;
 };
 
 struct phy_cck_rx_status {
@@ -69,14 +69,14 @@ struct phy_cck_rx_status {
 };
 
 struct phy_stat {
-   unsigned int phydw0;
-   unsigned int phydw1;
-   unsigned int phydw2;
-   unsigned int phydw3;
-   unsigned int phydw4;
-   unsigned int phydw5;
-   unsigned int phydw6;
-   unsigned int phydw7;
+   __le32 phydw0;
+   __le32 phydw1;
+   __le32 phydw2;
+   __le32 phydw3;
+   __le32 phydw4;
+   __le32 phydw5;
+   __le32 phydw6;
+   __le32 phydw7;
 };
 #define PHY_STAT_GAIN_TRSW_SHT 0
 #define PHY_STAT_PWDB_ALL_SHT 4
-- 
2.7.4

Re: What is the function of arch/x86/purgatory/purgatory.c?

2016-12-17 Thread Al Viro

On Sat, Dec 17, 2016 at 11:52:05AM -0600, Larry Finger wrote:

> Upon examination of the routine, I can see that if purgatory() should be
> static, then none of the code here will ever be accessed by any part of the
> kernel. Is there some bit of magic that is above my understanding, or is
> this a useless bit of code that has been forgotten and should be removed?

I don't know what is and what is not above your understanding, but grepping
in that area (grep -w purgatory arch/x86/purgatory/*) does catch this:
arch/x86/purgatory/setup-x86_64.S:  call purgatory
which is hardly magic - looks like a function call.  Looking into that
file shows
purgatory_start:
.code64

/* Load a gdt so I know what the segment registers are */
lgdtgdt(%rip)

/* load the data segments */
movl$0x18, %eax /* data segment */
movl%eax, %ds
movl%eax, %es
movl%eax, %ss
movl%eax, %fs
movl%eax, %gs

/* Setup a stack */
leaqlstack_end(%rip), %rsp

/* Call the C code */
call purgatory
jmp entry64

which pretty much confirms that - it's called from purgatory_start().

probably serious conntrack/netfilter panic, 4.8.14, timers and intel turbo

2016-12-17 Thread Denys Fedoryshchenko


Hi,

I posted recently several netfilter related crashes, didn't got any 
answers, one of them started to happen quite often on loaded NAT 
(17Gbps),
so after trying endless ways to make it stable, i found out that in 
backtrace i can often see timers, and this bug probably appearing on 
older releases,

i've seen such backtrace with timer fired for conntrack on them.
I disabled Intel turbo for cpus on this loaded NAT, and voila, panic 
disappeared for 2nd day!

* by wrmsr -a 0x1a0 0x4000850089
I am not sure timers is the reason, but probably turbo creating some 
condition for bug.




Here is examples of backtrace of last reboots (kernel 4.8.14), and same 
kernel worked perfectly without turbo.
Last one also one crash on 4.8.0 that looks painfully similar, on 
totally different workload, but with conntrack enabled. It happens there 
much less often,

so harder to crash and test by disabling turbo.

[28904.162607] BUG: unable to handle kernel
NULL pointer dereference
at 0008
[28904.163210] IP:
[] nf_ct_add_to_dying_list+0x55/0x61 [nf_conntrack]
[28904.163745] PGD 0

[28904.164058] Oops: 0002 [#1] SMP
[28904.164323] Modules linked in:
nf_nat_pptp
nf_nat_proto_gre
xt_TCPMSS
xt_connmark
ipt_MASQUERADE
nf_nat_masquerade_ipv4
xt_nat
xt_rateest
xt_RATEEST
nf_conntrack_pptp
nf_conntrack_proto_gre
xt_CT
xt_set
xt_hl
xt_tcpudp
ip_set_hash_net
ip_set
nfnetlink
iptable_raw
iptable_mangle
iptable_nat
nf_conntrack_ipv4
nf_defrag_ipv4
nf_nat_ipv4
nf_nat
nf_conntrack
iptable_filter
ip_tables
x_tables
netconsole
configfs
8021q
garp
mrp
stp
llc
bonding
ixgbe
dca

[28904.168132] CPU: 27 PID: 0 Comm: swapper/27 Not tainted 
4.8.14-build-0124 #2
[28904.168398] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS 
SE5C610.86B.01.01.1008.031920151331 03/19/2015

[28904.168853] task: 885fa42e8c40 task.stack: 885fa42f
[28904.169114] RIP: 0010:[]
[] nf_ct_add_to_dying_list+0x55/0x61 [nf_conntrack]
[28904.169643] RSP: 0018:885fbccc3dd8 EFLAGS: 00010246
[28904.169901] RAX:  RBX: 885fbccc RCX: 
885fbccc0010
[28904.170169] RDX: 885f87a1c150 RSI: 0142 RDI: 
885fbccc
[28904.170437] RBP: 885fbccc3de8 R08: cbdee177 R09: 
0100
[28904.170704] R10: 885fbccc3dd0 R11: 820050c0 R12: 
885f87a1c140
[28904.170971] R13: 0005d948 R14: 000ea942 R15: 
885f87a1c160
[28904.171237] FS: () GS:885fbccc() 
knlGS:

[28904.171688] CS: 0010 DS:  ES:  CR0: 80050033
[28904.171964] CR2: 0008 CR3: 00607f006000 CR4: 
001406e0

[28904.172231] Stack:
[28904.172482] 885f87a1c140
820a1405
885fbccc3e28
a00abb30

[28904.173182] 0002820a1405
885f87a1c140
885f99a28201


[28904.173884] 
820050c8
885fbccc3e58
a00abc62

[28904.174585] Call Trace:
[28904.174835] 

[28904.174912] [] nf_ct_delete_from_lists+0xc9/0xf2 
[nf_conntrack]
[28904.175613] [] nf_ct_delete+0x109/0x12c 
[nf_conntrack]
[28904.175894] [] ? nf_ct_delete+0x12c/0x12c 
[nf_conntrack]
[28904.176169] [] death_by_timeout+0xd/0xf 
[nf_conntrack]

[28904.176443] [] call_timer_fn.isra.5+0x17/0x6b
[28904.176714] [] expire_timers+0x6f/0x7e
[28904.176975] [] run_timer_softirq+0x69/0x8b
[28904.177238] [] ? 
clockevents_program_event+0xd0/0xe8

[28904.177504] [] __do_softirq+0xbd/0x1aa
[28904.177765] [] irq_exit+0x37/0x7c
[28904.178026] [] 
smp_trace_apic_timer_interrupt+0x7b/0x88

[28904.178300] [] smp_apic_timer_interrupt+0x9/0xb
[28904.178565] [] apic_timer_interrupt+0x7c/0x90
[28904.178835] 

[28904.178907] [] ? mwait_idle+0x64/0x7a
[28904.179436] [] ? 
atomic_notifier_call_chain+0x13/0x15

[28904.179712] [] arch_cpu_idle+0xa/0xc
[28904.179976] [] default_idle_call+0x27/0x29
[28904.180244] [] cpu_startup_entry+0x11d/0x1c7
[28904.180508] [] start_secondary+0xe8/0xeb
[28904.180767] Code:
80
2f
0b
82
48
89
df
e8
da
90
84
e1
48
8b
43
10
49
8d
54
24
10
48
8d
4b
10
49
89
4c
24
18
a8
01
49
89
44
24
10
48
89
53
10
75
04

89
50
08
c6
03
00
5b
41
5c
5d
c3
48
8b
05
10
be
00
00
89
f6

[28904.185546] RIP
[] nf_ct_add_to_dying_list+0x55/0x61 [nf_conntrack]
[28904.186065] RSP 
[28904.186319] CR2: 0008
[28904.186593] ---[ end trace 35cbc6c885a5c2d8 ]---
[28904.186860] Kernel panic - not syncing: Fatal exception in interrupt
[28904.187155] Kernel Offset: disabled
[28904.187419] Rebooting in 5 seconds..

[28909.193662] ACPI MEMORY or I/O RESET_REG.



[14125.227611] BUG: unable to handle kernel
NULL pointer dereference
at (null)
[14125.228215] IP:
[] nf_nat_setup_info+0x6d8/0x755 [nf_nat]
[14125.228564] PGD 0

[14125.228882] Oops:  [#1] SMP
[14125.229146] Modules linked in:
nf_nat_pptp
nf_nat_proto_gre
xt_TCPMSS
xt_connmark
ipt_MASQUERADE
nf_nat_masquerade_ipv4
xt_nat
xt_rateest
xt_RATEEST
nf_conntrack_pptp
nf_conntrack_proto_gre
xt_CT
xt_set
xt_hl
xt_tcpudp
ip_set_hash_net
ip_set
nfnetlink
iptable_raw
ipt

Re: What is the function of arch/x86/purgatory/purgatory.c?

2016-12-17 Thread Larry Finger


On 12/17/2016 01:46 PM, Al Viro wrote:

On Sat, Dec 17, 2016 at 11:52:05AM -0600, Larry Finger wrote:


Upon examination of the routine, I can see that if purgatory() should be
static, then none of the code here will ever be accessed by any part of the
kernel. Is there some bit of magic that is above my understanding, or is
this a useless bit of code that has been forgotten and should be removed?


I don't know what is and what is not above your understanding, but grepping
in that area (grep -w purgatory arch/x86/purgatory/*) does catch this:
arch/x86/purgatory/setup-x86_64.S:  call purgatory
which is hardly magic - looks like a function call.  Looking into that
file shows
purgatory_start:
.code64

/* Load a gdt so I know what the segment registers are */
lgdtgdt(%rip)

/* load the data segments */
movl$0x18, %eax /* data segment */
movl%eax, %ds
movl%eax, %es
movl%eax, %ss
movl%eax, %fs
movl%eax, %gs

/* Setup a stack */
leaqlstack_end(%rip), %rsp

/* Call the C code */
call purgatory
jmp entry64

which pretty much confirms that - it's called from purgatory_start().


Thanks for the explanation.

Larry

1 2 >

1 - 100 of 170 matches

Mail list logo