date:20190216

Re: [PATCH v3 1/4] dt-bindings: input: touchscreen: goodix: Document AVDD28-supply property

2019-02-16 Thread Jagan Teki

Hi Dmitry and Rob,

On Thu, Feb 14, 2019 at 3:21 AM Rob Herring  wrote:
>
> On Tue, Jan 22, 2019 at 7:39 AM Jagan Teki  wrote:
> >
> > On Mon, Jan 21, 2019 at 9:46 PM Rob Herring  wrote:
> > >
> > > On Fri, Jan 18, 2019 at 10:01 AM Jagan Teki  
> > > wrote:
> > > >
> > > > On Wed, Jan 9, 2019 at 7:08 PM Rob Herring  wrote:
> > > > >
> > > > > Please CC DT list if you want bindings reviewed.
> > > >
> > > > Sorry I forgot.
> > > >
> > > > >
> > > > > On Wed, Jan 9, 2019 at 1:40 AM Dmitry Torokhov
> > > > >  wrote:
> > > > > >
> > > > > > On Sat, Dec 15, 2018 at 08:47:59PM +0530, Jagan Teki wrote:
> > > > > > > Most of the Goodix CTP controllers are supply with AVDD28 pin.
> > > > > > > which need to supply for controllers like GT5663 on some boards
> > > > > > > to trigger the power.
> > > > > > >
> > > > > > > So, document the supply property so-that the require boards
> > > > > > > that used on GT5663 can enable it via device tree.
> > > > > > >
> > > > > > > Signed-off-by: Jagan Teki 
> > > > > > > ---
> > > > > > >  Documentation/devicetree/bindings/input/touchscreen/goodix.txt | 
> > > > > > > 1 +
> > > > > > >  1 file changed, 1 insertion(+)
> > > > > > >
> > > > > > > diff --git 
> > > > > > > a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt 
> > > > > > > b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt
> > > > > > > index f7e95c52f3c7..c4622c983e08 100644
> > > > > > > --- 
> > > > > > > a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt
> > > > > > > +++ 
> > > > > > > b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt
> > > > > > > @@ -23,6 +23,7 @@ Optional properties:
> > > > > > >   - touchscreen-inverted-y  : Y axis is inverted (boolean)
> > > > > > >   - touchscreen-swapped-x-y : X and Y axis are swapped (boolean)
> > > > > > >   (swapping is done after inverting 
> > > > > > > the axis)
> > > > > > > + - AVDD28-supply : Analog power supply regulator on AVDD28 
> > > > > > > pin
> > > > > >
> > > > > > I think we normally use lower case in DT bindings and rarely encode
> > > > > > voltage in the supply name unless we are dealing with several 
> > > > > > supplies
> > > > > > of the same kind, but I'll let Ron comment on this.
> > > > >
> > > > > Yes on lowercase though there are some exceptions.
> > > > >
> > > > > There's also a AVDD22 supply as well as DVDD12 and VDDIO. So we
> > > > > probably need to keep the voltage, but the binding is incomplete.
> > > >
> > > > What is incomplete here? can you please elaborate.
> > >
> > > You are missing the 3 other supplies the chip has: AVDD22, DVDD12 and 
> > > VDDIO.
> >
> > Though it has other supplies, only AVDD28 is connected in the Host
> > interface design similar like 9. Reference Schematic on Page, 23 in
> > attached manual.
>
> That is for a particular board design. It still has other supplies.
> Just make the binding complete please. You can make them optional. I
> don't care if the driver supports controlling all the supplies or not
> (though Dmitry might).

So, Can I make bulk get and enable in all 4 regulators global to
driver or specific to that chip, in either way the regulators which
are not used to process via dummy regulators (which we all know).

or regulator which are not using are get via devm_regulator_get_optional.

Any suggestions?

Re: 答复: [v6] coccinelle: semantic code search for missing put_device()

2019-02-16 Thread Markus Elfring

> But please also refer to the examples of coccinelle, such as:
> http://coccinelle.lip6.fr/rules/kmalloc.html
> and
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/free/pci_free_consistent.cocci

These scripts for the semantic patch language show some software design 
possibilities.
They contain implementation details which can be also worth for additional
development considerations.
Will systematic refactoring become more interesting?


> You will find that there are differences between coccinelle and c.

Would you like to discuss any of them further?

Regards,
Markus

[git pull] Input updates for v5.0-rc6

2019-02-16 Thread Dmitry Torokhov

Hi Linus,

Please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git for-linus

to receive updates for the input subsystem:

- tweaks to Elan drivers (both PS/2 and I2C) to support new devices.
  Also revert of one of IDs as that device should really be driven by
  i2c-hid + hid-multitouch

- a few drivers have been switched to set_brightness_blocking() call
  because they either were sleeping the their set_brightness()
  implementation or used workqueue but were not canceling it on unbind.

- ps2-gpio and matrix_keypad needed to [properly] flush their works to
  avoid potential use-after-free on unbind.

- other miscellaneous fixes.

Changelog:
-

Dmitry Torokhov (6):
  Input: cap11xx - switch to using set_brightness_blocking()
  Input: ps2-gpio - flush TX work when closing port
  Input: matrix_keypad - use flush_delayed_work()
  Input: qt2160 - switch to using brightness_set_blocking()
  Revert "Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G"
  Input: apanel - switch to using brightness_set_blocking()

Gabriel Fernandez (1):
  Input: st-keyscan - fix potential zalloc NULL dereference

Jonathan Bakker (2):
  Input: pwm-vibra - prevent unbalanced regulator
  Input: bma150 - register input device after setting private data

Matti Kurkela (1):
  Input: elantech - enable 3rd button support on Fujitsu CELSIUS H780

Mauro Ciancio (1):
  Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK

Paweł Chmiel (1):
  Input: pwm-vibra - stop regulator after disabling pwm, not before

Stefan Agner (1):
  Input: snvs_pwrkey - allow selecting driver for i.MX 7D

Diffstat:


 drivers/input/keyboard/Kconfig |  2 +-
 drivers/input/keyboard/cap11xx.c   | 35 ++---
 drivers/input/keyboard/matrix_keypad.c |  2 +-
 drivers/input/keyboard/qt2160.c| 69 +-
 drivers/input/keyboard/st-keyscan.c|  4 +-
 drivers/input/misc/apanel.c| 24 ++--
 drivers/input/misc/bma150.c|  9 +++--
 drivers/input/misc/pwm-vibra.c | 19 +++---
 drivers/input/mouse/elan_i2c_core.c|  2 +-
 drivers/input/mouse/elantech.c |  9 +
 drivers/input/serio/ps2-gpio.c |  1 +
 11 files changed, 75 insertions(+), 101 deletions(-)

Thanks.


-- 
Dmitry

Re: [PATCH] trace: skip hwasan

2019-02-16 Thread Dmitry Vyukov

On Sun, Feb 17, 2019 at 5:34 AM Qian Cai  wrote:
>
> Enabling function tracer with CONFIG_KASAN_SW_TAGS=y (hwasan) tracer
> causes the whole system frozen on ThunderX2 systems with 256 CPUs,
> because there is a burst of too much pointer access, and then KASAN will
> dereference each byte of the shadow address for the tag checking which
> will kill all the CPUs.

Hi Qian,

Could you please elaborate what exactly happens and who/why kills
CPUs? Number of memory accesses should not make any difference.
With hardware support (MTE) it won't be possible to disable
instrumentation (loads and stores check tags themselves), so it would
be useful to keep track of exact reasons we disable instrumentation to
know how to deal with them with hardware support.
It would be useful to keep this info in the comment in the Makefile.

Thanks

> Signed-off-by: Qian Cai 
> ---
>  kernel/trace/Makefile | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> index c2b2148bb1d2..fdd547a68385 100644
> --- a/kernel/trace/Makefile
> +++ b/kernel/trace/Makefile
> @@ -28,6 +28,11 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
>  GCOV_PROFILE := y
>  endif
>
> +# Too much pointer access will kill hwasan.
> +ifdef CONFIG_KASAN_SW_TAGS
> +KASAN_SANITIZE := n
> +endif
> +
>  CFLAGS_trace_benchmark.o := -I$(src)
>  CFLAGS_trace_events_filter.o := -I$(src)
>
> --
> 2.17.2 (Apple Git-113)
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To post to this group, send email to kasan-...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20190217043434.46233-1-cai%40lca.pw.
> For more options, visit https://groups.google.com/d/optout.

[PATCH] usb: core: add option of only authorizing internal devices

2019-02-16 Thread Dmitry Torokhov

On Chrome OS we want to use USBguard to potentially limit access to USB
devices based on policy. We however to do not want to wait for userspace to
come up before initializing fixed USB devices to not regress our boot
times.

This patch adds option to instruct the kernel to only authorize devices
connected to the internal ports. Previously we could either authorize
all or none (or, by default, we'd only authorize wired devices).

The behavior is controlled via usbcore.authorized_default command line
option.

Signed-off-by: Dmitry Torokhov 
---
 .../admin-guide/kernel-parameters.txt |  3 +-
 Documentation/usb/authorization.txt   |  4 +-
 drivers/usb/core/hcd.c| 51 +++
 drivers/usb/core/usb.c| 33 +---
 include/linux/usb/hcd.h   | 10 ++--
 5 files changed, 69 insertions(+), 32 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index aefd358a5ca3..4446919089b9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4675,7 +4675,8 @@
usbcore.authorized_default=
[USB] Default USB device authorization:
(default -1 = authorized except for wireless USB,
-   0 = not authorized, 1 = authorized)
+   0 = not authorized, 1 = authorized, 2 = authorized
+   if device connected to internal port)
 
usbcore.autosuspend=
[USB] The autosuspend time delay (in seconds) used
diff --git a/Documentation/usb/authorization.txt 
b/Documentation/usb/authorization.txt
index c7e985f05d8f..68c001aca78c 100644
--- a/Documentation/usb/authorization.txt
+++ b/Documentation/usb/authorization.txt
@@ -34,7 +34,9 @@ $ echo 1 > /sys/bus/usb/devices/usbX/authorized_default
 By default, Wired USB devices are authorized by default to
 connect. Wireless USB hosts deauthorize by default all new connected
 devices (this is so because we need to do an authentication phase
-before authorizing).
+before authorizing). Writing "2" to the authorized_default attribute
+causes kernel to only authorize by default devices connected to internal
+USB ports.
 
 
 Example system lockdown (lame)
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 487025d31d44..4a78bf191d78 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -373,13 +373,19 @@ static const u8 ss_rh_config_descriptor[] = {
  * -1 is authorized for all devices except wireless (old behaviour)
  * 0 is unauthorized for all devices
  * 1 is authorized for all devices
+ * 2 is authorized for internal devices
  */
-static int authorized_default = -1;
+#define USB_AUTHORIZE_WIRED-1
+#define USB_AUTHORIZE_NONE 0
+#define USB_AUTHORIZE_ALL  1
+#define USB_AUTHORIZE_INTERNAL 2
+
+static int authorized_default = USB_AUTHORIZE_WIRED;
 module_param(authorized_default, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(authorized_default,
"Default USB device authorization: 0 is not authorized, 1 is "
-   "authorized, -1 is authorized except for wireless USB (default, 
"
-   "old behaviour");
+   "authorized, 2 is authorized for internal devices, -1 is "
+   "authorized except for wireless USB (default, old behaviour");
 /*-*/
 
 /**
@@ -884,7 +890,7 @@ static ssize_t authorized_default_show(struct device *dev,
struct usb_hcd *hcd;
 
hcd = bus_to_hcd(usb_bus);
-   return snprintf(buf, PAGE_SIZE, "%u\n", !!HCD_DEV_AUTHORIZED(hcd));
+   return snprintf(buf, PAGE_SIZE, "%u\n", hcd->dev_policy);
 }
 
 static ssize_t authorized_default_store(struct device *dev,
@@ -900,11 +906,8 @@ static ssize_t authorized_default_store(struct device *dev,
hcd = bus_to_hcd(usb_bus);
result = sscanf(buf, "%u\n", &val);
if (result == 1) {
-   if (val)
-   set_bit(HCD_FLAG_DEV_AUTHORIZED, &hcd->flags);
-   else
-   clear_bit(HCD_FLAG_DEV_AUTHORIZED, &hcd->flags);
-
+   hcd->dev_policy = val <= USB_DEVICE_AUTHORIZE_INTERNAL ?
+   val : USB_DEVICE_AUTHORIZE_ALL;
result = size;
} else {
result = -EINVAL;
@@ -2745,18 +2748,26 @@ int usb_add_hcd(struct usb_hcd *hcd,
 
dev_info(hcd->self.controller, "%s\n", hcd->product_desc);
 
-   /* Keep old behaviour if authorized_default is not in [0, 1]. */
-   if (authorized_default < 0 || authorized_default > 1) {
-   if (hcd->wireless)
-   clear_bit(HCD_FLAG_DEV_AUTHORIZED, &hcd->flags);
-   else
-   set_bit(HCD_FLAG_DEV_AUTHORIZED, &hcd->flags);
-   } else {
-   if (authorized

Re: [PATCH] tty: serial: msm_serial: Remove __init from msm_console_setup()

2019-02-16 Thread Bjorn Andersson

On Sat 16 Feb 21:05 PST 2019, Jeffrey Hugo wrote:

> Due to the complexities of modern Qualcomm SoCs, about a half dozen drivers
> must successfully probe before the clocks for the console are present, and
> the console can successfully probe.  Depending on several random factors
> such as probe order and modules vs builtin, msm_serial may not be able to
> successfully probe for some, at which point, __init annotated functions
> may become unmapped.  If this occurs, msm_console_setup() will be called
> from the probe path, but will no longer exist, resulting in a kernel
> panic.
> 
> Resolve this issue by removing the __init annotation from
> msm_console_setup().

I'm pretty sure I've stumbled upon this several times without knowing
what hit me.

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> 
> Signed-off-by: Jeffrey Hugo 
> ---
>  drivers/tty/serial/msm_serial.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c
> index 736b74f..1090960 100644
> --- a/drivers/tty/serial/msm_serial.c
> +++ b/drivers/tty/serial/msm_serial.c
> @@ -1634,7 +1634,7 @@ static void msm_console_write(struct console *co, const 
> char *s,
>   __msm_console_write(port, s, count, msm_port->is_uartdm);
>  }
>  
> -static int __init msm_console_setup(struct console *co, char *options)
> +static int msm_console_setup(struct console *co, char *options)
>  {
>   struct uart_port *port;
>   int baud = 115200;
> -- 
> Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, 
> Inc.
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.
>

Re: [PATCH v3 0/3] input: goodix - support Goodix gt5688

2019-02-16 Thread Dmitry Torokhov

On Fri, Feb 15, 2019 at 03:11:50PM +0100, Guido Günther wrote:
> Besides the support for gt5688 two more minimal changes piled up so I'm
> folding these in here. One is a doc update for the goodix bindings, the
> other one makes it a bit simpler to figure out configuration problems
> (like lack of touchscreen-size-{x,y} in the device tree.
> 
> I've kept the original subject to (hopefully) not break the series for
> patchwork. All patches can be applied independently. Series is based on
> next-20190208.
> 
> Changes from v2
> * Add 'dt-bindings: input: Refer to touchscreen.txt for goodix ts'
> * Add 'input: goodix: Print values in case of inconsistencies'
>   (sent out separtely before, no comments so far)
> * Collect 'Reviewed-By' on 'input: goodix - support Goodix gt5688'

Applied the lot, thank you.

-- 
Dmitry

[GIT PULL] KVM changes for Linux 5.0-rc7

2019-02-16 Thread Paolo Bonzini

Linus,

The following changes since commit d13937116f1e82bf508a6325111b322c30c85eb9:

  Linux 5.0-rc6 (2019-02-10 14:42:20 -0800)

are available in the git repository at:

  https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

for you to fetch changes up to 98ae70cc476e82a2c6bb72f941a25f0de226:

  kvm: vmx: Fix entry number check for add_atomic_switch_msr() (2019-02-14 
16:22:20 +0100)


A somewhat bigger ARM update, and the usual smattering
of x86 bug fixes.


Christoffer Dall (2):
  KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded
  KVM: arm/arm64: vgic: Always initialize the group of private IRQs

James Morse (1):
  KVM: arm64: Forbid kprobing of the VHE world-switch code

Julien Thierry (3):
  KVM: arm/arm64: vgic: Make vgic_irq->irq_lock a raw_spinlock
  KVM: arm/arm64: vgic: Make vgic_dist->lpi_list_lock a raw_spinlock
  KVM: arm/arm64: vgic: Make vgic_cpu->ap_list_lock a raw_spinlock

Luwei Kang (1):
  KVM: x86: Recompute PID.ON when clearing PID.SN

Marc Zyngier (4):
  arm64: KVM: Don't generate UNDEF when LORegion feature is present
  arm/arm64: KVM: Allow a VCPU to fully reset itself
  arm/arm64: KVM: Don't panic on failure to properly reset system registers
  arm: KVM: Add missing kvm_stage2_has_pmd() helper

Paolo Bonzini (1):
  Merge tag 'kvm-arm-fixes-for-5.0' of 
git://git.kernel.org/.../kvmarm/kvmarm into kvm-master

Sean Christopherson (1):
  KVM: nVMX: Restore a preemption timer consistency check

Suzuki K Poulose (1):
  KVM: arm64: Relax the restriction on using stage2 PUD huge mapping

Vitaly Kuznetsov (1):
  x86/kvm/nVMX: read from MSR_IA32_VMX_PROCBASED_CTLS2 only when it is 
available

Xiaoyao Li (1):
  kvm: vmx: Fix entry number check for add_atomic_switch_msr()

 arch/arm/include/asm/kvm_host.h   |  10 +++
 arch/arm/include/asm/stage2_pgtable.h |   5 ++
 arch/arm/kvm/coproc.c |   4 +-
 arch/arm/kvm/reset.c  |  24 +++
 arch/arm64/include/asm/kvm_host.h |  11 
 arch/arm64/kvm/hyp/switch.c   |   5 ++
 arch/arm64/kvm/hyp/sysreg-sr.c|   5 ++
 arch/arm64/kvm/reset.c|  50 +-
 arch/arm64/kvm/sys_regs.c |  50 --
 arch/x86/kvm/vmx/nested.c |  12 +++-
 arch/x86/kvm/vmx/vmx.c|  29 -
 arch/x86/kvm/vmx/vmx.h|  10 +--
 arch/x86/kvm/x86.c|   2 +-
 include/kvm/arm_vgic.h|   6 +-
 virt/kvm/arm/arm.c|  10 +++
 virt/kvm/arm/mmu.c|   9 ++-
 virt/kvm/arm/psci.c   |  36 +--
 virt/kvm/arm/vgic/vgic-debug.c|   4 +-
 virt/kvm/arm/vgic/vgic-init.c |  30 +
 virt/kvm/arm/vgic/vgic-its.c  |  22 +++
 virt/kvm/arm/vgic/vgic-mmio-v2.c  |  14 ++--
 virt/kvm/arm/vgic/vgic-mmio-v3.c  |  12 ++--
 virt/kvm/arm/vgic/vgic-mmio.c |  34 +-
 virt/kvm/arm/vgic/vgic-v2.c   |   4 +-
 virt/kvm/arm/vgic/vgic-v3.c   |   8 +--
 virt/kvm/arm/vgic/vgic.c  | 118 +-
 26 files changed, 331 insertions(+), 193 deletions(-)

Re: [PATCH v2] Input: synaptics_i2c - remove redundant spinlock

2019-02-16 Thread Dmitry Torokhov

On Sun, Feb 17, 2019 at 01:10:53AM -0500, thesve...@gmail.com wrote:
> From: Sven Van Asbroeck 
> 
> Remove a leftover spinlock.
> 
> This was required back when mod_delayed_work() did not exist,
> and had to be implemented with a cancel + queue. See
> commit e7c2f967445d ("workqueue: use mod_delayed_work() instead of
> __cancel + queue")
> 
> schedule_delayed_work() and mod_delayed_work() can now be used
> concurrently. So the spinlock is no longer needed.
> 
> Cc: Tejun Heo 
> Signed-off-by: Sven Van Asbroeck 

Applied, thank you.

> ---
> 
> v2: replace useless synaptics_i2c_reschedule_work() with
>   mod_delayed_work().
> 
>  drivers/input/mouse/synaptics_i2c.c | 22 --
>  1 file changed, 4 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/input/mouse/synaptics_i2c.c 
> b/drivers/input/mouse/synaptics_i2c.c
> index 8538318d332c..fa304648d611 100644
> --- a/drivers/input/mouse/synaptics_i2c.c
> +++ b/drivers/input/mouse/synaptics_i2c.c
> @@ -219,7 +219,6 @@ struct synaptics_i2c {
>   struct i2c_client   *client;
>   struct input_dev*input;
>   struct delayed_work dwork;
> - spinlock_t  lock;
>   int no_data_count;
>   int no_decel_param;
>   int reduce_report_param;
> @@ -369,23 +368,11 @@ static bool synaptics_i2c_get_input(struct 
> synaptics_i2c *touch)
>   return xy_delta || gesture;
>  }
>  
> -static void synaptics_i2c_reschedule_work(struct synaptics_i2c *touch,
> -   unsigned long delay)
> -{
> - unsigned long flags;
> -
> - spin_lock_irqsave(&touch->lock, flags);
> -
> - mod_delayed_work(system_wq, &touch->dwork, delay);
> -
> - spin_unlock_irqrestore(&touch->lock, flags);
> -}
> -
>  static irqreturn_t synaptics_i2c_irq(int irq, void *dev_id)
>  {
>   struct synaptics_i2c *touch = dev_id;
>  
> - synaptics_i2c_reschedule_work(touch, 0);
> + mod_delayed_work(system_wq, &touch->dwork, 0);
>  
>   return IRQ_HANDLED;
>  }
> @@ -461,7 +448,7 @@ static void synaptics_i2c_work_handler(struct work_struct 
> *work)
>* We poll the device once in THREAD_IRQ_SLEEP_SECS and
>* if error is detected, we try to reset and reconfigure the touchpad.
>*/
> - synaptics_i2c_reschedule_work(touch, delay);
> + mod_delayed_work(system_wq, &touch->dwork, delay);
>  }
>  
>  static int synaptics_i2c_open(struct input_dev *input)
> @@ -474,7 +461,7 @@ static int synaptics_i2c_open(struct input_dev *input)
>   return ret;
>  
>   if (polling_req)
> - synaptics_i2c_reschedule_work(touch,
> + mod_delayed_work(system_wq, &touch->dwork,
>   msecs_to_jiffies(NO_DATA_SLEEP_MSECS));
>  
>   return 0;
> @@ -530,7 +517,6 @@ static struct synaptics_i2c 
> *synaptics_i2c_touch_create(struct i2c_client *clien
>   touch->scan_rate_param = scan_rate;
>   set_scan_rate(touch, scan_rate);
>   INIT_DELAYED_WORK(&touch->dwork, synaptics_i2c_work_handler);
> - spin_lock_init(&touch->lock);
>  
>   return touch;
>  }
> @@ -637,7 +623,7 @@ static int __maybe_unused synaptics_i2c_resume(struct 
> device *dev)
>   if (ret)
>   return ret;
>  
> - synaptics_i2c_reschedule_work(touch,
> + mod_delayed_work(system_wq, &touch->dwork,
>   msecs_to_jiffies(NO_DATA_SLEEP_MSECS));
>  
>   return 0;
> -- 
> 2.17.1
> 

-- 
Dmitry

[PATCH] scsi: clean obsolete return values of eh_timed_out

2019-02-16 Thread Avri Altman

Those are no longer in use since commit 242f9dcb8ba6
("block: unify request timeout handling").

Signed-off-by: Avri Altman 
---
 include/scsi/scsi_host.h | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 4047d68..dbbdfbd 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -300,11 +300,7 @@ struct scsi_host_template {
/*
 * This is an optional routine that allows the transport to become
 * involved when a scsi io timer fires. The return value tells the
-* timer routine how to finish the io timeout handling:
-* EH_HANDLED:  I fixed the error, please complete the command
-* EH_RESET_TIMER:  I need more time, reset the timer and
-*  begin counting again
-* EH_DONE: Begin normal error recovery
+* timer routine how to finish the io timeout handling.
 *
 * Status: OPTIONAL
 */
-- 
1.9.1

Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()

2019-02-16 Thread Balbir Singh

On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> Hi all,
> 
> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> > > rather than just checking that the value is non-zero, e.g.:
> > > 
> > >   static inline int pgd_present(pgd_t pgd)
> > >   {
> > >  -   return !pgd_none(pgd);
> > >  +   return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> > >   }
> > > 
> > > Unfortunately this is broken on big endian, as the result of the
> > > bitwise && is truncated to int, which is always zero because
> 
> (Bitwise "&" of course).
> 
> > Not sure why that should happen, why is the result an int? What
> > causes the casting of pgd_t & be64 to be truncated to an int.
> 
> Yes, it's not obvious as written...  It's simply that the return type of
> pgd_present is int.  So it is truncated _after_ the bitwise and.
>

Thanks, I am surprised the compiler does not complain about the truncation
of bits. I wonder if we are missing -Wconversion

Balbir

[PATCH v2] Input: synaptics_i2c - remove redundant spinlock

2019-02-16 Thread thesven73

From: Sven Van Asbroeck 

Remove a leftover spinlock.

This was required back when mod_delayed_work() did not exist,
and had to be implemented with a cancel + queue. See
commit e7c2f967445d ("workqueue: use mod_delayed_work() instead of
__cancel + queue")

schedule_delayed_work() and mod_delayed_work() can now be used
concurrently. So the spinlock is no longer needed.

Cc: Tejun Heo 
Signed-off-by: Sven Van Asbroeck 
---

v2: replace useless synaptics_i2c_reschedule_work() with
mod_delayed_work().

 drivers/input/mouse/synaptics_i2c.c | 22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/drivers/input/mouse/synaptics_i2c.c 
b/drivers/input/mouse/synaptics_i2c.c
index 8538318d332c..fa304648d611 100644
--- a/drivers/input/mouse/synaptics_i2c.c
+++ b/drivers/input/mouse/synaptics_i2c.c
@@ -219,7 +219,6 @@ struct synaptics_i2c {
struct i2c_client   *client;
struct input_dev*input;
struct delayed_work dwork;
-   spinlock_t  lock;
int no_data_count;
int no_decel_param;
int reduce_report_param;
@@ -369,23 +368,11 @@ static bool synaptics_i2c_get_input(struct synaptics_i2c 
*touch)
return xy_delta || gesture;
 }
 
-static void synaptics_i2c_reschedule_work(struct synaptics_i2c *touch,
- unsigned long delay)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(&touch->lock, flags);
-
-   mod_delayed_work(system_wq, &touch->dwork, delay);
-
-   spin_unlock_irqrestore(&touch->lock, flags);
-}
-
 static irqreturn_t synaptics_i2c_irq(int irq, void *dev_id)
 {
struct synaptics_i2c *touch = dev_id;
 
-   synaptics_i2c_reschedule_work(touch, 0);
+   mod_delayed_work(system_wq, &touch->dwork, 0);
 
return IRQ_HANDLED;
 }
@@ -461,7 +448,7 @@ static void synaptics_i2c_work_handler(struct work_struct 
*work)
 * We poll the device once in THREAD_IRQ_SLEEP_SECS and
 * if error is detected, we try to reset and reconfigure the touchpad.
 */
-   synaptics_i2c_reschedule_work(touch, delay);
+   mod_delayed_work(system_wq, &touch->dwork, delay);
 }
 
 static int synaptics_i2c_open(struct input_dev *input)
@@ -474,7 +461,7 @@ static int synaptics_i2c_open(struct input_dev *input)
return ret;
 
if (polling_req)
-   synaptics_i2c_reschedule_work(touch,
+   mod_delayed_work(system_wq, &touch->dwork,
msecs_to_jiffies(NO_DATA_SLEEP_MSECS));
 
return 0;
@@ -530,7 +517,6 @@ static struct synaptics_i2c 
*synaptics_i2c_touch_create(struct i2c_client *clien
touch->scan_rate_param = scan_rate;
set_scan_rate(touch, scan_rate);
INIT_DELAYED_WORK(&touch->dwork, synaptics_i2c_work_handler);
-   spin_lock_init(&touch->lock);
 
return touch;
 }
@@ -637,7 +623,7 @@ static int __maybe_unused synaptics_i2c_resume(struct 
device *dev)
if (ret)
return ret;
 
-   synaptics_i2c_reschedule_work(touch,
+   mod_delayed_work(system_wq, &touch->dwork,
msecs_to_jiffies(NO_DATA_SLEEP_MSECS));
 
return 0;
-- 
2.17.1

Re: [PATCH] scripts/sign-file.c: Use CMS if LibreSSL >= 2.6.0 is present

2019-02-16 Thread Alec Ari

Scratch that patch, error is still there on module installation. Sorry
about that! I thought that should have fixed it, I must be missing
something here.

Alec

Re: [PATCH] Input: st1232 - include gpio/consumer.h header for gpiod_set_value_cansleep()

2019-02-16 Thread Dmitry Torokhov

On Wed, Feb 13, 2019 at 12:19:58PM +0100, Martin Kepplinger wrote:
> gpiod_set_value_cansleep() needs it's declaration in the corresponding
> header. This fixes build errors like
> 
>drivers/input/touchscreen/st1232.c: In function 'st1232_ts_power':
> >> drivers/input/touchscreen/st1232.c:146:3: error: implicit declaration of
>function 'gpiod_set_value_cansleep'; did you mean 
> 'gpio_set_value_cansleep'?
>[-Werror=implicit-function-declaration]
>   gpiod_set_value_cansleep(ts->reset_gpio, !poweron);
>   ^~~~
>   gpio_set_value_cansleep
> 
> Signed-off-by: Martin Kepplinger 

Thanks Martin, I'll fold it into the original patch introducing gpiod so
we keep bisectability.

-- 
Dmitry

Re: [PATCH] Input: synaptics_i2c - remove redundant spinlock

2019-02-16 Thread Dmitry Torokhov

Hi Sven,

On Mon, Feb 11, 2019 at 08:34:42PM -0500, thesve...@gmail.com wrote:
> @@ -372,13 +371,7 @@ static bool synaptics_i2c_get_input(struct synaptics_i2c 
> *touch)
>  static void synaptics_i2c_reschedule_work(struct synaptics_i2c *touch,
> unsigned long delay)
>  {
> - unsigned long flags;
> -
> - spin_lock_irqsave(&touch->lock, flags);
> -
>   mod_delayed_work(system_wq, &touch->dwork, delay);
> -
> - spin_unlock_irqrestore(&touch->lock, flags);
>  }

This makes synaptics_i2c_reschedule_work() a useless wrapper for
mod_delayed_work(). Can we get rid of it?

Thanks.

-- 
Dmitry

Re: [PATCH][next] mtd: spi-nor: cadence-quadspi: fix spelling mistake: "Couldnt't" -> "Couldn't"

2019-02-16 Thread Vignesh R




On 15/02/19 8:45 PM, Colin King wrote:
> From: Colin Ian King 
> 
> There is a spelling mistake in a dev_error message. Fix it.
> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/mtd/spi-nor/cadence-quadspi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/spi-nor/cadence-quadspi.c 
> b/drivers/mtd/spi-nor/cadence-quadspi.c
> index 56512c0368f9..792628750eec 100644
> --- a/drivers/mtd/spi-nor/cadence-quadspi.c
> +++ b/drivers/mtd/spi-nor/cadence-quadspi.c
> @@ -1249,7 +1249,7 @@ static int cqspi_setup_flash(struct cqspi_st *cqspi, 
> struct device_node *np)
>  
>   ddata = of_device_get_match_data(dev);
>   if (!ddata) {
> - dev_err(dev, "Couldnt't find driver data\n");
> + dev_err(dev, "Couldn't find driver data\n");
>   return -EINVAL;
>   }
>   hwcaps.mask = ddata->hwcaps_mask;
> 

Oops, my bad.. Thanks for fixing this!


-- 
Regards
Vignesh

Re: [PATCH v1] Input: st-keyscan - fix potential zalloc NULL dereference

2019-02-16 Thread Dmitry Torokhov

On Tue, Feb 12, 2019 at 04:30:55PM +0100, gabriel.fernan...@st.com wrote:
> From: Gabriel Fernandez 
> 
> This patch fixes the following static checker warning:
> 
> drivers/input/keyboard/st-keyscan.c:156 keyscan_probe()
> error: potential zalloc NULL dereference: 'keypad_data->input_dev'
> 
> Reported-by: Dan Carpenter 
> Signed-off-by: Gabriel Fernandez 

Applied, thank you.

> ---
>  drivers/input/keyboard/st-keyscan.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/input/keyboard/st-keyscan.c 
> b/drivers/input/keyboard/st-keyscan.c
> index babcfb165e4f..3b85631fde91 100644
> --- a/drivers/input/keyboard/st-keyscan.c
> +++ b/drivers/input/keyboard/st-keyscan.c
> @@ -153,6 +153,8 @@ static int keyscan_probe(struct platform_device *pdev)
>  
>   input_dev->id.bustype = BUS_HOST;
>  
> + keypad_data->input_dev = input_dev;
> +
>   error = keypad_matrix_key_parse_dt(keypad_data);
>   if (error)
>   return error;
> @@ -168,8 +170,6 @@ static int keyscan_probe(struct platform_device *pdev)
>  
>   input_set_drvdata(input_dev, keypad_data);
>  
> - keypad_data->input_dev = input_dev;
> -
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   keypad_data->base = devm_ioremap_resource(&pdev->dev, res);
>   if (IS_ERR(keypad_data->base))
> -- 
> 2.17.0
> 

-- 
Dmitry

Re: [PATCH v2] LSM: Ignore "security=" when "lsm=" is specified

2019-02-16 Thread Tetsuo Handa

On 2019/02/14 1:05, Casey Schaufler wrote:
> On 2/12/2019 10:23 AM, Kees Cook wrote:
>> To avoid potential confusion, explicitly ignore "security=" when "lsm=" is
>> used on the command line, and report that it is happening.
>>
>> Suggested-by: Tetsuo Handa 
>> Signed-off-by: Kees Cook 
> 
> Acked-by: Casey Schaufler 
> 

The manual for TOMOYO was updated to follow this change.
SELinux folks and AppArmor folks, can we apply this change?

Fwd: [PATCH] scripts/sign-file.c: Use CMS if LibreSSL >= 2.6.0 is present

2019-02-16 Thread Alec Ari

Hello, it seems it's impossible to add tabs via gmail. The appropriate
patch has been attached to kernel bugzilla #202159

Sorry about this, hitting tab toggles highlighted selection, using
"more indent" didn't do the trick I guess. The patch works here but I
suppose not for the one who reported the bug.

Alec

[PATCH] tty: serial: msm_serial: Remove __init from msm_console_setup()

2019-02-16 Thread Jeffrey Hugo

Due to the complexities of modern Qualcomm SoCs, about a half dozen drivers
must successfully probe before the clocks for the console are present, and
the console can successfully probe.  Depending on several random factors
such as probe order and modules vs builtin, msm_serial may not be able to
successfully probe for some, at which point, __init annotated functions
may become unmapped.  If this occurs, msm_console_setup() will be called
from the probe path, but will no longer exist, resulting in a kernel
panic.

Resolve this issue by removing the __init annotation from
msm_console_setup().

Signed-off-by: Jeffrey Hugo 
---
 drivers/tty/serial/msm_serial.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c
index 736b74f..1090960 100644
--- a/drivers/tty/serial/msm_serial.c
+++ b/drivers/tty/serial/msm_serial.c
@@ -1634,7 +1634,7 @@ static void msm_console_write(struct console *co, const 
char *s,
__msm_console_write(port, s, count, msm_port->is_uartdm);
 }
 
-static int __init msm_console_setup(struct console *co, char *options)
+static int msm_console_setup(struct console *co, char *options)
 {
struct uart_port *port;
int baud = 115200;
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Re: [RFC] On the Current Troubles of Mainlining Loongson Platform Drivers

2019-02-16 Thread Alexandre Oliva

On Feb 11, 2019, Aaro Koskinen  wrote:

> ATA (libata) CS5536 driver is having issues with spurious IRQs and often
> disables IRQs completely during the boot. You should see a warning
> in dmesg. This was the reason for slowness on my FuLoong mini-PC. A
> workaround is to switch to old IDE driver.

Thanks.  I see a NIEN quirk in ide-iops.c that's enabled for the hard
drive model I've got on my yeeloong, but that's not even compiled in my
freeloong builds.  I don't see any changes in libata between 4.19 and
4.20 that could explain the regression either.

I'm afraid there's no observable change in behavior after installing the
proposed patch at
https://lore.kernel.org/linux-mips/20190106124607.gk27...@darkstar.musicnaut.iki.fi/

The kernel still disables irq14 early on, and then runs slow.

Tom, why do you say bisecting this is impossible?  I realize you wrote
you did so for 24 hours non-stop, but...  I'm curious as to what
obstacles you ran into.  It's such a reproducible problem for me that I
can't see how bisecting it might be difficult.

Or were by any chance you talking about the reboot/shutdown problem
then?

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe

Re: [RFC] On the Current Troubles of Mainlining Loongson Platform Drivers

2019-02-16 Thread Alexandre Oliva

On Feb 11, 2019, Tom Li  wrote:

> We've just identified and confirmed the source of the shutdown problem a
> few days ago on this mailing list.

> You can pick up the patch from:
> https://lore.kernel.org/lkml/20190207205812.ga11...@darkstar.musicnaut.iki.fi/

> A patch has been authored and submitted by Aaro Koskinen, but currently it
> seems stuck in the mailing list because the maintainer worries about 
> regression
> and asks for testing on more MIPS systems, although we believe it's a trivial
> patch.

Thanks.  I've just brought it into my local copies of loongson-community
branches for 4.4, 4.9, 4.14, 4.19, and 4.20, and master, after verifying
that it enables at least 4.19, 4.20 and master to reboot without a power
cycle, and I'll include it in upcoming freeloong builds, except for 3.16
and 3.18, where it doesn't apply.


> In addition, I've discovered and fixed another bug preventing the machine from
> shutting down, this patch has already merged into the Linus tree.

> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8a96669d77897ff3613157bf43f875739205d66d

Hmm, I don't think I've ever hit this one.  Do you happen to know how
far back it might be needed?


> I'll continue working on upstreaming these out-of-tree drivers as my personal
> project. I hope you'll be able to use a fully-functional machine with the 
> mainline
> kernel soon, my current target is Linux 5.3.

Thanks!

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe

[PATCH v2 4/4] arm64: dts: qcom: msm8998: Add mmcc node

2019-02-16 Thread Jeffrey Hugo

Add MSM8998 Multimedia Clock Controller DT node.

Signed-off-by: Jeffrey Hugo 

---
 arch/arm64/boot/dts/qcom/msm8998.dtsi | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8998.dtsi 
b/arch/arm64/boot/dts/qcom/msm8998.dtsi
index 19a023986009..14d1ae6366dc 100644
--- a/arch/arm64/boot/dts/qcom/msm8998.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8998.dtsi
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -973,6 +974,19 @@
};
};
 
+   mmcc: clock-controller@c8c {
+   compatible = "qcom,mmcc-msm8998";
+   #clock-cells = <1>;
+   #reset-cells = <1>;
+   #power-domain-cells = <1>;
+   reg = <0x0c8c 0x4>;
+
+   clock-names = "xo",
+ "gpll0";
+   clocks = <&rpmcc RPM_SMD_XO_CLK_SRC>,
+<&gcc GPLL0_OUT_MAIN>;
+   };
+
timer@1792 {
#address-cells = <1>;
#size-cells = <1>;
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH] trace: skip hwasan

2019-02-16 Thread Qian Cai

Enabling function tracer with CONFIG_KASAN_SW_TAGS=y (hwasan) tracer
causes the whole system frozen on ThunderX2 systems with 256 CPUs,
because there is a burst of too much pointer access, and then KASAN will
dereference each byte of the shadow address for the tag checking which
will kill all the CPUs.

Signed-off-by: Qian Cai 
---
 kernel/trace/Makefile | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index c2b2148bb1d2..fdd547a68385 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -28,6 +28,11 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
 GCOV_PROFILE := y
 endif
 
+# Too much pointer access will kill hwasan.
+ifdef CONFIG_KASAN_SW_TAGS
+KASAN_SANITIZE := n
+endif
+
 CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
-- 
2.17.2 (Apple Git-113)

[PATCH v2 3/4] clk: qcom: Add MSM8998 Multimedia Clock Controller (MMCC) driver

2019-02-16 Thread Jeffrey Hugo

Add a driver for the multimedia clock controller found on MSM8998
based devices. This should allow most multimedia device drivers
to probe and control their clocks.

Signed-off-by: Jeffrey Hugo 
---
 drivers/clk/qcom/Kconfig  |9 +
 drivers/clk/qcom/Makefile |1 +
 drivers/clk/qcom/mmcc-msm8998.c   | 2915 +
 include/dt-bindings/clock/qcom,mmcc-msm8998.h |  210 ++
 4 files changed, 3135 insertions(+)
 create mode 100644 drivers/clk/qcom/mmcc-msm8998.c
 create mode 100644 include/dt-bindings/clock/qcom,mmcc-msm8998.h

diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig
index 1b1ba54e33dd..d917bd7efb02 100644
--- a/drivers/clk/qcom/Kconfig
+++ b/drivers/clk/qcom/Kconfig
@@ -220,6 +220,15 @@ config MSM_GCC_8998
  Say Y if you want to use peripheral devices such as UART, SPI,
  i2c, USB, UFS, SD/eMMC, PCIe, etc.
 
+config MSM_MMCC_8998
+   tristate "MSM8998 Multimedia Clock Controller"
+   select MSM_GCC_8998
+   select QCOM_GDSC
+   help
+ Support for the multimedia clock controller on msm8998 devices.
+ Say Y if you want to support multimedia devices such as display,
+ graphics, video encode/decode, camera, etc.
+
 config QCS_GCC_404
tristate "QCS404 Global Clock Controller"
help
diff --git a/drivers/clk/qcom/Makefile b/drivers/clk/qcom/Makefile
index ee8d0698e370..d6de24b53ad9 100644
--- a/drivers/clk/qcom/Makefile
+++ b/drivers/clk/qcom/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_MSM_GCC_8998) += gcc-msm8998.o
 obj-$(CONFIG_MSM_MMCC_8960) += mmcc-msm8960.o
 obj-$(CONFIG_MSM_MMCC_8974) += mmcc-msm8974.o
 obj-$(CONFIG_MSM_MMCC_8996) += mmcc-msm8996.o
+obj-$(CONFIG_MSM_MMCC_8996) += mmcc-msm8998.o
 obj-$(CONFIG_QCOM_A53PLL) += a53-pll.o
 obj-$(CONFIG_QCOM_CLK_APCS_MSM8916) += apcs-msm8916.o
 obj-$(CONFIG_QCOM_CLK_RPM) += clk-rpm.o
diff --git a/drivers/clk/qcom/mmcc-msm8998.c b/drivers/clk/qcom/mmcc-msm8998.c
new file mode 100644
index ..762e55f682c3
--- /dev/null
+++ b/drivers/clk/qcom/mmcc-msm8998.c
@@ -0,0 +1,2915 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019, The Linux Foundation. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "common.h"
+#include "clk-regmap.h"
+#include "clk-regmap-divider.h"
+#include "clk-alpha-pll.h"
+#include "clk-rcg.h"
+#include "clk-branch.h"
+#include "reset.h"
+#include "gdsc.h"
+
+enum {
+   P_XO,
+   P_GPLL0,
+   P_GPLL0_DIV,
+   P_MMPLL0_OUT_EVEN,
+   P_MMPLL1_OUT_EVEN,
+   P_MMPLL3_OUT_EVEN,
+   P_MMPLL4_OUT_EVEN,
+   P_MMPLL5_OUT_EVEN,
+   P_MMPLL6_OUT_EVEN,
+   P_MMPLL7_OUT_EVEN,
+   P_MMPLL10_OUT_EVEN,
+   P_DSI0PLL,
+   P_DSI1PLL,
+   P_DSI0PLL_BYTE,
+   P_DSI1PLL_BYTE,
+   P_HDMIPLL,
+   P_DPVCO,
+   P_DPLINK,
+   P_CORE_BI_PLL_TEST_SE,
+};
+
+static struct clk_fixed_factor gpll0_div = {
+   .mult = 1,
+   .div = 2,
+   .hw.init = &(struct clk_init_data){
+   .name = "mmss_gpll0_div",
+   .parent_data = &(const struct clk_parent_data){
+   .name = "gpll0",
+   .fallback = "gpll0"
+   },
+   .num_parents = 1,
+   .ops = &clk_fixed_factor_ops,
+   },
+};
+
+static const struct clk_div_table post_div_table_fabia_even[] = {
+   { 0x0, 1 },
+   { 0x1, 2 },
+   { 0x3, 4 },
+   { 0x7, 8 },
+   { }
+};
+
+static struct clk_alpha_pll mmpll0 = {
+   .offset = 0xc000,
+   .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_FABIA],
+   .clkr = {
+   .enable_reg = 0x1e0,
+   .enable_mask = BIT(0),
+   .hw.init = &(struct clk_init_data){
+   .name = "mmpll0",
+   .parent_data = &(const struct clk_parent_data){
+   .name = "xo",
+   .fallback = "xo"
+   },
+   .num_parents = 1,
+   .ops = &clk_alpha_pll_fixed_fabia_ops,
+   },
+   },
+};
+
+static struct clk_alpha_pll_postdiv mmpll0_out_even = {
+   .offset = 0xc000,
+   .post_div_shift = 8,
+   .post_div_table = post_div_table_fabia_even,
+   .num_post_div = ARRAY_SIZE(post_div_table_fabia_even),
+   .width = 4,
+   .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_FABIA],
+   .clkr.hw.init = &(struct clk_init_data){
+   .name = "mmpll0_out_even",
+   .parent_hws = (struct clk_hw *[]){ &mmpll0.clkr.hw },
+   .num_parents = 1,
+   .ops = &clk_alpha_pll_postdiv_fabia_ops,
+   },
+};
+
+static struct clk_alpha_pll mmpll1 = {
+   .offset = 0xc050,
+   .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_FABIA],
+   .clkr = {
+

[PATCH v2 2/4] dt-bindings: clock: Add support for the MSM8998 mmcc

2019-02-16 Thread Jeffrey Hugo

Document the multimedia clock controller found on MSM8998

Signed-off-by: Jeffrey Hugo 
---
 .../devicetree/bindings/clock/qcom,mmcc.txt   | 21 +++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/devicetree/bindings/clock/qcom,mmcc.txt 
b/Documentation/devicetree/bindings/clock/qcom,mmcc.txt
index 8b0f7841af8d..a92f3cbc9736 100644
--- a/Documentation/devicetree/bindings/clock/qcom,mmcc.txt
+++ b/Documentation/devicetree/bindings/clock/qcom,mmcc.txt
@@ -10,11 +10,32 @@ Required properties :
"qcom,mmcc-msm8960"
"qcom,mmcc-msm8974"
"qcom,mmcc-msm8996"
+   "qcom,mmcc-msm8998"
 
 - reg : shall contain base register location and length
 - #clock-cells : shall contain 1
 - #reset-cells : shall contain 1
 
+For MSM8998 only:
+   - clocks: a list of phandles and clock-specifier pairs,
+ one for each entry in clock-names.
+   - clock-names: "xo" for the xo clock.
+  "gpll0" for the global pll 0 clock.
+  "dsi0dsi" for the dsi0 pll dsi clock (required if dsi is
+   enabled, optional otherwise).
+  "dsi0byte" for the dsi0 pll byte clock (required if dsi
+   is enabled, optional otherwise).
+  "dsi1dsi" for the dsi1 pll dsi clock (required if dsi is
+   enabled, optional otherwise).
+  "dsi1byte" for the dsi1 pll byte clock (required if dsi
+   is enabled, optional otherwise).
+  "hdmipll" for the hdmi pll clock (required if hdmi is
+   enabled, optional otherwise).
+  "dpvco" for the displayport pll vco clock (required if
+   dp is enabled, optional otherwise).
+  "dplink" for the displayport pll link clock (required if
+   dp is enabled, optional otherwise).
+
 Optional properties :
 - #power-domain-cells : shall contain 1
 
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH v2 1/4] clk: qcom: smd: Add XO clock for MSM8998

2019-02-16 Thread Jeffrey Hugo

The XO clock generally feeds into other clock controllers as the parent
for a lot of clock generators.

Drop the "fake" XO clock in GCC now that it is redundant can will cause a
namespace conflict.

Signed-off-by: Jeffrey Hugo 
---
 drivers/clk/qcom/clk-smd-rpm.c | 24 
 drivers/clk/qcom/gcc-msm8998.c | 17 -
 2 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
index 22dd42ad9223..55a622df3a68 100644
--- a/drivers/clk/qcom/clk-smd-rpm.c
+++ b/drivers/clk/qcom/clk-smd-rpm.c
@@ -68,7 +68,7 @@
}
 
 #define __DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active, type, r_id,\
-   stat_id, r, key)  \
+   stat_id, r, key, ignore_unused) 
  \
static struct clk_smd_rpm _platform##_##_active;  \
static struct clk_smd_rpm _platform##_##_name = { \
.rpm_res_type = (type),   \
@@ -83,6 +83,7 @@
.name = #_name,   \
.parent_names = (const char *[]){ "xo_board" },   \
.num_parents = 1, \
+   .flags = (ignore_unused) ? CLK_IGNORE_UNUSED : 0, \
},\
};\
static struct clk_smd_rpm _platform##_##_active = {   \
@@ -99,6 +100,7 @@
.name = #_active, \
.parent_names = (const char *[]){ "xo_board" },   \
.num_parents = 1, \
+   .flags = (ignore_unused) ? CLK_IGNORE_UNUSED : 0, \
},\
}
 
@@ -108,7 +110,17 @@
 
 #define DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active, type, r_id, r)   \
__DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active, type,  \
-   r_id, 0, r, QCOM_RPM_SMD_KEY_ENABLE)
+   r_id, 0, r, QCOM_RPM_SMD_KEY_ENABLE, false)
+
+/*
+ * Intended for XO clock where we don't want it turned off during late init
+ * if we don't have a consumer by then, but can turn it off later for deep
+ * sleep
+ */
+#define DEFINE_CLK_SMD_RPM_BRANCH_SKIP_UNUSED(_platform, _name, _active, type,\
+ r_id, r)\
+   __DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active, type,  \
+   r_id, 0, r, QCOM_RPM_SMD_KEY_ENABLE, true)
 
 #define DEFINE_CLK_SMD_RPM_QDSS(_platform, _name, _active, type, r_id)   \
__DEFINE_CLK_SMD_RPM(_platform, _name, _active, type, r_id,   \
@@ -117,12 +129,12 @@
 #define DEFINE_CLK_SMD_RPM_XO_BUFFER(_platform, _name, _active, r_id)\
__DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active,\
QCOM_SMD_RPM_CLK_BUF_A, r_id, 0, 1000,\
-   QCOM_RPM_KEY_SOFTWARE_ENABLE)
+   QCOM_RPM_KEY_SOFTWARE_ENABLE, false)
 
 #define DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(_platform, _name, _active, r_id) \
__DEFINE_CLK_SMD_RPM_BRANCH(_platform, _name, _active,\
QCOM_SMD_RPM_CLK_BUF_A, r_id, 0, 1000,\
-   QCOM_RPM_KEY_PIN_CTRL_CLK_BUFFER_ENABLE_KEY)
+   QCOM_RPM_KEY_PIN_CTRL_CLK_BUFFER_ENABLE_KEY, false)
 
 #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
 
@@ -656,6 +668,8 @@ static const struct rpm_smd_clk_desc rpm_clk_qcs404 = {
 };
 
 /* msm8998 */
+DEFINE_CLK_SMD_RPM_BRANCH_SKIP_UNUSED(msm8998, xo, xo_a, QCOM_SMD_RPM_MISC_CLK,
+ 0, 1920);
 DEFINE_CLK_SMD_RPM(msm8998, snoc_clk, snoc_a_clk, QCOM_SMD_RPM_BUS_CLK, 1);
 DEFINE_CLK_SMD_RPM(msm8998, cnoc_clk, cnoc_a_clk, QCOM_SMD_RPM_BUS_CLK, 2);
 DEFINE_CLK_SMD_RPM(msm8998, ce1_clk, ce1_a_clk, QCOM_SMD_RPM_CE_CLK, 0);
@@ -678,6 +692,8 @@ DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8998, rf_clk2_pin, 
rf_clk2_a_pin, 5);
 DEFINE_CLK_SMD_RPM_XO_BUFFER(msm8998, rf_clk3, rf_clk3_a, 6);
 DEFINE_CLK_SMD_RPM_XO_BUFFER_PINCTRL(msm8998, rf_clk3_pin, rf_clk3_a_pin, 6);
 static struct clk_smd_rpm *msm8998_clks[] = {
+   [RPM_SMD_XO_CLK_SRC] = &msm8998_xo,
+   [RPM_SMD_XO_A_CLK_SRC] = &msm8998_xo_a,
[RPM_SMD_SNOC_CLK] = &msm8998_snoc_clk,
[RPM_SMD_SNOC_A_CLK] = &msm8998_snoc_a_clk,
[RPM_SMD_CNOC_CLK] = &msm8998_cnoc_clk,
diff --git a/drivers/clk/qcom/gcc-msm8998.c b/drivers/clk/qcom/gcc-msm8998.c
index c240fba794c7..b98e76cdc56a 100644
--- a/drivers/clk/qcom/gc

[PATCH v2 0/4] MSM8998 Multimedia Clock Controller

2019-02-16 Thread Jeffrey Hugo

The multimedia clock controller (mmcc) is the main clock controller for
the multimedia subsystem and is required to enable things like display and
camera.

Based upon the "Rewrite clk parent handling" series [1] to simplify handling
of the external clocks that feed into the MMCC.

Assumes the "orphan probe defer" change [2] so that exposing XO via the RPM is
safe for the UART in case of late probe per review comments.

Assumes the common clk_hw registration is in place from "Make common clk_hw
registrations" [3]

[1] https://lore.kernel.org/lkml/20190129061021.94775-1-sb...@kernel.org/T/#u
[2] https://lkml.org/lkml/2019/2/11/1895
[3] https://lkml.org/lkml/2019/2/10/132

v2:
-Rebased on the "Rewrite clk parent handling" series and updated to the clk init
mechanisms introduced there.
-Marked XO clk as CLK_IGNORE_UNUSED to avoid the concern about the XO going away
"incorrectly" during late init
-Corrected the name of the XO clock to "xo"
-Dropped the fake XO clock in GCC to prevent a namespace conflict
-Fully enumerated the external clocks (DSI PLLs, etc) in the DT binding
-Cleaned up the weird newlines in the added DT node
-Added DT header file to msm8998 DT for future clients

Jeffrey Hugo (4):
  clk: qcom: smd: Add XO clock for MSM8998
  dt-bindings: clock: Add support for the MSM8998 mmcc
  clk: qcom: Add MSM8998 Multimedia Clock Controller (MMCC) driver
  arm64: dts: qcom: msm8998: Add mmcc node

 .../devicetree/bindings/clock/qcom,mmcc.txt|7 +
 arch/arm64/boot/dts/qcom/msm8998.dtsi  |   15 +
 drivers/clk/qcom/Kconfig   |9 +
 drivers/clk/qcom/Makefile  |1 +
 drivers/clk/qcom/clk-smd-rpm.c |4 +
 drivers/clk/qcom/mmcc-msm8998.c| 2937 
 include/dt-bindings/clock/qcom,mmcc-msm8998.h  |  210 ++
 7 files changed, 3183 insertions(+)
 create mode 100644 drivers/clk/qcom/mmcc-msm8998.c
 create mode 100644 include/dt-bindings/clock/qcom,mmcc-msm8998.h

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Al Viro

On Sun, Feb 17, 2019 at 03:41:21AM +, Arthur Gautier wrote:
> On Sat, Feb 16, 2019 at 11:47:02PM +, Al Viro wrote:
> > On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> > 
> > > What is the actual problem?  We’re not actually demand-faulting this 
> > > data, are we?  Are we just overrunning the buffer because the from_user 
> > > helpers are too clever?  Can we fix it for real by having the fancy 
> > > helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, 
> > > this might be faster, too.
> > 
> > Unaligned _stores_ are not any cheaper, and you'd get one hell of
> > extra arithmetics from trying to avoid both.  Check something
> > like e.g. memcpy() on alpha, where you really have to keep all
> > accesses aligned, both on load and on store side.
> > 
> > Can't we just pad the buffers a bit?  Making sure that name_buf
> > and symlink_buf are _not_ followed by unmapped pages shouldn't
> > be hard.  Both are allocated by kmalloc(), so...
> 
> We cannot change alignment rules here. The input buffer string we're
> reading is coming from an cpio formated file and the format is
> defined by cpio(5).
> Nothing much we can do there I'm afraid. Input buffer is defined to
> be 4-byte aligned.

Who says anything about changing the format of the file?  At least
one trivial way to handle that would be this:

diff --git a/init/initramfs.c b/init/initramfs.c
index 7cea802d00ef..edbddfb73106 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -265,8 +265,12 @@ static int __init do_header(void)
state = Collect;
return 0;
}
-   if (S_ISREG(mode) || !body_len)
-   read_into(name_buf, N_ALIGN(name_len), GotName);
+   if (S_ISREG(mode) || !body_len) {
+   collect = collected = name_buf;
+   remains = N_ALIGN(name_len);
+   next_state = GotName;
+   state = Collect;
+   }
return 0;
 }
 
Another would be to have the buffer passed to flush_buffer() (i.e.
the callback of decompress_fn) allocated with 4 bytes of padding
past the part where the unpacked piece of data is placed for the
callback to find.  As in,

diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
index 63b4b7eee138..ca3f7ecc9b35 100644
--- a/lib/decompress_inflate.c
+++ b/lib/decompress_inflate.c
@@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
rc = -1;
if (flush) {
out_len = 0x8000; /* 32 K */
-   out_buf = malloc(out_len);
+   out_buf = malloc(out_len + 4);
} else {
if (!out_len)
out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */

for gunzip/decompress and similar ones for bzip2, etc.  The contents
layout doesn't have anything to do with that...

[PATCH v3 0/2] Work around for Hisilicon CPPC cpufreq

2019-02-16 Thread Xiongfeng Wang

Hisilicon chips do not support delivered performance counter register
and reference performance counter register. But the platform can
calculate the real performance using its own method. This patch provide
a workaround for this problem, and other platforms can also use this
workaround framework. We reuse the desired performance register to
store the real performance calculated by the platform. After the
platform finished the frequency adjust, it gets the real performance and
writes it into desired performance register. OS can use it to calculate
the real frequency.

Changlog:

v2 -> v3:
Recontruct 'cppc_get_desired_perf'.
Rename 'cppc_workaround_info' to 'cppc_workaround_oem_info'
Drop 'get_rate' member from struct cppc_workaround_oem_info
Change 'cppc_wa_get_rate' pointer to a flag to indicate whether
we need to aplly the workaround.
Move the new functions to the beginning of cppc-cpufreq.c


Xiongfeng Wang (2):
  ACPI / CPPC: Add a helper to get desired performance
  cpufreq / cppc: Work around for Hisilicon CPPC cpufreq

 drivers/acpi/cppc_acpi.c   | 39 +
 drivers/cpufreq/cppc_cpufreq.c | 66 ++
 include/acpi/cppc_acpi.h   |  1 +
 3 files changed, 106 insertions(+)

-- 
1.7.12.4

[PATCH v3 2/2] cpufreq / cppc: Work around for Hisilicon CPPC cpufreq

2019-02-16 Thread Xiongfeng Wang

Hisilicon chips do not support delivered performance counter register
and reference performance counter register. But the platform can
calculate the real performance using its own method. We reuse the
desired performance register to store the real performance calculated by
the platform. After the platform finished the frequency adjust, it gets
the real performance and writes it into desired performance register. Os
can use it to calculate the real frequency.

Signed-off-by: Xiongfeng Wang 
---
 drivers/cpufreq/cppc_cpufreq.c | 66 ++
 1 file changed, 66 insertions(+)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index fd25c21c..efc0298 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -42,6 +42,67 @@
  */
 static struct cppc_cpudata **all_cpu_data;
 
+struct cppc_workaround_oem_info {
+   char oem_id[ACPI_OEM_ID_SIZE +1];
+   char oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1];
+   u32 oem_revision;
+};
+
+static bool apply_hisi_workaround;
+
+static struct cppc_workaround_oem_info wa_info[] = {
+   {
+   .oem_id = "HISI  ",
+   .oem_table_id   = "HIP07   ",
+   .oem_revision   = 0,
+   }, {
+   .oem_id = "HISI  ",
+   .oem_table_id   = "HIP08   ",
+   .oem_revision   = 0,
+   }
+};
+
+static unsigned int cppc_cpufreq_perf_to_khz(struct cppc_cpudata *cpu,
+   unsigned int perf);
+
+/*
+ * HISI platform does not support delivered performance counter and
+ * reference performance counter. It can calculate the performance using the
+ * platform specific mechanism. We reuse the desired performance register to
+ * store the real performance calculated by the platform.
+ */
+static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpunum)
+{
+   struct cppc_cpudata *cpudata = all_cpu_data[cpunum];
+   u64 desired_perf;
+   int ret;
+
+   ret = cppc_get_desired_perf(cpunum, &desired_perf);
+   if (ret < 0)
+   return -EIO;
+
+   return cppc_cpufreq_perf_to_khz(cpudata, desired_perf);
+}
+
+static void cppc_check_hisi_workaround(void)
+{
+   struct acpi_table_header *tbl;
+   acpi_status status = AE_OK;
+   int i;
+
+   status = acpi_get_table(ACPI_SIG_PCCT, 0, &tbl);
+   if (ACPI_FAILURE(status) || !tbl)
+   return;
+
+   for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
+   if (!memcmp(wa_info[i].oem_id, tbl->oem_id, ACPI_OEM_ID_SIZE) &&
+   !memcmp(wa_info[i].oem_table_id, tbl->oem_table_id, 
ACPI_OEM_TABLE_ID_SIZE) &&
+   wa_info[i].oem_revision == tbl->oem_revision) {
+   apply_hisi_workaround = true;
+   }
+   }
+}
+
 /* Callback function used to retrieve the max frequency from DMI */
 static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
 {
@@ -334,6 +395,9 @@ static unsigned int cppc_cpufreq_get_rate(unsigned int 
cpunum)
struct cppc_cpudata *cpu = all_cpu_data[cpunum];
int ret;
 
+   if (apply_hisi_workaround)
+   return hisi_cppc_cpufreq_get_rate(cpunum);
+
ret = cppc_get_perf_ctrs(cpunum, &fb_ctrs_t0);
if (ret)
return ret;
@@ -386,6 +450,8 @@ static int __init cppc_cpufreq_init(void)
goto out;
}
 
+   cppc_check_hisi_workaround();
+
ret = cpufreq_register_driver(&cppc_cpufreq_driver);
if (ret)
goto out;
-- 
1.7.12.4

[PATCH v3 1/2] ACPI / CPPC: Add a helper to get desired performance

2019-02-16 Thread Xiongfeng Wang

This patch add a helper to get the value of desired performance
register.

Signed-off-by: Xiongfeng Wang 
---
 drivers/acpi/cppc_acpi.c | 39 +++
 include/acpi/cppc_acpi.h |  1 +
 2 files changed, 40 insertions(+)

diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 217a782..231d447 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1051,6 +1051,45 @@ static int cpc_write(int cpu, struct 
cpc_register_resource *reg_res, u64 val)
 }
 
 /**
+ * cppc_get_desired_perf - Get the value of desired performance register.
+ * @cpunum: CPU from which to get desired performance.
+ * @desired_perf: address of a variable to store the returned desired 
performance
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+int cppc_get_desired_perf(int cpunum, u64 *desired_perf)
+{
+   struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
+   int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
+   struct cpc_register_resource *desired_reg;
+   struct cppc_pcc_data *pcc_ss_data = NULL;
+
+   desired_reg = &cpc_desc->cpc_regs[DESIRED_PERF];
+
+   if (CPC_IN_PCC(desired_reg)) {
+   int ret = 0;
+
+   if (pcc_ss_id < 0)
+   return -EIO;
+   pcc_ss_data = pcc_data[pcc_ss_id];
+
+   down_write(&pcc_ss_data->pcc_lock);
+   if (send_pcc_cmd(pcc_ss_id, CMD_READ) >= 0)
+   cpc_read(cpunum, desired_reg, desired_perf);
+   else
+   ret = -EIO;
+   up_write(&pcc_ss_data->pcc_lock);
+
+   return ret;
+   }
+
+   cpc_read(cpunum, desired_reg, desired_perf);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(cppc_get_desired_perf);
+
+/**
  * cppc_get_perf_caps - Get a CPUs performance capabilities.
  * @cpunum: CPU from which to get capabilities info.
  * @perf_caps: ptr to cppc_perf_caps. See cppc_acpi.h
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 4f34734..ba6fd72 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -137,6 +137,7 @@ struct cppc_cpudata {
cpumask_var_t shared_cpu_map;
 };
 
+extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
 extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
 extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
-- 
1.7.12.4

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Arthur Gautier

On Sat, Feb 16, 2019 at 11:47:02PM +, Al Viro wrote:
> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> 
> > What is the actual problem?  We’re not actually demand-faulting this data, 
> > are we?  Are we just overrunning the buffer because the from_user helpers 
> > are too clever?  Can we fix it for real by having the fancy helpers do 
> > *aligned* loads so that they don’t overrun the buffer?  Heck, this might be 
> > faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.
> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...

We cannot change alignment rules here. The input buffer string we're
reading is coming from an cpio formated file and the format is
defined by cpio(5).
Nothing much we can do there I'm afraid. Input buffer is defined to
be 4-byte aligned.

-- 
\o/ Arthur
 G  Gandi.net

Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

2019-02-16 Thread Christopher Lameter

On Fri, 15 Feb 2019, Ira Weiny wrote:

> > > > for filesystems and processes.  The only problems come in for the things
> > > > which bypass the page cache like O_DIRECT and DAX.
> > >
> > > It makes a lot of sense since the filesystems play COW etc games with the
> > > pages and RDMA is very much like O_DIRECT in that the pages are modified
> > > directly under I/O. It also bypasses the page cache in case you have
> > > not noticed yet.
> >
> > It is quite different, O_DIRECT modifies the physical blocks on the
> > storage, bypassing the memory copy.
> >
>
> Really?  I thought O_DIRECT allowed the block drivers to write to/from user
> space buffers.  But the _storage_ was still under the control of the block
> drivers?

It depends on what you see as the modification target. O_DIRECT uses
memory as a target and source like RDMA. The block device is at the other
end of the handling.

> > RDMA modifies the memory copy.
> >
> > pages are necessary to do RDMA, and those pages have to be flushed to
> > disk.. So I'm not seeing how it can be disconnected from the page
> > cache?
>
> I don't disagree with this.

RDMA does direct access to memory. If that memmory is a mmmap of a regular
block  device then we have a problem (this has not been a standard use case to 
my
knowledge). The semantics are simmply different. RDMA expects memory to be
pinned and always to be able to read and write from it. The block
device/filesystem expects memory access to be controllable via the page
permission. In particular access to be page need to be able to be stopped.

This is fundamentally incompatible. RDMA access to such an mmapped section
must preserve the RDMA semantics while the pinning is done and can only
provide the access control after RDMA is finished. Pages in the RDMA range
cannot be handled like normal page cache pages.

This is in particular evident in the DAX case in which we have direct pass
through even to the storage medium. And in this case write through can
replace the page cache.

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Andy Lutomirski




> On Feb 16, 2019, at 3:47 PM, Al Viro  wrote:
> 
>> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
>> 
>> What is the actual problem?  We’re not actually demand-faulting this data, 
>> are we?  Are we just overrunning the buffer because the from_user helpers 
>> are too clever?  Can we fix it for real by having the fancy helpers do 
>> *aligned* loads so that they don’t overrun the buffer?  Heck, this might be 
>> faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.

I think we should avoid unaligned loads and do unaligned stores instead.

I would general expect that unaligned stores are a bit cheaper since they don’t 
need to complete for the comparisons to happen.

But I maintain my claim that this code should not be overrunning its input 
buffer into the next page, since it could have observable side effects.

> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...
> 
> What am I missing here?

Re: [net-next, PATCH v2] net: stmmac: use correct define to get rx timestamp on GMAC4

2019-02-16 Thread David Miller

From: Alexandre Torgue 
Date: Thu, 14 Feb 2019 17:03:44 +0100

> In dwmac4_wrback_get_rx_timestamp_status we looking for a RX timestamp.
> For that receive descriptors are handled and so we should use defines
> related to receive descriptors. It'll no change the functional behavior
> as RDES3_RDES1_VALID=TDES3_RS1V=BIT(26) but it makes code easier to read.
> 
> Signed-off-by: Alexandre Torgue 

Applied.

Re: [PATCH v3 6/6] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system

2019-02-16 Thread Baoquan He

Hi Mike,

On 02/16/19 at 10:00pm, Baoquan He wrote:
> On SGI UV system, kernel often hangs when KASLR is enabled. Disabling
> KASLR makes kernel work well.

I wrap codes which calculate the size of the direct mapping section
into a new function calc_direct_mapping_size() as Ingo suggested. This
code change has passed basic testing, but hasn't been tested on a
SGI UV machine after reproducing since it needs UV machine with UV
module installed of enough size.

To reproduce it, we can apply patches 0001~0005. If reproduced, patch
0006 can be applied on top to check if bug is fixed. Please help check
if the code is OK, if you have a machine, I can have a test.

Thanks
Baoquan

> 
> The back trace is:
> 
> kernel BUG at arch/x86/mm/init_64.c:311!
> invalid opcode:  [#1] SMP
> [...]
> RIP: 0010:__init_extra_mapping+0x188/0x196
> [...]
> Call Trace:
>  init_extra_mapping_uc+0x13/0x15
>  map_high+0x67/0x75
>  map_mmioh_high_uv3+0x20a/0x219
>  uv_system_init_hub+0x12d9/0x1496
>  uv_system_init+0x27/0x29
>  native_smp_prepare_cpus+0x28d/0x2d8
>  kernel_init_freeable+0xdd/0x253
>  ? rest_init+0x80/0x80
>  kernel_init+0xe/0x110
>  ret_from_fork+0x2c/0x40
> 
> This is because the SGI UV system need map its MMIOH region to the direct
> mapping section, and the mapping happens in rest_init() which is much
> later than the calling of kernel_randomize_memory() to do mm KASLR. So
> mm KASLR can't count in the size of the MMIOH region when calculate the
> needed size of address space for the direct mapping section.
> 
> When KASLR is disabled, there are 64TB address space for both system RAM
> and the MMIOH regions to share. When KASLR is enabled, the current code
> of mm KASLR only reserves the actual size of system RAM plus extra 10TB
> for the direct mapping. Thus later the MMIOH mapping could go beyond
> the upper bound of the direct mapping to step into VMALLOC or VMEMMAP area.
> Then BUG_ON() in __init_extra_mapping() will be triggered.
> 
> E.g on the SGI UV3 machine where this bug was reported , there are two
> MMIOH regions:
> 
> [1.519001] UV: Map MMIOH0_HI 0xffc - 0x1000
> [1.523001] UV: Map MMIOH1_HI 0x1000 - 0x2000
> 
> They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G RAM are
> spread out to 1TB regions. Then above two SGI MMIOH regions also will be
> mapped into the direct mapping section.
> 
> To fix it, we need check if it's SGI UV system by calling
> is_early_uv_system() in kernel_randomize_memory(). If yes, do not adapt
> thesize of the direct mapping section, just keep it as is, e.g in level-4
> paging mode, 64TB.
> 
> Signed-off-by: Baoquan He 
> ---
>  arch/x86/mm/kaslr.c | 57 +
>  1 file changed, 42 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index ca12ed4e5239..754b5da91d43 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "mm_internal.h"
>  
> @@ -113,15 +114,51 @@ static inline bool kaslr_memory_enabled(void)
>   return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
>  }
>  
> +/*
> + * Even though a huge virtual address space is reserved for the direct
> + * mapping of physical memory, e.g in 4-level pageing mode, it's 64TB,
> + * rare system can own enough physical memory to use it up, most are
> + * even less than 1TB. So with KASLR enabled, we adapt the size of
> + * direct mapping area to size of actual physical memory plus the
> + * configured padding CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING.
> + * The left part will be taken out to join memory randomization.
> + *
> + * Note that UV system is an exception, its MMIOH region need be mapped
> + * into the direct mapping area too, while the size can't be got until
> + * rest_init() calling. Hence for UV system, do not adapt the size
> + * of direct mapping area.
> + */
> +static inline unsigned long calc_direct_mapping_size(void)
> +{
> + unsigned long size_tb, memory_tb;
> +
> + /*
> +  * Update Physical memory mapping to available and
> +  * add padding if needed (especially for memory hotplug support).
> +  */
> + memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> + CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> +
> + size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
> +
> + /*
> +  * Adapt phyiscal memory region size based on available memory if
> +  * it's not UV system.
> +  */
> + if (memory_tb < size_tb && !is_early_uv_system())
> + size_tb = memory_tb;
> +
> + return size_tb;
> +}
> +
>  /* Initialize base and padding for each memory region randomized with KASLR 
> */
>  void __init kernel_randomize_memory(void)
>  {
> - size_t i;
> - unsigned long vaddr_start, vaddr;
> - unsigned long rand, memory_tb;
> - struct rnd_state rand_state;
> + unsigned long vaddr_start, vaddr, rand;
>

ptrace() with multithreaded tracer

2019-02-16 Thread Niklas Hambüchen

Hello,

it would be awesome if somebody in the know could confirm or refute a suspicion 
on ptrace() that we have.

The man page says:

Attachment and subsequent commands are per thread:
in a multi‐ threaded process, every thread can be individually attached to a
(potentially different) tracer, or left not attached and thus not debugged.
Therefore, "tracee" always means "(one) thread", never "a (possibly
multithreaded) process".

While the first sentence "Attachment ... [is] per thread" is quite general, the 
rest talks only about the multi-threadedness of the *tracee*.

What about multithreaded *tracers*?

We suspect (and observe program behaviour that supports this) that having one 
thread pA_t1 in a process A become the tracer of some tracee thread pB_t1, and 
then a different thread of A, pA_t2 running a `ptrace(pB_t1, ...)` is illegal 
and results in `ESRCH`.

Is this statement in true in general, or are there nuances?

Thanks,
Niklas


PS: We'd be happy to contribute these details to the man page based on an 
answer :)

Re: [GIT PULL] auxdisplay for v5.0-rc7

2019-02-16 Thread pr-tracker-bot

The pull request you sent on Fri, 15 Feb 2019 20:22:26 +0100:

> https://github.com/ojeda/linux.git tags/auxdisplay-for-linus-v5.0-rc7

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9a7dcde4a661ccad2b641e873b15ce26bf302c4e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] more nfsd fixes for 5.0

2019-02-16 Thread pr-tracker-bot

The pull request you sent on Fri, 15 Feb 2019 17:01:02 -0500:

> git://linux-nfs.org/~bfields/linux.git tags/nfsd-5.0-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/88fe73cb804abc3d209a06f6221a7108d89ff04f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] Please pull additional NFS client fixes for 5.0

2019-02-16 Thread pr-tracker-bot

The pull request you sent on Fri, 15 Feb 2019 16:47:50 -0500:

> git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-5.0-4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/55638c520bb7b92999b6f0867ba135b6aeafc8d7

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] Compiler Attributes for v5.0-rc7

2019-02-16 Thread pr-tracker-bot

The pull request you sent on Fri, 15 Feb 2019 20:10:59 +0100:

> https://github.com/ojeda/linux.git tags/compiler-attributes-for-linus-v5.0-rc7

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0b999ae3614d09d97a1575936bcee884f912b10e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] ARM: SoC fixes for 5.0

2019-02-16 Thread pr-tracker-bot

The pull request you sent on Sat, 16 Feb 2019 21:20:48 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git tags/armsoc-fixes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/64c0133eb88a3b0c11c42580a520fe78b71b3932

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [RFC PATCH v1 04/25] printk-rb: add writer interface

2019-02-16 Thread John Ogness

Hi Petr,

I've made changes to the patch that hopefully align with what you are
looking for. I would appreciate it if you could go over it and see if
the changes are in the right direction. And if so, you should decide
whether I should make these kinds of changes for the whole series and
submit a v2 before you continue with the review.

The list of changes:

- Added comments everywhere I think they could be useful. Is it too
  much?

- Renamed struct prb_handle to prb_reserved_entry (more appropriate).

- Fixed up macros as you requested.

- The implementation from prb_commit() has been moved to a new
  prb_commit_all_reserved(). This should resolve the confusion in the
  "failed to push_tail()" code.

- I tried moving calc_next() into prb_reserve(), but it was pure
  insanity. I played with refactoring for a while until I found
  something that I think looks nice. I moved the implementation of
  calc_next() along with its containing loop into a new function
  find_res_ptrs(). This function does what calc_next() and push_tail()
  did. With this solution, I think prb_reserve() looks pretty
  clean. However, the optimization of communicating about the wrap is
  gone. So even though find_res_ptrs() knew if a wrap occurred,
  prb_reserve() figures it out again for itself. If we want the
  optimization, I still think the best approach is the -1,0,1 return
  value of find_res_ptrs().

I'm looking forward to your response.

John Ogness


diff --git a/include/linux/printk_ringbuffer.h 
b/include/linux/printk_ringbuffer.h
index 4239dc86e029..ab6177c9fe0a 100644
--- a/include/linux/printk_ringbuffer.h
+++ b/include/linux/printk_ringbuffer.h
@@ -25,6 +25,23 @@ struct printk_ringbuffer {
atomic_tctx;
 };
 
+/*
+ * struct prb_reserved_entry: Reserved but not yet committed entry.
+ * @rb: The printk_ringbuffer where the entry was reserved.
+ *
+ * This is a handle used by the writer to represent an entry that has been
+ * reserved but not yet committed.
+ *
+ * The structure does not actually store any information about the entry that
+ * has been reserved because this information is not required by the
+ * implementation. The struct could prove useful if extra tracking or even
+ * fundamental changes to the ringbuffer were to be implemented. And as such
+ * would not require changes to writers.
+ */
+struct prb_reserved_entry {
+   struct printk_ringbuffer*rb;
+};
+
 #define DECLARE_STATIC_PRINTKRB_CPULOCK(name)  \
 static struct prb_cpulock name = { \
.owner  = ATOMIC_INIT(-1),  \
@@ -46,6 +63,11 @@ static struct printk_ringbuffer name = { 
\
.ctx= ATOMIC_INIT(0),   \
 }
 
+/* writer interface */
+char *prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb,
+ unsigned int size);
+void prb_commit(struct prb_reserved_entry *e);
+
 /* utility functions */
 void prb_lock(struct prb_cpulock *cpu_lock);
 void prb_unlock(struct prb_cpulock *cpu_lock);
diff --git a/lib/printk_ringbuffer.c b/lib/printk_ringbuffer.c
index 54c750092810..fbe1d92b9b60 100644
--- a/lib/printk_ringbuffer.c
+++ b/lib/printk_ringbuffer.c
@@ -2,6 +2,59 @@
 #include 
 #include 
 
+/*
+ * struct prb_entry: An entry within the ringbuffer.
+ * @size: The size in bytes of the entry or -1 if terminating.
+ * @seq: The unique sequence number of the entry.
+ * @data: The data bytes of the entry.
+ *
+ * The struct is typecasted directly into the ringbuffer data array to access
+ * an entry. The @size specifies the complete size of the entry including any
+ * padding. The next entry will be located at &this_entry + this_entry.size.
+ * The only exception is if the entry is terminating (size = -1). In this case
+ * @seq and @data are invalid and the next entry is at the beginning of the
+ * ringbuffer data array.
+ */
+struct prb_entry {
+   unsigned intsize;
+   u64 seq;
+   chardata[0];
+};
+
+/* the size and size bitmask of the ringbuffer data array */
+#define PRB_SIZE(rb) (1L << rb->size_bits)
+#define PRB_SIZE_BITMASK(rb) (PRB_SIZE(rb) - 1)
+
+/* given a logical position, return its index in the ringbuffer data array */
+#define PRB_INDEX(rb, lpos) (lpos & PRB_SIZE_BITMASK(rb))
+
+/*
+ * given a logical position, return how many times the data buffer has
+ * wrapped, where logical position 0 begins at index 0 with no wraps
+ */
+#define PRB_WRAPS(rb, lpos) (lpos >> rb->size_bits)
+
+/*
+ * given a logical position, return the logical position that represents the
+ * beginning of the ringbuffer data array for this wrap
+ */
+#define PRB_THIS_WRAP_START_LPOS(rb, lpos) \
+   (PRB_WRAPS(rb, lpos) << rb->size_bits)
+
+/*
+ * given a logical position, return the logical position that represents the
+ * beginning of the ringbuffer data array for the next wrap
+ */
+#define PR

Re: [RFC PATCH 09/27] vfs: Allow mounting to other namespaces

2019-02-16 Thread Al Viro

On Fri, Feb 15, 2019 at 04:08:46PM +, David Howells wrote:
> Currently sys_move_mount() and sys_mount(MS_MOVE) prevent the caller from
> moving a mount into a namespace not their own.  Relax this such that any
> mount can be mounted onto any given mountpoint provided that the source
> mount is either detached or the same namespace as the destination.
> 
> This permits container namespaces to be built from the outside rather than
> from the inside.

I'm looking forward to your analysis of security implications, as well as
the proof that attach_recursive_mnt() won't get confused by that...

Re: [RFC PATCH 08/27] containers, vfs: Honour CONTAINER_NEW_EMPTY_FS_NS

2019-02-16 Thread Al Viro

On Fri, Feb 15, 2019 at 04:08:29PM +, David Howells wrote:

> + mnt_ns = alloc_mnt_ns(container->cred->user_ns, false);
> + if (IS_ERR(mnt_ns)) {
> + ret = PTR_ERR(mnt_ns);
> + goto out_fd;
> + }
> +
> + mnt = real_mount(path->mnt);
> + mnt_add_count(mnt, 1);
> + mnt->mnt_ns = mnt_ns;
> + mnt_ns->root = mnt;
> + mnt_ns->mounts++;
> + list_add(&mnt->mnt_list, &mnt_ns->list);
> +
> + ret = -EBUSY;
> + spin_lock(&container->lock);
> + if (!container->ns->mnt_ns) {
> + container->ns->mnt_ns = mnt_ns;
> + write_seqcount_begin(&container->seq);
> + container->root.mnt = path->mnt;
> + container->root.dentry = path->dentry;
> + write_seqcount_end(&container->seq);
> + path_get(&container->root);
> + mnt_ns = NULL;
> + ret = 0;
> + }

Almost certainly buggered.  Assumptions that we _won't_ get
to absolute root of namespace (it's overmounted and we are
chrooted into it, basically) had been made in quite a few
places.  The thing you are creating is *not* like normal
namespaces in that respect.

Re: [PATCH v2 1/2] leds: Add Intel Cherry Trail Whiskey Cove PMIC LEDs

2019-02-16 Thread Pavel Machek


> >I don't pretend to fully understand it, _but_ hw_pattern should really
> >describe the pattern LED should do, not whether it reacts to charging
> >or not.
> 
> Then we are back to step 1 of the discussion, that we need another
> mechanism outside of the trigger to select if the LED shows the configured
> pattern always, or only when the charger is on.

Yep, sorry.

> These really are 2 orthogonal settings, there is a pattern which can
> be set and the LED can either show that pattern always; or only when
> charging the battery. Note that the same pattern is used in both cases.
> 
> This is why I previously suggested having a custom sysfs hardware_control
> attribute which selects between the "only show pattern when charging"
> modes ("hardware_control=1" or "always show the pattern mode"
> ("hardware_control=0").

I see... and yes, that would be the easiest solution.

But somehow I see "this LED is controlled by charging state" as
primary and "it shows pulses instead of staying on" as secondary
eye-candy.

This week there was another driver for charger LED.. but that one does
not do pulses. Ideally, we'd like consistent interface to the
userland.

(To make it complex, the other driver supports things like:
  LED solid on -- fully charged
  LED blinking slowly -- charging
  LED blinking fast -- charge error
  LED off -- not charging).

Best regards,

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Al Viro

On Sat, Feb 16, 2019 at 11:47:02PM +, Al Viro wrote:
> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> 
> > What is the actual problem?  We’re not actually demand-faulting this data, 
> > are we?  Are we just overrunning the buffer because the from_user helpers 
> > are too clever?  Can we fix it for real by having the fancy helpers do 
> > *aligned* loads so that they don’t overrun the buffer?  Heck, this might be 
> > faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.
> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...
> 
> What am I missing here?

... the fact that read_info() might skip copying if everything it
wants is already in the buffer passed by decompressor ;-/  It's
been a while since I looked into that code...

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Al Viro

On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:

> What is the actual problem?  We’re not actually demand-faulting this data, 
> are we?  Are we just overrunning the buffer because the from_user helpers are 
> too clever?  Can we fix it for real by having the fancy helpers do *aligned* 
> loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

Unaligned _stores_ are not any cheaper, and you'd get one hell of
extra arithmetics from trying to avoid both.  Check something
like e.g. memcpy() on alpha, where you really have to keep all
accesses aligned, both on load and on store side.

Can't we just pad the buffers a bit?  Making sure that name_buf
and symlink_buf are _not_ followed by unmapped pages shouldn't
be hard.  Both are allocated by kmalloc(), so...

What am I missing here?

Re: [RFC] On the Current Troubles of Mainlining Loongson Platform Drivers

2019-02-16 Thread Alexandre Oliva

On Feb 11, 2019, Aaro Koskinen  wrote:

> ATA (libata) CS5536 driver is having issues with spurious IRQs and often
> disables IRQs completely during the boot. You should see a warning
> in dmesg.

Yup, thanks, it shows up first thing during boot.  I hadn't seen that
one in a while.  Thanks.

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe

Re: [PATCH] gpu: drm: radeon: Set DPM_FLAG_NEVER_SKIP when enabling PM-runtime

2019-02-16 Thread Alex Deucher

On Sat, Feb 16, 2019 at 1:01 AM Lukas Wunner  wrote:
>
> On Fri, Feb 15, 2019 at 11:01:04AM -0500, Alex Deucher wrote:
> > On Fri, Feb 15, 2019 at 10:39 AM Rafael J. Wysocki  
> > wrote:
> > > On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
> > > and the direct-complete optimization is used for the radeon device
> > > during system-wide suspend, the system doesn't resume.
> > >
> > > Preventing direct-complete from being used with the radeon device by
> > > setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
> > > go away, which indicates that direct-complete is not safe for the
> > > radeon driver in general and should not be used with it (at least
> > > for now).
> > >
> > > This fixes a regression introduced by commit c62ec4610c40
> > > ("PM / core: Fix direct_complete handling for devices with no
> > > callbacks") which allowed direct-complete to be applied to
> > > devices without PM callbacks (again) which in turn unlocked
> > > direct-complete for radeon on HP ProBook 4540s.
> >
> > Do other similar drivers like amdgpu and nouveau need the same fix?
> > I'm not too familiar with the direct_complete feature in general.
>
> direct_complete means that a discrete GPU which is in D3cold upon
> entering system sleep is left as is, i.e. it is not woken.  It is
> also expected to still be in D3cold when resuming from system sleep
> from the PM core's point of view.  (If it is in D0uninitialized, the
> GPU's driver needs to ensure it is transitioned to D3cold again.)
>
> I know for a fact that resuming the discrete GPU is not necessary
> on my MacBook Pro with Nvidia GPU.  I'd expect those with AMD GPUs
> to behave the same.  The apple-gmux driver takes care of putting
> the GPU into D3cold on resume from system sleep if it was in D3cold
> when entering system sleep (see drivers/platform/x86/apple-gmux.c,
> gmux_resume()).
>
> I think it is desirable to use direct_complete because it saves power
> (no need to gratuitously wake the GPU upon entering system sleep,
> only to immediately cut its power) and it also speeds up the suspend
> process by about half a second.

Thanks for the info.  It sounds like we need a similar patch for
amdgpu.  With dGPUs controlled by the ACPI ATPX method, I believe the
dGPU is powered by automatically on resume from S3/S4.  I think there
may be a way to change that behavior in some revisions of ATPX (i.e.,
to keep the state across suspend cycles), but it's not the default.
I'm not sure about the newer _PR3 stuff in Hybrid Graphics laptops.  I
think it retains state.  In both radeon and amdgpu we probably need to
check if the system is using ATPX or _PR3 and disable direct complete
for ATPX at least.

Alex

>
> The root cause on the HP ProBook 4540s needs to be debugged, I'd
> suspect a BIOS issue which could be adressed by a quirk, either for
> this particular machine or for a certain class of devices (e.g. all
> machines which use PR3 to transition to D3cold) if that is necessary
> to behave identically to Windows.  Or maybe the atpx vga_switcheroo
> handler needs to be amended to put the GPU into D3cold on resume from
> system sleep if it was runtime suspended before.
>
> Is this machine using s2idle or does it suspend to S3?
>
> Thanks,
>
> Lukas
>
> > > Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices 
> > > with no callbacks")
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=201519
> > > Reported-by: ??  
> > > Tested-by: ??  
> > > Signed-off-by: Rafael J. Wysocki 
> > > ---
> > >  drivers/gpu/drm/radeon/radeon_kms.c |1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > Index: linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> > > ===
> > > --- linux-pm.orig/drivers/gpu/drm/radeon/radeon_kms.c
> > > +++ linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> > > @@ -172,6 +172,7 @@ int radeon_driver_load_kms(struct drm_de
> > > }
> > >
> > > if (radeon_is_px(dev)) {
> > > +   dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP);
> > > pm_runtime_use_autosuspend(dev->dev);
> > > pm_runtime_set_autosuspend_delay(dev->dev, 5000);
> > > pm_runtime_set_active(dev->dev);

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Andy Lutomirski



> On Feb 16, 2019, at 2:50 PM, Andy Lutomirski  wrote:
> 
> 
> 
>>> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner  wrote:
>>> 
 On Sat, 16 Feb 2019, Jann Horn wrote:
 On Sat, Feb 16, 2019 at 12:59 AM  wrote:
 When extracting an initramfs, a filename may be near an allocation 
 boundary.
 Should that happen, strncopy_from_user will invoke unsafe_get_user which
 may cross the allocation boundary. Should that happen, unsafe_get_user will
 trigger a page fault, and strncopy_from_user would then bailout to
 byte_at_a_time behavior.
 
 unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
 boundaries.
 After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
 addresses")
 it may no longer rely on pagefault as the new page fault handler would
 trigger a BUG().
 
 This commit allows unsafe_get_user to explicitly trigger pagefaults and
 handle them directly with the error target label.
>>> 
>>> Oof. So basically the init code is full of things that just call
>>> syscalls instead of using VFS functions (which don't actually exist
>>> for everything), and the VFS syscalls use getname_flags(), which uses
>>> strncpy_from_user(), which can access out-of-bounds pages on
>>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
>>> that in summary means that all the init code is potentially prone to
>>> tripping over this?
>> 
>> Not all init code. It should be only the initramfs decompression.
>> 
>>> I don't particularly like this approach to fixing it, but I also don't
>>> have any better ideas, so I guess unless someone else has a bright
>>> idea, this patch might have to go in.
>> 
>> So we know that this happens in the context of decompress() which calls
>> flush_buffer() for every chunk. flush_buffer() gets the start_address and
>> the length. We also know that the fault can only happen within:
>> 
>>   start_address <= fault_address < start_address + length + 8;
>> 
>> So something like the untested workaround below should cover the initramfs
>> oddity and avoid to weaken the protection for all other cases.
> 
> What is the actual problem?  We’re not actually demand-faulting this data, 
> are we?  Are we just overrunning the buffer because the from_user helpers are 
> too clever?  Can we fix it for real by having the fancy helpers do *aligned* 
> loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

Indeed.  I would argue that the current code is a bug even in the normal case.  
If I lay out my user address space so that I have f,o,o,o,\0 at the end of a 
page and I have non-side-effect-free memory after it (MMIO, userfaultfd, etc), 
then passing a pointer to that “fooo” string to a syscall should *not* overrun 
the buffer.

If I have some time this evening, I’ll see if I can whip up a credible fix.

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Andy Lutomirski




> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner  wrote:
> 
>> On Sat, 16 Feb 2019, Jann Horn wrote:
>>> On Sat, Feb 16, 2019 at 12:59 AM  wrote:
>>> When extracting an initramfs, a filename may be near an allocation boundary.
>>> Should that happen, strncopy_from_user will invoke unsafe_get_user which
>>> may cross the allocation boundary. Should that happen, unsafe_get_user will
>>> trigger a page fault, and strncopy_from_user would then bailout to
>>> byte_at_a_time behavior.
>>> 
>>> unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
>>> boundaries.
>>> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
>>> addresses")
>>> it may no longer rely on pagefault as the new page fault handler would
>>> trigger a BUG().
>>> 
>>> This commit allows unsafe_get_user to explicitly trigger pagefaults and
>>> handle them directly with the error target label.
>> 
>> Oof. So basically the init code is full of things that just call
>> syscalls instead of using VFS functions (which don't actually exist
>> for everything), and the VFS syscalls use getname_flags(), which uses
>> strncpy_from_user(), which can access out-of-bounds pages on
>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
>> that in summary means that all the init code is potentially prone to
>> tripping over this?
> 
> Not all init code. It should be only the initramfs decompression.
> 
>> I don't particularly like this approach to fixing it, but I also don't
>> have any better ideas, so I guess unless someone else has a bright
>> idea, this patch might have to go in.
> 
> So we know that this happens in the context of decompress() which calls
> flush_buffer() for every chunk. flush_buffer() gets the start_address and
> the length. We also know that the fault can only happen within:
> 
>start_address <= fault_address < start_address + length + 8;
> 
> So something like the untested workaround below should cover the initramfs
> oddity and avoid to weaken the protection for all other cases.

What is the actual problem?  We’re not actually demand-faulting this data, are 
we?  Are we just overrunning the buffer because the from_user helpers are too 
clever?  Can we fix it for real by having the fancy helpers do *aligned* loads 
so that they don’t overrun the buffer?  Heck, this might be faster, too.

Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

2019-02-16 Thread Dave Chinner

On Fri, Feb 15, 2019 at 03:38:29PM -0800, Ira Weiny wrote:
> On Fri, Feb 15, 2019 at 03:00:31PM -0700, Jason Gunthorpe wrote:
> > On Fri, Feb 15, 2019 at 06:31:36PM +, Christopher Lameter wrote:
> > > On Fri, 15 Feb 2019, Matthew Wilcox wrote:
> > > 
> > > > > Since RDMA is something similar: Can we say that a file that is used 
> > > > > for
> > > > > RDMA should not use the page cache?
> > > >
> > > > That makes no sense.  The page cache is the standard synchronisation 
> > > > point
> > > > for filesystems and processes.  The only problems come in for the things
> > > > which bypass the page cache like O_DIRECT and DAX.
> > > 
> > > It makes a lot of sense since the filesystems play COW etc games with the
> > > pages and RDMA is very much like O_DIRECT in that the pages are modified
> > > directly under I/O. It also bypasses the page cache in case you have
> > > not noticed yet.
> > 
> > It is quite different, O_DIRECT modifies the physical blocks on the
> > storage, bypassing the memory copy.
> >
> 
> Really?  I thought O_DIRECT allowed the block drivers to write to/from user
> space buffers.  But the _storage_ was still under the control of the block
> drivers?

Yup, in a nutshell. Even O_DIRECT on DAX doesn't modify the physical
storage directly - it ends up in the pmem driver and it does a
memcpy() to move the data to/from the physical storage and the user
space buffer. It's exactly the same IO path as moving data to/from
the physical storage into the page cache pages

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com

Re: [PATCH v2] parisc: Fix ptrace syscall number modification

2019-02-16 Thread Dmitry V. Levin

On Sat, Feb 16, 2019 at 05:55:24PM +0100, Helge Deller wrote:
> On 16.02.19 14:10, Dmitry V. Levin wrote:
> > Commit 910cd32e552e ("parisc: Fix and enable seccomp filter support")
> > introduced a regression in ptrace-based syscall tampering: when tracer
> > changes syscall number to -1, the kernel fails to initialize %r28 with
> > -ENOSYS and subsequently fails to return the error code of the failed
> > syscall to userspace.
> > 
> > This erroneous behaviour could be observed with a simple strace syscall
> > fault injection command which is expected to print something like this:
> > 
> > $ strace -a0 -ewrite -einject=write:error=enospc echo hello
> > write(1, "hello\n", 6) = -1 ENOSPC (No space left on device) (INJECTED)
> > write(2, "echo: ", 6) = -1 ENOSPC (No space left on device) (INJECTED)
> > write(2, "write error", 11) = -1 ENOSPC (No space left on device) (INJECTED)
> > write(2, "\n", 1) = -1 ENOSPC (No space left on device) (INJECTED)
> > +++ exited with 1 +++
> > 
> > After commit 910cd32e552ea09caa89cdbe328e468979b030dd it loops printing
> > something like this instead:
> > 
> > write(1, "hello\n", 6../strace: Failed to tamper with process 12345: 
> > unexpectedly got no error (return value 0, error 0)
> > ) = 0 (INJECTED)
> > 
> > This bug was found by strace test suite.
> > 
> > Fixes: 910cd32e552e ("parisc: Fix and enable seccomp filter support")
> > Cc: sta...@vger.kernel.org # v4.5+
> > Signed-off-by: Dmitry V. Levin 
> 
> Thanks, the patch works as expected.
> You may add:
> Tested-by: Helge Deller 
> 
> There is an "out" label a few lines below, which should be removed as well.
> Otherwise you get this warning:
> arch/parisc/kernel/ptrace.c: In function ‘do_syscall_trace_enter’:
> arch/parisc/kernel/ptrace.c:357:1: warning: label ‘out’ defined but not used 
> [-Wunused-label]
> 
> I've fixed it up locally and added the patch to my for-next tree.
> If it's ok for you, I'll push it through the parisc tree.

It's fine with me, thanks!


-- 
ldv


signature.asc
Description: PGP signature

[PATCH] netfilter/ipvs: Fix unused variable warning

2019-02-16 Thread Borislav Petkov

From: Borislav Petkov 

Move the local variable 'ret' into the IP_VS_IPV6 ifdef so that the
unused variable warning doesn't fire for randconfig builds with
CONFIG_IP_VS_IPV6 disabled.

Fixes: 098e13f5b21d ("ipvs: fix dependency on nf_defrag_ipv6")
Cc: Andrea Claudi 
Cc: coret...@netfilter.org
Cc: Florian Westphal 
Cc: Jozsef Kadlecsik 
Cc: Julian Anastasov 
Cc: Li Shuang 
Cc: netfilter-de...@vger.kernel.org
Cc: Pablo Neira Ayuso 
Cc: Simon Horman 
Signed-off-by: Borislav Petkov 
---
 net/netfilter/ipvs/ip_vs_ctl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 86afacb07e5f..ac8d848d7624 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -896,12 +896,13 @@ ip_vs_new_dest(struct ip_vs_service *svc, struct 
ip_vs_dest_user_kern *udest,
 {
struct ip_vs_dest *dest;
unsigned int atype, i;
-   int ret = 0;
 
EnterFunction(2);
 
 #ifdef CONFIG_IP_VS_IPV6
if (udest->af == AF_INET6) {
+   int ret;
+
atype = ipv6_addr_type(&udest->addr.in6);
if ((!(atype & IPV6_ADDR_UNICAST) ||
atype & IPV6_ADDR_LINKLOCAL) &&
-- 
2.19.1

Dear Friend (Assalamu Alaikum),

2019-02-16 Thread AISHA GADDAFI

Dear Friend (Assalamu Alaikum),

I came across your e-mail contact prior a private ?search while in need of
your assistance. My name is Aisha  Al-Qaddafi a single Mother and a Widow
with three Children. I am the only biological Daughter of late Libyan
President (Late Colonel Muammar Gaddafi).

I have investment funds worth Twenty Seven Million Five Hundred Thousand
United State Dollar ($27.500.000.00 ) and i need a trusted investment
Manager/Partner because of my current refugee status, however, I am
interested in you for investment project assistance in your country, may be
from there, we can build business relationship in the nearest future.

I am willing to negotiate investment/business profit sharing ratio with you
base on the future investment earning profits.

If you are willing to handle this project on my behalf kindly reply urgent
to enable me provide you more information about the investment funds.

Best Regards
Mrs Aisha Al-Qaddafi

Re: ext4 corruption on alpha with 4.20.0-09062-gd8372ba8ce28

2019-02-16 Thread Meelis Roos


The result of the bisection is
[88dbcbb3a4847f5e6dfeae952d3105497700c128] blkdev: avoid migration stalls for 
blkdev pages

Is that result relevant for the problem or should I continue bisecting between 
4.20.0 and the so far first bad commit?


Can you try reverting the commit and see if it makes the problem go away?


Tried reverting it on top of 5.0.0-rc6-00153-g5ded5871030e and it seems to make 
the kernel work - emerge --sync succeeded.

Unfinished further bisection has also not yielded any other bad revisions so 
far.

--
Meelis Roos

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Thomas Gleixner

On Sat, 16 Feb 2019, Thomas Gleixner wrote:

> On Sat, 16 Feb 2019, Jann Horn wrote:
> > On Sat, Feb 16, 2019 at 12:59 AM  wrote:
> > > When extracting an initramfs, a filename may be near an allocation 
> > > boundary.
> > > Should that happen, strncopy_from_user will invoke unsafe_get_user which
> > > may cross the allocation boundary. Should that happen, unsafe_get_user 
> > > will
> > > trigger a page fault, and strncopy_from_user would then bailout to
> > > byte_at_a_time behavior.
> > >
> > > unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
> > > boundaries.
> > > After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on 
> > > kernel addresses")
> > > it may no longer rely on pagefault as the new page fault handler would
> > > trigger a BUG().
> > >
> > > This commit allows unsafe_get_user to explicitly trigger pagefaults and
> > > handle them directly with the error target label.
> > 
> > Oof. So basically the init code is full of things that just call
> > syscalls instead of using VFS functions (which don't actually exist
> > for everything), and the VFS syscalls use getname_flags(), which uses
> > strncpy_from_user(), which can access out-of-bounds pages on
> > architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
> > that in summary means that all the init code is potentially prone to
> > tripping over this?
> 
> Not all init code. It should be only the initramfs decompression.
> 
> > I don't particularly like this approach to fixing it, but I also don't
> > have any better ideas, so I guess unless someone else has a bright
> > idea, this patch might have to go in.
> 
> So we know that this happens in the context of decompress() which calls
> flush_buffer() for every chunk. flush_buffer() gets the start_address and
> the length. We also know that the fault can only happen within:
> 
> start_address <= fault_address < start_address + length + 8;
> 
> So something like the untested workaround below should cover the initramfs
> oddity and avoid to weaken the protection for all other cases.

The other even simpler solution would be to force these functions into the
byte at a time code path during init. Too tired to hack that up now.

Thanks,

tglx

Re: [PATCH v2 1/2] leds: Add Intel Cherry Trail Whiskey Cove PMIC LEDs

2019-02-16 Thread Hans de Goede


Hi,

On 2/16/19 10:54 PM, Jacek Anaszewski wrote:

On 2/16/19 8:37 PM, Pavel Machek wrote:

Hi!


I think that should work fine, which means that we can use the timer and
pattern trigger support for the blinking and breathing modes.

That still leaves the switching between user and hw-control modes,
as discussed the hw-controlled mode could be modelled as a new "hardware"
trigger, but then we cannot choose between on/blink/breathing when
in hw-controlled mode. As Pavel mentioned, that would require some
sort of composed trigger, where we have both the hardware and
timer triggers active for example.

I think it might be easier to just allow turning on/off the hardware
control mode through a special "hardware_control" sysfs attribute and
then use the existing timer and pattern triggers for blinking / breathing.


Pattern trigger exposes pattern file by default and hw_pattern if
pattern_set/get ops are provided. Writing them enables software and
hardware pattern respectively.


This is not about software vs hardware pattern.

There are 2 *orthogonal*, separate problems/challenges with this LED controller:

1) It has hardware blinking and breathing, as discussed this can be
controlled through the timer and pattern triggers, so this problem
is solved.

2) It has 2 operating modes:

a) Automatic/hardware controlled, in this mode the LED is turned
off or on (where on can be continues on, blinking or breathing)
by the hardware itself, when in this mode we / userspace is not
in control of the LED

b) Manual/user controlled mode, in this mode we / userspace can
control of the LED.

Currently there is no API in the ledclass to switch a LED from
automatic controlled to user controlled and back, This is what
the proposed hardware trigger was for, to switch to automatic
mode. A problem with this is that we still want to be able
to chose between continues on, blinking or breathing (when on),
configure the max brightness, etc.


Yes, we do have the API to switch a LED from automatic (hardware
accelerated) control to software control and back. This is pattern
trigger, which exposes two files for setting pattern: pattern
and hw_pattern. Writing pattern file switches the device to software
control mode and writing hw_pattern switches it to the hardware control,
with the possibility of defining device specific ABI syntax to enable
particular pattern (blinking, breathing or event permanently on
in case of this device).


OK, I see. So we would use the hw_pattern for this and the driver
would implement the pattern_set led_classdev callback.

The pattern_set callback would then expect 6 brightness/time tuples
with the following meaning for the time part of each tupple

tupple0: charging blinking_on_time
tupple1: charging blinking_off_time
tupple2: charging breathing_time
tupple3: manual blinking_on_time
tupple4: manual blinking_off_time
tupple5: manual breathing_time

Where only the times in tupple 0-2; or the times in 3-5 can be
non-zero. Having non zero times for both some charging and some
manual values is not allowed.

If a breathing time is set, none of the other times may be non
0. If blinkig_on and blinking_off are used then breathing_time
must be 0.

When configured to blink then blinking_off must be either 0
(continuously on); or it must be the same as blinking_on.


I believe this will work, does this sound ok to you ?


I don't pretend to fully understand it, _but_ hw_pattern should really
describe the pattern LED should do, not whether it reacts to charging
or not.


This is hardware specific and is supposed to have dedicated ABI
documentation. There's no reason to introduce new mechanisms when
existing ones fit. It will still describe a pattern but activated
on some condition.


Right, but we can control the condition, so either we need to make
the condition part of the pattern as in my recent proposal with:

tupple0: charging blinking_on_time
tupple1: charging blinking_off_time
tupple2: charging breathing_time
tupple3: manual blinking_on_time
tupple4: manual blinking_off_time
tupple5: manual breathing_time

As hw_pattern ABI; or we need to add an extra sysfs file to
set the condition.

So do you prefer the driver to code the condition into the hw_pattern
(see above); or do you prefer a separate sysfs attribute for the condition?

Regards,

Hans

Re: [PATCH v2 1/2] leds: Add Intel Cherry Trail Whiskey Cove PMIC LEDs

2019-02-16 Thread Jacek Anaszewski


On 2/16/19 8:37 PM, Pavel Machek wrote:

Hi!


I think that should work fine, which means that we can use the timer and
pattern trigger support for the blinking and breathing modes.

That still leaves the switching between user and hw-control modes,
as discussed the hw-controlled mode could be modelled as a new "hardware"
trigger, but then we cannot choose between on/blink/breathing when
in hw-controlled mode. As Pavel mentioned, that would require some
sort of composed trigger, where we have both the hardware and
timer triggers active for example.

I think it might be easier to just allow turning on/off the hardware
control mode through a special "hardware_control" sysfs attribute and
then use the existing timer and pattern triggers for blinking / breathing.


Pattern trigger exposes pattern file by default and hw_pattern if
pattern_set/get ops are provided. Writing them enables software and
hardware pattern respectively.


This is not about software vs hardware pattern.

There are 2 *orthogonal*, separate problems/challenges with this LED controller:

1) It has hardware blinking and breathing, as discussed this can be
controlled through the timer and pattern triggers, so this problem
is solved.

2) It has 2 operating modes:

a) Automatic/hardware controlled, in this mode the LED is turned
off or on (where on can be continues on, blinking or breathing)
by the hardware itself, when in this mode we / userspace is not
in control of the LED

b) Manual/user controlled mode, in this mode we / userspace can
control of the LED.

Currently there is no API in the ledclass to switch a LED from
automatic controlled to user controlled and back, This is what
the proposed hardware trigger was for, to switch to automatic
mode. A problem with this is that we still want to be able
to chose between continues on, blinking or breathing (when on),
configure the max brightness, etc.


Yes, we do have the API to switch a LED from automatic (hardware
accelerated) control to software control and back. This is pattern
trigger, which exposes two files for setting pattern: pattern
and hw_pattern. Writing pattern file switches the device to software
control mode and writing hw_pattern switches it to the hardware control,
with the possibility of defining device specific ABI syntax to enable
particular pattern (blinking, breathing or event permanently on
in case of this device).


OK, I see. So we would use the hw_pattern for this and the driver
would implement the pattern_set led_classdev callback.

The pattern_set callback would then expect 6 brightness/time tuples
with the following meaning for the time part of each tupple

tupple0: charging blinking_on_time
tupple1: charging blinking_off_time
tupple2: charging breathing_time
tupple3: manual blinking_on_time
tupple4: manual blinking_off_time
tupple5: manual breathing_time

Where only the times in tupple 0-2; or the times in 3-5 can be
non-zero. Having non zero times for both some charging and some
manual values is not allowed.

If a breathing time is set, none of the other times may be non
0. If blinkig_on and blinking_off are used then breathing_time
must be 0.

When configured to blink then blinking_off must be either 0
(continuously on); or it must be the same as blinking_on.


I believe this will work, does this sound ok to you ?


I don't pretend to fully understand it, _but_ hw_pattern should really
describe the pattern LED should do, not whether it reacts to charging
or not.


This is hardware specific and is supposed to have dedicated ABI
documentation. There's no reason to introduce new mechanisms when
existing ones fit. It will still describe a pattern but activated
on some condition.

--
Best regards,
Jacek Anaszewski

Re: [PATCH] xarray: Document erasing entries during iteration

2019-02-16 Thread Wei Yang

On Thu, Feb 14, 2019 at 02:33:00PM -0800, Matthew Wilcox wrote:
>On Thu, Feb 14, 2019 at 10:16:52PM +, Wei Yang wrote:
>> On Wed, Feb 13, 2019 at 08:12:58AM -0800, Matthew Wilcox wrote:
>> >The only remaining user of the radix tree in that tree is the IDR.  So
>> >now I'm converting the IDR users over to the XArray as well.
>> 
>> Wow, really a HUGE work.
>
>Yes ... but necessary.  Have to pay down the technical debt.
>
>> >But that isn't what I was talking about.  At the moment, the radix
>> >tree and the XArray use the same data structure.  It has good best-case
>> >performance, but shockingly bad worst-case performance.  So we're looking
>> >at replacing the data structure, which won't require changing any of the
>> >users (maybe the page cache ... that has some pretty intimate knowledge
>> >of exactly how the radix tree works).
>> 
>> Two questions from my curiosity:
>> 
>> 1. Why you come up the idea to replace radix tree with XArray even they
>>use the same data structure?
>
>The radix tree API was just awful to use.  I tried to convert some users
>with their own resizing-array-of-pointers to use the radix tree, and I
>gave up in disgust.  I believe the XArray is much simpler to use.
>
>> 2. The worst-case performance is related to the data structure itself?
>
>Yes.  Consider the case where you store a pointer at its own address in
>the data structure.  It'll allocate 11 nodes occupying one-and-a-half pages
>of RAM in order to store a single pointer.
>

Thanks for your explanation :-)

>> >> BTW, have we compared the performance difference?
>> >
>> >It's in the noise.  Sometimes the XArray does a little better because
>> >the APIs encourage the user to do things in a more efficient way.
>> >Some of the users are improved just because the original author didn't
>> >know about a more efficient way of doing what they wanted to do.
>> 
>> So sometimes XArray does a little worse?
>> 
>> Why this happens whey XArray and radix tree has the same data structure?
>> 
>> Interesting.
>
>I'm not sure there are any cases where the XArray does worse.
>Feel free to run your own measurements ... the test cases in
>tools/testing/radix-tree can always do with being improved ;-) (that
>directory is a bit misnamed as it now features tests for the radix tree,
>IDR, IDA and XArray).

Sure, I will try to play with this.

-- 
Wei Yang
Help you, Help me

ax25: fix possible use-after-free

2019-02-16 Thread f6bvp

Patch applied successfully on Linux draws-f6bvp 4.14.79-v7+ #1159 SMP Sun Nov 4 
17:50:20 GMT 2018 armv7l GNU/Linux

However ax25_route_lock_use and ax25_route_lock_unuse() are not declared and 
compile failed.

make : on entre dans le répertoire « /usr/src/linux-headers-4.14.79-v7+ »
  CC [M]  /usr/src/linux-4.14.y/net/ax25/ax25_ip.o
/usr/src/linux-4.14.y/net/ax25/ax25_ip.c: In function ‘ax25_ip_xmit’:
/usr/src/linux-4.14.y/net/ax25/ax25_ip.c:117:2: error: implicit declaration of 
function ‘ax25_route_lock_use’ [-Werror=implicit-function-declaration]
  ax25_route_lock_use();
  ^~~
/usr/src/linux-4.14.y/net/ax25/ax25_ip.c:211:2: error: implicit declaration of 
function ‘ax25_route_lock_unuse’ [-Werror=implicit-function-declaration]
  ax25_route_lock_unuse();
  ^
cc1: some warnings being treated as errors
scripts/Makefile.build:328 : la recette pour la cible « 
/usr/src/linux-4.14.y/net/ax25/ax25_ip.o » a échouée
make[1]: *** [/usr/src/linux-4.14.y/net/ax25/ax25_ip.o] Erreur 1
Makefile:1527 : la recette pour la cible « 
_module_/usr/src/linux-4.14.y/net/ax25 » a échouée
make: *** [_module_/usr/src/linux-4.14.y/net/ax25] Erreur 2
make : on quitte le répertoire « /usr/src/linux-headers-4.14.79-v7+ »


Bernard, f6bvp

Re: find_get_entries_tag regression bisected

2019-02-16 Thread Matthew Wilcox

On Sat, Feb 16, 2019 at 09:29:48AM -0800, Matthew Wilcox wrote:
> On Sat, Feb 16, 2019 at 07:35:11AM -0800, Matthew Wilcox wrote:
> > Another way to fix this would be to mask the address in dax_entry_mkclean(),
> > but I think this is cleaner.
> 
> That's clearly rubbish, dax_entry_mkclean() can't possibly mask the
> address.  It might be mis-aligned in another process.  But ... if it's
> misaligned in another process, dax_entry_mkclean() will only clean the first
> PTE associated with the PMD; it won't clean the whole thing.  I think we need
> something like this:

Nope, this isn't enough.  It's _necessary_ to find the processes that
have part of this PMD page mapped, but not the start of it.  But it's
not _sufficient_ because it'll still only mkclean the first PTE.  So we
need a loop.  I'm feeling a bit over my head here.  I may have a go at
a fuller fix, but if someone else wants to have a go at it, be my guest!

(feeling massively unqualified for the task at hand ;-)

[PATCH] dm crypt: fix memory leak in dm_crypt_integrity_io_alloc() error path

2019-02-16 Thread sultan

From: Sultan Alsawaf 

dm_crypt_integrity_io_alloc() allocates space for an integrity payload but
doesn't free it in the error path, leaking memory. Add a bio_integrity_free()
invocation upon error to fix the memory leak.

Signed-off-by: Sultan Alsawaf 
---
 drivers/md/dm-crypt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index dd538e6b2..f731e1fe0 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -939,8 +939,10 @@ static int dm_crypt_integrity_io_alloc(struct dm_crypt_io 
*io, struct bio *bio)
 
ret = bio_integrity_add_page(bio, virt_to_page(io->integrity_metadata),
 tag_len, 
offset_in_page(io->integrity_metadata));
-   if (unlikely(ret != tag_len))
+   if (unlikely(ret != tag_len)) {
+   bio_integrity_free(bio);
return -ENOMEM;
+   }
 
return 0;
 }
-- 
2.20.1

Re: [PATCH v2 1/2] leds: Add Intel Cherry Trail Whiskey Cove PMIC LEDs

2019-02-16 Thread Hans de Goede


Hi,

On 2/16/19 8:37 PM, Pavel Machek wrote:

Hi!


I think that should work fine, which means that we can use the timer and
pattern trigger support for the blinking and breathing modes.

That still leaves the switching between user and hw-control modes,
as discussed the hw-controlled mode could be modelled as a new "hardware"
trigger, but then we cannot choose between on/blink/breathing when
in hw-controlled mode. As Pavel mentioned, that would require some
sort of composed trigger, where we have both the hardware and
timer triggers active for example.

I think it might be easier to just allow turning on/off the hardware
control mode through a special "hardware_control" sysfs attribute and
then use the existing timer and pattern triggers for blinking / breathing.


Pattern trigger exposes pattern file by default and hw_pattern if
pattern_set/get ops are provided. Writing them enables software and
hardware pattern respectively.


This is not about software vs hardware pattern.

There are 2 *orthogonal*, separate problems/challenges with this LED controller:

1) It has hardware blinking and breathing, as discussed this can be
controlled through the timer and pattern triggers, so this problem
is solved.

2) It has 2 operating modes:

a) Automatic/hardware controlled, in this mode the LED is turned
off or on (where on can be continues on, blinking or breathing)
by the hardware itself, when in this mode we / userspace is not
in control of the LED

b) Manual/user controlled mode, in this mode we / userspace can
control of the LED.

Currently there is no API in the ledclass to switch a LED from
automatic controlled to user controlled and back, This is what
the proposed hardware trigger was for, to switch to automatic
mode. A problem with this is that we still want to be able
to chose between continues on, blinking or breathing (when on),
configure the max brightness, etc.


Yes, we do have the API to switch a LED from automatic (hardware
accelerated) control to software control and back. This is pattern
trigger, which exposes two files for setting pattern: pattern
and hw_pattern. Writing pattern file switches the device to software
control mode and writing hw_pattern switches it to the hardware control,
with the possibility of defining device specific ABI syntax to enable
particular pattern (blinking, breathing or event permanently on
in case of this device).


OK, I see. So we would use the hw_pattern for this and the driver
would implement the pattern_set led_classdev callback.

The pattern_set callback would then expect 6 brightness/time tuples
with the following meaning for the time part of each tupple

tupple0: charging blinking_on_time
tupple1: charging blinking_off_time
tupple2: charging breathing_time
tupple3: manual blinking_on_time
tupple4: manual blinking_off_time
tupple5: manual breathing_time

Where only the times in tupple 0-2; or the times in 3-5 can be
non-zero. Having non zero times for both some charging and some
manual values is not allowed.

If a breathing time is set, none of the other times may be non
0. If blinkig_on and blinking_off are used then breathing_time
must be 0.

When configured to blink then blinking_off must be either 0
(continuously on); or it must be the same as blinking_on.


I believe this will work, does this sound ok to you ?


I don't pretend to fully understand it, _but_ hw_pattern should really
describe the pattern LED should do, not whether it reacts to charging
or not.


Then we are back to step 1 of the discussion, that we need another
mechanism outside of the trigger to select if the LED shows the configured
pattern always, or only when the charger is on.

These really are 2 orthogonal settings, there is a pattern which can
be set and the LED can either show that pattern always; or only when
charging the battery. Note that the same pattern is used in both cases.

This is why I previously suggested having a custom sysfs hardware_control
attribute which selects between the "only show pattern when charging"
modes ("hardware_control=1" or "always show the pattern mode"
("hardware_control=0").

Regards,

Hans

Re: void __iomem *addr should be const

2019-02-16 Thread Hugo Lefeuvre

Hi,

> The const makes perfectly sense and we should have consistent state all
> over the place.

Thanks, I will submit a patch addressing this issue soon.

regards,
 Hugo

-- 
Hugo Lefeuvre (hle)|www.owl.eu.com
RSA4096_ 360B 03B3 BF27 4F4D 7A3F D5E8 14AA 1EB8 A247 3DFD
ed25519_ 37B2 6D38 0B25 B8A2 6B9F 3A65 A36F 5357 5F2D DC4C

Re: [PATCH] staging/android: use multiple futex wait queues

2019-02-16 Thread Hugo Lefeuvre

Hi,

> Have you tested this?

I have finally set up a cuttlefish test env and tested both my first
patch set[0] and this patch (v2).

My first patch set works fine. I have nothing to say about it.

> Noticed any performance speedups or slow downs?

This patch doesn't work.

The boot process goes well. Overall, calls to vsoc_cond_wake are executed
slightly faster. However the system freezes at some point.

The graphical interface generates many calls to vsoc_cond_wake and wait,
so I suspect an issue with the locks. I will investigate this and come back
with an updated patch.

regards,
 Hugo

[0] https://lore.kernel.org/patchwork/patch/1039712/

-- 
Hugo Lefeuvre (hle)|www.owl.eu.com
RSA4096_ 360B 03B3 BF27 4F4D 7A3F D5E8 14AA 1EB8 A247 3DFD
ed25519_ 37B2 6D38 0B25 B8A2 6B9F 3A65 A36F 5357 5F2D DC4C

Re: [PATCH] ARM: dts: rockchip: add chosen node on veyron chromebooks

2019-02-16 Thread Heiko Stuebner

Am Samstag, 16. Februar 2019, 02:13:25 CET schrieb Alexandru M Stan:
> On Fri, Feb 15, 2019 at 3:09 PM Heiko Stübner  wrote:
> >
> > Hi Enric,
> >
> > Am Freitag, 15. Februar 2019, 12:51:50 CET schrieb Enric Balletbo i Serra:
> > > In order to use earlycon, the stdout-path property needs to be set
> > > in the chosen node.
> > >
> > > Signed-off-by: Enric Balletbo i Serra 
> >
> > What's the reason for adding this only for the Chromebook variants?
> > Uart2 is pretty much the standard output for all devices, so I'd assume
> > at least all veyron boards should use uart2 as well, making this ideally
> > live in the rk3288-veyron.dtsi instead?
> 
> Yep, all veyriants use uart 2, even when they're not chromebooks.
> Feel free to put it in the other file instead.
> 
> Otherwise it'll make things like the Asus Chromebit (mickey), AOpen
> Chromebox mini (fievel), and AOpen Chromebase (tiger) unhappy
> while debugging.
> 
> The rk3288-veyron-chromebook.dtsi file is more for things that make
> a chromebook portable. Ex: built in display, lid switch, cros-ec.
> 
> Note how we have no uart2 references either in our tree in the
> chromebook specific dtsi:
> https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-3.14/arch/arm/boot/dts/rk3288-veyron-chromebook.dtsi

thanks for the confirmation :-) .

I've moved the choosen node over to veyron.dtsi and applied the result
for 5.1

Thanks
Heiko

[GIT PULL] ARM: SoC fixes for 5.0

2019-02-16 Thread Arnd Bergmann

The following changes since commit d13937116f1e82bf508a6325111b322c30c85eb9:

  Linux 5.0-rc6 (2019-02-10 14:42:20 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git tags/armsoc-fixes

for you to fetch changes up to 410d7360541c0e21b58e56b64e5bcdbec9c1d285:

  Merge tag 'omap-for-v5.0/fixes-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into
arm/fixes (2019-02-15 20:39:46 +0100)


This week is a much smaller update, containing fixes only for TI OMAP,
NXP i.MX and Rockchips platforms:

 - omap4 had problems with lost timer interrupts
 - another IRQ handling issue with OMAP5
 - A workaround for a regression in the pwm-omap-dmtimer driver

 - eMMC was broken on the new imx8mq-evk board

 - a fix for new dtc graph warnings and a regulator fix for rock64
 - USB support broke on rk3328-rock64


Arnd Bergmann (5):
  Merge tag 'omap-for-v5.0/fixes-rc4' of
git://git.kernel.org/.../tmlind/linux-omap into fixes
  Merge tag 'v5.0-rockchip-dts32fixes-1' of
git://git.kernel.org/.../mmind/linux-rockchip into arm/fixes
  Merge tag 'v5.0-rockchip-dts64fixes-1' of
git://git.kernel.org/.../mmind/linux-rockchip into arm/fixes
  Merge tag 'imx-fixes-5.0-3' of
git://git.kernel.org/.../shawnguo/linux into arm/fixes
  Merge tag 'omap-for-v5.0/fixes-rc5' of
git://git.kernel.org/.../tmlind/linux-omap into arm/fixes

Carlo Caione (1):
  arm64: dts: imx8mq: Fix boot from eMMC

Dmitry Voytik (1):
  arm64: dts: rockchip: enable usb-host regulators at boot on rk3328-rock64

Enric Balletbo i Serra (1):
  arm64: dts: rockchip: fix graph_port warning on rk3399 bob kevin
and excavator

Johan Jonker (1):
  ARM: dts: rockchip: remove qos_cif1 from rk3188 power-domain

Russell King (1):
  ARM: OMAP2+: fix lack of timer interrupts on CPU1 after hotplug

Tony Lindgren (5):
  clocksource: timer-ti-dm: Fix pwm dmtimer usage of fck reparenting
  ARM: OMAP5+: Fix inverted nirq pin interrupts with irq_set_type
  bus: ti-sysc: Fix timer handling with drop pm_runtime_irq_safe()
  ARM: dts: Configure clock parent for pwm vibra
  Merge branch 'pwm-dmtimer-fixes' into omap-for-v5.0/fixes-v2

Yizhuo (1):
  ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads()
could be uninitialized

 arch/arm/boot/dts/omap4-droid4-xt894.dts   | 11 ++
 arch/arm/boot/dts/omap5-board-common.dtsi  |  9 +++--
 arch/arm/boot/dts/omap5-cm-t54.dts | 12 +-
 arch/arm/boot/dts/rk3188.dtsi  |  1 -
 arch/arm/mach-omap2/cpuidle44xx.c  | 16 ++--
 arch/arm/mach-omap2/display.c  |  7 +++-
 arch/arm/mach-omap2/omap-wakeupgen.c   | 36 +-
 arch/arm64/boot/dts/freescale/imx8mq-evk.dts   | 44 +++---
 arch/arm64/boot/dts/freescale/imx8mq.dtsi  |  2 +
 arch/arm64/boot/dts/rockchip/rk3328-rock64.dts |  2 +
 arch/arm64/boot/dts/rockchip/rk3399-gru-bob.dts|  2 +-
 arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts  |  2 +-
 .../dts/rockchip/rk3399-sapphire-excavator.dts |  2 +-
 drivers/bus/ti-sysc.c  |  6 +--
 drivers/clocksource/timer-ti-dm.c  |  5 ++-
 15 files changed, 109 insertions(+), 48 deletions(-)

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Thomas Gleixner

On Sat, 16 Feb 2019, Jann Horn wrote:
> On Sat, Feb 16, 2019 at 12:59 AM  wrote:
> > When extracting an initramfs, a filename may be near an allocation boundary.
> > Should that happen, strncopy_from_user will invoke unsafe_get_user which
> > may cross the allocation boundary. Should that happen, unsafe_get_user will
> > trigger a page fault, and strncopy_from_user would then bailout to
> > byte_at_a_time behavior.
> >
> > unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
> > boundaries.
> > After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
> > addresses")
> > it may no longer rely on pagefault as the new page fault handler would
> > trigger a BUG().
> >
> > This commit allows unsafe_get_user to explicitly trigger pagefaults and
> > handle them directly with the error target label.
> 
> Oof. So basically the init code is full of things that just call
> syscalls instead of using VFS functions (which don't actually exist
> for everything), and the VFS syscalls use getname_flags(), which uses
> strncpy_from_user(), which can access out-of-bounds pages on
> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
> that in summary means that all the init code is potentially prone to
> tripping over this?

Not all init code. It should be only the initramfs decompression.

> I don't particularly like this approach to fixing it, but I also don't
> have any better ideas, so I guess unless someone else has a bright
> idea, this patch might have to go in.

So we know that this happens in the context of decompress() which calls
flush_buffer() for every chunk. flush_buffer() gets the start_address and
the length. We also know that the fault can only happen within:

start_address <= fault_address < start_address + length + 8;

So something like the untested workaround below should cover the initramfs
oddity and avoid to weaken the protection for all other cases.

Thanks,

tglx

8<---
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -1,5 +1,6 @@
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -161,6 +162,14 @@ static bool bogus_uaccess(struct pt_regs
if (current->kernel_uaccess_faults_ok)
return false;
 
+   /*
+* initramfs decompression can trigger a fault when
+* unsafe_get_user() goes over the boundary of the buffer. That's a
+* valid case for e.g. strncpy_from_user().
+*/
+   if (initramfs_fault_in_decompress(fault_addr))
+   return false;
+
/* This is bad. Refuse the fixup so that we go into die(). */
if (trapnr == X86_TRAP_PF) {
pr_emerg("BUG: pagefault on kernel address 0x%lx in 
non-whitelisted uaccess\n",
--- a/include/linux/initrd.h
+++ b/include/linux/initrd.h
@@ -1,5 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#ifndef LINUX_INITRD_H
+#define LINUX_INITRD_H
+
 #define INITRD_MINOR 250 /* shouldn't collide with /dev/ram* too soon ... */
 
 /* 1 = load ramdisk, 0 = don't load */
@@ -25,3 +28,14 @@ extern phys_addr_t phys_initrd_start;
 extern unsigned long phys_initrd_size;
 
 extern unsigned int real_root_dev;
+
+#ifdef CONFIG_BLK_DEV_INITRD
+bool initramfs_fault_in_decompress(unsigned long addr);
+#else
+static inline bool initramfs_fault_in_decompress(unsigned long addr)
+{
+   return false;
+}
+#endif
+
+#endif
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -403,13 +403,27 @@ static __initdata int (*actions[])(void)
[Reset] = do_reset,
 };
 
+static unsigned long flush_start;
+static unsigned long flush_length;
+
+bool initramfs_fault_in_decompress(unsigned long addr)
+{
+   return addr >= flush_start && addr < flush_start + flush_length + 8;
+}
+
 static long __init write_buffer(char *buf, unsigned long len)
 {
+   /* Store address and length for uaccess fault handling */
+   flush_start = (unsigned long) buf;
+   flush_length = len;
+
byte_count = len;
victim = buf;
 
while (!actions[state]())
;
+   /* Clear the uaccess fault handling region */
+   flush_start = flush_length = 0;
return len - byte_count;
 }

Re: [PATCHv2] random: Make /dev/random wait for crng_ready

2019-02-16 Thread Bernd Edlinger

On 2/16/19 7:23 PM, Theodore Y. Ts'o wrote:
> 
> This really isn't a correct way to fix things; since the blocking_pool
> used for /dev/random and the CRNG state are different things, and are
> fed by different sources of entropy.
> 

It is interesting that random_poll does only look at input_pool.
Could it be that the existing entropy in blocking_pool also matters?

My observation is that _random_read extracts always the requested
number of bytes from the input_pool by calling extract_entropy_user.
By not calling extract_entropy_user it is straight forward to ensure
that the entropy in the input_pool can accumulate until the primary_crng
is being initialized.

> What we should do is to have a separate flag which indicates that the
> blocking_pool has been adequately initialized, and set it only when
> the entropy count in the blocking pool is at least 128 bits.  When get
> woken up by the reader lock, we would transfer entropy from the input
> pool to the blocking pool, and if the pool is not yet initializedm,
> and the entropy count is less than 128 bits, we wait until it is.
> 

What happens for me, is that single byte reads out of /dev/random are able
to pull so much entropy out of the input_pool that the input_pool
never reaches 128 bit and therefore crng_ready will never be true.

Therefore I want to delay the blocking_pool until the CRNG is fully
initialized.  I hope you agree about that.

When that is done, there are two ways how entropy is transferred
from the input_pool to the blocking_pool, just to confirm I have
not overlooked something:

/* If the input pool is getting full, send some
 * entropy to the blocking pool until it is 75% full.
 */
if (entropy_bits > random_write_wakeup_bits &&
r->initialized &&
r->entropy_total >= 2*random_read_wakeup_bits) {
struct entropy_store *other = &blocking_pool;

if (other->entropy_count <=
3 * other->poolinfo->poolfracbits / 4) {
schedule_work(&other->push_work);
r->entropy_total = 0;
}
}

this rarely happens, before crng_ready because it requires more than
128 bits of entropy, at least in my setup.

The other path is:
_random_read->extract_entropy_user->xfer_secondary_pool

which pulls at least to random_read_wakeup_bits / 8
from the input_pool to the blocking_pool, 
and the same amount of entropy is extracted again
from the blocking_pool so the blocking_pool's entropy_count
effectively stays at zero, when this path is taken.

That you don't like, right?

You propose to disable the second path until the first path
has pulled 128 bits into the blocking_pool, right?

Thanks
Bernd.

Re: [PATCH] VMCI: Support upto 64-bit PPNs

2019-02-16 Thread gre...@linuxfoundation.org

On Fri, Feb 15, 2019 at 04:32:47PM +, Vishnu DASA wrote:
> diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h
> index b724ef7005de..eaa1e762bf06 100644
> --- a/include/linux/vmw_vmci_defs.h
> +++ b/include/linux/vmw_vmci_defs.h
> @@ -45,6 +45,7 @@
>  #define VMCI_CAPS_GUESTCALL 0x2
>  #define VMCI_CAPS_DATAGRAM  0x4
>  #define VMCI_CAPS_NOTIFICATIONS 0x8
> +#define VMCI_CAPS_PPN64 0x10

BIT()?

Re: 4.20.7: pl2303 not working (post-4.19 regression) (limited info so far, not yet bisected)

2019-02-16 Thread Greg KH

On Sat, Feb 16, 2019 at 04:26:30PM +, Nix wrote:
> So I just tried to connect up to my ancient Soekris firewall's serial
> console to try to bisect a problem where it stopped booting in 4.20, and
> found I couldn't.
> 
> minicom says:
> 
> minicom: cannot open /dev/ttyUSB0: Input/output error
> 
> and in the dmesg we see
> 
> [705576.028170] pl2303 ttyUSB0: failed to submit interrupt urb: -28
> 
> Booting to 4.19, everything works fine. (A random GalliumOS Chromebook
> running 4.9.4 works fine too, not that that confirmation is terribly
> useful.)
> 
> This is an extremely preliminary report in case it's instantly obvious
> what's going on: I'll do enough investigation to produce an actually
> useful bug report, including bisecting this, after I've bisected the
> *other* non-booting bug, but that might not be until next weekend. (All
> this for a firewall I was trying to decommission! bah :) )

bisection would be great, thanks!

greg k-h

Re: [PATCH] Use the correct style for SPDX License Identifier in all header files across the kernel

2019-02-16 Thread Greg Kroah-Hartman

On Sat, Feb 16, 2019 at 04:04:02PM +0530, Nishad Kamdar wrote:
>  816 files changed, 970 insertions(+), 970 deletions(-)

You have to break this up into subsystems at the very least, no one
person can take this patch, sorry.

greg k-h

Re: [PATCH v2 1/2] leds: Add Intel Cherry Trail Whiskey Cove PMIC LEDs

2019-02-16 Thread Pavel Machek

Hi!

> I think that should work fine, which means that we can use the timer and
> pattern trigger support for the blinking and breathing modes.
> 
> That still leaves the switching between user and hw-control modes,
> as discussed the hw-controlled mode could be modelled as a new "hardware"
> trigger, but then we cannot choose between on/blink/breathing when
> in hw-controlled mode. As Pavel mentioned, that would require some
> sort of composed trigger, where we have both the hardware and
> timer triggers active for example.
> 
> I think it might be easier to just allow turning on/off the hardware
> control mode through a special "hardware_control" sysfs attribute and
> then use the existing timer and pattern triggers for blinking / breathing.
> >>>
> >>>Pattern trigger exposes pattern file by default and hw_pattern if
> >>>pattern_set/get ops are provided. Writing them enables software and
> >>>hardware pattern respectively.
> >>
> >>This is not about software vs hardware pattern.
> >>
> >>There are 2 *orthogonal*, separate problems/challenges with this LED 
> >>controller:
> >>
> >>1) It has hardware blinking and breathing, as discussed this can be
> >>controlled through the timer and pattern triggers, so this problem
> >>is solved.
> >>
> >>2) It has 2 operating modes:
> >>
> >>a) Automatic/hardware controlled, in this mode the LED is turned
> >>off or on (where on can be continues on, blinking or breathing)
> >>by the hardware itself, when in this mode we / userspace is not
> >>in control of the LED
> >>
> >>b) Manual/user controlled mode, in this mode we / userspace can
> >>control of the LED.
> >>
> >>Currently there is no API in the ledclass to switch a LED from
> >>automatic controlled to user controlled and back, This is what
> >>the proposed hardware trigger was for, to switch to automatic
> >>mode. A problem with this is that we still want to be able
> >>to chose between continues on, blinking or breathing (when on),
> >>configure the max brightness, etc.
> >
> >Yes, we do have the API to switch a LED from automatic (hardware
> >accelerated) control to software control and back. This is pattern
> >trigger, which exposes two files for setting pattern: pattern
> >and hw_pattern. Writing pattern file switches the device to software
> >control mode and writing hw_pattern switches it to the hardware control,
> >with the possibility of defining device specific ABI syntax to enable
> >particular pattern (blinking, breathing or event permanently on
> >in case of this device).
> 
> OK, I see. So we would use the hw_pattern for this and the driver
> would implement the pattern_set led_classdev callback.
> 
> The pattern_set callback would then expect 6 brightness/time tuples
> with the following meaning for the time part of each tupple
> 
> tupple0: charging blinking_on_time
> tupple1: charging blinking_off_time
> tupple2: charging breathing_time
> tupple3: manual blinking_on_time
> tupple4: manual blinking_off_time
> tupple5: manual breathing_time
> 
> Where only the times in tupple 0-2; or the times in 3-5 can be
> non-zero. Having non zero times for both some charging and some
> manual values is not allowed.
> 
> If a breathing time is set, none of the other times may be non
> 0. If blinkig_on and blinking_off are used then breathing_time
> must be 0.
> 
> When configured to blink then blinking_off must be either 0
> (continuously on); or it must be the same as blinking_on.
> 
> 
> I believe this will work, does this sound ok to you ?

I don't pretend to fully understand it, _but_ hw_pattern should really
describe the pattern LED should do, not whether it reacts to charging
or not.

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

xfstest status update - possible regression outside cifs in rc6

2019-02-16 Thread Steve French

Have been looking at the cifs buildbot's automated xfstest results
(azure test bucket in this case), and noticed possible regression
outside cifs with  5.0-rc6.  Various xfstests started to fail on rc6
(didn't fail on rc5 or earlier, and no cifs changes involved).
Presumably someone else caught more of these these in testing another
fs since I noticed various mm/fs patches were reverted a few days ago
and these did fix most of the xfstest regressions with one exception -
I still see xfstest 422 fail on rc6 as of a couple of days ago
(although backing up to rc5 it didn't fail, and it wasn't obvious to
me at first which fs or mm change regressed this, and there were no
cifs changes).

Will see if I can narrow it down, or repro it in quicker environment.

-- 
Thanks,

Steve

[PATCH] rpmsg: add clock API redirection driver

2019-02-16 Thread Xiang Xiao

From: Yanlin Zhu 

which could redirect clk API from remote to the kernel

Signed-off-by: Yanlin Zhu 
---
 drivers/rpmsg/Kconfig |  10 ++
 drivers/rpmsg/Makefile|   1 +
 drivers/rpmsg/rpmsg_clk.c | 284 ++
 3 files changed, 295 insertions(+)
 create mode 100644 drivers/rpmsg/rpmsg_clk.c

diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
index 13ead55..ed04cb9 100644
--- a/drivers/rpmsg/Kconfig
+++ b/drivers/rpmsg/Kconfig
@@ -15,6 +15,16 @@ config RPMSG_CHAR
  in /dev. They make it possible for user-space programs to send and
  receive rpmsg packets.
 
+config RPMSG_CLK
+   tristate "RPMSG clock API redirection"
+   depends on COMMON_CLK
+   depends on RPMSG
+   help
+ Say Y here to redirect clock API from the remote processor.
+ With this driver, the remote processor could:
+ 1.Reuse the clock driver in the kernel side, or
+ 2.Form a hybrid(kernel plus RTOS) clock tree.
+
 config RPMSG_SYSLOG
tristate "RPMSG syslog redirection"
depends on RPMSG
diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
index bfd22df..0d777b1 100644
--- a/drivers/rpmsg/Makefile
+++ b/drivers/rpmsg/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_RPMSG)+= rpmsg_core.o
 obj-$(CONFIG_RPMSG_CHAR)   += rpmsg_char.o
+obj-$(CONFIG_RPMSG_CLK)+= rpmsg_clk.o
 obj-$(CONFIG_RPMSG_SYSLOG) += rpmsg_syslog.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_NATIVE) += qcom_glink_native.o
diff --git a/drivers/rpmsg/rpmsg_clk.c b/drivers/rpmsg/rpmsg_clk.c
new file mode 100644
index 000..0ec0241
--- /dev/null
+++ b/drivers/rpmsg/rpmsg_clk.c
@@ -0,0 +1,284 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2016 Pinecone Inc.
+ *
+ * redirect clk API from remote to the kernel.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#define RPMSG_CLK_ENABLE   0
+#define RPMSG_CLK_DISABLE  1
+#define RPMSG_CLK_SETRATE  2
+#define RPMSG_CLK_SETPHASE 3
+#define RPMSG_CLK_GETPHASE 4
+#define RPMSG_CLK_GETRATE  5
+#define RPMSG_CLK_ROUNDRATE6
+#define RPMSG_CLK_ISENABLED7
+
+struct rpmsg_clk_header {
+   u32 command;
+   u32 response;
+   s64 result;
+   u64 cookie;
+} __packed;
+
+struct rpmsg_clk_enable {
+   struct rpmsg_clk_header header;
+   charname[0];
+} __packed;
+
+#define rpmsg_clk_disable  rpmsg_clk_enable
+#define rpmsg_clk_isenabledrpmsg_clk_enable
+
+struct rpmsg_clk_setrate {
+   struct rpmsg_clk_header header;
+   u64 rate;
+   charname[0];
+} __packed;
+
+#define rpmsg_clk_getrate  rpmsg_clk_enable
+#define rpmsg_clk_roundraterpmsg_clk_setrate
+
+struct rpmsg_clk_setphase {
+   struct rpmsg_clk_header header;
+   u32 degrees;
+   charname[0];
+} __packed;
+
+#define rpmsg_clk_getphase rpmsg_clk_enable
+
+struct rpmsg_clk_res {
+   struct clk  *clk;
+   atomic_tcount;
+};
+
+static void rpmsg_clk_release(struct device *dev, void *res)
+{
+   struct rpmsg_clk_res *clkres = res;
+   int count = atomic_read(&clkres->count);
+
+   while (count-- > 0)
+   clk_disable_unprepare(clkres->clk);
+
+   clk_put(clkres->clk);
+}
+
+static int rpmsg_clk_match(struct device *dev, void *res, void *data)
+{
+   struct rpmsg_clk_res *clkres = res;
+
+   return !strcmp(__clk_get_name(clkres->clk), data);
+}
+
+static struct rpmsg_clk_res *
+rpmsg_clk_get_res(struct rpmsg_device *rpdev, const char *name)
+{
+   struct rpmsg_clk_res *clkres;
+   struct clk *clk;
+
+   clkres = devres_find(&rpdev->dev, rpmsg_clk_release,
+rpmsg_clk_match, (void *)name);
+   if (clkres)
+   return clkres;
+
+   clkres = devres_alloc(rpmsg_clk_release, sizeof(*clkres), GFP_KERNEL);
+   if (!clkres)
+   return ERR_PTR(-ENOMEM);
+
+   clk = clk_get(&rpdev->dev, name);
+   if (IS_ERR(clk)) {
+   devres_free(clkres);
+   return ERR_CAST(clk);
+   }
+
+   clkres->clk = clk;
+   devres_add(&rpdev->dev, clkres);
+
+   return clkres;
+}
+
+static int rpmsg_clk_enable_handler(struct rpmsg_device *rpdev,
+   void *data, int len, void *priv, u32 src)
+{
+   struct rpmsg_clk_enable *msg = data;
+   struct rpmsg_clk_res *clkres = rpmsg_clk_get_res(rpdev, msg->name);
+
+   if (!IS_ERR(clkres)) {
+   msg->header.result = clk_prepare_enable(clkres->clk);
+   if (msg->header.result >= 0)
+   atomic_inc(&clkres->count);
+   } else {
+   msg->header.result = PTR_ERR(clkres)

[PATCH V3] Add syslog redirection driver

2019-02-16 Thread Xiang Xiao

Changes vs. V2:

 - Remove memrchr function
 - Output message line by line

Guiding Li (1):
  rpmsg: add syslog redirection driver

 drivers/rpmsg/Kconfig|  12 
 drivers/rpmsg/Makefile   |   1 +
 drivers/rpmsg/rpmsg_syslog.c | 163 +++
 3 files changed, 176 insertions(+)
 create mode 100644 drivers/rpmsg/rpmsg_syslog.c

-- 
2.7.4

[PATCH V3] rpmsg: add syslog redirection driver

2019-02-16 Thread Xiang Xiao

From: Guiding Li 

This driver allows the remote processor to redirect the output of
syslog or printf into the kernel log, which is very useful to see
what happen in the remote side.

Signed-off-by: Guiding Li 
---
 drivers/rpmsg/Kconfig|  12 
 drivers/rpmsg/Makefile   |   1 +
 drivers/rpmsg/rpmsg_syslog.c | 161 +++
 3 files changed, 174 insertions(+)
 create mode 100644 drivers/rpmsg/rpmsg_syslog.c

diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
index d0322b4..13ead55 100644
--- a/drivers/rpmsg/Kconfig
+++ b/drivers/rpmsg/Kconfig
@@ -15,6 +15,18 @@ config RPMSG_CHAR
  in /dev. They make it possible for user-space programs to send and
  receive rpmsg packets.
 
+config RPMSG_SYSLOG
+   tristate "RPMSG syslog redirection"
+   depends on RPMSG
+   help
+ Say Y here to redirect the syslog/printf from remote processor into
+ the kernel log which is very useful to see what happened in the remote
+ side.
+
+ If the remote processor hangs during bootup or panics during runtime,
+ we can even cat /sys/kernel/debug/remoteproc/remoteprocX/trace0 to
+ get the last log which hasn't been output yet.
+
 config RPMSG_QCOM_GLINK_NATIVE
tristate
select RPMSG
diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
index 9aa8595..bfd22df 100644
--- a/drivers/rpmsg/Makefile
+++ b/drivers/rpmsg/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_RPMSG)+= rpmsg_core.o
 obj-$(CONFIG_RPMSG_CHAR)   += rpmsg_char.o
+obj-$(CONFIG_RPMSG_SYSLOG) += rpmsg_syslog.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_NATIVE) += qcom_glink_native.o
 obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
diff --git a/drivers/rpmsg/rpmsg_syslog.c b/drivers/rpmsg/rpmsg_syslog.c
new file mode 100644
index 000..06e3f86
--- /dev/null
+++ b/drivers/rpmsg/rpmsg_syslog.c
@@ -0,0 +1,161 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2017 Pinecone Inc.
+ *
+ * redirect syslog/printf from remote to the kernel.
+ */
+
+#include 
+#include 
+#include 
+
+#define RPMSG_SYSLOG_TRANSFER  0
+#define RPMSG_SYSLOG_TRANSFER_DONE 1
+#define RPMSG_SYSLOG_SUSPEND   2
+#define RPMSG_SYSLOG_RESUME3
+
+struct rpmsg_syslog_header {
+   u32 command;
+   s32 result;
+} __packed;
+
+struct rpmsg_syslog_transfer {
+   struct rpmsg_syslog_header  header;
+   u32 count;
+   chardata[0];
+} __packed;
+
+#define rpmsg_syslog_suspend   rpmsg_syslog_header
+#define rpmsg_syslog_resumerpmsg_syslog_header
+#define rpmsg_syslog_transfer_done rpmsg_syslog_header
+
+struct rpmsg_syslog {
+   char*buf;
+   unsigned intnext;
+   unsigned intsize;
+};
+
+static int rpmsg_syslog_callback(struct rpmsg_device *rpdev,
+void *data, int len, void *priv_, u32 src)
+{
+   struct rpmsg_syslog *priv = dev_get_drvdata(&rpdev->dev);
+   struct rpmsg_syslog_transfer *msg = data;
+   struct rpmsg_syslog_transfer_done done;
+   char *s, *e, *nl;
+
+   if (msg->header.command != RPMSG_SYSLOG_TRANSFER)
+   return -EINVAL;
+
+   s = msg->data;
+   e = s + msg->count;
+
+   /* redirect the message to kernel log line by line */
+   while ((nl = strnchr(s, e - s, '\n')) != NULL) {
+   *nl = '\0';
+   if (priv->next) {
+   pr_info("%.*s%s\n", priv->next, priv->buf, s);
+   priv->next = 0;
+   } else {
+   pr_info("%s\n", s);
+   }
+   s = nl + 1;
+   }
+
+   /* append the remain to the buffer */
+   if (s != e) {
+   unsigned int size = e - s;
+   unsigned int newsize = priv->next + size;
+
+   if (newsize > priv->size) {
+   char *newbuf;
+
+   newbuf = krealloc(priv->buf, newsize, GFP_KERNEL);
+   if (newbuf) {
+   priv->buf  = newbuf;
+   priv->size = newsize;
+   } else {
+   size = priv->size - priv->next;
+   }
+   }
+
+   strncpy(priv->buf + priv->next, s, size);
+   priv->next += size;
+   s += size;
+   }
+
+   done.command = RPMSG_SYSLOG_TRANSFER_DONE;
+   done.result  = s - msg->data;
+   return rpmsg_send(rpdev->ept, &done, sizeof(done));
+}
+
+static int rpmsg_syslog_probe(struct rpmsg_device *rpdev)
+{
+   struct rpmsg_syslog *priv;
+
+   priv = devm_kzalloc(&rpdev->dev,

Re: void __iomem *addr should be const

2019-02-16 Thread Thomas Gleixner

On Sun, 10 Feb 2019, Hugo Lefeuvre wrote:
> __iomem *addr seems to lack a const qualifier in the ioread* definitions
> from include/asm-generic/iomap.h (other definitions such as
> asm-generic/io.h do have the const).
> 
> This issue was briefly discussed a while ago[0] but the outcome is not
> quite clear to me. Should I submit a patch addressing this issue or is it
> a definitive wontfix?
> 
> This issue triggers warnings in my current work but I would like to avoid
> dropping the const if possible.

The const makes perfectly sense and we should have consistent state all
over the place.

Thanks,

tglx

Re: [PATCH v2 1/2] Provide in-kernel headers for making it easy to extend the kernel

2019-02-16 Thread Manoj



Joel Fernandes writes:

> On Thu, Feb 14, 2019 at 07:19:29PM -0800, Alexei Starovoitov wrote:
>> On Mon, Feb 11, 2019 at 09:35:59AM -0500, Joel Fernandes (Google) wrote:
>> > Introduce in-kernel headers and other artifacts which are made available
>> > as an archive through proc (/proc/kheaders.txz file). This archive makes
>> > it possible to build kernel modules, run eBPF programs, and other
>> > tracing programs that need to extend the kernel for tracing purposes
>> > without any dependency on the file system having headers and build
>> > artifacts.
>> > 
>> > On Android and embedded systems, it is common to switch kernels but not
>> > have kernel headers available on the file system. Raw kernel headers
>> > also cannot be copied into the filesystem like they can be on other
>> > distros, due to licensing and other issues. There's no linux-headers
>> > package on Android. Further once a different kernel is booted, any
>> > headers stored on the file system will no longer be useful. By storing
>> > the headers as a compressed archive within the kernel, we can avoid these
>> > issues that have been a hindrance for a long time.
>> 
>> The set looks good to me and since the main use case is building bpf progs
>> I can route it via bpf-next tree if there are no objections.
>> Masahiro, could you please ack it?
>> 
>
> Yes, eBPF is one of the usecases. After this, I am also planning to send
> patches to BCC so that it can use this feature when compiling C to eBPF.
>

Tested-by: Manoj Rao 

I think this can definitely make it easier to use eBPF on
Android. Thanks for initiating this.

> Thanks!
>
>  - Joel


-- 
Manoj
http://www.mycpu.org

Re: [PATCH v2 1/2] leds: Add Intel Cherry Trail Whiskey Cove PMIC LEDs

2019-02-16 Thread Hans de Goede


Hi,

On 2/16/19 6:02 PM, Jacek Anaszewski wrote:

Hi Hans,

On 2/16/19 12:14 AM, Hans de Goede wrote:

Hi,

On 2/15/19 11:31 PM, Jacek Anaszewski wrote:

On 2/15/19 11:26 PM, Hans de Goede wrote:





I think that should work fine, which means that we can use the timer and
pattern trigger support for the blinking and breathing modes.

That still leaves the switching between user and hw-control modes,
as discussed the hw-controlled mode could be modelled as a new "hardware"
trigger, but then we cannot choose between on/blink/breathing when
in hw-controlled mode. As Pavel mentioned, that would require some
sort of composed trigger, where we have both the hardware and
timer triggers active for example.

I think it might be easier to just allow turning on/off the hardware
control mode through a special "hardware_control" sysfs attribute and
then use the existing timer and pattern triggers for blinking / breathing.


Pattern trigger exposes pattern file by default and hw_pattern if
pattern_set/get ops are provided. Writing them enables software and
hardware pattern respectively.


This is not about software vs hardware pattern.

There are 2 *orthogonal*, separate problems/challenges with this LED controller:

1) It has hardware blinking and breathing, as discussed this can be
controlled through the timer and pattern triggers, so this problem
is solved.

2) It has 2 operating modes:

a) Automatic/hardware controlled, in this mode the LED is turned
off or on (where on can be continues on, blinking or breathing)
by the hardware itself, when in this mode we / userspace is not
in control of the LED

b) Manual/user controlled mode, in this mode we / userspace can
control of the LED.

Currently there is no API in the ledclass to switch a LED from
automatic controlled to user controlled and back, This is what
the proposed hardware trigger was for, to switch to automatic
mode. A problem with this is that we still want to be able
to chose between continues on, blinking or breathing (when on),
configure the max brightness, etc.


Yes, we do have the API to switch a LED from automatic (hardware
accelerated) control to software control and back. This is pattern
trigger, which exposes two files for setting pattern: pattern
and hw_pattern. Writing pattern file switches the device to software
control mode and writing hw_pattern switches it to the hardware control,
with the possibility of defining device specific ABI syntax to enable
particular pattern (blinking, breathing or event permanently on
in case of this device).


OK, I see. So we would use the hw_pattern for this and the driver
would implement the pattern_set led_classdev callback.

The pattern_set callback would then expect 6 brightness/time tuples
with the following meaning for the time part of each tupple

tupple0: charging blinking_on_time
tupple1: charging blinking_off_time
tupple2: charging breathing_time
tupple3: manual blinking_on_time
tupple4: manual blinking_off_time
tupple5: manual breathing_time

Where only the times in tupple 0-2; or the times in 3-5 can be
non-zero. Having non zero times for both some charging and some
manual values is not allowed.

If a breathing time is set, none of the other times may be non
0. If blinkig_on and blinking_off are used then breathing_time
must be 0.

When configured to blink then blinking_off must be either 0
(continuously on); or it must be the same as blinking_on.


I believe this will work, does this sound ok to you ?

Regards,

Hans

Re: [v6] coccinelle: semantic code search for missing put_device()

2019-02-16 Thread Markus Elfring

> In a function, for a local variable obtained by of_find_device_by_node(),

I got a software understanding where such a variable can not be obtained
from this function call.
The return value (like a pointer in this use case) can be stored there.


> v6:
> - to be double sure, replace &id->dev with (T)(&id->dev).

The support for data type casts is another interesting extension for
this source code analysis approach.
Further adjustments might become possible at other places of the presented SmPL 
script
after specific clarifications of previously mentioned implementation details.

Regards,
Markus

Re: [LSF/MM TOPIC] FS, MM, and stable trees

2019-02-16 Thread Theodore Y. Ts'o

On Thu, Feb 14, 2019 at 06:48:22PM -0800, James Bottomley wrote:
> Well, we differ on the value of running regression tests, then.  The
> whole point of a test infrastructure is that it's simple to run 'make
> check' in autoconf parlance.  xfstests does provide a useful baseline
> set of regression tests.  However, since my goal is primarily to detect
> problems in the storage path rather than the filesystem, the utility is
> exercising that path, although I fully appreciate that filesystem
> regression tests aren't going to catch every SCSI issue, they do
> provide some level of assurance against bugs.
> 
> Hopefully we can switch over to blktests when it's ready, but in the
> meantime xfstests is way better than nothing.

blktests isn't yet comprehensive, but I think there's value in running
blktests as well as xfstests.  I've been integrating blktests into
{kvm,gce}-xfstets because if the problem is caused to some regression
introduced in the block layer, I'm not wasting time trying to figure
out if it's caused by the block layer or not.  It won't catch
everything, but at least it has some value...

The block/*, loop/* and scsi/* tests in blktests do seem to be in
pretty good shape.  The nvme, nvmeof, and srp tests are *definitely*
not as mature.

- Ted

Re: [PATCHv2] random: Make /dev/random wait for crng_ready

2019-02-16 Thread Theodore Y. Ts'o

On Fri, Feb 15, 2019 at 01:58:20PM +, Bernd Edlinger wrote:
> Reading from /dev/random may return data while the getrandom
> syscall is still blocking.
> 
> Those bytes are not yet cryptographically secure.
> 
> The first byte from /dev/random can have as little
> as 8 bits entropy estimation.  Once a read blocks, it will
> block until /proc/sys/kernel/random/read_wakeup_threshold
> bits are available, which is usually 64 bits, but can be
> configured as low as 8 bits.  A select will wake up when
> at least read_wakeup_threshold bits are available.
> Also when constantly reading bytes out of /dev/random
> it will prevent the crng init done event forever.
> 
> Fixed by making read and select on /dev/random wait until
> the crng is fully initialized.
> 
> Signed-off-by: Bernd Edlinger 

This really isn't a correct way to fix things; since the blocking_pool
used for /dev/random and the CRNG state are different things, and are
fed by different sources of entropy.

What we should do is to have a separate flag which indicates that the
blocking_pool has been adequately initialized, and set it only when
the entropy count in the blocking pool is at least 128 bits.  When get
woken up by the reader lock, we would transfer entropy from the input
pool to the blocking pool, and if the pool is not yet initializedm,
and the entropy count is less than 128 bits, we wait until it is.

   - Ted

[PATCH] ARM: dts: sun7i: add pinctrl for missing uart mux options

2019-02-16 Thread Mans Rullgard

This adds pinctrl settings for various missing uart options.

Signed-off-by: Mans Rullgard 
---
 arch/arm/boot/dts/sun7i-a20.dtsi | 45 
 1 file changed, 45 insertions(+)

diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi b/arch/arm/boot/dts/sun7i-a20.dtsi
index af5b067a5f83..2295ff5adf48 100644
--- a/arch/arm/boot/dts/sun7i-a20.dtsi
+++ b/arch/arm/boot/dts/sun7i-a20.dtsi
@@ -944,6 +944,31 @@
function = "uart0";
};
 
+   uart0_pf_pins: uart0-pf-pins {
+   pins = "PF2", "PF4";
+   function = "uart0";
+   };
+
+   uart1_pa_pins: uart1-pa-pins {
+   pins = "PA10", "PA11";
+   function = "uart1";
+   };
+
+   uart1_cts_rts_pa_pins: uart1-cts-rts-pa-pins {
+   pins = "PA12", "PIA13";
+   function = "uart2";
+   };
+
+   uart2_pa_pins: uart2-pa-pins {
+   pins = "PIA2", "PIA3";
+   function = "uart2";
+   };
+
+   uart2_cts_rts_pa_pins: uart2-cts-rts-pa-pins {
+   pins = "PA0", "PIA1";
+   function = "uart2";
+   };
+
uart2_pi_pins: uart2-pi-pins {
pins = "PI18", "PI19";
function = "uart2";
@@ -969,6 +994,11 @@
function = "uart3";
};
 
+   uart3_cts_rts_ph_pins: uart3-cts-rts-ph-pins {
+   pins = "PH2", "PH3";
+   function = "uart3";
+   };
+
uart4_pg_pins: uart4-pg-pins {
pins = "PG10", "PG11";
function = "uart4";
@@ -979,16 +1009,31 @@
function = "uart4";
};
 
+   uart5_ph_pins: uart5-ph-pins {
+   pins = "PH6", "PH7";
+   function = "uart5";
+   };
+
uart5_pi_pins: uart5-pi-pins {
pins = "PI10", "PI11";
function = "uart5";
};
 
+   uart6_pa_pins: uart6-pa-pins {
+   pins = "PA12", "PA13";
+   function = "uart6";
+   };
+
uart6_pi_pins: uart6-pi-pins {
pins = "PI12", "PI13";
function = "uart6";
};
 
+   uart7_pa_pins: uart7-pa-pins {
+   pins = "PA14", "PA15";
+   function = "uart7";
+   };
+
uart7_pi_pins: uart7-pi-pins {
pins = "PI20", "PI21";
function = "uart7";
-- 
2.20.1

Re: [PATCH 4.20 237/352] igb: Fix an issue that PME is not enabled during runtime suspend

2019-02-16 Thread niveditas98

Bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=202571

The igb commit is commit fb29f76cc5668d8fc9e2b04ebfaf4de4f62e1866


On Tue, Feb 12, 2019 at 7:21 PM Arvind Sankar  wrote:
>
> After upgrading to 4.20.8, I got a WARN in my dmesg and I suspect this
> change. Reverting it removes this warning.
>
> I am not subscribed to the list, please Cc me if any follup questions.
>
> [   12.457238] [ cut here ]
> [   12.457251] PCI PM: State of device not saved by 
> igb_runtime_suspend+0x0/0x80
> [   12.457275] WARNING: CPU: 1 PID: 175 at
> drivers/pci/pci-driver.c:1280 pci_pm_runtime_suspend+0x127/0x130
> [   12.457277] Modules linked in: 8021q zfs(PO) zunicode(PO) zlua(PO)
> zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) zlib_inflate
> x86_pkg_temp_thermal kvm_intel kvm coretemp hwmon irqbypass
> crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64
> crypto_simd cryptd glue_helper intel_cstate intel_uncore
> intel_rapl_perf intel_pch_thermal i2c_i801 ipmi_si ipmi_devintf
> ipmi_msghandler
> [   12.457318] CPU: 1 PID: 175 Comm: kworker/1:1 Tainted: P
> OT 4.20.8 #16
> [   12.457320] Hardware name: Supermicro Super Server/X10SDV-7TP8F,
> BIOS 2.0 06/13/2018
> [   12.457329] Workqueue: pm pm_runtime_work
> [   12.457337] RIP: 0010:pci_pm_runtime_suspend+0x127/0x130
> [   12.457340] Code: f7 0f 00 8b 44 24 04 eb a1 44 39 e8 74 9a 80 3d
> cb 7b d0 00 00 75 91 48 c7 c7 d8 b3 75 a2 c6 05 bb 7b d0 00 01 e8 fe
> 37 cf ff <0f> 0b 31 c0 e9 77 ff ff ff 41 55 41 54 45 31 e4 55 53 48 89
> fb 48
> [   12.457343] RSP: 0018:b300406abd88 EFLAGS: 00010282
> [   12.457347] RAX:  RBX: 965fc6ff20b0 RCX: 
> a2829cb8
> [   12.457350] RDX: 0001 RSI: 0082 RDI: 
> a3755d88
> [   12.457353] RBP: a24a7980 R08: 0001 R09: 
> 0442
> [   12.457355] R10: 0003 R11:  R12: 
> 965fc6ff2000
> [   12.457358] R13:  R14: a1b62550 R15: 
> 
> [   12.457361] FS:  () GS:96671f44()
> knlGS:
> [   12.457365] CS:  0010 DS:  ES:  CR0: 80050033
> [   12.457368] CR2: 7fff7670c11c CR3: 0002c7c0a004 CR4: 
> 003606e0
> [   12.457370] DR0:  DR1:  DR2: 
> 
> [   12.457373] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   12.457374] Call Trace:
> [   12.457387]  ? __switch_to_asm+0x34/0x70
> [   12.457395]  __rpm_callback+0x70/0x1b0
> [   12.457401]  ? __switch_to_asm+0x34/0x70
> [   12.457406]  ? pci_dev_put+0x20/0x20
> [   12.457412]  rpm_callback+0x66/0x90
> [   12.457418]  ? pci_dev_put+0x20/0x20
> [   12.457424]  rpm_suspend+0x15a/0x570
> [   12.457431]  pm_runtime_work+0x8c/0xa0
> [   12.457439]  process_one_work+0x1d2/0x350
> [   12.457446]  worker_thread+0x28/0x3d0
> [   12.457453]  ? wq_calc_node_cpumask.constprop.50+0x20/0x20
> [   12.457457]  kthread+0x103/0x120
> [   12.457462]  ? kthread_create_on_node+0x90/0x90
> [   12.457468]  ret_from_fork+0x35/0x40
> [   12.457472] ---[ end trace aa262aac85ec967a ]---

Re: [PATCH] random: fix inconsistent spinlock usage

2019-02-16 Thread Theodore Y. Ts'o

On Fri, Feb 15, 2019 at 02:03:06PM -0800, Sultan Alsawaf wrote:
> All users of the struct entropy_store spinlock use the irqsave spinlock 
> variant.
> Spinlock users of the same lock should use be consistent in their use of a
> certain spinlock primitive, which makes add_interrupt_randomness()'s spinlock
> usage incorrect.
> 
> Fix the inconsistency by converting add_interrupt_randomness()'s spinlocks to
> use the irqsave primitive.
> 
> Signed-off-by: Sultan Alsawaf 

This isn't a problem; interrupts are off by definition when
add_interrupt_randomness() is called so there's no point using the
irqsave version.

Also, please note that your patches are whitespace damaged, so they
can't be applied directly.  You may want to look into how you are
sending your patches.

Regards,

- Ted

Re: ext4 corruption on alpha with 4.20.0-09062-gd8372ba8ce28

2019-02-16 Thread Theodore Y. Ts'o

On Fri, Feb 15, 2019 at 06:59:48PM +0200, Meelis Roos wrote:
> The result of the bisection is
> [88dbcbb3a4847f5e6dfeae952d3105497700c128] blkdev: avoid migration stalls for 
> blkdev pages
> 
> Is that result relevant for the problem or should I continue bisecting 
> between 4.20.0 and the so far first bad commit?

Can you try reverting the commit and see if it makes the problem go away?

 - Ted

Re: find_get_entries_tag regression bisected

2019-02-16 Thread Matthew Wilcox

On Sat, Feb 16, 2019 at 07:35:11AM -0800, Matthew Wilcox wrote:
> Another way to fix this would be to mask the address in dax_entry_mkclean(),
> but I think this is cleaner.

That's clearly rubbish, dax_entry_mkclean() can't possibly mask the
address.  It might be mis-aligned in another process.  But ... if it's
misaligned in another process, dax_entry_mkclean() will only clean the first
PTE associated with the PMD; it won't clean the whole thing.  I think we need
something like this:

(I'll have to split it apart to give us something to backport)

diff --git a/fs/dax.c b/fs/dax.c
index 6959837cc465..09680aa0481f 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -768,7 +768,7 @@ unsigned long pgoff_address(pgoff_t pgoff, struct 
vm_area_struct *vma)
 
 /* Walk all mappings of a given index of a file and writeprotect them */
 static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index,
-   unsigned long pfn)
+   pgoff_t end, unsigned long pfn)
 {
struct vm_area_struct *vma;
pte_t pte, *ptep = NULL;
@@ -776,7 +776,7 @@ static void dax_entry_mkclean(struct address_space 
*mapping, pgoff_t index,
spinlock_t *ptl;
 
i_mmap_lock_read(mapping);
-   vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) {
+   vma_interval_tree_foreach(vma, &mapping->i_mmap, index, end) {
struct mmu_notifier_range range;
unsigned long address;
 
@@ -843,9 +843,9 @@ static void dax_entry_mkclean(struct address_space 
*mapping, pgoff_t index,
 static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev,
struct address_space *mapping, void *entry)
 {
-   unsigned long pfn;
+   unsigned long pfn, index;
long ret = 0;
-   size_t size;
+   unsigned long count;
 
/*
 * A page got tagged dirty in DAX mapping? Something is seriously
@@ -894,17 +894,18 @@ static int dax_writeback_one(struct xa_state *xas, struct 
dax_device *dax_dev,
xas_unlock_irq(xas);
 
/*
-* Even if dax_writeback_mapping_range() was given a wbc->range_start
-* in the middle of a PMD, the 'index' we are given will be aligned to
-* the start index of the PMD, as will the pfn we pull from 'entry'.
+* If dax_writeback_mapping_range() was given a wbc->range_start
+* in the middle of a PMD, the 'index' we are given needs to be
+* aligned to the start index of the PMD.
 * This allows us to flush for PMD_SIZE and not have to worry about
 * partial PMD writebacks.
 */
pfn = dax_to_pfn(entry);
-   size = PAGE_SIZE << dax_entry_order(entry);
+   count = 1UL << dax_entry_order(entry);
+   index = xas->xa_index &~ (count - 1);
 
-   dax_entry_mkclean(mapping, xas->xa_index, pfn);
-   dax_flush(dax_dev, page_address(pfn_to_page(pfn)), size);
+   dax_entry_mkclean(mapping, index, index + count - 1, pfn);
+   dax_flush(dax_dev, page_address(pfn_to_page(pfn)), count * PAGE_SIZE);
/*
 * After we have flushed the cache, we can clear the dirty tag. There
 * cannot be new dirty data in the pfn after the flush has completed as
@@ -917,8 +918,7 @@ static int dax_writeback_one(struct xa_state *xas, struct 
dax_device *dax_dev,
xas_clear_mark(xas, PAGECACHE_TAG_DIRTY);
dax_wake_entry(xas, entry, false);
 
-   trace_dax_writeback_one(mapping->host, xas->xa_index,
-   size >> PAGE_SHIFT);
+   trace_dax_writeback_one(mapping->host, xas->xa_index, count);
return ret;
 
  put_unlocked:

[patch v6 6/7] PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets

2019-02-16 Thread Thomas Gleixner

Multiple interrupt sets for affinity spreading are now handled in the core
code and the number of sets and their size is recalculated via a driver
supplied callback.

That avoids the requirement to invoke pci_alloc_irq_vectors_affinity() with
the arguments minvecs and maxvecs set to the same value and the callsite
handling the ENOSPC situation.

Remove the now obsolete sanity checks and the related comments.

Signed-off-by: Thomas Gleixner 

---
 drivers/pci/msi.c |   14 --
 1 file changed, 14 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1035,13 +1035,6 @@ static int __pci_enable_msi_range(struct
if (maxvec < minvec)
return -ERANGE;
 
-   /*
-* If the caller is passing in sets, we can't support a range of
-* vectors. The caller needs to handle that.
-*/
-   if (affd && affd->nr_sets && minvec != maxvec)
-   return -EINVAL;
-
if (WARN_ON_ONCE(dev->msi_enabled))
return -EINVAL;
 
@@ -1093,13 +1086,6 @@ static int __pci_enable_msix_range(struc
if (maxvec < minvec)
return -ERANGE;
 
-   /*
-* If the caller is passing in sets, we can't support a range of
-* supported vectors. The caller needs to handle that.
-*/
-   if (affd && affd->nr_sets && minvec != maxvec)
-   return -EINVAL;
-
if (WARN_ON_ONCE(dev->msix_enabled))
return -EINVAL;

[patch v6 5/7] genirq/affinity: Remove the leftovers of the original set support

2019-02-16 Thread Thomas Gleixner

Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

Signed-off-by: Thomas Gleixner 
---
 kernel/irq/affinity.c |   20 
 1 file changed, 4 insertions(+), 16 deletions(-)

Index: b/kernel/irq/affinity.c
===
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -264,20 +264,13 @@ irq_create_affinity_masks(unsigned int n
 
/*
 * Simple invocations do not provide a calc_sets() callback. Install
-* the generic one. The check for affd->nr_sets is a temporary
-* workaround and will be removed after the NVME driver is converted
-* over.
+* the generic one.
 */
-   if (!affd->nr_sets && !affd->calc_sets)
+   if (!affd->calc_sets)
affd->calc_sets = default_calc_sets;
 
-   /*
-* If the device driver provided a calc_sets() callback let it
-* recalculate the number of sets and their size. The check will go
-* away once the NVME driver is converted over.
-*/
-   if (affd->calc_sets)
-   affd->calc_sets(affd, affvecs);
+   /* Recalculate the sets */
+   affd->calc_sets(affd, affvecs);
 
if (WARN_ON_ONCE(affd->nr_sets > IRQ_AFFINITY_MAX_SETS))
return NULL;
@@ -344,11 +337,6 @@ unsigned int irq_calc_affinity_vectors(u
 
if (affd->calc_sets) {
set_vecs = maxvec - resv;
-   } else if (affd->nr_sets) {
-   unsigned int i;
-
-   for (i = 0, set_vecs = 0;  i < affd->nr_sets; i++)
-   set_vecs += affd->set_size[i];
} else {
get_online_cpus();
set_vecs = cpumask_weight(cpu_possible_mask);

[patch v6 7/7] genirq/affinity: Add support for non-managed affinity sets

2019-02-16 Thread Thomas Gleixner

Some drivers need an extra set of interrupts which should not be marked
managed, but should get initial interrupt spreading.

Add a bitmap to struct irq_affinity which allows the driver to mark a
particular set of interrupts as non managed. Check the bitmap during
spreading and use the result to mark the interrupts in the sets
accordingly.

The unmanaged interrupts get initial spreading, but user space can change
their affinity later on. For the managed sets, i.e. the corresponding bit
in the mask is not set, there is no change in behaviour.

Usage example:

struct irq_affinity affd = {
.pre_vectors= 2,
.unmanaged_sets = 0x02,
.calc_sets  = drv_calc_sets,
};


For both interrupt sets the interrupts are properly spread out, but the
second set is not marked managed.

Signed-off-by: Thomas Gleixner 
---
 include/linux/interrupt.h |2 ++
 kernel/irq/affinity.c |   16 +++-
 2 files changed, 13 insertions(+), 5 deletions(-)

Index: b/include/linux/interrupt.h
===
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -251,6 +251,7 @@ struct irq_affinity_notify {
  * the MSI(-X) vector space
  * @nr_sets:   The number of interrupt sets for which affinity
  * spreading is required
+ * @unmanaged_sets:Bitmap to mark entries in the @set_size array unmanaged
  * @set_size:  Array holding the size of each interrupt set
  * @calc_sets: Callback for calculating the number and size
  * of interrupt sets
@@ -261,6 +262,7 @@ struct irq_affinity {
unsigned intpre_vectors;
unsigned intpost_vectors;
unsigned intnr_sets;
+   unsigned intunmanaged_sets;
unsigned intset_size[IRQ_AFFINITY_MAX_SETS];
void(*calc_sets)(struct irq_affinity *, unsigned int nvecs);
void*priv;
Index: b/kernel/irq/affinity.c
===
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -249,6 +249,8 @@ irq_create_affinity_masks(unsigned int n
unsigned int affvecs, curvec, usedvecs, i;
struct irq_affinity_desc *masks = NULL;
 
+   BUILD_BUG_ON(IRQ_AFFINITY_MAX_SETS > sizeof(affd->unmanaged_sets) * 8);
+
/*
 * Determine the number of vectors which need interrupt affinities
 * assigned. If the pre/post request exhausts the available vectors
@@ -292,7 +294,8 @@ irq_create_affinity_masks(unsigned int n
 * have multiple sets, build each sets affinity mask separately.
 */
for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) {
-   unsigned int this_vecs = affd->set_size[i];
+   bool managed = affd->unmanaged_sets & (1U << i) ? true : false;
+   unsigned int idx, this_vecs = affd->set_size[i];
int ret;
 
ret = irq_build_affinity_masks(affd, curvec, this_vecs,
@@ -301,8 +304,15 @@ irq_create_affinity_masks(unsigned int n
kfree(masks);
return NULL;
}
+
+   idx = curvec;
curvec += this_vecs;
usedvecs += this_vecs;
+   if (managed) {
+   /* Mark the managed interrupts */
+   for (; idx < curvec; idx++)
+   masks[idx].is_managed = 1;
+   }
}
 
/* Fill out vectors at the end that don't need affinity */
@@ -313,10 +323,6 @@ irq_create_affinity_masks(unsigned int n
for (; curvec < nvecs; curvec++)
cpumask_copy(&masks[curvec].mask, irq_default_affinity);
 
-   /* Mark the managed interrupts */
-   for (i = affd->pre_vectors; i < nvecs - affd->post_vectors; i++)
-   masks[i].is_managed = 1;
-
return masks;
 }

[patch v6 3/7] genirq/affinity: Add new callback for (re)calculating interrupt sets

2019-02-16 Thread Thomas Gleixner

From: Ming Lei 

The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources.  This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
for nr_sets after the invocation of the callback so it catches
broken drivers. Fixed the kernel doc comments for struct
irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei 
Signed-off-by: Thomas Gleixner 

---
 drivers/pci/msi.c   |   25 ++--
 drivers/scsi/be2iscsi/be_main.c |2 -
 include/linux/interrupt.h   |   10 +-
 include/linux/pci.h |4 +-
 kernel/irq/affinity.c   |   62 
 5 files changed, 71 insertions(+), 32 deletions(-)

Index: b/drivers/pci/msi.c
===
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -532,7 +532,7 @@ static int populate_msi_sysfs(struct pci
 }
 
 static struct msi_desc *
-msi_setup_entry(struct pci_dev *dev, int nvec, const struct irq_affinity *affd)
+msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
 {
struct irq_affinity_desc *masks = NULL;
struct msi_desc *entry;
@@ -597,7 +597,7 @@ static int msi_verify_entries(struct pci
  * which could have been allocated.
  */
 static int msi_capability_init(struct pci_dev *dev, int nvec,
-  const struct irq_affinity *affd)
+  struct irq_affinity *affd)
 {
struct msi_desc *entry;
int ret;
@@ -669,7 +669,7 @@ static void __iomem *msix_map_region(str
 
 static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
  struct msix_entry *entries, int nvec,
- const struct irq_affinity *affd)
+ struct irq_affinity *affd)
 {
struct irq_affinity_desc *curmsk, *masks = NULL;
struct msi_desc *entry;
@@ -736,7 +736,7 @@ static void msix_program_entries(struct
  * requested MSI-X entries with allocated irqs or non-zero for otherwise.
  **/
 static int msix_capability_init(struct pci_dev *dev, struct msix_entry 
*entries,
-   int nvec, const struct irq_affinity *affd)
+   int

[patch v6 4/7] nvme-pci: Simplify interrupt allocation

2019-02-16 Thread Thomas Gleixner

From: Ming Lei 

The NVME PCI driver contains a tedious mechanism for interrupt
allocation, which is necessary to adjust the number and size of interrupt
sets to the maximum available number of interrupts which depends on the
underlying PCI capabilities and the available CPU resources.

It works around the former short comings of the PCI and core interrupt
allocation mechanims in combination with interrupt sets.

The PCI interrupt allocation function allows to provide a maximum and a
minimum number of interrupts to be allocated and tries to allocate as
many as possible. This worked without driver interaction as long as there
was only a single set of interrupts to handle.

With the addition of support for multiple interrupt sets in the generic
affinity spreading logic, which is invoked from the PCI interrupt
allocation, the adaptive loop in the PCI interrupt allocation did not
work for multiple interrupt sets. The reason is that depending on the
total number of interrupts which the PCI allocation adaptive loop tries
to allocate in each step, the number and the size of the interrupt sets
need to be adapted as well. Due to the way the interrupt sets support was
implemented there was no way for the PCI interrupt allocation code or the
core affinity spreading mechanism to invoke a driver specific function
for adapting the interrupt sets configuration.

As a consequence the driver had to implement another adaptive loop around
the PCI interrupt allocation function and calling that with maximum and
minimum interrupts set to the same value. This ensured that the
allocation either succeeded or immediately failed without any attempt to
adjust the number of interrupts in the PCI code.

The core code now allows drivers to provide a callback to recalculate the
number and the size of interrupt sets during PCI interrupt allocation,
which in turn allows the PCI interrupt allocation function to be called
in the same way as with a single set of interrupts. The PCI code handles
the adaptive loop and the interrupt affinity spreading mechanism invokes
the driver callback to adapt the interrupt set configuration to the
current loop value. This replaces the adaptive loop in the driver
completely.

Implement the NVME specific callback which adjusts the interrupt sets
configuration and remove the adaptive allocation loop.

[ tglx: Simplify the callback further and restore the dropped adjustment of
number of sets ]

Signed-off-by: Ming Lei 
Signed-off-by: Thomas Gleixner 

---
 drivers/nvme/host/pci.c |  116 
 1 file changed, 39 insertions(+), 77 deletions(-)

Index: b/drivers/nvme/host/pci.c
===
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2041,41 +2041,42 @@ static int nvme_setup_host_mem(struct nv
return ret;
 }
 
-/* irq_queues covers admin queue */
-static void nvme_calc_io_queues(struct nvme_dev *dev, unsigned int irq_queues)
+/*
+ * nirqs is the number of interrupts available for write and read
+ * queues. The core already reserved an interrupt for the admin queue.
+ */
+static void nvme_calc_irq_sets(struct irq_affinity *affd, unsigned int nrirqs)
 {
-   unsigned int this_w_queues = write_queues;
-
-   WARN_ON(!irq_queues);
-
-   /*
-* Setup read/write queue split, assign admin queue one independent
-* irq vector if irq_queues is > 1.
-*/
-   if (irq_queues <= 2) {
-   dev->io_queues[HCTX_TYPE_DEFAULT] = 1;
-   dev->io_queues[HCTX_TYPE_READ] = 0;
-   return;
-   }
-
-   /*
-* If 'write_queues' is set, ensure it leaves room for at least
-* one read queue and one admin queue
-*/
-   if (this_w_queues >= irq_queues)
-   this_w_queues = irq_queues - 2;
+   struct nvme_dev *dev = affd->priv;
+   unsigned int nr_read_queues;
 
/*
-* If 'write_queues' is set to zero, reads and writes will share
-* a queue set.
-*/
-   if (!this_w_queues) {
-   dev->io_queues[HCTX_TYPE_DEFAULT] = irq_queues - 1;
-   dev->io_queues[HCTX_TYPE_READ] = 0;
+* If there is no interupt available for queues, ensure that
+* the default queue is set to 1. The affinity set size is
+* also set to one, but the irq core ignores it for this case.
+*
+* If only one interrupt is available or 'write_queue' == 0, combine
+* write and read queues.
+*
+* If 'write_queues' > 0, ensure it leaves room for at least one read
+* queue.
+*/
+   if (!nrirqs) {
+   nrirqs = 1;
+   nr_read_queues = 0;
+   } else if (nrirqs == 1 || !write_queues) {
+   nr_read_queues = 0;
+   } else if (write_queues >= nrirqs) {
+   nr_read_queues = 1;
} else {
-   dev->io_queues[HCTX_TYPE_DEFAULT] = this_w_q

[patch v6 2/7] genirq/affinity: Store interrupt sets size in struct irq_affinity

2019-02-16 Thread Thomas Gleixner

From: Ming Lei 

The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one
or more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via
a pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a
loop in the driver to determine the maximum number of interrupts which
are provided by the PCI capabilities and the underlying CPU resources.
This loop would have to be replicated in every driver which wants to
utilize this mechanism. That's unwanted code duplication and error
prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and
their size, in the core code. As the core code does not have any
knowledge about the underlying device, a driver specific callback will
be added to struct affinity_desc, which will be invoked by the core
code. The callback will get the number of available interupts as an
argument, so the driver can calculate the corresponding number and size
of interrupt sets.

To support this, two modifications for the handling of struct irq_affinity
are required:

1) The (optional) interrupt sets size information is contained in a
   separate array of integers and struct irq_affinity contains a
   pointer to it.

   This is cumbersome and as the maximum number of interrupt sets is small,
   there is no reason to have separate storage. Moving the size array into
   struct affinity_desc avoids indirections and makes the code simpler.

2) At the moment the struct irq_affinity pointer which is handed in from
   the driver and passed through to several core functions is marked
   'const'.

   With the upcoming callback to recalculate the number and size of
   interrupt sets, it's necessary to remove the 'const'
   qualifier. Otherwise the callback would not be able to update the data.

Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

No functional change.

[ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
source. Fixed the kernel doc comments for struct irq_affinity and
de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei 
Signed-off-by: Thomas Gleixner 

---
 drivers/nvme/host/pci.c   |7 +++
 include/linux/interrupt.h |9 ++---
 kernel/irq/affinity.c |   16 
 3 files changed, 21 insertions(+), 11 deletions(-)

--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2081,12 +2081,11 @@ static void nvme_calc_io_queues(struct n
 static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
 {
struct pci_dev *pdev = to_pci_dev(dev->dev);
-   int irq_sets[2];
struct irq_affinity affd = {
-   .pre_vectors = 1,
-   .nr_sets = ARRAY_SIZE(irq_sets),
-   .sets = irq_sets,
+   .pre_vectors= 1,
+   .nr_sets= 2,
};
+   unsigned int *irq_sets = affd.set_size;
int result = 0;
unsigned int irq_queues, this_p_queues;
 
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -241,20 +241,23 @@ struct irq_affinity_notify {
void (*release)(struct kref *ref);
 };
 
+#defineIRQ_AFFINITY_MAX_SETS  4
+
 /**
  * struct irq_affinity - Description for automatic irq affinity assignements
  * @pre_vectors:   Don't apply affinity to @pre_vectors at beginning of
  * the MSI(-X) vector space
  * @post_vectors:  Don't apply affinity to @post_vectors at end of
  * the MSI(-X) vector space
- * @nr_sets:   Length of passed in *sets array
- * @sets:  Number of affinitized sets
+ * @nr_sets:   The number of interrupt sets for which affinity
+ * spreading is required
+ * @set_size:  Array holding the size of each interrupt set
  */
 struct irq_affinity {
unsigned intpre_vectors;
unsigned intpost_vectors;
unsigned intnr_sets;
-   unsigned int*sets;
+   unsigned intset_size[IRQ_AFFINITY_MAX_SETS];
 };
 
 /**
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -238,9 +238,10 @@ static int irq_build_affinity_masks(cons
  * Returns the irq_affinity_desc pointer or NULL if allocation failed.
  */
 struct irq_affinity_desc *
-irq_create_affinity_masks(unsigned int nvecs, const struct irq_affinity *affd)
+irq_create_affinity_masks(unsigned in

[patch v6 1/7] genirq/affinity: Code consolidation

2019-02-16 Thread Thomas Gleixner

All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over to unsigned int.

Signed-off-by: Thomas Gleixner 
---
 include/linux/interrupt.h |   20 +---
 kernel/irq/affinity.c |   56 ++
 2 files changed, 38 insertions(+), 38 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -251,10 +251,10 @@ struct irq_affinity_notify {
  * @sets:  Number of affinitized sets
  */
 struct irq_affinity {
-   int pre_vectors;
-   int post_vectors;
-   int nr_sets;
-   int *sets;
+   unsigned intpre_vectors;
+   unsigned intpost_vectors;
+   unsigned intnr_sets;
+   unsigned int*sets;
 };
 
 /**
@@ -314,9 +314,10 @@ extern int
 irq_set_affinity_notifier(unsigned int irq, struct irq_affinity_notify 
*notify);
 
 struct irq_affinity_desc *
-irq_create_affinity_masks(int nvec, const struct irq_affinity *affd);
+irq_create_affinity_masks(unsigned int nvec, const struct irq_affinity *affd);
 
-int irq_calc_affinity_vectors(int minvec, int maxvec, const struct 
irq_affinity *affd);
+unsigned int irq_calc_affinity_vectors(unsigned int minvec, unsigned int 
maxvec,
+  const struct irq_affinity *affd);
 
 #else /* CONFIG_SMP */
 
@@ -350,13 +351,14 @@ irq_set_affinity_notifier(unsigned int i
 }
 
 static inline struct irq_affinity_desc *
-irq_create_affinity_masks(int nvec, const struct irq_affinity *affd)
+irq_create_affinity_masks(unsigned int nvec, const struct irq_affinity *affd)
 {
return NULL;
 }
 
-static inline int
-irq_calc_affinity_vectors(int minvec, int maxvec, const struct irq_affinity 
*affd)
+static inline unsigned int
+irq_calc_affinity_vectors(unsigned int minvec, unsigned int maxvec,
+ const struct irq_affinity *affd)
 {
return maxvec;
 }
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -9,7 +9,7 @@
 #include 
 
 static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
-   int cpus_per_vec)
+   unsigned int cpus_per_vec)
 {
const struct cpumask *siblmsk;
int cpu, sibl;
@@ -95,15 +95,17 @@ static int get_nodes_in_cpumask(cpumask_
 }
 
 static int __irq_build_affinity_masks(const struct irq_affinity *affd,
- int startvec, int numvecs, int firstvec,
+ unsigned int startvec,
+ unsigned int numvecs,
+ unsigned int firstvec,
  cpumask_var_t *node_to_cpumask,
  const struct cpumask *cpu_mask,
  struct cpumask *nmsk,
  struct irq_affinity_desc *masks)
 {
-   int n, nodes, cpus_per_vec, extra_vecs, done = 0;
-   int last_affv = firstvec + numvecs;
-   int curvec = startvec;
+   unsigned int n, nodes, cpus_per_vec, extra_vecs, done = 0;
+   unsigned int last_affv = firstvec + numvecs;
+   unsigned int curvec = startvec;
nodemask_t nodemsk = NODE_MASK_NONE;
 
if (!cpumask_weight(cpu_mask))
@@ -117,18 +119,16 @@ static int __irq_build_affinity_masks(co
 */
if (numvecs <= nodes) {
for_each_node_mask(n, nodemsk) {
-   cpumask_or(&masks[curvec].mask,
-   &masks[curvec].mask,
-   node_to_cpumask[n]);
+   cpumask_or(&masks[curvec].mask, &masks[curvec].mask,
+  node_to_cpumask[n]);
if (++curvec == last_affv)
curvec = firstvec;
}
-   done = numvecs;
-   goto out;
+   return numvecs;
}
 
for_each_node_mask(n, nodemsk) {
-   int ncpus, v, vecs_to_assign, vecs_per_node;
+   unsigned int ncpus, v, vecs_to_assign, vecs_per_node;
 
/* Spread the vectors per node */
vecs_per_node = (numvecs - (curvec - firstvec)) / nodes;
@@ -163,8 +163,6 @@ static int __irq_build_affinity_masks(co
curvec = firstvec;
--nodes;
}
-
-out:
return done;
 }
 
@@ -174,13 +172,14 @@ static int __irq_build_affinity_masks(co
  * 2) spread other possible CPUs on these vectors
  */
 static int irq_build_affinity_masks(const struct irq_affinity *affd,
-   int startvec, int numvecs, int firstvec,
+   unsigned int startvec, unsigned int numvecs,
+   unsigned int firstvec,
str

[patch v6 0/7] genirq/affinity: Overhaul the multiple interrupt sets support

2019-02-16 Thread Thomas Gleixner

This is the final update to the series with a few corner cases fixes
vs. V5 which can be found here:

   https://lkml.kernel.org/r/20190214204755.819014...@linutronix.de

The series applies against:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

and is also available from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.irq

Changes vs. V5:

  - Change the invocation for the driver callback so it is invoked even
when there are no interrupts left to spread out. That ensures that the
driver can adjust to the situation (in case of NVME a single interrupt)

  - Make sure the callback is invoked in the legacy irq fallback case so
the driver is not in a stale state from a failed MSI[X} allocation
attempt.

  - Fix the adjustment logic in the NVME driver as pointed out by Ming and
Marc, plus another corner case I found during testing.

  - Simplify the unmanaged set support

Thanks,

tglx

8<-
 drivers/nvme/host/pci.c |  117 +---
 drivers/pci/msi.c   |   39 +
 drivers/scsi/be2iscsi/be_main.c |2 
 include/linux/interrupt.h   |   35 ---
 include/linux/pci.h |4 -
 kernel/irq/affinity.c   |  116 ---
 6 files changed, 153 insertions(+), 160 deletions(-)

Re: [PATCH net-next 1/3] net: stmmac: Fix NAPI poll in TX path when in multi-queue

2019-02-16 Thread Lendacky, Thomas



On 2/15/19 6:35 PM, Florian Fainelli wrote:
> On 2/15/19 5:42 AM, Jose Abreu wrote:
>> Commit 8fce33317023 introduced the concept of NAPI per-channel and
>> independent cleaning of TX path.
>>
>> This is currently breaking performance in some cases. The scenario
>> happens when all packets are being received in Queue 0 but the TX is
>> performed in Queue != 0.
>>
>> Fix this by using different NAPI instances per each TX and RX queue, as
>> suggested by Florian.
>>
>> Signed-off-by: Jose Abreu 
>> Cc: Florian Fainelli 
>> Cc: Joao Pinto 
>> Cc: David S. Miller 
>> Cc: Giuseppe Cavallaro 
>> Cc: Alexandre Torgue 
>> ---
> 
> [snip]
> 
>> -if (work_done < budget && napi_complete_done(napi, work_done)) {
>> -int stat;
>> +priv->xstats.napi_poll++;
>>   
>> +work_done = stmmac_tx_clean(priv, budget, chan);
>> +if (work_done < budget && napi_complete_done(napi, work_done))
> 
> You should not be bounding your TX queue against the NAPI budge, it
> should run unbound and clean as much as it can, which could be the
> entire ring size if that is how many packets you pushed between
> interrupts. That could be the cause of poor performance as well.

Won't returning the budget value cause this napi_poll routine to be called
again, where the driver can continue to clean TX packets? I thought this
just gives other drivers the opportunity to run their napi_poll routines
in between so as not to be starved.

Thanks,
Tom

>

1 2 >

1 - 100 of 199 matches

Mail list logo