Re: [PATCH v2 5/8] KVM: arm/arm64: vgic: Handle mapped level sensitive SPIs

2017-08-28 Thread Auger Eric
Hi Christoffer,

On 29/08/2017 08:45, Christoffer Dall wrote:
> On Tue, Aug 22, 2017 at 04:33:42PM +0200, Auger Eric wrote:
>> Hi Christoffer,
>>
>> On 21/07/2017 14:11, Christoffer Dall wrote:
>>> On Thu, Jun 15, 2017 at 02:52:37PM +0200, Eric Auger wrote:
 Currently, the line level of unmapped level sensitive SPIs is
 toggled down by the maintenance IRQ handler/resamplefd mechanism.

 As mapped SPI completion is not trapped, we cannot rely on this
 mechanism and the line level needs to be observed at distributor
 level instead.

 This patch handles the physical IRQ case in vgic_validate_injection
 and get the line level of a mapped SPI at distributor level.

 Signed-off-by: Eric Auger 

 ---

 v1 -> v2:
 - renamed is_unshared_mapped into is_mapped_spi
 - changes to kvm_vgic_map_phys_irq moved in the previous patch
 - make vgic_validate_injection more readable
 - reword the commit message
 ---
  virt/kvm/arm/vgic/vgic.c | 16 ++--
  virt/kvm/arm/vgic/vgic.h |  7 ++-
  2 files changed, 20 insertions(+), 3 deletions(-)

 diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
 index 075f073..2e35ac7 100644
 --- a/virt/kvm/arm/vgic/vgic.c
 +++ b/virt/kvm/arm/vgic/vgic.c
 @@ -139,6 +139,17 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq 
 *irq)
kfree(irq);
  }
  
 +bool irq_line_level(struct vgic_irq *irq)
 +{
 +  bool line_level = irq->line_level;
 +
 +  if (unlikely(is_mapped_spi(irq)))
 +  WARN_ON(irq_get_irqchip_state(irq->host_irq,
 +IRQCHIP_STATE_PENDING,
 +&line_level));
 +  return line_level;
 +}
 +
  /**
   * kvm_vgic_target_oracle - compute the target vcpu for an irq
   *
 @@ -236,13 +247,14 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
  
  /*
   * Only valid injection if changing level for level-triggered IRQs or for 
 a
 - * rising edge.
 + * rising edge. Injection of virtual interrupts associated to physical
 + * interrupts always is valid.
>>>
>>> why?  I don't remember this now, and that means I probably won't in the
>>> future either.
>>
>> Sorry for the late reply.
>>
>> The life cycle of the physically mapped IRQ is as follows:
>> - pIRQ becomes pending
>> - pIRQ is acknowledged by the physical handler and becomes active
>> - vIRQ gets injected as part of the physical handler chain
>>   (VFIO->irqfd kvm_vgic_inject_irq for instance). Linux irq cannot
>>   hit until vIRQ=pIRQ deactivation
>> - guest deactivates the vIRQ which automatically deactivates the pIRQ
>>
>>
>> So to me if we are about to vgic_validate_injection() an injection of a
>> physically mapped vIRQ this means the vgic state is ready to accept it:
>> previous occurence was deactivated. There cannot be any state
>> inconsistency around the line_level/level.
>>
>> Do you agree?
>>
>> I will add this description at least in the commit message.
> 
> I think the important point is, that even though we don't change the
> level, we still add it to the AP list if not already there, and
> therefore we still do this.
> 
>>>
>>> When I look at this now, I'm thinking, if we're not going to change
>>> anything, why proceed beyond validate injection?
>>
>> don't catch this one. validation always succeeds and then we further
>> handle the IRQ.
> 
> The problem is that the code suggests that we will not change something,
> but in fact, later on, in the caller, we do queue this IRQ if not on the
> AP list, even though there were no state change in the struct IRQ.
> 
> But Marc and I sketched out another proposal which could benefit the
> timer as well.  Let me try to verify if it works and send it to you and
> see if that is an improvement over this one.

OK, looking forward to studying your proposal

Thanks

Eric
> 
> Thanks,
> -Christoffer
> 


Re: linux-next: manual merge of the tty tree with the parisc-hd tree

2017-08-28 Thread Greg KH
On Tue, Aug 29, 2017 at 04:35:13PM +1000, Stephen Rothwell wrote:
> Hi Greg,
> 
> Today's linux-next merge of the tty tree got a conflict in:
> 
>   drivers/tty/serial/8250/8250_gsc.c
> 
> between commit:
> 
>   9e466f101e19 ("parisc/8250_gsc: Fix section mismatches")
> 
> from the parisc-hd tree and commit:
> 
>   0d474f7fad3b ("tty: 8250: constify parisc_device_id")
> 
> from the tty tree.
> 
> I fixed it up (the former is a superset of the latter) and can carry the
> fix as necessary. This is now fixed as far as linux-next is concerned,
> but any non trivial conflicts should be mentioned to your upstream
> maintainer when your tree is submitted for merging.  You may also want
> to consider cooperating with the maintainer of the conflicting tree to
> minimise any particularly complex conflicts.

Thanks for letting me know about this, and the other conflict here.

greg k-h


Re: [PATCH] i2c: aspeed: Retain delay/setup/hold values when configuring bus frequency

2017-08-28 Thread Andrew Jeffery
On Mon, 2017-08-28 at 18:07 +0200, Wolfram Sang wrote:
> On Tue, Aug 15, 2017 at 04:51:02PM +0930, Andrew Jeffery wrote:
> > In addition to the base, low and high clock configuration, the AC timing
> > register #1 on the AST2400 houses fields controlling:
> > 
> > 1. tBUF: Minimum delay between Stop and Start conditions
> > 2. tHDSTA: Hold time for the Start condition
> > 3. tACST: Setup time for Start and Stop conditions, and hold time for the
> >    Repeated Start condition
> > 
> > These values are defined in hardware on the AST2500 and therefore don't
> > need to be set.
> > 
> > aspeed_i2c_init_clk() was performing a direct write of the generated
> > clock values rather than a read/mask/modify/update sequence to retain
> > tBUF, tHDSTA and tACST, and therefore cleared the tBUF, tHDSTA and tACST
> > fields on the AST2400. This resulted in a delay/setup/hold time of 1
> > base clock, which in some configurations is not enough for some devices
> > (e.g. the MAX31785 fan controller, with an APB of 48MHz and a desired
> > bus speed of 100kHz).
> > 
> > Signed-off-by: Andrew Jeffery 
> 
> Applied to for-next, thanks! 

Thanks!

> I even considered for-current but it does
> not apply there. So, I leave the backporting for the interested parties
> :)
> 

It depends on Brendan's clock divisor calculation fix, which appears to
be in for-next but not for-current:

87b59ff8d1d9 i2c: aspeed: add proper support fo 24xx clock params

I'd argue that Brendan's patch should go in for-current as well,
because it fixes a divisor rounding error for the ast2500 (bus is
clocked faster than requested).

Cheers,

Andrew

signature.asc
Description: This is a digitally signed message part


Re: [PATCH V3 2/8] drivers: boot_constraint: Add boot_constraints_disable kernel parameter

2017-08-28 Thread Jani Nikula
On Tue, 29 Aug 2017, Greg Kroah-Hartman  wrote:
> On Tue, Aug 01, 2017 at 02:53:43PM +0530, Viresh Kumar wrote:
>> Users must be given an option to discard any constraints set by
>> bootloaders. For example, consider that a constraint is set for the LCD
>> controller's supply and the LCD driver isn't loaded by the kernel. If
>> the user doesn't need to use the LCD device, then he shouldn't be forced
>> to honour the constraint.
>> 
>> We can also think about finer control of such constraints with help of
>> some sysfs files, but a kernel parameter is fine to begin with.
>> 
>> Tested-by: Rajendra Nayak 
>> Signed-off-by: Viresh Kumar 
>> ---
>>  Documentation/admin-guide/kernel-parameters.txt |  3 +++
>>  drivers/base/boot_constraints/core.c| 17 +
>>  2 files changed, 20 insertions(+)
>> 
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index d9c171ce4190..0706d1b6004d 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -426,6 +426,9 @@
>>  embedded devices based on command line input.
>>  See Documentation/block/cmdline-partition.txt
>>  
>> +boot_constraints_disable
>> +Do not set any boot constraints for devices.
>
> Shouldn't that be the default?  As really, that is what the situation is
> today, why force everyone to always enable the disable value?  And
> enabling a value to disable something is usually a sign of bad naming...
>
>> +
>>  boot_delay= Milliseconds to delay each printk during boot.
>>  Values larger than 10 seconds (1) are changed to
>>  no delay (0).
>> diff --git a/drivers/base/boot_constraints/core.c 
>> b/drivers/base/boot_constraints/core.c
>> index 366a05d6d9ba..e0c33b2b216f 100644
>> --- a/drivers/base/boot_constraints/core.c
>> +++ b/drivers/base/boot_constraints/core.c
>> @@ -24,6 +24,17 @@
>>  static LIST_HEAD(constraint_devices);
>>  static DEFINE_MUTEX(constraint_devices_mutex);
>>  
>> +static bool boot_constraints_disabled;
>
> Again, this should only be an "enable" type of option, that kicks in if
> you are using this type of bootloader/kernel interaction.  Don't force
> someone to disable it.

I might add that "disable" type options lead to annoying double
negatives. Regardless of the default, I'd generally prefer "enable" type
options that you enable/disable as needed.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center


Re: [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation

2017-08-28 Thread Byungchul Park
On Wed, Aug 23, 2017 at 01:58:47PM +0200, Peter Zijlstra wrote:
> The new completion/crossrelease annotations interact unfavourable with
> the extant flush_work()/flush_workqueue() annotations.
> 
> The problem is that when a single work class does:
> 
>   wait_for_completion(&C)
> 
> and
> 
>   complete(&C)
> 
> in different executions, we'll build dependencies like:
> 
>   lock_map_acquire(W)
>   complete_acquire(C)
> 
> and
> 
>   lock_map_acquire(W)
>   complete_release(C)
> 
> which results in the dependency chain: W->C->W, which lockdep thinks
> spells deadlock, even though there is no deadlock potential since
> works are ran concurrently.
> 
> One possibility would be to change the work 'lock' to recursive-read,
> but that would mean hitting a lockdep limitation on recursive locks.
> Also, unconditinoally switching to recursive-read here would fail to
> detect the actual deadlock on single-threaded workqueues, which do
> have a problem with this.
> 
> For now, forcefully disregard these locks for crossrelease.

Eventually, you pushed this patch to tip tree without any comment.

I don't really understand you.

How does a maintainer choose a very work-around method and avoid
problems rather than fix a root cause? I am very disappointed.

But, I have nothing to do against your will.



Re: [PATCH v2 5/8] KVM: arm/arm64: vgic: Handle mapped level sensitive SPIs

2017-08-28 Thread Christoffer Dall
On Tue, Aug 22, 2017 at 04:33:42PM +0200, Auger Eric wrote:
> Hi Christoffer,
> 
> On 21/07/2017 14:11, Christoffer Dall wrote:
> > On Thu, Jun 15, 2017 at 02:52:37PM +0200, Eric Auger wrote:
> >> Currently, the line level of unmapped level sensitive SPIs is
> >> toggled down by the maintenance IRQ handler/resamplefd mechanism.
> >>
> >> As mapped SPI completion is not trapped, we cannot rely on this
> >> mechanism and the line level needs to be observed at distributor
> >> level instead.
> >>
> >> This patch handles the physical IRQ case in vgic_validate_injection
> >> and get the line level of a mapped SPI at distributor level.
> >>
> >> Signed-off-by: Eric Auger 
> >>
> >> ---
> >>
> >> v1 -> v2:
> >> - renamed is_unshared_mapped into is_mapped_spi
> >> - changes to kvm_vgic_map_phys_irq moved in the previous patch
> >> - make vgic_validate_injection more readable
> >> - reword the commit message
> >> ---
> >>  virt/kvm/arm/vgic/vgic.c | 16 ++--
> >>  virt/kvm/arm/vgic/vgic.h |  7 ++-
> >>  2 files changed, 20 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> index 075f073..2e35ac7 100644
> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> @@ -139,6 +139,17 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq 
> >> *irq)
> >>kfree(irq);
> >>  }
> >>  
> >> +bool irq_line_level(struct vgic_irq *irq)
> >> +{
> >> +  bool line_level = irq->line_level;
> >> +
> >> +  if (unlikely(is_mapped_spi(irq)))
> >> +  WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >> +IRQCHIP_STATE_PENDING,
> >> +&line_level));
> >> +  return line_level;
> >> +}
> >> +
> >>  /**
> >>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
> >>   *
> >> @@ -236,13 +247,14 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
> >>  
> >>  /*
> >>   * Only valid injection if changing level for level-triggered IRQs or for 
> >> a
> >> - * rising edge.
> >> + * rising edge. Injection of virtual interrupts associated to physical
> >> + * interrupts always is valid.
> > 
> > why?  I don't remember this now, and that means I probably won't in the
> > future either.
> 
> Sorry for the late reply.
> 
> The life cycle of the physically mapped IRQ is as follows:
> - pIRQ becomes pending
> - pIRQ is acknowledged by the physical handler and becomes active
> - vIRQ gets injected as part of the physical handler chain
>   (VFIO->irqfd kvm_vgic_inject_irq for instance). Linux irq cannot
>   hit until vIRQ=pIRQ deactivation
> - guest deactivates the vIRQ which automatically deactivates the pIRQ
> 
> 
> So to me if we are about to vgic_validate_injection() an injection of a
> physically mapped vIRQ this means the vgic state is ready to accept it:
> previous occurence was deactivated. There cannot be any state
> inconsistency around the line_level/level.
> 
> Do you agree?
> 
> I will add this description at least in the commit message.

I think the important point is, that even though we don't change the
level, we still add it to the AP list if not already there, and
therefore we still do this.

> > 
> > When I look at this now, I'm thinking, if we're not going to change
> > anything, why proceed beyond validate injection?
> 
> don't catch this one. validation always succeeds and then we further
> handle the IRQ.

The problem is that the code suggests that we will not change something,
but in fact, later on, in the caller, we do queue this IRQ if not on the
AP list, even though there were no state change in the struct IRQ.

But Marc and I sketched out another proposal which could benefit the
timer as well.  Let me try to verify if it works and send it to you and
see if that is an improvement over this one.

Thanks,
-Christoffer


Re: [PATCH v2 1/2] driver core: detach device's pm_domain after devres_release_all

2017-08-28 Thread Greg Kroah-Hartman
On Tue, Aug 15, 2017 at 04:36:56PM +0800, Shawn Lin wrote:
> Move dev_pm_domain_detach after devres_release_all to avoid
> accessing device's registers with genpd been powered off.

So, what is this going to break that is working already today?  :)

> 
> Signed-off-by: Shawn Lin 
> ---
> 
> Changes in v2: None
> 
>  drivers/base/dd.c   | 35 ++-
>  drivers/base/platform.c | 18 ++
>  2 files changed, 32 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index ad44b40..13dc0ad 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -25,7 +25,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>  #include 
>  
>  #include "base.h"
> @@ -356,6 +358,8 @@ static int really_probe(struct device *dev, struct 
> device_driver *drv)
>   int local_trigger_count = atomic_read(&deferred_trigger_count);
>   bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) &&
>  !drv->suppress_bind_attrs;
> + struct platform_driver *pdrv;
> + bool do_pm_domain = false;
>  
>   if (defer_all_probes) {
>   /*
> @@ -414,6 +418,16 @@ static int really_probe(struct device *dev, struct 
> device_driver *drv)
>   if (ret)
>   goto probe_failed;
>   } else if (drv->probe) {
> + ret = dev_pm_domain_attach(dev, true);
> + pdrv = to_platform_driver(dev->driver);
> + /* don't fail if just dev_pm_domain_attach failed */
> + if (pdrv->prevent_deferred_probe &&
> + ret == -EPROBE_DEFER) {
> + dev_warn(dev, "probe deferral not supported\n");
> + ret = -ENXIO;
> + goto probe_failed;
> + }
> + do_pm_domain = true;
>   ret = drv->probe(dev);
>   if (ret)
>   goto probe_failed;
> @@ -421,13 +435,17 @@ static int really_probe(struct device *dev, struct 
> device_driver *drv)
>  
>   if (test_remove) {
>   test_remove = false;
> + do_pm_domain = false;
>  
> - if (dev->bus->remove)
> + if (dev->bus->remove) {
>   dev->bus->remove(dev);
> - else if (drv->remove)
> + } else if (drv->remove) {
>   drv->remove(dev);
> -
> + do_pm_domain = true;

Why is this set to true if you have a driver remove function, but not if
you only have a bus remove function?  Why the difference?


> + }
>   devres_release_all(dev);
> + if (do_pm_domain)
> + dev_pm_domain_detach(dev, true);
>   driver_sysfs_remove(dev);
>   dev->driver = NULL;
>   dev_set_drvdata(dev, NULL);
> @@ -458,6 +476,8 @@ static int really_probe(struct device *dev, struct 
> device_driver *drv)
>  pinctrl_bind_failed:
>   device_links_no_driver(dev);
>   devres_release_all(dev);
> + if (do_pm_domain)
> + dev_pm_domain_detach(dev, true);

Can't you just always call this on the error path?

>   driver_sysfs_remove(dev);
>   dev->driver = NULL;
>   dev_set_drvdata(dev, NULL);
> @@ -818,6 +838,7 @@ int driver_attach(struct device_driver *drv)
>  static void __device_release_driver(struct device *dev, struct device 
> *parent)
>  {
>   struct device_driver *drv;
> + bool do_pm_domain = false;
>  
>   drv = dev->driver;
>   if (drv) {
> @@ -855,15 +876,19 @@ static void __device_release_driver(struct device *dev, 
> struct device *parent)
>  
>   pm_runtime_put_sync(dev);
>  
> - if (dev->bus && dev->bus->remove)
> + if (dev->bus && dev->bus->remove) {
>   dev->bus->remove(dev);
> - else if (drv->remove)
> + } else if (drv->remove) {
> + do_pm_domain = true;

Same question here about drivers and bus default functions.

thanks,

greg k-h


Re: [RFC 1/3] sched/fair: add util_est on top of PELT

2017-08-28 Thread Pavan Kondeti
On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> The util_avg signal computed by PELT is too variable for some use-cases.
> For example, a big task waking up after a long sleep period will have its
> utilization almost completely decayed. This introduces some latency before
> schedutil will be able to pick the best frequency to run a task.
>



> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index c28b182c9833..8d7bc55f68d5 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  /* task_struct member predeclarations (sorted alphabetically): */
>  struct audit_context;
> @@ -277,6 +278,16 @@ struct load_weight {
> u32 inv_weight;
>  };
>
> +/**
> + * Utilizaton's Exponential Weighted Moving Average (EWMA)
> + *
> + * Support functions to track an EWMA for the utilization of SEs and RQs. New
> + * samples will be added to the moving average each time a task completes an
> + * activation. Thus the weight is chosen so that the EWMA wil be relatively
> + * insensitive to transient changes to the task's workload.
> + */
> +DECLARE_EWMA(util, 0, 4);
> +
>  /*

Should the factor be 1 instead of 0? i.e 25% contribution from the
recent sample.

Thanks,
Pavan


-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project


Re: [PATCH V3 1/8] drivers: Add boot constraints core

2017-08-28 Thread Greg Kroah-Hartman
On Tue, Aug 01, 2017 at 02:53:42PM +0530, Viresh Kumar wrote:
> Some devices are powered ON by the bootloader before the bootloader
> handovers control to Linux. It maybe important for those devices to keep
> working until the time a Linux device driver probes the device and
> reconfigure its resources.
> 
> A typical example of that can be the LCD controller, which is used by
> the bootloaders to show image(s) while the platform is booting into
> Linux. The LCD controller can be using some resources, like clk,
> regulators, PM domain, etc, that are shared between several devices.
> These shared resources should be configured to satisfy need of all the
> users. If another device's (X) driver gets probed before the LCD
> controller driver in this case, then it may end up reconfiguring these
> resources to ranges satisfying the current users (only device X) and
> that can make the LCD screen unstable.
> 
> This patch introduces the concept of boot-constraints, which will be set
> by the bootloaders and the kernel will satisfy them until the time
> driver for such a device is probed (successfully or unsuccessfully).
> 
> The list of boot constraint types is empty for now, and will be
> incrementally updated by later patches.
> 
> Only two routines are exposed by the boot constraints core for now:
> 
> - dev_boot_constraint_add(): This shall be called by parts of the kernel
>   (before the device is probed) to set the constraints.
> 
> - dev_boot_constraints_remove(): This is called only by the driver core
>   after a device is probed successfully or unsuccessfully. Special
>   handling is done here for deffered probing.

How is this information getting to the kernel from the bootloader?  I
didn't see where that happened, just a single example driver that
somehow "knew" what had to happen, which seems odd...

This is a lot of new code for no users, I would like to see at least 3
real drivers that are using it before we merge it, as then you have a
chance of getting the user/kernel api correct.

thanks,

greg k-h


linux-next: manual merge of the tty tree with the parisc-hd tree

2017-08-28 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the tty tree got a conflict in:

  drivers/tty/serial/mux.c

between commit:

  fc72b7a3a0d8 ("parisc/mux: Fix section mismatches")

from the parisc-hd tree and commit:

  829374f544b3 ("tty: mux: constify parisc_device_id")

from the tty tree.

I fixed it up (the former is a superset of the latter) and can carry the
fix as necessary. This is now fixed as far as linux-next is concerned,
but any non trivial conflicts should be mentioned to your upstream
maintainer when your tree is submitted for merging.  You may also want
to consider cooperating with the maintainer of the conflicting tree to
minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH V3 2/8] drivers: boot_constraint: Add boot_constraints_disable kernel parameter

2017-08-28 Thread Greg Kroah-Hartman
On Tue, Aug 01, 2017 at 02:53:43PM +0530, Viresh Kumar wrote:
> Users must be given an option to discard any constraints set by
> bootloaders. For example, consider that a constraint is set for the LCD
> controller's supply and the LCD driver isn't loaded by the kernel. If
> the user doesn't need to use the LCD device, then he shouldn't be forced
> to honour the constraint.
> 
> We can also think about finer control of such constraints with help of
> some sysfs files, but a kernel parameter is fine to begin with.
> 
> Tested-by: Rajendra Nayak 
> Signed-off-by: Viresh Kumar 
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  3 +++
>  drivers/base/boot_constraints/core.c| 17 +
>  2 files changed, 20 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index d9c171ce4190..0706d1b6004d 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -426,6 +426,9 @@
>   embedded devices based on command line input.
>   See Documentation/block/cmdline-partition.txt
>  
> + boot_constraints_disable
> + Do not set any boot constraints for devices.

Shouldn't that be the default?  As really, that is what the situation is
today, why force everyone to always enable the disable value?  And
enabling a value to disable something is usually a sign of bad naming...

> +
>   boot_delay= Milliseconds to delay each printk during boot.
>   Values larger than 10 seconds (1) are changed to
>   no delay (0).
> diff --git a/drivers/base/boot_constraints/core.c 
> b/drivers/base/boot_constraints/core.c
> index 366a05d6d9ba..e0c33b2b216f 100644
> --- a/drivers/base/boot_constraints/core.c
> +++ b/drivers/base/boot_constraints/core.c
> @@ -24,6 +24,17 @@
>  static LIST_HEAD(constraint_devices);
>  static DEFINE_MUTEX(constraint_devices_mutex);
>  
> +static bool boot_constraints_disabled;

Again, this should only be an "enable" type of option, that kicks in if
you are using this type of bootloader/kernel interaction.  Don't force
someone to disable it.

thanks,

greg k-h


Re: [PATCH V3 6/8] drivers: boot_constraint: Add debugfs support

2017-08-28 Thread Greg Kroah-Hartman
On Tue, Aug 01, 2017 at 02:53:47PM +0530, Viresh Kumar wrote:
> This patch adds debugfs support for boot constraints. This is how it
> looks for a "vmmc-supply" constraint for the MMC device.
> 
> $ ls -R /sys/kernel/debug/boot_constraints/
> /sys/kernel/debug/boot_constraints/:
> f723d000.dwmmc0
> 
> /sys/kernel/debug/boot_constraints/f723d000.dwmmc0:
> clk-ciu  pm-domain  supply-vmmc  supply-vmmcaux
> 
> /sys/kernel/debug/boot_constraints/f723d000.dwmmc0/clk-ciu:
> 
> /sys/kernel/debug/boot_constraints/f723d000.dwmmc0/pm-domain:
> 
> /sys/kernel/debug/boot_constraints/f723d000.dwmmc0/supply-vmmc:
> u_volt_max  u_volt_min
> 
> /sys/kernel/debug/boot_constraints/f723d000.dwmmc0/supply-vmmcaux:
> u_volt_max  u_volt_min
> 
> Tested-by: Rajendra Nayak 
> Signed-off-by: Viresh Kumar 

Minor debugfs api interaction nits below:


> ---
>  drivers/base/boot_constraints/clk.c|  4 ++
>  drivers/base/boot_constraints/core.c   | 72 
> ++
>  drivers/base/boot_constraints/core.h   |  6 +++
>  drivers/base/boot_constraints/pm.c | 12 +-
>  drivers/base/boot_constraints/supply.c | 10 +
>  5 files changed, 102 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/boot_constraints/clk.c 
> b/drivers/base/boot_constraints/clk.c
> index b5b1d63c3e76..bdbfcbc2944d 100644
> --- a/drivers/base/boot_constraints/clk.c
> +++ b/drivers/base/boot_constraints/clk.c
> @@ -49,6 +49,9 @@ int constraint_clk_add(struct constraint *constraint, void 
> *data)
>   cclk->clk_info.name = kstrdup_const(clk_info->name, GFP_KERNEL);
>   constraint->private = cclk;
>  
> + /* Debugfs */

That's obvious no need for the comment...

> + constraint_add_debugfs(constraint, clk_info->name);
> +
>   return 0;
>  
>  put_clk:
> @@ -63,6 +66,7 @@ void constraint_clk_remove(struct constraint *constraint)
>  {
>   struct constraint_clk *cclk = constraint->private;
>  
> + constraint_remove_debugfs(constraint);
>   kfree_const(cclk->clk_info.name);
>   clk_disable_unprepare(cclk->clk);
>   clk_put(cclk->clk);
> diff --git a/drivers/base/boot_constraints/core.c 
> b/drivers/base/boot_constraints/core.c
> index 06267f0c88d4..c0e3a85ff85a 100644
> --- a/drivers/base/boot_constraints/core.c
> +++ b/drivers/base/boot_constraints/core.c
> @@ -35,6 +35,76 @@ static int __init constraints_disable(char *str)
>  }
>  early_param("boot_constraints_disable", constraints_disable);
>  
> +/* Debugfs */
> +
> +static struct dentry *rootdir;
> +
> +static void constraint_device_add_debugfs(struct constraint_dev *cdev)
> +{
> + struct device *dev = cdev->dev;
> +
> + cdev->dentry = debugfs_create_dir(dev_name(dev), rootdir);
> + if (!cdev->dentry)
> + dev_err(dev, "Failed to create constraint dev debugfs dir\n");

No, you never need to check the return value of a debugfs call.  You
shouldn't care what happens here, it's just debugfs, a user can't do
anything with this info, and neither should you change your actions
(which you aren't here, which is good, but not true later on...)

So just call the function, save the value, and move on, it's always
going to return a value that you can use in any future debugfs calls, no
need to care.

> +}
> +
> +static void constraint_device_remove_debugfs(struct constraint_dev *cdev)
> +{
> + debugfs_remove_recursive(cdev->dentry);
> +}
> +
> +void constraint_add_debugfs(struct constraint *constraint, const char 
> *suffix)
> +{
> + struct device *dev = constraint->cdev->dev;
> + const char *prefix;
> + char name[NAME_MAX];
> +
> + switch (constraint->type) {
> + case DEV_BOOT_CONSTRAINT_CLK:
> + prefix = "clk";
> + break;
> + case DEV_BOOT_CONSTRAINT_PM:
> + prefix = "pm";
> + break;
> + case DEV_BOOT_CONSTRAINT_SUPPLY:
> + prefix = "supply";
> + break;
> + default:
> + dev_err(dev, "%s: Constraint type (%d) not supported\n",
> + __func__, constraint->type);
> + return;
> + }
> +
> + snprintf(name, NAME_MAX, "%s-%s", prefix, suffix);
> +
> + constraint->dentry = debugfs_create_dir(name, constraint->cdev->dentry);
> + if (!constraint->dentry)
> + dev_err(dev, "Failed to create constraint (%s) debugfs dir\n",
> + name);

Again, you don't care, just call it and move on.

> +}
> +
> +void constraint_remove_debugfs(struct constraint *constraint)
> +{
> + debugfs_remove_recursive(constraint->dentry);
> +}
> +
> +static int __init constraint_debugfs_init(void)
> +{
> + if (boot_constraints_disabled)
> + return -ENODEV;
> +
> + /* Create /sys/kernel/debug/opp directory */
> + rootdir = debugfs_create_dir("boot_constraints", NULL);
> + if (!rootdir) {
> + pr_err("Failed to create root directory\n");
> + return -ENOMEM;

And again, you don't care, call it and move on, don't return an

linux-next: manual merge of the tty tree with the parisc-hd tree

2017-08-28 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the tty tree got a conflict in:

  drivers/tty/serial/8250/8250_gsc.c

between commit:

  9e466f101e19 ("parisc/8250_gsc: Fix section mismatches")

from the parisc-hd tree and commit:

  0d474f7fad3b ("tty: 8250: constify parisc_device_id")

from the tty tree.

I fixed it up (the former is a superset of the latter) and can carry the
fix as necessary. This is now fixed as far as linux-next is concerned,
but any non trivial conflicts should be mentioned to your upstream
maintainer when your tree is submitted for merging.  You may also want
to consider cooperating with the maintainer of the conflicting tree to
minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] misc: pci_endpoint_test: make boolean no_msi static

2017-08-28 Thread Kishon Vijay Abraham I
Hi Bjorn,

On Wednesday 23 August 2017 03:17 PM, Colin King wrote:
> From: Colin Ian King 
> 
> The boolean no_msi is local to the source and does not need to be in
> global scope, so make it static.
> 
> Cleans up sparse warning:
> symbol 'no_msi' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King 

Can you pick this one also?
Acked-by: Kishon Vijay Abraham I 


Thanks
Kishon
> ---
>  drivers/misc/pci_endpoint_test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/pci_endpoint_test.c 
> b/drivers/misc/pci_endpoint_test.c
> index 1f64d943794d..deb203026496 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -73,7 +73,7 @@ static DEFINE_IDA(pci_endpoint_test_ida);
>  #define to_endpoint_test(priv) container_of((priv), struct 
> pci_endpoint_test, \
>   miscdev)
>  
> -bool no_msi;
> +static bool no_msi;
>  module_param(no_msi, bool, 0444);
>  MODULE_PARM_DESC(no_msi, "Disable MSI interrupt in pci_endpoint_test");
>  
> 


Re: [PATCH][pci-next] PCI: endpoint: fix incorrect end of table check in while loop

2017-08-28 Thread Kishon Vijay Abraham I


On Monday 28 August 2017 11:55 PM, Bjorn Helgaas wrote:
> On Wed, Aug 23, 2017 at 05:03:03PM +0100, Colin King wrote:
>> From: Colin Ian King 
>>
>> Currently, the while loop will iterate until a matching name is found
>> or until the id pointer wraps around to NULL (the latter is incorrect).
>>
>> The end of a pci_epf_device_id table is terminated with zero'd entries
>> for name and driver_data, so can change the while loop to iterate while
>> the first character in the name is a non-zero character.
>>
>> Detected by CoverityScan, CID#1454557 ("Logically dead code")
>>
>> Fixes: 9e9d6eb48623 ("PCI: endpoint: Add an API to get matching 
>> "pci_epf_device_id")
>> Signed-off-by: Colin Ian King 
> 
> Kishon, do you want to ack this, please?

Acked-by: Kishon Vijay Abraham I 
> 
>> ---
>>  drivers/pci/endpoint/pci-epf-core.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/endpoint/pci-epf-core.c 
>> b/drivers/pci/endpoint/pci-epf-core.c
>> index ae6fac5995e3..70eccc04ee7f 100644
>> --- a/drivers/pci/endpoint/pci-epf-core.c
>> +++ b/drivers/pci/endpoint/pci-epf-core.c
>> @@ -273,7 +273,7 @@ pci_epf_match_device(const struct pci_epf_device_id *id, 
>> struct pci_epf *epf)
>>  if (!id || !epf)
>>  return NULL;
>>  
>> -while (id) {
>> +while (*id->name) {
>>  if (strcmp(epf->name, id->name) == 0)
>>  return id;
>>  id++;
>> -- 
>> 2.14.1
>>


Re: [PATCH v3 0/3] Define CPU_BIG_ENDIAN or warn for inconsistencies

2017-08-28 Thread Greg KH
On Thu, Jul 06, 2017 at 09:34:18AM -0700, Babu Moger wrote:
> Resending the series per Greg KH's request.
> 
> Found this problem while enabling queued rwlock on SPARC.
> The parameter CONFIG_CPU_BIG_ENDIAN is used to clear the
> specific byte in qrwlock structure. Without this parameter,
> we clear the wrong byte.
> Here is the code in include/asm-generic/qrwlock.h
> 
> static inline u8 *__qrwlock_write_byte(struct qrwlock *lock)
>   {
>  return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
>   }
> 
> Also found few more references of this parameter in
> drivers/of/base.c
> drivers/of/fdt.c
> drivers/tty/serial/earlycon.c
> drivers/tty/serial/serial_core.c
> 
> Here is our previous discussion.
> https://lkml.org/lkml/2017/5/24/620
> 
> Based on the discussion, it was decided to add CONFIG_CPU_BIG_ENDIAN
> for all the fixed big endian architecture(frv, h8300, m68k, openrisc,
> parisc and sparc). And warn if there are inconsistencies in this definition.

Did this series ever get picked up by anyone?  I don't know whose tree
it should go through if not, anyone have any ideas?  I guess I could,
but arch-specific stuff is odd...

thanks,

greg k-h


Re: [PATCH net-next v2 00/10] net: dsa: add generic debugfs interface

2017-08-28 Thread Jiri Pirko
Tue, Aug 29, 2017 at 06:38:37AM CEST, da...@davemloft.net wrote:
>From: Vivien Didelot 
>Date: Mon, 28 Aug 2017 15:17:38 -0400
>
>> This patch series adds a generic debugfs interface for the DSA
>> framework, so that all switch devices benefit from it, e.g. Marvell,
>> Broadcom, Microchip or any other DSA driver.
>
>I've been thinking this over and I agree with the feedback given that
>debugfs really isn't appropriate for this.
>
>Please create a DSA device class, and hang these values under
>appropriate sysfs device nodes that can be easily found via
>/sys/class/dsa/ just as easily as they would be /sys/kernel/debug/dsa/
>
>You really intend these values to be consistent across DSA devices,
>and you don't intend to go willy-nilly changig these exported values
>arbitrarily over time.  That's what debugfs is for, throw-away
>stuff.
>
>So please make these proper device sysfs attributes rather than
>debugfs.

As I wrote, I believe that there is a big overlap with devlink and its
dpipe subset. I think that primary we should focus on extending whatever
is needed for dsa there. The iface should be generic for all drivers,
not only dsa. dsa-specific sysfs attributes should be last-resort solution,
I believe we can avoid them.


[PATCH v8 01/10] powerpc/vas: Define macros, register fields and structures

2017-08-28 Thread Sukadev Bhattiprolu
Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v8]
- Use u64/u32 instead of the uintXX versions.

Changelog[v7]
- Move the threshold control macros from uapi/asm/vas.h to
  asm/vas.h for now. When we actually have an user space need for
  them, we can move them to uapi/asm/vas.h. With this change,
  uapi/asm/vas.h is empty and can be dropped from this patch.

Changelog[v6]
- Add some fields for FTW windows

Changelog[v4]
- [Michael Neuling] Move VAS code to arch/powerpc; Reorg vas.h and
  vas-internal.h to kernel and uapi versions; rather than creating
  separate properties for window context/address entries in device
  tree, combine them into "reg" properties; drop ->hwirq and irq_port
  fields from vas_window as they are only needed with user space
  windows.
- Drop the error check for CONFIG_PPC_4K_PAGES. Instead in a
  follow-on patch add a "depends on CONFIG_PPC_64K_PAGES".

Changelog[v3]
- Rename winctx->pid to winctx->pidr to reflect that its a value
  from the PID register (SPRN_PID), not the linux process id.
- Make it easier to split header into kernel/user parts
- To keep user interface simple, use macros rather than enum for
  the threshold-control modes.
- Add a pid field to struct vas_window - needed for user space
  send windows.

Changelog[v2]
- Add an overview of VAS in vas-internal.h
- Get window context parameters from device tree and drop
  unnecessary macros.
---
 arch/powerpc/include/asm/vas.h   |  45 +
 arch/powerpc/platforms/powernv/vas.h | 382 +++
 2 files changed, 427 insertions(+)
 create mode 100644 arch/powerpc/include/asm/vas.h
 create mode 100644 arch/powerpc/platforms/powernv/vas.h

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
new file mode 100644
index 000..ff87e44
--- /dev/null
+++ b/arch/powerpc/include/asm/vas.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright 2016-17 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _MISC_VAS_H
+#define _MISC_VAS_H
+
+/*
+ * Min and max FIFO sizes are based on Version 1.05 Section 3.1.4.25
+ * (Local FIFO Size Register) of the VAS workbook.
+ */
+#define VAS_RX_FIFO_SIZE_MIN   (1 << 10)   /* 1KB */
+#define VAS_RX_FIFO_SIZE_MAX   (8 << 20)   /* 8MB */
+
+/*
+ * Threshold Control Mode: Have paste operation fail if the number of
+ * requests in receive FIFO exceeds a threshold.
+ *
+ * NOTE: No special error code yet if paste is rejected because of these
+ *  limits. So users can't distinguish between this and other errors.
+ */
+#define VAS_THRESH_DISABLED0
+#define VAS_THRESH_FIFO_GT_HALF_FULL   1
+#define VAS_THRESH_FIFO_GT_QTR_FULL2
+#define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
+
+/*
+ * Co-processor Engine type.
+ */
+enum vas_cop_type {
+   VAS_COP_TYPE_FAULT,
+   VAS_COP_TYPE_842,
+   VAS_COP_TYPE_842_HIPRI,
+   VAS_COP_TYPE_GZIP,
+   VAS_COP_TYPE_GZIP_HIPRI,
+   VAS_COP_TYPE_FTW,
+   VAS_COP_TYPE_MAX,
+};
+
+#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas.h 
b/arch/powerpc/platforms/powernv/vas.h
new file mode 100644
index 000..abb545f
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -0,0 +1,382 @@
+/*
+ * Copyright 2016-17 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _VAS_H
+#define _VAS_H
+#include 
+#include 
+#include 
+
+/*
+ * Overview of Virtual Accelerator Switchboard (VAS).
+ *
+ * VAS is a hardware "switchboard" that allows senders and receivers to
+ * exchange messages with _minimal_ kernel involvment. The receivers are
+ * typically NX coprocessor engines that perform compression or encryption
+ * in hardware, but receivers can also be other software threads.
+ *
+ * Senders are user/kernel threads that submit compression/encryption or
+ * other requests to the receivers. Senders must format their messages as
+ * Coprocessor Request Blocks (CRB)s and submit them using the "copy" and
+ * "paste" instructions which were introduced in Power9.
+ *
+ * A Power node can have (upto?) 8 Power chips. There is one instance of
+ * VAS in each Power9 chip. Each instance of VAS has 64K windows or ports,
+ * Senders and receivers must each connect to a separate window before they
+ * can exchange mess

[PATCH v8 03/10] powerpc/vas: Define vas_init() and vas_exit()

2017-08-28 Thread Sukadev Bhattiprolu
Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.

VAS is currently only supported with 64K page size.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v8]:
- [Michael Ellerman] VAS should be built-in and not a module;
drop the vas_exit() and free_vinst() code since its not
a module and might need some refcounting. Drop init_done,
->ready and unnecessary "len" fields in vinst; Misc cleanup.

Changelog[v5]:
- [Ben Herrenschmidt]: Create and use platform device tree nodes,
  fix up the "reg" properties for the VAS DT node and use the
  platform device helpers to parse the reg properties; Use linked
  list of VAS instances (don't assume vasids are sequential);
  Use CONFIG_PPC_VAS instead of CONFIG_VAS.

Changelog[v4]:
- [Michael Neuling] Fix some accidental deletions; fix help text
  in Kconfig; change vas_initialized to a function; move from
  drivers/misc to arch/powerpc/kernel
- Drop the vas_window_reset() interface. It is not needed as
  window will be initialized before each use.
- Add a "depends on PPC_64K_PAGES"

Changelog[v3]:
- Zero vas_instances memory on allocation
- [Haren Myneni] Fix description in Kconfig
Changelog[v2]:
- Get HVWC, UWC and window address parameters from device tree.
---
 .../devicetree/bindings/powerpc/ibm,vas.txt|  23 
 MAINTAINERS|   8 ++
 arch/powerpc/platforms/powernv/Kconfig |  14 ++
 arch/powerpc/platforms/powernv/Makefile|   1 +
 arch/powerpc/platforms/powernv/vas-window.c|  19 +++
 arch/powerpc/platforms/powernv/vas.c   | 151 +
 arch/powerpc/platforms/powernv/vas.h   |   2 +
 7 files changed, 218 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/powerpc/ibm,vas.txt
 create mode 100644 arch/powerpc/platforms/powernv/vas-window.c
 create mode 100644 arch/powerpc/platforms/powernv/vas.c

diff --git a/Documentation/devicetree/bindings/powerpc/ibm,vas.txt 
b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
new file mode 100644
index 000..a096d73
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
@@ -0,0 +1,23 @@
+* IBM Powerpc Virtual Accelerator Switchboard (VAS)
+
+VAS is a hardware mechanism that allows kernel subsystems and user processes
+to directly submit compression and other requests to Nest accelerators (NX)
+or other coprocessors functions.
+
+Required properties:
+- compatible : should be "ibm,vas".
+- ibm,vas-id : A unique identifier for each instance of VAS in the system
+- reg : Should contain 4 pairs of 64-bit fields specifying the Hypervisor
+  window context start and length, OS/User window context start and length,
+  "Paste address" start and length, "Paste window id" start bit and number
+  of bits)
+
+Example:
+
+   vas@60191 {
+   compatible = "ibm,vas", "ibm,power9-vas";
+   reg = <0x60191 0x200 0x60190 0x1 
0x8 0x1 0x20 0x10>;
+   name = "vas";
+   ibm,vas-id = <0x1>;
+   };
+
diff --git a/MAINTAINERS b/MAINTAINERS
index 1c3feff..ec68732 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6437,6 +6437,14 @@ L:   net...@vger.kernel.org
 S: Supported
 F: drivers/net/ethernet/ibm/ibmvnic.*
 
+IBM Power Virtual Accelerator Switchboard
+M: Sukadev Bhattiprolu
+L: linuxppc-...@lists.ozlabs.org
+S: Supported
+F: arch/powerpc/platforms/powernv/vas*
+F: arch/powerpc/include/asm/vas.h
+F: arch/powerpc/include/uapi/asm/vas.h
+
 IBM Power Virtual Ethernet Device Driver
 M: Thomas Falcon 
 L: net...@vger.kernel.org
diff --git a/arch/powerpc/platforms/powernv/Kconfig 
b/arch/powerpc/platforms/powernv/Kconfig
index 6a6f4ef..3e3bbe9 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -30,3 +30,17 @@ config OPAL_PRD
help
  This enables the opal-prd driver, a facility to run processor
  recovery diagnostics on OpenPower machines
+
+config PPC_VAS
+   bool "IBM Virtual Accelerator Switchboard (VAS)"
+   depends on PPC_POWERNV && PPC_64K_PAGES
+   default y
+   help
+ This enables support for IBM Virtual Accelerator Switchboard (VAS).
+
+ VAS allows accelerators in co-processors like NX-GZIP and NX-842
+ to be accessible to kernel subsystems and user processes.
+
+ VAS adapters are found in POWER9 based systems.
+
+ If unsure, say N.
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/po

[PATCH v8 10/10] powerpc/vas: Define copy/paste interfaces

2017-08-28 Thread Sukadev Bhattiprolu
Define interfaces (wrappers) to the 'copy' and 'paste' instructions
(which are new in PowerISA 3.0). These are intended to be used to
by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the
NX hardware engines.

Signed-off-by: Sukadev Bhattiprolu 

---
Changelog[v8]:
- [Michael Ellerman] Drop vas_initialized() check; cleanup asm code,
  reuse existing macros, fix old references; add cr0 to clobbers

Changelog[v4]
- Export symbols
Changelog[v3]
- Map raw CR value from paste instruction into an error code.

Conflicts:
arch/powerpc/platforms/powernv/vas.h
---
 MAINTAINERS |  1 +
 arch/powerpc/include/asm/ppc-opcode.h   |  2 ++
 arch/powerpc/include/asm/vas.h  | 12 
 arch/powerpc/platforms/powernv/copy-paste.h | 46 
 arch/powerpc/platforms/powernv/vas-window.c | 47 +
 arch/powerpc/platforms/powernv/vas.h| 18 +--
 6 files changed, 124 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/copy-paste.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ec68732..624c67a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6442,6 +6442,7 @@ M:Sukadev Bhattiprolu
 L: linuxppc-...@lists.ozlabs.org
 S: Supported
 F: arch/powerpc/platforms/powernv/vas*
+F: arch/powerpc/platforms/powernv/copy-paste.h
 F: arch/powerpc/include/asm/vas.h
 F: arch/powerpc/include/uapi/asm/vas.h
 
diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index fa9ebae..749336d 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -414,6 +414,8 @@
___PPC_RB(b))
 #define PPC_MSGCLRP(b) stringify_in_c(.long PPC_INST_MSGCLRP | \
___PPC_RB(b))
+#define PPC_PASTE(a, b)stringify_in_c(.long PPC_INST_PASTE | \
+   ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_POPCNTB(a, s)  stringify_in_c(.long PPC_INST_POPCNTB | \
__PPC_RA(a) | __PPC_RS(s))
 #define PPC_POPCNTD(a, s)  stringify_in_c(.long PPC_INST_POPCNTD | \
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index efbdde5..dfc97f5 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -145,4 +145,16 @@ struct vas_window *vas_tx_win_open(int vasid, enum 
vas_cop_type cop,
  */
 int vas_win_close(struct vas_window *win);
 
+/*
+ * Copy the co-processor request block (CRB) @crb into the local L2 cache.
+ */
+extern int vas_copy_crb(void *crb, int offset);
+
+/*
+ * Paste a previously copied CRB (see vas_copy_crb()) from the L2 cache to
+ * the hardware address associated with the window @win. @re is expected/
+ * assumed to be true for NX windows.
+ */
+extern int vas_paste_crb(struct vas_window *win, int offset, bool re);
+
 #endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/copy-paste.h 
b/arch/powerpc/platforms/powernv/copy-paste.h
new file mode 100644
index 000..c9a5036
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/copy-paste.h
@@ -0,0 +1,46 @@
+/*
+ * Copyright 2016-17 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include 
+
+#define CR0_SHIFT  28
+#define CR0_MASK   0xF
+/*
+ * Copy/paste instructions:
+ *
+ * copy RA,RB
+ * Copy contents of address (RA) + effective_address(RB)
+ * to internal copy-buffer.
+ *
+ * paste RA,RB
+ * Paste contents of internal copy-buffer to the address
+ * (RA) + effective_address(RB)
+ */
+static inline int vas_copy(void *crb, int offset)
+{
+   asm volatile(PPC_COPY(%0, %1)";"
+   :
+   : "b" (offset), "b" (crb)
+   : "memory");
+
+   return 0;
+}
+
+static inline int vas_paste(void *paste_address, int offset)
+{
+   u32 cr;
+
+   cr = 0;
+   asm volatile(PPC_PASTE(%1, %2)";"
+   "mfocrf %0, 0x80;"
+   : "=r" (cr)
+   : "b" (offset), "b" (paste_address)
+   : "memory", "cr0");
+
+   return (cr >> CR0_SHIFT) & CR0_MASK;
+}
diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index cd12e44..b02f26d 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -18,6 +18,7 @@
 #include 
 
 #include "vas.h"
+#include "copy-paste.h"
 
 /*
  * Compute the paste address region for the window @window using the
@@ -997,6 +998,52 @@ struct vas_window *vas_tx_win_open(int vasid, enum 
vas_cop_type cop,
 }
 EXPORT_SYMBOL_GP

Re: [PATCH net-next v2 00/10] net: dsa: add generic debugfs interface

2017-08-28 Thread Jiri Pirko
Mon, Aug 28, 2017 at 10:08:34PM CEST, and...@lunn.ch wrote:
>> I see this overlaps a lot with DPIPE. Why won't you use that to expose
>> your hw state?
>
>We took a look at dpipe and i talked to you about using it for this
>sort of thing at netconf/netdev. But dpipe has issues displaying the
>sort of information we have. I never figured out how to do two
>dimensional tables. The output of the dpipe command is pretty
>unreadable. A lot of the information being dumped here is not about
>the data pipe, etc.

So improve it. No problem. Also, we extend it to support what you neede.


>
>There is a lot of pushback on debugfs for individual drivers. As i
>said recently to somebody, debugfs is a bit of a wild west. When
>designing this code, we thought about that. This debugfs is not at the
>driver level. It is at the DSA level. All DSA drivers will benefit
>from this code, and all DSA drivers will get the same information
>exposed in debugfs. It is generic, well defined and structured, with
>respect to DSA.

Still, it has *a lot* of overlap with devlink and dpipe. So instead of
making devlink and dpipe work for you, you introduced completely
separated debugfs interface specific to a list of drivers. That is just
wrong. Debugfs is never the correct answer! Please work with us on
devlink and dpipe so they are used for all drivers, mlxsw, dsa and others.

Thanks!


[PATCH v8 05/10] powerpc/vas: Define helpers to init window context

2017-08-28 Thread Sukadev Bhattiprolu
Define helpers to initialize window context registers of the VAS
hardware. These will be used in follow-on patches when opening/closing
VAS windows.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v8]:
- Update comments (ISA references and some cleanup)
- Use 0 or 1 when setting boolean fields with SET_FIELD()
- Don't write to spare/unused registers.
- Use kernel integer types (u64/u32/s32)
Changelog[v6]
- Add support for FTW windows and drop the fault window id
  code since it is not needed for FTW/kernel windows.
Changelog[v5]
- Fix: Copy the FIFO address into LFIFO_BAR register as is (don't
  shift address into bits 8:53).

Changelog[v4]
- Michael Neuling] Use ilog2(), radix_enabled() helpers;
  drop warning when 32-bit app uses VAS (a follow-on patch
  will check and return error). Set MSR_PR state to 0 for
  kernel (rather than reading from MSR).

Changelog[v3]
- Have caller, rather than init_xlate_regs() reset window regs
  so we don't reset any settings caller may already have set.
- Translation mode should be 0x3 (0b11) not 0x11.
- Skip initilaizing read-only registers NX_UTIL and NX_UTIL_SE
- Skip initializing adder registers from UWC - they are already
  initialized from the HVWC.
- Check winctx->user_win when setting translation registers
---
 arch/powerpc/platforms/powernv/vas-window.c | 299 
 arch/powerpc/platforms/powernv/vas.h|  55 +
 2 files changed, 354 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 642814a2..68dfe53 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "vas.h"
 
@@ -186,6 +187,304 @@ int map_winctx_mmio_bars(struct vas_window *window)
return 0;
 }
 
+/*
+ * Reset all valid registers in the HV and OS/User Window Contexts for
+ * the window identified by @window.
+ *
+ * NOTE: We cannot really use a for loop to reset window context. Not all
+ *  offsets in a window context are valid registers and the valid
+ *  registers are not sequential. And, we can only write to offsets
+ *  with valid registers.
+ */
+void reset_window_regs(struct vas_window *window)
+{
+   write_hvwc_reg(window, VREG(LPID), 0ULL);
+   write_hvwc_reg(window, VREG(PID), 0ULL);
+   write_hvwc_reg(window, VREG(XLATE_MSR), 0ULL);
+   write_hvwc_reg(window, VREG(XLATE_LPCR), 0ULL);
+   write_hvwc_reg(window, VREG(XLATE_CTL), 0ULL);
+   write_hvwc_reg(window, VREG(AMR), 0ULL);
+   write_hvwc_reg(window, VREG(SEIDR), 0ULL);
+   write_hvwc_reg(window, VREG(FAULT_TX_WIN), 0ULL);
+   write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
+   write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), 0ULL);
+   write_hvwc_reg(window, VREG(PSWID), 0ULL);
+   write_hvwc_reg(window, VREG(LFIFO_BAR), 0ULL);
+   write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), 0ULL);
+   write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), 0ULL);
+   write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
+   write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
+   write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
+   write_hvwc_reg(window, VREG(LRX_WCRED), 0ULL);
+   write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+   write_hvwc_reg(window, VREG(TX_WCRED), 0ULL);
+   write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+   write_hvwc_reg(window, VREG(LFIFO_SIZE), 0ULL);
+   write_hvwc_reg(window, VREG(WINCTL), 0ULL);
+   write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
+   write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 0ULL);
+   write_hvwc_reg(window, VREG(TX_RSVD_BUF_COUNT), 0ULL);
+   write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), 0ULL);
+   write_hvwc_reg(window, VREG(LNOTIFY_CTL), 0ULL);
+   write_hvwc_reg(window, VREG(LNOTIFY_PID), 0ULL);
+   write_hvwc_reg(window, VREG(LNOTIFY_LPID), 0ULL);
+   write_hvwc_reg(window, VREG(LNOTIFY_TID), 0ULL);
+   write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), 0ULL);
+   write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
+
+   /* Skip read-only registers: NX_UTIL and NX_UTIL_SE */
+
+   /*
+* The send and receive window credit adder registers are also
+* accessible from HVWC and have been initialized above. We don't
+* need to initialize from the OS/User Window Context, so skip
+* following calls:
+*
+*  write_uwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+*  write_uwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+*/
+}
+
+/*
+ * Initialize window context registers related to Address Translation.
+ * These registers are common to send/receive windows although they
+ * differ for user/kernel windows. As we

[PATCH v8 04/10] powerpc/vas: Define helpers to access MMIO regions

2017-08-28 Thread Sukadev Bhattiprolu
Define some helper functions to access the MMIO regions. We use these
in follow-on patches to read/write VAS hardware registers. They are
also used to later issue 'paste' instructions to submit requests to
the NX hardware engines.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog [v8]:
Minor cleanup of error/debug messages

Changelog [v6]:
- Minor reorg to make setup/cleanup functions more symmetric

Changelog [v5]:
- [Ben Herrenschmidt]: Need cachable mapping for paste regions
  and non-cachable mapping for the MMIO regions. So, just use
  ioremap() for mapping the MMIO regions; use "winctx" instead
  of "wc" to avoid collision with "write combine".

Changelog [v3]:
- Minor reorg/cleanup of map/unmap functions

Changelog [v2]:
- Get HVWC, UWC and paste addresses from window->vinst (i.e DT)
  rather than kernel macros.
---
 arch/powerpc/platforms/powernv/vas-window.c | 174 
 1 file changed, 174 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index de21acb..642814a2 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -7,11 +7,185 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt) "vas: " fmt
+
 #include 
 #include 
+#include 
+#include 
 
 #include "vas.h"
 
+/*
+ * Compute the paste address region for the window @window using the
+ * ->paste_base_addr and ->paste_win_id_shift we got from device tree.
+ */
+static void compute_paste_address(struct vas_window *window, u64 *addr, int 
*len)
+{
+   int winid;
+   u64 base, shift;
+
+   base = window->vinst->paste_base_addr;
+   shift = window->vinst->paste_win_id_shift;
+   winid = window->winid;
+
+   *addr  = base + (winid << shift);
+   if (len)
+   *len = PAGE_SIZE;
+
+   pr_debug("Txwin #%d: Paste addr 0x%llx\n", winid, *addr);
+}
+
+static inline void get_hvwc_mmio_bar(struct vas_window *window,
+   u64 *start, int *len)
+{
+   u64 pbaddr;
+
+   pbaddr = window->vinst->hvwc_bar_start;
+   *start = pbaddr + window->winid * VAS_HVWC_SIZE;
+   *len = VAS_HVWC_SIZE;
+}
+
+static inline void get_uwc_mmio_bar(struct vas_window *window,
+   u64 *start, int *len)
+{
+   u64 pbaddr;
+
+   pbaddr = window->vinst->uwc_bar_start;
+   *start = pbaddr + window->winid * VAS_UWC_SIZE;
+   *len = VAS_UWC_SIZE;
+}
+
+/*
+ * Map the paste bus address of the given send window into kernel address
+ * space. Unlike MMIO regions (map_mmio_region() below), paste region must
+ * be mapped cache-able and is only applicable to send windows.
+ */
+void *map_paste_region(struct vas_window *txwin)
+{
+   int len;
+   void *map;
+   char *name;
+   u64 start;
+
+   name = kasprintf(GFP_KERNEL, "window-v%d-w%d", txwin->vinst->vas_id,
+   txwin->winid);
+   if (!name)
+   goto free_name;
+
+   txwin->paste_addr_name = name;
+   compute_paste_address(txwin, &start, &len);
+
+   if (!request_mem_region(start, len, name)) {
+   pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
+   __func__, start, len);
+   goto free_name;
+   }
+
+   map = ioremap_cache(start, len);
+   if (!map) {
+   pr_devel("%s(): ioremap_cache(0x%llx, %d) failed\n", __func__,
+   start, len);
+   goto free_name;
+   }
+
+   pr_devel("Mapped paste addr 0x%llx to kaddr 0x%p\n", start, map);
+   return map;
+
+free_name:
+   kfree(name);
+   return ERR_PTR(-ENOMEM);
+}
+
+
+static void *map_mmio_region(char *name, u64 start, int len)
+{
+   void *map;
+
+   if (!request_mem_region(start, len, name)) {
+   pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
+   __func__, start, len);
+   return NULL;
+   }
+
+   map = ioremap(start, len);
+   if (!map) {
+   pr_devel("%s(): ioremap(0x%llx, %d) failed\n", __func__, start,
+   len);
+   return NULL;
+   }
+
+   return map;
+}
+
+static void unmap_region(void *addr, u64 start, int len)
+{
+   iounmap(addr);
+   release_mem_region((phys_addr_t)start, len);
+}
+
+/*
+ * Unmap the paste address region for a window.
+ */
+void unmap_paste_region(struct vas_window *window)
+{
+   int len;
+   u64 busaddr_start;
+
+   if (window->paste_kaddr) {
+   compute_paste_address(window, &busaddr_start, &len);
+   unmap_region(window->paste_kaddr, busaddr_start, len);
+   window->paste_kaddr = NULL;
+   kfree(window->paste_addr_name);
+   window->paste_addr_name = NULL;
+   }

[PATCH v8 02/10] Move GET_FIELD/SET_FIELD to vas.h

2017-08-28 Thread Sukadev Bhattiprolu
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
users of VAS, including NX-842 can use those macros.

There is a lot of related code between the VAS/NX kernel drivers
and skiboot. For consistency, switch the order of parameters in
SET_FIELD to match the order in skiboot.

Signed-off-by: Sukadev Bhattiprolu 
Reviewed-by: Dan Streetman 
---

Changelog[v7]
[Michael Ellerman] Move the macros to  rather than
to 

Changelog[v3]
- Fix order of parameters in nx-842 driver.
---
 arch/powerpc/include/asm/vas.h | 8 
 drivers/crypto/nx/nx-842-powernv.c | 7 ---
 drivers/crypto/nx/nx-842.h | 5 -
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index ff87e44..33e93ca 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -30,6 +30,14 @@
 #define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
 
 /*
+ * Get/Set bit fields
+ */
+#define GET_FIELD(m, v)(((v) & (m)) >> MASK_LSH(m))
+#define MASK_LSH(m)(__builtin_ffsl(m) - 1)
+#define SET_FIELD(m, v, val)   \
+   (((v) & ~(m)) | typeof(v))(val)) << MASK_LSH(m)) & (m)))
+
+/*
  * Co-processor Engine type.
  */
 enum vas_cop_type {
diff --git a/drivers/crypto/nx/nx-842-powernv.c 
b/drivers/crypto/nx/nx-842-powernv.c
index 1710f80..3abb045 100644
--- a/drivers/crypto/nx/nx-842-powernv.c
+++ b/drivers/crypto/nx/nx-842-powernv.c
@@ -22,6 +22,7 @@
 
 #include 
 #include 
+#include 
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Dan Streetman ");
@@ -424,9 +425,9 @@ static int nx842_powernv_function(const unsigned char *in, 
unsigned int inlen,
 
/* set up CCW */
ccw = 0;
-   ccw = SET_FIELD(ccw, CCW_CT, nx842_ct);
-   ccw = SET_FIELD(ccw, CCW_CI_842, 0); /* use 0 for hw auto-selection */
-   ccw = SET_FIELD(ccw, CCW_FC_842, fc);
+   ccw = SET_FIELD(CCW_CT, ccw, nx842_ct);
+   ccw = SET_FIELD(CCW_CI_842, ccw, 0); /* use 0 for hw auto-selection */
+   ccw = SET_FIELD(CCW_FC_842, ccw, fc);
 
/* set up CRB's CSB addr */
csb_addr = nx842_get_pa(csb) & CRB_CSB_ADDRESS;
diff --git a/drivers/crypto/nx/nx-842.h b/drivers/crypto/nx/nx-842.h
index a4eee3b..30929bd 100644
--- a/drivers/crypto/nx/nx-842.h
+++ b/drivers/crypto/nx/nx-842.h
@@ -100,11 +100,6 @@ static inline unsigned long nx842_get_pa(void *addr)
return page_to_phys(vmalloc_to_page(addr)) + offset_in_page(addr);
 }
 
-/* Get/Set bit fields */
-#define MASK_LSH(m)(__builtin_ffsl(m) - 1)
-#define GET_FIELD(v, m)(((v) & (m)) >> MASK_LSH(m))
-#define SET_FIELD(v, m, val)   (((v) & ~(m)) | (((val) << MASK_LSH(m)) & (m)))
-
 /**
  * This provides the driver's constraints.  Different nx842 implementations
  * may have varying requirements.  The constraints are:
-- 
2.7.4



[PATCH v8 08/10] powerpc/vas: Define vas_win_close() interface

2017-08-28 Thread Sukadev Bhattiprolu
Define the vas_win_close() interface which should be used to close a
send or receive windows.

While the hardware configurations required to open send and receive windows
differ, the configuration to close a window is the same for both. So we use
a single interface to close the window.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v8]
- [Michael Ellerman] Set task_state() and pass correct values to
   schedule_timeout().

Changelog[v4]:
- Drop the poll for credits return (we can set the required credit,
  but cannot really find the available credit at a point in time)
- Export the symbol

Changelog[v3]:
- Fix order of parameters in GET_FIELD().
- Update references and sequence for closing/quiescing a window.

Conflicts:
arch/powerpc/platforms/powernv/vas-window.c
---
 arch/powerpc/include/asm/vas.h  |   7 ++
 arch/powerpc/platforms/powernv/vas-window.c | 101 ++--
 2 files changed, 103 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 5ce800a..e124856 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -96,4 +96,11 @@ extern void vas_init_rx_win_attr(struct vas_rx_win_attr 
*rxattr,
 extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
struct vas_rx_win_attr *attr);
 
+/*
+ * Close the send or receive window identified by @win. For receive windows
+ * return -EAGAIN if there are active send windows attached to this receive
+ * window.
+ */
+int vas_win_close(struct vas_window *win);
+
 #endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 783c8c6..39aa0e4 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -130,7 +130,7 @@ static void unmap_region(void *addr, u64 start, int len)
 /*
  * Unmap the paste address region for a window.
  */
-void unmap_paste_region(struct vas_window *window)
+static void unmap_paste_region(struct vas_window *window)
 {
int len;
u64 busaddr_start;
@@ -522,7 +522,7 @@ static int vas_assign_window_id(struct ida *ida)
return winid;
 }
 
-void vas_window_free(struct vas_window *window)
+static void vas_window_free(struct vas_window *window)
 {
int winid = window->winid;
struct vas_instance *vinst = window->vinst;
@@ -560,6 +560,14 @@ static struct vas_window *vas_window_alloc(struct 
vas_instance *vinst)
return ERR_PTR(-ENOMEM);
 }
 
+static void put_rx_win(struct vas_window *rxwin)
+{
+   /* Better not be a send window! */
+   WARN_ON_ONCE(rxwin->tx_win);
+
+   atomic_dec(&rxwin->num_txwins);
+}
+
 /*
  * Get the VAS receive window associated with NX engine identified
  * by @cop and if applicable, @pswid.
@@ -627,7 +635,7 @@ static void set_vinst_win(struct vas_instance *vinst,
  * Clear this window from the table(s) of windows for this VAS instance.
  * See also function header of set_vinst_win().
  */
-void clear_vinst_win(struct vas_window *window)
+static void clear_vinst_win(struct vas_window *window)
 {
int id = window->winid;
struct vas_instance *vinst = window->vinst;
@@ -839,8 +847,91 @@ struct vas_window *vas_rx_win_open(int vasid, enum 
vas_cop_type cop,
 }
 EXPORT_SYMBOL_GPL(vas_rx_win_open);
 
-/* stub for now */
+static void poll_window_busy_state(struct vas_window *window)
+{
+   int busy;
+   u64 val;
+
+retry:
+   /*
+* Poll Window Busy flag
+*/
+   val = read_hvwc_reg(window, VREG(WIN_STATUS));
+   busy = GET_FIELD(VAS_WIN_BUSY, val);
+   if (busy) {
+   val = 0;
+   set_current_state(TASK_UNINTERRUPTIBLE);
+   schedule_timeout(HZ);
+   goto retry;
+   }
+}
+
+static void poll_window_castout(struct vas_window *window)
+{
+   int cached;
+   u64 val;
+
+   /* Cast window context out of the cache */
+retry:
+   val = read_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL));
+   cached = GET_FIELD(VAS_WIN_CACHE_STATUS, val);
+   if (cached) {
+   val = 0ULL;
+   val = SET_FIELD(VAS_CASTOUT_REQ, val, 1);
+   val = SET_FIELD(VAS_PUSH_TO_MEM, val, 0);
+   write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
+
+   set_current_state(TASK_UNINTERRUPTIBLE);
+   schedule_timeout(HZ);
+   goto retry;
+   }
+}
+
+/*
+ * Close a window.
+ *
+ * See Section 1.12.1 of VAS workbook v1.05 for details on closing window:
+ * - Disable new paste operations (unmap paste address)
+ * - Poll for the "Window Busy" bit to be cleared
+ * - Clear the Open/Enable bit for the Window.
+ * - Poll for return of window Credits (implies FIFO empty for Rx win?)
+ * - Unpin and cast window context out of cache
+ *
+ * Besides the h

[PATCH v8 07/10] powerpc/vas: Define vas_rx_win_open() interface

2017-08-28 Thread Sukadev Bhattiprolu
Define the vas_rx_win_open() interface. This interface is intended to be
used by the Nest Accelerator (NX) driver(s) to setup receive windows for
one or more NX engines (which implement compression/encryption algorithms
in the hardware).

Follow-on patches will provide an interface to close the window and to open
a send window that kernel subsystems can use to access the NX engines.

The interface to open a receive window is expected to be invoked for each
instance of VAS in the system.

Signed-off-by: Sukadev Bhattiprolu 
---

Changelog[v8]:
- [Michael Ellerman] Drop vas_initialized() check; use pr_fmt;
  user kernel integer types (u32,64 etc);
- Dropped (deferred) code that handles user rx windows.

Changelog[v7]:
- vas_rx_win_open() is simplified because API for FTW windows
  are simplified. We expect the driver to open an rxwin and txwin
  one after the other and we don't get back an rx_win_handle from
  user space so don't have to validate the pid permissions.

Changelog[v6]:
- Add support for FTW windows

Changelog[v4]:
- Export the symbols

Changelog[v3]:
- Fault receive windows must enable interrupts and disable
  notifications. NX Windows are opposite.
- Use macros rather than enum for threshold-control mode
- Ignore irq_ports for in-kernel windows. They are needed for
  user space windows and will be added later
---
 arch/powerpc/include/asm/vas.h  |  46 +
 arch/powerpc/platforms/powernv/vas-window.c | 283 +++-
 arch/powerpc/platforms/powernv/vas.h|  14 ++
 3 files changed, 342 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 33e93ca..5ce800a 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -50,4 +50,50 @@ enum vas_cop_type {
VAS_COP_TYPE_MAX,
 };
 
+/*
+ * Receive window attributes specified by the (in-kernel) owner of window.
+ */
+struct vas_rx_win_attr {
+   void *rx_fifo;
+   int rx_fifo_size;
+   int wcreds_max;
+
+   bool pin_win;
+   bool rej_no_credit;
+   bool tx_wcred_mode;
+   bool rx_wcred_mode;
+   bool tx_win_ord_mode;
+   bool rx_win_ord_mode;
+   bool data_stamp;
+   bool nx_win;
+   bool fault_win;
+   bool user_win;
+   bool notify_disable;
+   bool intr_disable;
+   bool notify_early;
+
+   int lnotify_lpid;
+   int lnotify_pid;
+   int lnotify_tid;
+   u32 pswid;
+
+   int tc_mode;
+};
+
+/*
+ * Helper to initialize receive window attributes to defaults for an
+ * NX window.
+ */
+extern void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr,
+   enum vas_cop_type cop);
+
+/*
+ * Open a VAS receive window for the instance of VAS identified by @vasid
+ * Use @attr to initialize the attributes of the window.
+ *
+ * Return a handle to the window or ERR_PTR() on error.
+ */
+extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
+   struct vas_rx_win_attr *attr);
+
 #endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index bfc9dba..783c8c6 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -14,6 +14,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "vas.h"
 
@@ -531,7 +533,7 @@ void vas_window_free(struct vas_window *window)
vas_release_window_id(&vinst->ida, winid);
 }
 
-struct vas_window *vas_window_alloc(struct vas_instance *vinst)
+static struct vas_window *vas_window_alloc(struct vas_instance *vinst)
 {
int winid;
struct vas_window *window;
@@ -558,6 +560,285 @@ struct vas_window *vas_window_alloc(struct vas_instance 
*vinst)
return ERR_PTR(-ENOMEM);
 }
 
+/*
+ * Get the VAS receive window associated with NX engine identified
+ * by @cop and if applicable, @pswid.
+ *
+ * See also function header of set_vinst_win().
+ */
+struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
+   enum vas_cop_type cop, u32 pswid)
+{
+   struct vas_window *rxwin;
+
+   mutex_lock(&vinst->mutex);
+
+   if (cop == VAS_COP_TYPE_842 || cop == VAS_COP_TYPE_842_HIPRI)
+   rxwin = vinst->rxwin[cop] ?: ERR_PTR(-EINVAL);
+   else
+   rxwin = ERR_PTR(-EINVAL);
+
+   if (!IS_ERR(rxwin))
+   atomic_inc(&rxwin->num_txwins);
+
+   mutex_unlock(&vinst->mutex);
+
+   return rxwin;
+}
+
+/*
+ * We have two tables of windows in a VAS instance. The first one,
+ * ->windows[], contains all the windows in the instance and allows
+ * looking up a window by its id. It is used to look up send windows
+ * during fault handling and receive windows when pairing user space
+ * send/receive windows.
+ *
+ * The second table, ->r

[PATCH v8 06/10] powerpc/vas: Define helpers to alloc/free windows

2017-08-28 Thread Sukadev Bhattiprolu
Define helpers to allocate/free VAS window objects. These will
be used in follow-on patches when opening/closing windows.

Changelog[v8]:
- [Michael Ellerman] Make some functions static; retry if
  ida_get_new() fails with EAGAIN; fix a couple of leak in ids

Signed-off-by: Sukadev Bhattiprolu 
---
 arch/powerpc/platforms/powernv/vas-window.c | 73 +
 1 file changed, 73 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 68dfe53..bfc9dba 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -485,6 +485,79 @@ int init_winctx_regs(struct vas_window *window, struct 
vas_winctx *winctx)
return 0;
 }
 
+static DEFINE_SPINLOCK(vas_ida_lock);
+
+static void vas_release_window_id(struct ida *ida, int winid)
+{
+   spin_lock(&vas_ida_lock);
+   ida_remove(ida, winid);
+   spin_unlock(&vas_ida_lock);
+}
+
+static int vas_assign_window_id(struct ida *ida)
+{
+   int rc, winid;
+
+   do {
+   rc = ida_pre_get(ida, GFP_KERNEL);
+   if (!rc)
+   return -EAGAIN;
+
+   spin_lock(&vas_ida_lock);
+   rc = ida_get_new(ida, &winid);
+   spin_unlock(&vas_ida_lock);
+   } while (rc == -EAGAIN);
+
+   if (rc)
+   return rc;
+
+   if (winid > VAS_WINDOWS_PER_CHIP) {
+   pr_err("Too many (%d) open windows\n", winid);
+   vas_release_window_id(ida, winid);
+   return -EAGAIN;
+   }
+
+   return winid;
+}
+
+void vas_window_free(struct vas_window *window)
+{
+   int winid = window->winid;
+   struct vas_instance *vinst = window->vinst;
+
+   unmap_winctx_mmio_bars(window);
+   kfree(window);
+
+   vas_release_window_id(&vinst->ida, winid);
+}
+
+struct vas_window *vas_window_alloc(struct vas_instance *vinst)
+{
+   int winid;
+   struct vas_window *window;
+
+   winid = vas_assign_window_id(&vinst->ida);
+   if (winid < 0)
+   return ERR_PTR(winid);
+
+   window = kzalloc(sizeof(*window), GFP_KERNEL);
+   if (!window)
+   goto out_free;
+
+   window->vinst = vinst;
+   window->winid = winid;
+
+   if (map_winctx_mmio_bars(window))
+   goto out_free;
+
+   return window;
+
+out_free:
+   kfree(window);
+   vas_release_window_id(&vinst->ida, winid);
+   return ERR_PTR(-ENOMEM);
+}
+
 /* stub for now */
 int vas_win_close(struct vas_window *window)
 {
-- 
2.7.4



[PATCH v8 00/10] Enable VAS

2017-08-28 Thread Sukadev Bhattiprolu
Power9 introduces a hardware subsystem referred to as the Virtual
Accelerator Switchboard (VAS). VAS allows kernel subsystems and user
space processes to directly access the Nest Accelerator (NX) engines
which implement compression and encryption algorithms in the hardware.

NX has been in Power processors since Power7+, but access to the NX
engines was through the 'icswx' instruction which is only available
to the kernel/hypervisor. Starting with Power9, access to the NX
engines is provided to both kernel and user space processes through
VAS.

The switchboard (i.e VAS) multiplexes accesses between "receivers" and
"senders", where the "receivers" are typically the NX engines and
"senders" are the kernel subsystems and user processors that wish to
access the receivers (NX engines).  Once a sender is "connected" to
a receiver through the switchboard, the senders can submit compression/
encryption requests to the hardware using the new (PowerISA 3.0)
"copy" and "paste" instructions.

In the initial OPAL and PowerNV kernel patchsets, the "senders" can
only be kernel subsystems (eg NX-842 driver) and receivers can only
be the NX-842 engine. Follow-on patch sets will allow senders/receivers
to be user-space processes and receivers to be NX-GZIP engines.

Provides:

This kernel patch set configures the VAS subsystems and provides
kernel interfaces to drivers like NX-842 to open receive and send
windows in VAS and to submit compression requests to the NX engine.

Requires:

This patch set needs corresponding VAS/NX skiboot patches which
were merged into skiboot tree. i.e skiboot must include:
commit b503dcf ("vas: Set mmio enable bits in DD2")

Tests:
In-kernel compression requests were tested on DD1 and DD2 POWER9
hardware using compression self-test module and the following
NX-842 patch set from Haren Myneni:

https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-July/160620.html

and by dropping the last parameters to both vas_copy_crb() and
vas_paste_crb() calls in drivers/crypto/nx/nx-842-powernv.c.
See also PATCH 10/10.

Git Tree:

https://github.com/sukadev/linux/ 
Branch: vas-kern-v8

Thanks to input from Ben Herrenschmidt, Michael Neuling, Michael Ellerman
and Haren Myneni.

Changelog[v8]:
- [Michael Ellerman] Use kernel int types (u64, u32 etc); make VAS
  a built-in rather than a module; drop unnecessary fields from
  struct vas_instance; Update ISA references; use 0 or 1 with
  SET_FIELD macros instead of bool; skip writing to SPARE registers;
  minor cleanup of debug/error messages; retry if ida_get_new()
  fails with EAGAIN; fix couple of leaks in ids in error handling;
  drop vas_initialized() check; drop vas_win_id() and vas_paste_addr()
  interfaces as they are not yet used; Set task_state() and fix
  parameter to schedule_timeout(); Reuse existing copy/paste macros
  drop unnecessary parameters and add cr0 to clobbers list

Changelog[v7]:
- Drop support for user space send/receive FTW windows (will be
  posted separately) Simplifies the rx-win-open interface a bit.
- [Michael Ellerman] Move GET_FIELD/SET_FIELD macros from 
  uapi/asm/vas.h to asm/vas.h.

Changelog[v6]
- Add support for user space send/receive FTW windows
- Add a new, NX-FTW driver which provides the FTW user interface

Changelog[v5]
- [Ben Herrenschmidt] Make VAS a platform device in the device tree
  and use the core platform functions to parse the VAS properties.
  Map the VAS MMIO regions as non-cachable and paste regions as
  cachable. Use CONFIG_PPC_VAS rather than CONFIG_VAS; Don't assume
  VAS ids are sequential.
- Copy the FIFO address as is into LFIFO_BAR (don't shift it).

Changelog[v4]
Comments from Michael Neuling:
- Move VAS code from drivers/misc/vas to arch/powerpc/platforms/powernv
  since VAS only provides interfaces to other drivers like NX-842.
- Drop vas-internal.h and use vas.h in separate dirs for VAS
  internal, kernel API and user API
- Rather than create 6 separate device tree properties windows
  and window context, combine them into 6 "reg" properties.
- Drop vas_window_reset() since windows are reset/cleared before
  being assigned to kernel/users.
- Use ilog2() and radix_enabled() helpers

Changelog[v3]
- Rebase to v4.11-rc1
- Add interfaces to initialize send/receive window attributes to
  defaults that drivers can use (see arch/powerpc/include/asm/vas.h)
- Modify interface vas_paste() to return 0 or error code
- Fix a bug in setting Translation Control Mode (0b11 not 0x11)
- Enable send-window-credit checking 
- Reorg code  in vas_win_close()
- Minor reorgs and tweaks 

[PATCH v8 09/10] powerpc/vas: Define vas_tx_win_open()

2017-08-28 Thread Sukadev Bhattiprolu
Define an interface to open a VAS send window. This interface is
intended to be used the Nest Accelerator (NX) driver(s) to open
a send window and use it to submit compression/encryption requests
to a VAS receive window.

The receive window, identified by the [vasid, cop] parameters, must
already be open in VAS (i.e connected to an NX engine).

Signed-off-by: Sukadev Bhattiprolu 

---
Changelog[v8]:
- [Michael Ellerman] Drop vas_initialized() check; defer code
  that sets pswid;

Changelog[v7]:
- Initialize txwin->user_win field for FTW windows.

Changelog[v6]:
- Add support for FTW windows

Changelog[v4]:
- [Ben Herrenschmidt] MMIO regions must be mapped non-cached and
  paste regions must be mapped cached. Define/use map_paste_region().

Changelog [v3]:
- Distinguish between hardware PID (SPRN_PID) and Linux pid.
- Use macros rather than enum for threshold-control mode
- Set the pid of send window from attr (needed for user space
  send windows).
- Ignore irq port setting for now. They are needed for user space
  windows and will be added later
---
 arch/powerpc/include/asm/vas.h  |  42 
 arch/powerpc/platforms/powernv/vas-window.c | 156 +++-
 2 files changed, 195 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index e124856..efbdde5 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -81,6 +81,29 @@ struct vas_rx_win_attr {
 };
 
 /*
+ * Window attributes specified by the in-kernel owner of a send window.
+ */
+struct vas_tx_win_attr {
+   enum vas_cop_type cop;
+   int wcreds_max;
+   int lpid;
+   int pidr;   /* hardware PID (from SPRN_PID) */
+   int pid;/* linux process id */
+   int pswid;
+   int rsvd_txbuf_count;
+   int tc_mode;
+
+   bool user_win;
+   bool pin_win;
+   bool rej_no_credit;
+   bool rsvd_txbuf_enable;
+   bool tx_wcred_mode;
+   bool rx_wcred_mode;
+   bool tx_win_ord_mode;
+   bool rx_win_ord_mode;
+};
+
+/*
  * Helper to initialize receive window attributes to defaults for an
  * NX window.
  */
@@ -97,6 +120,25 @@ extern struct vas_window *vas_rx_win_open(int vasid, enum 
vas_cop_type cop,
struct vas_rx_win_attr *attr);
 
 /*
+ * Helper to initialize send window attributes to defaults for an NX window.
+ */
+extern void vas_init_tx_win_attr(struct vas_tx_win_attr *txattr,
+   enum vas_cop_type cop);
+
+/*
+ * Open a VAS send window for the instance of VAS identified by @vasid
+ * and the co-processor type @cop. Use @attr to initialize attributes
+ * of the window.
+ *
+ * Note: The instance of VAS must already have an open receive window for
+ * the coprocessor type @cop.
+ *
+ * Return a handle to the send window or ERR_PTR() on error.
+ */
+struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
+   struct vas_tx_win_attr *attr);
+
+/*
  * Close the send or receive window identified by @win. For receive windows
  * return -EAGAIN if there are active send windows attached to this receive
  * window.
diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 39aa0e4..cd12e44 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -64,7 +64,7 @@ static inline void get_uwc_mmio_bar(struct vas_window *window,
  * space. Unlike MMIO regions (map_mmio_region() below), paste region must
  * be mapped cache-able and is only applicable to send windows.
  */
-void *map_paste_region(struct vas_window *txwin)
+static void *map_paste_region(struct vas_window *txwin)
 {
int len;
void *map;
@@ -100,7 +100,6 @@ void *map_paste_region(struct vas_window *txwin)
return ERR_PTR(-ENOMEM);
 }
 
-
 static void *map_mmio_region(char *name, u64 start, int len)
 {
void *map;
@@ -574,7 +573,7 @@ static void put_rx_win(struct vas_window *rxwin)
  *
  * See also function header of set_vinst_win().
  */
-struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
+static struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
enum vas_cop_type cop, u32 pswid)
 {
struct vas_window *rxwin;
@@ -847,6 +846,157 @@ struct vas_window *vas_rx_win_open(int vasid, enum 
vas_cop_type cop,
 }
 EXPORT_SYMBOL_GPL(vas_rx_win_open);
 
+void vas_init_tx_win_attr(struct vas_tx_win_attr *txattr, enum vas_cop_type 
cop)
+{
+   memset(txattr, 0, sizeof(*txattr));
+
+   if (cop == VAS_COP_TYPE_842 || cop == VAS_COP_TYPE_842_HIPRI) {
+   txattr->rej_no_credit = false;
+   txattr->rx_wcred_mode = true;
+   txattr->tx_wcred_mode = true;
+   txattr->rx_win_ord_mode = true;
+   txattr->tx_win_or

Re: [PATCH] Staging: ks7010: Fix hardcoded function names in strings. Warnings reported by checkpatch.pl.

2017-08-28 Thread Greg KH
On Mon, Aug 28, 2017 at 05:32:39PM -0600, Jonathan Whitaker wrote:
> This commit replaces hardcoded function name strings to the more preferred 
> '"%s...", __func__'
> style. These warnings were reported by checkpatch.pl.

Please wrap your changelog text at 72 columns.

And your subject is very odd, please fix that up as well.

> 
> Signed-off-by: Jonathan Whitaker 
> ---
>  drivers/staging/ks7010/ks7010_sdio.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/ks7010/ks7010_sdio.c 
> b/drivers/staging/ks7010/ks7010_sdio.c
> index 9b28ee1..c0e91c3 100644
> --- a/drivers/staging/ks7010/ks7010_sdio.c
> +++ b/drivers/staging/ks7010/ks7010_sdio.c
> @@ -834,7 +834,7 @@ static int ks7010_sdio_probe(struct sdio_func *func,
>   unsigned char byte;
>   int ret;
>  
> - DPRINTK(5, "ks7010_sdio_probe()\n");
> + DPRINTK(5, "%s()\n", __func__);
>  
>   priv = NULL;
>   netdev = NULL;
> @@ -1008,7 +1008,7 @@ static void ks7010_sdio_remove(struct sdio_func *func)
>   struct ks_sdio_card *card;
>   struct ks_wlan_private *priv;
>  
> - DPRINTK(1, "ks7010_sdio_remove()\n");
> + DPRINTK(1, "%s()\n", __func__);

These lines can just be deleted entirely, we have in-kernel
functionality for tracing kernel function calls, no need to have special
debug lines just for that.

thanks,

greg k-h


Re: [PATCH] fpga: make xlnx_pr_decoupler_br_ops const

2017-08-28 Thread Michal Simek
On 28.8.2017 19:32, Bhumika Goyal wrote:
> Make this const as it is only passed to a const argument of the function
> fpga_bridge_register.
> 
> Signed-off-by: Bhumika Goyal 
> ---
>  drivers/fpga/xilinx-pr-decoupler.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/fpga/xilinx-pr-decoupler.c 
> b/drivers/fpga/xilinx-pr-decoupler.c
> index e359930..0d77430 100644
> --- a/drivers/fpga/xilinx-pr-decoupler.c
> +++ b/drivers/fpga/xilinx-pr-decoupler.c
> @@ -79,7 +79,7 @@ static int xlnx_pr_decoupler_enable_show(struct fpga_bridge 
> *bridge)
>   return !status;
>  }
>  
> -static struct fpga_bridge_ops xlnx_pr_decoupler_br_ops = {
> +static const struct fpga_bridge_ops xlnx_pr_decoupler_br_ops = {
>   .enable_set = xlnx_pr_decoupler_enable_set,
>   .enable_show = xlnx_pr_decoupler_enable_show,
>  };
> 

Acked-by: Michal Simek 

Thanks,
Michal


Re: [PATCH v4 3/3] ARM: dts: exynos: Remove the display-timing and delay from rinato dts

2017-08-28 Thread Krzysztof Kozlowski
On Tue, Aug 29, 2017 at 4:52 AM, Hoegeun Kwon  wrote:
> Hi Krzysztof,
>
> The driver has been merged into exynos-drm-misc.
> Could you please check this patch(3/3).

Hi, OK, no problems for me but it is too late for current cycle so it
will go in for v4.15.

Best regards,
Krzysztof

>
> Best regards,
> Hoegeun
>
>
> On 07/13/2017 11:20 AM, Hoegeun Kwon wrote:
>>
>> The display-timing and delay are included in the panel driver. So it
>> should be removed in dts.
>>
>> Signed-off-by: Hoegeun Kwon 
>> ---
>>   arch/arm/boot/dts/exynos3250-rinato.dts | 22 --
>>   1 file changed, 22 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/exynos3250-rinato.dts
>> b/arch/arm/boot/dts/exynos3250-rinato.dts
>> index 443e0c9..6b70c8d 100644
>> --- a/arch/arm/boot/dts/exynos3250-rinato.dts
>> +++ b/arch/arm/boot/dts/exynos3250-rinato.dts
>> @@ -242,28 +242,6 @@
>> vci-supply = <&ldo20_reg>;
>> reset-gpios = <&gpe0 1 GPIO_ACTIVE_LOW>;
>> te-gpios = <&gpx0 6 GPIO_ACTIVE_HIGH>;
>> -   power-on-delay= <30>;
>> -   power-off-delay= <120>;
>> -   reset-delay = <5>;
>> -   init-delay = <100>;
>> -   flip-horizontal;
>> -   flip-vertical;
>> -   panel-width-mm = <29>;
>> -   panel-height-mm = <29>;
>> -
>> -   display-timings {
>> -   timing-0 {
>> -   clock-frequency = <460>;
>> -   hactive = <320>;
>> -   vactive = <320>;
>> -   hfront-porch = <1>;
>> -   hback-porch = <1>;
>> -   hsync-len = <1>;
>> -   vfront-porch = <150>;
>> -   vback-porch = <1>;
>> -   vsync-len = <2>;
>> -   };
>> -   };
>> port {
>> dsi_in: endpoint {
>
>


[PATCH 4/4] [media] zr364xx: Fix a typo in a comment line of the file header

2017-08-28 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:46:30 +0200

Fix a word in this description.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 4cc6d2a9d91f..4ccf71d8b608 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -2,7 +2,7 @@
  * Zoran 364xx based USB webcam module version 0.73
  *
  * Allows you to use your USB webcam with V4L2 applications
- * This is still in heavy developpement !
+ * This is still in heavy development!
  *
  * Copyright (C) 2004  Antoine Jacquet 
  * http://royale.zerezo.com/zr364xx/
-- 
2.14.1



[PATCH 3/4] [media] zr364xx: Adjust ten checks for null pointers

2017-08-28 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:40:47 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written !…

Thus fix the affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 37cd6e20e68a..4cc6d2a9d91f 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -385,9 +385,9 @@ static int buffer_prepare(struct videobuf_queue *vq, struct 
videobuf_buffer *vb,
  vb);
int rc;
 
-   DBG("%s, field=%d, fmt name = %s\n", __func__, field, cam->fmt != NULL ?
-   cam->fmt->name : "");
-   if (cam->fmt == NULL)
+   DBG("%s, field=%d, fmt name = %s\n", __func__, field,
+   cam->fmt ? cam->fmt->name : "");
+   if (!cam->fmt)
return -EINVAL;
 
buf->vb.size = cam->width * cam->height * (cam->fmt->depth >> 3);
@@ -787,7 +787,7 @@ static int zr364xx_vidioc_try_fmt_vid_cap(struct file 
*file, void *priv,
struct zr364xx_camera *cam = video_drvdata(file);
char pixelformat_name[5];
 
-   if (cam == NULL)
+   if (!cam)
return -ENODEV;
 
if (f->fmt.pix.pixelformat != V4L2_PIX_FMT_JPEG) {
@@ -817,7 +817,7 @@ static int zr364xx_vidioc_g_fmt_vid_cap(struct file *file, 
void *priv,
 {
struct zr364xx_camera *cam;
 
-   if (file == NULL)
+   if (!file)
return -ENODEV;
cam = video_drvdata(file);
 
@@ -979,13 +979,13 @@ static void read_pipe_completion(struct urb *purb)
 
pipe_info = purb->context;
_DBG("%s %p, status %d\n", __func__, purb, purb->status);
-   if (pipe_info == NULL) {
+   if (!pipe_info) {
printk(KERN_ERR KBUILD_MODNAME ": no context!\n");
return;
}
 
cam = pipe_info->cam;
-   if (cam == NULL) {
+   if (!cam) {
printk(KERN_ERR KBUILD_MODNAME ": no context!\n");
return;
}
@@ -1069,7 +1069,7 @@ static void zr364xx_stop_readpipe(struct zr364xx_camera 
*cam)
 {
struct zr364xx_pipeinfo *pipe_info;
 
-   if (cam == NULL) {
+   if (!cam) {
printk(KERN_ERR KBUILD_MODNAME ": invalid device\n");
return;
}
@@ -1273,7 +1273,7 @@ static int zr364xx_mmap(struct file *file, struct 
vm_area_struct *vma)
struct zr364xx_camera *cam = video_drvdata(file);
int ret;
 
-   if (cam == NULL) {
+   if (!cam) {
DBG("%s: cam == NULL\n", __func__);
return -ENODEV;
}
@@ -1357,7 +1357,7 @@ static int zr364xx_board_init(struct zr364xx_camera *cam)
 
pipe->transfer_buffer = kzalloc(pipe->transfer_size,
GFP_KERNEL);
-   if (pipe->transfer_buffer == NULL) {
+   if (!pipe->transfer_buffer) {
DBG("out of memory!\n");
return -ENOMEM;
}
@@ -1373,7 +1373,7 @@ static int zr364xx_board_init(struct zr364xx_camera *cam)
DBG("valloc %p, idx %lu, pdata %p\n",
&cam->buffer.frame[i], i,
cam->buffer.frame[i].lpvbits);
-   if (cam->buffer.frame[i].lpvbits == NULL) {
+   if (!cam->buffer.frame[i].lpvbits) {
printk(KERN_INFO KBUILD_MODNAME ": out of memory. Using 
less frames\n");
break;
}
-- 
2.14.1



[PATCH 2/4] [media] zr364xx: Improve a size determination in zr364xx_probe()

2017-08-28 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:28:02 +0200

Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 97af697dcc81..37cd6e20e68a 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -1421,7 +1421,7 @@ static int zr364xx_probe(struct usb_interface *intf,
 le16_to_cpu(udev->descriptor.idVendor),
 le16_to_cpu(udev->descriptor.idProduct));
 
-   cam = kzalloc(sizeof(struct zr364xx_camera), GFP_KERNEL);
+   cam = kzalloc(sizeof(*cam), GFP_KERNEL);
if (!cam)
return -ENOMEM;
 
-- 
2.14.1



Re: [PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP

2017-08-28 Thread Oza Oza
On Tue, Aug 29, 2017 at 3:17 AM, Bjorn Helgaas  wrote:
> On Thu, Aug 24, 2017 at 10:34:25AM +0530, Oza Pawandeep wrote:
>> PCIe spec r3.1, sec 2.3.2
>> If CRS software visibility is not enabled, the RC must reissue the
>> config request as a new request.
>>
>> - If CRS software visibility is enabled,
>> - for a config read of Vendor ID, the RC must return 0x0001 data
>> - for all other config reads/writes, the RC must reissue the
>>   request
>>
>> iproc PCIe Controller spec:
>> 4.7.3.3. Retry Status On Configuration Cycle
>> Endpoints are allowed to generate retry status on configuration
>> cycles. In this case, the RC needs to re-issue the request. The IP
>> does not handle this because the number of configuration cycles needed
>> will probably be less than the total number of non-posted operations
>> needed.
>>
>> When a retry status is received on the User RX interface for a
>> configuration request that was sent on the User TX interface,
>> it will be indicated with a completion with the CMPL_STATUS field set
>> to 2=CRS, and the user will have to find the address and data values
>> and send a new transaction on the User TX interface.
>> When the internal configuration space returns a retry status during a
>> configuration cycle (user_cscfg = 1) on the Command/Status interface,
>> the pcie_cscrs will assert with the pcie_csack signal to indicate the
>> CRS status.
>> When the CRS Software Visibility Enable register in the Root Control
>> register is enabled, the IP will return the data value to 0x0001 for
>> the Vendor ID value and 0x  (all 1’s) for the rest of the data in
>> the request for reads of offset 0 that return with CRS status.  This
>> is true for both the User RX Interface and for the Command/Status
>> interface.  When CRS Software Visibility is enabled, the CMPL_STATUS
>> field of the completion on the User RX Interface will not be 2=CRS and
>> the pcie_cscrs signal will not assert on the Command/Status interface.
>>
>> Per PCIe r3.1, sec 2.3.2, config requests that receive completions
>> with Configuration Request Retry Status (CRS) should be reissued by
>> the hardware except reads of the Vendor ID when CRS Software
>> Visibility is enabled.
>>
>> This hardware never reissues configuration requests when it receives
>> CRS completions.
>> Note that, neither PCIe host bridge nor PCIe core re-issues the
>> request for any configuration offset.
>>
>> For config reads, this hardware returns CFG_RETRY_STATUS data when
>> it receives a CRS completion for a config read, regardless of the
>> address of the read or the CRS Software Visibility Enable bit.
>>
>> This patch implements iproc_pcie_config_read which gets called for
>> Stingray, if it receives a CRS completion, it retries reading it again.
>> In case of timeout, it returns 0x.
>> For other iproc based SOC, it falls back to PCI generic APIs.
>>
>> Signed-off-by: Oza Pawandeep 
>>
>> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
>> index 61d9be6..37f4adf 100644
>> --- a/drivers/pci/host/pcie-iproc.c
>> +++ b/drivers/pci/host/pcie-iproc.c
>> @@ -68,6 +68,9 @@
>>  #define APB_ERR_EN_SHIFT 0
>>  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
>>
>> +#define CFG_RETRY_STATUS 0x0001
>> +#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
>> +
>>  /* derive the enum index of the outbound/inbound mapping registers */
>>  #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
>>
>> @@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct 
>> iproc_pcie *pcie,
>>   return (pcie->base + offset);
>>  }
>>
>> +static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
>> +{
>> + int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
>> + unsigned int data;
>> +
>> + /*
>> +  * As per PCIe spec r3.1, sec 2.3.2, CRS Software
>> +  * Visibility only affects config read of the Vendor ID.
>> +  * For config write or any other config read the Root must
>> +  * automatically re-issue configuration request again as a
>> +  * new request.
>> +  *
>> +  * For config reads, this hardware returns CFG_RETRY_STATUS data when
>> +  * it receives a CRS completion for a config read, regardless of the
>> +  * address of the read or the CRS Software Visibility Enable bit. As a
>> +  * partial workaround for this, we retry in software any read that
>> +  * returns CFG_RETRY_STATUS.
>> +  */
>> + data = readl(cfg_data_p);
>> + while (data == CFG_RETRY_STATUS && timeout--) {
>> + udelay(1);
>> + data = readl(cfg_data_p);
>> + }
>> +
>> + if (data == CFG_RETRY_STATUS)
>> + data = 0x;
>> +
>> + return data;
>> +}
>> +
>> +static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
>> + int where, int size, u32 *val)
>> +{
>> + struct iproc_pcie *pcie = iproc_data(bus);
>> + u

[PATCH 1/4] [media] zr364xx: Delete an error message for a failed memory allocation in two functions

2017-08-28 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:23:56 +0200

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index efdcd5bd6a4c..97af697dcc81 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -212,7 +212,5 @@ static int send_control_msg(struct usb_device *udev, u8 
request, u16 value,
-   if (!transfer_buffer) {
-   dev_err(&udev->dev, "kmalloc(%d) failed\n", size);
+   if (!transfer_buffer)
return -ENOMEM;
-   }
 
memcpy(transfer_buffer, cp, size);
 
@@ -1427,7 +1425,5 @@ static int zr364xx_probe(struct usb_interface *intf,
-   if (cam == NULL) {
-   dev_err(&udev->dev, "cam: out of memory !\n");
+   if (!cam)
return -ENOMEM;
-   }
 
cam->v4l2_dev.release = zr364xx_release;
err = v4l2_device_register(&intf->dev, &cam->v4l2_dev);
-- 
2.14.1



[PATCH 0/4] [media] zr364xx: Adjustments for some function implementations

2017-08-28 Thread SF Markus Elfring
From: Markus Elfring 
Date: Tue, 29 Aug 2017 07:17:07 +0200

A few update suggestions were taken into account
from static source code analysis.

Markus Elfring (4):
  Delete an error message for a failed memory allocation in two functions
  Improve a size determination in zr364xx_probe()
  Adjust ten checks for null pointers
  Fix a typo in a comment line of the file header

 drivers/media/usb/zr364xx/zr364xx.c | 34 +++---
 1 file changed, 15 insertions(+), 19 deletions(-)

-- 
2.14.1



Re: [PATCH v8 0/3] PCI: iproc: SOC specific fixes

2017-08-28 Thread Oza Oza
On Tue, Aug 29, 2017 at 3:23 AM, Bjorn Helgaas  wrote:
> On Thu, Aug 24, 2017 at 10:34:23AM +0530, Oza Pawandeep wrote:
>> PCI: iproc: Retry request when CRS returned from EP Above patch adds
>> support for CRS in PCI RC driver, otherwise if not handled at lower
>> level, the user space PMD (poll mode drivers) can timeout.
>>
>> PCI: iproc: add device shutdown for PCI RC This fixes the issue where
>> certian PCI endpoints are not getting detected on Stingray SOC after
>> reboot.
>>
>> Changes Since v7:
>> Factor out the ep config access code.
>>
>> Changes Since v6:
>> Rebased patches on top of Lorenzo's patches.
>> Bjorn's comments addressed.
>> now the confg retry returns 0x as data.
>> Added reference to PCIe spec and iproc Controller spec in Changelog.
>>
>> Changes Since v5:
>> Ray's comments addressed.
>>
>> Changes Since v4:
>> Bjorn's comments addressed.
>>
>> Changes Since v3:
>> [re-send]
>>
>> Changes Since v2:
>> Fix compilation errors for pcie-iproc-platform.ko which was caught
>> by kbuild.
>>
>> Oza Pawandeep (3):
>>   PCI: iproc: factor-out ep configuration access
>>   PCI: iproc: Retry request when CRS returned from EP
>>   PCI: iproc: add device shutdown for PCI RC
>>
>>  drivers/pci/host/pcie-iproc-platform.c |   8 ++
>>  drivers/pci/host/pcie-iproc.c  | 143 
>> ++---
>>  drivers/pci/host/pcie-iproc.h  |   1 +
>>  3 files changed, 124 insertions(+), 28 deletions(-)
>
> I applied these to pci/host-iproc for v4.14.  Man, this is ugly.
>
> I reworked the changelog to try to make it more readable.  I also tried to
> disable the PCI_EXP_RTCAP_CRSVIS bit, which advertises CRS SV support.  And
> I removed what looked like a duplicate pci_generic_config_read32() call.
> And I added a warning about the fact that we corrupt reads of config
> registers that happen to contain 0x0001.
>
> I'm pretty sure I broke something, so please take a look.

Appreciate your time in adding PCI_EXP_RTCAP_CRSVIS and other changes.
I just tested the patch, and it works fine.
which tells us, that CRS visibility bit has no effect.

so things look okay to me.

Regards,
Oza.
>
> Incremental diff from your v8 to what's on pci/host-iproc:
>
> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
> index cbdabe8a073e..8bd5e544b1c1 100644
> --- a/drivers/pci/host/pcie-iproc.c
> +++ b/drivers/pci/host/pcie-iproc.c
> @@ -69,7 +69,7 @@
>  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
>
>  #define CFG_RETRY_STATUS 0x0001
> -#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
> +#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milliseconds */
>
>  /* derive the enum index of the outbound/inbound mapping registers */
>  #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
> @@ -482,17 +482,21 @@ static unsigned int iproc_pcie_cfg_retry(void __iomem 
> *cfg_data_p)
> unsigned int data;
>
> /*
> -* As per PCIe spec r3.1, sec 2.3.2, CRS Software
> -* Visibility only affects config read of the Vendor ID.
> -* For config write or any other config read the Root must
> -* automatically re-issue configuration request again as a
> -* new request.
> +* As per PCIe spec r3.1, sec 2.3.2, CRS Software Visibility only
> +* affects config reads of the Vendor ID.  For config writes or any
> +* other config reads, the Root may automatically reissue the
> +* configuration request again as a new request.
>  *
> -* For config reads, this hardware returns CFG_RETRY_STATUS data when
> -* it receives a CRS completion for a config read, regardless of the
> -* address of the read or the CRS Software Visibility Enable bit. As a
> +* For config reads, this hardware returns CFG_RETRY_STATUS data
> +* when it receives a CRS completion, regardless of the address of
> +* the read or the CRS Software Visibility Enable bit.  As a
>  * partial workaround for this, we retry in software any read that
>  * returns CFG_RETRY_STATUS.
> +*
> +* Note that a non-Vendor ID config register may have a value of
> +* CFG_RETRY_STATUS.  If we read that, we can't distinguish it from
> +* a CRS completion, so we will incorrectly retry the read and
> +* eventually return the wrong data (0x).
>  */
> data = readl(cfg_data_p);
> while (data == CFG_RETRY_STATUS && timeout--) {
> @@ -515,10 +519,19 @@ static int iproc_pcie_config_read(struct pci_bus *bus, 
> unsigned int devfn,
> unsigned int busno = bus->number;
> void __iomem *cfg_data_p;
> unsigned int data;
> +   int ret;
>
> -   /* root complex access. */
> -   if (busno == 0)
> -   return pci_generic_config_read32(bus, devfn, where, size, 
> val);
> +   /* root complex access */
> +   if (busno == 0) {
> + 

Re: [Cocci] cocci: remove unnecessary casts of void * while avoiding casts with __user or __force ?

2017-08-28 Thread Julia Lawall


On Mon, 28 Aug 2017, Joe Perches wrote:

> A simple cocci script that removes unnecessary casts of
> a void * will also remove casts with __force or __user

Unfortunately, attributes are currently not supported inside casts.  This
can be done in a hackish way (possible false negatives) as follows:

---

@initialize:ocaml@
@@

let close (p1,p2) =
  let r = (List.hd p1).line_end in
  let l = (List.hd p2).line in
  let rc = (List.hd p1).col_end in
  let lc = (List.hd p2).col in
  r = l && lc = rc+1

@r@
position p1,p2;
expression f,e;
type T;
@@

f(..., // generalize this rule as needed
 (T@p1 *@p2)
 e,...)

@@
position r.p2 : script:ocaml(r.p1) { close(p1,p2) };
position r.p1;
expression e;
type T;
@@

- (T@p1 *@p2)
  e

---

Basically, it assumes that if the type and the * are more than one space
apart then there is something important there, and the cast is not
removed.

julia


[no subject]

2017-08-28 Thread Venkat Subbiah
Sup Linux


http://www.imr-asso.org/wp-content/uploads/innovation.php?corn=pks2ea81htmcx01ew



Venkat

Re: [PATCH v3 1/3] mfd: Add support for Cherry Trail Dollar Cove TI PMIC

2017-08-28 Thread Takashi Iwai
On Tue, 29 Aug 2017 00:31:15 +0200,
Rafael J. Wysocki wrote:
> 
> On Fri, Aug 25, 2017 at 3:44 PM, Takashi Iwai  wrote:
> > This patch adds the MFD driver for Dollar Cove (TI version) PMIC with
> > ACPI INT33F5 that is found on some Intel Cherry Trail devices.
> > The driver is based on the original work by Intel, found at:
> >   https://github.com/01org/ProductionKernelQuilts
> >
> > This is a minimal version for adding the basic resources.  Currently,
> > only ACPI PMIC opregion and the external power-button are used.
> >
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=193891
> > Reviewed-by: Mika Westerberg 
> > Reviewed-by: Andy Shevchenko 
> > Signed-off-by: Takashi Iwai 
> 
> I need an ACK from Lee on this one.

Yeah, the MFD patch is prerequisite for patches 2 and 3, of course...

Lee, could you review the patch 1?


thanks,

Takashi


Re: [PATCH] lsm_audit: use get_task_comm

2017-08-28 Thread Richard Guy Briggs
On 2017-08-28 17:54, Paul Moore wrote:
> On Mon, Aug 28, 2017 at 9:58 AM, Geliang Tang  wrote:
> > get_task_comm() copys the task's comm under the task_lock, it's safer
> > than directly using memcpy().
> >
> > Signed-off-by: Geliang Tang 
> > ---
> >  security/lsm_audit.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/security/lsm_audit.c b/security/lsm_audit.c
> > index 28d4c3a..555b1c4 100644
> > --- a/security/lsm_audit.c
> > +++ b/security/lsm_audit.c
> > @@ -221,7 +221,7 @@ static void dump_common_audit_data(struct audit_buffer 
> > *ab,
> > BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);
> >
> > audit_log_format(ab, " pid=%d comm=", task_tgid_nr(current));
> > -   audit_log_untrustedstring(ab, memcpy(comm, current->comm, 
> > sizeof(comm)));
> > +   audit_log_untrustedstring(ab, get_task_comm(comm, current));
> >
> > switch (a->type) {
> > case LSM_AUDIT_DATA_NONE:
> > @@ -312,7 +312,7 @@ static void dump_common_audit_data(struct audit_buffer 
> > *ab,
> > char comm[sizeof(tsk->comm)];
> > audit_log_format(ab, " opid=%d ocomm=", 
> > pid);
> > audit_log_untrustedstring(ab,
> > -   memcpy(comm, tsk->comm, sizeof(comm)));
> > +   get_task_comm(comm, tsk));
> 
> [NOTE: adding the linux-audit mailing list to this thread]

There was previously pushback about using get_task_comm() with its
locking, which is why in this particular location, a memcpy was chosen
instead.

This was done in:
5deeb5cece3f9b30c8129786726b9d02c412c8ca rgb 2015-04-14
("lsm: copy comm before calling audit_log to avoid race in string printing")

>From that commit:
Using get_task_comm() to get a copy while acquiring the task_lock to prevent
this and to prevent the result from being a mixture of old and new values of
comm would incur potentially unacceptable overhead, considering that the 
value
can be influenced by userspace and therefore untrusted anyways.

> This isn't strictly a problem with this patch, but I think we should
> be able to get rid of the 'comm' variable in this if-block as simply
> reuse the 'comm' from the top of the function.  It would be nice to
> include that in this patch.
> 
> Other than that minor nit, this patch looks good to me; if you make
> that small change I'll merge it into the audit/next branch for the
> upcoming merge window.

So, I'd offer a NACK here.

> paul moore

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


Re: [PATCH 0/4] irda: move it to drivers/staging so we can delete it

2017-08-28 Thread Greg KH
On Mon, Aug 28, 2017 at 04:46:07PM -0700, Joe Perches wrote:
> On Mon, 2017-08-28 at 16:42 -0700, David Miller wrote:
> > From: Greg Kroah-Hartman 
> > Date: Sun, 27 Aug 2017 17:03:30 +0200
> > 
> > > The IRDA code has long been obsolete and broken.  So, to keep people
> > > from trying to use it, and to prevent people from having to maintain it,
> > > let's move it to drivers/staging/ so that we can delete it entirely from
> > > the kernel in a few releases.
> > 
> > No objection, I'll apply this to net-next, thanks Greg.
> 
> Still needs an update to MAINTAINERS.

Oops, forgot those directories, will send a follow-on patch for that.

greg k-h


Re: [PATCH 4.12 00/99] 4.12.10-stable review

2017-08-28 Thread Greg Kroah-Hartman
On Mon, Aug 28, 2017 at 01:40:29PM -0600, Shuah Khan wrote:
> On 08/28/2017 02:03 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.12.10 release.
> > There are 99 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Aug 30 08:04:17 UTC 2017.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.12.10-rc1.gz
> > or in the git tree and branch at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.12.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.12 00/99] 4.12.10-stable review

2017-08-28 Thread Greg Kroah-Hartman
On Mon, Aug 28, 2017 at 05:11:03PM -0700, Guenter Roeck wrote:
> On 08/28/2017 01:03 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.12.10 release.
> > There are 99 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Aug 30 08:04:17 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> 
> Build results:
>   total: 145 pass: 145 fail: 0
> Qemu test results:
>   total: 122 pass: 122 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Great, thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH] Revert "xhci: Limit USB2 port wake support for AMD Promontory hosts"

2017-08-28 Thread Kai-Heng Feng
On Mon, Aug 28, 2017 at 6:14 PM, Mathias Nyman
 wrote:
> On 28.08.2017 12:29, Greg KH wrote:
>>
>> On Tue, Aug 22, 2017 at 05:14:47PM +0800, Kai-Heng Feng wrote:
>>>
>>> This reverts commit dec08194ffeccfa1cf085906b53d301930eae18f.
>>>
>>> Commit dec08194ffec ("xhci: Limit USB2 port wake support for AMD
>>> Promontory
>>> hosts") makes all high speed USB ports on ASUS PRIME B350M-A cease to
>>> function after enabling runtime PM.
>>>
>>> All boards with this chipsets will be affected, so revert the commit.
>>>
>>> Conflicts:
>>> drivers/usb/host/xhci-pci.c
>>> drivers/usb/host/xhci.h
>>
>>
>> Why are these "Conflicts:" lines here, you did fix up the issues, so
>> there shouldn't be any more conflicts.
>>
>> And if you revert this, don't we still have the original problem here?
>>
>
> Adding more people who were involved in the original patch.
>
> Users are now seeing the unresponsive USB2 ports with Promontory hosts.
> Is there any update on a better way to solve the original issue.
>
> To me a "dead" USB2 port seems like a much worse issue for a user
> than a BIOS disabled port waking up on plug/unplug (wake on
> connect/disconnect),
> so I'm myself in favor of doing this revert.

At least I can't find "Disable USB2" on my ASUS PRIME B350M-A, so the
new behavior is quite surprising.

>
> But there was a strong push from Promontory developers to get the original
> fix in,
> and I would like to get some comment from them before I do anything about
> it.

You looped them to the mail thread which I reported the regression two
weeks ago, and there is no response since then...

>
> Thanks
> -Mathias
>


[PATCH RESEND 2/2] dmaengine: sun6i: support V3s SoC variant

2017-08-28 Thread Icenowy Zheng
From: Icenowy Zheng 

Allwinner V3s has a DMA engine similar to the ones from A31, but with
fewer channels and DRQs.

Add support for it.

Signed-off-by: Icenowy Zheng 
Acked-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/dma/sun6i-dma.txt |  1 +
 drivers/dma/sun6i-dma.c | 13 +
 2 files changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/sun6i-dma.txt 
b/Documentation/devicetree/bindings/dma/sun6i-dma.txt
index 6b267045f522..98fbe1a5c6dd 100644
--- a/Documentation/devicetree/bindings/dma/sun6i-dma.txt
+++ b/Documentation/devicetree/bindings/dma/sun6i-dma.txt
@@ -9,6 +9,7 @@ Required properties:
  "allwinner,sun8i-a23-dma"
  "allwinner,sun8i-a83t-dma"
  "allwinner,sun8i-h3-dma"
+ "allwinner,sun8i-v3s-dma"
 - reg: Should contain the registers base address and length
 - interrupts:  Should contain a reference to the interrupt used by this device
 - clocks:  Should contain a reference to the parent AHB clock
diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index 252b59c1d1d5..bcd496edc70f 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -1040,11 +1040,24 @@ static struct sun6i_dma_config sun8i_h3_dma_cfg = {
.nr_max_vchans   = 34,
 };
 
+/*
+ * The V3s have only 8 physical channels, a maximum DRQ port id of 23,
+ * and a total of 24 usable source and destination endpoints.
+ */
+
+static struct sun6i_dma_config sun8i_v3s_dma_cfg = {
+   .nr_max_channels = 8,
+   .nr_max_requests = 23,
+   .nr_max_vchans   = 24,
+   .gate_needed = true,
+};
+
 static const struct of_device_id sun6i_dma_match[] = {
{ .compatible = "allwinner,sun6i-a31-dma", .data = &sun6i_a31_dma_cfg },
{ .compatible = "allwinner,sun8i-a23-dma", .data = &sun8i_a23_dma_cfg },
{ .compatible = "allwinner,sun8i-a83t-dma", .data = &sun8i_a83t_dma_cfg 
},
{ .compatible = "allwinner,sun8i-h3-dma", .data = &sun8i_h3_dma_cfg },
+   { .compatible = "allwinner,sun8i-v3s-dma", .data = &sun8i_v3s_dma_cfg },
{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, sun6i_dma_match);
-- 
2.13.5



[PATCH RESEND 1/2] dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk

2017-08-28 Thread Icenowy Zheng
From: Icenowy Zheng 

Originally we enable a special gate bit when the compatible indicates
A23/33.

But according to BSP sources and user manuals, more SoCs will need this
gate bit.

So make it a common quirk configured in the config struct.

Signed-off-by: Icenowy Zheng 
Reviewed-by: Chen-Yu Tsai 
---
 drivers/dma/sun6i-dma.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index a2358780ab2c..252b59c1d1d5 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -101,6 +101,17 @@ struct sun6i_dma_config {
u32 nr_max_channels;
u32 nr_max_requests;
u32 nr_max_vchans;
+   /*
+* In the datasheets/user manuals of newer Allwinner SoCs, a special
+* bit (bit 2 at register 0x20) is present.
+* It's named "DMA MCLK interface circuit auto gating bit" in the
+* documents, and the footnote of this register says that this bit
+* should be set up when initializing the DMA controller.
+* Allwinner A23/A33 user manuals do not have this bit documented,
+* however these SoCs really have and need this bit, as seen in the
+* BSP kernel source code.
+*/
+   bool gate_needed;
 };
 
 /*
@@ -1009,6 +1020,7 @@ static struct sun6i_dma_config sun8i_a23_dma_cfg = {
.nr_max_channels = 8,
.nr_max_requests = 24,
.nr_max_vchans   = 37,
+   .gate_needed = true,
 };
 
 static struct sun6i_dma_config sun8i_a83t_dma_cfg = {
@@ -1174,13 +1186,7 @@ static int sun6i_dma_probe(struct platform_device *pdev)
goto err_dma_unregister;
}
 
-   /*
-* sun8i variant requires us to toggle a dma gating register,
-* as seen in Allwinner's SDK. This register is not documented
-* in the A23 user manual.
-*/
-   if (of_device_is_compatible(pdev->dev.of_node,
-   "allwinner,sun8i-a23-dma"))
+   if (sdc->cfg->gate_needed)
writel(SUN8I_DMA_GATE_ENABLE, sdc->base + SUN8I_DMA_GATE);
 
return 0;
-- 
2.13.5



[PATCH RESEND 0/2] Allwinner V3s DMA support

2017-08-28 Thread Icenowy Zheng
This is a dedicated patchset of Allwinner V3s DMA support, which used
to be part of the audio codec support patchset.

It's a derivation of the DMA part of v3 of the codec patchset.

Icenowy Zheng (2):
  dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk
  dmaengine: sun6i: support V3s SoC variant

 .../devicetree/bindings/dma/sun6i-dma.txt  |  1 +
 drivers/dma/sun6i-dma.c| 33 +-
 2 files changed, 27 insertions(+), 7 deletions(-)

-- 
2.13.5



Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-28 Thread Darrick J. Wong
On Mon, Aug 28, 2017 at 02:57:14PM -0700, Kees Cook wrote:
> On Mon, Aug 28, 2017 at 2:49 PM, Darrick J. Wong
>  wrote:
> > On Mon, Aug 28, 2017 at 02:34:56PM -0700, Kees Cook wrote:
> >> From: David Windsor 
> >>
> >> The XFS inline inode data, stored in struct xfs_inode_t field
> >> i_df.if_u2.if_inline_data and therefore contained in the xfs_inode slab
> >> cache, needs to be copied to/from userspace.
> >>
> >> cache object allocation:
> >> fs/xfs/xfs_icache.c:
> >> xfs_inode_alloc(...):
> >> ...
> >> ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP);
> >>
> >> fs/xfs/libxfs/xfs_inode_fork.c:
> >> xfs_init_local_fork(...):
> >> ...
> >> if (mem_size <= sizeof(ifp->if_u2.if_inline_data))
> >> ifp->if_u1.if_data = ifp->if_u2.if_inline_data;
> >
> > Hmm, what happens when mem_size > sizeof(if_inline_data)?  A slab object
> > will be allocated for ifp->if_u1.if_data which can then be used for
> > readlink in the same manner as the example usage trace below.  Does
> > that allocated object have a need for a usercopy annotation like
> > the one we're adding for if_inline_data?  Or is that already covered
> > elsewhere?
> 
> Yeah, the xfs helper kmem_alloc() is used in the other case, which
> ultimately boils down to a call to kmalloc(), which is entirely
> whitelisted by an earlier patch in the series:
> 
> https://lkml.org/lkml/2017/8/28/1026

Ah.  It would've been helpful to have the first three patches cc'd to
the xfs list.  So basically this series establishes the ability to set
regions within a slab object into which copy_to_user can copy memory
contents, and vice versa.  Have you seen any runtime performance impact?
The overhead looks like it ought to be minimal.

> (It's possible that at some future time we can start segregating
> kernel-only kmallocs from usercopy-able kmallocs, but for now, there
> are no plans for this.)

A pity.  It would be interesting to create no-usercopy versions of the
kmalloc-* slabs and see how much of XFS' memory consumption never
touches userspace buffers. :)

--D

> 
> -Kees
> 
> -- 
> Kees Cook
> Pixel Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 2/3] sched/fair: use util_est in LB

2017-08-28 Thread Pavan Kondeti
On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> When the scheduler looks at the CPU utlization, the current PELT value
> for a CPU is returned straight away. In certain scenarios this can have
> undesired side effects on task placement.
>



> +/**
> + * cpu_util_est: estimated utilization for the specified CPU
> + * @cpu: the CPU to get the estimated utilization for
> + *
> + * The estimated utilization of a CPU is defined to be the maximum between 
> its
> + * PELT's utilization and the sum of the estimated utilization of the tasks
> + * currently RUNNABLE on that CPU.
> + *
> + * This allows to properly represent the expected utilization of a CPU which
> + * has just got a big task running since a long sleep period. At the same 
> time
> + * however it preserves the benefits of the "blocked load" in describing the
> + * potential for other tasks waking up on the same CPU.
> + *
> + * Return: the estimated utlization for the specified CPU
> + */
> +static inline unsigned long cpu_util_est(int cpu)
> +{
> +   struct sched_avg *sa = &cpu_rq(cpu)->cfs.avg;
> +   unsigned long util = cpu_util(cpu);
> +
> +   if (!sched_feat(UTIL_EST))
> +   return util;
> +
> +   return max(util, util_est(sa, UTIL_EST_LAST));
> +}
> +
>  static inline int task_util(struct task_struct *p)
>  {
> return p->se.avg.util_avg;
> @@ -6007,11 +6033,19 @@ static int cpu_util_wake(int cpu, struct task_struct 
> *p)
>
> /* Task has no contribution or is new */
> if (cpu != task_cpu(p) || !p->se.avg.last_update_time)
> -   return cpu_util(cpu);
> +   return cpu_util_est(cpu);
>
> capacity = capacity_orig_of(cpu);
> util = max_t(long, cpu_rq(cpu)->cfs.avg.util_avg - task_util(p), 0);
>
> +   /*
> +* Estimated utilization tracks only tasks already enqueued, but still
> +* sometimes can return a bigger value than PELT, for example when the
> +* blocked load is negligible wrt the estimated utilization of the
> +* already enqueued tasks.
> +*/
> +   util = max_t(long, util, cpu_util_est(cpu));
> +

We are supposed to discount the task's util from its CPU. But the
cpu_util_est() can potentially return cpu_util() which includes the
task's utilization.

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project


Re: [PATCH net-next] hinic: don't build the module by default

2017-08-28 Thread David Miller
From: Vitaly Kuznetsov 
Date: Mon, 28 Aug 2017 15:16:05 +0200

> We probably don't want to enable code supporting particular hardware by
> default e.g. when someone does 'make defconfig'. Other ethernet modules
> don't do it.
> 
> Signed-off-by: Vitaly Kuznetsov 

Applied, thanks.


Re: [PATCH net-next v2 00/10] net: dsa: add generic debugfs interface

2017-08-28 Thread David Miller
From: Vivien Didelot 
Date: Mon, 28 Aug 2017 15:17:38 -0400

> This patch series adds a generic debugfs interface for the DSA
> framework, so that all switch devices benefit from it, e.g. Marvell,
> Broadcom, Microchip or any other DSA driver.

I've been thinking this over and I agree with the feedback given that
debugfs really isn't appropriate for this.

Please create a DSA device class, and hang these values under
appropriate sysfs device nodes that can be easily found via
/sys/class/dsa/ just as easily as they would be /sys/kernel/debug/dsa/

You really intend these values to be consistent across DSA devices,
and you don't intend to go willy-nilly changig these exported values
arbitrarily over time.  That's what debugfs is for, throw-away
stuff.

So please make these proper device sysfs attributes rather than
debugfs.

Thank you.


Re: [RFC 1/3] sched/fair: add util_est on top of PELT

2017-08-28 Thread Pavan Kondeti
Hi Patrick,

On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> The util_avg signal computed by PELT is too variable for some use-cases.
> For example, a big task waking up after a long sleep period will have its
> utilization almost completely decayed. This introduces some latency before
> schedutil will be able to pick the best frequency to run a task.
>
> The same issue can affect task placement. Indeed, since the task
> utilization is already decayed at wakeup, when the task is enqueued in a
> CPU, this can results in a CPU running a big task as being temporarily
> represented as being almost empty. This leads to a race condition where
> other tasks can be potentially allocated on a CPU which just started to run
> a big task which slept for a relatively long period.
>
> Moreover, the utilization of a task is, by PELT definition, a continuously
> changing metrics. This contributes in making almost instantly outdated some
> decisions based on the value of the PELT's utilization.
>
> For all these reasons, a more stable signal could probably do a better job
> of representing the expected/estimated utilization of a SE/RQ. Such a
> signal can be easily created on top of PELT by still using it as an
> estimator which produces values to be aggregated once meaningful events
> happens.
>
> This patch adds a simple implementation of util_est, a new signal built on
> top of PELT's util_avg where:
>
> util_est(se) = max(se::util_avg, f(se::util_avg@dequeue_times))
>

I don't see any wrapper function in this patch that implements this
signal. You want to use this signal in the task placement path as a
replacement of task_util(), right?

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project


Re: iov_iter_pipe warning.

2017-08-28 Thread Darrick J. Wong
On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote:
> On Mon, Aug 07, 2017 at 04:18:18PM -0400, Dave Jones wrote:
>  > On Fri, Apr 28, 2017 at 06:20:25PM +0100, Al Viro wrote:
>  >  > On Fri, Apr 28, 2017 at 12:50:24PM -0400, Dave Jones wrote:
>  >  > > currently running v4.11-rc8-75-gf83246089ca0
>  >  > > 
>  >  > > sunrpc bit is for the other unrelated problem I'm chasing.
>  >  > > 
>  >  > > note also, I saw the backtrace without the fs/splice.c changes.
>  >  > 
>  >  > Interesting...  Could you add this and see if that triggers?
>  >  > 
>  >  > diff --git a/fs/splice.c b/fs/splice.c
>  >  > index 540c4a44756c..12a12d9c313f 100644
>  >  > --- a/fs/splice.c
>  >  > +++ b/fs/splice.c
>  >  > @@ -306,6 +306,9 @@ ssize_t generic_file_splice_read(struct file *in, 
> loff_t *ppos,
>  >  > kiocb.ki_pos = *ppos;
>  >  > ret = call_read_iter(in, &kiocb, &to);
>  >  > if (ret > 0) {
>  >  > +   if (WARN_ON(iov_iter_count(&to) != len - ret))
>  >  > +   printk(KERN_ERR "ops %p: was %zd, left %zd, 
> returned %d\n",
>  >  > +   in->f_op, len, iov_iter_count(&to), 
> ret);
>  >  > *ppos = kiocb.ki_pos;
>  >  > file_accessed(in);
>  >  > } else if (ret < 0) {
>  > 
>  > Hey Al,
>  >  Due to a git stash screw up on my part, I've had this leftover WARN_ON
>  > in my tree for the last couple months. (That screw-up might turn out to be
>  > serendipitous if this is a real bug..)
>  > 
>  > Today I decided to change things up and beat up on xfs for a change, and
>  > was able to trigger this again.
>  > 
>  > Is this check no longer valid, or am I triggering the same bug we were 
> chased
>  > down in nfs, but now in xfs ?  (None of the other detritus from that 
> debugging
>  > back in April made it, just those three lines above).
> 
> Revisiting this. I went back and dug out some of the other debug diffs [1]
> from that old thread.
> 
> I can easily trigger this spew on xfs.
> 
> 
> WARNING: CPU: 1 PID: 2251 at fs/splice.c:292 test_it+0xd4/0x1d0
> CPU: 1 PID: 2251 Comm: trinity-c42 Not tainted 4.13.0-rc7-think+ #1 
> task: 880459173a40 task.stack: 88044f7d
> RIP: 0010:test_it+0xd4/0x1d0
> RSP: 0018:88044f7d7878 EFLAGS: 00010283
> RAX:  RBX: 88044f44b968 RCX: 81511ea0
> RDX: 0003 RSI: dc00 RDI: 88044f44ba68
> RBP: 88044f7d78c8 R08: 88046b218ec0 R09: 
> R10: 88044f7d7518 R11:  R12: 1000
> R13: 0001 R14:  R15: 0001
> FS:  7fdbc09b2700() GS:88046b20() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 000459e1d000 CR4: 001406e0
> Call Trace:
>  generic_file_splice_read+0x414/0x4e0
>  ? opipe_prep.part.14+0x180/0x180
>  ? lockdep_init_map+0xb2/0x2b0
>  ? rw_verify_area+0x65/0x150
>  do_splice_to+0xab/0xc0
>  splice_direct_to_actor+0x1f5/0x540
>  ? generic_pipe_buf_nosteal+0x10/0x10
>  ? do_splice_to+0xc0/0xc0
>  ? rw_verify_area+0x9d/0x150
>  do_splice_direct+0x1b9/0x230
>  ? splice_direct_to_actor+0x540/0x540
>  ? __sb_start_write+0x164/0x1c0
>  ? do_sendfile+0x7b3/0x840
>  do_sendfile+0x428/0x840
>  ? do_compat_pwritev64+0xb0/0xb0
>  ? __might_sleep+0x72/0xe0
>  ? kasan_check_write+0x14/0x20
>  SyS_sendfile64+0xa4/0x120
>  ? SyS_sendfile+0x150/0x150
>  ? mark_held_locks+0x23/0xb0
>  ? do_syscall_64+0xc0/0x3e0
>  ? SyS_sendfile+0x150/0x150
>  do_syscall_64+0x1bc/0x3e0
>  ? syscall_return_slowpath+0x240/0x240
>  ? mark_held_locks+0x23/0xb0
>  ? return_from_SYSCALL_64+0x2d/0x7a
>  ? trace_hardirqs_on_caller+0x182/0x260
>  ? trace_hardirqs_on_thunk+0x1a/0x1c
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x7fdbc02dd219
> RSP: 002b:7ffc5024fa48 EFLAGS: 0246
>  ORIG_RAX: 0028
> RAX: ffda RBX: 0028 RCX: 7fdbc02dd219
> RDX: 7fdbbe348000 RSI: 0011 RDI: 0015
> RBP: 7ffc5024faf0 R08: 006d R09: 0094e82f2c730a50
> R10: 1000 R11: 0246 R12: 0002
> R13: 7fdbc0885058 R14: 7fdbc09b2698 R15: 7fdbc0885000
> ---[ end trace a5847ef0f7be7e20 ]---
> asked to read 4096, claims to have read 1
> actual size of data in pipe 4096 
> [0:4096]
> f_op: a058c920, f_flags: 49154, pos: 0/1, size: 0
> 
> 
> I'm still trying to narrow down an exact reproducer, but it seems having
> trinity do a combination of sendfile & writev, with pipes and regular
> files as fd's is the best repro.
> 
> Is this a real problem, or am I chasing ghosts ?  That it doesn't happen
> on ext4 or btrfs is making me wonder...

 I haven't heard of any problems w/ directio xfs lately, but OTOH
I think it's the only filesystem that uses iomap_dio_rw, which would
explain why ext4/btrfs don't have this problem.

Granted that's idle speculation; is t

Re: [PATCH] be2net: Fix some u16 fields appropriately

2017-08-28 Thread David Miller
From: 严海双 
Date: Tue, 29 Aug 2017 09:04:57 +0800

> The GET_TX_COMPL_BITS comes from amap_get which also returns a 32-bit value:

It never returns a value with more than 16-bits of significance for
this specific call.

Please stop trying to be semantically clever when arguing about this
change.

It's not about types, it's about what range of values the struct
member can actually hold.


linux-next: manual merge of the md tree with the block tree

2017-08-28 Thread Stephen Rothwell
Hi Shaohua,

Today's linux-next merge of the md tree got a conflict in:

  drivers/md/raid5-ppl.c

between commit:

  74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions 
index")

from the block tree and commit:

  ddc088238cd6 ("md: Runtime support for multiple ppls")

from the md tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell
diff --cc drivers/md/raid5-ppl.c
index 1e237c40d6fa,a98ef172f8e8..
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@@ -451,12 -456,25 +456,25 @@@ static void ppl_submit_iounit(struct pp
pplhdr->entries_count = cpu_to_le32(io->entries_count);
pplhdr->checksum = cpu_to_le32(~crc32c_le(~0, pplhdr, PPL_HEADER_SIZE));
  
+   /* Rewind the buffer if current PPL is larger then remaining space */
+   if (log->use_multippl &&
+   log->rdev->ppl.sector + log->rdev->ppl.size - log->next_io_sector <
+   (PPL_HEADER_SIZE + io->pp_size) >> 9)
+   log->next_io_sector = log->rdev->ppl.sector;
+ 
+ 
bio->bi_end_io = ppl_log_endio;
bio->bi_opf = REQ_OP_WRITE | REQ_FUA;
 -  bio->bi_bdev = log->rdev->bdev;
 +  bio_set_dev(bio, log->rdev->bdev);
-   bio->bi_iter.bi_sector = log->rdev->ppl.sector;
+   bio->bi_iter.bi_sector = log->next_io_sector;
bio_add_page(bio, io->header_page, PAGE_SIZE, 0);
  
+   pr_debug("%s: log->current_io_sector: %llu\n", __func__,
+   (unsigned long long)log->next_io_sector);
+ 
+   if (log->use_multippl)
+   log->next_io_sector += (PPL_HEADER_SIZE + io->pp_size) >> 9;
+ 
list_for_each_entry(sh, &io->stripe_list, log_list) {
/* entries for full stripe writes have no partial parity */
if (test_bit(STRIPE_FULL_WRITE, &sh->state))


[PATCH v6] FlexRM support in VFIO platform

2017-08-28 Thread Anup Patel
This patchset primarily adds Broadcom FlexRM reset module for
VFIO platform driver.

The patches are based on Linux-4.13-rc3 and can also be
found at flexrm-vfio-v6 branch of
https://github.com/Broadcom/arm64-linux.git

Changes since v5:
 - Make kconfig option VFIO_PLATFORM_BCMFLEXRM_RESET
   default to ARCH_BCM_IPROC

Changes since v4:
 - Use "--timeout" instead of "timeout--" in
   vfio_platform_bcmflexrm_shutdown()

Changes since v3:
 - Improve "depends on" for Kconfig option
   VFIO_PLATFORM_BCMFLEXRM_RESET
 - Fix typo in pr_warn() called by
   vfio_platform_bcmflexrm_shutdown()
 - Return error from vfio_platform_bcmflexrm_shutdown()
   when FlexRM ring flush timeout happens

Changes since v2:
 - Remove PATCH1 because fixing VFIO no-IOMMU mode is
   a separate topic

Changes since v1:
 - Remove iommu_present() check in vfio_iommu_group_get()
 - Drop PATCH1-to-PATCH3 because IOMMU_CAP_BYPASS is not
   required
 - Move additional comments out of license header in
   vfio_platform_bcmflexrm.c

Anup Patel (1):
  vfio: platform: reset: Add Broadcom FlexRM reset module

 drivers/vfio/platform/reset/Kconfig|   9 ++
 drivers/vfio/platform/reset/Makefile   |   1 +
 .../vfio/platform/reset/vfio_platform_bcmflexrm.c  | 100 +
 3 files changed, 110 insertions(+)
 create mode 100644 drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c

-- 
2.7.4



[PATCH v6] vfio: platform: reset: Add Broadcom FlexRM reset module

2017-08-28 Thread Anup Patel
This patch adds Broadcom FlexRM low-level reset for
VFIO platform.

It will do the following:
1. Disable/Deactivate each FlexRM ring
2. Flush each FlexRM ring

The cleanup sequence for FlexRM rings is adapted from
Broadcom FlexRM mailbox driver.

Signed-off-by: Anup Patel 
Reviewed-by: Oza Oza 
Reviewed-by: Scott Branden 
Reviewed-by: Eric Auger 
---
 drivers/vfio/platform/reset/Kconfig|   9 ++
 drivers/vfio/platform/reset/Makefile   |   1 +
 .../vfio/platform/reset/vfio_platform_bcmflexrm.c  | 100 +
 3 files changed, 110 insertions(+)
 create mode 100644 drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c

diff --git a/drivers/vfio/platform/reset/Kconfig 
b/drivers/vfio/platform/reset/Kconfig
index 705..392e3c0 100644
--- a/drivers/vfio/platform/reset/Kconfig
+++ b/drivers/vfio/platform/reset/Kconfig
@@ -13,3 +13,12 @@ config VFIO_PLATFORM_AMDXGBE_RESET
  Enables the VFIO platform driver to handle reset for AMD XGBE
 
  If you don't know what to do here, say N.
+
+config VFIO_PLATFORM_BCMFLEXRM_RESET
+   tristate "VFIO support for Broadcom FlexRM reset"
+   depends on VFIO_PLATFORM && (ARCH_BCM_IPROC || COMPILE_TEST)
+   default ARCH_BCM_IPROC
+   help
+ Enables the VFIO platform driver to handle reset for Broadcom FlexRM
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/platform/reset/Makefile 
b/drivers/vfio/platform/reset/Makefile
index 93f4e23..8d9874b 100644
--- a/drivers/vfio/platform/reset/Makefile
+++ b/drivers/vfio/platform/reset/Makefile
@@ -5,3 +5,4 @@ ccflags-y += -Idrivers/vfio/platform
 
 obj-$(CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET) += vfio-platform-calxedaxgmac.o
 obj-$(CONFIG_VFIO_PLATFORM_AMDXGBE_RESET) += vfio-platform-amdxgbe.o
+obj-$(CONFIG_VFIO_PLATFORM_BCMFLEXRM_RESET) += vfio_platform_bcmflexrm.o
diff --git a/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c 
b/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
new file mode 100644
index 000..966a813
--- /dev/null
+++ b/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
@@ -0,0 +1,100 @@
+/*
+ * Copyright (C) 2017 Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+/*
+ * This driver provides reset support for Broadcom FlexRM ring manager
+ * to VFIO platform.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vfio_platform_private.h"
+
+/* FlexRM configuration */
+#define RING_REGS_SIZE 0x1
+#define RING_VER_MAGIC 0x76303031
+
+/* Per-Ring register offsets */
+#define RING_VER   0x000
+#define RING_CONTROL   0x034
+#define RING_FLUSH_DONE0x038
+
+/* Register RING_CONTROL fields */
+#define CONTROL_FLUSH_SHIFT5
+
+/* Register RING_FLUSH_DONE fields */
+#define FLUSH_DONE_MASK0x1
+
+static int vfio_platform_bcmflexrm_shutdown(void __iomem *ring)
+{
+   unsigned int timeout;
+
+   /* Disable/inactivate ring */
+   writel_relaxed(0x0, ring + RING_CONTROL);
+
+   /* Flush ring with timeout of 1s */
+   timeout = 1000;
+   writel_relaxed(BIT(CONTROL_FLUSH_SHIFT), ring + RING_CONTROL);
+   do {
+   if (readl_relaxed(ring + RING_FLUSH_DONE) & FLUSH_DONE_MASK)
+   break;
+   mdelay(1);
+   } while (--timeout);
+
+   if (!timeout) {
+   pr_warn("VFIO FlexRM shutdown timeout\n");
+   return -ETIMEDOUT;
+   }
+
+   return 0;
+}
+
+static int vfio_platform_bcmflexrm_reset(struct vfio_platform_device *vdev)
+{
+   int rc = 0;
+   void __iomem *ring;
+   struct vfio_platform_region *reg = &vdev->regions[0];
+
+   /* Map FlexRM ring registers if not mapped */
+   if (!reg->ioaddr) {
+   reg->ioaddr = ioremap_nocache(reg->addr, reg->size);
+   if (!reg->ioaddr)
+   return -ENOMEM;
+   }
+
+   /* Discover and shutdown each FlexRM ring */
+   for (ring = reg->ioaddr;
+ring < (reg->ioaddr + reg->size); ring += RING_REGS_SIZE) {
+   if (readl_relaxed(ring + RING_VER) == RING_VER_MAGIC) {
+   rc = vfio_platform_bcmflexrm_shutdown(ring);
+   if (rc)
+  

Re: [PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-08-28 Thread Florian Fainelli


On 08/28/2017 08:09 PM, Nicholas Piggin wrote:
> On Mon, 28 Aug 2017 13:03:31 -0700
> Florian Fainelli  wrote:
> 
>> On 05/21/2017 07:46 PM, Nicholas Piggin wrote:
>>> On Sat, 20 May 2017 20:33:35 -0700
>>> Florian Fainelli  wrote:
>>>   
 Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
 initram compression algorithm") introduced the possibility to select the
 initramfs compression algorithm from Kconfig and while this is a nice
 feature it broke the use case described below.

 Here is what my build system does:

 - kernel is initially configured not to have an initramfs included
 - build the user space root file system
 - re-configure the kernel to have an initramfs included
 (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
 CONFIG_INITRAMFS options, in my case, no compression option
 (CONFIG_INITRAMFS_COMPRESSION_NONE)
 - kernel is re-built with these options -> kernel+initramfs image is
   copied
 - kernel is re-built again without these options -> kernel image is
   copied

 Building a kernel without an initramfs means setting this option:

 CONFIG_INITRAMFS_SOURCE="" (and this one only)

 whereas building a kernel with an initramfs means setting these options:

 CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
 /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
 CONFIG_INITRAMFS_ROOT_UID=1000
 CONFIG_INITRAMFS_ROOT_GID=1000
 CONFIG_INITRAMFS_COMPRESSION_NONE=y
 CONFIG_INITRAMFS_COMPRESSION=""

 Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
 choice of the embedded initram compression algorithm") is problematic
 because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
 initramfs_data.cpio extension/compression is a string, and due to how
 Kconfig works it will evaluate in order, how to assign it.

 Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
 CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
 on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
 still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
 CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
 being built.

 So we basically end-up generating two initramfs_data.cpio* files, one
 without extension, and one with .gz. This causes usr/Makefile to track
 usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
 that is also largely problematic after
 9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
 set target from Kconfig") because we used to track all possible
 initramfs_data files in the $(targets) variable before that commit.

 The end result is that the kernel with an initramfs clearly does not
 contain what we expect it to, it has a stale initramfs_data.cpio file
 built into it, and we keep re-generating an initramfs_data.cpio.gz file
 which is not the one that we want to include in the kernel image proper.

 The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
 CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
 behavior where we can properly disable and re-enable initramfs within
 the same kernel .config file, and be in control of what
 CONFIG_INITRAMFS_COMPRESSION is set to.

 Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded 
 initram compression algorithm")
 Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
 Signed-off-by: Florian Fainelli   
>>>
>>> This is very thorough, thank you for tracking it down and fixing it.
>>>
>>> I can't say I've worked through the problem in the code, but your
>>> changelog and the proposed fix seem reasonable to me. So for what
>>> it's worth:
>>>
>>> Acked-by: Nicholas Piggin   
>>
>> Well, I am looking at this again, and it's still broken, the same test
>> case is involved, except this time, I am switching beween no-initramfs
>> and initramfs with gzip compression (the key thing is using a
>> compression of some sort). The end result is the following:
>>
>> - change stuff in the rootfs
>> - build the kernel with initramfs, CONFIG_INITRAMFS_COMPRESSION_GZIP=y,
>> usr/initramfs_data.cpio.gz gets generated correctly the first time
>> - build the kernel without initramfs,
>> CONFIG_INITRAMFS_COMPRESSION_NONE=y, usr/initramfs_data.cpio gets generated
>>
>> Now back to step 1 add some files, and we can see that
>> usr/initramfs_data.cpio.gz is now stale from before...
>>
>> So while my earlier fix switched the initramfs w/o compression to no
>> initramfs rebuild, now this does not work because we still have two
>> files left to be tracked:
>>
>> usr/initramfs_data.cpio (no compression, or when initramfs is disabled)
>> and usr/initramfs_data.cpio.$(suffix_y)
>>
>> How would you go ab

[PATCH v5 1/2] dt-bindings: i2c: Add Spreadtrum I2C controller documentation

2017-08-28 Thread Baolin Wang
This patch adds the binding documentation for Spreadtrum I2C
controller device.

Signed-off-by: Baolin Wang 
Acked-by: Rob Herring 
---
Changes since v4:
 - No updates.

Changes since v3:
 - Add Ack from RobH.

Changes since v2:
 - Change compatible strings to be SoC specific.

Changes since v1:
 - No updates.
---
 Documentation/devicetree/bindings/i2c/i2c-sprd.txt |   31 
 1 file changed, 31 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/i2c/i2c-sprd.txt

diff --git a/Documentation/devicetree/bindings/i2c/i2c-sprd.txt 
b/Documentation/devicetree/bindings/i2c/i2c-sprd.txt
new file mode 100644
index 000..60b7cda
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/i2c-sprd.txt
@@ -0,0 +1,31 @@
+I2C for Spreadtrum platforms
+
+Required properties:
+- compatible: Should be "sprd,sc9860-i2c".
+- reg: Specify the physical base address of the controller and length
+  of memory mapped region.
+- interrupts: Should contain I2C interrupt.
+- clock-names: Should contain following entries:
+  "i2c" for I2C clock,
+  "source" for I2C source (parent) clock,
+  "enable" for I2C module enable clock.
+- clocks: Should contain a clock specifier for each entry in clock-names.
+- clock-frequency: Constains desired I2C bus clock frequency in Hz.
+- #address-cells: Should be 1 to describe address cells for I2C device address.
+- #size-cells: Should be 0 means no size cell for I2C device address.
+
+Optional properties:
+- Child nodes conforming to I2C bus binding
+
+Examples:
+i2c0: i2c@7050 {
+   compatible = "sprd,sc9860-i2c";
+   reg = <0 0x7050 0 0x1000>;
+   interrupts = ;
+   clock-names = "i2c", "source", "enable";
+   clocks = <&clk_i2c3>, <&ext_26m>, <&clk_ap_apb_gates 11>;
+   clock-frequency = <40>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+};
+
-- 
1.7.9.5



[PATCH v5 2/2] i2c: Add Spreadtrum I2C controller driver

2017-08-28 Thread Baolin Wang
This patch adds the I2C controller driver for Spreadtrum SC9860 platform.

Signed-off-by: Baolin Wang 
Reviewed-by: Andy Shevchenko 
---
Changes since v4:
 - Remove dump registers function.
 - Change 'unsigned int' to 'u32' type.
 - Invert ack logic to make it clear.
 - Modify some comments and error message.

Changes since v3:
 - Use SPDX-License-Identifier tag instead.

Changes since v2:
 - Remove some redundant comments and parens.
 - Define macros instead of magic number.
 - Add some comments to explain clock formula.
 - Change of_clk_get_by_name() to devm_clk_get().
 - Deal with other frequency.
 - Change register definiton to low case.
 - Change is_last_msg to boolean.
 - Other optimization.

Changes sice v1:
 - Power on I2C device in probe().
 - Remove redundant macros and usb __maybe_unused.
 - Remove redundant 'of_match_ptr'.
 - Modify return values and check the return value for 'clk_prepare_enable'.
---
 drivers/i2c/busses/Kconfig|7 +
 drivers/i2c/busses/Makefile   |1 +
 drivers/i2c/busses/i2c-sprd.c |  646 +
 3 files changed, 654 insertions(+)
 create mode 100644 drivers/i2c/busses/i2c-sprd.c

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 1006b23..64729ac 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -900,6 +900,13 @@ config I2C_SIRF
  This driver can also be built as a module.  If so, the module
  will be called i2c-sirf.
 
+config I2C_SPRD
+   bool "Spreadtrum I2C interface"
+   depends on ARCH_SPRD
+   help
+ If you say yes to this option, support will be included for the
+ Spreadtrum I2C interface.
+
 config I2C_ST
tristate "STMicroelectronics SSC I2C support"
depends on ARCH_STI
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 1b2fc81..505f74a 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -89,6 +89,7 @@ obj-$(CONFIG_I2C_SH7760)  += i2c-sh7760.o
 obj-$(CONFIG_I2C_SH_MOBILE)+= i2c-sh_mobile.o
 obj-$(CONFIG_I2C_SIMTEC)   += i2c-simtec.o
 obj-$(CONFIG_I2C_SIRF) += i2c-sirf.o
+obj-$(CONFIG_I2C_SPRD) += i2c-sprd.o
 obj-$(CONFIG_I2C_ST)   += i2c-st.o
 obj-$(CONFIG_I2C_STM32F4)  += i2c-stm32f4.o
 obj-$(CONFIG_I2C_STU300)   += i2c-stu300.o
diff --git a/drivers/i2c/busses/i2c-sprd.c b/drivers/i2c/busses/i2c-sprd.c
new file mode 100644
index 000..22e08ae
--- /dev/null
+++ b/drivers/i2c/busses/i2c-sprd.c
@@ -0,0 +1,646 @@
+/*
+ * Copyright (C) 2017 Spreadtrum Communications Inc.
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define I2C_CTL0x00
+#define I2C_ADDR_CFG   0x04
+#define I2C_COUNT  0x08
+#define I2C_RX 0x0c
+#define I2C_TX 0x10
+#define I2C_STATUS 0x14
+#define I2C_HSMODE_CFG 0x18
+#define I2C_VERSION0x1c
+#define ADDR_DVD0  0x20
+#define ADDR_DVD1  0x24
+#define ADDR_STA0_DVD  0x28
+#define ADDR_RST   0x2c
+
+/* I2C_CTL */
+#define STP_EN BIT(20)
+#define FIFO_AF_LVL_MASK   GENMASK(19, 16)
+#define FIFO_AF_LVL16
+#define FIFO_AE_LVL_MASK   GENMASK(15, 12)
+#define FIFO_AE_LVL12
+#define I2C_DMA_EN BIT(11)
+#define FULL_INTEN BIT(10)
+#define EMPTY_INTENBIT(9)
+#define I2C_DVD_OPTBIT(8)
+#define I2C_OUT_OPTBIT(7)
+#define I2C_TRIM_OPT   BIT(6)
+#define I2C_HS_MODEBIT(4)
+#define I2C_MODE   BIT(3)
+#define I2C_EN BIT(2)
+#define I2C_INT_EN BIT(1)
+#define I2C_START  BIT(0)
+
+/* I2C_STATUS */
+#define SDA_IN BIT(21)
+#define SCL_IN BIT(20)
+#define FIFO_FULL  BIT(4)
+#define FIFO_EMPTY BIT(3)
+#define I2C_INTBIT(2)
+#define I2C_RX_ACK BIT(1)
+#define I2C_BUSY   BIT(0)
+
+/* ADDR_RST */
+#define I2C_RSTBIT(0)
+
+#define I2C_FIFO_DEEP  12
+#define I2C_FIFO_FULL_THLD 15
+#define I2C_FIFO_EMPTY_THLD4
+#define I2C_DATA_STEP  8
+#define I2C_ADDR_DVD0_CALC(high, low)  \
+   high) & GENMASK(15, 0)) << 16) | ((low) & GENMASK(15, 0)))
+#define I2C_ADDR_DVD1_CALC(high, low)  \
+   (((high) & GENMASK(31, 16)) | (((low) & GENMASK(31, 16)) >> 16))
+
+/* timeout (ms) for pm runtime autosuspend */
+#define SPRD_I2C_PM_TIMEOUT1000
+
+/* SPRD i2c data structure */
+struct sprd_i2c {
+   struct i2c_adapter adap;
+   struct device *dev;
+   void __iomem *base;
+   struct i2c_msg *msg;
+   struct clk *clk;
+   u32 src_clk;
+   u32 bus_freq;
+   struct completion complete;
+

Re: [PATCH] libnvdimm: clean up command definitions

2017-08-28 Thread Dan Williams
On Mon, Aug 28, 2017 at 6:03 PM, Yasunori Goto  wrote:
>> On Mon, Aug 28, 2017 at 1:50 PM, Jerry Hoemann  wrote:
>> >
>> > On Mon, Aug 28, 2017 at 08:45:32AM -0700, Dan Williams wrote:
>> >> Remove the command payloads that do not have an associated libnvdimm
>> >> ioctl. I.e. remove the payloads that would only ever be carried in the
>> >> ND_CMD_CALL envelope. This prevents userspace from growing unnecessary
>> >> dependencies on this kernel header when userspace already has everything
>> >> it needs to craft and send these commands.
>> >
>> > Userspace needs to include linux/ndctl.h to make the call as
>> > that is where nd_cmd_pkg is defined.
>> >
>> > So you want to have some structures defined in ndctl.h and other
>> > defined in the to be created libndctl-nfit.h?  Plus a third header
>> > file for the HPE non-root calls?
>>
>> Yes. ndctl.h exports the ioctl command payloads, everything that goes
>> inside of ND_CMD_CALL is defined by userspace headers. The
>> libndctl-nfit.h header is proposed as a place to land vendor agnostic
>> NFIT-defined payloads, and any vendor specific definitions would
>> remain internal to libndctl as they are today.
>>
>> > Will libndctl-nfit.h be generally available and installed?
>>
>> Yes, that's the plan.
>>
>> > Will it be clean so that other applications can use it to get these
>> > definitions?  Or will it be loaded w/ a bunch of stuff only useful
>> > to your ndctl command?
>>
>> Yes, that's the plan. It's a bug if libndctl-nfit.h is not generically
>> clean for issuing the NFIT root device commands via some ND_CMD_CALL
>> helpers from the base libndctl library.
>>
>> In other words libndctl-nfit.h defines the payload and libndctl
>> defines some general helpers for issuing commands.
>
> Maybe I don't understand your idea yet, let me confirm it.
>
> Certainly, current acpi driver does not need these definitions.
> But, I think nfit_test.ko will need them to emulate these features.
>
> Do you intend that libndctl-nfit.h should be defined at "include/uapi/linux/"
> directory?
> Otherwise, it should be defined at "tools/testing/nvdimm/" or
> "tools/testing/nvdimm/test" ?

nfit_test will need its own internal / private copy of these payloads
in tools/testing/nvdimm/test so it can emulate how the bios behaves.
The include/uapi/linux directory is for user to kernel interface
definitions and these command payloads are purely an interface to bios
/ firmware.


Re: [PATCH v15 4/5] mm: support reporting free page blocks

2017-08-28 Thread Wei Wang

On 08/28/2017 09:33 PM, Michal Hocko wrote:

On Mon 28-08-17 18:08:32, Wei Wang wrote:

This patch adds support to walk through the free page blocks in the
system and report them via a callback function. Some page blocks may
leave the free list after zone->lock is released, so it is the caller's
responsibility to either detect or prevent the use of such pages.

One use example of this patch is to accelerate live migration by skipping
the transfer of free pages reported from the guest. A popular method used
by the hypervisor to track which part of memory is written during live
migration is to write-protect all the guest memory. So, those pages that
are reported as free pages but are written after the report function
returns will be captured by the hypervisor, and they will be added to the
next round of memory transfer.

OK, looks much better. I still have few nits.


+extern void walk_free_mem_block(void *opaque,
+   int min_order,
+   bool (*report_page_block)(void *, unsigned long,
+ unsigned long));
+

please add names to arguments of the prototype


  /*
   * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
   * into the buddy system. The freed pages will be poisoned with pattern
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6d00f74..81eedc7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4762,6 +4762,71 @@ void show_free_areas(unsigned int filter, nodemask_t 
*nodemask)
show_swap_cache_info();
  }
  
+/**

+ * walk_free_mem_block - Walk through the free page blocks in the system
+ * @opaque: the context passed from the caller
+ * @min_order: the minimum order of free lists to check
+ * @report_page_block: the callback function to report free page blocks

page_block has meaning in the core MM which doesn't strictly match its
usage here. Moreover we are reporting pfn ranges rather than struct page
range. So report_pfn_range would suit better.

[...]

+   for_each_populated_zone(zone) {
+   for (order = MAX_ORDER - 1; order >= min_order; order--) {
+   for (mt = 0; !stop && mt < MIGRATE_TYPES; mt++) {
+   spin_lock_irqsave(&zone->lock, flags);
+   list = &zone->free_area[order].free_list[mt];
+   list_for_each_entry(page, list, lru) {
+   pfn = page_to_pfn(page);
+   stop = report_page_block(opaque, pfn,
+1 << order);
+   if (stop)
+   break;

if (stop) {

spin_unlock_irqrestore(&zone->lock, flags);
return;
}

would be both easier and less error prone. E.g. You wouldn't pointlessly
iterate over remaining orders just to realize there is nothing to be
done for those...



Yes, that's better, thanks. I will take other suggestions as well.

Best,
Wei





Re: [PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-08-28 Thread Nicholas Piggin
On Mon, 28 Aug 2017 13:03:31 -0700
Florian Fainelli  wrote:

> On 05/21/2017 07:46 PM, Nicholas Piggin wrote:
> > On Sat, 20 May 2017 20:33:35 -0700
> > Florian Fainelli  wrote:
> >   
> >> Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
> >> initram compression algorithm") introduced the possibility to select the
> >> initramfs compression algorithm from Kconfig and while this is a nice
> >> feature it broke the use case described below.
> >>
> >> Here is what my build system does:
> >>
> >> - kernel is initially configured not to have an initramfs included
> >> - build the user space root file system
> >> - re-configure the kernel to have an initramfs included
> >> (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
> >> CONFIG_INITRAMFS options, in my case, no compression option
> >> (CONFIG_INITRAMFS_COMPRESSION_NONE)
> >> - kernel is re-built with these options -> kernel+initramfs image is
> >>   copied
> >> - kernel is re-built again without these options -> kernel image is
> >>   copied
> >>
> >> Building a kernel without an initramfs means setting this option:
> >>
> >> CONFIG_INITRAMFS_SOURCE="" (and this one only)
> >>
> >> whereas building a kernel with an initramfs means setting these options:
> >>
> >> CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
> >> /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
> >> CONFIG_INITRAMFS_ROOT_UID=1000
> >> CONFIG_INITRAMFS_ROOT_GID=1000
> >> CONFIG_INITRAMFS_COMPRESSION_NONE=y
> >> CONFIG_INITRAMFS_COMPRESSION=""
> >>
> >> Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
> >> choice of the embedded initram compression algorithm") is problematic
> >> because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
> >> initramfs_data.cpio extension/compression is a string, and due to how
> >> Kconfig works it will evaluate in order, how to assign it.
> >>
> >> Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
> >> CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
> >> on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
> >> still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
> >> CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
> >> being built.
> >>
> >> So we basically end-up generating two initramfs_data.cpio* files, one
> >> without extension, and one with .gz. This causes usr/Makefile to track
> >> usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
> >> that is also largely problematic after
> >> 9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
> >> set target from Kconfig") because we used to track all possible
> >> initramfs_data files in the $(targets) variable before that commit.
> >>
> >> The end result is that the kernel with an initramfs clearly does not
> >> contain what we expect it to, it has a stale initramfs_data.cpio file
> >> built into it, and we keep re-generating an initramfs_data.cpio.gz file
> >> which is not the one that we want to include in the kernel image proper.
> >>
> >> The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
> >> CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
> >> behavior where we can properly disable and re-enable initramfs within
> >> the same kernel .config file, and be in control of what
> >> CONFIG_INITRAMFS_COMPRESSION is set to.
> >>
> >> Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded 
> >> initram compression algorithm")
> >> Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
> >> Signed-off-by: Florian Fainelli   
> > 
> > This is very thorough, thank you for tracking it down and fixing it.
> > 
> > I can't say I've worked through the problem in the code, but your
> > changelog and the proposed fix seem reasonable to me. So for what
> > it's worth:
> > 
> > Acked-by: Nicholas Piggin   
> 
> Well, I am looking at this again, and it's still broken, the same test
> case is involved, except this time, I am switching beween no-initramfs
> and initramfs with gzip compression (the key thing is using a
> compression of some sort). The end result is the following:
> 
> - change stuff in the rootfs
> - build the kernel with initramfs, CONFIG_INITRAMFS_COMPRESSION_GZIP=y,
> usr/initramfs_data.cpio.gz gets generated correctly the first time
> - build the kernel without initramfs,
> CONFIG_INITRAMFS_COMPRESSION_NONE=y, usr/initramfs_data.cpio gets generated
> 
> Now back to step 1 add some files, and we can see that
> usr/initramfs_data.cpio.gz is now stale from before...
> 
> So while my earlier fix switched the initramfs w/o compression to no
> initramfs rebuild, now this does not work because we still have two
> files left to be tracked:
> 
> usr/initramfs_data.cpio (no compression, or when initramfs is disabled)
> and usr/initramfs_data.cpio.$(suffix_y)
> 
> How would you go about solving this?

I don't see the problem. When I change back to .gz,

Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-08-28 Thread Wei Wang

On 08/29/2017 02:03 AM, Michael S. Tsirkin wrote:

On Mon, Aug 28, 2017 at 06:08:31PM +0800, Wei Wang wrote:

Add a new feature, VIRTIO_BALLOON_F_SG, which enables the transfer
of balloon (i.e. inflated/deflated) pages using scatter-gather lists
to the host.

The implementation of the previous virtio-balloon is not very
efficient, because the balloon pages are transferred to the
host one by one. Here is the breakdown of the time in percentage
spent on each step of the balloon inflating process (inflating
7GB of an 8GB idle guest).

1) allocating pages (6.5%)
2) sending PFNs to host (68.3%)
3) address translation (6.1%)
4) madvise (19%)

It takes about 4126ms for the inflating process to complete.
The above profiling shows that the bottlenecks are stage 2)
and stage 4).

This patch optimizes step 2) by transferring pages to the host in
sgs. An sg describes a chunk of guest physically continuous pages.
With this mechanism, step 4) can also be optimized by doing address
translation and madvise() in chunks rather than page by page.

With this new feature, the above ballooning process takes ~597ms
resulting in an improvement of ~86%.

TODO: optimize stage 1) by allocating/freeing a chunk of pages
instead of a single page each time.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Suggested-by: Michael S. Tsirkin 
---
  drivers/virtio/virtio_balloon.c | 171 
  include/uapi/linux/virtio_balloon.h |   1 +
  2 files changed, 155 insertions(+), 17 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..8ecc1d4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -32,6 +32,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  /*

   * Balloon device works in 4K page units.  So each page is pointed to by
@@ -79,6 +81,9 @@ struct virtio_balloon {
/* Synchronize access/update to this struct virtio_balloon elements */
struct mutex balloon_lock;
  
+	/* The xbitmap used to record balloon pages */

+   struct xb page_xb;
+
/* The array of pfns we tell the Host about. */
unsigned int num_pfns;
__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
@@ -141,13 +146,111 @@ static void set_page_pfns(struct virtio_balloon *vb,
  page_to_balloon_pfn(page) + i);
  }
  
+static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)

+{
+   struct scatterlist sg;
+
+   sg_init_one(&sg, addr, size);
+   return virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);
+}
+
+static void send_balloon_page_sg(struct virtio_balloon *vb,
+struct virtqueue *vq,
+void *addr,
+uint32_t size,
+bool batch)
+{
+   unsigned int len;
+   int err;
+
+   err = add_one_sg(vq, addr, size);
+   /* Sanity check: this can't really happen */
+   WARN_ON(err);

It might be cleaner to detect that add failed due to
ring full and kick then. Just an idea, up to you
whether to do it.


+
+   /* If batching is in use, we batch the sgs till the vq is full. */
+   if (!batch || !vq->num_free) {
+   virtqueue_kick(vq);
+   wait_event(vb->acked, virtqueue_get_buf(vq, &len));
+   /* Release all the entries if there are */

Meaning
Account for all used entries if any
?


+   while (virtqueue_get_buf(vq, &len))
+   ;


Above code is reused below. Add a function?


+   }
+}
+
+/*
+ * Send balloon pages in sgs to host. The balloon pages are recorded in the
+ * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE.
+ * The page xbitmap is searched for continuous "1" bits, which correspond
+ * to continuous pages, to chunk into sgs.
+ *
+ * @page_xb_start and @page_xb_end form the range of bits in the xbitmap that
+ * need to be searched.
+ */
+static void tell_host_sgs(struct virtio_balloon *vb,
+ struct virtqueue *vq,
+ unsigned long page_xb_start,
+ unsigned long page_xb_end)
+{
+   unsigned long sg_pfn_start, sg_pfn_end;
+   void *sg_addr;
+   uint32_t sg_len, sg_max_len = round_down(UINT_MAX, PAGE_SIZE);
+
+   sg_pfn_start = page_xb_start;
+   while (sg_pfn_start < page_xb_end) {
+   sg_pfn_start = xb_find_next_bit(&vb->page_xb, sg_pfn_start,
+   page_xb_end, 1);
+   if (sg_pfn_start == page_xb_end + 1)
+   break;
+   sg_pfn_end = xb_find_next_bit(&vb->page_xb, sg_pfn_start + 1,
+ page_xb_end, 0);
+   sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
+   sg_len = (sg_pfn_end - sg_pfn_start) << PAGE_SHIFT;
+   while (sg_len > sg_m

Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-08-28 Thread Alexey Kardashevskiy
On 21/08/17 12:47, Alexey Kardashevskiy wrote:
> Folks,
> 
> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I
> repost this or go back to PCI bus flags or something else? Thanks.


Anyone, any help? How do we proceed with this? Thanks.



> 
> 
> 
> On 14/08/17 19:45, Alexey Kardashevskiy wrote:
>> Folks,
>>
>> Is there anything to change besides those compiler errors and David's
>> comment in 5/5? Or the while patchset is too bad? Thanks.
>>
>>
>>
>> On 07/08/17 17:25, Alexey Kardashevskiy wrote:
>>> This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for 
>>> mmapping MSI-X table"
>>> http://www.spinics.net/lists/kvm/msg152232.html
>>>
>>> This time it is using "caps" in IOMMU groups. The main question is if PCI
>>> bus flags or IOMMU domains are still better (and which one).
>>
>>>
>>>
>>>
>>> Here is some background:
>>>
>>> Current vfio-pci implementation disallows to mmap the page
>>> containing MSI-X table in case that users can write directly
>>> to MSI-X table and generate an incorrect MSIs.
>>>
>>> However, this will cause some performance issue when there
>>> are some critical device registers in the same page as the
>>> MSI-X table. We have to handle the mmio access to these
>>> registers in QEMU emulation rather than in guest.
>>>
>>> To solve this issue, this series allows to expose MSI-X table
>>> to userspace when hardware enables the capability of interrupt
>>> remapping which can ensure that a given PCI device can only
>>> shoot the MSIs assigned for it. And we introduce a new bus_flags
>>> PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
>>> for different archs.
>>>
>>>
>>> This is based on sha1
>>> 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux"
>>>
>>> Please comment. Thanks.
>>>
>>> Changelog:
>>>
>>> v5:
>>> * redid the whole thing via so-called IOMMU group capabilities
>>>
>>> v4:
>>> * rebased on recent upstream
>>> * got all 6 patches from v2 (v3 was missing some)
>>>
>>>
>>>
>>>
>>> Alexey Kardashevskiy (5):
>>>   iommu: Add capabilities to a group
>>>   iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ
>>> remapping
>>>   iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is
>>> enabled
>>>   powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX
>>>   vfio-pci: Allow to expose MSI-X table to userspace when safe
>>>
>>>  include/linux/iommu.h| 20 
>>>  include/linux/vfio.h |  1 +
>>>  arch/powerpc/kernel/iommu.c  |  1 +
>>>  drivers/iommu/amd_iommu.c|  3 +++
>>>  drivers/iommu/intel-iommu.c  |  3 +++
>>>  drivers/iommu/iommu.c| 35 +++
>>>  drivers/vfio/pci/vfio_pci.c  | 20 +---
>>>  drivers/vfio/pci/vfio_pci_rdwr.c |  5 -
>>>  drivers/vfio/vfio.c  | 15 +++
>>>  9 files changed, 99 insertions(+), 4 deletions(-)
>>>
>>
>>
> 
> 


-- 
Alexey


Re: [PATCH v4 3/3] ARM: dts: exynos: Remove the display-timing and delay from rinato dts

2017-08-28 Thread Hoegeun Kwon

Hi Krzysztof,

The driver has been merged into exynos-drm-misc.
Could you please check this patch(3/3).

Best regards,
Hoegeun

On 07/13/2017 11:20 AM, Hoegeun Kwon wrote:

The display-timing and delay are included in the panel driver. So it
should be removed in dts.

Signed-off-by: Hoegeun Kwon 
---
  arch/arm/boot/dts/exynos3250-rinato.dts | 22 --
  1 file changed, 22 deletions(-)

diff --git a/arch/arm/boot/dts/exynos3250-rinato.dts 
b/arch/arm/boot/dts/exynos3250-rinato.dts
index 443e0c9..6b70c8d 100644
--- a/arch/arm/boot/dts/exynos3250-rinato.dts
+++ b/arch/arm/boot/dts/exynos3250-rinato.dts
@@ -242,28 +242,6 @@
vci-supply = <&ldo20_reg>;
reset-gpios = <&gpe0 1 GPIO_ACTIVE_LOW>;
te-gpios = <&gpx0 6 GPIO_ACTIVE_HIGH>;
-   power-on-delay= <30>;
-   power-off-delay= <120>;
-   reset-delay = <5>;
-   init-delay = <100>;
-   flip-horizontal;
-   flip-vertical;
-   panel-width-mm = <29>;
-   panel-height-mm = <29>;
-
-   display-timings {
-   timing-0 {
-   clock-frequency = <460>;
-   hactive = <320>;
-   vactive = <320>;
-   hfront-porch = <1>;
-   hback-porch = <1>;
-   hsync-len = <1>;
-   vfront-porch = <150>;
-   vback-porch = <1>;
-   vsync-len = <2>;
-   };
-   };
  
  		port {

dsi_in: endpoint {




Re: linux-next: Signed-off-by missing for commit in the slave-dma tree

2017-08-28 Thread Vinod Koul
On Tue, Aug 29, 2017 at 09:10:56AM +1000, Stephen Rothwell wrote:
> Hi Vinod,
> 
> Commit
> 
>   966e5e01f420 ("dmaengine: altera: Use macros instead of structs to describe 
> the registers")
> 
> is missing a Signed-off-by from its committer.

Oops, missed fixing that up while running with -i :(.

Fixed now, thanks for pointing out

-- 
~Vinod


[PATCH] docs: highres: fix broken urls

2017-08-28 Thread stephen lu
Some urls is invalid. I find alternative urls.

Signed-off-by: stephen lu 
---
 Documentation/timers/highres.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/timers/highres.txt b/Documentation/timers/highres.txt
index e878997..9d88f67 100644
--- a/Documentation/timers/highres.txt
+++ b/Documentation/timers/highres.txt
@@ -4,10 +4,10 @@ High resolution timers and dynamic ticks design notes
 Further information can be found in the paper of the OLS 2006 talk "hrtimers
 and beyond". The paper is part of the OLS 2006 Proceedings Volume 1, which can
 be found on the OLS website:
-http://www.linuxsymposium.org/2006/linuxsymposium_procv1.pdf
+https://www.kernel.org/doc/ols/2006/ols2006v1-pages-333-346.pdf

 The slides to this talk are available from:
-http://tglx.de/projects/hrtimers/ols2006-hrtimers.pdf
+http://www.cs.columbia.edu/~nahum/w6998/papers/ols2006-hrtimers-slides.pdf

 The slides contain five figures (pages 2, 15, 18, 20, 22), which illustrate the
 changes in the time(r) related Linux subsystems. Figure #1 (p. 2) shows the


RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default

2017-08-28 Thread Dexuan Cui
> From: Dexuan Cui
> Sent: Tuesday, August 22, 2017 21:21
> > ...
> > ...
> > The only problem here would be the potential for a guest and a host app
> to
> > have a conflict wrt port numbers, even though they would be able to
> > operate fine, if restricted to their appropriate transport.
> >
> > Thanks,
> > Jorgen
> 
> Hi Jorgen, Stefan,
> Thank you for the detailed analysis!
> You have a much better understanding than me about the complex
> scenarios. Can you please work out a patch? :-)

Hi Jorgen, Stefan,
May I know your plan for this? 
 
> IMO Linux driver of Hyper-V sockets is the simplest case, as we only have
> the "to host" option (the host side driver of Hyper-V sockets runs on
> Windows kernel and I don't think the other hypervisors emulate
> the full Hyper-V VMBus 4.0, which is required to support Hyper-V sockets).
> 
> -- Dexuan

Thanks,
-- Dexuan


Re: module: Remove const attribute from alias for MODULE_DEVICE_TABLE

2017-08-28 Thread Stefan Agner
On 2017-08-28 10:41, Kees Cook wrote:
> On Mon, Aug 28, 2017 at 10:38 AM, Nick Desaulniers
>  wrote:
>> I think Kees' proposal is a better solution; rather than require all
>> usage of device table to remember to add const, have the macro add it
>> for all users.  Otherwise if you require caller's to add it, they may
>> forget.
> 
> And with the coccinelle script, it should be easy to invert the logic
> and remove const from the callers...
> 

I tried to reproduce my findings again but was not successful :-( I must
have changed .config or something in between and draw wrong
conclusions...

So removing the const in the module.h alias actually did not change
anything... It did not help for drivers which forget to constify... I
think even the alias in module.h was actually illegal according to C
standard:

(C89, 6.2.7p2) "All declarations that refer to the same object or
function shall have compatible type; otherwise the behavior is
undefined."


I guess it would still make sense to constify the structs for most of
the 620 drivers which do not have it const currently. I found some
driver actually change the table at runtime, e.g.
drivers/net/usb/pegasus.c, so we would have to exclude them.

--
Stefan


linux-next: manual merge of the net-next tree with the net tree

2017-08-28 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/ethernet/marvell/mvpp2.c

between commit:

  4c2286826451 ("net: mvpp2: fix the mac address used when using PPv2.2")

from the net tree and commits:

  09f8397553a2 ("net: mvpp2: introduce per-port nrxqs/ntxqs variables")
  213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue 
distribution modes")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/ethernet/marvell/mvpp2.c
index 4d598ca8503a,fea9ae5b70ba..
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@@ -6504,7 -7248,9 +7248,9 @@@ static int mvpp2_port_probe(struct plat
struct resource *res;
const char *dt_mac_addr;
const char *mac_from;
 -  char hw_mac_addr[ETH_ALEN];
 +  char hw_mac_addr[ETH_ALEN] = {0};
+   unsigned int ntxqs, nrxqs;
+   bool has_tx_irqs;
u32 id;
int features;
int phy_mode;


Re: [PATCH v3 4/5] input: Add MediaTek PMIC keys support

2017-08-28 Thread Chen Zhong
On Mon, 2017-08-28 at 09:57 -0700, Dmitry Torokhov wrote:
> Hi Chen,
> 
> On Fri, Aug 25, 2017 at 02:32:32PM +0800, Chen Zhong wrote:
> > +static int mtk_pmic_key_setup(struct mtk_pmic_keys *keys,
> > +   struct pmic_keys_info *info)
> > +{
> > +   int ret;
> > +
> > +   info->keys = keys;
> > +
> > +   ret = regmap_update_bits(keys->regmap, info->regs->intsel_reg,
> > +info->regs->intsel_mask,
> > +info->regs->intsel_mask);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > +   ret = devm_request_threaded_irq(keys->dev, info->irq, NULL,
> > +   mtk_pmic_keys_irq_handler_thread,
> > +   IRQF_ONESHOT | IRQF_TRIGGER_HIGH,
> > +   "mtk-pmic-keys", info);
> > +   if (ret) {
> > +   dev_err(keys->dev, "Failed to request IRQ: %d: %d\n",
> > +   info->irq, ret);
> > +   return ret;
> > +   }
> > +
> > +   if (info->wakeup)
> > +   irq_set_irq_wake(info->irq, 1);
> 
> Normally we do this in suspend() (and undo in resume()), and I believe
> the drover API is enable_irq_wake() and disable_irq_wake().
> 

Hi Dmitry,

I'll add suspend/resume callback functions and do this with
enable_irq_wake() and disable_irq_wake().

Thank you.


> Thanks.
> 




[PATCH v7 07/11] sparc64: optimized struct page zeroing

2017-08-28 Thread Pavel Tatashin
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

   BASEFIX  OPTIMIZED_FIX
bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
  
Total 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)   (mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#definemm_zero_struct_page(pp) do {
\
+   unsigned long *_pp = (void *)(pp);  \
+   \
+/* Check that struct page is either 64, 72, or 80 bytes */ \
+   BUILD_BUG_ON(sizeof(struct page) & 7);  \
+   BUILD_BUG_ON(sizeof(struct page) < 64); \
+   BUILD_BUG_ON(sizeof(struct page) > 80); \
+   \
+   switch (sizeof(struct page)) {  \
+   case 80:\
+   _pp[9] = 0; /* fallthrough */   \
+   case 72:\
+   _pp[8] = 0; /* fallthrough */   \
+   default:\
+   _pp[7] = 0; \
+   _pp[6] = 0; \
+   _pp[5] = 0; \
+   _pp[4] = 0; \
+   _pp[3] = 0; \
+   _pp[2] = 0; \
+   _pp[1] = 0; \
+   _pp[0] = 0; \
+   }   \
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1



[PATCH v7 00/11] complete deferred page initialization

2017-08-28 Thread Pavel Tatashin
Changelog:
v7 - v6
- Addressed comments from Michal Hocko
- memblock_discard() patch was removed from this series and integrated
  separately
- Fixed bug reported by kbuild test robot new patch:
  mm: zero reserved and unavailable struct pages
- Removed patch
  x86/mm: reserve only exiting low pages
  As, it is not needed anymore, because of the previous fix
- Re-wrote deferred_init_memmap(), found and fixed an existing bug, where
  page variable is not reset when zone holes present.
- Merged several patches together per Michal request
- Added performance data including raw logs

v6 - v5
- Fixed ARM64 + kasan code, as reported by Ard Biesheuvel
- Tested ARM64 code in qemu and found few more issues, that I fixed in this
  iteration
- Added page roundup/rounddown to x86 and arm zeroing routines to zero the
  whole allocated range, instead of only provided address range.
- Addressed SPARC related comment from Sam Ravnborg
- Fixed section mismatch warnings related to memblock_discard().

v5 - v4
- Fixed build issues reported by kbuild on various configurations

v4 - v3
- Rewrote code to zero sturct pages in __init_single_page() as
  suggested by Michal Hocko
- Added code to handle issues related to accessing struct page
  memory before they are initialized.

v3 - v2
- Addressed David Miller comments about one change per patch:
* Splited changes to platforms into 4 patches
* Made "do not zero vmemmap_buf" as a separate patch

v2 - v1
- Per request, added s390 to deferred "struct page" zeroing
- Collected performance data on x86 which proofs the importance to
  keep memset() as prefetch (see below).

SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option,
which defers initializing struct pages until all cpus have been started so
it can be done in parallel.

However, this feature is sub-optimal, because the deferred page
initialization code expects that the struct pages have already been zeroed,
and the zeroing is done early in boot with a single thread only.  Also, we
access that memory and set flags before struct pages are initialized. All
of this is fixed in this patchset.

In this work we do the following:
- Never read access struct page until it was initialized
- Never set any fields in struct pages before they are initialized
- Zero struct page at the beginning of struct page initialization


==
Performance improvements on x86 machine with 8 nodes:
Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz and 1T of memory:
TIME  SPEED UP
base no deferred:   95.796233s
fix no deferred:79.978956s19.77%

base deferred:  77.254713s
fix deferred:   55.050509s40.34%
==
SPARC M6 3600 MHz with 15T of memory
TIME  SPEED UP
base no deferred:   358.335727s
fix no deferred:302.320936s   18.52%

base deferred:  237.534603s
fix deferred:   182.103003s   30.44%
==
Raw dmesg output with timestamps:
x86 base no deferred:https://hastebin.com/ofunepurit.scala
x86 base deferred:   https://hastebin.com/ifazegeyas.scala
x86 fix no deferred: https://hastebin.com/pegocohevo.scala
x86 fix deferred:https://hastebin.com/ofupevikuk.scala
sparc base no deferred:  https://hastebin.com/ibobeteken.go
sparc base deferred: https://hastebin.com/fariqimiyu.go
sparc fix no deferred:   https://hastebin.com/muhegoheyi.go
sparc fix deferred:  https://hastebin.com/xadinobutu.go

Pavel Tatashin (11):
  x86/mm: setting fields in deferred pages
  sparc64/mm: setting fields in deferred pages
  mm: deferred_init_memmap improvements
  sparc64: simplify vmemmap_populate
  mm: defining memblock_virt_alloc_try_nid_raw
  mm: zero struct pages during initialization
  sparc64: optimized struct page zeroing
  mm: zero reserved and unavailable struct pages
  x86/kasan: explicitly zero kasan shadow memory
  arm64/kasan: explicitly zero kasan shadow memory
  mm: stop zeroing memory during allocation in vmemmap

 arch/arm64/mm/kasan_init.c  |  42 
 arch/sparc/include/asm/pgtable_64.h |  30 ++
 arch/sparc/mm/init_64.c |  31 +++---
 arch/x86/mm/init_64.c   |   9 +-
 arch/x86/mm/kasan_init_64.c |  66 
 include/linux/bootmem.h |  27 +
 include/linux/memblock.h|  16 +++
 include/linux/mm.h  |  26 +
 mm/memblock.c   |  60 +--
 mm/page_alloc.c | 207 
 mm/sparse-vmemmap.c |  14 +--
 mm/sparse.c |   6 +-
 12 files changed, 406 insertions(+), 128 deletions(-)

-- 
2.14.1



[PATCH v7 09/11] x86/kasan: explicitly zero kasan shadow memory

2017-08-28 Thread Pavel Tatashin
To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.

We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/x86/mm/kasan_init_64.c | 66 +
 1 file changed, 66 insertions(+)

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 02c9d7553409..96fde5bf9597 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -84,6 +84,66 @@ static struct notifier_block kasan_die_notifier = {
 };
 #endif
 
+/*
+ * x86 variant of vmemmap_populate() uses either PMD_SIZE pages or base pages
+ * to map allocated memory.  This routine determines the page size for the 
given
+ * address from vmemmap.
+ */
+static u64 get_vmemmap_pgsz(u64 addr)
+{
+   pgd_t *pgd;
+   p4d_t *p4d;
+   pud_t *pud;
+   pmd_t *pmd;
+
+   pgd = pgd_offset_k(addr);
+   BUG_ON(pgd_none(*pgd) || pgd_large(*pgd));
+
+   p4d = p4d_offset(pgd, addr);
+   BUG_ON(p4d_none(*p4d) || p4d_large(*p4d));
+
+   pud = pud_offset(p4d, addr);
+   BUG_ON(pud_none(*pud) || pud_large(*pud));
+
+   pmd = pmd_offset(pud, addr);
+   BUG_ON(pmd_none(*pmd));
+
+   if (pmd_large(*pmd))
+   return PMD_SIZE;
+   return PAGE_SIZE;
+}
+
+/*
+ * Memory that was allocated by vmemmap_populate is not zeroed, so we must
+ * zero it here explicitly.
+ */
+static void
+zero_vmemmap_populated_memory(void)
+{
+   u64 i, start, end;
+
+   for (i = 0; i < E820_MAX_ENTRIES && pfn_mapped[i].end; i++) {
+   void *kaddr_start = pfn_to_kaddr(pfn_mapped[i].start);
+   void *kaddr_end = pfn_to_kaddr(pfn_mapped[i].end);
+
+   start = (u64)kasan_mem_to_shadow(kaddr_start);
+   end = (u64)kasan_mem_to_shadow(kaddr_end);
+
+   /* Round to the start end of the mapped pages */
+   start = rounddown(start, get_vmemmap_pgsz(start));
+   end = roundup(end, get_vmemmap_pgsz(start));
+   memset((void *)start, 0, end - start);
+   }
+
+   start = (u64)kasan_mem_to_shadow(_stext);
+   end = (u64)kasan_mem_to_shadow(_end);
+
+   /* Round to the start end of the mapped pages */
+   start = rounddown(start, get_vmemmap_pgsz(start));
+   end = roundup(end, get_vmemmap_pgsz(start));
+   memset((void *)start, 0, end - start);
+}
+
 void __init kasan_early_init(void)
 {
int i;
@@ -146,6 +206,12 @@ void __init kasan_init(void)
load_cr3(init_top_pgt);
__flush_tlb_all();
 
+   /*
+* vmemmap_populate does not zero the memory, so we need to zero it
+* explicitly
+*/
+   zero_vmemmap_populated_memory();
+
/*
 * kasan_zero_page has been used as early shadow memory, thus it may
 * contain some garbage. Now we can clear and write protect it, since
-- 
2.14.1



[PATCH v7 04/11] sparc64: simplify vmemmap_populate

2017-08-28 Thread Pavel Tatashin
Remove duplicating code by using common functions
vmemmap_pud_populate and vmemmap_pgd_populate.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/mm/init_64.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 12dbba85a2e2..a603d2c9087d 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2611,30 +2611,19 @@ int __meminit vmemmap_populate(unsigned long vstart, 
unsigned long vend,
vstart = vstart & PMD_MASK;
vend = ALIGN(vend, PMD_SIZE);
for (; vstart < vend; vstart += PMD_SIZE) {
-   pgd_t *pgd = pgd_offset_k(vstart);
+   pgd_t *pgd = vmemmap_pgd_populate(vstart, node);
unsigned long pte;
pud_t *pud;
pmd_t *pmd;
 
-   if (pgd_none(*pgd)) {
-   pud_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
+   if (!pgd)
+   return -ENOMEM;
 
-   if (!new)
-   return -ENOMEM;
-   pgd_populate(&init_mm, pgd, new);
-   }
-
-   pud = pud_offset(pgd, vstart);
-   if (pud_none(*pud)) {
-   pmd_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
-
-   if (!new)
-   return -ENOMEM;
-   pud_populate(&init_mm, pud, new);
-   }
+   pud = vmemmap_pud_populate(pgd, vstart, node);
+   if (!pud)
+   return -ENOMEM;
 
pmd = pmd_offset(pud, vstart);
-
pte = pmd_val(*pmd);
if (!(pte & _PAGE_VALID)) {
void *block = vmemmap_alloc_block(PMD_SIZE, node);
-- 
2.14.1



[PATCH v7 08/11] mm: zero reserved and unavailable struct pages

2017-08-28 Thread Pavel Tatashin
Some memory is reserved but unavailable: not present in memblock.memory
(because not backed by physical pages), but present in memblock.reserved.
Such memory has backing struct pages, but they are not initialized by going
through __init_single_page().

In some cases these struct pages are accessed even if they do not contain
any data. One example is page_to_pfn() might access page->flags if this is
where section information is stored (CONFIG_SPARSEMEM,
SECTION_IN_PAGE_FLAGS).

Since, struct pages are zeroed in __init_single_page(), and not during
allocation time, we must zero such struct pages explicitly.

The patch involves adding a new memblock iterator:
for_each_resv_unavail_range(i, p_start, p_end)

Which iterates through reserved && !memory lists, and we zero struct pages
explicitly by calling mm_zero_struct_page().

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 include/linux/memblock.h | 16 
 include/linux/mm.h   |  6 ++
 mm/page_alloc.c  | 30 ++
 3 files changed, 52 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index bae11c7e7bf3..bdd4268f9323 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -237,6 +237,22 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, 
unsigned long max_pfn);
for_each_mem_range_rev(i, &memblock.memory, &memblock.reserved, \
   nid, flags, p_start, p_end, p_nid)
 
+/**
+ * for_each_resv_unavail_range - iterate through reserved and unavailable 
memory
+ * @i: u64 used as loop variable
+ * @flags: pick from blocks based on memory attributes
+ * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
+ * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
+ *
+ * Walks over unavailabled but reserved (reserved && !memory) areas of 
memblock.
+ * Available as soon as memblock is initialized.
+ * Note: because this memory does not belong to any physical node, flags and
+ * nid arguments do not make sense and thus not exported as arguments.
+ */
+#define for_each_resv_unavail_range(i, p_start, p_end) \
+   for_each_mem_range(i, &memblock.reserved, &memblock.memory, \
+  NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL)
+
 static inline void memblock_set_region_flags(struct memblock_region *r,
 unsigned long flags)
 {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 183ac5e733db..0a440ff8f226 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1968,6 +1968,12 @@ extern int __meminit __early_pfn_to_nid(unsigned long 
pfn,
struct mminit_pfnnid_cache *state);
 #endif
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+void zero_resv_unavail(void);
+#else
+static inline void __paginginit zero_resv_unavail(void) {}
+#endif
+
 extern void set_dma_reserve(unsigned long new_dma_reserve);
 extern void memmap_init_zone(unsigned long, int, unsigned long,
unsigned long, enum memmap_context);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4d67fe3dd172..484c16fb5f0d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6261,6 +6261,34 @@ void __paginginit free_area_init_node(int nid, unsigned 
long *zones_size,
free_area_init_core(pgdat);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+/*
+ * Only struct pages that are backed by physical memory are zeroed and
+ * initialized by going through __init_single_page(). But, there are some
+ * struct pages which are reserved in memblock allocator and their fields
+ * may be accessed (for example page_to_pfn() on some configuration accesses
+ * flags). We must explicitly zero those struct pages.
+ */
+void __paginginit zero_resv_unavail(void)
+{
+   phys_addr_t start, end;
+   unsigned long pfn;
+   u64 i, pgcnt;
+
+   /* Loop through ranges that are reserved, but do not have reported
+* physical memory backing.
+*/
+   pgcnt = 0;
+   for_each_resv_unavail_range(i, &start, &end) {
+   for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
+   mm_zero_struct_page(pfn_to_page(pfn));
+   pgcnt++;
+   }
+   }
+   pr_info("Reserved but unavailable: %lld pages", pgcnt);
+}
+#endif /* CONFIG_HAVE_MEMBLOCK */
+
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 
 #if MAX_NUMNODES > 1
@@ -6684,6 +6712,7 @@ void __init free_area_init_nodes(unsigned long 
*max_zone_pfn)
node_set_state(nid, N_MEMORY);
check_for_memory(pgdat, nid);
}
+   zero_resv_unavail();
 }
 
 static int __init cmdline_parse_core(char *p, unsigned long *core)
@@ -6847,6 +6876,7 @@ void __init free_area_init(unsigned long *zones_size)
 {
free_area_init_node(0, zones_size,
__pa(PAGE_OFF

[PATCH v7 10/11] arm64/kasan: explicitly zero kasan shadow memory

2017-08-28 Thread Pavel Tatashin
To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.

We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/arm64/mm/kasan_init.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 81f03959a4ab..e78a9ecbb687 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -135,6 +135,41 @@ static void __init clear_pgds(unsigned long start,
set_pgd(pgd_offset_k(start), __pgd(0));
 }
 
+/*
+ * Memory that was allocated by vmemmap_populate is not zeroed, so we must
+ * zero it here explicitly.
+ */
+static void
+zero_vmemmap_populated_memory(void)
+{
+   struct memblock_region *reg;
+   u64 start, end;
+
+   for_each_memblock(memory, reg) {
+   start = __phys_to_virt(reg->base);
+   end = __phys_to_virt(reg->base + reg->size);
+
+   if (start >= end)
+   break;
+
+   start = (u64)kasan_mem_to_shadow((void *)start);
+   end = (u64)kasan_mem_to_shadow((void *)end);
+
+   /* Round to the start end of the mapped pages */
+   start = round_down(start, SWAPPER_BLOCK_SIZE);
+   end = round_up(end, SWAPPER_BLOCK_SIZE);
+   memset((void *)start, 0, end - start);
+   }
+
+   start = (u64)kasan_mem_to_shadow(_text);
+   end = (u64)kasan_mem_to_shadow(_end);
+
+   /* Round to the start end of the mapped pages */
+   start = round_down(start, SWAPPER_BLOCK_SIZE);
+   end = round_up(end, SWAPPER_BLOCK_SIZE);
+   memset((void *)start, 0, end - start);
+}
+
 void __init kasan_init(void)
 {
u64 kimg_shadow_start, kimg_shadow_end;
@@ -205,8 +240,15 @@ void __init kasan_init(void)
pfn_pte(sym_to_pfn(kasan_zero_page), PAGE_KERNEL_RO));
 
memset(kasan_zero_page, 0, PAGE_SIZE);
+
cpu_replace_ttbr1(lm_alias(swapper_pg_dir));
 
+   /*
+* vmemmap_populate does not zero the memory, so we need to zero it
+* explicitly
+*/
+   zero_vmemmap_populated_memory();
+
/* At this point kasan is fully initialized. Enable error messages */
init_task.kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized\n");
-- 
2.14.1



[PATCH v7 02/11] sparc64/mm: setting fields in deferred pages

2017-08-28 Thread Pavel Tatashin
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set some
fields prior to initializing:

mem_init() {
 register_page_bootmem_info();
 free_all_bootmem();
 ...
}

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page->freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() <- Only if this is deferred reserved page
__init_single_pfn()
 __init_single_page()
  memset(0) <-- Loose the set fields here

We end-up with similar issue as in the previous patch, where currently we
do not observe problem as memory is zeroed. But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/mm/init_64.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index b3020a956b87..12dbba85a2e2 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2508,9 +2508,15 @@ void __init mem_init(void)
 {
high_memory = __va(last_valid_pfn << PAGE_SHIFT);
 
-   register_page_bootmem_info();
free_all_bootmem();
 
+   /* Must be done after boot memory is put on freelist, because here we
+* might set fields in deferred struct pages that have not yet been
+* initialized, and free_all_bootmem() initializes all the reserved
+* deferred pages for us.
+*/
+   register_page_bootmem_info();
+
/*
 * Set up the zero page, mark it reserved, so that page count
 * is not manipulated when freeing the page from user ptes.
-- 
2.14.1



[PATCH v7 11/11] mm: stop zeroing memory during allocation in vmemmap

2017-08-28 Thread Pavel Tatashin
vmemmap_alloc_block() will no longer zero the block, so zero memory
at its call sites for everything except struct pages.  Struct page memory
is zero'd by struct page initialization.

Replace allocators in sprase-vmemmap to use the non-zeroing version. So,
we will get the performance improvement by zeroing the memory in parallel
when struct pages are zeroed.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 include/linux/mm.h  | 11 +++
 mm/sparse-vmemmap.c | 14 +++---
 mm/sparse.c |  6 +++---
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0a440ff8f226..fba540aef1da 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2456,6 +2456,17 @@ static inline void *vmemmap_alloc_block_buf(unsigned 
long size, int node)
return __vmemmap_alloc_block_buf(size, node, NULL);
 }
 
+static inline void *vmemmap_alloc_block_zero(unsigned long size, int node)
+{
+   void *p = vmemmap_alloc_block(size, node);
+
+   if (!p)
+   return NULL;
+   memset(p, 0, size);
+
+   return p;
+}
+
 void vmemmap_verify(pte_t *, int, unsigned long, unsigned long);
 int vmemmap_populate_basepages(unsigned long start, unsigned long end,
   int node);
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index c50b1a14d55e..423d4da85a91 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -41,7 +41,7 @@ static void * __ref __earlyonly_bootmem_alloc(int node,
unsigned long align,
unsigned long goal)
 {
-   return memblock_virt_alloc_try_nid(size, align, goal,
+   return memblock_virt_alloc_try_nid_raw(size, align, goal,
BOOTMEM_ALLOC_ACCESSIBLE, node);
 }
 
@@ -56,11 +56,11 @@ void * __meminit vmemmap_alloc_block(unsigned long size, 
int node)
 
if (node_state(node, N_HIGH_MEMORY))
page = alloc_pages_node(
-   node, GFP_KERNEL | __GFP_ZERO | 
__GFP_RETRY_MAYFAIL,
+   node, GFP_KERNEL | __GFP_RETRY_MAYFAIL,
get_order(size));
else
page = alloc_pages(
-   GFP_KERNEL | __GFP_ZERO | __GFP_RETRY_MAYFAIL,
+   GFP_KERNEL | __GFP_RETRY_MAYFAIL,
get_order(size));
if (page)
return page_address(page);
@@ -188,7 +188,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned 
long addr, int node)
 {
pmd_t *pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
pmd_populate_kernel(&init_mm, pmd, p);
@@ -200,7 +200,7 @@ pud_t * __meminit vmemmap_pud_populate(p4d_t *p4d, unsigned 
long addr, int node)
 {
pud_t *pud = pud_offset(p4d, addr);
if (pud_none(*pud)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
pud_populate(&init_mm, pud, p);
@@ -212,7 +212,7 @@ p4d_t * __meminit vmemmap_p4d_populate(pgd_t *pgd, unsigned 
long addr, int node)
 {
p4d_t *p4d = p4d_offset(pgd, addr);
if (p4d_none(*p4d)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
p4d_populate(&init_mm, p4d, p);
@@ -224,7 +224,7 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, 
int node)
 {
pgd_t *pgd = pgd_offset_k(addr);
if (pgd_none(*pgd)) {
-   void *p = vmemmap_alloc_block(PAGE_SIZE, node);
+   void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
if (!p)
return NULL;
pgd_populate(&init_mm, pgd, p);
diff --git a/mm/sparse.c b/mm/sparse.c
index 7b4be3fd5cac..0e315766ad11 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -441,9 +441,9 @@ void __init sparse_mem_maps_populate_node(struct page 
**map_map,
}
 
size = PAGE_ALIGN(size);
-   map = memblock_virt_alloc_try_nid(size * map_count,
- PAGE_SIZE, __pa(MAX_DMA_ADDRESS),
- BOOTMEM_ALLOC_ACCESSIBLE, nodeid);
+   map = memblock_virt_alloc_try_nid_raw(size * map_count,
+ PAGE_SIZE, __pa(MAX_DMA_ADDRESS),
+ BOOTMEM_ALLOC_ACCESSIBLE, nodeid);

[PATCH v7 05/11] mm: defining memblock_virt_alloc_try_nid_raw

2017-08-28 Thread Pavel Tatashin
* A new variant of memblock_virt_alloc_* allocations:
memblock_virt_alloc_try_nid_raw()
- Does not zero the allocated memory
- Does not panic if request cannot be satisfied

* optimize early system hash allocations

Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify
that memory that was allocated for system hash needs to be zeroed,
otherwise the memory does not need to be zeroed, and client will initialize
it.

If memory does not need to be zero'd, call the new
memblock_virt_alloc_raw() interface, and thus improve the boot performance.

* debug for raw alloctor

When CONFIG_DEBUG_VM is enabled, this patch sets all the memory that is
returned by memblock_virt_alloc_try_nid_raw() to ones to ensure that no
places excpect zeroed memory.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: Michal Hocko 
---
 include/linux/bootmem.h | 27 ++
 mm/memblock.c   | 60 +++--
 mm/page_alloc.c | 15 ++---
 3 files changed, 87 insertions(+), 15 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index e223d91b6439..ea30b3987282 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -160,6 +160,9 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
 #define BOOTMEM_ALLOC_ANYWHERE (~(phys_addr_t)0)
 
 /* FIXME: Move to memblock.h at a point where we remove nobootmem.c */
+void *memblock_virt_alloc_try_nid_raw(phys_addr_t size, phys_addr_t align,
+ phys_addr_t min_addr,
+ phys_addr_t max_addr, int nid);
 void *memblock_virt_alloc_try_nid_nopanic(phys_addr_t size,
phys_addr_t align, phys_addr_t min_addr,
phys_addr_t max_addr, int nid);
@@ -176,6 +179,14 @@ static inline void * __init memblock_virt_alloc(
NUMA_NO_NODE);
 }
 
+static inline void * __init memblock_virt_alloc_raw(
+   phys_addr_t size,  phys_addr_t align)
+{
+   return memblock_virt_alloc_try_nid_raw(size, align, BOOTMEM_LOW_LIMIT,
+   BOOTMEM_ALLOC_ACCESSIBLE,
+   NUMA_NO_NODE);
+}
+
 static inline void * __init memblock_virt_alloc_nopanic(
phys_addr_t size, phys_addr_t align)
 {
@@ -257,6 +268,14 @@ static inline void * __init memblock_virt_alloc(
return __alloc_bootmem(size, align, BOOTMEM_LOW_LIMIT);
 }
 
+static inline void * __init memblock_virt_alloc_raw(
+   phys_addr_t size,  phys_addr_t align)
+{
+   if (!align)
+   align = SMP_CACHE_BYTES;
+   return __alloc_bootmem_nopanic(size, align, BOOTMEM_LOW_LIMIT);
+}
+
 static inline void * __init memblock_virt_alloc_nopanic(
phys_addr_t size, phys_addr_t align)
 {
@@ -309,6 +328,14 @@ static inline void * __init 
memblock_virt_alloc_try_nid(phys_addr_t size,
  min_addr);
 }
 
+static inline void * __init memblock_virt_alloc_try_nid_raw(
+   phys_addr_t size, phys_addr_t align,
+   phys_addr_t min_addr, phys_addr_t max_addr, int nid)
+{
+   return ___alloc_bootmem_node_nopanic(NODE_DATA(nid), size, align,
+   min_addr, max_addr);
+}
+
 static inline void * __init memblock_virt_alloc_try_nid_nopanic(
phys_addr_t size, phys_addr_t align,
phys_addr_t min_addr, phys_addr_t max_addr, int nid)
diff --git a/mm/memblock.c b/mm/memblock.c
index 91205780e6b1..1f299fb1eb08 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1327,7 +1327,6 @@ static void * __init memblock_virt_alloc_internal(
return NULL;
 done:
ptr = phys_to_virt(alloc);
-   memset(ptr, 0, size);
 
/*
 * The min_count is set to 0 so that bootmem allocated blocks
@@ -1340,6 +1339,45 @@ static void * __init memblock_virt_alloc_internal(
return ptr;
 }
 
+/**
+ * memblock_virt_alloc_try_nid_raw - allocate boot memory block without zeroing
+ * memory and without panicking
+ * @size: size of memory block to be allocated in bytes
+ * @align: alignment of the region and block's size
+ * @min_addr: the lower bound of the memory region from where the allocation
+ *   is preferred (phys address)
+ * @max_addr: the upper bound of the memory region from where the allocation
+ *   is preferred (phys address), or %BOOTMEM_ALLOC_ACCESSIBLE to
+ *   allocate only from memory limited by memblock.current_limit value
+ * @nid: nid of the free area to find, %NUMA_NO_NODE for any node
+ *
+ * Public function, provides additional debug information (including caller
+ * info), if enabled. Does not zero allocated mem

Re: [PATCH V1] pinctrl: qcom: spmi-gpio: Add support for qcom,gpios-disallowed property

2017-08-28 Thread Fenglin Wu

On 8/29/2017 9:51 AM, Shawn Guo wrote:

On Tue, Aug 29, 2017 at 09:03:02AM +0800, Fenglin Wu wrote:

I agree the GPIO's ownership is configurable and it always configured at
the very beginning of the device boot up which is not visible by linux
kernel drivers/image. Normally, this configuration is fixed in one
platform and it's been protected and not allowed to be configured in
linux kernel driver. So from linux driver point of view, this is a
hardware configuration. I agree the coming patch "spmi: pmic-arb: Move
the ownership check to irq_chip callback" would fix the pinctrl-
spmi-gpio driver probe failure caused by the ownership mismatch, but
this is just hiding the mistake of the kernel configured the GPIOs which
not owned by APPS processor.


The kernel does everything just right, using the GPIO that device tree
tells to use.  If there is something wrong about ownership check, it
should be fault of that device tree specifies the wrong GPIO, or
firmware doesn't configure ownership as needed.
> Shawn



If you thought that the driver registers pins for the GPIOs not owned by
APPS processor is correct, then this patch is no needed.
I agreed with others.
Thanks

Fenglin


And these GPIOs will be registered
successfully as pinctrl pins and any APPS processor consumer drivers
could use this pins. This is not correct even the select_state operation
for these pins would failed due to the mode protection in spmi write_cmd
calling. I am thinking that not allowing these pins to be register as
pinctrl pins should be more straightforward and easy understanding. So I
think this patch still have value even the probe failure has been fixed
by the coming spmi patch.


--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


[PATCH v7 06/11] mm: zero struct pages during initialization

2017-08-28 Thread Pavel Tatashin
Add struct page zeroing as a part of initialization of other fields in
__init_single_page().

This single thread performance collected on: Intel(R) Xeon(R) CPU E7-8895
v3 @ 2.60GHz with 1T of memory (268400646 pages in 8 nodes):

BASEFIX
sparse_init 11.244671836s   0.007199623s
zone_sizes_init  4.879775891s   8.355182299s
  --
Total   16.124447727s   8.362381922s

sparse_init is where memory for struct pages is zeroed, and the zeroing
part is moved later in this patch into __init_single_page(), which is
called from zone_sizes_init().

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
Acked-by: Michal Hocko 
---
 include/linux/mm.h | 9 +
 mm/page_alloc.c| 1 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 46b9ac5e8569..183ac5e733db 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -93,6 +93,15 @@ extern int mmap_rnd_compat_bits __read_mostly;
 #define mm_forbids_zeropage(X) (0)
 #endif
 
+/*
+ * On some architectures it is expensive to call memset() for small sizes.
+ * Those architectures should provide their own implementation of "struct page"
+ * zeroing by defining this macro in .
+ */
+#ifndef mm_zero_struct_page
+#define mm_zero_struct_page(pp)  ((void)memset((pp), 0, sizeof(struct page)))
+#endif
+
 /*
  * Default maximum number of active map areas, this limits the number of vmas
  * per mm struct. Users can overwrite this number by sysctl but there is a
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8293815ca85d..4d67fe3dd172 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1169,6 +1169,7 @@ static void free_one_page(struct zone *zone,
 static void __meminit __init_single_page(struct page *page, unsigned long pfn,
unsigned long zone, int nid)
 {
+   mm_zero_struct_page(page);
set_page_links(page, zone, nid, pfn);
init_page_count(page);
page_mapcount_reset(page);
-- 
2.14.1



[PATCH v7 01/11] x86/mm: setting fields in deferred pages

2017-08-28 Thread Pavel Tatashin
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after in
free_all_bootmem:

mem_init() {
register_page_bootmem_info();
free_all_bootmem();
...
}

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

  mem_init
   register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
for_each_reserved_mem_region()
 reserve_bootmem_region()
  init_reserved_page() <- Only if this is deferred reserved page
   __init_single_pfn()
__init_single_page()
memset(0) <-- Loose the set fields here

We end-up with issue where, currently we do not observe problem as memory
is explicitly zeroed. But, if flag asserts are changed we can start hitting
issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/x86/mm/init_64.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 62a91e6b1237..3a997352a992 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1174,12 +1174,17 @@ void __init mem_init(void)
 
/* clear_bss() already clear the empty_zero_page */
 
-   register_page_bootmem_info();
-
/* this will put all memory onto the freelists */
free_all_bootmem();
after_bootmem = 1;
 
+   /* Must be done after boot memory is put on freelist, because here we
+* might set fields in deferred struct pages that have not yet been
+* initialized, and free_all_bootmem() initializes all the reserved
+* deferred pages for us.
+*/
+   register_page_bootmem_info();
+
/* Register memory areas for /proc/kcore */
kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
 PAGE_SIZE, KCORE_OTHER);
-- 
2.14.1



[PATCH v7 03/11] mm: deferred_init_memmap improvements

2017-08-28 Thread Pavel Tatashin
This patch fixes two issues in deferred_init_memmap

=
In deferred_init_memmap() where all deferred struct pages are initialized
we have a check like this:

if (page->flags) {
VM_BUG_ON(page_zone(page) != zone);
goto free_range;
}

This way we are checking if the current deferred page has already been
initialized. It works, because memory for struct pages has been zeroed, and
the only way flags are not zero if it went through __init_single_page()
before.  But, once we change the current behavior and won't zero the memory
in memblock allocator, we cannot trust anything inside "struct page"es
until they are initialized. This patch fixes this.

The deferred_init_memmap() is re-written to loop through only free memory
ranges provided by memblock.

=
This patch fixes another existing issue on systems that have holes in
zones i.e CONFIG_HOLES_IN_ZONE is defined.

In for_each_mem_pfn_range() we have code like this:

if (!pfn_valid_within(pfn)
goto free_range;

Note: 'page' is not set to NULL and is not incremented but 'pfn' advances.
Thus means if deferred struct pages are enabled on systems with these kind
of holes, linux would get memory corruptions. I have fixed this issue by
defining a new macro that performs all the necessary operations when we
free the current set of pages.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 mm/page_alloc.c | 161 +++-
 1 file changed, 78 insertions(+), 83 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7a58eb5757e3..c170ac569aec 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1409,14 +1409,17 @@ void clear_zone_contiguous(struct zone *zone)
 }
 
 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
-static void __init deferred_free_range(struct page *page,
-   unsigned long pfn, int nr_pages)
+static void __init deferred_free_range(unsigned long pfn,
+  unsigned long nr_pages)
 {
-   int i;
+   struct page *page;
+   unsigned long i;
 
-   if (!page)
+   if (!nr_pages)
return;
 
+   page = pfn_to_page(pfn);
+
/* Free a large naturally-aligned chunk if possible */
if (nr_pages == pageblock_nr_pages &&
(pfn & (pageblock_nr_pages - 1)) == 0) {
@@ -1442,19 +1445,82 @@ static inline void __init 
pgdat_init_report_one_done(void)
complete(&pgdat_init_all_done_comp);
 }
 
+#define DEFERRED_FREE(nr_free, free_base_pfn, page)\
+({ \
+   unsigned long nr = (nr_free);   \
+   \
+   deferred_free_range((free_base_pfn), (nr)); \
+   (free_base_pfn) = 0;\
+   (nr_free) = 0;  \
+   page = NULL;\
+   nr; \
+})
+
+static unsigned long deferred_init_range(int nid, int zid, unsigned long pfn,
+unsigned long end_pfn)
+{
+   struct mminit_pfnnid_cache nid_init_state = { };
+   unsigned long nr_pgmask = pageblock_nr_pages - 1;
+   unsigned long free_base_pfn = 0;
+   unsigned long nr_pages = 0;
+   unsigned long nr_free = 0;
+   struct page *page = NULL;
+
+   for (; pfn < end_pfn; pfn++) {
+   /*
+* First we check if pfn is valid on architectures where it is
+* possible to have holes within pageblock_nr_pages. On systems
+* where it is not possible, this function is optimized out.
+*
+* Then, we check if a current large page is valid by only
+* checking the validity of the head pfn.
+*
+* meminit_pfn_in_nid is checked on systems where pfns can
+* interleave within a node: a pfn is between start and end
+* of a node, but does not belong to this memory node.
+*
+* Finally, we minimize pfn page lookups and scheduler checks by
+* performing it only once every pageblock_nr_pages.
+*/
+   if (!pfn_valid_within(pfn)) {
+   nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page);
+   } else if (!(pfn & nr_pgmask) && !pfn_valid(pfn)) {
+   nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page);
+   } else if (!meminit_pfn_in_nid(pfn, nid, &nid_init_state)) {
+   nr_pages += DEFERRED_FREE(nr_free, free_base_pfn, page);
+   } else if (page && (pfn & nr_pgmask))

Inquiry 28-08-2017

2017-08-28 Thread Julian Smith
Hello,

My name is Ms Julian Smith and i am from Sinara Group Co.,LTD in Russia..We are 
glad to know about your company from the web and we are interested in your 
products.Please send us your Latest catalog and price list for our trial order

Julian Smith,
Purchasing Manager




Re: [RESEND PATCH v4 2/2] i2c: Add Spreadtrum I2C controller driver

2017-08-28 Thread Baolin Wang
Hi Wolfram,

On 28 August 2017 at 23:13, Wolfram Sang  wrote:
>
>> >> + /*
>> >> +  * If we did not get one ACK from slave when writing data, we should
>> >> +  * dump all registers to check I2C status.
>> >
>> > Why? I would say no. NACK from a slave can always happen, e.g. when an
>> > EEPROM is busy erasing a page.
>>
>> For our I2C controller databook, if the master did not get one ACK
>> from slave when writing data to salve, we should send one STOP signal
>> to abort this data transfer or generate one repeated START signal to
>> start one new data transfer cycle. Considering our I2C usage
>
> Yes, so far so good.
>
>> scenarios, we should dump registers to analyze I2C status and notify
>> to user to re-start new data transfer.
>
> I disagree here. You notify the users by returning -EIO. The upper layer
> (e.g. the i2c client driver) will handle it, like the EEPROM driver
> might retry after a while. This all is expected behaviour, so no need to
> print the registers to the logfile.
>
> If you really, really want to keep it, make it debug output. But I think
> the sentence "we should dump all registers" needs to be rephrased.

Make sense. I will remove the registers printing here.

>
>> As I explained before, in our Spreadtrum platform, our regulator
>> driver will depend on I2C driver and the regulator driver uses
>> subsys_initcall() level to initialize. Moreover some other drivers
>> like GPU, they will depend on regulator to set voltage and they also
>> need initialization much earlier.
>>
>> Since it is arch_initcall() level, Andy suggested I should get rid of
>> tristate (use bool) and drop module.h here and all leftovers like
>> MODULE_*() calls including module_exit().
>
> I see. So the driver is really so essential for proper bootup that it is
> not even allowed to be unloaded. I might make an exception here and

Yes.

> allow arch_initcall() then. But I do wonder: did you try deferred
> probing all over the place?

Many modules (like GPU) need set voltage earlier by regulator which
depends on I2C ( or we also need regulate voltage for big-cores ASAP),
if we defer to set voltage for GPU or other modules, maybe will cause
some system problems. Thanks for your comments.

-- 
Baolin.wang
Best Regards


[PATCH] lib: Closed hash table with low overhead

2017-08-28 Thread Felix Kuehling
This adds a statically sized closed hash table implementation with
low memory and CPU overhead. The API is inspired by kfifo.

Storing, retrieving and deleting data does not involve any dynamic
memory management, which makes it ideal for use in interrupt context.
Static memory usage per entry comprises a 32 or 64 bit hash key, two
bits for occupancy tracking and the value size stored in the table.
No list heads or pointers are needed. Therefore this data structure
should be quite cache-friendly, too.

It uses linear probing and lazy deletion. During lookups free space
is reclaimed and entries relocated to speed up future lookups.

Signed-off-by: Felix Kuehling 
Acked-by: Christian König 
---

This is part of a larger patch series I'm working on to support
demand-paging on AMD Vega10 and similar AMD GPUs. Vega10 generates a
flood of page fault events in its interrupt ring buffer for every
retry. I need a very fast way to discard known page faults in the
interrupt handler to stop processing retry events as early as
possible.

I don't actually need to store any data. All the information I need is
in the hash key itself, which is made up of the fault address and
process ID. I don't need a dynamically resizable data structure
because the hardware will keep retrying. If my hashtable fills up, I
can just discard retry events indiscriminately until space is freed
(by handling older page faults).

In preliminary testing with the builtin self test on a 3.6GHz Core i7,
this code can manage over 120M lookups per second (hits, misses take
about twice as long) with the table 40-50% full. Even 80-90% full it
still remains quite efficient with hits still almost as fast but
misses taking 8x as long as misses.

I'm hoping that people find this useful and consider including it as a
library routine in the kernel.

 include/linux/chash.h | 358 +
 lib/Kconfig   |  24 ++
 lib/Makefile  |   2 +
 lib/chash.c   | 622 ++
 4 files changed, 1006 insertions(+)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

diff --git a/include/linux/chash.h b/include/linux/chash.h
new file mode 100644
index 000..c89b92b
--- /dev/null
+++ b/include/linux/chash.h
@@ -0,0 +1,358 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _LINUX_CHASH_H
+#define _LINUX_CHASH_H
+
+#include 
+#include 
+#include 
+#include 
+
+struct __chash_table {
+   u8 bits;
+   u8 key_size;
+   unsigned int value_size;
+   u32 size_mask;
+   unsigned long *occup_bitmap, *valid_bitmap;
+   union {
+   u32 *keys32;
+   u64 *keys64;
+   };
+   u8 *values;
+
+#ifdef CONFIG_CHASH_STATS
+   u64 hits, hits_steps, hits_time_ns;
+   u64 miss, miss_steps, miss_time_ns;
+   u64 relocs, reloc_dist;
+#endif
+};
+
+#define __CHASH_BITMAP_SIZE(bits)  \
+   (((1 << (bits)) + BITS_PER_LONG - 1) / BITS_PER_LONG)
+#define __CHASH_ARRAY_SIZE(bits, size) \
+   size) << (bits)) + sizeof(long) - 1) / sizeof(long))
+
+#define __CHASH_DATA_SIZE(bits, key_size, value_size)  \
+   (__CHASH_BITMAP_SIZE(bits) * 2 +\
+__CHASH_ARRAY_SIZE(bits, key_size) +   \
+__CHASH_ARRAY_SIZE(bits, value_size))
+
+#define STRUCT_CHASH_TABLE(bits, key_size, value_size) \
+   struct {\
+   struct __chash_table table; \
+   unsigned long data  \
+   [__CHASH_DATA_SIZE(bits, key_size, value_size)];\
+   }
+
+/**
+ * struct chash_table - Dynamically allocated closed hash table
+ *
+ * Use this struct for dynamically allocated hash tables (us

Re: linux-next: Signed-off-by missing for commits in the drm tree

2017-08-28 Thread Sinclair Yeh
On Tue, Aug 29, 2017 at 11:44:13AM +1000, Stephen Rothwell wrote:
> Hi Dave,
> 
> Commits
> 
>   461e60ea1119 ("drm/exynos/decon5433: use mode info stored in CRTC to detect 
> i80 mode")
>   ac60944ccf23 ("drm/exynos: consistent use of cpp")
>   e300173f0616 ("drm/vmwgfx: Don't use drm_irq_[un]install")

This one is Reviewed-by: Sinclair Yeh 

>   ef369904aaf7 ("drm/vmwgfx: Move irq bottom half processing to threads")
>   65b97a2bec2f ("drm/vmwgfx: Restart command buffers after errors")
>   5f55be5f306a ("drm/vmwgfx: Support the NOP_ERROR command")
>   1f1a36cc4d49 ("drm/vmwgfx: Fix incorrect command header offset at restart")a

For these 4, I thought SOB the author is enough, but if not, I can add SOB.

Is it easier for you to add those, or would you like me to rebase and send out 
another
pull request?

thanks,

Sinclair


Re: [PATCH V1] pinctrl: qcom: spmi-gpio: Add support for qcom,gpios-disallowed property

2017-08-28 Thread Shawn Guo
On Tue, Aug 29, 2017 at 09:03:02AM +0800, Fenglin Wu wrote:
> I agree the GPIO's ownership is configurable and it always configured at
> the very beginning of the device boot up which is not visible by linux
> kernel drivers/image. Normally, this configuration is fixed in one
> platform and it's been protected and not allowed to be configured in
> linux kernel driver. So from linux driver point of view, this is a
> hardware configuration. I agree the coming patch "spmi: pmic-arb: Move
> the ownership check to irq_chip callback" would fix the pinctrl-
> spmi-gpio driver probe failure caused by the ownership mismatch, but
> this is just hiding the mistake of the kernel configured the GPIOs which
> not owned by APPS processor.

The kernel does everything just right, using the GPIO that device tree
tells to use.  If there is something wrong about ownership check, it
should be fault of that device tree specifies the wrong GPIO, or
firmware doesn't configure ownership as needed.

Shawn

> And these GPIOs will be registered
> successfully as pinctrl pins and any APPS processor consumer drivers
> could use this pins. This is not correct even the select_state operation
> for these pins would failed due to the mode protection in spmi write_cmd
> calling. I am thinking that not allowing these pins to be register as
> pinctrl pins should be more straightforward and easy understanding. So I
> think this patch still have value even the probe failure has been fixed
> by the coming spmi patch.


linux-next: Signed-off-by missing for commits in the drm tree

2017-08-28 Thread Stephen Rothwell
Hi Dave,

Commits

  461e60ea1119 ("drm/exynos/decon5433: use mode info stored in CRTC to detect 
i80 mode")
  ac60944ccf23 ("drm/exynos: consistent use of cpp")
  e300173f0616 ("drm/vmwgfx: Don't use drm_irq_[un]install")
  ef369904aaf7 ("drm/vmwgfx: Move irq bottom half processing to threads")
  65b97a2bec2f ("drm/vmwgfx: Restart command buffers after errors")
  5f55be5f306a ("drm/vmwgfx: Support the NOP_ERROR command")
  1f1a36cc4d49 ("drm/vmwgfx: Fix incorrect command header offset at restart")

are missing a Signed-off-by from their committers.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] MIPS: Revert "MIPS: Fix race on setting and getting cpu_online_mask"

2017-08-28 Thread Huacai Chen
I suggest to drop sync_r4k completely, because it is inaccurate. You
can use IPI to synchronize count/compare instead, as Loongson-3 does.

Huacai

On Mon, Aug 28, 2017 at 6:07 PM, Matija Glavinic Pecotic
 wrote:
> On 08/23/2017 10:21 AM, Matt Redfearn wrote:
>> As noted in the commit message, upstream differs in this area. The
>> hotplug code now waits on a completion event in bringup_wait_for_ap,
>> which is set by the starting CPU in cpuhp_online_idle once it calls
>> cpu_startup_entry. Thus there is no possibility of a race in upstream,
>> and this commit has only re-introduced the deadlock condition, which can
>> be observed on multiple platforms when running a heavy load test at the
>> same time as hotplugging CPUs. See commit 8f46cca1e6c06 ("MIPS: SMP: Fix
>> possibility of deadlock when bringing CPUs online") for details.
>
> I personally do not like the fact that synchronization is implicitly done by 
> the callers, it is the reason why the patch was proposed. As noted before, it 
> is enough someone checks cpu online mask somewhere in between and there is 
> race again.
>
> How about moving synchronise_count_slave before setting the cpu online? Is 
> there dependency it has to be done after completion?
>
> Regards,
>
> Matija
>


[lkp-robot] [x86/idt] 7684b56d00: WARNING:at_arch/x86/kernel/idt.c:#update_intr_gate

2017-08-28 Thread kernel test robot

FYI, we noticed the following commit:

commit: 7684b56d008c9141492e00de32c6c7b9ef0066d2 ("x86/idt: Hide 
set_intr_gate()")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git WIP.x86/apic

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


++++
|| b0f4a9654d | 7684b56d00 |
++++
| boot_successes | 8  | 0  |
| boot_failures  | 0  | 34 |
| WARNING:at_arch/x86/kernel/idt.c:#update_intr_gate | 0  | 34 |
| EIP:update_intr_gate   | 0  | 34 |
| BUG:unable_to_handle_kernel| 0  | 34 |
| Oops:#[##] | 0  | 34 |
| EIP:native_irq_enable  | 0  | 20 |
| Kernel_panic-not_syncing:Fatal_exception   | 0  | 34 |
| EIP:hpet_enable| 0  | 13 |
| EIP:get_page_from_freelist | 0  | 1  |
| BUG:kernel_in_stage| 0  | 3  |
| BUG:kernel_reboot-without-warning_in_boot_stage| 0  | 1  |
++++



[0.00] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/idt.c:355 
update_intr_gate+0x16/0x21
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
4.13.0-rc6-00088-g7684b56d #66
[0.00] task: 41edee00 task.stack: 41ed4000
[0.00] EIP: update_intr_gate+0x16/0x21
[0.00] EFLAGS: 00210002 CPU: 0
[0.00] EAX: 000e EBX: 420a4380 ECX: 4222c401 EDX: 41a1c930
[0.00] ESI: 3112fa61 EDI: 0209f0e0 EBP: 41ed5f68 ESP: 41ed5f68
[0.00]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[0.00] CR0: 80050033 CR2:  CR3: 0222e000 CR4: 0690
[0.00] Call Trace:
[0.00]  kvm_apf_trap_init+0x17/0x19
[0.00]  trap_init+0x42/0x44
[0.00]  start_kernel+0x1c8/0x3d0
[0.00]  i386_start_kernel+0xa6/0xaa
[0.00]  startup_32_smp+0x164/0x166
[0.00] Code: 41 00 75 02 0f 0b b8 00 70 ec 41 ff 15 60 2d ef 41 5b 5e 
5d c3 55 89 e5 e8 a0 63 ff fe 0f a3 05 00 14 23 42 0f 92 c1 84 c9 74 04 <0f> ff 
eb 05 e8 31 0b fd fe 5d c3 55 89 e5 e8 7f 63 ff fe c7 05
[0.00] random: get_random_bytes called from 
print_oops_end_marker+0x52/0x5f with crng_init=0
[0.00] ---[ end trace f68728a0d3053b52 ]---


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/i386 4.13.0-rc6 Kernel Configuration
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_BITS_MAX=16
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_32_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL

[lkp-robot] [x86/idt] fd1962ea6c: WARNING:at_arch/x86/kernel/idt.c:#update_intr_gate

2017-08-28 Thread kernel test robot

FYI, we noticed the following commit:

commit: fd1962ea6c9cb9be5991f9e304fedb3f9e5135d7 ("x86/idt: Hide 
set_intr_gate()")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git WIP.x86/apic

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


++++
|| 3eb0edaad2 | fd1962ea6c |
++++
| boot_successes | 12 | 0  |
| boot_failures  | 6  | 61 |
| BUG:kernel_hang_in_test_stage  | 6  | 3  |
| WARNING:at_arch/x86/kernel/idt.c:#update_intr_gate | 0  | 61 |
| BUG:unable_to_handle_kernel| 0  | 61 |
| Oops:#[##] | 0  | 61 |
| Kernel_panic-not_syncing:Fatal_exception   | 0  | 61 |
| BUG:kernel_in_stage| 0  | 12 |
++++



[0.00] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/idt.c:355 
update_intr_gate+0x11/0x1f
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
4.13.0-rc6-00089-gfd1962ea #64
[0.00] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[0.00] task: 9b628500 task.stack: 9b60
[0.00] RIP: 0010:update_intr_gate+0x11/0x1f
[0.00] RSP: :9b603ed0 EFLAGS: 00010047
[0.00] RAX: 000e RBX:  RCX: 0001
[0.00] RDX:  RSI: 9ad80040 RDI: 000e
[0.00] RBP: 9b603ed8 R08:  R09: 0001
[0.00] R10: 002f7000 R11: 9c8c93cc R12: 8de65a0e0d00
[0.00] R13: 9badc960 R14: 9bae4780 R15: 
[0.00] FS:  () GS:8de659e0() 
knlGS:
[0.00] CS:  0010 DS:  ES:  CR0: 80050033
[0.00] CR2: 8de653adf000 CR3: 12623000 CR4: 06b0
[0.00] Call Trace:
[0.00]  ? kvm_apf_trap_init+0x1a/0x1c
[0.00]  trap_init+0x4d/0x54
[0.00]  start_kernel+0x1dd/0x402
[0.00]  x86_64_start_reservations+0x24/0x26
[0.00]  x86_64_start_kernel+0x73/0x76
[0.00]  secondary_startup_64+0x9f/0x9f
[0.00] Code: c4 09 83 fb 20 75 eb 48 c7 c7 40 73 3b 9b ff 14 25 50 f8 
64 9b 5b 41 5c 5d c3 e8 ed 7a 35 ff 89 f8 48 0f a3 05 03 4f 1b 00 73 03 <0f> ff 
c3 55 48 89 e5 e8 21 c8 63 fe 5d c3 e8 ce 7a 35 ff 55 31 
[0.00] ---[ end trace be658dd14e22cef1 ]---


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.13.0-rc6 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_

[PATCH v2 1/2] media:imx274 device tree binding file

2017-08-28 Thread Soren Brinkmann
From: Leon Luo 

The binding file for imx274 CMOS sensor V4l2 driver

Signed-off-by: Leon Luo 
Acked-by: Sören Brinkmann 
---
 .../devicetree/bindings/media/i2c/imx274.txt   | 32 ++
 1 file changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/imx274.txt

diff --git a/Documentation/devicetree/bindings/media/i2c/imx274.txt 
b/Documentation/devicetree/bindings/media/i2c/imx274.txt
new file mode 100644
index ..9154666d1149
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/i2c/imx274.txt
@@ -0,0 +1,32 @@
+* Sony 1/2.5-Inch 8.51Mp CMOS Digital Image Sensor
+
+The Sony imx274 is a 1/2.5-inch CMOS active pixel digital image sensor with
+an active array size of 3864H x 2202V. It is programmable through I2C
+interface. The I2C address is fixed to 0x1a as per sensor data sheet.
+Image data is sent through MIPI CSI-2, which is configured as 4 lanes
+at 1440 Mbps.
+
+
+Required Properties:
+- compatible: value should be "sony,imx274" for imx274 sensor
+
+Optional Properties:
+- reset-gpios: Sensor reset GPIO
+
+For further reading on port node refer to
+Documentation/devicetree/bindings/media/video-interfaces.txt.
+
+Example:
+   imx274: sensor@1a{
+   compatible = "sony,imx274";
+   reg = <0x1a>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reset-gpios = <&gpio_sensor 0 0>;
+   port@0 {
+   reg = <0>;
+   sensor_out: endpoint {
+   remote-endpoint = <&csiss_in>;
+   };
+   };
+   };
-- 
2.14.1.3.g5766cf452



[PATCH v2 2/2] media:imx274 V4l2 driver for Sony imx274 CMOS sensor

2017-08-28 Thread Soren Brinkmann
From: Leon Luo 

The imx274 is a Sony CMOS image sensor that has 1/2.5 image size.
It supports up to 3840x2160 (4K) 60fps, 1080p 120fps. The interface
is 4-lane MIPI running at 1.44Gbps each.

This driver has been tested on Xilinx ZCU102 platform with a Leopard
LI-IMX274MIPI-FMC camera board.

Support for the following features:
-Resolutions: 3840x2160, 1920x1080, 1280x720
-Frame rate: 3840x2160 : 5 – 60fps
1920x1080 : 5 – 120fps
1280x720 : 5 – 120fps
-Exposure time: 16 – (frame interval) micro-seconds
-Gain: 1x - 180x
-VFLIP: enable/disable
-Test pattern: 12 test patterns

Signed-off-by: Leon Luo 
Tested-by: Sören Brinkmann 
---
v2:
 - Fix Kconfig to not remove existing options
---
 drivers/media/i2c/Kconfig  |7 +
 drivers/media/i2c/Makefile |1 +
 drivers/media/i2c/imx274.c | 1842 
 3 files changed, 1850 insertions(+)
 create mode 100644 drivers/media/i2c/imx274.c

diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
index 94153895fcd4..ad2e70a02363 100644
--- a/drivers/media/i2c/Kconfig
+++ b/drivers/media/i2c/Kconfig
@@ -547,6 +547,13 @@ config VIDEO_APTINA_PLL
 config VIDEO_SMIAPP_PLL
tristate
 
+config VIDEO_IMX274
+   tristate "Sony IMX274 sensor support"
+   depends on I2C && VIDEO_V4L2 && VIDEO_V4L2_SUBDEV_API
+   ---help---
+ This is a V4L2 sensor-level driver for the Sony IMX274
+ CMOS image sensor.
+
 config VIDEO_OV2640
tristate "OmniVision OV2640 sensor support"
depends on VIDEO_V4L2 && I2C
diff --git a/drivers/media/i2c/Makefile b/drivers/media/i2c/Makefile
index c843c181dfb9..f8d57e453936 100644
--- a/drivers/media/i2c/Makefile
+++ b/drivers/media/i2c/Makefile
@@ -92,5 +92,6 @@ obj-$(CONFIG_VIDEO_IR_I2C)  += ir-kbd-i2c.o
 obj-$(CONFIG_VIDEO_ML86V7667)  += ml86v7667.o
 obj-$(CONFIG_VIDEO_OV2659) += ov2659.o
 obj-$(CONFIG_VIDEO_TC358743)   += tc358743.o
+obj-$(CONFIG_VIDEO_IMX274) += imx274.o
 
 obj-$(CONFIG_SDR_MAX2175) += max2175.o
diff --git a/drivers/media/i2c/imx274.c b/drivers/media/i2c/imx274.c
new file mode 100644
index ..fcbb5ad2763c
--- /dev/null
+++ b/drivers/media/i2c/imx274.c
@@ -0,0 +1,1842 @@
+/*
+ * imx274.c - IMX274 CMOS Image Sensor driver
+ *
+ * Copyright (C) 2017, Leopard Imaging, Inc.
+ *
+ * Leon Luo 
+ * Edwin Zou 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int debug;
+module_param(debug, int, 0644);
+MODULE_PARM_DESC(debug, "Debug level (0-2)");
+
+/*
+ * See "SHR, SVR Setting" in datasheet
+ */
+#define IMX274_DEFAULT_FRAME_LENGTH(4550)
+#define IMX274_MAX_FRAME_LENGTH(0x000f)
+
+/*
+ * See "Frame Rate Adjustment" in datasheet
+ */
+#define IMX274_PIXCLK_CONST1   (7200)
+#define IMX274_PIXCLK_CONST2   (100)
+
+/*
+ * The input gain is shifted by IMX274_GAIN_SHIFT to get
+ * decimal number. The real gain is
+ * (float)input_gain_value / (1 << IMX274_GAIN_SHIFT)
+ */
+#define IMX274_GAIN_SHIFT  (8)
+#define IMX274_GAIN_SHIFT_MASK ((1 << IMX274_GAIN_SHIFT) - 1)
+
+/*
+ * See "Analog Gain" and "Digital Gain" in datasheet
+ * min gain is 1X
+ * max gain is calculated based on IMX274_GAIN_REG_MAX
+ */
+#define IMX274_GAIN_REG_MAX(1957)
+#define IMX274_MIN_GAIN(0x01 << 
IMX274_GAIN_SHIFT)
+#define IMX274_MAX_ANALOG_GAIN ((2048 << IMX274_GAIN_SHIFT)\
+   / (2048 - IMX274_GAIN_REG_MAX))
+#define IMX274_MAX_DIGITAL_GAIN(8)
+#define IMX274_DEF_GAIN(20 << 
IMX274_GAIN_SHIFT)
+#define IMX274_GAIN_CONST  (2048) /* for gain formula */
+
+/*
+ * 1 line time in us = (HMAX / 72), minimal is 4 lines
+ */
+#define IMX274_MIN_EXPOSURE_TIME   (4 * 260 / 72)
+
+#define IMX274_DEFAULT_MODEIMX274_MODE_3840X2160
+#define IMX274_MAX_WIDTH   (3840)
+#define IMX274_MAX_HEIGHT  (2160)
+#define IMX274_MAX_FRAME_RATE

Re: [PATCH] libnvdimm: clean up command definitions

2017-08-28 Thread Yasunori Goto
> On Mon, Aug 28, 2017 at 1:50 PM, Jerry Hoemann  wrote:
> >
> > On Mon, Aug 28, 2017 at 08:45:32AM -0700, Dan Williams wrote:
> >> Remove the command payloads that do not have an associated libnvdimm
> >> ioctl. I.e. remove the payloads that would only ever be carried in the
> >> ND_CMD_CALL envelope. This prevents userspace from growing unnecessary
> >> dependencies on this kernel header when userspace already has everything
> >> it needs to craft and send these commands.
> >
> > Userspace needs to include linux/ndctl.h to make the call as
> > that is where nd_cmd_pkg is defined.
> >
> > So you want to have some structures defined in ndctl.h and other
> > defined in the to be created libndctl-nfit.h?  Plus a third header
> > file for the HPE non-root calls?
> 
> Yes. ndctl.h exports the ioctl command payloads, everything that goes
> inside of ND_CMD_CALL is defined by userspace headers. The
> libndctl-nfit.h header is proposed as a place to land vendor agnostic
> NFIT-defined payloads, and any vendor specific definitions would
> remain internal to libndctl as they are today.
> 
> > Will libndctl-nfit.h be generally available and installed?
> 
> Yes, that's the plan.
> 
> > Will it be clean so that other applications can use it to get these
> > definitions?  Or will it be loaded w/ a bunch of stuff only useful
> > to your ndctl command?
> 
> Yes, that's the plan. It's a bug if libndctl-nfit.h is not generically
> clean for issuing the NFIT root device commands via some ND_CMD_CALL
> helpers from the base libndctl library.
> 
> In other words libndctl-nfit.h defines the payload and libndctl
> defines some general helpers for issuing commands.

Maybe I don't understand your idea yet, let me confirm it.

Certainly, current acpi driver does not need these definitions.
But, I think nfit_test.ko will need them to emulate these features.

Do you intend that libndctl-nfit.h should be defined at "include/uapi/linux/"
directory?
Otherwise, it should be defined at "tools/testing/nvdimm/" or 
"tools/testing/nvdimm/test" ?

Thanks,
---
Yasunori Goto




  1   2   3   4   5   6   7   8   9   10   >