Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted

2019-04-19 Thread Michael S. Tsirkin
On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote:
> 
> Michael S. Tsirkin  writes:
> 
> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote:
> >>
> >> Michael S. Tsirkin  writes:
> >>
> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote:
> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will
> >> >> only ever try to access memory addresses that are supplied to it by the
> >> >> guest, so all of the secure guest memory that the host cares about is
> >> >> accessible:
> >> >>
> >> >> If this feature bit is set to 0, then the device has same access to
> >> >> memory addresses supplied to it as the driver has. In particular,
> >> >> the device will always use physical addresses matching addresses
> >> >> used by the driver (typically meaning physical addresses used by the
> >> >> CPU) and not translated further, and can access any address supplied
> >> >> to it by the driver. When clear, this overrides any
> >> >> platform-specific description of whether device access is limited or
> >> >> translated in any way, e.g. whether an IOMMU may be present.
> >> >>
> >> >> All of the above is true for POWER guests, whether they are secure
> >> >> guests or not.
> >> >>
> >> >> Or are you saying that a virtio device may want to access memory
> >> >> addresses that weren't supplied to it by the driver?
> >> >
> >> > Your logic would apply to IOMMUs as well.  For your mode, there are
> >> > specific encrypted memory regions that driver has access to but device
> >> > does not. that seems to violate the constraint.
> >>
> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that
> >> the device can ignore the IOMMU for all practical purposes I would
> >> indeed say that the logic would apply to IOMMUs as well. :-)
> >>
> >> I guess I'm still struggling with the purpose of signalling to the
> >> driver that the host may not have access to memory addresses that it
> >> will never try to access.
> >
> > For example, one of the benefits is to signal to host that driver does
> > not expect ability to access all memory. If it does, host can
> > fail initialization gracefully.
> 
> But why would the ability to access all memory be necessary or even
> useful? When would the host access memory that the driver didn't tell it
> to access?

When I say all memory I mean even memory not allowed by the IOMMU.


> >> >> >> > But the name "sev_active" makes me scared because at least AMD 
> >> >> >> > guys who
> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM
> >> >> >>
> >> >> >> My understanding is, AMD guest-platform knows in advance that their
> >> >> >> guest will run in secure mode and hence sets the flag at the time of 
> >> >> >> VM
> >> >> >> instantiation. Unfortunately we dont have that luxury on our 
> >> >> >> platforms.
> >> >> >
> >> >> > Well you do have that luxury. It looks like that there are existing
> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy
> >> >> > with how that path is slow. So you are trying to optimize for
> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability
> >> >> > to invoke DMA API.
> >> >> >
> >> >> > For example if there was another flag just like ACCESS_PLATFORM
> >> >> > just not yet used by anyone, you would be all fine using that right?
> >> >>
> >> >> Yes, a new flag sounds like a great idea. What about the definition
> >> >> below?
> >> >>
> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as
> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the
> >> >> exception that the IOMMU is explicitly defined to be off or bypassed
> >> >> when accessing memory addresses supplied to the device by the
> >> >> driver. This flag should be set by the guest if offered, but to
> >> >> allow for backward-compatibility device implementations allow for it
> >> >> to be left unset by the guest. It is an error to set both this flag
> >> >> and VIRTIO_F_ACCESS_PLATFORM.
> >> >
> >> > It looks kind of narrow but it's an option.
> >>
> >> Great!
> >>
> >> > I wonder how we'll define what's an iommu though.
> >>
> >> Hm, it didn't occur to me it could be an issue. I'll try.
> 
> I rephrased it in terms of address translation. What do you think of
> this version? The flag name is slightly different too:
> 
> 
> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same
> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set,
> with the exception that address translation is guaranteed to be
> unnecessary when accessing memory addresses supplied to the device
> by the driver. Which is to say, the device will always use physical
> addresses matching addresses used by the driver (typically meaning
> physical addresses used by the CPU) and not translated further. This
> flag should be set by the 

Re: [PATCH 0/8]

2019-04-19 Thread Tyrel Datwyler
On 04/14/2019 08:41 PM, Sam Bobroff wrote:
> On Thu, Apr 11, 2019 at 05:55:33PM -0700, Tyrel Datwyler wrote:
>> On 03/19/2019 07:58 PM, Sam Bobroff wrote:
>>> Hi all,
>>>
>>> This patch set adds support for EEH recovery of hot plugged devices on 
>>> pSeries
>>> machines. Specifically, devices discovered by PCI rescanning using
>>> /sys/bus/pci/rescan, which includes devices hotplugged by QEMU's device_add
>>> command. (pSeries doesn't currently use slot power control for hotplugging.)
>>
>> Slight nit that its not that pSeries doesn't support slot power control
>> hotplugging, its that QEMU pSeries guests don't support it. We most 
>> definitely
>> use the slot power control for hotplugging in PowerVM pSeries Linux guests. 
>> More
> 
> Ah, I think I see what you mean: pSeries can (and does!) use slot power
> control for hotplugging, it's just that Linux doesn't. Right :-) I'll
> change it to "Linux on pSeries doesn't" for the next version.

Not quite. A pSeries Linux PowerVM LPAR does use the slot power hotplugging. It
is only the pSeries Linux qemu guest that doesn't. The way that hotplug is
initiated is different for PowerVM and qemu. On PowerVM the command is sent from
the HMC down the RSCT stack which calls drmgr whereas with qemu hotplug
generates an interrupt which logs an event that is picked up by rtas_errd which
then calls drmgr with the "-V" flag. That flag was a special case that we added
to drmgr to work around the slot power hotplugging with rpaphp not working
correctly with qemu guests.

> 
>> specifically we had to work around short comings in the rpaphp driver when
>> dealing with QEMU. This being that while at initial glance the design implies
>> that it had multiple devices per PHB in mind, it didn't, and only actually
>> supported a single slot per PHB. Further, when we developed the QEMU pci 
>> hotplug
>> feature we had to deal with only having a single PHB per QEMU guest and as a
>> result needed a way to plug multiple pci devices into a single PHB. Hence, 
>> came
>> the pci rescan work around in drmgr.
>>
>> Mike Roth and I have had discussions over the years to get the slot power
>> control hotplugging working properly with QEMU, and while I did get the RPA
>> hotplug driver fixed to register all available slots associated with a PHB, 
>> EEH
>> remained an issue. So, I'm very happy to see this patchset get that working 
>> with
>> the rescan work around.
>>
>>>
>>> As a side effect this also provides EEH support for devices removed by
>>> /sys/bus/pci/devices/*/remove and re-discovered by writing to 
>>> /sys/bus/pci/rescan,
>>> on all platforms.
>>
>> +1, this seems like icing on the cake. ;)
> 
> Yes :-)
> 
> Although maybe I should mention that we can't really benefit from this
> on PowerNV *yet* because there seem to be some other problems with
> removing and re-scanning devices: in my tests devices are often unusable
> after being rediscovered.
> 
> (I'm hoping to take a look at that soon.)

Interesting.

> 
>>>
>>> The approach I've taken is to use the fact that the existing
>>> pcibios_bus_add_device() platform hooks (which are used to set up EEH on
>>> Virtual Function devices (VFs)) are actually called for all devices, so I've
>>> widened their scope and made other adjustments necessary to allow them to 
>>> work
>>> for hotplugged and boot-time devices as well.
>>>
>>> Because some of the changes are in generic PowerPC code, it's
>>> possible that I've disturbed something for another PowerPC platform. I've 
>>> tried
>>> to minimize this by leaving that code alone as much as possible and so there
>>> are a few cases where eeh_add_device_{early,late}() or 
>>> eeh_add_sysfs_files() is
>>> called more than once. I think these can be looked at later, as duplicate 
>>> calls
>>> are not harmful.
>>>
>>> The patch "Convert PNV_PHB_FLAG_EEH" isn't strictly necessary and I'm not 
>>> sure
>>> if it's better to keep it, because it simplifies the code or drop it, 
>>> because
>>> we may need a separate flag per PHB later on. Thoughts anyone?
>>>
>>> The first patch is a rework of the pcibios_init reordering patch I posted
>>> earlier, which I've included here because it's necessary for this set.
>>>
>>> I have done some testing for PowerNV on Power9 using a modified pnv_php 
>>> module
>>> and some testing on pSeries with slot power control using a modified rpaphp
>>> module, and the EEH-related parts seem to work.
>>
>> I'm interested in what modifications with rpaphp. Its unclear if you are 
>> saying
>> rpaphp modified so that slot power hotplug works with a QEMU pSeries guest? 
>> If
>> thats the case it would be optimal to get that upstream and remove the work
>> rescan workaround for guests that don't need it.
> 
> Unfortunately no, I didn't do enough work to really get it working.  I
> just wanted to get an idea of how that code path interacted with the EEH
> code I was changing, so that hopefully when we get to fixing it, the EEH
> part will be easier to do.
> 
> 

Re: Avoiding merge conflicts while adding new docs - Was: Re: [PATCH 00/57] Convert files to ReST

2019-04-19 Thread Jonathan Corbet
On Thu, 18 Apr 2019 09:42:23 -0300
Mauro Carvalho Chehab  wrote:

> After thinking a little bit and doing some tests, I think a good solution
> would be to add ":orphan:" markup to the new .rst files that were not
> added yet into the main body (e. g. something like the enclosed example).

Interesting...I didn't know about that.  Yes, I think it makes sense to
put that in...

jon


[PATCH] [v2] x86/mpx: fix recursive munmap() corruption

2019-04-19 Thread Dave Hansen


Changes from v1:
 * Fix compile errors on UML and non-x86 arches
 * Clarify commit message and Fixes about the origin of the
   bug and add the impact to powerpc / uml / unicore32

--

This is a bit of a mess, to put it mildly.  But, it's a bug
that only seems to have showed up in 4.20 but wasn't noticed
until now because nobody uses MPX.

MPX has the arch_unmap() hook inside of munmap() because MPX
uses bounds tables that protect other areas of memory.  When
memory is unmapped, there is also a need to unmap the MPX
bounds tables.  Barring this, unused bounds tables can eat 80%
of the address space.

But, the recursive do_munmap() that gets called vi arch_unmap()
wreaks havoc with __do_munmap()'s state.  It can result in
freeing populated page tables, accessing bogus VMA state,
double-freed VMAs and more.

To fix this, call arch_unmap() before __do_unmap() has a chance
to do anything meaningful.  Also, remove the 'vma' argument
and force the MPX code to do its own, independent VMA lookup.

== UML / unicore32 impact ==

Remove unused 'vma' argument to arch_unmap().  No functional
change.

I compile tested this on UML but not unicore32.

== powerpc impact ==

powerpc uses arch_unmap() well to watch for munmap() on the
VDSO and zeroes out 'current->mm->context.vdso_base'.  Moving
arch_unmap() makes this happen earlier in __do_munmap().  But,
'vdso_base' seems to only be used in perf and in the signal
delivery that happens near the return to userspace.  I can not
find any likely impact to powerpc, other than the zeroing
happening a little earlier.

powerpc does not use the 'vma' argument and is unaffected by
its removal.

I compile-tested a 64-bit powerpc defconfig.

== x86 impact ==

For the common success case this is functionally identical to
what was there before.  For the munmap() failure case, it's
possible that some MPX tables will be zapped for memory that
continues to be in use.  But, this is an extraordinarily
unlikely scenario and the harm would be that MPX provides no
protection since the bounds table got reset (zeroed).

I can't imagine anyone doing this:

ptr = mmap();
// use ptr
ret = munmap(ptr);
if (ret)
// oh, there was an error, I'll
// keep using ptr.

Because if you're doing munmap(), you are *done* with the
memory.  There's probably no good data in there _anyway_.

This passes the original reproducer from Richard Biener as
well as the existing mpx selftests/.



The long story:

munmap() has a couple of pieces:
1. Find the affected VMA(s)
2. Split the start/end one(s) if neceesary
3. Pull the VMAs out of the rbtree
4. Actually zap the memory via unmap_region(), including
   freeing page tables (or queueing them to be freed).
5. Fixup some of the accounting (like fput()) and actually
   free the VMA itself.

This specific ordering was actually introduced by:

dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")

during the 4.20 merge window.  The previous __do_munmap() code
was actually safe because the only thing after arch_unmap() was
remove_vma_list().  arch_unmap() could not see 'vma' in the
rbtree because it was detached, so it is not even capable of
doing operations unsafe for remove_vma_list()'s use of 'vma'.

Richard Biener reported a test that shows this in dmesg:

[1216548.787498] BUG: Bad rss-counter state mm:17ce560b idx:1 val:551
[1216548.787500] BUG: non-zero pgtables_bytes on freeing mm: 24576

What triggered this was the recursive do_munmap() called via
arch_unmap().  It was freeing page tables that has not been
properly zapped.

But, the problem was bigger than this.  For one, arch_unmap()
can free VMAs.  But, the calling __do_munmap() has variables
that *point* to VMAs and obviously can't handle them just
getting freed while the pointer is still in use.

I tried a couple of things here.  First, I tried to fix the page
table freeing problem in isolation, but I then found the VMA
issue.  I also tried having the MPX code return a flag if it
modified the rbtree which would force __do_munmap() to re-walk
to restart.  That spiralled out of control in complexity pretty
fast.

Just moving arch_unmap() and accepting that the bonkers failure
case might eat some bounds tables seems like the simplest viable
fix.

This was also reported in the following kernel bugzilla entry:

https://bugzilla.kernel.org/show_bug.cgi?id=203123

There are some reports that dd2283f2605 ("mm: mmap: zap pages
with read mmap_sem in munmap") triggered this issue.  While that
commit certainly made the issues easier to hit, I belive the
fundamental issue has been with us as long as MPX itself, thus
the Fixes: tag below is for one of the original MPX commits.

Reported-by: Richard Biener 
Reported-by: H.J. Lu 
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Cc: Yang Shi 
Cc: Michal Hocko 
Cc: Vlastimil Babka 
Cc: Andy Lutomirski 
Cc: x...@kernel.org
Cc: Andrew Morton 
Cc: 

Re: [PATCH V4 3/3] ASoC: fsl_asrc: Unify the supported input and output rate

2019-04-19 Thread Nicolin Chen
On Fri, Apr 19, 2019 at 10:23:56AM +, S.j. Wang wrote:
> Unify the supported input and output rate, add the
> 12kHz/24kHz/128kHz to the support list
> 
> Signed-off-by: Shengjiu Wang 
> ---
>  sound/soc/fsl/fsl_asrc.c | 32 +++-
>  1 file changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
> index 2c4bbc3499db..0d06e738264a 100644
> --- a/sound/soc/fsl/fsl_asrc.c
> +++ b/sound/soc/fsl/fsl_asrc.c
> @@ -27,13 +27,14 @@
>   dev_dbg(_priv->pdev->dev, "Pair %c: " fmt, 'A' + index, 
> ##__VA_ARGS__)
>  
>  /* Corresponding to process_option */
> -static int supported_input_rate[] = {
> - 5512, 8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200,
> - 96000, 176400, 192000,
> +static unsigned int supported_asrc_rate[] = {
> + 5512, 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000,
> + 64000, 88200, 96000, 128000, 176400, 192000,
>  };
>  
> -static int supported_asrc_rate[] = {
> - 8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200, 96000, 
> 176400, 192000,
> +static struct snd_pcm_hw_constraint_list fsl_asrc_rate_constraints = {
> + .count = ARRAY_SIZE(supported_asrc_rate),
> + .list = supported_asrc_rate,
>  };
>  
>  /**
> @@ -293,11 +294,11 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair 
> *pair)
>   ideal = config->inclk == INCLK_NONE;
>  
>   /* Validate input and output sample rates */
> - for (in = 0; in < ARRAY_SIZE(supported_input_rate); in++)
> - if (inrate == supported_input_rate[in])
> + for (in = 0; in < ARRAY_SIZE(supported_asrc_rate); in++)
> + if (inrate == supported_asrc_rate[in])
>   break;

Not sure if we still need it upon having hw_constraint. Maybe m2m
needs it?


Re: [PATCH V4 2/3] ASoC: fsl_asrc: replace the process_option table with function

2019-04-19 Thread Nicolin Chen
On Fri, Apr 19, 2019 at 10:23:53AM +, S.j. Wang wrote:

> @@ -289,6 +318,12 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair 
> *pair)
>   return -EINVAL;
>   }
>  
> + ret = fsl_asrc_sel_proc(inrate, outrate, _proc, _proc);

Since the function always return 0, I am thinking of treating
this function as a lookup function, and then moving this call
right before the register settings -- as we have already made
sure that both inrate and outrate are supported.

> + if (ret) {
> + pair_err("No supported pre-processing options\n");
> + return ret;
> + }

And probably no longer need this error-out. If there's a new
limitation related to this function, I believe we can add it
to the rate validation section as we are doing now -- better
to have rate validation code at one place.

> @@ -380,8 +415,8 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair 
> *pair)
>   /* Apply configurations for pre- and post-processing */

Here:
-   /* Apply configurations for pre- and post-processing */
+   /* Select and apply configurations for pre- and post-processing */
+   fsl_asrc_sel_proc(inrate, outrate, _proc, _proc);
>   regmap_update_bits(asrc_priv->regmap, REG_ASRCFG,
>  ASRCFG_PREMODi_MASK(index) | 
> ASRCFG_POSTMODi_MASK(index),
> -ASRCFG_PREMOD(index, process_option[in][out][0]) |
> -ASRCFG_POSTMOD(index, process_option[in][out][1]));
> +ASRCFG_PREMOD(index, pre_proc) |
> +ASRCFG_POSTMOD(index, post_proc));


Re: [PATCH V4 1/3] ASoC: fsl_asrc: Fix the issue about unsupported rate

2019-04-19 Thread Nicolin Chen
On Fri, Apr 19, 2019 at 10:23:50AM +, S.j. Wang wrote:
> When the output sample rate is [8kHz, 30kHz], the limitation
> of the supported ratio range is (1/24, 8). In the driver
> we use (8kHz, 30kHz) instead of [8kHz, 30kHz].
> So this patch is to fix this issue and the potential rounding
> issue with divider.
> 
> Fixes: fff6e03c7b65 ("ASoC: fsl_asrc: add support for 8-30kHz
> output sample rate")
> Cc: 
> Signed-off-by: Shengjiu Wang 
> ---
>  sound/soc/fsl/fsl_asrc.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
> index 0b937924d2e4..5b8adc7fb117 100644
> --- a/sound/soc/fsl/fsl_asrc.c
> +++ b/sound/soc/fsl/fsl_asrc.c
> @@ -282,10 +282,10 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair 
> *pair)
>   return -EINVAL;
>   }
>  
> - if ((outrate > 8000 && outrate < 3) &&
> - (outrate/inrate > 24 || inrate/outrate > 8)) {
> - pair_err("exceed supported ratio range [1/24, 8] for \
> - inrate/outrate: %d/%d\n", inrate, outrate);
> + if ((outrate >= 8000 && outrate <= 3) &&
> + (outrate > 24 * inrate || inrate > 8 * outrate)) {
> + pair_err("exceed supported ratio range (1/24, 8) for 
> inrate/outrate: %d/%d\n",

Using one of the conditions:
if (inrate > 8 * outrate)
pair_err();

This means:
if (inrate <= 8 * outrate)
/* Everything is fine */

So the supported ratio range is still [1/24, 8] right?

Thanks


Re: Linux 5.1-rc5

2019-04-19 Thread Linus Torvalds
On Fri, Apr 19, 2019 at 6:33 AM Martin Schwidefsky
 wrote:
>
> That problem got stuck in my head and I thought more about it. Why not
> emulate the static folding sequence in the s390 page table code?

So this model seems much closer to what x86 does in its folding, where
the pattern is basically

> static inline pX-1d_t *pXd_offset(pXd_t *pXd, unsigned long address)
> {
> if (pXd_folded(pXd)
> return (pX-1d_t *) pXd;
> return (pX-1d_t *) pXd_deref(*pXd) + pXd_index(address);
> }

which is really how the code is designed to work (ie the folded entry
doesn't actually do anything to the page directory pointer, it just
says "ok, we'll use this exact page directory pointer for the next
lower level instead".

And that's very much what allows the generic gup code to load the
entry once, and use a temporary, and as you walk down the chain, if it
is folded it just then uses that (previous) temporary value for the
next level instead. IOW, the lower level page table is hidden inside
the upper level one, and folding just means "don't do any offsets,
don't change any values, just use the entry as-is for the next lower
level".

So I think that's the right thing to do.

Looking at the s390 code, it seems to fold things the other way,
conceptually hiding the upper level inside the lower one, and always
doing the offset thing (but just avoiding the dereference).

Maybe there's some reason why the s390 code does it that way, but I
think your new model is the right one, and hopefully means you can use
the generic page table walking more easily.

Of course, the s390 folding is very different from the x86 one (or the
generic fixed 3-level of 4-level cases). The x86 folding doesn't
depend on the contents of the page tables, it's just entirely static
(well, the 5th level is conditional, but it's conditional on a static
key, not on what is in the page tables). So maybe the old model of
s390 made more sense in that context, but I look at your new suggested
pXd_offset() functions and I go "yeah, that's the way it's supposed to
work".

 Linus


Re: [PATCH v12 09/31] mm: VMA sequence count

2019-04-19 Thread Laurent Dufour

Hi Jerome,

Thanks a lot for reviewing this series.

Le 19/04/2019 à 00:48, Jerome Glisse a écrit :

On Tue, Apr 16, 2019 at 03:45:00PM +0200, Laurent Dufour wrote:

From: Peter Zijlstra 

Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence
counts such that we can easily test if a VMA is changed.

The calls to vm_write_begin/end() in unmap_page_range() are
used to detect when a VMA is being unmap and thus that new page fault
should not be satisfied for this VMA. If the seqcount hasn't changed when
the page table are locked, this means we are safe to satisfy the page
fault.

The flip side is that we cannot distinguish between a vma_adjust() and
the unmap_page_range() -- where with the former we could have
re-checked the vma bounds against the address.

The VMA's sequence counter is also used to detect change to various VMA's
fields used during the page fault handling, such as:
  - vm_start, vm_end
  - vm_pgoff
  - vm_flags, vm_page_prot
  - vm_policy


^ All above are under mmap write lock ?


Yes, changes are still made under the protection of the mmap_sem.




  - anon_vma


^ This is either under mmap write lock or under page table lock

So my question is do we need the complexity of seqcount_t for this ?


The sequence counter is used to detect write operation done while 
readers (SPF handler) is running.


The implementation is quite simple (here without the lockdep checks):

static inline void raw_write_seqcount_begin(seqcount_t *s)
{
s->sequence++;
smp_wmb();
}

I can't see why this is too complex here, would you elaborate on this ?



It seems that using regular int as counter and also relying on vm_flags
when vma is unmap should do the trick.


vm_flags is not enough I guess an some operation are not impacting the 
vm_flags at all (resizing for instance).

Am I missing something ?



vma_delete(struct vm_area_struct *vma)
{
 ...
 /*
  * Make sure the vma is mark as invalid ie neither read nor write
  * so that speculative fault back off. A racing speculative fault
  * will either see the flags as 0 or the new seqcount.
  */
 vma->vm_flags = 0;
 smp_wmb();
 vma->seqcount++;
 ...
}


Well I don't think we can safely clear the vm_flags this way when the 
VMA is unmap, I think it is used later when cleaning is doen.


Later in this series, the VMA deletion is managed when the VMA is 
unlinked from the RB Tree. That is checked using the vm_rb field's 
value, and managed using RCU.



Then:
speculative_fault_begin(struct vm_area_struct *vma,
 struct spec_vmf *spvmf)
{
 ...
 spvmf->seqcount = vma->seqcount;
 smp_rmb();
 spvmf->vm_flags = vma->vm_flags;
 if (!spvmf->vm_flags) {
 // Back off the vma is dying ...
 ...
 }
}

bool speculative_fault_commit(struct vm_area_struct *vma,
   struct spec_vmf *spvmf)
{
 ...
 seqcount = vma->seqcount;
 smp_rmb();
 vm_flags = vma->vm_flags;

 if (spvmf->vm_flags != vm_flags || seqcount != spvmf->seqcount) {
 // Something did change for the vma
 return false;
 }
 return true;
}

This would also avoid the lockdep issue described below. But maybe what
i propose is stupid and i will see it after further reviewing thing.


That's true that the lockdep is quite annoying here. But it is still 
interesting to keep in the loop to avoid 2 subsequent 
write_seqcount_begin() call being made in the same context (which would 
lead to an even sequence counter value while write operation is in 
progress). So I think this is still a good thing to have lockdep 
available here.






Cheers,
Jérôme




Signed-off-by: Peter Zijlstra (Intel) 

[Port to 4.12 kernel]
[Build depends on CONFIG_SPECULATIVE_PAGE_FAULT]
[Introduce vm_write_* inline function depending on
  CONFIG_SPECULATIVE_PAGE_FAULT]
[Fix lock dependency between mapping->i_mmap_rwsem and vma->vm_sequence by
  using vm_raw_write* functions]
[Fix a lock dependency warning in mmap_region() when entering the error
  path]
[move sequence initialisation INIT_VMA()]
[Review the patch description about unmap_page_range()]
Signed-off-by: Laurent Dufour 
---
  include/linux/mm.h   | 44 
  include/linux/mm_types.h |  3 +++
  mm/memory.c  |  2 ++
  mm/mmap.c| 30 +++
  4 files changed, 79 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2ceb1d2869a6..906b9e06f18e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1410,6 +1410,9 @@ struct zap_details {
  static inline void INIT_VMA(struct vm_area_struct *vma)
  {
INIT_LIST_HEAD(>anon_vma_chain);
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+   seqcount_init(>vm_sequence);
+#endif
  }
  
  struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,

@@ -1534,6 +1537,47 @@ static inline void unmap_shared_mapping_range(struct 

Re: [PATCH v2 06/11] MIPS: mark __fls() as __always_inline

2019-04-19 Thread Mathieu Malaterre
Hi,

On Fri, Apr 19, 2019 at 12:06 PM Masahiro Yamada
 wrote:
>
> This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
> place. We need to eliminate potential issues beforehand.
>
> If it is enabled for mips, the following errors are reported:
>
> arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
> sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
> sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
> arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined 
> references to `mips_gcr_base'

Tested with success on ppc32/G4. But on CI20 (ci20_defconfig from
master), I get:

  MODPOST vmlinux.o
mipsel-linux-gnu-ld: arch/mips/kernel/traps.o: in function
`addr_gcr_err_control':
/home/mathieu/tmp/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:169:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:169:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld: arch/mips/mm/sc-mips.o: in function
`addr_gcr_l2_pft_control':
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
undefined reference to `mips_gcr_base'
mipsel-linux-gnu-ld:
arch/mips/mm/sc-mips.o:/home/mathieu/linux/linux/ci20/../arch/mips/include/asm/mips-cm.h:246:
more undefined references to `mips_gcr_base' follow


> Signed-off-by: Masahiro Yamada 
> ---
>
> Changes in v2:
>   - new patch
>
>  arch/mips/include/asm/bitops.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
> index 830c93a010c3..6a26ead1c2b6 100644
> --- a/arch/mips/include/asm/bitops.h
> +++ b/arch/mips/include/asm/bitops.h
> @@ -482,7 +482,7 @@ static inline void __clear_bit_unlock(unsigned long nr, 
> volatile unsigned long *
>   * Return the bit position (0..63) of the most significant 1 bit in a word
>   * Returns -1 if no 1 bit exists
>   */
> -static inline unsigned long __fls(unsigned long word)
> +static __always_inline unsigned long __fls(unsigned long word)
>  {
> int num;
>
> --
> 2.17.1
>


[PATCH] vfio-pci/nvlink2: Fix potential VMA leak

2019-04-19 Thread Greg Kurz
If vfio_pci_register_dev_region() fails then we should rollback
previous changes, ie. unmap the ATSD registers.

Signed-off-by: Greg Kurz 
---
 drivers/vfio/pci/vfio_pci_nvlink2.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c 
b/drivers/vfio/pci/vfio_pci_nvlink2.c
index 32f695ffe128..50fe3c4f7feb 100644
--- a/drivers/vfio/pci/vfio_pci_nvlink2.c
+++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
@@ -472,6 +472,8 @@ int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
return 0;
 
 free_exit:
+   if (data->base)
+   memunmap(data->base);
kfree(data);
 
return ret;



[PATCH] powerpc/powernv/npu: Fix reference leak

2019-04-19 Thread Greg Kurz
Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
has the effect of incrementing the reference count of the PCI device, as
explained in drivers/pci/search.c:

 * Given a PCI domain, bus, and slot/function number, the desired PCI
 * device is located in the list of PCI devices. If the device is
 * found, its reference count is increased and this function returns a
 * pointer to its data structure.  The caller must decrement the
 * reference count by calling pci_dev_put().  If no device is found,
 * %NULL is returned.

Nothing was done to call pci_dev_put() and the reference count of GPU and
NPU PCI devices rockets up.

A natural way to fix this would be to teach the callers about the change,
so that they call pci_dev_put() when done with the pointer. This turns
out to be quite intrusive, as it affects many paths in npu-dma.c,
pci-ioda.c and vfio_pci_nvlink2.c. Also, the issue appeared in 4.16 and
some affected code got moved around since then: it would be problematic
to backport the fix to stable releases.

All that code never cared for reference counting anyway. Call pci_dev_put()
from get_pci_dev() to revert to the previous behavior.

Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from 
pci_dn")
Cc: sta...@vger.kernel.org # v4.16
Signed-off-by: Greg Kurz 
---
 arch/powerpc/platforms/powernv/npu-dma.c |   15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c 
b/arch/powerpc/platforms/powernv/npu-dma.c
index e713ade30087..d8f3647e8fb2 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
 static struct pci_dev *get_pci_dev(struct device_node *dn)
 {
struct pci_dn *pdn = PCI_DN(dn);
+   struct pci_dev *pdev;
 
-   return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
+   pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
   pdn->busno, pdn->devfn);
+
+   /*
+* pci_get_domain_bus_and_slot() increased the reference count of
+* the PCI device, but callers don't need that actually as the PE
+* already holds a reference to the device. Since callers aren't
+* aware of the reference count change, call pci_dev_put() now to
+* avoid leaks.
+*/
+   if (pdev)
+   pci_dev_put(pdev);
+
+   return pdev;
 }
 
 /* Given a NPU device get the associated PCI device. */



Re: [alsa-devel] [PATCH V2] ASoC: fsl_esai: Add pm runtime function

2019-04-19 Thread Mark Brown
On Fri, Apr 19, 2019 at 11:01:21AM +, S.j. Wang wrote:

> > fsl_esai_probe(struct platform_device *pdev)
> > return ret;
> > }
> > 
> > +   pm_runtime_enable(>dev);
> > +

> I just have a question, do I need to add pm_runtime_idle(>dev)?

It gets used to help drivers get into the correct state on startup, if
you're unsure if it's 100% required it shouldn't hurt.


signature.asc
Description: PGP signature


Re: [PATCH] kernel/crash: make parse_crashkernel()'s return value more indicant

2019-04-19 Thread Pingfan Liu
On Fri, Apr 19, 2019 at 4:19 PM Thomas Gleixner  wrote:
>
> On Fri, 19 Apr 2019, Pingfan Liu wrote:
>
> > At present, both return and crash_size should be checked to guarantee the
> > success of parse_crashkernel().
> > Simplify the way by returning negative if fail, positive if success. In
> > case of failure, -EINVAL for bad syntax, -1 for the parsing results in
> > crash_size=0.
>
> I'm not entirely sure what you are trying to say here, but '-1' is not an
> improvement at all. We surely are not short of proper error codes, right?
>
The different negative return values are only used by x86. The option
"crashkernel=X,high", which is only used on x86, causes parse_kernel()
to return -EINVAL, then let parse_crashkernel_high() have a try.

When parsing crashkernel=size@offset and crashkernel=range1:size1,
there are other cases of failure, which is not worth to call
parse_crashkernel_high() to have a try. That is "-1" aiming for.

First, in parse_crashkernel_mem(), if demanded size is bigger than
system ram, this one looks like -ENOMEM, but -ENOMEM normally is used
for allocation. Second, in parse_crashkernel_mem(), if system ram is
not inside the range listed by "crashkernel=". Third, crashkernel=0MB
is given in the option (not in practice, but can not forbid user to do
so).

All of these cases can be treated as -EINVAL, but hard to define the
error codes.
> Also I don't see any positive return value > 0. So what is this about:
>
Yes. 0 is enough for success.  I had thought about returning 1 if
@offset is specified in crashkernel. But at present, no use case for
it.

> > --- a/arch/ia64/kernel/setup.c
> > +++ b/arch/ia64/kernel/setup.c
> > @@ -277,7 +277,7 @@ static void __init setup_crashkernel(unsigned long 
> > total, int *n)
> >
> >   ret = parse_crashkernel(boot_command_line, total,
> >   , );
> > - if (ret == 0 && size > 0) {
> > + if (ret >= 0) {
>
>   ^^^  
>
> >   if (!memory_region_available(crash_base, crash_size)) {
> > diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> > index 45a8d0b..0b626e2 100644
> > --- a/arch/powerpc/kernel/fadump.c
> > +++ b/arch/powerpc/kernel/fadump.c
> > @@ -376,7 +376,7 @@ static inline unsigned long 
> > fadump_calculate_reserve_size(void)
> >*/
> >   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >   , );
> > - if (ret == 0 && size > 0) {
> > + if (ret >= 0) {
>
> and this ?
>
> >   unsigned long max_size;
> >
> >   if (fw_dump.reserve_bootvar)
> > diff --git a/arch/powerpc/kernel/machine_kexec.c 
> > b/arch/powerpc/kernel/machine_kexec.c
> > index 63f5a93..9f3e61a 100644
> > --- a/arch/powerpc/kernel/machine_kexec.c
> > +++ b/arch/powerpc/kernel/machine_kexec.c
> > @@ -122,7 +122,7 @@ void __init reserve_crashkernel(void)
> >   /* use common parsing */
> >   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >   _size, _base);
> > - if (ret == 0 && crash_size > 0) {
> > + if (ret >= 0) {
>
> Again.
>
> >   crashk_res.start = crash_base;
> >   crashk_res.end = crash_base + crash_size - 1;
> >   }
> > --- a/arch/sh/kernel/machine_kexec.c
> > +++ b/arch/sh/kernel/machine_kexec.c
> > @@ -157,7 +157,7 @@ void __init reserve_crashkernel(void)
> >
> >   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >   _size, _base);
> > - if (ret == 0 && crash_size > 0) {
> > + if (ret >= 0) {
>
> And some more.
>
> >   crashk_res.start = crash_base;
> >   crashk_res.end = crash_base + crash_size - 1;
> >   }
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 3d872a5..62d07d4 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -526,11 +526,11 @@ static void __init reserve_crashkernel(void)
> >
> >   /* crashkernel=XM */
> >   ret = parse_crashkernel(boot_command_line, total_mem, _size, 
> > _base);
> > - if (ret != 0 || crash_size <= 0) {
> > + if (ret == -EINVAL) {
>
> Without an explanation why this proceedes on error codes other than EINVAL
> this is uncomprehensible. Comments exist for a reason.
>
As explained above, deciding whether to let parse_crashkernel_high() try.

> >   /* crashkernel=X,high */
> >   ret = parse_crashkernel_high(boot_command_line, total_mem,
> >_size, _base);
> > - if (ret != 0 || crash_size <= 0)
> > + if (ret < 0)
> >   return;
> >   high = true;
>
> > @@ -87,7 +87,7 @@ static int __init parse_crashkernel_mem(char *cmdline,
> >   cur = tmp;
> >   if (size >= system_ram) {
> >   pr_warn("crashkernel: invalid size\n");
> > - return -EINVAL;
> > + 

RE: [PATCH 02/13] soc/fsl/bman: map FBPR area in the iommu

2019-04-19 Thread Laurentiu Tudor
Hi Robin,

> -Original Message-
> From: Robin Murphy 
> Sent: Friday, March 29, 2019 4:51 PM
> 
> On 29/03/2019 14:00, laurentiu.tu...@nxp.com wrote:
> > From: Laurentiu Tudor 
> >
> > Add a one-to-one iommu mapping for bman private data memory (FBPR).
> > This is required for BMAN to work without faults behind an iommu.
> >
> > Signed-off-by: Laurentiu Tudor 
> > ---
> >   drivers/soc/fsl/qbman/bman_ccsr.c | 11 +++
> >   1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/bman_ccsr.c
> b/drivers/soc/fsl/qbman/bman_ccsr.c
> > index 7c3cc968053c..b209c79511bb 100644
> > --- a/drivers/soc/fsl/qbman/bman_ccsr.c
> > +++ b/drivers/soc/fsl/qbman/bman_ccsr.c
> > @@ -29,6 +29,7 @@
> >*/
> >
> >   #include "bman_priv.h"
> > +#include 
> >
> >   u16 bman_ip_rev;
> >   EXPORT_SYMBOL(bman_ip_rev);
> > @@ -178,6 +179,7 @@ static int fsl_bman_probe(struct platform_device
> *pdev)
> > int ret, err_irq;
> > struct device *dev = >dev;
> > struct device_node *node = dev->of_node;
> > +   struct iommu_domain *domain;
> > struct resource *res;
> > u16 id, bm_pool_cnt;
> > u8 major, minor;
> > @@ -225,6 +227,15 @@ static int fsl_bman_probe(struct platform_device
> *pdev)
> >
> > dev_dbg(dev, "Allocated FBPR 0x%llx 0x%zx\n", fbpr_a, fbpr_sz);
> >
> > +   /* Create an 1-to-1 iommu mapping for FBPR area */
> > +   domain = iommu_get_domain_for_dev(dev);
> 
> If that's expected to be the default domain that you're grabbing, then
> this is *incredibly* fragile. There's nothing to stop the IOVA that you
> forcibly map from being automatically allocated later and causing some
> other DMA mapping to fail noisily and unexpectedly. Furthermore, have
> you tried this with "iommu.passthrough=1"?
> 
> That said, I really don't understand what's going on here anyway :/
> 
> As far as I can tell from qbman_init_private_mem(), fbpr_a comes from
> dma_alloc_coherent() and thus would already be a mapped IOVA - isn't
> this the stuff that Roy converted to nicely use shared-dma-pool regions
> a while ago?
> 

Finally found some time to look into this, sorry for the delay. It seems that 
on the code path taken in our case (dma_alloc_coherent() -> dma_alloc_attrs() 
-> dma_alloc_from_dev_coherent() -> __dma_alloc_from_coherent()) there's no 
call into the iommu layer, thus no mapping in the smmu. I plan to come up with 
a RFC patch early next week so we have something concrete to discuss on.

---
Best Regards, Laurentiu


Re: Linux 5.1-rc5

2019-04-19 Thread Martin Schwidefsky
On Thu, 18 Apr 2019 20:41:44 +0200
Martin Schwidefsky  wrote:

> On Thu, 18 Apr 2019 08:49:32 -0700
> Linus Torvalds  wrote:
> 
> > On Thu, Apr 18, 2019 at 1:02 AM Martin Schwidefsky
> >  wrote:  
> > >
> > > The problematic lines in the generic gup code are these three:
> > >
> > > 1845:   pmdp = pmd_offset(, addr);
> > > 1888:   pudp = pud_offset(, addr);
> > > 1916:   p4dp = p4d_offset(, addr);
> > >
> > > Passing the pointer of a *copy* of a page table entry to pxd_offset() does
> > > not work with the page table folding on s390.
> > 
> > Hmm. I wonder why. x86 too does the folding thing for the p4d and pud case.
> > 
> > The folding works with the local copy just the same way it works with
> > the orignal value.  
> 
> The difference is that with the static page table folding pgd_offset()
> does the index calculation of the actual hardware top-level table. With
> dynamic page table folding as s390 is doing it, if the task does not use
> a 5-level page table pgd_offset() will see a pgd_index() of 0, the indexing
> of the actual top-level table is done later with p4d_offset(), pud_offset()
> or pmd_offset(). 
> 
> As an example, with a three level page table we have three indexes x/y/z.
> The common code "thinks" 5 indexing steps, with static folding the index
> sequence is x 0 0 y z. With dynamic folding the sequence is 0 0 x y z.
> By moving the first indexing operation to pgd_offset the static sequence
> does not add an index to a non-dereferenced pointer to a stack variable,
> the dynamic sequence does.

That problem got stuck in my head and I thought more about it. Why not
emulate the static folding sequence in the s390 page table code?

As the table type is encoded in every entry for the region and segment
tables, pgd_offset() can look at the first entry to find the table type
and then do the correct index calculation for the given top-level table.
Like this:

static inline pgd_t *pgd_offset_raw(pgd_t *pgd, unsigned long address)
{
unsigned long rste;
unsigned int shift;

/* Get the first entry of the top level table */
rste = pgd_val(*pgd);
/* Pick up the shift from the table type of the first entry */
shift = ((rste & _REGION_ENTRY_TYPE_MASK) >> 2) * 11 + 20;
return pgd + ((address >> shift) & (PTRS_PER_PGD - 1));
}

#define pgd_offset(mm, address) pgd_offset_raw((mm)->pgd, address)
#define pgd_offset_k(address) pgd_offset(_mm, address)

static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R1)
return (p4d_t *) pgd;
return (p4d_t *) pgd_deref(*pgd) + p4d_index(address);
}

static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
{
if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R2)
return (pud_t *) p4d;
return (pud_t *) p4d_deref(*p4d) + pud_index(address);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) != _REGION_ENTRY_TYPE_R3)
return (pmd_t *) pud;
return (pmd_t *) pud_deref(*pud) + pmd_index(address);
}

This needs more thorough testing but in principle it does work. The kernel
boots and survives a kernel compile. The only things that is slightly off is
that pgd_offset() now has to look at the first table entry to do its job.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



[alsa-devel] [PATCH V2] ASoC: fsl_esai: Add pm runtime function

2019-04-19 Thread S.j. Wang


Hi

> 
> 
> Add pm runtime support and move clock handling there.
> fsl_esai_suspend is replaced by pm_runtime_force_suspend.
> fsl_esai_resume is replaced by pm_runtime_force_resume.
> 
> Signed-off-by: Shengjiu Wang 
> ---
> Changes in v2
> -refine the commit comments.
> -move regcache_mark_dirty to runtime suspend.
> 
>  sound/soc/fsl/fsl_esai.c | 141 ++
> -
>  1 file changed, 77 insertions(+), 64 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c index
> bad0dfed6b68..10d2210c91ef 100644
> --- a/sound/soc/fsl/fsl_esai.c
> +++ b/sound/soc/fsl/fsl_esai.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> 
> @@ -466,30 +467,6 @@ static int fsl_esai_startup(struct
> snd_pcm_substream *substream,
> struct snd_soc_dai *dai)  {
> struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
> -   int ret;
> -
> -   /*
> -* Some platforms might use the same bit to gate all three or two of
> -* clocks, so keep all clocks open/close at the same time for safety
> -*/
> -   ret = clk_prepare_enable(esai_priv->coreclk);
> -   if (ret)
> -   return ret;
> -   if (!IS_ERR(esai_priv->spbaclk)) {
> -   ret = clk_prepare_enable(esai_priv->spbaclk);
> -   if (ret)
> -   goto err_spbaclk;
> -   }
> -   if (!IS_ERR(esai_priv->extalclk)) {
> -   ret = clk_prepare_enable(esai_priv->extalclk);
> -   if (ret)
> -   goto err_extalck;
> -   }
> -   if (!IS_ERR(esai_priv->fsysclk)) {
> -   ret = clk_prepare_enable(esai_priv->fsysclk);
> -   if (ret)
> -   goto err_fsysclk;
> -   }
> 
> if (!dai->active) {
> /* Set synchronous mode */ @@ -506,16 +483,6 @@ static int
> fsl_esai_startup(struct snd_pcm_substream *substream,
> 
> return 0;
> 
> -err_fsysclk:
> -   if (!IS_ERR(esai_priv->extalclk))
> -   clk_disable_unprepare(esai_priv->extalclk);
> -err_extalck:
> -   if (!IS_ERR(esai_priv->spbaclk))
> -   clk_disable_unprepare(esai_priv->spbaclk);
> -err_spbaclk:
> -   clk_disable_unprepare(esai_priv->coreclk);
> -
> -   return ret;
>  }
> 
>  static int fsl_esai_hw_params(struct snd_pcm_substream *substream, @@
> -576,20 +543,6 @@ static int fsl_esai_hw_params(struct
> snd_pcm_substream *substream,
> return 0;
>  }
> 
> -static void fsl_esai_shutdown(struct snd_pcm_substream *substream,
> - struct snd_soc_dai *dai)
> -{
> -   struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
> -
> -   if (!IS_ERR(esai_priv->fsysclk))
> -   clk_disable_unprepare(esai_priv->fsysclk);
> -   if (!IS_ERR(esai_priv->extalclk))
> -   clk_disable_unprepare(esai_priv->extalclk);
> -   if (!IS_ERR(esai_priv->spbaclk))
> -   clk_disable_unprepare(esai_priv->spbaclk);
> -   clk_disable_unprepare(esai_priv->coreclk);
> -}
> -
>  static int fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
> struct snd_soc_dai *dai)  { @@ -658,7 +611,6 @@ 
> static int
> fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
> 
>  static const struct snd_soc_dai_ops fsl_esai_dai_ops = {
> .startup = fsl_esai_startup,
> -   .shutdown = fsl_esai_shutdown,
> .trigger = fsl_esai_trigger,
> .hw_params = fsl_esai_hw_params,
> .set_sysclk = fsl_esai_set_dai_sysclk, @@ -947,6 +899,10 @@ static int
> fsl_esai_probe(struct platform_device *pdev)
> return ret;
> }
> 
> +   pm_runtime_enable(>dev);
> +

I just have a question, do I need to add pm_runtime_idle(>dev)?

Best regards
Wang shengjiu


[PATCH V4 3/3] ASoC: fsl_asrc: Unify the supported input and output rate

2019-04-19 Thread S.j. Wang
Unify the supported input and output rate, add the
12kHz/24kHz/128kHz to the support list

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_asrc.c | 32 +++-
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 2c4bbc3499db..0d06e738264a 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -27,13 +27,14 @@
dev_dbg(_priv->pdev->dev, "Pair %c: " fmt, 'A' + index, 
##__VA_ARGS__)
 
 /* Corresponding to process_option */
-static int supported_input_rate[] = {
-   5512, 8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200,
-   96000, 176400, 192000,
+static unsigned int supported_asrc_rate[] = {
+   5512, 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000,
+   64000, 88200, 96000, 128000, 176400, 192000,
 };
 
-static int supported_asrc_rate[] = {
-   8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200, 96000, 
176400, 192000,
+static struct snd_pcm_hw_constraint_list fsl_asrc_rate_constraints = {
+   .count = ARRAY_SIZE(supported_asrc_rate),
+   .list = supported_asrc_rate,
 };
 
 /**
@@ -293,11 +294,11 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair 
*pair)
ideal = config->inclk == INCLK_NONE;
 
/* Validate input and output sample rates */
-   for (in = 0; in < ARRAY_SIZE(supported_input_rate); in++)
-   if (inrate == supported_input_rate[in])
+   for (in = 0; in < ARRAY_SIZE(supported_asrc_rate); in++)
+   if (inrate == supported_asrc_rate[in])
break;
 
-   if (in == ARRAY_SIZE(supported_input_rate)) {
+   if (in == ARRAY_SIZE(supported_asrc_rate)) {
pair_err("unsupported input sample rate: %dHz\n", inrate);
return -EINVAL;
}
@@ -311,7 +312,7 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
return -EINVAL;
}
 
-   if ((outrate >= 8000 && outrate <= 3) &&
+   if ((outrate >= 5512 && outrate <= 3) &&
(outrate > 24 * inrate || inrate > 8 * outrate)) {
pair_err("exceed supported ratio range (1/24, 8) for 
inrate/outrate: %d/%d\n",
inrate, outrate);
@@ -490,7 +491,9 @@ static int fsl_asrc_dai_startup(struct snd_pcm_substream 
*substream,
snd_pcm_hw_constraint_step(substream->runtime, 0,
   SNDRV_PCM_HW_PARAM_CHANNELS, 2);
 
-   return 0;
+
+   return snd_pcm_hw_constraint_list(substream->runtime, 0,
+   SNDRV_PCM_HW_PARAM_RATE, _asrc_rate_constraints);
 }
 
 static int fsl_asrc_dai_hw_params(struct snd_pcm_substream *substream,
@@ -603,7 +606,6 @@ static int fsl_asrc_dai_probe(struct snd_soc_dai *dai)
return 0;
 }
 
-#define FSL_ASRC_RATES  SNDRV_PCM_RATE_8000_192000
 #define FSL_ASRC_FORMATS   (SNDRV_PCM_FMTBIT_S24_LE | \
 SNDRV_PCM_FMTBIT_S16_LE | \
 SNDRV_PCM_FMTBIT_S20_3LE)
@@ -614,14 +616,18 @@ static int fsl_asrc_dai_probe(struct snd_soc_dai *dai)
.stream_name = "ASRC-Playback",
.channels_min = 1,
.channels_max = 10,
-   .rates = FSL_ASRC_RATES,
+   .rate_min = 5512,
+   .rate_max = 192000,
+   .rates = SNDRV_PCM_RATE_KNOT,
.formats = FSL_ASRC_FORMATS,
},
.capture = {
.stream_name = "ASRC-Capture",
.channels_min = 1,
.channels_max = 10,
-   .rates = FSL_ASRC_RATES,
+   .rate_min = 5512,
+   .rate_max = 192000,
+   .rates = SNDRV_PCM_RATE_KNOT,
.formats = FSL_ASRC_FORMATS,
},
.ops = _asrc_dai_ops,
-- 
1.9.1



[PATCH V4 2/3] ASoC: fsl_asrc: replace the process_option table with function

2019-04-19 Thread S.j. Wang
When we want to support more sample rate, for example 12kHz/24kHz
we need update the process_option table, if we want to support more
sample rate next time, the table need to be updated again. which
is not flexible.

We got a function fsl_asrc_sel_proc to replace the table, which can
give the pre-processing and post-processing options according to
the sample rate.

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_asrc.c | 75 +++-
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 5b8adc7fb117..2c4bbc3499db 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -26,24 +26,6 @@
 #define pair_dbg(fmt, ...) \
dev_dbg(_priv->pdev->dev, "Pair %c: " fmt, 'A' + index, 
##__VA_ARGS__)
 
-/* Sample rates are aligned with that defined in pcm.h file */
-static const u8 process_option[][12][2] = {
-   /* 8kHz 11.025kHz 16kHz 22.05kHz 32kHz 44.1kHz 48kHz   64kHz   88.2kHz 
96kHz   176kHz  192kHz */
-   {{0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 
0}, {0, 0}, {0, 0}, {0, 0},},  /* 5512Hz */
-   {{0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 
0}, {0, 0}, {0, 0}, {0, 0},},  /* 8kHz */
-   {{0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 
0}, {0, 0}, {0, 0}, {0, 0},},  /* 11025Hz */
-   {{1, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 
0}, {0, 0}, {0, 0}, {0, 0},},  /* 16kHz */
-   {{1, 2}, {1, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 0}, {0, 
0}, {0, 0}, {0, 0}, {0, 0},},  /* 22050Hz */
-   {{1, 2}, {2, 1}, {2, 1}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 1}, {0, 
1}, {0, 0}, {0, 0}, {0, 0},},  /* 32kHz */
-   {{2, 2}, {2, 2}, {2, 1}, {2, 1}, {0, 2}, {0, 1}, {0, 1}, {0, 1}, {0, 
1}, {0, 1}, {0, 0}, {0, 0},},  /* 44.1kHz */
-   {{2, 2}, {2, 2}, {2, 1}, {2, 1}, {0, 2}, {0, 2}, {0, 1}, {0, 1}, {0, 
1}, {0, 1}, {0, 0}, {0, 0},},  /* 48kHz */
-   {{2, 2}, {2, 2}, {2, 2}, {2, 1}, {1, 2}, {0, 2}, {0, 2}, {0, 1}, {0, 
1}, {0, 1}, {0, 1}, {0, 0},},  /* 64kHz */
-   {{2, 2}, {2, 2}, {2, 2}, {2, 2}, {1, 2}, {1, 2}, {1, 2}, {1, 1}, {1, 
1}, {1, 1}, {1, 1}, {1, 1},},  /* 88.2kHz */
-   {{2, 2}, {2, 2}, {2, 2}, {2, 2}, {1, 2}, {1, 2}, {1, 2}, {1, 1}, {1, 
1}, {1, 1}, {1, 1}, {1, 1},},  /* 96kHz */
-   {{2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 1}, {2, 
1}, {2, 1}, {2, 1}, {2, 1},},  /* 176kHz */
-   {{2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 2}, {2, 1}, {2, 
1}, {2, 1}, {2, 1}, {2, 1},},  /* 192kHz */
-};
-
 /* Corresponding to process_option */
 static int supported_input_rate[] = {
5512, 8000, 11025, 16000, 22050, 32000, 44100, 48000, 64000, 88200,
@@ -80,6 +62,51 @@
 static unsigned char *clk_map[2];
 
 /**
+ * Select the pre-processing and post-processing options
+ * Unsupport cases: Tsout > 8.125 * Tsin, Tsout > 16.125 * Tsin
+ *
+ * inrate: input sample rate
+ * outrate: output sample rate
+ * pre_proc: return value for pre-processing option
+ * post_proc: return value for post-processing option
+ */
+static int fsl_asrc_sel_proc(int inrate, int outrate,
+int *pre_proc, int *post_proc)
+{
+   bool post_proc_cond2;
+   bool post_proc_cond0;
+
+   /* select pre_proc between [0, 2] */
+   if (inrate * 8 > 33 * outrate)
+   *pre_proc = 2;
+   else if (inrate * 8 > 15 * outrate) {
+   if (inrate > 152000)
+   *pre_proc = 2;
+   else
+   *pre_proc = 1;
+   } else if (inrate < 76000)
+   *pre_proc = 0;
+   else if (inrate > 152000)
+   *pre_proc = 2;
+   else
+   *pre_proc = 1;
+
+   /* Condition for selection of post-processing */
+   post_proc_cond2 = (inrate * 15 > outrate * 16 && outrate < 56000) ||
+ (inrate > 56000 && outrate < 56000);
+   post_proc_cond0 = inrate * 23 < outrate * 8;
+
+   if (post_proc_cond2)
+   *post_proc = 2;
+   else if (post_proc_cond0)
+   *post_proc = 0;
+   else
+   *post_proc = 1;
+
+   return 0;
+}
+
+/**
  * Request ASRC pair
  *
  * It assigns pair by the order of A->C->B because allocation of pair B,
@@ -239,8 +266,10 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
u32 inrate, outrate, indiv, outdiv;
u32 clk_index[2], div[2];
int in, out, channels;
+   int pre_proc, post_proc;
struct clk *clk;
bool ideal;
+   int ret;
 
if (!config) {
pair_err("invalid pair config\n");
@@ -289,6 +318,12 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair *pair)
return -EINVAL;
}
 
+   ret = fsl_asrc_sel_proc(inrate, outrate, _proc, _proc);
+   if (ret) {
+   

[PATCH V4 1/3] ASoC: fsl_asrc: Fix the issue about unsupported rate

2019-04-19 Thread S.j. Wang
When the output sample rate is [8kHz, 30kHz], the limitation
of the supported ratio range is (1/24, 8). In the driver
we use (8kHz, 30kHz) instead of [8kHz, 30kHz].
So this patch is to fix this issue and the potential rounding
issue with divider.

Fixes: fff6e03c7b65 ("ASoC: fsl_asrc: add support for 8-30kHz
output sample rate")
Cc: 
Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_asrc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 0b937924d2e4..5b8adc7fb117 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -282,10 +282,10 @@ static int fsl_asrc_config_pair(struct fsl_asrc_pair 
*pair)
return -EINVAL;
}
 
-   if ((outrate > 8000 && outrate < 3) &&
-   (outrate/inrate > 24 || inrate/outrate > 8)) {
-   pair_err("exceed supported ratio range [1/24, 8] for \
-   inrate/outrate: %d/%d\n", inrate, outrate);
+   if ((outrate >= 8000 && outrate <= 3) &&
+   (outrate > 24 * inrate || inrate > 8 * outrate)) {
+   pair_err("exceed supported ratio range (1/24, 8) for 
inrate/outrate: %d/%d\n",
+   inrate, outrate);
return -EINVAL;
}
 
-- 
1.9.1



[PATCH V4 0/3] Support more sample rate in asrc

2019-04-19 Thread S.j. Wang
Support more sample rate in asrc

Shengjiu Wang (3):
  ASoC: fsl_asrc: Fix the issue about unsupported rate
  ASoC: fsl_asrc: replace the process_option table with function
  ASoC: fsl_asrc: Unify the supported input and output rate

Changes in v4
- add patch to Fix the [8kHz, 30kHz] open set issue.

Changes in v3
- remove FSL_ASRC_RATES
- refine fsl_asrc_sel_proc according to comments

Changes in v2
- add more comments in code
- add commit "Unify the supported input and output rate"

 sound/soc/fsl/fsl_asrc.c | 113 ---
 1 file changed, 77 insertions(+), 36 deletions(-)

-- 
1.9.1



Re: [PATCH V3 1/2] ASoC: fsl_asrc: replace the process_option table with function

2019-04-19 Thread S.j. Wang
Hi

> 
> 
> On Thu, Apr 18, 2019 at 09:37:06AM +, S.j. Wang wrote:
> > > > > And this is according to IMX6DQRM:
> > > > > Limited support for the case when output sampling rates is
> > > > > between 8kHz and 30kHz. The limitation is the supported ratio
> > > > > (Fsin/Fsout) range as between 1/24 to 8
> > > > >
> > > > > This should cover your 8.125 condition already, even if having
> > > > > an outrate range between [8KHz, 30KHz] check, since an outrate
> > > > > above 30KHz will not have an inrate bigger than 8.125 times of
> > > > > it, given the maximum input rate is 192KHz.
> > > > >
> > > > > So I think that we can just drop that 8.125 condition from your
> > > > > change and there's no need to error out any more.
> > > > >
> > > > No, if outrate=8kHz,  inrate > 88.2kHz, these cases are not supported.
> > > > This is not covered by
> > > >
> > > > if ((outrate > 8000 && outrate < 3) &&
> > > > (outrate/inrate > 24 || inrate/outrate > 8)) {
> > >
> > > Good catch. The range should be [8KHz, 30KHz] vs. (8KHz, 32KHz) in
> > > the code. Then I think the fix should be at both lines:
> > >
> > > - if ((outrate > 8000 && outrate < 3) &&
> > > - (outrate/inrate > 24 || inrate/outrate > 8)) {
> > > + if ((outrate >= 8000 && outrate =< 3) &&
> > > + (outrate > 24 * inrate || inrate > 8 * outrate)) {
> > >
> > > Overall, I think we should fix this instead of adding an extra one,
> > > since it is very likely saying the same thing.
> >
> > Actually if outrate < 8kHz, there will be issue too.
> 
> Here is the thing, the RM doesn't explicitly state that ASRC can support a
> lower output sample rate than 8KHz. And I actually had a concern when
> reviewing your PATCH-2, as the table of supported output sample rate no
> longer matches RM.
> 
> If you've verified a lower output sample rate working solid with the
> process_option function, that means our driver can go beyond the
> limitation mentioned in the RM, then I believe [8KHz, 32KHz] should be
> updated too -- that says we can do:
> -   if ((outrate > 8000 && outrate < 3) &&
> -   (outrate/inrate > 24 || inrate/outrate > 8)) {
> +   if ((outrate >= 5512 && outrate =< 3) &&
> +   (outrate > 24 * inrate || inrate > 8 * outrate)) {
> 
> Actually "ourate > 24 * inrate" is kind of pointless for range [5KHz, 32KHz]
> but we can keep it since it matches RM.

Ok, will send v4.

Best regards
Wang shengjiu


Re: [PATCH] powerpc/dts/fsl: add crypto node alias for B4

2019-04-19 Thread Horia Geanta
Has this slipped through?

On 3/20/2019 2:57 PM, Horia Geantă wrote:
> crypto node alias is needed by U-boot to identify the node and
> perform fix-ups, like adding "fsl,sec-era" property.
> 
> Signed-off-by: Horia Geantă 
> ---
>  arch/powerpc/boot/dts/fsl/b4qds.dtsi | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/boot/dts/fsl/b4qds.dtsi 
> b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
> index 999efd3bc167..05be919f3545 100644
> --- a/arch/powerpc/boot/dts/fsl/b4qds.dtsi
> +++ b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
> @@ -40,6 +40,7 @@
>   interrupt-parent = <>;
>  
>   aliases {
> + crypto = 
>   phy_sgmii_10 = _sgmii_10;
>   phy_sgmii_11 = _sgmii_11;
>   phy_sgmii_1c = _sgmii_1c;
> 


[PATCH v2 11/11] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

2019-04-19 Thread Masahiro Yamada
Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.

The idea is obviously arch-agnostic. This commit moves the config
entry from arch/x86/Kconfig.debug to lib/Kconfig.debug so that all
architectures can benefit from it.

This can make a huge difference in kernel image size especially when
CONFIG_OPTIMIZE_FOR_SIZE is enabled.

For example, I got 3.5% smaller arm64 kernel for v5.1-rc1.

  dec   file
  18983424  arch/arm64/boot/Image.before
  18321920  arch/arm64/boot/Image.after

This also slightly improves the "Kernel hacking" Kconfig menu as
e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen') suggested;
this config option would be a good fit in the "compiler option" menu.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 arch/x86/Kconfig   |  3 ---
 arch/x86/Kconfig.debug | 14 --
 include/linux/compiler_types.h |  3 +--
 lib/Kconfig.debug  | 14 ++
 4 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5ad92419be19..9e93d109a6cb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -310,9 +310,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
def_bool y if X86_64
 
-config ARCH_SUPPORTS_OPTIMIZED_INLINING
-   def_bool y
-
 config ARCH_SUPPORTS_DEBUG_PAGEALLOC
def_bool y
 
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 15d0fbe27872..f730680dc818 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -266,20 +266,6 @@ config CPA_DEBUG
---help---
  Do change_page_attr() self-tests every 30 seconds.
 
-config OPTIMIZE_INLINING
-   bool "Allow gcc to uninline functions marked 'inline'"
-   ---help---
- This option determines if the kernel forces gcc to inline the 
functions
- developers have marked 'inline'. Doing so takes away freedom from gcc 
to
- do what it thinks is best, which is desirable for the gcc 3.x series 
of
- compilers. The gcc 4.x series have a rewritten inlining algorithm and
- enabling this option will generate a smaller kernel there. Hopefully
- this algorithm is so good that allowing gcc 4.x and above to make the
- decision will become the default in the future. Until then this option
- is there to test gcc for this.
-
- If unsure, say N.
-
 config DEBUG_ENTRY
bool "Debug low-level entry code"
depends on DEBUG_KERNEL
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index ba814f18cb4c..19e58b9138a0 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -140,8 +140,7 @@ struct ftrace_likely_data {
  * Do not use __always_inline here, since currently it expands to inline again
  * (which would break users of __always_inline).
  */
-#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
-   !defined(CONFIG_OPTIMIZE_INLINING)
+#if !defined(CONFIG_OPTIMIZE_INLINING)
 #define inline inline __attribute__((__always_inline__)) __gnu_inline \
__maybe_unused notrace
 #else
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0d9e81779e37..f8f284f46c85 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -310,6 +310,20 @@ config HEADERS_CHECK
  exported to $(INSTALL_HDR_PATH) (usually 'usr/include' in
  your build tree), to make sure they're suitable.
 
+config OPTIMIZE_INLINING
+   bool "Allow compiler to uninline functions marked 'inline'"
+   help
+ This option determines if the kernel forces gcc to inline the 
functions
+ developers have marked 'inline'. Doing so takes away freedom from gcc 
to
+ do what it thinks is best, which is desirable for the gcc 3.x series 
of
+ compilers. The gcc 4.x series have a rewritten inlining algorithm and
+ enabling this option will generate a smaller kernel there. Hopefully
+ this algorithm is so good that allowing gcc 4.x and above to make the
+ decision will become the default in the future. Until then this option
+ is there to test gcc for this.
+
+ If unsure, say N.
+
 config DEBUG_SECTION_MISMATCH
bool "Enable full Section mismatch analysis"
help
-- 
2.17.1



[PATCH v2 06/11] MIPS: mark __fls() as __always_inline

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for mips, the following errors are reported:

arch/mips/mm/sc-mips.o: In function `mips_sc_prefetch_enable.part.2':
sc-mips.c:(.text+0x98): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0x9c): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xbc): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xc8): undefined reference to `mips_gcr_base'
sc-mips.c:(.text+0xdc): undefined reference to `mips_gcr_base'
arch/mips/mm/sc-mips.o:sc-mips.c:(.text.unlikely+0x44): more undefined 
references to `mips_gcr_base'

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - new patch

 arch/mips/include/asm/bitops.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 830c93a010c3..6a26ead1c2b6 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -482,7 +482,7 @@ static inline void __clear_bit_unlock(unsigned long nr, 
volatile unsigned long *
  * Return the bit position (0..63) of the most significant 1 bit in a word
  * Returns -1 if no 1 bit exists
  */
-static inline unsigned long __fls(unsigned long word)
+static __always_inline unsigned long __fls(unsigned long word)
 {
int num;
 
-- 
2.17.1



[PATCH v2 10/11] powerpc/mm/radix: mark as __tlbie_pid() and friends as__always_inline

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following errors are reported:

arch/powerpc/mm/tlb-radix.c: In function '__tlbie_lpid':
arch/powerpc/mm/tlb-radix.c:148:2: warning: asm operand 3 probably doesn't 
match constraints
  asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
  ^~~
arch/powerpc/mm/tlb-radix.c:148:2: error: impossible constraint in 'asm'
arch/powerpc/mm/tlb-radix.c: In function '__tlbie_pid':
arch/powerpc/mm/tlb-radix.c:118:2: warning: asm operand 3 probably doesn't 
match constraints
  asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
  ^~~
arch/powerpc/mm/tlb-radix.c: In function '__tlbiel_pid':
arch/powerpc/mm/tlb-radix.c:104:2: warning: asm operand 3 probably doesn't 
match constraints
  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
  ^~~

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - new patch

 arch/powerpc/mm/tlb-radix.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index a2b2848f0ae3..14ff414d1545 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -90,8 +90,8 @@ void radix__tlbiel_all(unsigned int action)
asm volatile(PPC_INVALIDATE_ERAT "; isync" : : :"memory");
 }
 
-static inline void __tlbiel_pid(unsigned long pid, int set,
-   unsigned long ric)
+static __always_inline void __tlbiel_pid(unsigned long pid, int set,
+unsigned long ric)
 {
unsigned long rb,rs,prs,r;
 
@@ -106,7 +106,7 @@ static inline void __tlbiel_pid(unsigned long pid, int set,
trace_tlbie(0, 1, rb, rs, ric, prs, r);
 }
 
-static inline void __tlbie_pid(unsigned long pid, unsigned long ric)
+static __always_inline void __tlbie_pid(unsigned long pid, unsigned long ric)
 {
unsigned long rb,rs,prs,r;
 
@@ -136,7 +136,7 @@ static inline void __tlbiel_lpid(unsigned long lpid, int 
set,
trace_tlbie(lpid, 1, rb, rs, ric, prs, r);
 }
 
-static inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
+static __always_inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
 {
unsigned long rb,rs,prs,r;
 
@@ -239,7 +239,7 @@ static inline void fixup_tlbie_lpid(unsigned long lpid)
 /*
  * We use 128 set in radix mode and 256 set in hpt mode.
  */
-static inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
+static __always_inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
 {
int set;
 
-- 
2.17.1



[PATCH v2 05/11] mtd: rawnand: vf610_nfc: add initializer to avoid -Wmaybe-uninitialized

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

Kbuild test robot has never reported -Wmaybe-uninitialized warning
for this probably because vf610_nfc_run() is inlined by the x86
compiler's inlining heuristic.

If CONFIG_OPTIMIZE_INLINING is enabled for a different architecture
and vf610_nfc_run() is not inlined, the following warning is reported:

drivers/mtd/nand/raw/vf610_nfc.c: In function ‘vf610_nfc_cmd’:
drivers/mtd/nand/raw/vf610_nfc.c:455:3: warning: ‘offset’ may be used 
uninitialized in this function [-Wmaybe-uninitialized]
   vf610_nfc_rd_from_sram(instr->ctx.data.buf.in + offset,
   ^~~
nfc->regs + NFC_MAIN_AREA(0) + offset,
~~
trfr_sz, !nfc->data_access);
~~~

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 drivers/mtd/nand/raw/vf610_nfc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/raw/vf610_nfc.c b/drivers/mtd/nand/raw/vf610_nfc.c
index a662ca1970e5..19792d725ec2 100644
--- a/drivers/mtd/nand/raw/vf610_nfc.c
+++ b/drivers/mtd/nand/raw/vf610_nfc.c
@@ -364,7 +364,7 @@ static int vf610_nfc_cmd(struct nand_chip *chip,
 {
const struct nand_op_instr *instr;
struct vf610_nfc *nfc = chip_to_nfc(chip);
-   int op_id = -1, trfr_sz = 0, offset;
+   int op_id = -1, trfr_sz = 0, offset = 0;
u32 col = 0, row = 0, cmd1 = 0, cmd2 = 0, code = 0;
bool force8bit = false;
 
-- 
2.17.1



[PATCH v2 09/11] powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following error is reported:

arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't match 
constraints [-Werror]
  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
  ^~~
arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 arch/powerpc/mm/tlb-radix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 6a23b9ebd2a1..a2b2848f0ae3 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -928,7 +928,7 @@ void radix__tlb_flush(struct mmu_gather *tlb)
tlb->need_flush_all = 0;
 }
 
-static inline void __radix__flush_tlb_range_psize(struct mm_struct *mm,
+static __always_inline void __radix__flush_tlb_range_psize(struct mm_struct 
*mm,
unsigned long start, unsigned long end,
int psize, bool also_pwc)
 {
-- 
2.17.1



[PATCH v2 07/11] ARM: mark setup_machine_tags() stub as __init __noreturn

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for arm, Clang build results in the following modpost
warning:

WARNING: vmlinux.o(.text+0x1124): Section mismatch in reference from the 
function setup_machine_tags() to the function .init.text:early_print()
The function setup_machine_tags() references
the function __init early_print().
This is often because setup_machine_tags lacks a __init
annotation or the annotation of early_print is wrong.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - new patch

 arch/arm/kernel/atags.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/atags.h b/arch/arm/kernel/atags.h
index 201100226301..067e12edc341 100644
--- a/arch/arm/kernel/atags.h
+++ b/arch/arm/kernel/atags.h
@@ -5,7 +5,7 @@ void convert_to_tag_list(struct tag *tags);
 const struct machine_desc *setup_machine_tags(phys_addr_t __atags_pointer,
unsigned int machine_nr);
 #else
-static inline const struct machine_desc *
+static inline const struct machine_desc * __init __noreturn
 setup_machine_tags(phys_addr_t __atags_pointer, unsigned int machine_nr)
 {
early_print("no ATAGS support: can't continue\n");
-- 
2.17.1



[PATCH v2 08/11] powerpc/prom_init: mark prom_getprop() and prom_getproplen() as __init

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following modpost warnings are
reported:

WARNING: vmlinux.o(.text.unlikely+0x20): Section mismatch in reference from the 
function .prom_getprop() to the function .init.text:.call_prom()
The function .prom_getprop() references
the function __init .call_prom().
This is often because .prom_getprop lacks a __init
annotation or the annotation of .call_prom is wrong.

WARNING: vmlinux.o(.text.unlikely+0x3c): Section mismatch in reference from the 
function .prom_getproplen() to the function .init.text:.call_prom()
The function .prom_getproplen() references
the function __init .call_prom().
This is often because .prom_getproplen lacks a __init
annotation or the annotation of .call_prom is wrong.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 arch/powerpc/kernel/prom_init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index f33ff4163a51..241fe6b7a8cc 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -501,14 +501,14 @@ static int __init prom_next_node(phandle *nodep)
}
 }
 
-static inline int prom_getprop(phandle node, const char *pname,
-  void *value, size_t valuelen)
+static inline int __init prom_getprop(phandle node, const char *pname,
+ void *value, size_t valuelen)
 {
return call_prom("getprop", 4, 1, node, ADDR(pname),
 (u32)(unsigned long) value, (u32) valuelen);
 }
 
-static inline int prom_getproplen(phandle node, const char *pname)
+static inline int __init prom_getproplen(phandle node, const char *pname)
 {
return call_prom("getproplen", 2, 1, node, ADDR(pname));
 }
-- 
2.17.1



[PATCH v2 00/11] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

2019-04-19 Thread Masahiro Yamada


Major changes in v2:
 - Eliminate more errors and warnings
 - Delete 'depends on !MIPS'
 - Split into separate patches

Arnd Bergmann (1):
  ARM: prevent tracing IPI_CPU_BACKTRACE

Masahiro Yamada (10):
  arm64: mark (__)cpus_have_const_cap as __always_inline
  MIPS: mark mult_sh_align_mod() as __always_inline
  s390/cpacf: mark scpacf_query() as __always_inline
  mtd: rawnand: vf610_nfc: add initializer to avoid
-Wmaybe-uninitialized
  MIPS: mark __fls() as __always_inline
  ARM: mark setup_machine_tags() stub as __init __noreturn
  powerpc/prom_init: mark prom_getprop() and prom_getproplen() as __init
  powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as
__always_inline
  powerpc/mm/radix: mark as __tlbie_pid() and friends as__always_inline
  compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

 arch/arm/include/asm/hardirq.h  |  1 +
 arch/arm/kernel/atags.h |  2 +-
 arch/arm/kernel/smp.c   |  6 +-
 arch/arm64/include/asm/cpufeature.h |  4 ++--
 arch/mips/include/asm/bitops.h  |  2 +-
 arch/mips/kernel/cpu-bugs64.c   |  4 ++--
 arch/powerpc/kernel/prom_init.c |  6 +++---
 arch/powerpc/mm/tlb-radix.c | 12 ++--
 arch/s390/include/asm/cpacf.h   |  2 +-
 arch/x86/Kconfig|  3 ---
 arch/x86/Kconfig.debug  | 14 --
 drivers/mtd/nand/raw/vf610_nfc.c|  2 +-
 include/linux/compiler_types.h  |  3 +--
 lib/Kconfig.debug   | 14 ++
 14 files changed, 38 insertions(+), 37 deletions(-)

-- 
2.17.1



[PATCH v2 01/11] ARM: prevent tracing IPI_CPU_BACKTRACE

2019-04-19 Thread Masahiro Yamada
From: Arnd Bergmann 

When function tracing for IPIs is enabled, we get a warning for an
overflow of the ipi_types array with the IPI_CPU_BACKTRACE type
as triggered by raise_nmi():

arch/arm/kernel/smp.c: In function 'raise_nmi':
arch/arm/kernel/smp.c:489:2: error: array subscript is above array bounds 
[-Werror=array-bounds]
  trace_ipi_raise(target, ipi_types[ipinr]);

This is a correct warning as we actually overflow the array here.

This patch raise_nmi() to call __smp_cross_call() instead of
smp_cross_call(), to avoid calling into ftrace. For clarification,
I'm also adding a two new code comments describing how this one
is special.

The warning appears to have shown up after patch e7273ff49acf
("ARM: 8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI"), which
changed the number assignment from '15' to '8', but as far as I can
tell has existed since the IPI tracepoints were first introduced.
If we decide to backport this patch to stable kernels, we probably
need to backport e7273ff49acf as well.

Fixes: e7273ff49acf ("ARM: 8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI")
Fixes: 365ec7b17327 ("ARM: add IPI tracepoints") # v3.17
Signed-off-by: Arnd Bergmann 
[yamada.masah...@socionext.com: rebase on v5.0-rc1]
Signed-off-by: Masahiro Yamada 
---

This is a long-standing issue, and
Arnd posted this patch two years ago:
http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/409393.html

It is no longer applied, so I rebased it on top of the latest kernel.


Changes in v2: None

 arch/arm/include/asm/hardirq.h | 1 +
 arch/arm/kernel/smp.c  | 6 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h
index cba23eaa6072..7a88f160b1fb 100644
--- a/arch/arm/include/asm/hardirq.h
+++ b/arch/arm/include/asm/hardirq.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 
+/* number of IPIS _not_ including IPI_CPU_BACKTRACE */
 #define NR_IPI 7
 
 typedef struct {
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index facd4240ca02..c93fe0f256de 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -70,6 +70,10 @@ enum ipi_msg_type {
IPI_CPU_STOP,
IPI_IRQ_WORK,
IPI_COMPLETION,
+   /*
+* CPU_BACKTRACE is special and not included in NR_IPI
+* or tracable with trace_ipi_*
+*/
IPI_CPU_BACKTRACE,
/*
 * SGI8-15 can be reserved by secure firmware, and thus may
@@ -797,7 +801,7 @@ core_initcall(register_cpufreq_notifier);
 
 static void raise_nmi(cpumask_t *mask)
 {
-   smp_cross_call(mask, IPI_CPU_BACKTRACE);
+   __smp_cross_call(mask, IPI_CPU_BACKTRACE);
 }
 
 void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
-- 
2.17.1



[PATCH v2 04/11] s390/cpacf: mark scpacf_query() as __always_inline

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for s390, the following error is reported:

In file included from arch/s390/crypto/des_s390.c:19:
./arch/s390/include/asm/cpacf.h: In function 'cpacf_query':
./arch/s390/include/asm/cpacf.h:170:2: warning: asm operand 3 probably doesn't 
match constraints
  asm volatile(
  ^~~
./arch/s390/include/asm/cpacf.h:170:2: error: impossible constraint in 'asm'

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 arch/s390/include/asm/cpacf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/cpacf.h b/arch/s390/include/asm/cpacf.h
index 3cc52e37b4b2..f316de40e51b 100644
--- a/arch/s390/include/asm/cpacf.h
+++ b/arch/s390/include/asm/cpacf.h
@@ -202,7 +202,7 @@ static inline int __cpacf_check_opcode(unsigned int opcode)
}
 }
 
-static inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
+static __always_inline int cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
 {
if (__cpacf_check_opcode(opcode)) {
__cpacf_query(opcode, mask);
-- 
2.17.1



[PATCH v2 03/11] MIPS: mark mult_sh_align_mod() as __always_inline

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for mips, the following error is reported:

arch/mips/kernel/cpu-bugs64.c: In function 'mult_sh_align_mod.constprop':
arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably doesn't match 
constraints [-Werror]
  asm volatile(
  ^~~
arch/mips/kernel/cpu-bugs64.c:33:2: error: asm operand 1 probably doesn't match 
constraints [-Werror]
  asm volatile(
  ^~~
arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
  asm volatile(
  ^~~
arch/mips/kernel/cpu-bugs64.c:33:2: error: impossible constraint in 'asm'
  asm volatile(
  ^~~

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 arch/mips/kernel/cpu-bugs64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/kernel/cpu-bugs64.c b/arch/mips/kernel/cpu-bugs64.c
index bada74af7641..c04b97aace4a 100644
--- a/arch/mips/kernel/cpu-bugs64.c
+++ b/arch/mips/kernel/cpu-bugs64.c
@@ -42,8 +42,8 @@ static inline void align_mod(const int align, const int mod)
: "n"(align), "n"(mod));
 }
 
-static inline void mult_sh_align_mod(long *v1, long *v2, long *w,
-const int align, const int mod)
+static __always_inline void mult_sh_align_mod(long *v1, long *v2, long *w,
+ const int align, const int mod)
 {
unsigned long flags;
int m1, m2;
-- 
2.17.1



[PATCH v2 02/11] arm64: mark (__)cpus_have_const_cap as __always_inline

2019-04-19 Thread Masahiro Yamada
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place. We need to eliminate potential issues beforehand.

If it is enabled for arm64, the following errors are reported:

In file included from ././include/linux/compiler_types.h:68,
 from :
./arch/arm64/include/asm/jump_label.h: In function 'cpus_have_const_cap':
./include/linux/compiler-gcc.h:120:38: warning: asm operand 0 probably doesn't 
match constraints
 #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
  ^~~
./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of macro 
'asm_volatile_goto'
  asm_volatile_goto(
  ^
./include/linux/compiler-gcc.h:120:38: error: impossible constraint in 'asm'
 #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
  ^~~
./arch/arm64/include/asm/jump_label.h:32:2: note: in expansion of macro 
'asm_volatile_goto'
  asm_volatile_goto(
  ^

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - split into a separate patch

 arch/arm64/include/asm/cpufeature.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index e505e1fbd2b9..77d1aa57323e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -406,7 +406,7 @@ static inline bool cpu_have_feature(unsigned int num)
 }
 
 /* System capability check for constant caps */
-static inline bool __cpus_have_const_cap(int num)
+static __always_inline bool __cpus_have_const_cap(int num)
 {
if (num >= ARM64_NCAPS)
return false;
@@ -420,7 +420,7 @@ static inline bool cpus_have_cap(unsigned int num)
return test_bit(num, cpu_hwcaps);
 }
 
-static inline bool cpus_have_const_cap(int num)
+static __always_inline bool cpus_have_const_cap(int num)
 {
if (static_branch_likely(_const_caps_ready))
return __cpus_have_const_cap(num);
-- 
2.17.1



Re: [PATCH] kernel/crash: make parse_crashkernel()'s return value more indicant

2019-04-19 Thread Thomas Gleixner
On Fri, 19 Apr 2019, Pingfan Liu wrote:

> At present, both return and crash_size should be checked to guarantee the
> success of parse_crashkernel().
> Simplify the way by returning negative if fail, positive if success. In
> case of failure, -EINVAL for bad syntax, -1 for the parsing results in
> crash_size=0.

I'm not entirely sure what you are trying to say here, but '-1' is not an
improvement at all. We surely are not short of proper error codes, right?

Also I don't see any positive return value > 0. So what is this about:

> --- a/arch/ia64/kernel/setup.c
> +++ b/arch/ia64/kernel/setup.c
> @@ -277,7 +277,7 @@ static void __init setup_crashkernel(unsigned long total, 
> int *n)
>  
>   ret = parse_crashkernel(boot_command_line, total,
>   , );
> - if (ret == 0 && size > 0) {
> + if (ret >= 0) {

  ^^^  

>   if (!memory_region_available(crash_base, crash_size)) {
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index 45a8d0b..0b626e2 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -376,7 +376,7 @@ static inline unsigned long 
> fadump_calculate_reserve_size(void)
>*/
>   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   , );
> - if (ret == 0 && size > 0) {
> + if (ret >= 0) {

and this ?

>   unsigned long max_size;
>  
>   if (fw_dump.reserve_bootvar)
> diff --git a/arch/powerpc/kernel/machine_kexec.c 
> b/arch/powerpc/kernel/machine_kexec.c
> index 63f5a93..9f3e61a 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -122,7 +122,7 @@ void __init reserve_crashkernel(void)
>   /* use common parsing */
>   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   _size, _base);
> - if (ret == 0 && crash_size > 0) {
> + if (ret >= 0) {

Again.

>   crashk_res.start = crash_base;
>   crashk_res.end = crash_base + crash_size - 1;
>   }
> --- a/arch/sh/kernel/machine_kexec.c
> +++ b/arch/sh/kernel/machine_kexec.c
> @@ -157,7 +157,7 @@ void __init reserve_crashkernel(void)
>  
>   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   _size, _base);
> - if (ret == 0 && crash_size > 0) {
> + if (ret >= 0) {

And some more.

>   crashk_res.start = crash_base;
>   crashk_res.end = crash_base + crash_size - 1;
>   }
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 3d872a5..62d07d4 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -526,11 +526,11 @@ static void __init reserve_crashkernel(void)
>  
>   /* crashkernel=XM */
>   ret = parse_crashkernel(boot_command_line, total_mem, _size, 
> _base);
> - if (ret != 0 || crash_size <= 0) {
> + if (ret == -EINVAL) {

Without an explanation why this proceedes on error codes other than EINVAL
this is uncomprehensible. Comments exist for a reason.

>   /* crashkernel=X,high */
>   ret = parse_crashkernel_high(boot_command_line, total_mem,
>_size, _base);
> - if (ret != 0 || crash_size <= 0)
> + if (ret < 0)
>   return;
>   high = true;

> @@ -87,7 +87,7 @@ static int __init parse_crashkernel_mem(char *cmdline,
>   cur = tmp;
>   if (size >= system_ram) {
>   pr_warn("crashkernel: invalid size\n");
> - return -EINVAL;
> + return -1;

Well, this is incomprehensible as well. The pr_warn() says invalid and then
you change the error code to something magic.

Thanks,

tglx




[PATCH] kernel/crash: make parse_crashkernel()'s return value more indicant

2019-04-19 Thread Pingfan Liu
At present, both return and crash_size should be checked to guarantee the
success of parse_crashkernel().
Simplify the way by returning negative if fail, positive if success. In
case of failure, -EINVAL for bad syntax, -1 for the parsing results in
crash_size=0.

Signed-off-by: Pingfan Liu 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Ralf Baechle 
Cc: Paul Burton 
Cc: James Hogan 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Andrew Morton 
Cc: Julien Thierry 
Cc: Palmer Dabbelt 
Cc: Ard Biesheuvel 
Cc: Florian Fainelli 
Cc: Logan Gunthorpe 
Cc: Robin Murphy 
Cc: Greg Hackmann 
Cc: Stefan Agner 
Cc: Johannes Weiner 
Cc: David Hildenbrand 
Cc: Jens Axboe 
Cc: Thomas Bogendoerfer 
Cc: Greg Kroah-Hartman 
Cc: Hari Bathini 
Cc: Ananth N Mavinakayanahalli 
Cc: Yangtao Li 
Cc: Dave Young 
Cc: Baoquan He 
Cc: x...@kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
---
 arch/arm/kernel/setup.c |  2 +-
 arch/arm64/mm/init.c|  2 +-
 arch/ia64/kernel/setup.c|  2 +-
 arch/mips/kernel/setup.c|  2 +-
 arch/powerpc/kernel/fadump.c|  2 +-
 arch/powerpc/kernel/machine_kexec.c |  2 +-
 arch/s390/kernel/setup.c|  2 +-
 arch/sh/kernel/machine_kexec.c  |  2 +-
 arch/x86/kernel/setup.c |  4 ++--
 kernel/crash_core.c | 12 ++--
 10 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 5d78b6a..2feab13 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -997,7 +997,7 @@ static void __init reserve_crashkernel(void)
total_mem = get_total_mem();
ret = parse_crashkernel(boot_command_line, total_mem,
_size, _base);
-   if (ret)
+   if (ret < 0)
return;
 
if (crash_base <= 0) {
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6bc1350..240918c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -79,7 +79,7 @@ static void __init reserve_crashkernel(void)
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
_size, _base);
/* no crashkernel= or invalid value specified */
-   if (ret || !crash_size)
+   if (ret < 0)
return;
 
crash_size = PAGE_ALIGN(crash_size);
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 583a374..99cae33 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -277,7 +277,7 @@ static void __init setup_crashkernel(unsigned long total, 
int *n)
 
ret = parse_crashkernel(boot_command_line, total,
, );
-   if (ret == 0 && size > 0) {
+   if (ret >= 0) {
if (!base) {
sort_regions(rsvd_region, *n);
*n = merge_regions(rsvd_region, *n);
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 8d1dc6c..168571b 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -715,7 +715,7 @@ static void __init mips_parse_crashkernel(void)
total_mem = get_total_mem();
ret = parse_crashkernel(boot_command_line, total_mem,
_size, _base);
-   if (ret != 0 || crash_size <= 0)
+   if (ret < 0)
return;
 
if (!memory_region_available(crash_base, crash_size)) {
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 45a8d0b..0b626e2 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -376,7 +376,7 @@ static inline unsigned long 
fadump_calculate_reserve_size(void)
 */
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
, );
-   if (ret == 0 && size > 0) {
+   if (ret >= 0) {
unsigned long max_size;
 
if (fw_dump.reserve_bootvar)
diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 63f5a93..9f3e61a 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -122,7 +122,7 @@ void __init reserve_crashkernel(void)
/* use common parsing */
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
_size, _base);
-   if (ret == 0 && crash_size > 0) {
+   if (ret >= 0) {
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
}
diff --git a/arch/s390/kernel/setup.c