date:20171018

Re: [PATCH] mac80211: aggregation: Convert timers to use timer_setup()

2017-10-18 Thread Kees Cook

On Wed, Oct 18, 2017 at 3:29 AM, Johannes Berg
 wrote:
>> This has been the least trivial timer conversion yet. Given the use of
>> RCU and other things I may not even know about, I'd love to get a close
>> look at this. I *think* this is correct, as it will re-lookup the tid
>> entries when firing the timer.
>
> I'm not really sure why you're doing the lookup again? That seems
> pointless, since you already have the right structure, and already rely
> on it being valid. You can't really get a new struct assigned to the
> same TID without the old one being destroyed.

I couldn't tell what the lifetime expectation was, so I left the
re-lookup. I assumed it was possible to have a timer fire after the
structure had been removed from the station structure.

-Kees

-- 
Kees Cook
Pixel Security

Re: [RFC PATCH] can: m_can: Support higher speed CAN-FD bitrates

2017-10-18 Thread Franklin S Cooper Jr



On 10/18/2017 08:24 AM, Sekhar Nori wrote:
> Hi Marc,
> 
> On Wednesday 18 October 2017 06:14 PM, Marc Kleine-Budde wrote:
>> On 09/21/2017 02:48 AM, Franklin S Cooper Jr wrote:
>>>
>>>
>>> On 09/20/2017 04:37 PM, Mario Hüttel wrote:


 On 09/20/2017 10:19 PM, Franklin S Cooper Jr wrote:
> Hi Wenyou,
>
> On 09/17/2017 10:47 PM, Yang, Wenyou wrote:
>>
>> On 2017/9/14 13:06, Sekhar Nori wrote:
>>> On Thursday 14 September 2017 03:28 AM, Franklin S Cooper Jr wrote:
 On 08/18/2017 02:39 PM, Franklin S Cooper Jr wrote:
> During test transmitting using CAN-FD at high bitrates (4 Mbps) only
> resulted in errors. Scoping the signals I noticed that only a single
> bit
> was being transmitted and with a bit more investigation realized the
> actual
> MCAN IP would go back to initialization mode automatically.
>
> It appears this issue is due to the MCAN needing to use the 
> Transmitter
> Delay Compensation Mode as defined in the MCAN User's Guide. When this
> mode is used the User's Guide indicates that the Transmitter Delay
> Compensation Offset register should be set. The document mentions
> that this
> register should be set to (1/dbitrate)/2*(Func Clk Freq).
>
> Additional CAN-CIA's "Bit Time Requirements for CAN FD" document
> indicates
> that this TDC mode is only needed for data bit rates above 2.5 Mbps.
> Therefore, only enable this mode and only set TDCO when the data bit
> rate
> is above 2.5 Mbps.
>
> Signed-off-by: Franklin S Cooper Jr 
> ---
> I'm pretty surprised that this hasn't been implemented already since
> the primary purpose of CAN-FD is to go beyond 1 Mbps and the MCAN IP
> supports up to 10 Mbps.
>
> So it will be nice to get comments from users of this driver to
> understand
> if they have been able to use CAN-FD beyond 2.5 Mbps without this
> patch.
> If they haven't what did they do to get around it if they needed 
> higher
> speeds.
>
> Meanwhile I plan on testing this using a more "realistic" CAN bus to
> insure
> everything still works at 5 Mbps which is the max speed of my CAN
> transceiver.
 ping. Anyone has any thoughts on this?
>>> I added Dong who authored the m_can driver and Wenyou who added the only
>>> in-kernel user of the driver for any help.
>> I tested it on SAMA5D2 Xplained board both with and without this patch, 
>> both work with the 4M bps data bit rate.
> Thank you for testing this out. Its interesting that you have been able
> to use higher speeds without this patch. What is the CAN transceiver
> being used on the SAMA5D2 Xplained board? I tried looking at the
> schematic but it seems the CAN signals are used on an extension board
> which I can't find the schematic for. Also do you mind sharing your test
> setup? Were you doing a short point to point test?
>
> Thank You,
> Franklin
 Hello Franklin,

 your patch definitely makes sense.

 I forgot the TDC in my patches because it was not present in the
 previous driver versions and because I didn't encounter any
 problems when testing it myself.

 The error is highly dependent on the hardware (transceiver) setup.
 So it is definitely possible that some people don't encounter errors
 without your patch.
>>>
>>> So the Transmission Delay Compensation feature Value register is suppose
>>> to take into consideration the transceiver delay automatically and add
>>> the value of TDCO on top of that. So why would TDCO be dependent on the
>>> transceiver? I've heard conflicting things regarding TDC so any
>>> clarification on what actually impacts it would be appreciated.
>>>
>>> Also part of the issue I'm having is how can we properly configure TDCO?
>>> Configuring TDCO is essentially figuring out what Secondary Sample Point
>>> to use. However, it is unclear what value to set SSP to and which use
>>> cases a given SSP will work or doesn't work. I've seen various
>>> recommendations from Bosch on choosing SSP but ultimately it seems they
>>> suggestion "real world testing" to come up with a proper value. Not
>>> setting TDCO causes problems for my device and improperly setting TDCO
>>> causes problems for my device. So its likely any value I use could end
>>> up breaking something for someone else.
>>>
>>> Currently I leaning to a DT property that can be used for setting SSP.
>>> Perhaps use a generic default value and allow individuals to override it
>>> via DT?
>>
>> Sounds reasonable. What's the status of this series?
> 
> I have had some offline discussions with Franklin on this, and I am not
> fully convinced that DT is the way to go here (although I don't have

Re: [PATCH] mac80211: aggregation: Convert timers to use timer_setup()

2017-10-18 Thread Kees Cook

On Wed, Oct 18, 2017 at 4:37 AM, Johannes Berg
 wrote:
> On Wed, 2017-10-18 at 13:31 +0200, Johannes Berg wrote:
>> On Wed, 2017-10-18 at 12:29 +0200, Johannes Berg wrote:
>>
>> > Anyway, the change here looks correct to me, so I'll apply it and then
>> > perhaps clean up more. I've only changed "u16 tid" to "u8 tid" since
>> > the valid range is 0-15 (in theory, in practice 0-7).

I started with u8 tid, but I saw it cast to u16 and in a few other
places it was u16, so I went with that ultimately.

>> Well, I guess I'm clearly wrong - it's crashing our test suite left and
>> right.
>>
>> I'll dig a little bit, but I don't have much time today, and will be
>> out for a few days starting tomorrow.
>
> Ok, it's pretty obvious - you never initialize the new fields in tid_tx
> (sta and tid), only in tid_rx. Let's see if it passes with that fixed.

Argh, whoops, thanks for working on this.

-Kees

-- 
Kees Cook
Pixel Security

Re: [PATCH 2/4] arm64: prevent instrumentation of LL/SC atomics

2017-10-18 Thread Will Deacon

On Tue, Oct 17, 2017 at 01:55:16PM +0100, Mark Rutland wrote:
> On Tue, Oct 17, 2017 at 12:38:14PM +0100, Will Deacon wrote:
> > On Tue, Oct 17, 2017 at 12:10:33PM +0100, Mark Rutland wrote:
> > > On Tue, Oct 17, 2017 at 11:58:58AM +0100, Will Deacon wrote:
> > > > On Tue, Oct 17, 2017 at 11:54:54AM +0100, Mark Rutland wrote:
> > > > > On Tue, Oct 17, 2017 at 11:03:15AM +0100, Will Deacon wrote:
> > > > > > On Mon, Oct 16, 2017 at 02:24:38PM +0100, Mark Rutland wrote:
> > > > > > > While we build the LL/SC atomics as a C object file, this does not
> > > > > > > follow the AAPCS. This does not interoperate with other C code, 
> > > > > > > and can
> > > > > > > only be called from special wrapper assembly.
> > > > > > > 
> > > > > > > Bulding a kernel with CONFIG_KCOV and CONFIG_ARM64_LSE_ATOMICS 
> > > > > > > results
> > > > > > > in the cmopiler inserting calls to __sanitizer_cov_trace_pc 
> > > > > > > within the
> > > > > > > LL/SC atomics. As __sanitizer_cov_trace_pc is built per the 
> > > > > > > AAPCS, these
> > > > > > > calls corrupt register values, resulting in failures at boot time.
> > > > > > > 
> > > > > > > Avoid this (and other similar issues) by opting out of all 
> > > > > > > compiler
> > > > > > > instrumentation. We can opt-in to specific instrumentation in 
> > > > > > > future if
> > > > > > > we want to.
> > > 
> > > > > > > diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
> > > > > > > index a0abc142c92b..af77516f71b2 100644
> > > > > > > --- a/arch/arm64/lib/Makefile
> > > > > > > +++ b/arch/arm64/lib/Makefile
> > > > > > > @@ -17,5 +17,6 @@ CFLAGS_atomic_ll_sc.o   := -fcall-used-x0 
> > > > > > > -ffixed-x1 -ffixed-x2 \
> > > > > > >  -fcall-saved-x10 -fcall-saved-x11 -fcall-saved-x12   
> > > > > > > \
> > > > > > >  -fcall-saved-x13 -fcall-saved-x14 -fcall-saved-x15   
> > > > > > > \
> > > > > > >  -fcall-saved-x18
> > > > > > > +CC_INSTRUMENT_atomic_ll_sc.o := n
> > > > > > 
> > > > > > Does this mean we can lose the "notrace" definition of 
> > > > > > __LL_SC_INLINE
> > > > > > when generating the out-of-line atomics?
> > > > > 
> > > > > Unfortunately not.
> > > > > 
> > > > > I'd missed -pg, since that isn't handled in scripts/Makefile.lib, and
> > > > > doesn't seem to have a makefile-level disable.
> > > > > 
> > > > > I'll see if that can be remedied.
> > > > 
> > > > Thanks. It's a real shame to have a "just use this option to avoid
> > > > instrumentation" if it doesn't actually catch everything. 
> > > 
> > > Agreed; it defeats the purpose of the exercise.
> > > 
> > > > We probably need to think about kprobes too, but not really sure what
> > > > you can do there on a per-file basis.
> > > 
> > > Ugh; that's a much more painful one, yes. :(
> > > 
> > > Does that rely on any compiler options at all? I thought was all a
> > > runtime thing.
> > > 
> > > Arguably it is somewhat separate for compiler instrumentation, and it
> > > might make sense for that to be a separate option.
> > 
> > Yes, I suppose the problem here is that opting out of dynamic tracing
> > requires function attributes such as notrace and __kprobes, rather than a
> > compiler flag.  If there's no way to say to the compiler "act as though
> > every function in this compilation unit is tagged with this attribute" then
> > we probably can't do anything to solve this easily.
> 
> Unfortunately, I'm not aware of any way to do that short of using a
> linker script to rewrite sections.
> 
> > We should probably add __kprobes to __LL_SC_INLINE though.
> 
> Agreed.
> 
> It's a different case, but kprobes can use atomics behind the scenes
> (e.g. via aarch64_insn_patch_text_cb()), and so those need to be
> blacklisted.
> 
> I'll add a patch to this series, unless you plan to put one together.

Don't mind either way. If you post the next version without, I can just
add it on top.

Will

Re: [PATCH 1/2] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default

2017-10-18 Thread Matthew Wilcox

On Wed, Oct 18, 2017 at 03:36:05PM +0200, Thomas Gleixner wrote:
> Which reminds me that I wanted to convert them to static_key so they are
> zero overhead when disabled. Sigh, why are todo lists growth only?

This is why you need an Outreachy intern -- it gets at least one task
off your todo list, and in the best possible case, it gets a second
person working on your todo list for a long time.

... eventually they start their own todo lists ...

Re: [PATCH] HID: rmi: Check that a device is a RMI device before calling RMI functions

2017-10-18 Thread Jiri Kosina

On Wed, 18 Oct 2017, Benjamin Tissoires wrote:

> > The hid-rmi driver may handle non rmi devices on composite USB devices.
> > Callbacks need to make sure that the current device is a RMI device before
> > calling RMI specific functions. Most callbacks already have this check, but
> > this patch adds checks to the remaining callbacks.
> > 
> > Signed-off-by: Andrew Duggan 
> > ---
> > This is the patch which hopefully will address the X1 tablet dock freeze:
> > http://www.spinics.net/lists/linux-input/msg53582.html
> > 
> > I was not able to test on a composite USB device so I have not tested 
> > confirmed
> > this will fix the reported issues. But, based on the description I think it 
> > will.
> > I also added a check for rmi_raw_event() since it could be possible that 
> > another
> > hid device using one of the same report IDs as an RMI device could result 
> > in calling
> > into unitialized RMI functions. It was also the only callbacl left not 
> > checking the
> > RMI_DEVICE flag. I wonder if this explains the attach crash.
> > 
> > Anyway, I would appriciate it if Hendrik or someone else with the device 
> > could test this
> > patch to confirm it fixes the reported behavior.
> 
> Shouldn't rmi_hid_reset() also get the same check?

I think so as well.

> 
> From what I can see, even if the patch doesn't fix the immediate issue, 
> such a patch should definitively go in as those checks are clearly 
> missing.

Agreed; however I'd like to get Hendrik's Tested-by: if possible in case 
this really fixes the issue, so I am not merging it right away.

Hendrik, are you by any chance able to test this patch in a reasonable 
timeframe please?

Thanks!

-- 
Jiri Kosina
SUSE Labs

Re: [patch] mm, slab: only set __GFP_RECLAIMABLE once

2017-10-18 Thread Mel Gorman

On Tue, Oct 17, 2017 at 03:30:01PM -0700, David Rientjes wrote:
> SLAB_RECLAIM_ACCOUNT is a permanent attribute of a slab cache.  Set 
> __GFP_RECLAIMABLE as part of its ->allocflags rather than check the cachep 
> flag on every page allocation.
> 
> Signed-off-by: David Rientjes 

Acked-by: Mel Gorman 

-- 
Mel Gorman
SUSE Labs

Re: [PATCH RFC 00/10] Intel EPT-Based Sub-page Write Protection Support.

2017-10-18 Thread Mihai Donțu

On Wed, 2017-10-18 at 11:35 +0200, Paolo Bonzini wrote:
> On 16/10/2017 02:08, Yi Zhang wrote:
> > > And the introspection facility by Mihai uses a completely
> > > different API for the introspector, based on sockets rather than ioctls.
> > > So I'm not sure this is the right API at all.
> > 
> > Currently,  We only block the write access, As far as I know an example,
> > we now using it in a security daemon:
> 
> Understood.  However, I think QEMU is the wrong place to set this up.
> 
> If the kernel wants to protect _itself_, it should use a hypercall.  If
> an introspector appliance wants to protect the guest kernel, it should
> use the socket that connects it to the hypervisor.

We have been looking at using SPP for VMI for quite some time. If a
guest kernel will be able to control it (can it do so with EPT?) then
it would be useful a simple switch that disables this ability, as an
introspector wouldn't want the guest is trying to protect to interfere
with it.

Also, if Intel doesn't have a specific use case for it that requires
separate access to SPP control, then maybe we can fold it into the VMI 
API we are working on?

Thanks,

> > Consider It has a server which launching in the host user-space, and a
> > client launching in the guest kernel. Yes, they are communicate with
> > sockets. The guest kernel wanna protect a special area to prevent all
> > the process including the kernel itself modify this area. the client
> > could send the guest physical address via the security socket to server
> > side, and server would update these protection into KVM. Thus, all the
> > write access in a guest specific area will be blocked.
> > 
> > Now the implementation only on the second half(maybe third ^_^) of this
> > example: 'How kvm set the write-protect into a specific GFN?'
> > 
> > Maybe a user space tools which use ioctl let kvm mmu update the
> > write-protection is a better choice.

-- 
Mihai Donțu

Re: [PATCH] x86, syscalls: use SYSCALL_DEFINE() macros for sys_modify_ldt()

2017-10-18 Thread Dave Hansen

On 10/18/2017 06:17 AM, Ingo Molnar wrote:
> I have added your:
> 
>   Signed-off-by: Dave Hansen 
> 
> let me know if that's OK.

Yes, that's OK.

Re: [PATCH v3 25/33] tracing: Allow whitespace to surround hist trigger filter

2017-10-18 Thread Tom Zanussi

On Wed, 2017-10-18 at 12:00 +0900, Namhyung Kim wrote:
> Hi Tom,
> 
> On Wed, Oct 04, 2017 at 02:05:17PM -0500, Tom Zanussi wrote:
> > Hi Steve,
> > 
> > On Wed, 2017-10-04 at 14:11 -0400, Steven Rostedt wrote:
> > > On Fri, 22 Sep 2017 15:00:05 -0500
> > > Tom Zanussi  wrote:
> > > 
> > > > The existing code only allows for one space before and after the 'if'
> > > > specifying the filter for a hist trigger.  Add code to make that more
> > > > permissive as far as whitespace goes.  Specifically, we want to allow
> > > > spaces in the trigger itself now that we have additional syntax
> > > > (onmatch/onmax) where spaces are more natural e.g. spaces after commas
> > > > in param lists.
> > > > 
> > > > Signed-off-by: Tom Zanussi 
> > > > ---
> > > >  kernel/trace/trace_events_hist.c | 24 +++-
> > > >  1 file changed, 19 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/kernel/trace/trace_events_hist.c 
> > > > b/kernel/trace/trace_events_hist.c
> > > > index ba42fe2..431f2b2 100644
> > > > --- a/kernel/trace/trace_events_hist.c
> > > > +++ b/kernel/trace/trace_events_hist.c
> > > > @@ -4857,7 +4857,7 @@ static int event_hist_trigger_func(struct 
> > > > event_command *cmd_ops,
> > > > struct synth_event *se;
> > > > const char *se_name;
> > > > bool remove = false;
> > > > -   char *trigger;
> > > > +   char *trigger, *p;
> > > > int ret = 0;
> > > >  
> > > > if (!param)
> > > > @@ -4866,10 +4866,23 @@ static int event_hist_trigger_func(struct 
> > > > event_command *cmd_ops,
> > > > if (glob[0] == '!')
> > > > remove = true;
> > > >  
> > > > -   /* separate the trigger from the filter (k:v [if filter]) */
> > > > -   trigger = strsep(¶m, " \t");
> > > > -   if (!trigger)
> > > > -   return -EINVAL;
> > > > +   /*
> > > > +* separate the trigger from the filter (k:v [if filter])
> > > > +* allowing for whitespace in the trigger
> > > > +*/
> > > > +   trigger = param;
> > > > +   p = strstr(param, " if");
> > > > +   if (!p)
> > > > +   p = strstr(param, "\tif");
> > > > +   if (p) {
> > > > +   if (p == trigger)
> > > > +   return -EINVAL;
> > > > +   param = p + 1;
> > > > +   param = strstrip(param);
> > > > +   *p = '\0';
> > > > +   trigger = strstrip(trigger);
> > > > +   } else
> > > > +   param = NULL;
> > > 
> > > I think you forgot to update this:
> > > 
> > 
> > I was going to but on closer inspection realized the simpler form
> > wouldn't accomplish the same thing - the problem this is trying to solve
> > is to allow bits of whitespace within the trigger (because we now have
> > function-like syntax, which should allow whitespace after commas for
> > instance) and separating the trigger from the filter ('if').  So we
> > explicitly search for 'if' with preceding whitespace, which strsep won't
> > accomplish.
> 
> What if a field name is started with 'if' (like "ifb")?  I think you
> also need to check whitespace after the 'if'.
> 

Yeah, it will unnecessarily truncate the trigger, will fix.

Thanks,

Tom

> Thanks,
> Namhyung

Re: [PATCH] staging: irda: resolve sparse errors due to implicit pci_power_t casts

2017-10-18 Thread Greg KH

On Thu, Oct 05, 2017 at 04:38:23PM -0700, Matthew Giassa wrote:
> Explicitly casting pci_power_t types to resolve sparse warnings (shown
> below).
> 
> Also fixing a related logging bug where pci_power_t is cast to unsigned
> (can be negative, i.e. PCI_POWER_ERROR).
> 
> Original sparse report:
> 
> drivers/staging/irda/drivers//vlsi_ir.c:170:51: warning: cast from
> restricted pci_power_t
> drivers/staging/irda/drivers//vlsi_ir.c:1726:39: warning: restricted
> pci_power_t degrades to integer
> drivers/staging/irda/drivers//vlsi_ir.c:1728:45: warning: incorrect type
> in assignment (different base types)
> drivers/staging/irda/drivers//vlsi_ir.c:1728:45:expected restricted
> pci_power_t [usertype] current_state
> drivers/staging/irda/drivers//vlsi_ir.c:1728:45:got int [signed]
> [usertype] event
> drivers/staging/irda/drivers//vlsi_ir.c:1748:29: warning: incorrect type
> in assignment (different base types)
> drivers/staging/irda/drivers//vlsi_ir.c:1748:29:expected restricted
> pci_power_t [usertype] current_state
> drivers/staging/irda/drivers//vlsi_ir.c:1748:29:got int [signed]
> [usertype] event

Please do not line-wrap lines like this, it makes them harder to
understand.

> 
> Warnings no longer present.
> 
> Signed-off-by: Matthew Giassa 
> ---
>  drivers/staging/irda/drivers/vlsi_ir.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/irda/drivers/vlsi_ir.c 
> b/drivers/staging/irda/drivers/vlsi_ir.c
> index 3dff3c5..20ce4d8 100644
> --- a/drivers/staging/irda/drivers/vlsi_ir.c
> +++ b/drivers/staging/irda/drivers/vlsi_ir.c
> @@ -167,7 +167,8 @@ static void vlsi_proc_pdev(struct seq_file *seq, struct 
> pci_dev *pdev)
>  
>   seq_printf(seq, "\n%s (vid/did: [%04x:%04x])\n",
>  pci_name(pdev), (int)pdev->vendor, (int)pdev->device);
> - seq_printf(seq, "pci-power-state: %u\n", (unsigned) 
> pdev->current_state);
> + seq_printf(seq, "pci-power-state: %d\n",
> +(int __force)pdev->current_state);

Ick, using __force is almost always a huge sign that something is wrong
here.  This patch does not look correct because of this.

You did read drivers/staging/irda/TODO, right?

thanks,

greg k-h

Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume

2017-10-18 Thread Ulf Hansson

[...]

>>
>> The reason why pm_runtime_force_* needs to respects the hierarchy of
>> the RPM callbacks, is because otherwise it can't safely update the
>> runtime PM status of the device.
>
> I'm not sure I follow this requirement.  Why is that so?

If the PM domain controls some resources for the device in its RPM
callbacks and the driver controls some other resources in its RPM
callbacks - then these resources needs to be managed together.

This follows the behavior of when a regular call to
pm_runtime_get|put(), triggers the RPM callbacks to be invoked.

>
>> And updating the runtime PM status of
>> the device is required to manage the optimized behavior during system
>> resume (avoiding to unnecessary resume devices).
>
> Well, OK.  The runtime PM status of the device after system resume should
> better reflect its physical state.
>
> [The physical state of the device may not be under the control of the
> kernel in some cases, like in S3 resume on some systems that reset
> devices in the firmware and so on, but let's set that aside.]
>
> However, for the runtime PM status of the device may still reflect its state
> if, say, a ->resume_early of the middle layer is called during resume along
> with a driver's ->runtime_resume.  That still can produce the right state
> of the device and all depends on the middle layer.
>
> On the other hand, as I said before, using a middle-layer ->runtime_suspend
> during a system sleep transition may be outright incorrect, say if device
> wakeup settings need to be adjusted by the middle layer (which is the
> case for some of them).
>
> Of course, if the middle layer expects the driver to point its
> system-wide PM callbacks to pm_runtime_force_*, then that's how it goes,
> but the drivers working with this particular middle layer generally
> won't work with other middle layers and may interact incorrectly
> with parents and/or children using the other middle layers.
>
> I guess the problem boils down to having a common set of expectations
> on the driver side and on the middle layer side allowing different
> combinations of these to work together.

Yes!

>
>> Besides the AMBA case, I also realized that we are dealing with PM
>> clocks in the genpd case. For this, genpd relies on the that runtime
>> PM status of the device properly reflects the state of the HW, during
>> system-wide PM.
>>
>> In other words, if the driver would change the runtime PM status of
>> the device, without respecting the hierarchy of the runtime PM
>> callbacks, it would lead to that genpd starts taking wrong decisions
>> while managing the PM clocks during system-wide PM. So in case you
>> intend to change pm_runtime_force_* this needs to be addressed too.
>
> I've just looked at the genpd code and quite frankly I'm not sure how this
> works, but I'll figure this out. :-)

You may think of it as genpd's RPM callback controls some device
clocks, while the driver control some other device resources (pinctrl
for example) from its RPM callback.

These resources needs to managed together, similar to as I described above.

[...]

>> Absolutely agree about the different wake-up settings. However, these
>> issues can be addressed also when using pm_runtime_force_*, at least
>> in general, but then not for PCI.
>
> Well, not for the ACPI PM domain too.
>
> In general, not if the wakeup settings are adjusted by the middle layer.

Correct!

To use pm_runtime_force* for these cases, one would need some
additional information exchange between the driver and the
middle-layer.

>
>> Regarding hibernation, honestly that's not really my area of
>> expertise. Although, I assume the middle-layer and driver can treat
>> that as a separate case, so if it's not suitable to use
>> pm_runtime_force* for that case, then they shouldn't do it.
>
> Well, agreed.
>
> In some simple cases, though, driver callbacks can be reused for hibernation
> too, so it would be good to have a common way to do that too, IMO.

Okay, that makes sense!

>
>> >
>> > Also, quite so often other middle layers interact with PCI directly or
>> > indirectly (eg. a platform device may be a child or a consumer of a PCI
>> > device) and some optimizations need to take that into account (eg. parents
>> > generally need to be accessible when their childres are resumed and so on).
>>
>> A device's parent becomes informed when changing the runtime PM status
>> of the device via pm_runtime_force_suspend|resume(), as those calls
>> pm_runtime_set_suspended|active().
>
> This requires the parent driver or middle layer to look at the reference
> counter and understand it the same way as pm_runtime_force_*.
>
>> In case that isn't that sufficient, what else is needed? Perhaps you can
>> point me to an example so I can understand better?
>
> Say you want to leave the parent suspended after system resume, but the
> child drivers use pm_runtime_force_suspend|resume().  The parent would then
> need to use pm_runtime_force_suspend|resume() too, no?

Actually no.

Re: [PATCH] zswap: Same-filled pages handling

2017-10-18 Thread Matthew Wilcox

On Wed, Oct 18, 2017 at 04:33:43PM +0300, Timofey Titovets wrote:
> 2017-10-18 15:34 GMT+03:00 Matthew Wilcox :
> > On Wed, Oct 18, 2017 at 10:48:32AM +, Srividya Desireddy wrote:
> >> +static void zswap_fill_page(void *ptr, unsigned long value)
> >> +{
> >> + unsigned int pos;
> >> + unsigned long *page;
> >> +
> >> + page = (unsigned long *)ptr;
> >> + if (value == 0)
> >> + memset(page, 0, PAGE_SIZE);
> >> + else {
> >> + for (pos = 0; pos < PAGE_SIZE / sizeof(*page); pos++)
> >> + page[pos] = value;
> >> + }
> >> +}
> >
> > I think you meant:
> >
> > static void zswap_fill_page(void *ptr, unsigned long value)
> > {
> > memset_l(ptr, value, PAGE_SIZE / sizeof(unsigned long));
> > }
> 
> IIRC kernel have special zero page, and if i understand correctly.
> You can map all zero pages to that zero page and not touch zswap completely.
> (Your situation look like some KSM case (i.e. KSM can handle pages
> with same content), but i'm not sure if that applicable there)

You're confused by the word "same".  What Srividya meant was that the
page is filled with a pattern, eg 0xfffefffefffefffe..., not that it is
the same as any other page.

Re: [PATCH v5 3/6] perf: hisi: Add support for HiSilicon SoC L3C PMU driver

2017-10-18 Thread Zhangshaokun

Hi Mark,

Thanks for your further explanation.

On 2017/10/18 21:55, Mark Rutland wrote:
> On Wed, Oct 18, 2017 at 09:33:30PM +0800, Zhangshaokun wrote:
>> On 2017/10/17 23:16, Mark Rutland wrote:
>>> On Tue, Aug 22, 2017 at 04:07:54PM +0800, Shaokun Zhang wrote:
 +static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
 +struct hisi_pmu *l3c_pmu)
 +{
 +  unsigned long long id;
 +  struct resource *res;
 +  acpi_status status;
 +  int cpu;
 +
 +  status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev),
 + "_UID", NULL, &id);
 +  if (ACPI_FAILURE(status))
 +  return -EINVAL;
 +
 +  l3c_pmu->id = id;
 +
 +  /*
 +   * Use the SCCL_ID and CCL_ID to identify the L3C PMU, while
 +   * SCCL_ID is in MPIDR[aff2] and CCL_ID is in MPIDR[aff1].
 +   */
 +  if (device_property_read_u32(&pdev->dev, "hisilicon,scl-id",
 +   &l3c_pmu->sccl_id)) {
 +  dev_err(&pdev->dev, "Can not read l3c sccl-id!\n");
 +  return -EINVAL;
 +  }
 +
 +  if (device_property_read_u32(&pdev->dev, "hisilicon,ccl-id",
 +   &l3c_pmu->ccl_id)) {
 +  dev_err(&pdev->dev, "Can not read l3c ccl-id!\n");
 +  return -EINVAL;
 +  }
 +
 +  /* Initialise the associated cpumask of the PMU */
 +  for_each_present_cpu(cpu)
 +  smp_call_function_single(cpu, hisi_l3c_pmu_set_cpumask_by_ccl,
 +   (void *)l3c_pmu, 1);
> 
>>> Rather than a proble-time smp_call_function_single(), can you follow the
>>> qcom l2's approach of associating CPUs with a PMU instance in the
>>> notifier? That will work even if CPUs are brought online very late.
>>
>> A good guidance, but HHA and DDRC PMUs are different from L3C PMU, the former
>> share the same SCCL and the latter share the same SCCL and CCL. I will
>> try to deal with this difference in online notifier.
> 
> FWIW, I think it makes sense for each PMU to have its own notifier
> (perhaps with some shared code that each calls to do the migration).
> 
> I just want to avoid the smp_call_function_single() at probe time, as
> that doesn't work in some cases.
> 

Got it, i shall update the hisi_pmu::associated_cpus only in online
and offline notifiers.

Thanks,
Shaokun

> Thanks,
> Mark.
> 
> .
>

Re: [PATCH v4 3/5] reset: socfpga: use the reset-simple driver

2017-10-18 Thread Andre Przywara

Hi,

On 18/10/17 14:50, Philipp Zabel wrote:
> On Wed, 2017-10-18 at 14:00 +0100, Andre Przywara wrote:
>> Hi,
> 
> Thank you for the review.
> 
>> On 17/10/17 14:03, Philipp Zabel wrote:
>>> Add reset line status readback, inverted status support, and socfpga
>>> device tree quirks to the simple reset driver, and use it to replace
>>> the socfpga driver.
>>>
>>> Signed-off-by: Philipp Zabel 
>>> ---
>>> Changes since v3:
>>>  - Rebased onto reset/next
>>>  - Only warn about missing altr,modrst-offset property on socfpga
>>> ---
>>>  drivers/reset/Kconfig |  10 +--
>>>  drivers/reset/Makefile|   1 -
>>>  drivers/reset/reset-simple.c  |  51 +-
>>>  drivers/reset/reset-simple.h  |   4 ++
>>>  drivers/reset/reset-socfpga.c | 157 
>>> --
>>>  5 files changed, 56 insertions(+), 167 deletions(-)
>>>  delete mode 100644 drivers/reset/reset-socfpga.c
>>>
> [...]
>>> diff --git a/drivers/reset/reset-simple.c b/drivers/reset/reset-simple.c
>>> index a5119457cec61..98ff0f924948e 100644
>>> --- a/drivers/reset/reset-simple.c
>>> +++ b/drivers/reset/reset-simple.c
>>> @@ -68,25 +68,58 @@ static int reset_simple_deassert(struct 
>>> reset_controller_dev *rcdev,
>>> return reset_simple_update(rcdev, id, false);
>>>  }
>>>  
>>> +static int reset_simple_status(struct reset_controller_dev *rcdev,
>>> +  unsigned long id)
>>> +{
>>> +   struct reset_simple_data *data = to_reset_simple_data(rcdev);
>>> +   int reg_width = sizeof(u32);
>>> +   int bank = id / (reg_width * BITS_PER_BYTE);
>>> +   int offset = id % (reg_width * BITS_PER_BYTE);
>>> +   u32 reg;
>>> +
>>> +   reg = readl(data->membase + (bank * reg_width));
>>> +
>>> +   return !(reg & BIT(offset)) ^ !data->status_active_low;
>>> +}
>>> +
>>>  const struct reset_control_ops reset_simple_ops = {
>>> .assert = reset_simple_assert,
>>> .deassert   = reset_simple_deassert,
>>> +   .status = reset_simple_status,
>>>  };
>>>  
>>>  /**
>>>   * struct reset_simple_devdata - simple reset controller properties
>>> + * @reg_offset: offset between base address and first reset register.
>>> + * @nr_resets: number of resets. If not set, default to resource size in 
>>> bits.
>>>   * @active_low: if true, bits are cleared to assert the reset. Otherwise, 
>>> bits
>>>   *  are set to assert the reset.
>>> + * @status_active_low: if true, bits read back as cleared while the reset 
>>> is
>>> + * asserted. Otherwise, bits read back as set while the
>>> + * reset is asserted.
>>>   */
>>>  struct reset_simple_devdata {
>>> +   u32 reg_offset;
>>> +   u32 nr_resets;
>>> bool active_low;
>>> +   bool status_active_low;
>>> +};
>>> +
>>> +#define SOCFPGA_NR_BANKS   8
>>> +
>>> +static const struct reset_simple_devdata reset_simple_socfpga = {
>>> +   .reg_offset = 0x10,
> 
> Here reset_simple_socfpga.reg_offset is set to the default of 0x10.

Ah, right. Sorry, I missed that.

> 
>>> +   .nr_resets = SOCFPGA_NR_BANKS * 32,
>>> +   .status_active_low = true,
>>>  };
>>>  
>>>  static const struct reset_simple_devdata reset_simple_active_low = {
>>> .active_low = true,
>>> +   .status_active_low = true,
>>>  };
>>>  
>>>  static const struct of_device_id reset_simple_dt_ids[] = {
>>> +   { .compatible = "altr,rst-mgr", .data = &reset_simple_socfpga },
>>> { .compatible = "allwinner,sun6i-a31-clock-reset",
>>> .data = &reset_simple_active_low },
>>> { /* sentinel */ },
>>> @@ -99,6 +132,7 @@ static int reset_simple_probe(struct platform_device 
>>> *pdev)
>>> struct reset_simple_data *data;
>>> void __iomem *membase;
>>> struct resource *res;
>>> +   u32 reg_offset = 0;
>>>  
>>> devdata = of_device_get_match_data(dev);
>>>  
>>> @@ -118,8 +152,23 @@ static int reset_simple_probe(struct platform_device 
>>> *pdev)
>>> data->rcdev.ops = &reset_simple_ops;
>>> data->rcdev.of_node = dev->of_node;
>>>  
>>> -   if (devdata)
>>> +   if (devdata) {
>>> +   reg_offset = devdata->reg_offset;
> 
> And here reg_offset is set to the default of 0x10 on socfpga.
> 
>>> +   if (devdata->nr_resets)
>>> +   data->rcdev.nr_resets = devdata->nr_resets;
>>> data->active_low = devdata->active_low;
>>> +   data->status_active_low = devdata->status_active_low;
>>> +   }
>>> +
>>> +   if (devdata == &reset_simple_socfpga &&
>>
>> Mmh, this pointer comparison looks a bit dodgy. Isn't
>> of_device_is_compatible() the right solution here?
> 
> My thinking was, the of_device_is_compatible is already called inside
> of_device_get_match_data, so why call it again and not reuse the result?
> 
>> Also semantically, as the property is tied to a certain compatible
>> string (and not to our data structure)?
> 
> I agree with this, though. I'll change this line to say:
> 
> + if (of_device_is_compatible(dev->of_node, "altr,rst-mgr") &&

Yes, t

Re: [PATCH v7 7/6] fs/epoll: scale nested callbacks

2017-10-18 Thread Jason Baron



On 10/17/2017 11:53 AM, Davidlohr Bueso wrote:
> On Mon, 16 Oct 2017, Jason Baron wrote:
> 
>> Hi,
>>
>> I posted a patch to completely remove the lock here from the
>> ep_poll_safewake() patch here:
>>
>> http://lkml.iu.edu/hypermail/linux/kernel/1710.1/05834.html
> 
> Kernel development never ceases to amaze me. Two complementary
> fixes to a 8+ y/o performance issue on the same day - not that
> nested epolls are that common, but it also comes from two real
> workloads...
> 
> Getting rid of the lock altogether is always the best way.
> 
>>
>> So these are going to conflict. The reason its safe to remove the lock
>> is that there are loop and depth checks now that are performed during
>> EPOLL_CTL_ADD. Specifically, ep_loop_check(). I would prefer to these
>> checks once add add time as opposed to at each wakeup (even if they can
>> be scaled better).
> 
> Wrt conflicts, no worries, I'll rebase -- and hopefully we can get
> the dlock stuff in for v4.15 as well.
> 
>>
>> I also have a pending patch to do something similar for
>> poll_readywalk_ncalls, but I think that poll_safewake_ncalls is the most
>> egregious one here?
> 
> The customer's workload issues are for the loop_ncalls and readywalk_ncalls
> lists, so I'd be interested in seeing your patch for the later. The reason
> your patch above is likely not to help my scenario is that most of the time
> is spent at a dispatcher level doing epoll_wait, not too many
> EPOLL_CTL_ADDs
> going on afaict.

If there are not many EPOLL_CTL_ADDs, then I wouldn't think loop_ncalls
would be highly contented (since it should only be called from the add
path)?

Thanks,

-Jason


> 
> Thanks,
> Davidlohr
> 
>>
>> Thanks,
>>
>> -Jason
>>
>> On 10/13/2017 11:45 AM, Davidlohr Bueso wrote:
>>> A customer reported massive contention on the ncalls->lock in which
>>> the workload is designed around nested epolls (where the fd is already
>>> an epoll).
>>>
>>> 83.49%  [kernel]   [k] __pv_queued_spin_lock_slowpath
>>>  2.70%  [kernel]   [k] ep_call_nested.constprop.13
>>>  1.94%  [kernel]   [k] _raw_spin_lock_irqsave
>>>  1.83%  [kernel]   [k]
>>> __raw_callee_save___pv_queued_spin_unlock
>>>  1.45%  [kernel]   [k] _raw_spin_unlock_irqrestore
>>>  0.41%  [kernel]   [k] ep_scan_ready_list.isra.8
>>>  0.36%  [kernel]   [k] pvclock_clocksource_read
>>>
>>> The application running on kvm, is using a shared memory IPC
>>> communication
>>> with a pipe wakeup mechanism, and a two level dispatcher both built
>>> around
>>> 'epoll_wait'. There is one process per physical core and a full mesh of
>>> pipes
>>> between them, so on a 18 core system (the actual case), there are 18*18
>>> pipes
>>> for the IPCs alone.
>>>
>>> This patch proposes replacing the nested calls global linked list with a
>>> dlock
>>> interface, for which we can benefit from pcpu lists when doing
>>> ep_poll_safewake(),
>>> and hashing for the current task, which provides two benefits:
>>>
>>> 1. Speeds up the process of loop and max-depth checks from O(N) lookups
>>> to O(1)
>>>   (albeit possible collisions, which we have to iterate); and,
>>>
>>> 2. Provides smaller lock granularity.
>>>
>>> cpus    before    after   diff
>>> 1    1409370    1344804 -4.58%
>>> 2    1015690    1337053 31.63%
>>> 3 721009    1273254 76.59%
>>> 4 380959    1128931    196.33%
>>> 5 287603    1028362    257.56%
>>> 6 221104 894943    304.76%
>>> 7 173592 976395    462.46%
>>> 8 145860 922584    532.51%
>>> 9 127877 925900    624.05%
>>> 10 112603 791456    602.87%
>>> 11  97926 724296    639.63%
>>> 12  80732 730485    804.82%
>>>
>>> With the exception of a single cpu, where the overhead of jhashing
>>> influences), we
>>> get some pretty good raw throughput increase. Similarly, the amount of
>>> time spent
>>> decreases immensely as well.
>>>
>>> Passes ltp related testcases.
>>>
>>> Signed-off-by: Davidlohr Bueso 
>>> ---
>>> fs/eventpoll.c | 88
>>> +++---
>>> 1 file changed, 53 insertions(+), 35 deletions(-)
>>>
>>> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
>>> index 2fabd19cdeea..675c97fdc5da 100644
>>> --- a/fs/eventpoll.c
>>> +++ b/fs/eventpoll.c
>>> @@ -22,7 +22,7 @@
>>> #include 
>>> #include 
>>> #include 
>>> -#include 
>>> +#include 
>>> #include 
>>> #include 
>>> #include 
>>> @@ -119,7 +119,7 @@ struct epoll_filefd {
>>>  * and loop cycles.
>>>  */
>>> struct nested_call_node {
>>> -    struct list_head llink;
>>> +    struct dlock_list_node llink;
>>>     void *cookie;
>>>     void *ctx;
>>> };
>>> @@ -129,8 +129,7 @@ struct nested_call_node {
>>>  * maximum recursion dept and loop cycles.
>>>  */
>>> struct nested_calls {
>>> -    struct list_head tasks_call_list;
>>> -    spinlock_t lock;
>>> +    struct dlock_list_heads tasks_call_list;
>>>

Re: [PATCH v3 29/33] tracing: Add inter-event hist trigger Documentation

2017-10-18 Thread Tom Zanussi

On Tue, 2017-10-17 at 21:36 -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 15:00:09 -0500
> Tom Zanussi  wrote:
> 
> > Add background and details on inter-event hist triggers, including
> > hist variables, synthetic events, and actions.
> 
> One more thing. Make this a separate document. Don't add it to the
> events.txt. Have a histogram.txt or something and just have events.txt
> reference that document.
> 

OK, will do.

Tom

> Thanks!
> 
> -- Steve
> 
> > 
> > Signed-off-by: Tom Zanussi 
> > Signed-off-by: Baohong Liu 
> > ---
> >

Re: [PATCH RFC 00/10] Intel EPT-Based Sub-page Write Protection Support.

2017-10-18 Thread Yi Zhang

On 2017-10-18 at 11:35:12 +0200, Paolo Bonzini wrote:
> >
> > Currently,  We only block the write access, As far as I know an example,
> > we now using it in a security daemon:
> 
> Understood.  However, I think QEMU is the wrong place to set this up.
> 
> If the kernel wants to protect _itself_, it should use a hypercall.  If
> an introspector appliance wants to protect the guest kernel, it should
> use the socket that connects it to the hypervisor.
> 
> Paolo
> 

Thanks Paolo,

Yes, that correctable, I will think about to switch the interface to a
hypercall,  How about we keep these 2 interface together(hyper call +
ioctl)? think about that if VMM manager have some way could intercept
the guest kernel memory accessing, the page protection would like a
hardware watch point, is it an easy way to let VMM manager debug the
guest kernel?

Except the interface change, could you please help to review the other
patch series? just skip the ioctl patch( patch 7). 
Thank you very much Paolo.

Re: [PATCH] Staging: rtl8723bs: core: rtw_recv: fix parenthesis alignment warning in validate_recv_mgnt_frame()

2017-10-18 Thread Greg KH

On Thu, Oct 05, 2017 at 03:45:33PM +0200, Srinivasan Shanmugam wrote:
> Fix parenthesis alignment warning in validate_recv_mgnt_frame()
> 
> Signed-off-by: Srinivasan Shanmugam 
> ---
>  drivers/staging/rtl8723bs/core/rtw_recv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Patch does not apply to my tree :(

Re: [PATCH net 0/3] Fix for BPF devmap percpu allocation splat

2017-10-18 Thread Daniel Borkmann


On 10/18/2017 03:25 PM, Tejun Heo wrote:

Hello, Daniel.

(cc'ing Dennis)

On Tue, Oct 17, 2017 at 04:55:51PM +0200, Daniel Borkmann wrote:

The set fixes a splat in devmap percpu allocation when we alloc
the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2,
patch 1 is rather small, so if this could be routed via -net, for
example, with Tejun's Ack that would be good. Patch 3 gets rid of
remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator
internals and should not be used.

Thanks!

Daniel Borkmann (3):
   mm, percpu: add support for __GFP_NOWARN flag


This looks fine.


Great, thanks!


   bpf: fix splat for illegal devmap percpu allocation
   bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations


These look okay too but if it helps percpu allocator can expose the
maximum size / alignment supported to take out the guessing game too.


At least from BPF side there's right now no infra for exposing
max possible alloc sizes for maps to e.g. user space as indication.
There are few users left in the tree, where it would make sense for
having some helpers though:

  arch/tile/kernel/setup.c:729:   if (size < PCPU_MIN_UNIT_SIZE)
  arch/tile/kernel/setup.c:730:   size = PCPU_MIN_UNIT_SIZE;
  drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c:346: unsigned int max = 
(PCPU_MIN_UNIT_SIZE - sizeof(*pools)) << 3;
  drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c:352: /* make sure per cpu 
pool fits into PCPU_MIN_UNIT_SIZE */
  drivers/scsi/libfc/fc_exch.c:2488:   /* reduce range so per cpu pool fits 
into PCPU_MIN_UNIT_SIZE pool */
  drivers/scsi/libfc/fc_exch.c:2489:  pool_exch_range = (PCPU_MIN_UNIT_SIZE 
- sizeof(*pool)) /


Also, the reason why PCPU_MIN_UNIT_SIZE is what it is is because
nobody needed anything bigger.  Increasing the size doesn't really
cost much at least on 64bit archs.  Is that something we want to be
considering?


For devmap (and cpumap) itself it wouldn't make sense. For per-cpu
hashtable we could indeed consider it in the future.

Thanks,
Daniel

Re: [PATCH] epoll: avoid calling ep_call_nested() from ep_poll_safewake()

2017-10-18 Thread Jason Baron

On 10/17/2017 11:37 AM, Davidlohr Bueso wrote:
> On Fri, 13 Oct 2017, Jason Baron wrote:
> 
>> The ep_poll_safewake() function is used to wakeup potentially nested
>> epoll
>> file descriptors. The function uses ep_call_nested() to prevent entering
>> the same wake up queue more than once, and to prevent excessively deep
>> wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary
>> since we are already preventing these conditions during EPOLL_CTL_ADD.
>> This
>> saves extra function calls, and avoids taking a global lock during the
>> ep_call_nested() calls.
> 
> This makes sense.
> 
>>
>> I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC
>> case, since ep_call_nested() keeps track of the nesting level, and
>> this is
>> required by the call to spin_lock_irqsave_nested(). It would be nice to
>> remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as
>> well, however its not clear how to simply pass the nesting level through
>> multiple wake_up() levels without more surgery. In any case, I don't
>> think
>> CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch,
>> also
>> apparently fixes a workload at Google that Salman Qazi reported by
>> completely removing the poll_safewake_ncalls->lock from wakeup paths.
> 
> I'm a bit curious about the workload (which uses lots of EPOLL_CTL_ADDs) as
> I was tackling the nested epoll scaling issue with loop and readywalk lists
> in mind.
>>

I'm not sure the details of the workload - perhaps Salman can elaborate
further about it.

It would seem that the safewake would potentially be the most contended
in general in the nested case, because generally you have a few epoll
fds attached to lots of sources doing wakeups. So those sources are all
going to conflict on the safewake lock. The readywalk is used when
performing a 'nested' poll and in general this is likely going to be
called on a few epoll fds. That said, we should remove it too. I will
post a patch to remove it.

The loop lock is used during EPOLL_CTL_ADD to check for loops and deep
wakeup paths and so I would expect this to be less common, but I
wouldn't doubt there are workloads impacted by it. We can potentially, I
think remove this one too - and the global 'epmutex'. I posted some
ideas a while ago on it:

http://lkml.iu.edu/hypermail//linux/kernel/1501.1/05905.html

We can work through these ideas or others...

Thanks,

-Jason

>> Signed-off-by: Jason Baron 
>> Cc: Alexander Viro 
>> Cc: Andrew Morton 
>> Cc: Salman Qazi 
> 
> Acked-by: Davidlohr Bueso 
> 
>> ---
>> fs/eventpoll.c | 47 ++-
>> 1 file changed, 18 insertions(+), 29 deletions(-)
> 
> Yay for getting rid of some of the callback hell.
> 
> Thanks,
> Davidlohr

[PATCH v4 4/4] Add the fp_selection_helper to set the fp for print functions

2017-10-18 Thread yuzhoujian

This patch will make all print functions receive the fp, add the 
fp_selection_helper
function to select the fp(stdout or the dump_event fp) and open the dump file 
for
all print functions. When the perf script is over, closes the dump_event file 
and
calculate its size.

Changes since v3:
- free the evsel->priv by zfree()

Changes since v2:
- remove the file_name variable and get the data file name from struct 
perf_session
- remove the per_event_dump_file variable and get the dump_event fp from struct
 perf_evsel
- add the fp_selection_helper function to select the fp(stdout or the dump_event
 fp) and open the dump file for all print functions if evname and last evsel 
name is not
 the same.
- close the dump file for all the evsels and calculate the dump file's size at 
the end of
 the perf script.
- solve the segmentation fault generated by perf script --per-event-dump 
--show-mmap-events

Changes since v1:
- modify the dump file name to -script-dump-.txt
 ect. perf.data-script-dump-cycles.txt, perf.data-script-dump-cs.txt
- split the original patch(Make all those related functions receive the file 
pointer)
 to two patches, and this is the second part of the original one.

Signed-off-by: yuzhoujian 
---
 tools/perf/builtin-script.c | 438 +---
 tools/perf/util/session.c   |  20 +-
 2 files changed, 273 insertions(+), 185 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 8c297f0..b49d380 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -501,7 +501,7 @@ static int perf_session__check_output_opt(struct 
perf_session *session)
 }
 
 static void fprint_sample_iregs(struct perf_sample *sample,
- struct perf_event_attr *attr, FILE *fp __maybe_unused)
+ struct perf_event_attr *attr, FILE *fp)
 {
struct regs_dump *regs = &sample->intr_regs;
uint64_t mask = attr->sample_regs_intr;
@@ -512,12 +512,12 @@ static void fprint_sample_iregs(struct perf_sample 
*sample,
 
for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   fprintf(stdout, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+   fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
 }
 
 static void fprint_sample_uregs(struct perf_sample *sample,
- struct perf_event_attr *attr, FILE *fp __maybe_unused)
+ struct perf_event_attr *attr, FILE *fp)
 {
struct regs_dump *regs = &sample->user_regs;
uint64_t mask = attr->sample_regs_user;
@@ -526,17 +526,17 @@ static void fprint_sample_uregs(struct perf_sample 
*sample,
if (!regs || !regs->regs)
return;
 
-   fprintf(stdout, " ABI:%" PRIu64 " ", regs->abi);
+   fprintf(fp, " ABI:%" PRIu64 " ", regs->abi);
 
for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   fprintf(stdout, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+   fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
 }
 
 static void fprint_sample_start(struct perf_sample *sample,
   struct thread *thread,
-  struct perf_evsel *evsel, FILE *fp 
__maybe_unused)
+  struct perf_evsel *evsel, FILE *fp)
 {
struct perf_event_attr *attr = &evsel->attr;
unsigned long secs;
@@ -544,25 +544,25 @@ static void fprint_sample_start(struct perf_sample 
*sample,
 
if (PRINT_FIELD(COMM)) {
if (latency_format)
-   fprintf(stdout, "%8.8s ", thread__comm_str(thread));
+   fprintf(fp, "%8.8s ", thread__comm_str(thread));
else if (PRINT_FIELD(IP) && symbol_conf.use_callchain)
-   fprintf(stdout, "%s ", thread__comm_str(thread));
+   fprintf(fp, "%s ", thread__comm_str(thread));
else
-   fprintf(stdout, "%16s ", thread__comm_str(thread));
+   fprintf(fp, "%16s ", thread__comm_str(thread));
}
 
if (PRINT_FIELD(PID) && PRINT_FIELD(TID))
-   fprintf(stdout, "%5d/%-5d ", sample->pid, sample->tid);
+   fprintf(fp, "%5d/%-5d ", sample->pid, sample->tid);
else if (PRINT_FIELD(PID))
-   fprintf(stdout, "%5d ", sample->pid);
+   fprintf(fp, "%5d ", sample->pid);
else if (PRINT_FIELD(TID))
-   fprintf(stdout, "%5d ", sample->tid);
+   fprintf(fp, "%5d ", sample->tid);
 
if (PRINT_FIELD(CPU)) {
if (latency_format)
-   fprintf(stdout, "%3d ", sample->cpu);
+   fprintf(fp, "%3d ", sample->cpu);
else
-   fprintf(stdout, "[%03d] ", sample->cpu);
+

[PATCH v4 1/4] Add new elements for per-event-dump option

2017-10-18 Thread yuzhoujian

This patch will add two elements for perf_tool struct: per_event_dump
is used to mark the per-event-dump option, last_evsel_name is used
to save last evsel's name. Add a new struct perf_script_evsel to
save evsel's specific data. There are three elements in this new struct:
dump_evsel_fp is used to save the file pointer of the dump_event file,
filename is used to save the file name of the dump_event file, samples
is used to save the number of samples for each evsel. The perf_script_evsel
struct will be saved in the evsel->priv. Add the OPT_BOOLEAN for per-event-dump
in the perf_data_file struct.

Changes since v3:
- remove three elements for perf_evsel struct and create a new struct:
 perf_script_evsel to save them.

Changes since v2:
- add the last_evsel_name for per_tool struct to save last evsel's name.
- add three elements for perf_evsel struct:dump_event_fp is used to save
 the file pointer of the dump_event file, filename is used to save the file
 name of the dump_event file, samples is used to save the number of samples
 for each evsel.

Changes since v1:
- remove the set for script.tool.per_event_dump variable,since the OPT_BOOLEAN
 will do the same thing.

Signed-off-by: yuzhoujian 
---
 tools/perf/builtin-script.c |  3 +++
 tools/perf/util/evsel.h | 12 
 tools/perf/util/tool.h  |  2 ++
 3 files changed, 17 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7167df2..4ffa716 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2729,6 +2729,7 @@ int cmd_script(int argc, const char **argv)
.cpu_map = process_cpu_map_event,
.ordered_events  = true,
.ordering_requires_timestamps = true,
+   .per_event_dump = false,
},
};
struct perf_data_file file = {
@@ -2799,6 +2800,8 @@ int cmd_script(int argc, const char **argv)
"Show context switch events (if recorded)"),
OPT_BOOLEAN('\0', "show-namespace-events", 
&script.show_namespace_events,
"Show namespace events (if recorded)"),
+   OPT_BOOLEAN('\0', "per-event-dump", &script.tool.per_event_dump,
+   "print trace output to files named by the monitored 
events"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
OPT_INTEGER(0, "max-blocks", &max_blocks,
"Maximum number of code blocks to dump with brstackinsn"),
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index db65878..2ab0650 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "xyarray.h"
@@ -51,6 +52,17 @@ enum {
PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
+/**
+ * The struct perf_script_evsel is used to save the dump file's name,
+ * dump_evsel_fp and the total number of samples for each evsel when
+ * the per-event-dump option is set.
+ */
+struct perf_script_evsel {
+   char*filename;
+   FILE*dump_evsel_fp;
+   unsigned long   samples;
+};
+
 struct perf_evsel_config_term {
struct list_headlist;
int type;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index d549e50..2cbcee4 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -75,6 +75,8 @@ struct perf_tool {
boolordered_events;
boolordering_requires_timestamps;
boolnamespace_events;
+   boolper_event_dump;
+   const char  *last_evsel_name;
enum show_feature_header show_feat_hdr;
 };
 
-- 
1.8.3.1

[PATCH v4 3/4] Replace printf with fprintf for all print functions

2017-10-18 Thread yuzhoujian

This patch will replace printf with fprintf for all print functions in the
builtin-script in order to support the per-event-dump option.

Changes since v3:
- none

Changes since v2:
- none

Changes since v1:
- remove the fp_selection_helper function for setting the fp argument, and use
 a local variable to do the same thing.

Signed-off-by: yuzhoujian 
---
 tools/perf/builtin-script.c | 178 ++--
 1 file changed, 89 insertions(+), 89 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4b51dd1..8c297f0 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -512,7 +512,7 @@ static void fprint_sample_iregs(struct perf_sample *sample,
 
for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+   fprintf(stdout, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
 }
 
@@ -526,11 +526,11 @@ static void fprint_sample_uregs(struct perf_sample 
*sample,
if (!regs || !regs->regs)
return;
 
-   printf(" ABI:%" PRIu64 " ", regs->abi);
+   fprintf(stdout, " ABI:%" PRIu64 " ", regs->abi);
 
for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+   fprintf(stdout, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
 }
 
@@ -544,25 +544,25 @@ static void fprint_sample_start(struct perf_sample 
*sample,
 
if (PRINT_FIELD(COMM)) {
if (latency_format)
-   printf("%8.8s ", thread__comm_str(thread));
+   fprintf(stdout, "%8.8s ", thread__comm_str(thread));
else if (PRINT_FIELD(IP) && symbol_conf.use_callchain)
-   printf("%s ", thread__comm_str(thread));
+   fprintf(stdout, "%s ", thread__comm_str(thread));
else
-   printf("%16s ", thread__comm_str(thread));
+   fprintf(stdout, "%16s ", thread__comm_str(thread));
}
 
if (PRINT_FIELD(PID) && PRINT_FIELD(TID))
-   printf("%5d/%-5d ", sample->pid, sample->tid);
+   fprintf(stdout, "%5d/%-5d ", sample->pid, sample->tid);
else if (PRINT_FIELD(PID))
-   printf("%5d ", sample->pid);
+   fprintf(stdout, "%5d ", sample->pid);
else if (PRINT_FIELD(TID))
-   printf("%5d ", sample->tid);
+   fprintf(stdout, "%5d ", sample->tid);
 
if (PRINT_FIELD(CPU)) {
if (latency_format)
-   printf("%3d ", sample->cpu);
+   fprintf(stdout, "%3d ", sample->cpu);
else
-   printf("[%03d] ", sample->cpu);
+   fprintf(stdout, "[%03d] ", sample->cpu);
}
 
if (PRINT_FIELD(TIME)) {
@@ -571,11 +571,11 @@ static void fprint_sample_start(struct perf_sample 
*sample,
nsecs -= secs * NSEC_PER_SEC;
 
if (nanosecs)
-   printf("%5lu.%09llu: ", secs, nsecs);
+   fprintf(stdout, "%5lu.%09llu: ", secs, nsecs);
else {
char sample_time[32];
timestamp__scnprintf_usec(sample->time, sample_time, 
sizeof(sample_time));
-   printf("%12s: ", sample_time);
+   fprintf(stdout, "%12s: ", sample_time);
}
}
 }
@@ -612,21 +612,21 @@ static void fprint_sample_brstack(struct perf_sample 
*sample,
thread__find_addr_map(thread, sample->cpumode, 
MAP__FUNCTION, to, &alt);
}
 
-   printf(" 0x%"PRIx64, from);
+   fprintf(stdout, " 0x%"PRIx64, from);
if (PRINT_FIELD(DSO)) {
-   printf("(");
+   fprintf(stdout, "(");
map__fprintf_dsoname(alf.map, stdout);
-   printf(")");
+   fprintf(stdout, ")");
}
 
-   printf("/0x%"PRIx64, to);
+   fprintf(stdout, "/0x%"PRIx64, to);
if (PRINT_FIELD(DSO)) {
-   printf("(");
+   fprintf(stdout, "(");
map__fprintf_dsoname(alt.map, stdout);
-   printf(")");
+   fprintf(stdout, ")");
}
 
-   printf("/%c/%c/%c/%d ",
+   fprintf(stdout, "/%c/%c/%c/%d ",
mispred_str( br->entries + i),
br->entries[i].flags.in_tx? 'X' : '-',
br->entries[i].flags.abort? 'A' : '-',
@@ -663,18 +663,18 @@ static void fprint_sample_brstacksym(struct perf_sample 
*sample,

Re: [PATCH RFC 00/10] Intel EPT-Based Sub-page Write Protection Support.

2017-10-18 Thread Yi Zhang

On 2017-10-18 at 00:09:36 -0700, Christoph Hellwig wrote:
> > We introduced 2 ioctls to let user application to set/get subpage write 
> > protection bitmap per gfn, each gfn corresponds to a bitmap.
> > The user application, qemu, or some other security control daemon. will set 
> > the protection bitmap via this ioctl.
> > the API defined as:
> > struct kvm_subpage {
> > __u64 base_gfn;
> > __u64 npages;
> > /* sub-page write-access bitmap array */
> > __u32 access_map[SUBPAGE_MAX_BITMAP];
> > }sp;
> > kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp)
> > kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp)
> 
> What is the use case for this feature?

Thanks for your review Chirs,

I have prepared a draft version of tools which embedded in the qemu
command line, mean that we could set/get the subpage protection via qemu
command.

Attached the qemu patch, it is a pre-design version, I'm considering to
change the interface to hypercall as Paolo's advice.

>From a369bed5d986dccb3ca36dc5a27c6220ca2d1405 Mon Sep 17 00:00:00 2001
From: Zhang Yi Z 
Date: Tue, 14 Mar 2017 15:11:38 +0800
Subject: [PATCH] x86: Intel Sub-Page Protection support

Signed-off-by: He Chen 
Signed-off-by: Zhang Yi Z 
---
 hmp-commands.hx   | 26 ++
 hmp.c | 26 ++
 hmp.h |  2 ++
 include/sysemu/kvm.h  |  2 ++
 kvm-all.c | 40 
 linux-headers/linux/kvm.h | 15 +++
 qapi-schema.json  | 41 +
 qmp.c | 43 +++
 target/i386/kvm.c | 22 ++
 9 files changed, 217 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 8819281..7a57411 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1766,6 +1766,32 @@ Set QOM property @var{property} of object at location @var{path} to value @var{v
 ETEXI
 
 {
+.name   = "get-subpage",
+.args_type  = "base_gfn:l,npages:l,filename:str",
+.params = "base_gfn npages filename",
+.help   = "get the write-protect bitmap setting of sub-page protectio",
+.cmd= hmp_get_subpage,
+},
+
+STEXI
+@item get-subpage @var{base_gfn} @var{npages} @var{file}
+Get the write-protect bitmap setting of sub-page protection in the range of @var{base_gfn} to @var{base_gfn} + @var{npages}
+ETEXI
+
+{
+.name   = "set-subpage",
+.args_type  = "base_gfn:l,npages:l,wp_map:i",
+.params = "base_gfn npages",
+.help   = "set the write-protect bitmap setting of sub-page protectio",
+.cmd= hmp_set_subpage,
+},
+
+STEXI
+@item set-subpage @var{base_gfn} @var{npages}
+Get the write-protect bitmap setting of sub-page protection in the range of @var{base_gfn} to @var{base_gfn} + @var{npages}
+ETEXI
+
+{
 .name   = "info",
 .args_type  = "item:s?",
 .params = "[subcommand]",
diff --git a/hmp.c b/hmp.c
index 261843f..7d217e9 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2614,3 +2614,29 @@ void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict)
 }
 qapi_free_GuidInfo(info);
 }
+
+void hmp_get_subpage(Monitor *mon, const QDict *qdict)
+{
+uint64_t base_gfn = qdict_get_int(qdict, "base_gfn");
+uint64_t npages = qdict_get_int(qdict, "npages");
+const char *filename = qdict_get_str(qdict, "filename");
+Error *err = NULL;
+
+monitor_printf(mon, "base_gfn: %ld, npages: %ld, file: %s\n", base_gfn, npages, filename);
+
+qmp_get_subpage(base_gfn, npages, filename, &err);
+hmp_handle_error(mon, &err);
+}
+
+void hmp_set_subpage(Monitor *mon, const QDict *qdict)
+{
+uint64_t base_gfn = qdict_get_int(qdict, "base_gfn");
+uint64_t npages = qdict_get_int(qdict, "npages");
+uint32_t wp_map = qdict_get_int(qdict, "wp_map");
+Error *err = NULL;
+
+monitor_printf(mon, "base_gfn: %ld, npages: %ld, wp_map: %d\n", base_gfn, npages, wp_map);
+
+qmp_set_subpage(base_gfn, npages, wp_map, &err);
+hmp_handle_error(mon, &err);
+}
diff --git a/hmp.h b/hmp.h
index 799fd37..b72143f 100644
--- a/hmp.h
+++ b/hmp.h
@@ -138,5 +138,7 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
 void hmp_info_dump(Monitor *mon, const QDict *qdict);
 void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
 void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
+void hmp_get_subpage(Monitor *mon, const QDict *qdict);
+void hmp_set_subpage(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 24281fc..f7c1340 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -528,4 +528,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source);
  */
 int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
 in

[PATCH v4 2/4] Add fp argument to print functions

2017-10-18 Thread yuzhoujian

This patch will add the fp argument to all the print functions so that they can
use different file pointer to print on the screen or dump in the file.

Changes since v3:
- none

Changes since v2:
- none

Changes since v1:
- add the __maybe_unused attribute for the fp argument in all the print 
functions,
 because the fp is not used in this patch but needed in the later patches.
- split the original patch(Makes all those related functions receive the FILE
 pointer) to two simple patches, and this is the first part.

Signed-off-by: yuzhoujian 
---
 tools/perf/builtin-script.c | 190 +++-
 1 file changed, 100 insertions(+), 90 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4ffa716..4b51dd1 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -500,8 +500,8 @@ static int perf_session__check_output_opt(struct 
perf_session *session)
return 0;
 }
 
-static void print_sample_iregs(struct perf_sample *sample,
- struct perf_event_attr *attr)
+static void fprint_sample_iregs(struct perf_sample *sample,
+ struct perf_event_attr *attr, FILE *fp __maybe_unused)
 {
struct regs_dump *regs = &sample->intr_regs;
uint64_t mask = attr->sample_regs_intr;
@@ -516,8 +516,8 @@ static void print_sample_iregs(struct perf_sample *sample,
}
 }
 
-static void print_sample_uregs(struct perf_sample *sample,
- struct perf_event_attr *attr)
+static void fprint_sample_uregs(struct perf_sample *sample,
+ struct perf_event_attr *attr, FILE *fp __maybe_unused)
 {
struct regs_dump *regs = &sample->user_regs;
uint64_t mask = attr->sample_regs_user;
@@ -534,9 +534,9 @@ static void print_sample_uregs(struct perf_sample *sample,
}
 }
 
-static void print_sample_start(struct perf_sample *sample,
+static void fprint_sample_start(struct perf_sample *sample,
   struct thread *thread,
-  struct perf_evsel *evsel)
+  struct perf_evsel *evsel, FILE *fp 
__maybe_unused)
 {
struct perf_event_attr *attr = &evsel->attr;
unsigned long secs;
@@ -589,9 +589,10 @@ static void print_sample_start(struct perf_sample *sample,
return br->flags.predicted ? 'P' : 'M';
 }
 
-static void print_sample_brstack(struct perf_sample *sample,
+static void fprint_sample_brstack(struct perf_sample *sample,
 struct thread *thread,
-struct perf_event_attr *attr)
+struct perf_event_attr *attr,
+FILE *fp __maybe_unused)
 {
struct branch_stack *br = sample->branch_stack;
struct addr_location alf, alt;
@@ -633,9 +634,10 @@ static void print_sample_brstack(struct perf_sample 
*sample,
}
 }
 
-static void print_sample_brstacksym(struct perf_sample *sample,
+static void fprint_sample_brstacksym(struct perf_sample *sample,
struct thread *thread,
-   struct perf_event_attr *attr)
+   struct perf_event_attr *attr,
+   FILE *fp __maybe_unused)
 {
struct branch_stack *br = sample->branch_stack;
struct addr_location alf, alt;
@@ -680,9 +682,10 @@ static void print_sample_brstacksym(struct perf_sample 
*sample,
}
 }
 
-static void print_sample_brstackoff(struct perf_sample *sample,
+static void fprint_sample_brstackoff(struct perf_sample *sample,
struct thread *thread,
-   struct perf_event_attr *attr)
+   struct perf_event_attr *attr,
+   FILE *fp __maybe_unused)
 {
struct branch_stack *br = sample->branch_stack;
struct addr_location alf, alt;
@@ -789,9 +792,9 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
return len;
 }
 
-static void print_jump(uint64_t ip, struct branch_entry *en,
+static void fprint_jump(uint64_t ip, struct branch_entry *en,
   struct perf_insn *x, u8 *inbuf, int len,
-  int insn)
+  int insn, FILE *fp __maybe_unused)
 {
printf("\t%016" PRIx64 "\t%-30s\t#%s%s%s%s",
   ip,
@@ -808,9 +811,10 @@ static void print_jump(uint64_t ip, struct branch_entry 
*en,
putchar('\n');
 }
 
-static void print_ip_sym(struct thread *thread, u8 cpumode, int cpu,
+static void fprint_ip_sym(struct thread *thread, u8 cpumode, int cpu,
 uint64_t addr, struct symbol **lastsym,
-struct perf_event_attr *attr)
+struct perf_event_attr *attr,
+FILE *fp __maybe_unused)
 {
struct addr_location a

[PATCH v4 0/4] perf script: Add script per-event-dump support

2017-10-18 Thread yuzhoujian

Introduce a new option to print trace output to files named by the
monitored events and update perf-script documentation accordingly.

Shown below is output of perf script command with the newly introduced
option.

 $perf record -e cycles -e cs -ag -- sleep 1
 $perf script --per-event-dump
 [ perf script: Wrote 0.051 MB perf.data-script-dump-cycles.txt (76 
samples) ]
 [ perf script: Wrote 0.012 MB perf.data-script-dump-cs.txt (69 
samples) ]
 $ls
 perf.data-script-dump-cycles.txt perf.data-script-dump-cs.txt

Without per-event-dump support, drawing flamegraphs for different events
would be a very troublesome thing. You can monitor only one event at a time
if you want to get flamegraphs for different events. Using this option, you
can get the trace output files named by the monitored events, and could draw
flamegraphs according to the event's name.

yuzhoujian (4):
Patch 1: Add new elements for per-event-dump option
Patch 2: Add fp argument to print functions
Patch 3: Replace printf with fprintf for all print functions
Patch 4: Add the fp_selection_helper to set the fp for print functions

Changes since v3:
- Patch 1: - remove three elements for perf_evsel struct and create the 
perf_script_evsel 
 struct to save them.
- Patch 2: None
- Patch 3: None
- Patch 4: Free the evsel->priv by zfree().

Changes since v2:
- Patch 1: Add the last_evsel_name for per_tool struct and three elements for 
perf_evsel struct.
- Patch 2: None
- Patch 3: None
- Patch 4: Remove the file_name and per_event_dump_file variables.
   Add the fp_selection_helper function to select the fp and open the 
dump file
 for all print functions.
   Close the dump file for all the evsels and calculate the dump file's 
size at
 the end of the perf script.
   Solve the segmentation fault generated by perf script 
--per-event-dump --show-mmap-events

Changes since v1:
- Patch 1: Remove the set of script.tool.per_event_dump variable.
- Patch 2: Add the __maybe_unused attribute for the fp argument in the second 
patch.
- Patch 3: remove the fp_selection_helper function for setting the fp argument.
- Patch 2: split the original second patch(Makes all those related functions 
receive
 the FILE pointer) to two patches.
- Patch 4: modify the file name of per-event-dump to
 -script-dump-.txt

 tools/perf/builtin-script.c | 489 ++--
 tools/perf/util/evsel.h |  12 ++
 tools/perf/util/session.c   |  20 +-
 tools/perf/util/tool.h  |   2 +
 4 files changed, 319 insertions(+), 204 deletions(-)

-- 
1.8.3.1

[PATCH v9 2/5] PCI: add resizeable BAR infrastructure v5

2017-10-18 Thread Christian König

From: Christian König 

Just the defines and helper functions to read the possible sizes of a BAR and
update it's size.

See 
https://pcisig.com/sites/default/files/specification_documents/ECN_Resizable-BAR_24Apr2008.pdf
and PCIe r3.1, sec 7.22.

This is useful for hardware with large local storage (mostly GFX) which only
expose 256MB BARs initially to be compatible with 32bit systems.

v2: provide read helper as well
v3: improve function names, use unsigned values, add better comments.
v4: move definition, improve commit message, s/bar/BAR/
v5: split out helper to find ctrl reg pos, style fixes, comment fixes,
add pci_rbar_size_to_bytes as well

Signed-off-by: Christian König 
Reviewed-by: Andy Shevchenko 
---
 drivers/pci/pci.c | 104 ++
 drivers/pci/pci.h |   8 
 include/uapi/linux/pci_regs.h |  11 -
 3 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b4b7eab29400..3aca7393c43c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2957,6 +2957,110 @@ bool pci_acs_path_enabled(struct pci_dev *start,
 }
 
 /**
+ * pci_rbar_find_pos - find position of resize ctrl reg for BAR
+ * @dev: PCI device
+ * @bar: BAR to find
+ *
+ * Helper to find the postion of the ctrl register for a BAR.
+ * Returns -ENOTSUPP of resizeable BARs are not supported at all.
+ * Returns -ENOENT if not ctrl register for the BAR could be found.
+ */
+static int pci_rbar_find_pos(struct pci_dev *pdev, int bar)
+{
+   unsigned int pos, nbars;
+   unsigned int i;
+   u32 ctrl;
+
+   pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_REBAR);
+   if (!pos)
+   return -ENOTSUPP;
+
+   pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+   nbars = (ctrl & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
+
+   for (i = 0; i < nbars; ++i, pos += 8) {
+   int bar_idx;
+
+   pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+   bar_idx = (ctrl & PCI_REBAR_CTRL_BAR_IDX_MASK) >>
+   PCI_REBAR_CTRL_BAR_IDX_SHIFT;
+   if (bar_idx == bar)
+   return pos;
+   }
+
+   return -ENOENT;
+}
+
+/**
+ * pci_rbar_get_possible_sizes - get possible sizes for BAR
+ * @dev: PCI device
+ * @bar: BAR to query
+ *
+ * Get the possible sizes of a resizeable BAR as bitmask defined in the spec
+ * (bit 0=1MB, bit 19=512GB). Returns 0 if BAR isn't resizeable.
+ */
+u32 pci_rbar_get_possible_sizes(struct pci_dev *pdev, int bar)
+{
+   u32 cap;
+   int pos;
+
+   pos = pci_rbar_find_pos(pdev, bar);
+   if (pos < 0)
+   return 0;
+
+   pci_read_config_dword(pdev, pos + PCI_REBAR_CAP, &cap);
+   return (cap & PCI_REBAR_CTRL_SIZES_MASK) >>
+   PCI_REBAR_CTRL_SIZES_SHIFT;
+}
+
+/**
+ * pci_rbar_get_current_size - get the current size of a BAR
+ * @dev: PCI device
+ * @bar: BAR to set size to
+ *
+ * Read the size of a BAR from the resizeable BAR config.
+ * Returns size if found or negative error code.
+ */
+int pci_rbar_get_current_size(struct pci_dev *pdev, int bar)
+{
+   u32 ctrl;
+   int pos;
+
+   pos = pci_rbar_find_pos(pdev, bar);
+   if (pos < 0)
+   return pos;
+
+   pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+   return (ctrl & PCI_REBAR_CTRL_BAR_SIZE_MASK) >>
+   PCI_REBAR_CTRL_BAR_SIZE_SHIFT;
+}
+
+/**
+ * pci_rbar_set_size - set a new size for a BAR
+ * @dev: PCI device
+ * @bar: BAR to set size to
+ * @size: new size as defined in the spec (0=1MB, 19=512GB)
+ *
+ * Set the new size of a BAR as defined in the spec.
+ * Returns zero if resizing was successful, error code otherwise.
+ */
+int pci_rbar_set_size(struct pci_dev *pdev, int bar, int size)
+{
+   u32 ctrl;
+   int pos;
+
+   pos = pci_rbar_find_pos(pdev, bar);
+   if (pos < 0)
+   return pos;
+
+   pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+   ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE_MASK;
+   ctrl |= size << PCI_REBAR_CTRL_BAR_SIZE_SHIFT;
+   pci_write_config_dword(pdev, pos + PCI_REBAR_CTRL, ctrl);
+   return 0;
+}
+
+/**
  * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge
  * @dev: the PCI device
  * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 5c475edc78c2..1681895366dc 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -368,4 +368,12 @@ int acpi_get_rc_resources(struct device *dev, const char 
*hid, u16 segment,
  struct resource *res);
 #endif
 
+u32 pci_rbar_get_possible_sizes(struct pci_dev *pdev, int bar);
+int pci_rbar_get_current_size(struct pci_dev *pdev, int bar);
+int pci_rbar_set_size(struct pci_dev *pdev, int bar, int size);
+static inline u64 pci_rbar_size_to_bytes(int size)
+{
+   return 1ULL << (size + 20

[PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-10-18 Thread Christian König

From: Christian König 

Most BIOS don't enable this because of compatibility reasons.

Manually enable a 64bit BAR of 64GB size so that we have
enough room for PCI devices.

v2: style cleanups, increase size, add resource name, set correct flags,
print message that windows was added
v3: add defines for all the magic numbers, style cleanups
v4: add some comment that the BIOS should actually allow this using
_PRS and _SRS.
v5: only enable this if CONFIG_PHYS_ADDR_T_64BIT is set

Signed-off-by: Christian König 
Reviewed-by: Andy Shevchenko 
---
 arch/x86/pci/fixup.c | 80 
 1 file changed, 80 insertions(+)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 11e407489db0..7b6bd76713c5 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -618,3 +618,83 @@ static void quirk_apple_mbp_poweroff(struct pci_dev *pdev)
dev_info(dev, "can't work around MacBook Pro poweroff issue\n");
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, 
quirk_apple_mbp_poweroff);
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+
+#define AMD_141b_MMIO_BASE(x)  (0x80 + (x) * 0x8)
+#define AMD_141b_MMIO_BASE_RE_MASK BIT(0)
+#define AMD_141b_MMIO_BASE_WE_MASK BIT(1)
+#define AMD_141b_MMIO_BASE_MMIOBASE_MASK   GENMASK(31,8)
+
+#define AMD_141b_MMIO_LIMIT(x) (0x84 + (x) * 0x8)
+#define AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK GENMASK(31,8)
+
+#define AMD_141b_MMIO_HIGH(x)  (0x180 + (x) * 0x4)
+#define AMD_141b_MMIO_HIGH_MMIOBASE_MASK   GENMASK(7,0)
+#define AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT 16
+#define AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK  GENMASK(23,16)
+
+/*
+ * The PCI Firmware Spec, rev 3.2 notes that ACPI should optionally allow
+ * configuring host bridge windows using the _PRS and _SRS methods.
+ *
+ * But this is rarely implemented, so we manually enable a large 64bit BAR for
+ * PCIe device on AMD Family 15h (Models 30h-3fh) Processors here.
+ */
+static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
+{
+   struct resource *res, *conflict;
+   u32 base, limit, high;
+   unsigned i;
+
+   for (i = 0; i < 8; ++i) {
+   pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
+   pci_read_config_dword(dev, AMD_141b_MMIO_HIGH(i), &high);
+
+   /* Is this slot free? */
+   if (!(base & (AMD_141b_MMIO_BASE_RE_MASK |
+ AMD_141b_MMIO_BASE_WE_MASK)))
+   break;
+
+   base >>= 8;
+   base |= high << 24;
+
+   /* Abort if a slot already configures a 64bit BAR. */
+   if (base > 0x1)
+   return;
+   }
+   if (i == 8)
+   return;
+
+   res = kzalloc(sizeof(*res), GFP_KERNEL);
+   if (!res)
+   return;
+
+   res->name = "PCI Bus :00";
+   res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
+   IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
+   res->start = 0x1ull;
+   res->end = 0xfdull - 1;
+
+   /* Just grab the free area behind system memory for this */
+   while ((conflict = request_resource_conflict(&iomem_resource, res)))
+   res->start = conflict->end + 1;
+
+   dev_info(&dev->dev, "adding root bus resource %pR\n", res);
+
+   base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
+   AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
+   limit = ((res->end + 1) >> 8) & AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK;
+   high = ((res->start >> 40) & AMD_141b_MMIO_HIGH_MMIOBASE_MASK) |
+   res->end + 1) >> 40) << AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT)
+& AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK);
+
+   pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
+   pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
+   pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
+
+   pci_bus_add_resource(dev->bus, res, 0);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
+
+#endif
-- 
2.11.0

[PATCH v9 3/5] PCI: add functionality for resizing resources v7

2017-10-18 Thread Christian König

From: Christian König 

This allows device drivers to request resizing their BARs.

The function only tries to reprogram the windows of the bridge directly above
the requesting device and only the BAR of the same type (usually mem, 64bit,
prefetchable). This is done to make sure not to disturb other drivers by
changing the BARs of their devices.

Drivers should use the following sequence to resize their BARs:
1. Disable memory decoding of your device using the PCI cfg dword.
2. Use pci_release_resource() to release all BARs which can move during the
   resize. Including the one you want to resize.
3. Call pci_resize_resource() for each BAR you want to resize.
4. Call pci_assign_unassigned_bus_resources() to reassign new locations
   for all BARs which are not resized, but could move.
5. If everything worked as expected enable memory decoding in your device again
   using the PCI cfg dword.

v2: rebase on changes in rbar support
v3: style cleanups, fail if memory decoding is enabled or resources
still allocated, resize all unused bridge BARs,
drop calling pci_reenable_device
v4: print resources before releasing them, style cleanups,
use pci_rbar_size_to_bytes, use PCI_RES_TYPE_MASK
v5: use next pointer to simplify loop
v6: move reassigning resources on error to driver side
v7: Document in the commit message how to use the new function.

Signed-off-by: Christian König 
---
 drivers/pci/setup-bus.c | 98 +
 drivers/pci/setup-res.c | 58 +
 include/linux/pci.h |  3 ++
 3 files changed, 159 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 37450d9e1132..03af25b3eec4 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1908,6 +1908,104 @@ void pci_assign_unassigned_bridge_resources(struct 
pci_dev *bridge)
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
+int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
+{
+   struct pci_dev_resource *dev_res;
+   struct pci_dev *next;
+   LIST_HEAD(saved);
+   LIST_HEAD(added);
+   LIST_HEAD(failed);
+   unsigned int i;
+   int ret;
+
+   /* Walk to the root hub, releasing bridge BARs when possible */
+   next = bridge;
+   do {
+   bridge = next;
+   for (i = PCI_BRIDGE_RESOURCES; i < PCI_BRIDGE_RESOURCE_END;
+i++) {
+   struct resource *res = &bridge->resource[i];
+
+   if ((res->flags ^ type) & PCI_RES_TYPE_MASK)
+   continue;
+
+   /* Ignore BARs which are still in use */
+   if (res->child)
+   continue;
+
+   ret = add_to_list(&saved, bridge, res, 0, 0);
+   if (ret)
+   goto cleanup;
+
+   dev_info(&bridge->dev, "BAR %d: releasing %pR\n",
+i, res);
+
+   if (res->parent)
+   release_resource(res);
+   res->start = 0;
+   res->end = 0;
+   break;
+   }
+   if (i == PCI_BRIDGE_RESOURCE_END)
+   break;
+
+   next = bridge->bus ? bridge->bus->self : NULL;
+   } while (next);
+
+   if (list_empty(&saved))
+   return -ENOENT;
+
+   __pci_bus_size_bridges(bridge->subordinate, &added);
+   __pci_bridge_assign_resources(bridge, &added, &failed);
+   BUG_ON(!list_empty(&added));
+
+   if (!list_empty(&failed)) {
+   ret = -ENOSPC;
+   goto cleanup;
+   }
+
+   list_for_each_entry(dev_res, &saved, list) {
+   /* Skip the bridge we just assigned resources for. */
+   if (bridge == dev_res->dev)
+   continue;
+
+   bridge = dev_res->dev;
+   pci_setup_bridge(bridge->subordinate);
+   }
+
+   free_list(&saved);
+   return 0;
+
+cleanup:
+   /* restore size and flags */
+   list_for_each_entry(dev_res, &failed, list) {
+   struct resource *res = dev_res->res;
+
+   res->start = dev_res->start;
+   res->end = dev_res->end;
+   res->flags = dev_res->flags;
+   }
+   free_list(&failed);
+
+   /* Revert to the old configuration */
+   list_for_each_entry(dev_res, &saved, list) {
+   struct resource *res = dev_res->res;
+
+   bridge = dev_res->dev;
+   i = res - bridge->resource;
+
+   res->start = dev_res->start;
+   res->end = dev_res->end;
+   res->flags = dev_res->flags;
+
+   pci_claim_resource(bridge, i);
+   pci_setup_bridge(bridge->subordinate);
+   }
+   free_list(&saved);
+
+   ret

Resizable PCI BAR support V9

2017-10-18 Thread Christian König

Hi everyone,

This is the ninth and hopefully last incarnation of this set of patches. It
enables device drivers to resize and most likely also relocate the PCI BAR of
devices they manage to allow the CPU to access all of the device local memory
at once.

This is very useful for GFX device drivers where the default PCI BAR is only
about 256MB in size for compatibility reasons, but the device easily have
multiple gigabyte of local memory.

Some changes since the last version:
1. Rebased on drm-next, so should be ready to be merged for 4.15.
2. The fixup of the 64bit root window on AMD Family 15h CPUs/APUs is only
   enabled when we compile a kernel supporting that hw.
3. Some minor error handling improvements for the amdgpu side. We now
   gracefully abort driver loading in case of a critical error instead of
   calling BUG().

Bjorn any more comments or can we finally get this into 4.15? I will remove the
version tags from the patches when I send you a pull request if you want this.

I only work on this as a background task, so sorry for the ~3 month delay
between each version of the patchset.

Regards,
Christian.

Re: [PATCH] Staging: rtl8723bs: Externs should be avoided in .C file

2017-10-18 Thread Greg KH

On Wed, Oct 04, 2017 at 07:21:10PM +0200, Srinivasan Shanmugam wrote:
> Removed all the unnecessary extern from rtl8723bs

That's not what this patch does :(

[PATCH v9 5/5] drm/amdgpu: resize VRAM BAR for CPU access v5

2017-10-18 Thread Christian König

From: Christian König 

Try to resize BAR0 to let CPU access all of VRAM.

v2: rebased, style cleanups, disable mem decode before resize,
handle gmc_v9 as well, round size up to power of two.
v3: handle gmc_v6 as well, release and reassign all BARs in the driver.
v4: rename new function to amdgpu_device_resize_fb_bar,
reenable mem decoding only if all resources are assigned.
v5: reorder resource release, return -ENODEV instead of BUG_ON().

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 48 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  | 12 ++--
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 13 ++--
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 13 ++--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 14 ++---
 6 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7ecfc5303f4f..ac4e6f6fb6d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1850,6 +1850,7 @@ void amdgpu_ttm_placement_from_domain(struct amdgpu_bo 
*abo, u32 domain);
 bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
 void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, 
u64 base);
 void amdgpu_gart_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
+int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev);
 void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
 int amdgpu_ttm_init(struct amdgpu_device *adev);
 void amdgpu_ttm_fini(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 57addfe9e89b..8f2be5a36625 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -414,6 +414,9 @@ static int amdgpu_doorbell_init(struct amdgpu_device *adev)
return 0;
}
 
+   if (pci_resource_flags(adev->pdev, 2) & IORESOURCE_UNSET)
+   return -EINVAL;
+
/* doorbell bar mapping */
adev->doorbell.base = pci_resource_start(adev->pdev, 2);
adev->doorbell.size = pci_resource_len(adev->pdev, 2);
@@ -732,6 +735,51 @@ int amdgpu_fw_reserve_vram_init(struct amdgpu_device *adev)
return r;
 }
 
+/**
+ * amdgpu_device_resize_fb_bar - try to resize FB BAR
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Try to resize FB BAR to make all VRAM CPU accessible. We try very hard not
+ * to fail, but if any of the BARs is not accessible after the size we abort
+ * driver loading by returning -ENODEV.
+ */
+int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
+{
+   u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
+   u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) - 1;
+   u16 cmd;
+   int r;
+
+   /* Disable memory decoding while we change the BAR addresses and size */
+   pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
+   pci_write_config_word(adev->pdev, PCI_COMMAND,
+ cmd & ~PCI_COMMAND_MEMORY);
+
+   /* Free the VRAM and doorbell BAR, we most likely need to move both. */
+   amdgpu_doorbell_fini(adev);
+   if (adev->asic_type >= CHIP_BONAIRE)
+   pci_release_resource(adev->pdev, 2);
+
+   pci_release_resource(adev->pdev, 0);
+
+   r = pci_resize_resource(adev->pdev, 0, rbar_size);
+   if (r == -ENOSPC)
+   DRM_INFO("Not enough PCI address space for a large BAR.");
+   else if (r && r != -ENOTSUPP)
+   DRM_ERROR("Problem resizing BAR0 (%d).", r);
+
+   pci_assign_unassigned_bus_resources(adev->pdev->bus);
+
+   /* When the doorbell or fb BAR isn't available we have no chance of
+* using the device.
+*/
+   r = amdgpu_doorbell_init(adev);
+   if (r || (pci_resource_flags(adev->pdev, 0) & IORESOURCE_UNSET))
+   return -ENODEV;
+
+   pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
+}
 
 /*
  * GPU helpers function.
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index f4603a7c8ef3..d2a43db22cff 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -283,6 +283,7 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
 
u32 tmp;
int chansize, numchan;
+   int r;
 
tmp = RREG32(mmMC_ARB_RAMCFG);
if (tmp & (1 << 11)) {
@@ -324,12 +325,17 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
break;
}
adev->mc.vram_width = numchan * chansize;
-   /* Could aper size report 0 ? */
-   adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-   adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
/* size in MB on si */
adev->mc.mc_vram_size = RREG32(mmCONFIG

[PATCH v9 1/5] PCI: add a define for the PCI resource type mask v2

2017-10-18 Thread Christian König

From: Christian König 

We use this mask multiple times in the bus setup.

v2: fix some style nit picks

Signed-off-by: Christian König 
Reviewed-by: Andy Shevchenko 
---
 drivers/pci/pci.h   |  3 +++
 drivers/pci/setup-bus.c | 12 +++-
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 22e061738c6f..5c475edc78c2 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -4,6 +4,9 @@
 #define PCI_FIND_CAP_TTL   48
 
 #define PCI_VSEC_ID_INTEL_TBT  0x1234  /* Thunderbolt */
+#define PCI_RES_TYPE_MASK \
+   (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
+IORESOURCE_MEM_64)
 
 extern const unsigned char pcie_link_speed[];
 
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 958da7db9033..37450d9e1132 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1523,8 +1523,6 @@ static void pci_bridge_release_resources(struct pci_bus 
*bus,
 {
struct pci_dev *dev = bus->self;
struct resource *r;
-   unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
unsigned old_flags = 0;
struct resource *b_res;
int idx = 1;
@@ -1567,7 +1565,7 @@ static void pci_bridge_release_resources(struct pci_bus 
*bus,
 */
release_child_resources(r);
if (!release_resource(r)) {
-   type = old_flags = r->flags & type_mask;
+   type = old_flags = r->flags & PCI_RES_TYPE_MASK;
dev_printk(KERN_DEBUG, &dev->dev, "resource %d %pR released\n",
PCI_BRIDGE_RESOURCES + idx, r);
/* keep the old size */
@@ -1758,8 +1756,6 @@ void pci_assign_unassigned_root_bus_resources(struct 
pci_bus *bus)
enum release_type rel_type = leaf_only;
LIST_HEAD(fail_head);
struct pci_dev_resource *fail_res;
-   unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
int pci_try_num = 1;
enum enable_type enable_local;
 
@@ -1818,7 +1814,7 @@ void pci_assign_unassigned_root_bus_resources(struct 
pci_bus *bus)
 */
list_for_each_entry(fail_res, &fail_head, list)
pci_bus_release_bridge_resources(fail_res->dev->bus,
-fail_res->flags & type_mask,
+fail_res->flags & 
PCI_RES_TYPE_MASK,
 rel_type);
 
/* restore size and flags */
@@ -1862,8 +1858,6 @@ void pci_assign_unassigned_bridge_resources(struct 
pci_dev *bridge)
LIST_HEAD(fail_head);
struct pci_dev_resource *fail_res;
int retval;
-   unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 
 again:
__pci_bus_size_bridges(parent, &add_list);
@@ -1889,7 +1883,7 @@ void pci_assign_unassigned_bridge_resources(struct 
pci_dev *bridge)
 */
list_for_each_entry(fail_res, &fail_head, list)
pci_bus_release_bridge_resources(fail_res->dev->bus,
-fail_res->flags & type_mask,
+fail_res->flags & 
PCI_RES_TYPE_MASK,
 whole_subtree);
 
/* restore size and flags */
-- 
2.11.0

Re: libbattery was Re: [RFC PATCH 5/5] power: generic-adc-battery: Add capacity handling

2017-10-18 Thread Pavel Machek

On Wed 2017-10-18 06:22:04, Tony Lindgren wrote:
> * H. Nikolaus Schaller  [171018 05:49]:
> > > Am 18.10.2017 um 14:28 schrieb Pavel Machek :
> > > 
> > > So I started something, it is at.
> > > 
> > > https://github.com/pavelmachek/libbattery
> > > 
> > > My battery on n900 is currently uncalibrated (and charging), still it
> > > gets some kind of estimation:
> > > 
> > > Battery -1 %
> > > Seconds -1
> > > State 1
> > > Voltage 3.88 V
> > > Battery 63 %
> > > 
> > > Of course, there's a lot more work to be done.
> > 
> > Nice start but not a solution to our problem.
> > 
> > Our problem is that people simply expect that for example 
> > https://packages.debian.org/wheezy/xfce/xfce4-battery-plugin
> > displays the battery percentage.
> 
> I think we could make things compatible with various battery apps by
> having libbattery write back the capacity percentage and time remaining
> to the kernel driver via sysfs or a dev entry. Then the kernel interface
> can just display the data to whatever apps.

Hmm. This could be as simple as providing symlink from
/sys/class/power/userland-battery to some place writable by
userspace...

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

RE: [PATCH v9 17/17] tools/wmi: add a sample for dell smbios communication over WMI

2017-10-18 Thread Mario.Limonciello

> -Original Message-
> From: Pali Rohár [mailto:pali.ro...@gmail.com]
> Sent: Wednesday, October 18, 2017 2:29 AM
> To: Greg KH ; Alan Cox 
> Cc: dvh...@infradead.org; Andy Shevchenko ;
> LKML ; platform-driver-...@vger.kernel.org; Andy
> Lutomirski ; quasi...@google.com; r...@rjwysocki.net;
> mj...@google.com; h...@lst.de; Limonciello, Mario
> 
> Subject: Re: [PATCH v9 17/17] tools/wmi: add a sample for dell smbios
> communication over WMI
> 
> On Tuesday 17 October 2017 13:22:01 Mario Limonciello wrote:
> > diff --git a/tools/wmi/dell-smbios-example.c 
> > b/tools/wmi/dell-smbios-example.c
> > new file mode 100644
> > index ..69c4dd9c6056
> > --- /dev/null
> > +++ b/tools/wmi/dell-smbios-example.c
> > @@ -0,0 +1,214 @@
> > +/*
> > + *  Sample application for SMBIOS communication over WMI interface
> > + *  Performs the following:
> > + *  - Simple class/select lookup for TPM information
> > + *  - Simple query of known tokens and their values
> > + *  - Simple activation of a token
> > + *
> > + *  Copyright (C) 2017 Dell, Inc.
> > + *
> > + *  This program is free software; you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License version 2 as
> > + *  published by the Free Software Foundation.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +/* if uapi header isn't installed, this might not yet exist */
> > +#ifndef __packed
> > +#define __packed __attribute__((packed))
> > +#endif
> > +#include 
> > +
> > +/* It would be better to discover these using udev, but for a simple
> > + * application they're hardcoded
> > + */
> > +static const char *ioctl_devfs = "/dev/wmi/dell-smbios";
> > +static const char *token_sysfs =
> > +   "/sys/bus/platform/devices/dell-smbios.0/tokens";
> > +static const char *buffer_sysfs =
> > +   "/sys/bus/wmi/devices/A80593CE-A997-11DA-B012-
> B622A1EF5492/required_buffer_size";
> 
> Greg, Alan, could userspace expects those paths to be part of kernel
> <--> userspace ABI? Looking e.g. at "dell-smbios.0" name and I'm not
> sure if this is something which is going to be stable between kernel
> versions and forever as part of ABI.

In my sample application to be distributed with the kernel these are 
hardcoded paths, but if more dependencies were used, I would
expect all 3 of these paths to be discovered using udev.  
I do include a comment for that point specifically.

> 
> Also if everything is part of smbios API, would not it better to provide
> everything via IOCTL over /dev/wmi/dell-smbios? I think this code is too
> complicated, just because for correct IOCTL buffer size it needs to read
> other properties via sysfs, etc... For me it looks like that it is not a
> good API for userspace developers.
> 
> --

This does give me an idea, how about a read on the character device
will return required buffer size instead of needing to find a sysfs 
attribute?  This seems more intuitive to me.

Token information is provided over sysfs for multiple reasons.
1) It's applicable to all dispatchers.  Even if the WMI dispatcher wasn't
used it's useful for userspace to query through.  For example the SMI call
to get tokens in libsmbios can be simplified to just read sysfs files.

2) it's information not coming from ACPI-WMI.  This series is setting
precedent for how to interact with ACPI-WMI methods in userspace.
putting in random data on the IOCTL that is not used in the ACPI-WMI
method or provided by the WMI bus doesn't fit.

3) It is static information that won't change until you reboot.

Re: [PATCH v5 3/6] perf: hisi: Add support for HiSilicon SoC L3C PMU driver

2017-10-18 Thread Mark Rutland

On Wed, Oct 18, 2017 at 09:33:30PM +0800, Zhangshaokun wrote:
> On 2017/10/17 23:16, Mark Rutland wrote:
> > On Tue, Aug 22, 2017 at 04:07:54PM +0800, Shaokun Zhang wrote:
> >> +static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
> >> +struct hisi_pmu *l3c_pmu)
> >> +{
> >> +  unsigned long long id;
> >> +  struct resource *res;
> >> +  acpi_status status;
> >> +  int cpu;
> >> +
> >> +  status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev),
> >> + "_UID", NULL, &id);
> >> +  if (ACPI_FAILURE(status))
> >> +  return -EINVAL;
> >> +
> >> +  l3c_pmu->id = id;
> >> +
> >> +  /*
> >> +   * Use the SCCL_ID and CCL_ID to identify the L3C PMU, while
> >> +   * SCCL_ID is in MPIDR[aff2] and CCL_ID is in MPIDR[aff1].
> >> +   */
> >> +  if (device_property_read_u32(&pdev->dev, "hisilicon,scl-id",
> >> +   &l3c_pmu->sccl_id)) {
> >> +  dev_err(&pdev->dev, "Can not read l3c sccl-id!\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  if (device_property_read_u32(&pdev->dev, "hisilicon,ccl-id",
> >> +   &l3c_pmu->ccl_id)) {
> >> +  dev_err(&pdev->dev, "Can not read l3c ccl-id!\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  /* Initialise the associated cpumask of the PMU */
> >> +  for_each_present_cpu(cpu)
> >> +  smp_call_function_single(cpu, hisi_l3c_pmu_set_cpumask_by_ccl,
> >> +   (void *)l3c_pmu, 1);

> > Rather than a proble-time smp_call_function_single(), can you follow the
> > qcom l2's approach of associating CPUs with a PMU instance in the
> > notifier? That will work even if CPUs are brought online very late.
> 
> A good guidance, but HHA and DDRC PMUs are different from L3C PMU, the former
> share the same SCCL and the latter share the same SCCL and CCL. I will
> try to deal with this difference in online notifier.

FWIW, I think it makes sense for each PMU to have its own notifier
(perhaps with some shared code that each calls to do the migration).

I just want to avoid the smp_call_function_single() at probe time, as
that doesn't work in some cases.

Thanks,
Mark.

Re: [PATCH] efi: parse ARM error information value

2017-10-18 Thread Tyler Baicar


On 10/17/2017 3:30 PM, Andy Shevchenko wrote:

On Tue, 2017-10-17 at 11:23 -0600, Tyler Baicar wrote:

ARM errors just print out the error information value, then the
value needs to be manually decoded as per the UEFI spec. Add
decoding of the ARM error information value so that the kernel
logs capture all of the valid information at first glance.

ARM error information value decoding is captured in UEFI 2.7
spec tables 263-265.

Could it be located in separate file?

Hello Andy,

Thank you for the feedback.

Yes, I can break this out into a different file...we may want to break out all 
of the

ARM error parsing to that file then though.



+   printk("%stransaction type: %s\n", pfx,
+  arm_err_trans_type_strs[trans_type]);

Plain printk():s?

This is consistent with the other prints in this CPER code.



+#define CPER_ARM_ERR_VALID_TRANSACTION_TYPE0x0001
+#define CPER_ARM_ERR_VALID_OPERATION_TYPE  0x0002
+#define CPER_ARM_ERR_VALID_LEVEL   0x0004
+#define CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT0x0008
+#define CPER_ARM_ERR_VALID_CORRECTED   0x0010
+#define CPER_ARM_ERR_VALID_PRECISE_PC  0x0020
+#define CPER_ARM_ERR_VALID_RESTARTABLE_PC  0x0040
+#define CPER_ARM_ERR_VALID_PARTICIPATION_TYPE  0x0080
+#define CPER_ARM_ERR_VALID_TIME_OUT0x0100
+#define CPER_ARM_ERR_VALID_ADDRESS_SPACE   0x0200
+#define CPER_ARM_ERR_VALID_MEM_ATTRIBUTES  0x0400
+#define CPER_ARM_ERR_VALID_ACCESS_MODE 0x0800

BIT() is already being used in this file.

I'll convert these to use BIT().



+
+#define CPER_ARM_ERR_TRANSACTION_SHIFT 16
+#define CPER_ARM_ERR_TRANSACTION_MASK  0x3

Mask is mask, so GENMASK()

I'll convert the masks to use GENMASK()

Thanks,
Tyler

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

Re: [PATCH] Staging: rtlwifi: phydm: Use setup_timer

2017-10-18 Thread Greg KH

On Sat, Oct 07, 2017 at 10:56:25PM +0530, Srishti Sharma wrote:
> Use setup_timer to combine initialization of a timer with the
> initialization of the timer's function and data fields. Done
> using the following semantic patch by coccinelle.
> 
> @r@
> struct timer_list *l;
> expression f, d;
> @@
> 
> -init_timer(l);
> +setup_timer(l,f,d);
> ...
> 
> (
> - l->function = f;
> ...
> - l->data = d;
> |
> - l->data = d;
> ...
> - l->function = f;
> )
> 
> Signed-off-by: Srishti Sharma 
> Acked-by: Julia Lawall 
> ---
>  drivers/staging/rtlwifi/phydm/phydm_interface.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Due to other changes in this same function, this patch no longer
applies, sorry.

greg k-h

Re: Adjusting further size determinations?

2017-10-18 Thread SF Markus Elfring

>> If you want 'security' for kmalloc() then:
>>
>> #define KMALLOC_TYPE(flags) (type *)kmalloc(sizeof (type), flags)
>> #define KMALLOC(ptr, flags) *(ptr) = KMALLOC_TYPE(typeof *(ptr), flags)

Such an approach might help.


>> and change:
>>  ptr = kmalloc(sizeof *ptr, flags);
>> to:
>>  KMALLOC(&ptr, flags);
>>
>> But it is all churn for churn's sake.
> 
> Please don't.

Interesting …


> Coccinelle won't find real problems with kmalloc any more if this is done.

The corresponding source code analysis will become different
(or more challenging) then. Are you still looking for related solutions?

Regards,
Markus

Re: [PATCH v4 3/5] reset: socfpga: use the reset-simple driver

2017-10-18 Thread Philipp Zabel

On Wed, 2017-10-18 at 14:00 +0100, Andre Przywara wrote:
> Hi,

Thank you for the review.

> On 17/10/17 14:03, Philipp Zabel wrote:
> > Add reset line status readback, inverted status support, and socfpga
> > device tree quirks to the simple reset driver, and use it to replace
> > the socfpga driver.
> > 
> > Signed-off-by: Philipp Zabel 
> > ---
> > Changes since v3:
> >  - Rebased onto reset/next
> >  - Only warn about missing altr,modrst-offset property on socfpga
> > ---
> >  drivers/reset/Kconfig |  10 +--
> >  drivers/reset/Makefile|   1 -
> >  drivers/reset/reset-simple.c  |  51 +-
> >  drivers/reset/reset-simple.h  |   4 ++
> >  drivers/reset/reset-socfpga.c | 157 
> > --
> >  5 files changed, 56 insertions(+), 167 deletions(-)
> >  delete mode 100644 drivers/reset/reset-socfpga.c
> > 
[...]
> > diff --git a/drivers/reset/reset-simple.c b/drivers/reset/reset-simple.c
> > index a5119457cec61..98ff0f924948e 100644
> > --- a/drivers/reset/reset-simple.c
> > +++ b/drivers/reset/reset-simple.c
> > @@ -68,25 +68,58 @@ static int reset_simple_deassert(struct 
> > reset_controller_dev *rcdev,
> > return reset_simple_update(rcdev, id, false);
> >  }
> >  
> > +static int reset_simple_status(struct reset_controller_dev *rcdev,
> > +  unsigned long id)
> > +{
> > +   struct reset_simple_data *data = to_reset_simple_data(rcdev);
> > +   int reg_width = sizeof(u32);
> > +   int bank = id / (reg_width * BITS_PER_BYTE);
> > +   int offset = id % (reg_width * BITS_PER_BYTE);
> > +   u32 reg;
> > +
> > +   reg = readl(data->membase + (bank * reg_width));
> > +
> > +   return !(reg & BIT(offset)) ^ !data->status_active_low;
> > +}
> > +
> >  const struct reset_control_ops reset_simple_ops = {
> > .assert = reset_simple_assert,
> > .deassert   = reset_simple_deassert,
> > +   .status = reset_simple_status,
> >  };
> >  
> >  /**
> >   * struct reset_simple_devdata - simple reset controller properties
> > + * @reg_offset: offset between base address and first reset register.
> > + * @nr_resets: number of resets. If not set, default to resource size in 
> > bits.
> >   * @active_low: if true, bits are cleared to assert the reset. Otherwise, 
> > bits
> >   *  are set to assert the reset.
> > + * @status_active_low: if true, bits read back as cleared while the reset 
> > is
> > + * asserted. Otherwise, bits read back as set while the
> > + * reset is asserted.
> >   */
> >  struct reset_simple_devdata {
> > +   u32 reg_offset;
> > +   u32 nr_resets;
> > bool active_low;
> > +   bool status_active_low;
> > +};
> > +
> > +#define SOCFPGA_NR_BANKS   8
> > +
> > +static const struct reset_simple_devdata reset_simple_socfpga = {
> > +   .reg_offset = 0x10,

Here reset_simple_socfpga.reg_offset is set to the default of 0x10.

> > +   .nr_resets = SOCFPGA_NR_BANKS * 32,
> > +   .status_active_low = true,
> >  };
> >  
> >  static const struct reset_simple_devdata reset_simple_active_low = {
> > .active_low = true,
> > +   .status_active_low = true,
> >  };
> >  
> >  static const struct of_device_id reset_simple_dt_ids[] = {
> > +   { .compatible = "altr,rst-mgr", .data = &reset_simple_socfpga },
> > { .compatible = "allwinner,sun6i-a31-clock-reset",
> > .data = &reset_simple_active_low },
> > { /* sentinel */ },
> > @@ -99,6 +132,7 @@ static int reset_simple_probe(struct platform_device 
> > *pdev)
> > struct reset_simple_data *data;
> > void __iomem *membase;
> > struct resource *res;
> > +   u32 reg_offset = 0;
> >  
> > devdata = of_device_get_match_data(dev);
> >  
> > @@ -118,8 +152,23 @@ static int reset_simple_probe(struct platform_device 
> > *pdev)
> > data->rcdev.ops = &reset_simple_ops;
> > data->rcdev.of_node = dev->of_node;
> >  
> > -   if (devdata)
> > +   if (devdata) {
> > +   reg_offset = devdata->reg_offset;

And here reg_offset is set to the default of 0x10 on socfpga.

> > +   if (devdata->nr_resets)
> > +   data->rcdev.nr_resets = devdata->nr_resets;
> > data->active_low = devdata->active_low;
> > +   data->status_active_low = devdata->status_active_low;
> > +   }
> > +
> > +   if (devdata == &reset_simple_socfpga &&
> 
> Mmh, this pointer comparison looks a bit dodgy. Isn't
> of_device_is_compatible() the right solution here?

My thinking was, the of_device_is_compatible is already called inside
of_device_get_match_data, so why call it again and not reuse the result?

> Also semantically, as the property is tied to a certain compatible
> string (and not to our data structure)?

I agree with this, though. I'll change this line to say:

+   if (of_device_is_compatible(dev->of_node, "altr,rst-mgr") &&

instead.

> > +   of_property_read_u32(dev->of_node, "altr,modrst-offset",
> > +   dev_warn(dev,
> > +

Re: [PATCH] dmaengine: rcar-dmac: use DMATCRB when xxx_TO_MEM direction

2017-10-18 Thread Geert Uytterhoeven

On Wed, Oct 18, 2017 at 3:46 PM, Laurent Pinchart
 wrote:
> On Wednesday, 18 October 2017 03:01:28 EEST Kuninori Morimoto wrote:
>> >>> Anyway, in all case I can use TCRB in v3 patch,
>> >>> and it needs abouve explanation.
>> >> If so, I think v1 is enough... ?
>> >> "transfer completed count is important for all case" is no doubt... ?
>> >
>> > That's correct, but I don't think the explanation was detailed and clear
>> > enough. If it was Geert wouldn't have asked for a v2, and you wouldn't
>> > have
>> > agreed to his request :-)
>>
>> OK. Let's follow Vinod's decision.
>>
>> Vinod, I'm happy if you are OK on v1.
>> And I'm happy to create v3 patch which includes detail reason
>> which is explained by Laurent if you want.
>
> I'd be happier with v3 :-)

+1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH] staging: rtlwifi: remove duplicated macros in comments

2017-10-18 Thread Greg KH

On Thu, Oct 05, 2017 at 04:44:31PM -0700, Matthew Giassa wrote:
> Removing a comment that duplicates definitions for pci_power_t
> enumeration, and pointing to the relevant header file (current comment
> is also missing PCI_POWER_ERROR).
> 
> Signed-off-by: Matthew Giassa 
> ---
>  drivers/staging/rtlwifi/pci.c | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/rtlwifi/pci.c b/drivers/staging/rtlwifi/pci.c
> index 4035b88..2e2cd21 100644
> --- a/drivers/staging/rtlwifi/pci.c
> +++ b/drivers/staging/rtlwifi/pci.c
> @@ -2456,12 +2456,7 @@ void rtl_pci_disconnect(struct pci_dev *pdev)
>  #ifdef CONFIG_PM_SLEEP
>  /***
>   * kernel pci power state define:
> - * PCI_D0 ((pci_power_t __force) 0)
> - * PCI_D1 ((pci_power_t __force) 1)
> - * PCI_D2 ((pci_power_t __force) 2)
> - * PCI_D3hot  ((pci_power_t __force) 3)
> - * PCI_D3cold ((pci_power_t __force) 4)
> - * PCI_UNKNOWN((pci_power_t __force) 5)
> + * Refer to include/linux/pci.h

That's really vague, how about just deleting these lines, and the
previous one as well?

thanks,

greg k-h

Re: [PATCH] dmaengine: rcar-dmac: use DMATCRB when xxx_TO_MEM direction

2017-10-18 Thread Laurent Pinchart

Hi Morimoto-san,

On Wednesday, 18 October 2017 03:01:28 EEST Kuninori Morimoto wrote:
> Hi Vinod, Laurent
> 
> >>> Anyway, in all case I can use TCRB in v3 patch,
> >>> and it needs abouve explanation.
> >> 
> >> If so, I think v1 is enough... ?
> >> "transfer completed count is important for all case" is no doubt... ?
> > 
> > That's correct, but I don't think the explanation was detailed and clear
> > enough. If it was Geert wouldn't have asked for a v2, and you wouldn't
> > have
> > agreed to his request :-)
> 
> OK. Let's follow Vinod's decision.
> 
> Vinod, I'm happy if you are OK on v1.
> And I'm happy to create v3 patch which includes detail reason
> which is explained by Laurent if you want.

I'd be happier with v3 :-)

-- 
Regards,

Laurent Pinchart

[tip:x86/apic] x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE

2017-10-18 Thread tip-bot for Thomas Gleixner

Commit-ID:  c201c91799d687c0a6d8c3272950f51aad5ffebe
Gitweb: https://git.kernel.org/tip/c201c91799d687c0a6d8c3272950f51aad5ffebe
Author: Thomas Gleixner 
AuthorDate: Tue, 17 Oct 2017 09:54:59 +0200
Committer:  Thomas Gleixner 
CommitDate: Wed, 18 Oct 2017 15:38:31 +0200

x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE

Select CONFIG_GENERIC_IRQ_RESERVATION_MODE so PCI/MSI domains get the
MSI_FLAG_MUST_REACTIVATE flag set in pci_msi_create_irq_domain().

Remove the explicit setters of this flag in the apic/msi code as they are
not longer required.

Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode")
Reported-and-tested-by: Dexuan Cui 
Signed-off-by: Thomas Gleixner 
Cc: Josh Poulson 
Cc: Mihai Costache 
Cc: Stephen Hemminger 
Cc: Marc Zyngier 
Cc: linux-...@vger.kernel.org
Cc: Haiyang Zhang 
Cc: Simon Xiao 
Cc: Saeed Mahameed 
Cc: Jork Loeser 
Cc: Bjorn Helgaas 
Cc: de...@linuxdriverproject.org
Cc: KY Srinivasan 
Link: https://lkml.kernel.org/r/20171017075600.527569...@linutronix.de

---
 arch/x86/Kconfig   | 1 +
 arch/x86/kernel/apic/msi.c | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 64e99d3..ea4beda 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -95,6 +95,7 @@ config X86
select GENERIC_IRQ_MATRIX_ALLOCATOR if X86_LOCAL_APIC
select GENERIC_IRQ_MIGRATIONif SMP
select GENERIC_IRQ_PROBE
+   select GENERIC_IRQ_RESERVATION_MODE
select GENERIC_IRQ_SHOW
select GENERIC_PENDING_IRQ  if SMP
select GENERIC_SMP_IDLE_THREAD
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 5b6dd1a..9b18be7 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -129,7 +129,7 @@ static struct msi_domain_ops pci_msi_domain_ops = {
 
 static struct msi_domain_info pci_msi_domain_info = {
.flags  = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
- MSI_FLAG_PCI_MSIX | MSI_FLAG_MUST_REACTIVATE,
+ MSI_FLAG_PCI_MSIX,
.ops= &pci_msi_domain_ops,
.chip   = &pci_msi_controller,
.handler= handle_edge_irq,
@@ -167,8 +167,7 @@ static struct irq_chip pci_msi_ir_controller = {
 
 static struct msi_domain_info pci_msi_ir_domain_info = {
.flags  = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
- MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX |
- MSI_FLAG_MUST_REACTIVATE,
+ MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX,
.ops= &pci_msi_domain_ops,
.chip   = &pci_msi_ir_controller,
.handler= handle_edge_irq,

[tip:x86/apic] PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code

2017-10-18 Thread tip-bot for Thomas Gleixner

Commit-ID:  25e960efc63852b84d1c3739aef586285b177395
Gitweb: https://git.kernel.org/tip/25e960efc63852b84d1c3739aef586285b177395
Author: Thomas Gleixner 
AuthorDate: Tue, 17 Oct 2017 09:54:58 +0200
Committer:  Thomas Gleixner 
CommitDate: Wed, 18 Oct 2017 15:38:31 +0200

PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code

If interrupt reservation mode is enabled then the PCI/MSI interrupts must
be reactivated after early activation.

Make sure that all callers of pci_msi_create_irq_domain() have the
MSI_FLAG_MUST_REACTIVATE set when reservation mode is enabled.

Signed-off-by: Thomas Gleixner 
Cc: Josh Poulson 
Cc: Mihai Costache 
Cc: Stephen Hemminger 
Cc: Marc Zyngier 
Cc: linux-...@vger.kernel.org
Cc: Haiyang Zhang 
Cc: Dexuan Cui 
Cc: Simon Xiao 
Cc: Saeed Mahameed 
Cc: Jork Loeser 
Cc: Bjorn Helgaas 
Cc: de...@linuxdriverproject.org
Cc: KY Srinivasan 
Link: https://lkml.kernel.org/r/20171017075600.448649...@linutronix.de

---
 drivers/pci/msi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 496ed91..e066071 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1441,6 +1441,8 @@ struct irq_domain *pci_msi_create_irq_domain(struct 
fwnode_handle *fwnode,
pci_msi_domain_update_chip_ops(info);
 
info->flags |= MSI_FLAG_ACTIVATE_EARLY;
+   if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE))
+   info->flags |= MSI_FLAG_MUST_REACTIVATE;
 
domain = msi_create_irq_domain(fwnode, info, parent);
if (!domain)

[tip:x86/apic] genirq: Add config option for reservation mode

2017-10-18 Thread tip-bot for Thomas Gleixner

Commit-ID:  2b5175c4fa974b6aa05bbd2ee8d443a8036a1714
Gitweb: https://git.kernel.org/tip/2b5175c4fa974b6aa05bbd2ee8d443a8036a1714
Author: Thomas Gleixner 
AuthorDate: Tue, 17 Oct 2017 09:54:57 +0200
Committer:  Thomas Gleixner 
CommitDate: Wed, 18 Oct 2017 15:38:30 +0200

genirq: Add config option for reservation mode

The interrupt reservation mode requires reactivation of PCI/MSI
interrupts. Create a config option, so the PCI code can set the
corresponding flag when required.

Signed-off-by: Thomas Gleixner 
Cc: Josh Poulson 
Cc: Mihai Costache 
Cc: Stephen Hemminger 
Cc: Marc Zyngier 
Cc: linux-...@vger.kernel.org
Cc: Haiyang Zhang 
Cc: Dexuan Cui 
Cc: Simon Xiao 
Cc: Saeed Mahameed 
Cc: Jork Loeser 
Cc: Bjorn Helgaas 
Cc: de...@linuxdriverproject.org
Cc: KY Srinivasan 
Link: https://lkml.kernel.org/r/20171017075600.369375...@linutronix.de

---
 kernel/irq/Kconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index ac1a3e2..89e3558 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -100,6 +100,9 @@ config IRQ_TIMINGS
 config GENERIC_IRQ_MATRIX_ALLOCATOR
bool
 
+config GENERIC_IRQ_RESERVATION_MODE
+   bool
+
 config IRQ_DOMAIN_DEBUG
bool "Expose hardware/virtual IRQ mapping via debugfs"
depends on IRQ_DOMAIN && DEBUG_FS

Re: [PATCH v6 5/5] arm: dts: stm32: remove useless clocksource nodes

2017-10-18 Thread Benjamin Gaignard

2017-10-18 15:21 GMT+02:00 Rob Herring :
> On Wed, Oct 18, 2017 at 7:58 AM, Benjamin Gaignard
>  wrote:
>> 16 bits timers aren't accurate enough to be used as
>> clocksource, remove them from stm32f4 and stm32f7 devicetree.
>
> They aren't useful for anything? Zephyr? u-boot?

I have check with the teams and timers (either 16 or 32 bits) are not
used by zephyr or u-boot

>
> Rob

Re: [PATCH v6 1/4] sched/clock: interface to allow timestamps early in boot

2017-10-18 Thread Pavel Tatashin

On Wed, Oct 18, 2017 at 6:01 AM, Dou Liyang  wrote:
> Hi Pasha,
>
> Sorry to reply you so late.
>
> I have test the TSC sync in our machine with DR(Dynamic Reconfiguration)
>   Linux kernel: Linux-4.14.0-rc5
>   NUMA nodes: 4 node.
>   Use clock_gettime() to reach nano-second accuracy.
>
> It is OK that we setup our reconfigurable with "tsc=unstable".
>

Excellent, thank you very much for confirming this Dou!

Pavel

Re: [PATCH v5 5/6] perf: hisi: Add support for HiSilicon SoC DDRC PMU driver

2017-10-18 Thread Zhangshaokun

Hi Mark,

On 2017/10/17 23:21, Mark Rutland wrote:
> On Tue, Aug 22, 2017 at 04:07:56PM +0800, Shaokun Zhang wrote:
>> This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each
>> DDRC has own control, counter and interrupt registers and is an separate
>> PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been
>> mapped to 8-events by hardware, it assumes that counter index is equal
>> to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to
>> handle counter (32-bits) overflow.
>>
>> Reviewed-by: Jonathan Cameron 
>> Signed-off-by: Shaokun Zhang 
>> Signed-off-by: Anurup M 
> 
> I have the same comments for this case as for the other two PMU drivers.
> 

Sure.

Thanks,
Shaokun

> Thanks,
> Mark.
> 
> .
>

Re: [PATCH v3 2/2] livepatch: add atomic replace

2017-10-18 Thread Jiri Kosina

On Wed, 18 Oct 2017, Miroslav Benes wrote:

> 3. Drop immediate. It causes problems only and its advantages on x86_64 
> are theoretical. You would still need to solve the interaction with atomic 
> replace on other architecture with immediate preserved, but that may be 
> easier. Or we can be aggressive and drop immediate completely. The force 
> transition I proposed earlier could achieve the same.

After brief off-thread discussion, I've been thinking about this a bit 
more and I also think that we should claim immediate "an experiment that 
failed", especially as the force functionality (which provides equal 
functionality from the userspace POV) will likely be there sonnish.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH 1/2] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default

2017-10-18 Thread Thomas Gleixner

On Wed, 18 Oct 2017, Ingo Molnar wrote:
> * Thomas Gleixner  wrote:
> 
> > On Wed, 18 Oct 2017, Byungchul Park wrote:
> > >  #ifdef CONFIG_LOCKDEP_CROSSRELEASE
> > > +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
> > >  #define MAX_XHLOCK_TRACE_ENTRIES 5
> > > +#else
> > > +#define MAX_XHLOCK_TRACE_ENTRIES 1
> > > +#endif
> > >  
> > >  /*
> > >   * This is for keeping locks waiting for commit so that true dependencies
> > > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > > index e36e652..5c2ddf2 100644
> > > --- a/kernel/locking/lockdep.c
> > > +++ b/kernel/locking/lockdep.c
> > > @@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock)
> > >   xhlock->trace.nr_entries = 0;
> > >   xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
> > >   xhlock->trace.entries = xhlock->trace_entries;
> > > +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
> > >   xhlock->trace.skip = 3;
> > >   save_stack_trace(&xhlock->trace);
> > > +#else
> > > + xhlock->trace.nr_entries = 1;
> > > + xhlock->trace.entries[0] = hlock->acquire_ip;
> > > +#endif
> > 
> > Hmm. Would it be possible to have this switchable at boot time via a
> > command line parameter? So in case of a splat with no stack trace, one
> > could just reboot and set something like 'lockdep_fullstack' on the kernel
> > command line to get the full data without having to recompile the kernel.
> 
> Yeah, and I'd suggest keeping the Kconfig option to default-enable that boot 
> option as well - i.e. let's have both.

That makes sense. Like we have with debug objects:
DEBUG_OBJECTS_ENABLE_DEFAULT.

Which reminds me that I wanted to convert them to static_key so they are
zero overhead when disabled. Sigh, why are todo lists growth only?

Thanks,

tglx

Re: [PATCH 3/4] iommu/arm-smmu-v3: Use NUMA memory allocations for stream tables and comamnd queues

2017-10-18 Thread Robin Murphy

On 04/10/17 14:53, Ganapatrao Kulkarni wrote:
> Hi Robin,
> 
> 
> On Thu, Sep 21, 2017 at 5:28 PM, Robin Murphy  wrote:
>> [+Christoph and Marek]
>>
>> On 21/09/17 09:59, Ganapatrao Kulkarni wrote:
>>> Introduce smmu_alloc_coherent and smmu_free_coherent functions to
>>> allocate/free dma coherent memory from NUMA node associated with SMMU.
>>> Replace all calls of dmam_alloc_coherent with smmu_alloc_coherent
>>> for SMMU stream tables and command queues.
>>
>> This doesn't work - not only do you lose the 'managed' aspect and risk
>> leaking various tables on probe failure or device removal, but more
>> importantly, unless you add DMA syncs around all the CPU accesses to the
>> tables, you lose the critical 'coherent' aspect, and that's a horribly
>> invasive change that I really don't want to make.
> 
> this implementation is similar to function used to allocate memory for
> translation tables.

The concept is similar, yes, and would work if implemented *correctly*
with the aforementioned comprehensive and hugely invasive changes. The
implementation as presented in this patch, however, is incomplete and
badly broken.

By way of comparison, the io-pgtable implementations contain all the
necessary dma_sync_* calls, never relied on devres, and only have one
DMA direction to worry about (hint: the queues don't all work
identically). There are also a couple of practical reasons for using
streaming mappings with the DMA == phys restriction there - tracking
both the CPU and DMA addresses for each table would significantly
increase the memory overhead, and using the cacheable linear map address
in all cases sidesteps any potential problems with the atomic PTE
updates. Neither of those concerns apply to the SMMUv3 data structures,
which are textbook coherent DMA allocations (being tied to the lifetime
of the device, rather than transient).

> why do you see it affects to stream tables and not to page tables.
> at runtime, both tables are accessed by SMMU only.
> 
> As said in cover letter, having stream table from respective NUMA node
> is yielding
> around 30% performance!
> please suggest, if there is any better way to address this issue?

I fully agree that NUMA-aware allocations are a worthwhile thing that we
want. I just don't like the idea of going around individual drivers
replacing coherent API usage with bodged-up streaming mappings - I
really think it's worth making the effort to to tackle it once, in the
proper place, in a way that benefits all users together.

Robin.

>>
>> Christoph, Marek; how reasonable do you think it is to expect
>> dma_alloc_coherent() to be inherently NUMA-aware on NUMA-capable
>> systems? SWIOTLB looks fairly straightforward to fix up (for the simple
>> allocation case; I'm not sure it's even worth it for bounce-buffering),
>> but the likes of CMA might be a little trickier...
>>
>> Robin.
>>
>>> Signed-off-by: Ganapatrao Kulkarni 
>>> ---
>>>  drivers/iommu/arm-smmu-v3.c | 57 
>>> -
>>>  1 file changed, 51 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>>> index e67ba6c..bc4ba1f 100644
>>> --- a/drivers/iommu/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm-smmu-v3.c
>>> @@ -1158,6 +1158,50 @@ static void arm_smmu_init_bypass_stes(u64 *strtab, 
>>> unsigned int nent)
>>>   }
>>>  }
>>>
>>> +static void *smmu_alloc_coherent(struct arm_smmu_device *smmu, size_t size,
>>> + dma_addr_t *dma_handle, gfp_t gfp)
>>> +{
>>> + struct device *dev = smmu->dev;
>>> + void *pages;
>>> + dma_addr_t dma;
>>> + int numa_node = dev_to_node(dev);
>>> +
>>> + pages = alloc_pages_exact_nid(numa_node, size, gfp | __GFP_ZERO);
>>> + if (!pages)
>>> + return NULL;
>>> +
>>> + if (!(smmu->features & ARM_SMMU_FEAT_COHERENCY)) {
>>> + dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
>>> + if (dma_mapping_error(dev, dma))
>>> + goto out_free;
>>> + /*
>>> +  * We depend on the SMMU being able to work with any physical
>>> +  * address directly, so if the DMA layer suggests otherwise by
>>> +  * translating or truncating them, that bodes very badly...
>>> +  */
>>> + if (dma != virt_to_phys(pages))
>>> + goto out_unmap;
>>> + }
>>> +
>>> + *dma_handle = (dma_addr_t)virt_to_phys(pages);
>>> + return pages;
>>> +
>>> +out_unmap:
>>> + dev_err(dev, "Cannot accommodate DMA translation for IOMMU page 
>>> tables\n");
>>> + dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
>>> +out_free:
>>> + free_pages_exact(pages, size);
>>> + return NULL;
>>> +}
>>> +
>>> +static void smmu_free_coherent(struct arm_smmu_device *smmu, size_t size,
>>> + void *pages, dma_addr_t dma_handle)
>>> +{
>>> + if (!(smmu->features & ARM_SMMU_FEAT_COHERENCY))
>>> + dma_unmap_single(smmu->d

Re: [PATCH v5 4/6] perf: hisi: Add support for HiSilicon SoC HHA PMU driver

2017-10-18 Thread Zhangshaokun

Hi Mark,

On 2017/10/17 23:18, Mark Rutland wrote:
> On Tue, Aug 22, 2017 at 04:07:55PM +0800, Shaokun Zhang wrote:
>> L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon
>> SoC. This patch adds support for HHA PMU driver, Each HHA has own
>> control, counter and interrupt registers and is an separate PMU. For
>> each HHA PMU, it has 16-programable counters and each counter is
>> free-running. Interrupt is supported to handle counter (48-bits)
>> overflow.
> 
> My comments here are the same as for the L3C PMU driver.
> 

Sure.

Thanks,
Shaokun

> Thanks,
> Mark.
> 
> .
>

Re: [PATCH v5 3/6] perf: hisi: Add support for HiSilicon SoC L3C PMU driver

2017-10-18 Thread Zhangshaokun

Hi Mark,

On 2017/10/17 23:16, Mark Rutland wrote:
> On Tue, Aug 22, 2017 at 04:07:54PM +0800, Shaokun Zhang wrote:
>> +static int hisi_l3c_pmu_init_irq(struct hisi_pmu *l3c_pmu,
>> + struct platform_device *pdev)
>> +{
>> +int irq, ret;
>> +
>> +/* Read and init IRQ */
>> +irq = platform_get_irq(pdev, 0);
>> +if (irq < 0) {
>> +dev_err(&pdev->dev, "L3C PMU get irq fail; irq:%d\n", irq);
>> +return irq;
>> +}
>> +
>> +ret = devm_request_irq(&pdev->dev, irq, hisi_l3c_pmu_isr,
>> +   IRQF_NOBALANCING | IRQF_NO_THREAD,
>> +   dev_name(&pdev->dev), l3c_pmu);
>> +if (ret < 0) {
>> +dev_err(&pdev->dev,
>> +"Fail to request IRQ:%d ret:%d\n", irq, ret);
>> +return ret;
>> +}
>> +
>> +l3c_pmu->irq = irq;
>> +
>> +return 0;
>> +}
>> +
>> +/*
>> + * Check whether the CPU is associated with this L3C PMU by SCCL_ID
>> + * and CCL_ID, if true, set the associated cpumask of the L3C PMU.
>> + */
>> +static void hisi_l3c_pmu_set_cpumask_by_ccl(void *arg)
>> +{
>> +struct hisi_pmu *l3c_pmu = (struct hisi_pmu *)arg;
>> +u32 ccl_id, sccl_id;
>> +
>> +hisi_read_sccl_and_ccl_id(&sccl_id, &ccl_id);
>> +if (sccl_id == l3c_pmu->sccl_id && ccl_id == l3c_pmu->ccl_id)
>> +cpumask_set_cpu(smp_processor_id(), &l3c_pmu->associated_cpus);
>> +}
> 
> The shared code has hisi_uncore_pmu_set_cpumask_by_sccl(), and it would
> be nice to place this in the same place.
> 
> Otherwise, the same comments apply here.
> 

Ok, shall fix the same issues.

>> +
>> +static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = {
>> +{ "HISI0213", },
>> +{},
>> +};
>> +MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match);
>> +
>> +static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
>> +  struct hisi_pmu *l3c_pmu)
>> +{
>> +unsigned long long id;
>> +struct resource *res;
>> +acpi_status status;
>> +int cpu;
>> +
>> +status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev),
>> +   "_UID", NULL, &id);
>> +if (ACPI_FAILURE(status))
>> +return -EINVAL;
>> +
>> +l3c_pmu->id = id;
>> +
>> +/*
>> + * Use the SCCL_ID and CCL_ID to identify the L3C PMU, while
>> + * SCCL_ID is in MPIDR[aff2] and CCL_ID is in MPIDR[aff1].
>> + */
>> +if (device_property_read_u32(&pdev->dev, "hisilicon,scl-id",
>> + &l3c_pmu->sccl_id)) {
>> +dev_err(&pdev->dev, "Can not read l3c sccl-id!\n");
>> +return -EINVAL;
>> +}
>> +
>> +if (device_property_read_u32(&pdev->dev, "hisilicon,ccl-id",
>> + &l3c_pmu->ccl_id)) {
>> +dev_err(&pdev->dev, "Can not read l3c ccl-id!\n");
>> +return -EINVAL;
>> +}
>> +
>> +/* Initialise the associated cpumask of the PMU */
>> +for_each_present_cpu(cpu)
>> +smp_call_function_single(cpu, hisi_l3c_pmu_set_cpumask_by_ccl,
>> + (void *)l3c_pmu, 1);
> 
> Ah, so that's why hisi_uncore_pmu_set_cpumask_by_sccl took a void
> pointer.
> 
> Please drop a comment above hisi_uncore_pmu_set_cpumask_by_sccl to cover
> that.
> 
> I think you can drop the void cast here; I don't beleive it is
> necessary.
> 

Ok.

> Rather than a proble-time smp_call_function_single(), can you follow the
> qcom l2's approach of associating CPUs with a PMU instance in the
> notifier? That will work even if CPUs are brought online very late.
> 

A good guidance, but HHA and DDRC PMUs are different from L3C PMU, the former
share the same SCCL and the latter share the same SCCL and CCL. I will
try to deal with this difference in online notifier.

Thanks,
Shaokun

>> +
>> +res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +l3c_pmu->base = devm_ioremap_resource(&pdev->dev, res);
>> +if (IS_ERR(l3c_pmu->base)) {
>> +dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n");
>> +return PTR_ERR(l3c_pmu->base);
>> +}
>> +
>> +return 0;
>> +}
> 
> Thanks,
> Mark.
> 
> .
>

[tip:timers/core] timers: Avoid an unnecessary iteration in __run_timers()

2017-10-18 Thread tip-bot for Zhenzhong Duan

Commit-ID:  c310ce4dcb9df9b2f1be82caff7dae609fe53f72
Gitweb: https://git.kernel.org/tip/c310ce4dcb9df9b2f1be82caff7dae609fe53f72
Author: Zhenzhong Duan 
AuthorDate: Sun, 8 Oct 2017 20:55:59 -0700
Committer:  Thomas Gleixner 
CommitDate: Wed, 18 Oct 2017 15:29:33 +0200

timers: Avoid an unnecessary iteration in __run_timers()

If the base clock is behind jiffies in the soft irq expiry code then the
next timer is retrieved by get_next_timer_interrupt() to avoid incrementing
base clock one by one. If the next timer interrupt is past current jiffies
then the base clock is set to jiffies - 1. At the call site this is
incremented and another iteration through the expiry loop is executed which
checks empty hash buckets.

That's a pointless excercise because it's already known that the next timer
is past jiffies.

Set the base clock in that case to jiffies directly so it gets incremented
to jiffies + 1 at the call site resulting in immediate termination of the
expiry loop.

[ tglx: Massaged changelog and added comment to the code ]

Signed-off-by: Zhenzhong Duan 
Signed-off-by: Thomas Gleixner 
Acked-by: Anna-Maria Gleixner 
Cc: Joe Jin 
Cc: sb...@codeaurora.org
Cc: Srinivas Reddy Eeda 
Cc: john.stu...@linaro.org
Link: https://lkml.kernel.org/r/7086a857-f90c-4616-bbe8-f7696f21626c@default
---
 kernel/time/timer.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 38613ce..ee1a88d 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1560,8 +1560,11 @@ static int collect_expired_timers(struct timer_base 
*base,
 * jiffies, otherwise forward to the next expiry time:
 */
if (time_after(next, jiffies)) {
-   /* The call site will increment clock! */
-   base->clk = jiffies - 1;
+   /*
+* The call site will increment base->clk and then
+* terminate the expiry loop immediately.
+*/
+   base->clk = jiffies;
return 0;
}
base->clk = next;

Re: [PATCH] zswap: Same-filled pages handling

2017-10-18 Thread Timofey Titovets

2017-10-18 15:34 GMT+03:00 Matthew Wilcox :
> On Wed, Oct 18, 2017 at 10:48:32AM +, Srividya Desireddy wrote:
>> +static void zswap_fill_page(void *ptr, unsigned long value)
>> +{
>> + unsigned int pos;
>> + unsigned long *page;
>> +
>> + page = (unsigned long *)ptr;
>> + if (value == 0)
>> + memset(page, 0, PAGE_SIZE);
>> + else {
>> + for (pos = 0; pos < PAGE_SIZE / sizeof(*page); pos++)
>> + page[pos] = value;
>> + }
>> +}
>
> I think you meant:
>
> static void zswap_fill_page(void *ptr, unsigned long value)
> {
> memset_l(ptr, value, PAGE_SIZE / sizeof(unsigned long));
> }
>
> (and you should see significantly better numbers at least on x86;
> I don't know if anyone's done an arm64 version of memset_l yet).
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 

IIRC kernel have special zero page, and if i understand correctly.
You can map all zero pages to that zero page and not touch zswap completely.
(Your situation look like some KSM case (i.e. KSM can handle pages
with same content), but i'm not sure if that applicable there)

Thanks.
-- 
Have a nice day,
Timofey.

Re: [PATCH] workqueue: Convert timers to use timer_setup() (part 2)

2017-10-18 Thread Tejun Heo

On Mon, Oct 16, 2017 at 03:58:25PM -0700, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer
> to all timer callbacks, switch to using the new timer_setup() and
> from_timer() to pass the timer pointer explicitly. (The prior workqueue
> patch missed a few timers.)
> 
> Cc: Tejun Heo 
> Cc: Lai Jiangshan 
> Signed-off-by: Kees Cook 

Acked-by: Tejun Heo 

Please feel free to route with other timer patches.

Thanks.

-- 
tejun

Re: [PATCH 0/4] numa, iommu/smmu: IOMMU/SMMU driver optimization for NUMA systems

2017-10-18 Thread Will Deacon

Hi Ganapat,

On Thu, Sep 21, 2017 at 02:29:18PM +0530, Ganapatrao Kulkarni wrote:
> Adding numa aware memory allocations used for iommu dma allocation and
> memory allocated for SMMU stream tables, page walk tables and command queues.
> 
> With this patch, iperf testing on ThunderX2, with 40G NIC card on
> NODE 1 PCI shown same performance(around 30% improvement) as NODE 0.

Are you planning to repost this series? The idea looks good, but it needs
some rework before it can be merged.

Thanks,

Will

Re: [PATCH] video: fbdev: remove dead igafb driver

2017-10-18 Thread David Miller

From: John Paul Adrian Glaubitz 
Date: Wed, 18 Oct 2017 15:14:27 +0200

> Hi Bartlomiej!
> 
> On 10/18/2017 02:56 PM, Bartlomiej Zolnierkiewicz wrote:
>> igafb driver hasn't compiled since at least kernel v2.6.34 as
>> commit 6016a363f6b5 ("of: unify phandle name in struct device_node")
>> missed updating igafb.c to use dp->phandle instead of dp->node.
> Would it take a lot of work to port the driver to the new interface?
> 
> I'm not sure which SPARC machines use this particular framebuffer, but
> my plans are to fix up all these old framebuffer drivers. I have
> already
> received several Amiga (Zorro) graphics cards for testing the updated
> drivers on Amiga.
> 
> It could be that I actually have this particular SPARC framebuffer in
> my hardware collection.

Unless you have a 32-bit sparc laptop, you don't have a machine that
will use this driver.

Re: [PATCH v5 4/9] drm/bridge: analogix_dp: Fix connector & encoder cleanup

2017-10-18 Thread Andrzej Hajda

On 18.10.2017 14:09, Jeffy Chen wrote:
> Since we are initing connector in the core driver and encoder in the
> plat driver, let's clean them up in the right places.
>
> Signed-off-by: Jeffy Chen 
> ---
>
> Changes in v5: None
>
>  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c |  2 --
>  drivers/gpu/drm/exynos/exynos_dp.c |  7 +--
>  drivers/gpu/drm/rockchip/analogix_dp-rockchip.c| 15 ++-
>  3 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
> b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> index 74d274b6d31d..3f910ab36ff6 100644
> --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> @@ -1409,7 +1409,6 @@ analogix_dp_bind(struct device *dev, struct drm_device 
> *drm_dev,
>   ret = analogix_dp_create_bridge(drm_dev, dp);
>   if (ret) {
>   DRM_ERROR("failed to create bridge (%d)\n", ret);
> - drm_encoder_cleanup(dp->encoder);
>   goto err_disable_pm_runtime;
>   }
>  
> @@ -1432,7 +1431,6 @@ void analogix_dp_unbind(struct analogix_dp_device *dp)
>  {
>   analogix_dp_bridge_disable(dp->bridge);
>   dp->connector.funcs->destroy(&dp->connector);
> - dp->encoder->funcs->destroy(dp->encoder);
>  
>   if (dp->plat_data->panel) {
>   if (drm_panel_unprepare(dp->plat_data->panel))
> diff --git a/drivers/gpu/drm/exynos/exynos_dp.c 
> b/drivers/gpu/drm/exynos/exynos_dp.c
> index f7e5b2c405ed..33319a858f3a 100644
> --- a/drivers/gpu/drm/exynos/exynos_dp.c
> +++ b/drivers/gpu/drm/exynos/exynos_dp.c
> @@ -185,8 +185,10 @@ static int exynos_dp_bind(struct device *dev, struct 
> device *master, void *data)
>   dp->plat_data.encoder = encoder;
>  
>   dp->adp = analogix_dp_bind(dev, dp->drm_dev, &dp->plat_data);
> - if (IS_ERR(dp->adp))
> + if (IS_ERR(dp->adp)) {
> + dp->encoder.funcs->destroy(&dp->encoder);
>   return PTR_ERR(dp->adp);
> + }
>  
>   return 0;
>  }
> @@ -196,7 +198,8 @@ static void exynos_dp_unbind(struct device *dev, struct 
> device *master,
>  {
>   struct exynos_dp_device *dp = dev_get_drvdata(dev);
>  
> - return analogix_dp_unbind(dp->adp);
> + analogix_dp_unbind(dp->adp);
> + dp->encoder.funcs->destroy(&dp->encoder);
>  }
>  
>  static const struct component_ops exynos_dp_ops = {
> diff --git a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c 
> b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
> index fa0365de31d2..c0fb3f3748f4 100644
> --- a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
> +++ b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
> @@ -261,13 +261,8 @@ static struct drm_encoder_helper_funcs 
> rockchip_dp_encoder_helper_funcs = {
>   .atomic_check = rockchip_dp_drm_encoder_atomic_check,
>  };
>  
> -static void rockchip_dp_drm_encoder_destroy(struct drm_encoder *encoder)
> -{
> - drm_encoder_cleanup(encoder);
> -}
> -
>  static struct drm_encoder_funcs rockchip_dp_encoder_funcs = {
> - .destroy = rockchip_dp_drm_encoder_destroy,
> + .destroy = drm_encoder_cleanup,
>  };
>  
>  static int rockchip_dp_of_probe(struct rockchip_dp_device *dp)
> @@ -361,12 +356,13 @@ static int rockchip_dp_bind(struct device *dev, struct 
> device *master,
>   dp->psr_state = ~EDP_VSC_PSR_STATE_ACTIVE;
>   INIT_WORK(&dp->psr_work, analogix_dp_psr_work);
>  
> - rockchip_drm_psr_register(&dp->encoder, analogix_dp_psr_set);
> -
>   dp->adp = analogix_dp_bind(dev, dp->drm_dev, &dp->plat_data);
> - if (IS_ERR(dp->adp))
> + if (IS_ERR(dp->adp)) {
> + dp->encoder.funcs->destroy(&dp->encoder);
>   return PTR_ERR(dp->adp);
> + }
>  
> + rockchip_drm_psr_register(&dp->encoder, analogix_dp_psr_set);

You are changing here order of calls: psr_reg after bind, it does not
seem to be related to patch subject.
Anyway psr_register can fail and its result is not checked, but it can
be addressed in separate patch.
So maybe it would be better to leave the order as is, unless there is
reason for change it in one patch, in such case please explain it in
commit message.
Beside this:
Reviewed-by: Andrzej Hajda 

 --
Regards
Andrzej

>   return 0;
>  }
>  
> @@ -377,6 +373,7 @@ static void rockchip_dp_unbind(struct device *dev, struct 
> device *master,
>  
>   rockchip_drm_psr_unregister(&dp->encoder);
>   analogix_dp_unbind(dp->adp);
> + dp->encoder.funcs->destroy(&dp->encoder);
>  }
>  
>  static const struct component_ops rockchip_dp_component_ops = {

[for-next][PATCH 0/7] tracing: Some more updates for 4.15

2017-10-18 Thread Steven Rostedt


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: a96a5037ed0f52e2d86739f4a1ef985bd036e575


Peter Zijlstra (4):
  perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on 
ftrace:function")
  perf/ftrace: Fix function trace events
  perf/ftrace: Small cleanup
  ftrace: Kill FTRACE_OPS_FL_PER_CPU

Steven Rostedt (VMware) (3):
  tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_on
  tracing, thermal: Hide devfreq trace events when not in use
  tracing, thermal: Hide cpu cooling trace events when not in use


 drivers/dma-buf/dma-fence.c  |  1 -
 include/linux/ftrace.h   | 83 +++-
 include/linux/perf_event.h   |  2 +-
 include/linux/trace_events.h |  9 -
 include/trace/events/dma_fence.h | 40 ---
 include/trace/events/thermal.h   |  4 ++
 kernel/events/core.c | 13 ++-
 kernel/trace/ftrace.c| 55 +++---
 kernel/trace/trace_event_perf.c  | 82 +++
 kernel/trace/trace_kprobe.c  |  4 +-
 kernel/trace/trace_syscalls.c|  4 +-
 kernel/trace/trace_uprobe.c  |  2 +-
 12 files changed, 90 insertions(+), 209 deletions(-)

Re: [PATCH v2 1/4] dt-bindings: rtc: mediatek: add bindings for MediaTek SoC based RTC

2017-10-18 Thread Yingjoe Chen

On Tue, 2017-10-17 at 17:40 +0800, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> Add device-tree binding for MediaTek SoC based RTC
> 
> Cc: devicet...@vger.kernel.org
> Signed-off-by: Sean Wang 
> Acked-by: Rob Herring 
> ---
>  .../devicetree/bindings/rtc/rtc-mediatek.txt| 21 
> +
>  1 file changed, 21 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/rtc/rtc-mediatek.txt
> 
> diff --git a/Documentation/devicetree/bindings/rtc/rtc-mediatek.txt 
> b/Documentation/devicetree/bindings/rtc/rtc-mediatek.txt
> new file mode 100644
> index 000..09fe8f5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/rtc/rtc-mediatek.txt


How about change this filename to match driver filename change?

Joe.C

RE: Adjusting further size determinations?

2017-10-18 Thread Julia Lawall



On Wed, 18 Oct 2017, David Laight wrote:

> From: SF Markus Elfring
> >  Unpleasant consequences are possible in both cases.
> > >> How much do you care to reduce the failure probability further?
> > >
> > > Zero.
> >
> > I am interested to improve the software situation a bit more here.
>
> There are probably better places to spend your time!
>
> If you want 'security' for kmalloc() then:
>
> #define KMALLOC_TYPE(flags) (type *)kmalloc(sizeof (type), flags)
> #define KMALLOC(ptr, flags) *(ptr) = KMALLOC_TYPE(typeof *(ptr), flags)
>
> and change:
>   ptr = kmalloc(sizeof *ptr, flags);
> to:
>   KMALLOC(&ptr, flags);
>
> But it is all churn for churn's sake.

Please don't.  Coccinelle won't find real problems with kmalloc any more
if this is done.

julia

[for-next][PATCH 2/7] perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function")

2017-10-18 Thread Steven Rostedt

From: Peter Zijlstra 

Revert commit:

  75e8387685f6 ("perf/ftrace: Fix double traces of perf on ftrace:function")

The reason I instantly stumbled on that patch is that it only addresses the
ftrace situation and doesn't mention the other _5_ places that use this
interface. It doesn't explain why those don't have the problem and if not, why
their solution doesn't work for ftrace.

It doesn't, but this is just putting more duct tape on.

Link: http://lkml.kernel.org/r/20171011080224.200565...@infradead.org

Cc: Zhou Chengming 
Cc: Jiri Olsa 
Cc: Ingo Molnar 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Steven Rostedt (VMware) 
---
 include/linux/perf_event.h  |  2 +-
 include/linux/trace_events.h|  4 ++--
 kernel/events/core.c| 13 -
 kernel/trace/trace_event_perf.c |  4 +---
 kernel/trace/trace_kprobe.c |  4 ++--
 kernel/trace/trace_syscalls.c   |  4 ++--
 kernel/trace/trace_uprobe.c |  2 +-
 7 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8e22f24ded6a..569d1b54e201 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1184,7 +1184,7 @@ extern void perf_event_init(void);
 extern void perf_tp_event(u16 event_type, u64 count, void *record,
  int entry_size, struct pt_regs *regs,
  struct hlist_head *head, int rctx,
- struct task_struct *task, struct perf_event *event);
+ struct task_struct *task);
 extern void perf_bp_event(struct perf_event *event, void *data);
 
 #ifndef perf_misc_flags
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 2e0f22298fe9..a6349b76fd39 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -507,9 +507,9 @@ void perf_trace_run_bpf_submit(void *raw_data, int size, 
int rctx,
 static inline void
 perf_trace_buf_submit(void *raw_data, int size, int rctx, u16 type,
   u64 count, struct pt_regs *regs, void *head,
-  struct task_struct *task, struct perf_event *event)
+  struct task_struct *task)
 {
-   perf_tp_event(type, count, raw_data, size, regs, head, rctx, task, 
event);
+   perf_tp_event(type, count, raw_data, size, regs, head, rctx, task);
 }
 #endif
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6bc21e202ae4..b8db80c5513b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7954,15 +7954,16 @@ void perf_trace_run_bpf_submit(void *raw_data, int 
size, int rctx,
}
}
perf_tp_event(call->event.type, count, raw_data, size, regs, head,
- rctx, task, NULL);
+ rctx, task);
 }
 EXPORT_SYMBOL_GPL(perf_trace_run_bpf_submit);
 
 void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
   struct pt_regs *regs, struct hlist_head *head, int rctx,
-  struct task_struct *task, struct perf_event *event)
+  struct task_struct *task)
 {
struct perf_sample_data data;
+   struct perf_event *event;
 
struct perf_raw_record raw = {
.frag = {
@@ -7976,15 +7977,9 @@ void perf_tp_event(u16 event_type, u64 count, void 
*record, int entry_size,
 
perf_trace_buf_update(record, event_type);
 
-   /* Use the given event instead of the hlist */
-   if (event) {
+   hlist_for_each_entry_rcu(event, head, hlist_entry) {
if (perf_tp_event_match(event, &data, regs))
perf_swevent_event(event, count, &data, regs);
-   } else {
-   hlist_for_each_entry_rcu(event, head, hlist_entry) {
-   if (perf_tp_event_match(event, &data, regs))
-   perf_swevent_event(event, count, &data, regs);
-   }
}
 
/*
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 13ba2d3f6a91..562fa69df5d3 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -306,7 +306,6 @@ static void
 perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
  struct ftrace_ops *ops, struct pt_regs *pt_regs)
 {
-   struct perf_event *event;
struct ftrace_entry *entry;
struct hlist_head *head;
struct pt_regs regs;
@@ -330,9 +329,8 @@ perf_ftrace_function_call(unsigned long ip, unsigned long 
parent_ip,
 
entry->ip = ip;
entry->parent_ip = parent_ip;
-   event = container_of(ops, struct perf_event, ftrace_ops);
perf_trace_buf_submit(entry, ENTRY_SIZE, rctx, TRACE_FN,
- 1, ®s, head, NULL, event);
+ 1, ®s, head, NULL);
 
 #undef ENTRY_SIZE
 }
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index af6134f2e597..996902a526d4 100

[for-next][PATCH 7/7] tracing, thermal: Hide cpu cooling trace events when not in use

2017-10-18 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

As trace events when defined create data structures and functions to
process them, defining trace events when not using them is a waste of
memory.

The trace events thermal_power_cpu_get_power and
thermal_power_cpu_limit are only used when CONFIG_CPU_THERMAL is set.
Make those events only defined when that is set as well.

Link: http://lkml.kernel.org/r/20171013102309.2c4ef...@gandalf.local.home

Cc: Eduardo Valentin 
Acked-by: Javi Merino 
Signed-off-by: Steven Rostedt (VMware) 
---
 include/trace/events/thermal.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
index 1fdacdb94e77..8af8f130950e 100644
--- a/include/trace/events/thermal.h
+++ b/include/trace/events/thermal.h
@@ -90,6 +90,7 @@ TRACE_EVENT(thermal_zone_trip,
show_tzt_type(__entry->trip_type))
 );
 
+#ifdef CONFIG_CPU_THERMAL
 TRACE_EVENT(thermal_power_cpu_get_power,
TP_PROTO(const struct cpumask *cpus, unsigned long freq, u32 *load,
size_t load_len, u32 dynamic_power, u32 static_power),
@@ -147,6 +148,7 @@ TRACE_EVENT(thermal_power_cpu_limit,
__get_bitmask(cpumask), __entry->freq, __entry->cdev_state,
__entry->power)
 );
+#endif /* CONFIG_CPU_THERMAL */
 
 #ifdef CONFIG_DEVFREQ_THERMAL
 TRACE_EVENT(thermal_power_devfreq_get_power,
-- 
2.13.2

[for-next][PATCH 5/7] ftrace: Kill FTRACE_OPS_FL_PER_CPU

2017-10-18 Thread Steven Rostedt

From: Peter Zijlstra 

The one and only user of FTRACE_OPS_FL_PER_CPU is gone, remove the
lot.

Link: http://lkml.kernel.org/r/20171011080224.372422...@infradead.org

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Steven Rostedt (VMware) 
---
 include/linux/ftrace.h | 83 +-
 kernel/trace/ftrace.c  | 55 -
 2 files changed, 20 insertions(+), 118 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 1f8545caa691..252e334e7b5f 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -102,10 +102,6 @@ ftrace_func_t ftrace_ops_get_func(struct ftrace_ops *ops);
  * ENABLED - set/unset when ftrace_ops is registered/unregistered
  * DYNAMIC - set when ftrace_ops is registered to denote dynamically
  *   allocated ftrace_ops which need special care
- * PER_CPU - set manualy by ftrace_ops user to denote the ftrace_ops
- *   could be controlled by following calls:
- * ftrace_function_local_enable
- * ftrace_function_local_disable
  * SAVE_REGS - The ftrace_ops wants regs saved at each function called
  *and passed to the callback. If this flag is set, but the
  *architecture does not support passing regs
@@ -149,21 +145,20 @@ ftrace_func_t ftrace_ops_get_func(struct ftrace_ops *ops);
 enum {
FTRACE_OPS_FL_ENABLED   = 1 << 0,
FTRACE_OPS_FL_DYNAMIC   = 1 << 1,
-   FTRACE_OPS_FL_PER_CPU   = 1 << 2,
-   FTRACE_OPS_FL_SAVE_REGS = 1 << 3,
-   FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED= 1 << 4,
-   FTRACE_OPS_FL_RECURSION_SAFE= 1 << 5,
-   FTRACE_OPS_FL_STUB  = 1 << 6,
-   FTRACE_OPS_FL_INITIALIZED   = 1 << 7,
-   FTRACE_OPS_FL_DELETED   = 1 << 8,
-   FTRACE_OPS_FL_ADDING= 1 << 9,
-   FTRACE_OPS_FL_REMOVING  = 1 << 10,
-   FTRACE_OPS_FL_MODIFYING = 1 << 11,
-   FTRACE_OPS_FL_ALLOC_TRAMP   = 1 << 12,
-   FTRACE_OPS_FL_IPMODIFY  = 1 << 13,
-   FTRACE_OPS_FL_PID   = 1 << 14,
-   FTRACE_OPS_FL_RCU   = 1 << 15,
-   FTRACE_OPS_FL_TRACE_ARRAY   = 1 << 16,
+   FTRACE_OPS_FL_SAVE_REGS = 1 << 2,
+   FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED= 1 << 3,
+   FTRACE_OPS_FL_RECURSION_SAFE= 1 << 4,
+   FTRACE_OPS_FL_STUB  = 1 << 5,
+   FTRACE_OPS_FL_INITIALIZED   = 1 << 6,
+   FTRACE_OPS_FL_DELETED   = 1 << 7,
+   FTRACE_OPS_FL_ADDING= 1 << 8,
+   FTRACE_OPS_FL_REMOVING  = 1 << 9,
+   FTRACE_OPS_FL_MODIFYING = 1 << 10,
+   FTRACE_OPS_FL_ALLOC_TRAMP   = 1 << 11,
+   FTRACE_OPS_FL_IPMODIFY  = 1 << 12,
+   FTRACE_OPS_FL_PID   = 1 << 13,
+   FTRACE_OPS_FL_RCU   = 1 << 14,
+   FTRACE_OPS_FL_TRACE_ARRAY   = 1 << 15,
 };
 
 #ifdef CONFIG_DYNAMIC_FTRACE
@@ -198,7 +193,6 @@ struct ftrace_ops {
unsigned long   flags;
void*private;
ftrace_func_t   saved_func;
-   int __percpu*disabled;
 #ifdef CONFIG_DYNAMIC_FTRACE
struct ftrace_ops_hash  local_hash;
struct ftrace_ops_hash  *func_hash;
@@ -230,55 +224,6 @@ int register_ftrace_function(struct ftrace_ops *ops);
 int unregister_ftrace_function(struct ftrace_ops *ops);
 void clear_ftrace_function(void);
 
-/**
- * ftrace_function_local_enable - enable ftrace_ops on current cpu
- *
- * This function enables tracing on current cpu by decreasing
- * the per cpu control variable.
- * It must be called with preemption disabled and only on ftrace_ops
- * registered with FTRACE_OPS_FL_PER_CPU. If called without preemption
- * disabled, this_cpu_ptr will complain when CONFIG_DEBUG_PREEMPT is enabled.
- */
-static inline void ftrace_function_local_enable(struct ftrace_ops *ops)
-{
-   if (WARN_ON_ONCE(!(ops->flags & FTRACE_OPS_FL_PER_CPU)))
-   return;
-
-   (*this_cpu_ptr(ops->disabled))--;
-}
-
-/**
- * ftrace_function_local_disable - disable ftrace_ops on current cpu
- *
- * This function disables tracing on current cpu by increasing
- * the per cpu control variable.
- * It must be called with preemption disabled and only on ftrace_ops
- * registered with FTRACE_OPS_FL_PER_CPU. If called without preemption
- * disabled, this_cpu_ptr will complain when CONFIG_DEBUG_PREEMPT is enabled.
- */
-static inline void ftrace_function_local_disable(struct ftrace_ops *ops)
-{
-   if (WARN_ON_ONCE(!(ops->flags & FTRACE_OPS_FL_PER_CPU)))
-   return;
-
-   (*this_cpu_ptr(ops->disabled

Re: [PATCH] cgroup: reorder flexible array members of struct cgroup_root

2017-10-18 Thread Tejun Heo

Hello,

On Mon, Oct 16, 2017 at 11:33:21PM -0700, Nick Desaulniers wrote:
> When compiling arch/x86/boot/compressed/eboot.c with HOSTCC=clang, the
> following warning is observed:
> 
> ./include/linux/cgroup-defs.h:391:16: warning: field 'cgrp' with
> variable sized type 'struct cgroup' not at the end of a struct or class
> is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
> struct cgroup cgrp;
>   ^
> Flexible array members are a C99 feature, but must be the last member of
> a struct. Structs with flexible members composed in other structs must
> also be the final members, unless using GNU C extensions.
> 
> struct cgroup_root's member cgrp is a struct cgroup, struct cgroup's
> member ancestor_ids is a flexible member.

This is silly tho.  We know the the root group embedded there won't
have any ancestor_ids.  Also, in general, nothing prevents us from
doing something like the following.

struct outer_struct {
blah blah;
struct inner_struct_with_flexible_array_member inner;
unsigned long storage_for_flexible_array[NR_ENTRIES];
blah blah;
};

I think we should just silence the bogus warning.

Thanks.

-- 
tejun

Re: [PATCH 1/2] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default

2017-10-18 Thread Ingo Molnar


* Thomas Gleixner  wrote:

> On Wed, 18 Oct 2017, Byungchul Park wrote:
> >  #ifdef CONFIG_LOCKDEP_CROSSRELEASE
> > +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
> >  #define MAX_XHLOCK_TRACE_ENTRIES 5
> > +#else
> > +#define MAX_XHLOCK_TRACE_ENTRIES 1
> > +#endif
> >  
> >  /*
> >   * This is for keeping locks waiting for commit so that true dependencies
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index e36e652..5c2ddf2 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock)
> > xhlock->trace.nr_entries = 0;
> > xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
> > xhlock->trace.entries = xhlock->trace_entries;
> > +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
> > xhlock->trace.skip = 3;
> > save_stack_trace(&xhlock->trace);
> > +#else
> > +   xhlock->trace.nr_entries = 1;
> > +   xhlock->trace.entries[0] = hlock->acquire_ip;
> > +#endif
> 
> Hmm. Would it be possible to have this switchable at boot time via a
> command line parameter? So in case of a splat with no stack trace, one
> could just reboot and set something like 'lockdep_fullstack' on the kernel
> command line to get the full data without having to recompile the kernel.

Yeah, and I'd suggest keeping the Kconfig option to default-enable that boot 
option as well - i.e. let's have both.

Thanks,

Ingo

[for-next][PATCH 6/7] tracing, thermal: Hide devfreq trace events when not in use

2017-10-18 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

As trace events when defined create data structures and functions to
process them, defining trace events when not using them is a waste of
memory.

The trace events thermal_power_devfreq_get_power and
thermal_power_devfreq_limit are only used when CONFIG_DEVFREQ_THERMAL
is set. Make those events only defined when that is set as well.

Link: http://lkml.kernel.org/r/20171013102150.0050c...@gandalf.local.home

Acked-by: Javi Merino 
Signed-off-by: Steven Rostedt (VMware) 
---
 include/trace/events/thermal.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
index 6cde5b3514c2..1fdacdb94e77 100644
--- a/include/trace/events/thermal.h
+++ b/include/trace/events/thermal.h
@@ -148,6 +148,7 @@ TRACE_EVENT(thermal_power_cpu_limit,
__entry->power)
 );
 
+#ifdef CONFIG_DEVFREQ_THERMAL
 TRACE_EVENT(thermal_power_devfreq_get_power,
TP_PROTO(struct thermal_cooling_device *cdev,
 struct devfreq_dev_status *status, unsigned long freq,
@@ -203,6 +204,7 @@ TRACE_EVENT(thermal_power_devfreq_limit,
__get_str(type), __entry->freq, __entry->cdev_state,
__entry->power)
 );
+#endif /* CONFIG_DEVFREQ_THERMAL */
 #endif /* _TRACE_THERMAL_H */
 
 /* This part must be outside protection */
-- 
2.13.2

[for-next][PATCH 4/7] perf/ftrace: Small cleanup

2017-10-18 Thread Steven Rostedt

From: Peter Zijlstra 

ops->flags _should_ be 0 at this point, so setting the flag using
bitwise or is a bit daft.

Link: http://lkml.kernel.org/r/20171011080224.315585...@infradead.org

Requested-by: Steven Rostedt 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace_event_perf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index e73f9ab15939..55d6dff37daf 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -363,7 +363,7 @@ static int perf_ftrace_function_register(struct perf_event 
*event)
 {
struct ftrace_ops *ops = &event->ftrace_ops;
 
-   ops->flags   |= FTRACE_OPS_FL_RCU;
+   ops->flags   = FTRACE_OPS_FL_RCU;
ops->func= perf_ftrace_function_call;
ops->private = (void *)(unsigned long)nr_cpu_ids;
 
-- 
2.13.2

Re: [PATCH v6 1/3] PCI: rockchip: Add support for pcie wake irq

2017-10-18 Thread Bjorn Helgaas

On Tue, Oct 17, 2017 at 06:03:14PM -0700, Brian Norris wrote:
> On Mon, Oct 16, 2017 at 03:03:50PM -0500, Bjorn Helgaas wrote:
> > On Sat, Oct 14, 2017 at 03:50:56AM +0800, Jeffy Chen wrote:
> > > Add support for PCIE_WAKE pin in rockchip pcie driver.
> > > 
> > > Signed-off-by: Jeffy Chen 
> > > ---
> > > 
> > > Changes in v6:
> > > Fix device_init_wake error handling, and add some comments.
> > > 
> > > Changes in v5:
> > > Rebase
> > > 
> > > Changes in v3:
> > > Fix error handling
> > > 
> > > Changes in v2:
> > > Use dev_pm_set_dedicated_wake_irq
> > > -- Suggested by Brian Norris 
> > > 
> > >  drivers/pci/host/pcie-rockchip.c | 27 +--
> > >  1 file changed, 21 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/pci/host/pcie-rockchip.c 
> > > b/drivers/pci/host/pcie-rockchip.c
> > > index 9051c6c8fea4..268513b6c9c4 100644
> > > --- a/drivers/pci/host/pcie-rockchip.c
> > > +++ b/drivers/pci/host/pcie-rockchip.c
> ...
> > > @@ -995,6 +996,17 @@ static int rockchip_pcie_setup_irq(struct 
> > > rockchip_pcie *rockchip)
> > >   return err;
> > >   }
> > >  
> > > + irq = platform_get_irq_byname(pdev, "wakeup");
> > > + if (irq >= 0) {
> > > + /* Must init wakeup before setting dedicated wakeup irq. */
> > > + device_init_wakeup(dev, true);
> > > + err = dev_pm_set_dedicated_wake_irq(dev, irq);
> > > + if (err) {
> > > + dev_err(dev, "failed to setup PCIe wakeup IRQ\n");
> > > + device_init_wakeup(dev, false);
> > > + }
> > > + }
> > 
> > There's nothing Rockchip-specific here, so I'm hoping you can explore
> > putting this support in the PCI core, so any system that describes the
> > WAKE# connection in the DT can benefit.
> 
> I guess it could work to look into pci_create_root_bus(), and
> do something like the following?
> 
>   if (IS_ENABLED(CONFIG_OF) && parent && parent->of_node)
>   ... do OF parsing for generic features like WAKE# ...

That's exactly the sort of thing I was thinking.

[for-next][PATCH 3/7] perf/ftrace: Fix function trace events

2017-10-18 Thread Steven Rostedt

From: Peter Zijlstra 

The function-trace <-> perf interface is a tad messed up. Where all
the other trace <-> perf interfaces use a single trace hook
registration and use per-cpu RCU based hlist to iterate the events,
function-trace actually needs multiple hook registrations in order to
minimize function entry patching when filters are present.

The end result is that we iterate events both on the trace hook and on
the hlist, which results in reporting events multiple times.

Since function-trace cannot use the regular scheme, fix it the other
way around, use singleton hlists.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Steven Rostedt (VMware) 
---
 include/linux/trace_events.h|  5 +++
 kernel/trace/trace_event_perf.c | 80 +
 2 files changed, 54 insertions(+), 31 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index a6349b76fd39..ca4e67e466a7 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -173,6 +173,11 @@ enum trace_reg {
TRACE_REG_PERF_UNREGISTER,
TRACE_REG_PERF_OPEN,
TRACE_REG_PERF_CLOSE,
+   /*
+* These (ADD/DEL) use a 'boolean' return value, where 1 (true) means a
+* custom action was taken and the default action is not to be
+* performed.
+*/
TRACE_REG_PERF_ADD,
TRACE_REG_PERF_DEL,
 #endif
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 562fa69df5d3..e73f9ab15939 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -240,27 +240,41 @@ void perf_trace_destroy(struct perf_event *p_event)
 int perf_trace_add(struct perf_event *p_event, int flags)
 {
struct trace_event_call *tp_event = p_event->tp_event;
-   struct hlist_head __percpu *pcpu_list;
-   struct hlist_head *list;
-
-   pcpu_list = tp_event->perf_events;
-   if (WARN_ON_ONCE(!pcpu_list))
-   return -EINVAL;
 
if (!(flags & PERF_EF_START))
p_event->hw.state = PERF_HES_STOPPED;
 
-   list = this_cpu_ptr(pcpu_list);
-   hlist_add_head_rcu(&p_event->hlist_entry, list);
+   /*
+* If TRACE_REG_PERF_ADD returns false; no custom action was performed
+* and we need to take the default action of enqueueing our event on
+* the right per-cpu hlist.
+*/
+   if (!tp_event->class->reg(tp_event, TRACE_REG_PERF_ADD, p_event)) {
+   struct hlist_head __percpu *pcpu_list;
+   struct hlist_head *list;
+
+   pcpu_list = tp_event->perf_events;
+   if (WARN_ON_ONCE(!pcpu_list))
+   return -EINVAL;
+
+   list = this_cpu_ptr(pcpu_list);
+   hlist_add_head_rcu(&p_event->hlist_entry, list);
+   }
 
-   return tp_event->class->reg(tp_event, TRACE_REG_PERF_ADD, p_event);
+   return 0;
 }
 
 void perf_trace_del(struct perf_event *p_event, int flags)
 {
struct trace_event_call *tp_event = p_event->tp_event;
-   hlist_del_rcu(&p_event->hlist_entry);
-   tp_event->class->reg(tp_event, TRACE_REG_PERF_DEL, p_event);
+
+   /*
+* If TRACE_REG_PERF_DEL returns false; no custom action was performed
+* and we need to take the default action of dequeueing our event from
+* the right per-cpu hlist.
+*/
+   if (!tp_event->class->reg(tp_event, TRACE_REG_PERF_DEL, p_event))
+   hlist_del_rcu(&p_event->hlist_entry);
 }
 
 void *perf_trace_buf_alloc(int size, struct pt_regs **regs, int *rctxp)
@@ -307,14 +321,24 @@ perf_ftrace_function_call(unsigned long ip, unsigned long 
parent_ip,
  struct ftrace_ops *ops, struct pt_regs *pt_regs)
 {
struct ftrace_entry *entry;
-   struct hlist_head *head;
+   struct perf_event *event;
+   struct hlist_head head;
struct pt_regs regs;
int rctx;
 
-   head = this_cpu_ptr(event_function.perf_events);
-   if (hlist_empty(head))
+   if ((unsigned long)ops->private != smp_processor_id())
return;
 
+   event = container_of(ops, struct perf_event, ftrace_ops);
+
+   /*
+* @event->hlist entry is NULL (per INIT_HLIST_NODE), and all
+* the perf code does is hlist_for_each_entry_rcu(), so we can
+* get away with simply setting the @head.first pointer in order
+* to create a singular list.
+*/
+   head.first = &event->hlist_entry;
+
 #define ENTRY_SIZE (ALIGN(sizeof(struct ftrace_entry) + sizeof(u32), \
sizeof(u64)) - sizeof(u32))
 
@@ -330,7 +354,7 @@ perf_ftrace_function_call(unsigned long ip, unsigned long 
parent_ip,
entry->ip = ip;
entry->parent_ip = parent_ip;
perf_trace_buf_submit(entry, ENTRY_SIZE, rctx, TRACE_FN,
- 1, ®s, head, NULL);
+ 1, ®s, &head, NULL);
 
 #undef ENTRY_SIZE

Re: [PATCH] static_key: Improve uninizialized key warning

2017-10-18 Thread Steven Rostedt

On Wed, 18 Oct 2017 15:19:34 +0200
Ingo Molnar  wrote:

> * Borislav Petkov  wrote:
> 
> > but it doesn't tell me which key it is. So dump its address too:
> > 
> >   static_key_disable_cpuslocked, key 81c32680 used before call to 
> > jump_label_init  
> 
> Is it possible to print out a symbol instead of an absolute address - does 
> that 
> work for data symbols?

It should.

Boris, can you try it with "%pS" ?

-- Steve

Re: [PATCH net 0/3] Fix for BPF devmap percpu allocation splat

2017-10-18 Thread Tejun Heo

Hello, Daniel.

(cc'ing Dennis)

On Tue, Oct 17, 2017 at 04:55:51PM +0200, Daniel Borkmann wrote:
> The set fixes a splat in devmap percpu allocation when we alloc
> the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2,
> patch 1 is rather small, so if this could be routed via -net, for
> example, with Tejun's Ack that would be good. Patch 3 gets rid of
> remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator
> internals and should not be used.
> 
> Thanks!
> 
> Daniel Borkmann (3):
>   mm, percpu: add support for __GFP_NOWARN flag

This looks fine.

>   bpf: fix splat for illegal devmap percpu allocation
>   bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations

These look okay too but if it helps percpu allocator can expose the
maximum size / alignment supported to take out the guessing game too.

Also, the reason why PCPU_MIN_UNIT_SIZE is what it is is because
nobody needed anything bigger.  Increasing the size doesn't really
cost much at least on 64bit archs.  Is that something we want to be
considering?

Thanks.

-- 
tejun

Re: [RFC PATCH] can: m_can: Support higher speed CAN-FD bitrates

2017-10-18 Thread Sekhar Nori

Hi Marc,

On Wednesday 18 October 2017 06:14 PM, Marc Kleine-Budde wrote:
> On 09/21/2017 02:48 AM, Franklin S Cooper Jr wrote:
>>
>>
>> On 09/20/2017 04:37 PM, Mario Hüttel wrote:
>>>
>>>
>>> On 09/20/2017 10:19 PM, Franklin S Cooper Jr wrote:
 Hi Wenyou,

 On 09/17/2017 10:47 PM, Yang, Wenyou wrote:
>
> On 2017/9/14 13:06, Sekhar Nori wrote:
>> On Thursday 14 September 2017 03:28 AM, Franklin S Cooper Jr wrote:
>>> On 08/18/2017 02:39 PM, Franklin S Cooper Jr wrote:
 During test transmitting using CAN-FD at high bitrates (4 Mbps) only
 resulted in errors. Scoping the signals I noticed that only a single
 bit
 was being transmitted and with a bit more investigation realized the
 actual
 MCAN IP would go back to initialization mode automatically.

 It appears this issue is due to the MCAN needing to use the Transmitter
 Delay Compensation Mode as defined in the MCAN User's Guide. When this
 mode is used the User's Guide indicates that the Transmitter Delay
 Compensation Offset register should be set. The document mentions
 that this
 register should be set to (1/dbitrate)/2*(Func Clk Freq).

 Additional CAN-CIA's "Bit Time Requirements for CAN FD" document
 indicates
 that this TDC mode is only needed for data bit rates above 2.5 Mbps.
 Therefore, only enable this mode and only set TDCO when the data bit
 rate
 is above 2.5 Mbps.

 Signed-off-by: Franklin S Cooper Jr 
 ---
 I'm pretty surprised that this hasn't been implemented already since
 the primary purpose of CAN-FD is to go beyond 1 Mbps and the MCAN IP
 supports up to 10 Mbps.

 So it will be nice to get comments from users of this driver to
 understand
 if they have been able to use CAN-FD beyond 2.5 Mbps without this
 patch.
 If they haven't what did they do to get around it if they needed higher
 speeds.

 Meanwhile I plan on testing this using a more "realistic" CAN bus to
 insure
 everything still works at 5 Mbps which is the max speed of my CAN
 transceiver.
>>> ping. Anyone has any thoughts on this?
>> I added Dong who authored the m_can driver and Wenyou who added the only
>> in-kernel user of the driver for any help.
> I tested it on SAMA5D2 Xplained board both with and without this patch, 
> both work with the 4M bps data bit rate.
 Thank you for testing this out. Its interesting that you have been able
 to use higher speeds without this patch. What is the CAN transceiver
 being used on the SAMA5D2 Xplained board? I tried looking at the
 schematic but it seems the CAN signals are used on an extension board
 which I can't find the schematic for. Also do you mind sharing your test
 setup? Were you doing a short point to point test?

 Thank You,
 Franklin
>>> Hello Franklin,
>>>
>>> your patch definitely makes sense.
>>>
>>> I forgot the TDC in my patches because it was not present in the
>>> previous driver versions and because I didn't encounter any
>>> problems when testing it myself.
>>>
>>> The error is highly dependent on the hardware (transceiver) setup.
>>> So it is definitely possible that some people don't encounter errors
>>> without your patch.
>>
>> So the Transmission Delay Compensation feature Value register is suppose
>> to take into consideration the transceiver delay automatically and add
>> the value of TDCO on top of that. So why would TDCO be dependent on the
>> transceiver? I've heard conflicting things regarding TDC so any
>> clarification on what actually impacts it would be appreciated.
>>
>> Also part of the issue I'm having is how can we properly configure TDCO?
>> Configuring TDCO is essentially figuring out what Secondary Sample Point
>> to use. However, it is unclear what value to set SSP to and which use
>> cases a given SSP will work or doesn't work. I've seen various
>> recommendations from Bosch on choosing SSP but ultimately it seems they
>> suggestion "real world testing" to come up with a proper value. Not
>> setting TDCO causes problems for my device and improperly setting TDCO
>> causes problems for my device. So its likely any value I use could end
>> up breaking something for someone else.
>>
>> Currently I leaning to a DT property that can be used for setting SSP.
>> Perhaps use a generic default value and allow individuals to override it
>> via DT?
> 
> Sounds reasonable. What's the status of this series?

I have had some offline discussions with Franklin on this, and I am not
fully convinced that DT is the way to go here (although I don't have the
agreement with Franklin there).

There are two components in configuring the secondary sample point. It
is the transceiver loopback delay and an offset (example half of the

[tip:locking/core] locking/arch, powerpc/rtas: Use arch_spin_lock() instead of arch_spin_lock_flags()

2017-10-18 Thread tip-bot for Will Deacon

Commit-ID:  58788a9b6060890e481c8111fac43d065560ebcb
Gitweb: https://git.kernel.org/tip/58788a9b6060890e481c8111fac43d065560ebcb
Author: Will Deacon 
AuthorDate: Wed, 18 Oct 2017 12:51:09 +0100
Committer:  Ingo Molnar 
CommitDate: Wed, 18 Oct 2017 15:15:07 +0200

locking/arch, powerpc/rtas: Use arch_spin_lock() instead of 
arch_spin_lock_flags()

arch_spin_lock_flags() is an internal part of the spinlock implementation
and is no longer available when SMP=n and DEBUG_SPINLOCK=y, so the PPC
RTAS code fails to compile in this configuration:

   arch/powerpc/kernel/rtas.c: In function 'lock_rtas':
>> arch/powerpc/kernel/rtas.c:81:2: error: implicit declaration of function 
>> 'arch_spin_lock_flags' [-Werror=implicit-function-declaration]
 arch_spin_lock_flags(&rtas.lock, flags);
 ^~~~

Since there's no good reason to use arch_spin_lock_flags() here (the code
in question already calls local_irq_save(flags)), switch it over to
arch_spin_lock and get things building again.

Reported-by: kbuild test robot 
Signed-off-by: Will Deacon 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: 
http://lkml.kernel.org/r/1508327469-20231-1-git-send-email-will.dea...@arm.com
Signed-off-by: Ingo Molnar 
---
 arch/powerpc/kernel/rtas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 1643e9e..3f1c4fc 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -78,7 +78,7 @@ static unsigned long lock_rtas(void)
 
local_irq_save(flags);
preempt_disable();
-   arch_spin_lock_flags(&rtas.lock, flags);
+   arch_spin_lock(&rtas.lock);
return flags;
 }

[PATCH] perf vendor events: Add Goldmont Plus V1 event file

2017-10-18 Thread kan . liang

From: Kan Liang 

Add a Intel event file for perf.

Signed-off-by: Kan Liang 
---
 .../pmu-events/arch/x86/goldmontplus/cache.json| 1453 
 .../pmu-events/arch/x86/goldmontplus/frontend.json |   62 +
 .../pmu-events/arch/x86/goldmontplus/memory.json   |   38 +
 .../pmu-events/arch/x86/goldmontplus/other.json|   98 ++
 .../pmu-events/arch/x86/goldmontplus/pipeline.json |  544 
 .../arch/x86/goldmontplus/virtual-memory.json  |  218 +++
 tools/perf/pmu-events/arch/x86/mapfile.csv |1 +
 7 files changed, 2414 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/x86/goldmontplus/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/goldmontplus/frontend.json
 create mode 100644 tools/perf/pmu-events/arch/x86/goldmontplus/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/goldmontplus/other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/goldmontplus/pipeline.json
 create mode 100644 
tools/perf/pmu-events/arch/x86/goldmontplus/virtual-memory.json

diff --git a/tools/perf/pmu-events/arch/x86/goldmontplus/cache.json 
b/tools/perf/pmu-events/arch/x86/goldmontplus/cache.json
new file mode 100644
index 000..b4791b4
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/goldmontplus/cache.json
@@ -0,0 +1,1453 @@
+[
+{
+"CollectPEBSRecord": "1",
+"PublicDescription": "Counts memory requests originating from the core 
that miss in the L2 cache.",
+"EventCode": "0x2E",
+"Counter": "0,1,2,3",
+"UMask": "0x41",
+"PEBScounters": "0,1,2,3",
+"EventName": "LONGEST_LAT_CACHE.MISS",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "L2 cache request misses"
+},
+{
+"CollectPEBSRecord": "1",
+"PublicDescription": "Counts memory requests originating from the core 
that reference a cache line in the L2 cache.",
+"EventCode": "0x2E",
+"Counter": "0,1,2,3",
+"UMask": "0x4f",
+"PEBScounters": "0,1,2,3",
+"EventName": "LONGEST_LAT_CACHE.REFERENCE",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "L2 cache requests"
+},
+{
+"CollectPEBSRecord": "1",
+"PublicDescription": "Counts the number of demand and prefetch 
transactions that the L2 XQ rejects due to a full or near full condition which 
likely indicates back pressure from the intra-die interconnect (IDI) fabric. 
The XQ may reject transactions from the L2Q (non-cacheable requests), L2 misses 
and L2 write-back victims.",
+"EventCode": "0x30",
+"Counter": "0,1,2,3",
+"UMask": "0x0",
+"PEBScounters": "0,1,2,3",
+"EventName": "L2_REJECT_XQ.ALL",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "Requests rejected by the XQ"
+},
+{
+"CollectPEBSRecord": "1",
+"PublicDescription": "Counts the number of demand and L1 prefetcher 
requests rejected by the L2Q due to a full or nearly full condition which 
likely indicates back pressure from L2Q. It also counts requests that would 
have gone directly to the XQ, but are rejected due to a full or nearly full 
condition, indicating back pressure from the IDI link. The L2Q may also reject 
transactions from a core to insure fairness between cores, or to delay a core's 
dirty eviction when the address conflicts with incoming external snoops.",
+"EventCode": "0x31",
+"Counter": "0,1,2,3",
+"UMask": "0x0",
+"PEBScounters": "0,1,2,3",
+"EventName": "CORE_REJECT_L2Q.ALL",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "Requests rejected by the L2Q"
+},
+{
+"CollectPEBSRecord": "1",
+"PublicDescription": "Counts when a modified (dirty) cache line is 
evicted from the data L1 cache and needs to be written back to memory.  No 
count will occur if the evicted line is clean, and hence does not require a 
writeback.",
+"EventCode": "0x51",
+"Counter": "0,1,2,3",
+"UMask": "0x1",
+"PEBScounters": "0,1,2,3",
+"EventName": "DL1.REPLACEMENT",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "L1 Cache evictions for dirty data"
+},
+{
+"CollectPEBSRecord": "1",
+"PublicDescription": "Counts cycles that fetch is stalled due to an 
outstanding ICache miss. That is, the decoder queue is able to accept bytes, 
but the fetch unit is unable to provide bytes due to an ICache miss.  Note: 
this event is not the same as the total number of cycles spent retrieving 
instruction cache lines from the memory hierarchy.",
+"EventCode": "0x86",
+"Counter": "0,1,2,3",
+"UMask": "0x2",
+"PEBScounters": "0,1,2,3",
+"EventName": "FETCH_STALL.ICACHE_FILL_PENDING_CYCLES",
+

Re: [PATCH 1/2] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default

2017-10-18 Thread Thomas Gleixner

On Wed, 18 Oct 2017, Byungchul Park wrote:
>  #ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
>  #define MAX_XHLOCK_TRACE_ENTRIES 5
> +#else
> +#define MAX_XHLOCK_TRACE_ENTRIES 1
> +#endif
>  
>  /*
>   * This is for keeping locks waiting for commit so that true dependencies
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index e36e652..5c2ddf2 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock)
>   xhlock->trace.nr_entries = 0;
>   xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
>   xhlock->trace.entries = xhlock->trace_entries;
> +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
>   xhlock->trace.skip = 3;
>   save_stack_trace(&xhlock->trace);
> +#else
> + xhlock->trace.nr_entries = 1;
> + xhlock->trace.entries[0] = hlock->acquire_ip;
> +#endif

Hmm. Would it be possible to have this switchable at boot time via a
command line parameter? So in case of a splat with no stack trace, one
could just reboot and set something like 'lockdep_fullstack' on the kernel
command line to get the full data without having to recompile the kernel.

Thanks,

tglx

Re: [PATCH v2 0/8] ARM: sun8i: a83t: Enable AXP813/AXP818 regulators

2017-10-18 Thread Maxime Ripard

On Wed, Oct 18, 2017 at 04:31:30PM +0800, Chen-Yu Tsai wrote:
> Hi everyone,
> 
> This series was originally name "regulator: axp20x: Add support for
> AXP813/818 regulators". It adds support for the X-Powers AXP813/818 [1]
> PMICs' regulators. The series is quite straightforward.
> 
> Changes since v1:
> 
>   - Regulator driver patches were merged and now dropped from the series
> 
>   - Chose simpler names for the regulators
> 
>   - Added SDIO WiFi enablement patches
> 
> Patch 1 adds a axp20x-regulator cell for AXP813, thereby enabling the
> regulators.
> 
> Patch 2 adds a shared dtsi file for the PMIC. This currently contains
> a list of regulator nodes, but will be expanded with Quentin's power
> supply work.
> 
> Patches 3 through 5 add regulator nodes to board dts files for the A83T
> boards that I have. They are not squashed together as each file has
> substantial additions.
> 
> Patch 6 moves the mmc1 pinmux setting over to the dtsi, and sets it by
> default.
> 
> Patches 7 & 8 enable SDIO-based WiFi on the Cubietruck Plus and Banana
> Pi M3.
> 
> Originally my work also included enabling Ethernet. But the Ethernet
> bindings were reverted. Everything can be found here:
> 
> https://github.com/wens/linux/tree/a83t-regulator-wifi-eth
> 
> Please have a look.
> 
> Lee, we need the mfd changes merged in before merging the dts changes.
> Otherwise, mmc would break as vmmc/vqmmc is tied to the PMIC regulators.

Acked-by: Maxime Ripard 

For the whole serie.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [PATCH] video: fbdev: remove dead igafb driver

2017-10-18 Thread John Paul Adrian Glaubitz


Hi Bartlomiej!

On 10/18/2017 02:56 PM, Bartlomiej Zolnierkiewicz wrote:

igafb driver hasn't compiled since at least kernel v2.6.34 as
commit 6016a363f6b5 ("of: unify phandle name in struct device_node")
missed updating igafb.c to use dp->phandle instead of dp->node.

Would it take a lot of work to port the driver to the new interface?

I'm not sure which SPARC machines use this particular framebuffer, but
my plans are to fix up all these old framebuffer drivers. I have already
received several Amiga (Zorro) graphics cards for testing the updated
drivers on Amiga.

It could be that I actually have this particular SPARC framebuffer in
my hardware collection.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Re: libbattery was Re: [RFC PATCH 5/5] power: generic-adc-battery: Add capacity handling

2017-10-18 Thread Tony Lindgren

* H. Nikolaus Schaller  [171018 05:49]:
> > Am 18.10.2017 um 14:28 schrieb Pavel Machek :
> > 
> > So I started something, it is at.
> > 
> > https://github.com/pavelmachek/libbattery
> > 
> > My battery on n900 is currently uncalibrated (and charging), still it
> > gets some kind of estimation:
> > 
> > Battery -1 %
> > Seconds -1
> > State 1
> > Voltage 3.88 V
> > Battery 63 %
> > 
> > Of course, there's a lot more work to be done.
> 
> Nice start but not a solution to our problem.
> 
> Our problem is that people simply expect that for example 
> https://packages.debian.org/wheezy/xfce/xfce4-battery-plugin
> displays the battery percentage.

I think we could make things compatible with various battery apps by
having libbattery write back the capacity percentage and time remaining
to the kernel driver via sysfs or a dev entry. Then the kernel interface
can just display the data to whatever apps.

Regards,

Tony

Re: [PATCH v6 5/5] arm: dts: stm32: remove useless clocksource nodes

2017-10-18 Thread Rob Herring

On Wed, Oct 18, 2017 at 7:58 AM, Benjamin Gaignard
 wrote:
> 16 bits timers aren't accurate enough to be used as
> clocksource, remove them from stm32f4 and stm32f7 devicetree.

They aren't useful for anything? Zephyr? u-boot?

Rob

Re: [PATCH v5 2/6] perf: hisi: Add support for HiSilicon SoC uncore PMU driver

2017-10-18 Thread Zhangshaokun

Hi Mark,

Thanks for your comments.

On 2017/10/17 23:06, Mark Rutland wrote:
> Hi,
> 
> Apologies for the delay for this review.
> 
> Largely this seems to look OK, but there are a couple of things which
> stick out.
> 
> On Tue, Aug 22, 2017 at 04:07:53PM +0800, Shaokun Zhang wrote:
>> +int hisi_uncore_pmu_event_init(struct perf_event *event)
>> +{
>> +struct hw_perf_event *hwc = &event->hw;
>> +struct hisi_pmu *hisi_pmu;
>> +
>> +if (event->attr.type != event->pmu->type)
>> +return -ENOENT;
>> +
>> +/*
>> + * We do not support sampling as the counters are all
>> + * shared by all CPU cores in a CPU die(SCCL). Also we
>> + * do not support attach to a task(per-process mode)
>> + */
>> +if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
>> +return -EOPNOTSUPP;
>> +
>> +/* counters do not have these bits */
>> +if (event->attr.exclude_user||
>> +event->attr.exclude_kernel  ||
>> +event->attr.exclude_host||
>> +event->attr.exclude_guest   ||
>> +event->attr.exclude_hv  ||
>> +event->attr.exclude_idle)
>> +return -EINVAL;
>> +
>> +/*
>> + *  The uncore counters not specific to any CPU, so cannot
>> + *  support per-task
>> + */
>> +if (event->cpu < 0)
>> +return -EINVAL;
>> +
>> +/*
>> + * Validate if the events in group does not exceed the
>> + * available counters in hardware.
>> + */
>> +if (!hisi_validate_event_group(event))
>> +return -EINVAL;
>> +
>> +/*
>> + * We don't assign an index until we actually place the event onto
>> + * hardware. Use -1 to signify that we haven't decided where to put it
>> + * yet.
>> + */
>> +hwc->idx= -1;
>> +hwc->config_base= event->attr.config;
> 
> Are all event codes valid?
> 

No, some event codes are invalid for different PMUs.

> e.g. is it possible that some value passed by the user would cause a
> problem were it written to the hardware?
> 
> I see that you only use the low 8 bits of the config field elsewhere, so
> it might make sense to sanity check that here rather than having to mask
> it elsewhere.

Ok, i will add this check for this nice comment.

> 
> That would make future extension safer, since no-one could be relying on
> passing a dodgy value in.
> 
>> +
>> +hisi_pmu = to_hisi_pmu(event->pmu);
>> +/* Enforce to use the same CPU for all events in this PMU */
>> +event->cpu = hisi_pmu->on_cpu;
> 
> I think you need to check hisi_pmu->on_cpu != -1, otherwise we can
> accidentally create a task-bound event if a cluster is offline, and I'm
> not sure how the perf core code would handle here.
> 

Ok.

>> +
>> +return 0;
>> +}
> 
> [...]
> 
>> +int hisi_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
>> +{
>> +struct hisi_pmu *hisi_pmu;
>> +
>> +hisi_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
>> +
>> +/*
>> + * If the CPU is associated with the PMU, set it in online_cpus of
>> + * the PMU.
>> + */
>> +if (cpumask_test_cpu(cpu, &hisi_pmu->associated_cpus))
>> +cpumask_set_cpu(cpu, &hisi_pmu->online_cpus);
>> +else
>> +return 0;
> 
> This would be a bit nicer as:
> 
>   if (!cpumask_test_cpu(cpu, &hisi_pmu->associated_cpus))
>   return 0;
> 
>   cpumask_set_cpu(cpu, &hisi_pmu->online_cpus);
> 
> 
> However, I don't think you need hisi_pmu::online_cpus. That's only used
> for the online/offline callbacks, and you can use the
> hisi_pmu::associated_cpus mask in hisi_uncore_pmu_offline_cpu(), and
> avoid altering any mask here.
> 

Ok, shall remove this unnecessary member.

>> +
>> +/* If another CPU is already managing this PMU, simply return. */
>> +if (hisi_pmu->on_cpu != -1)
>> +return 0;
>> +
>> +/* Use this CPU in cpumask for event counting */
>> +hisi_pmu->on_cpu = cpu;
>> +
>> +/* Overflow interrupt also should use the same CPU */
>> +WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(cpu)));
>> +
>> +return 0;
>> +}
>> +
>> +int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
>> +{
>> +struct hisi_pmu *hisi_pmu;
>> +cpumask_t pmu_online_cpus;
>> +unsigned int target;
>> +
>> +hisi_pmu = hlist_entry_safe(node, struct hisi_pmu, node);
>> +
>> +/*
>> + * If the CPU is online with the PMU, clear it in online_cpus of
>> + * the PMU.
>> + */
>> +if (!cpumask_test_and_clear_cpu(cpu, &hisi_pmu->online_cpus) ||
>> +(hisi_pmu->on_cpu != cpu))
>> +return 0;
>> +
>> +hisi_pmu->on_cpu = -1;
>> +
>> +/* Any other CPU associated with the PMU is still online */
>> +cpumask_and(&pmu_online_cpus, &hisi_pmu->online_cpus, cpu_online_mask);
>> +target = cpumask_any_but(&pmu_online_cpus, cpu);
>> +if (target >= nr_cpu_ids)
>> +return 0;
> 
> I think

Re: [PATCH] static_key: Improve uninizialized key warning

2017-10-18 Thread Ingo Molnar

* Borislav Petkov  wrote:

> but it doesn't tell me which key it is. So dump its address too:
> 
>   static_key_disable_cpuslocked, key 81c32680 used before call to 
> jump_label_init

Is it possible to print out a symbol instead of an absolute address - does that 
work for data symbols?

Thanks,

Ingo

Re: [lkp-robot] [x86/topology] 379a4bb988: dmesg.WARNING:at_arch/x86/events/intel/uncore.c:#uncore_change_type_ctx

2017-10-18 Thread Prarit Bhargava



On 10/12/2017 02:44 AM, kernel test robot wrote:
>  bin/lkp install job.yaml

[root@hpe-dl385gen10-02 lkp-tests]# ls
allotdaemon  filters   include   Makefile  pkgrepospec
bin  distro  Gemfile   jobs  monitors  plot   rootfs  stats
cluster  doc Gemfile.lock  lib   pack  Rakefile   sbintests
_config.yml  etc hosts lkp-exec  paramsREADME.md  setup   tools
[root@hpe-dl385gen10-02 lkp-tests]#  bin/lkp install ../job.yaml
Not a supported system, cannot install packages.
[root@hpe-dl385gen10-02 lkp-tests]#

Well that's useful.

P.

Re: [PATCH] x86, syscalls: use SYSCALL_DEFINE() macros for sys_modify_ldt()

2017-10-18 Thread Ingo Molnar


* Dave Hansen  wrote:

> 
> We do not have tracepoints for sys_modify_ldt() because we define
> it directly instead of using the normal SYSCALL_DEFINEx() macros.
> 
> However, there is a reason sys_modify_ldt() does not use the macros:
> it has an 'int' return type instead of 'unsigned long'.  This is
> a bug, but it's a bug cemented in the ABI.
> 
> What does this mean?  If we return -EINVAL from a function that
> returns 'int', we have 0xffea in %rax.  But, if we
> return -EINVAL from a function returning 'unsigned long', we end
> up with 0xffea in %rax, which is wrong.
> 
> To work around this and maintain the 'int' behavior while using
> the SYSCALL_DEFINEx() macros, so we add a cast to 'unsigned int'
> in both implementations of sys_modify_ldt().
> 
> Cc: x...@kernel.org
> Cc: Andy Lutomirski 
> Cc: Brian Gerst 

I have added your:

  Signed-off-by: Dave Hansen 

let me know if that's OK.

Thanks,

Ingo

Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel

2017-10-18 Thread Thomas Gleixner

On Wed, 18 Oct 2017, Linus Torvalds wrote:
> On Tue, Oct 17, 2017 at 3:33 AM, Joonsoo Kim  wrote:
> >
> > It looks like a compiler bug. The code of slob_units() try to read two
> > bytes at 88001c4afffe. It's valid. But the compiler generates
> > wrong code that try to read four bytes.
> >
> > static slobidx_t slob_units(slob_t *s)
> > {
> >   if (s->units > 0)
> > return s->units;
> >   return 1;
> > }
> >
> > s->units is defined as two bytes in this setup.
> >
> > Wrongly generated code for this part.
> >
> > 'mov 0x0(%rbp), %ebp'
> >
> > %ebp is four bytes.
> >
> > I guess that this wrong four bytes read cross over the valid memory
> > boundary and this issue happend.
> 
> Hmm. I can see why the compiler would do that (16-bit accesses are
> slow), but it's definitely wrong.
> 
> Does it work ok if that slob_units() code is written as
> 
>   static slobidx_t slob_units(slob_t *s)
>   {
>  int units = READ_ONCE(s->units);
> 
>  if (units > 0)
>  return units;
>  return 1;
>   }
> 
> which might be an acceptable workaround for now?

Discussed exactly that with Peter Zijlstra yesterday, but we came to the
conclusion that this is a whack a mole game. It might fix this slob issue,
but what guarantees that we don't have the same problem in some other
place? Just duct taping this particular instance makes me nervous.

Joonsoo says:

> gcc 4.8 and 4.9 fails to generate proper code. gcc 5.1 and
> the latest version works fine.

> I guess that this problem is related to the corner case of some
> optimization feature since minor code change makes the result
> different. And, with -O2, proper code is generated even if gcc 4.8 is
> used.

So it would be useful to figure out which optimization bit is causing that
and blacklist it for the affected compiler versions.

Thanks,

tglx

Re: [PATCH v5 1/2] acpi: apei: remove the unused dead-code for SEA/NMI notification type

2017-10-18 Thread Borislav Petkov

On Wed, Oct 18, 2017 at 08:27:00PM +0800, gengdongjiu wrote:
>   For this patch(the first one), whether it can be firstly applied?

Sure:

Reviewed-by: Borislav Petkov 

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--

Re: [PATCH] hwmon: (coretemp) remove duplicated coretemp for same core id

2017-10-18 Thread Guenter Roeck

On 10/17/2017 08:21 PM, Shu Wang wrote:

From: "Guenter Roeck" 
To: shuw...@redhat.com
Cc: "fenghua yu" , jdelv...@suse.com, 
linux-hw...@vger.kernel.org,
linux-kernel@vger.kernel.org, ch...@redhat.com, yiz...@redhat.com
Sent: Tuesday, October 17, 2017 11:25:50 PM
Subject: Re: [PATCH] hwmon: (coretemp) remove duplicated coretemp for same core 
id

On Tue, Oct 17, 2017 at 04:44:50PM +0800, shuw...@redhat.com wrote:

From: Shu Wang 

Fix kernel warning on my 4cpus 2core_id system. The cpu0 and cpu1 have
same core_id 0, so both cpu0 and cpu1 will try to create file temp2_label
when it's online.

What system/cpu is that ?

Normally I would assume that each CPU (package) instantiates
a separate instance of the driver.

The system is ThinkPad X1 Carbon 3rd laptop, model 20BTS1N70F.

- coretemp_cpu_online(cpu=0)
   - create_core_data(cpu=0, attr_no=2)
- create_core_attrs(attr_no=2)
- coretemp_cpu_online(cpu=1)
   - create_core_data(cpu=1, attr_no=2)
- create_core_attrs(attr_no=2)

$ grep -e processor -e 'core id' /proc/cpuinfo
processor   : 0
core id : 0
processor   : 1
core id : 0
processor   : 2
core id : 1
processor   : 3
core id : 1

Complete output of /proc/cpuinfo might be helpful.

$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 61
model name  : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz

This is a hyperthreading CPU, which should already be handled,
and the problem would affect pretty much everyone. I'll have
to look into this more closely. Is this with the ToT kernel ?

Guenter

Re: [PATCH] net: ipx: mark expected switch fall-through

2017-10-18 Thread David Miller

From: "Gustavo A. R. Silva" 
Date: Mon, 16 Oct 2017 16:53:16 -0500

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Signed-off-by: Gustavo A. R. Silva 

Applied.

Re: [PATCH] ipv6: mark expected switch fall-throughs

2017-10-18 Thread David Miller

From: "Gustavo A. R. Silva" 
Date: Mon, 16 Oct 2017 16:36:52 -0500

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Notice that in some cases I placed the "fall through" comment
> on its own line, which is what GCC is expecting to find.
> 
> Signed-off-by: Gustavo A. R. Silva 

Applied.

[PATCH v2 1/5] ARM: sun8i: r40: add USB host port nodes for R40

2017-10-18 Thread Icenowy Zheng

From: Icenowy Zheng 

Allwinner R40 SoC features a USB OTG port and two USB HOST ports.

Add support for the host ports in the DTSI file.

The OTG controller still cannot work with existing compatibles, and needs
more investigation. So it's not added yet.

Signed-off-by: Icenowy Zheng 
---
Changes in v2:
- Dropped the bogus OHCI resources in EHCI device node.

 arch/arm/boot/dts/sun8i-r40.dtsi | 72 
 1 file changed, 72 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-r40.dtsi b/arch/arm/boot/dts/sun8i-r40.dtsi
index d5a6745409ae..19f270a9f3b1 100644
--- a/arch/arm/boot/dts/sun8i-r40.dtsi
+++ b/arch/arm/boot/dts/sun8i-r40.dtsi
@@ -173,6 +173,78 @@
#size-cells = <0>;
};
 
+   usbphy: phy@1c13400 {
+   compatible = "allwinner,sun8i-r40-usb-phy";
+   reg = <0x01c13400 0x14>,
+ <0x01c14800 0x4>,
+ <0x01c19800 0x4>,
+ <0x01c1c800 0x4>;
+   reg-names = "phy_ctrl",
+   "pmu0",
+   "pmu1",
+   "pmu2";
+   clocks = <&ccu CLK_USB_PHY0>,
+<&ccu CLK_USB_PHY1>,
+<&ccu CLK_USB_PHY2>;
+   clock-names = "usb0_phy",
+ "usb1_phy",
+ "usb2_phy";
+   resets = <&ccu RST_USB_PHY0>,
+<&ccu RST_USB_PHY1>,
+<&ccu RST_USB_PHY2>;
+   reset-names = "usb0_reset",
+ "usb1_reset",
+ "usb2_reset";
+   status = "disabled";
+   #phy-cells = <1>;
+   };
+
+   ehci1: usb@1c19000 {
+   compatible = "allwinner,sun8i-r40-ehci", "generic-ehci";
+   reg = <0x01c19000 0x100>;
+   interrupts = ;
+   clocks = <&ccu CLK_BUS_EHCI1>;
+   resets = <&ccu RST_BUS_EHCI1>;
+   phys = <&usbphy 1>;
+   phy-names = "usb";
+   status = "disabled";
+   };
+
+   ohci1: usb@1c19400 {
+   compatible = "allwinner,sun8i-r40-ohci", "generic-ohci";
+   reg = <0x01c19400 0x100>;
+   interrupts = ;
+   clocks = <&ccu CLK_BUS_OHCI1>,
+<&ccu CLK_USB_OHCI1>;
+   resets = <&ccu RST_BUS_OHCI1>;
+   phys = <&usbphy 1>;
+   phy-names = "usb";
+   status = "disabled";
+   };
+
+   ehci2: usb@1c1c000 {
+   compatible = "allwinner,sun8i-r40-ehci", "generic-ehci";
+   reg = <0x01c1c000 0x100>;
+   interrupts = ;
+   clocks = <&ccu CLK_BUS_EHCI2>;
+   resets = <&ccu RST_BUS_EHCI2>;
+   phys = <&usbphy 2>;
+   phy-names = "usb";
+   status = "disabled";
+   };
+
+   ohci2: usb@1c1c400 {
+   compatible = "allwinner,sun8i-r40-ohci", "generic-ohci";
+   reg = <0x01c1c400 0x100>;
+   interrupts = ;
+   clocks = <&ccu CLK_BUS_OHCI2>,
+<&ccu CLK_USB_OHCI2>;
+   resets = <&ccu RST_BUS_OHCI2>;
+   phys = <&usbphy 2>;
+   phy-names = "usb";
+   status = "disabled";
+   };
+
ccu: clock@1c2 {
compatible = "allwinner,sun8i-r40-ccu";
reg = <0x01c2 0x400>;
-- 
2.13.6

Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume

2017-10-18 Thread Rafael J. Wysocki

On Wednesday, October 18, 2017 1:57:52 PM CEST Ulf Hansson wrote:
> On 18 October 2017 at 02:39, Rafael J. Wysocki  wrote:
> > On Tuesday, October 17, 2017 9:41:16 PM CEST Ulf Hansson wrote:
> >
> > [cut]
> >
> >> >
> >> >> deploying this and from a middle layer point of view, all the trivial
> >> >> cases supports this.
> >> >
> >> > These functions are wrong, however, because they attempt to reuse the
> >> > whole callback *path* instead of just reusing driver callbacks.  The
> >> > *only* reason why it all "works" is because there are no middle layer
> >> > callbacks involved in that now.
> >> >
> >> > If you changed them to reuse driver callbacks only today, nothing would 
> >> > break
> >> > AFAICS.
> >>
> >> Yes, it would.
> >>
> >> First, for example, the amba bus is responsible for the amba bus
> >> clock, but relies on drivers to gate/ungate it during system sleep. In
> >> case the amba drivers don't use the pm_runtime_force_suspend|resume(),
> >> it will explicitly have to start manage the clock during system sleep
> >> themselves. Leading to open coding.
> >
> > Well, I suspected that something like this would surface. ;-)
> >
> > Are there any major reasons why the appended patch (obviously untested) 
> > won't
> > work, then?
> 
> Let me comment on the code, instead of here...
> 
> ...just realized your second reply, so let me reply to that instead
> regarding the patch.
> 
> >
> >> Second, it will introduce a regression in behavior for all users of
> >> pm_runtime_force_suspend|resume(), especially during system resume as
> >> the driver may then end up resuming the device even in case it isn't
> >> needed.
> >
> > How so?
> >
> > I'm talking about a change like in the appended patch, where
> > pm_runtime_force_* simply invoke driver callbacks directly.  What is
> > skipped there is middle-layer stuff which is empty anyway in all cases
> > except for AMBA (if that's all what is lurking below the surface), so
> > I don't quite see how the failure will happen.
> 
> I am afraid changing pm_runtime_force* to only call driver callbacks
> may become fragile. Let me elaborate.
> 
> The reason why pm_runtime_force_* needs to respects the hierarchy of
> the RPM callbacks, is because otherwise it can't safely update the
> runtime PM status of the device.

I'm not sure I follow this requirement.  Why is that so?

> And updating the runtime PM status of
> the device is required to manage the optimized behavior during system
> resume (avoiding to unnecessary resume devices).

Well, OK.  The runtime PM status of the device after system resume should
better reflect its physical state.

[The physical state of the device may not be under the control of the
kernel in some cases, like in S3 resume on some systems that reset
devices in the firmware and so on, but let's set that aside.]

However, for the runtime PM status of the device may still reflect its state
if, say, a ->resume_early of the middle layer is called during resume along
with a driver's ->runtime_resume.  That still can produce the right state
of the device and all depends on the middle layer.

On the other hand, as I said before, using a middle-layer ->runtime_suspend
during a system sleep transition may be outright incorrect, say if device
wakeup settings need to be adjusted by the middle layer (which is the
case for some of them).

Of course, if the middle layer expects the driver to point its
system-wide PM callbacks to pm_runtime_force_*, then that's how it goes,
but the drivers working with this particular middle layer generally
won't work with other middle layers and may interact incorrectly
with parents and/or children using the other middle layers.

I guess the problem boils down to having a common set of expectations
on the driver side and on the middle layer side allowing different
combinations of these to work together.

> Besides the AMBA case, I also realized that we are dealing with PM
> clocks in the genpd case. For this, genpd relies on the that runtime
> PM status of the device properly reflects the state of the HW, during
> system-wide PM.
> 
> In other words, if the driver would change the runtime PM status of
> the device, without respecting the hierarchy of the runtime PM
> callbacks, it would lead to that genpd starts taking wrong decisions
> while managing the PM clocks during system-wide PM. So in case you
> intend to change pm_runtime_force_* this needs to be addressed too.

I've just looked at the genpd code and quite frankly I'm not sure how this
works, but I'll figure this out. :-)

> >
> >> I believe I have explained why, also several times by now -
> >> and that's also how far you could take the i2c designware driver at
> >> this point.
> >>
> >> That said, I assume the second part may be addressed in this series,
> >> if these drivers convert to use the "driver PM flags", right?
> >>
> >> However, what about the first case? Is some open coding needed or your
> >> think the amba driver can instruct the amba bus via

[PATCH v2 5/5] ARM: sun8i: v40: enable USB host ports for Banana Pi M2 Berry

2017-10-18 Thread Icenowy Zheng

Banana Pi M2 Berry has an on-board USB Hub that provides 4 USB Type-A
ports, and it's connected to the USB1 port of the SoC.

Enable it.

Signed-off-by: Icenowy Zheng 
---
 arch/arm/boot/dts/sun8i-v40-bananapi-m2-berry.dts | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-v40-bananapi-m2-berry.dts 
b/arch/arm/boot/dts/sun8i-v40-bananapi-m2-berry.dts
index fe16fc0eb518..45c17c8c5915 100644
--- a/arch/arm/boot/dts/sun8i-v40-bananapi-m2-berry.dts
+++ b/arch/arm/boot/dts/sun8i-v40-bananapi-m2-berry.dts
@@ -87,6 +87,10 @@
};
 };
 
+&ehci1 {
+   status = "okay";
+};
+
 &i2c0 {
status = "okay";
 
@@ -98,6 +102,10 @@
};
 };
 
+&ohci1 {
+   status = "okay";
+};
+
 #include "axp22x.dtsi"
 
 ®_aldo3 {
@@ -171,3 +179,8 @@
pinctrl-0 = <&uart0_pb_pins>;
status = "okay";
 };
+
+&usbphy {
+   usb1_vbus-supply = <®_vcc5v0>;
+   status = "okay";
+};
-- 
2.13.6

Re: libbattery was Re: [RFC PATCH 5/5] power: generic-adc-battery: Add capacity handling

2017-10-18 Thread Pavel Machek


> So I have three questions:
> a) why do you use float/double instead of fixed point for such
> simple and imprecise calculations?

Cleaner code, and probably faster, too.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

[PATCH v2 4/5] ARM: sun8i: r40: enable USB host for Banana Pi M2 Ultra

2017-10-18 Thread Icenowy Zheng

From: Icenowy Zheng 

Banana Pi M2 Ultra board features two USB host ports, connected to the
two USB host ports on the SoC.

Add support for them.

Signed-off-by: Icenowy Zheng 
---
 arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts 
b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
index 035599d870b9..8c5efe2a9881 100644
--- a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
+++ b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
@@ -93,6 +93,14 @@
};
 };
 
+&ehci1 {
+   status = "okay";
+};
+
+&ehci2 {
+   status = "okay";
+};
+
 &i2c0 {
status = "okay";
 
@@ -180,8 +188,22 @@
status = "okay";
 };
 
+&ohci1 {
+   status = "okay";
+};
+
+&ohci2 {
+   status = "okay";
+};
+
 &uart0 {
pinctrl-names = "default";
pinctrl-0 = <&uart0_pb_pins>;
status = "okay";
 };
+
+&usbphy {
+   usb1_vbus-supply = <®_vcc5v0>;
+   usb2_vbus-supply = <®_vcc5v0>;
+   status = "okay";
+};
-- 
2.13.6

[PATCH v2 0/5] Allwinner R40 USB host support (DT part)

2017-10-18 Thread Icenowy Zheng

This patchset adds support for the USB host ports on Allwiner R40, and
enable them on Banana Pi M2 Ultra and Berry boards.

The first patch adds USB PHY and EHCI/OHCI nodes to the R40 DTSI.

The second and third patch adds 5V regulator for the two boards, and
the fourth and fifth patch finally adds USB host ports support.

Icenowy Zheng (5):
  ARM: sun8i: r40: add USB host port nodes for R40
  ARM: sun8i: r40: add 5V regulator for Banana Pi M2 Ultra
  ARM: sun8i: v40: add 5V regulator for Banana Pi M2 Berry
  ARM: sun8i: r40: enable USB host for Banana Pi M2 Ultra
  ARM: sun8i: v40: enable USB host ports for Banana Pi M2 Berry

 arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts | 31 ++
 arch/arm/boot/dts/sun8i-r40.dtsi  | 72 +++
 arch/arm/boot/dts/sun8i-v40-bananapi-m2-berry.dts | 22 +++
 3 files changed, 125 insertions(+)

-- 
2.13.6

Answer

2017-10-18 Thread 61327344

Brauchen Sie eine dringende finanzielle für weitere Informationen E-Mail mit 
Kreditbetrag benötigt: Darlehen Dauer: Telefonnummer: Land: Formular 
10,000.00euros zu 50,000,000Euros kontaktieren Sie uns  Email: 
mrlynneerwi...@gmail.com

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

[PATCH v2 2/5] ARM: sun8i: r40: add 5V regulator for Banana Pi M2 Ultra

2017-10-18 Thread Icenowy Zheng

On newer revisions of the Banana Pi M2 Ultra boards, the 5V power output
(used by HDMI, SATA and USB) is controller via a GPIO.

Add the regulator node for it.

Older revisions just have the 5V power output always on, and the GPIO is
reserved on these boards. So it won't affect the older revisions.

Signed-off-by: Icenowy Zheng 
---
 arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts 
b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
index 7b52608cebe6..035599d870b9 100644
--- a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
+++ b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts
@@ -78,6 +78,15 @@
};
};
 
+   reg_vcc5v0: vcc5v0 {
+   compatible = "regulator-fixed";
+   regulator-name = "vcc5v0";
+   regulator-min-microvolt = <500>;
+   regulator-max-microvolt = <500>;
+   gpio = <&pio 7 23 GPIO_ACTIVE_HIGH>; /* PH23 */
+   enable-active-high;
+   };
+
wifi_pwrseq: wifi_pwrseq {
compatible = "mmc-pwrseq-simple";
reset-gpios = <&pio 6 10 GPIO_ACTIVE_LOW>; /* PG10 WIFI_EN */
-- 
2.13.6

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 1064 matches

Mail list logo