date:20150723

Re: Dealing with the NMI mess

2015-07-23 Thread Willy Tarreau

On Thu, Jul 23, 2015 at 05:31:05PM -0400, Steven Rostedt wrote:
> On Thu, 23 Jul 2015 14:08:59 -0700
> Linus Torvalds  wrote:
> 
> > On Thu, Jul 23, 2015 at 1:49 PM, Andy Lutomirski  
> > wrote:
> > >
> > > Issue A: to return with RF clear, we need to disarm the breakpoint.
> > > If it's limited to the duration of the NMI, that's easy.  If not, when
> > > do we re-arm?  New prepare_exit_to_usermode hook?  Hmm, setting ti
> > > flags during context switch may target the wrong task.
> > 
> > We don't re-arm it.
> > 
> 
> Let me get this straight. The idea is in the #DB handler to detect that
> it was triggered in NMI context, and if so, simply disarm that
> breakpoint permanently, right?
> 
> Nothing should be adding hw breakpoints to NMI code anyway. Sounds
> perfectly reasonable to me. Of course, how we tell we are in NMI
> brings back all the races as we had in the nesting code. We can check
> the per-cpu variable that is set with nmi_enter() and cleared at
> nmi_exit() but what happens if the breakpoint is outside those calls.
> We can check the stack pointer, but then we are back to userspace
> fooling us. Maybe add the DF trick again?

Can't the back link of the TSS tell us where we come from ? At least
it should not be manipulable from user-space.

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Andy Lutomirski

On Thu, Jul 23, 2015 at 2:46 PM, Willy Tarreau  wrote:
> On Thu, Jul 23, 2015 at 05:31:05PM -0400, Steven Rostedt wrote:
>> On Thu, 23 Jul 2015 14:08:59 -0700
>> Linus Torvalds  wrote:
>>
>> > On Thu, Jul 23, 2015 at 1:49 PM, Andy Lutomirski  
>> > wrote:
>> > >
>> > > Issue A: to return with RF clear, we need to disarm the breakpoint.
>> > > If it's limited to the duration of the NMI, that's easy.  If not, when
>> > > do we re-arm?  New prepare_exit_to_usermode hook?  Hmm, setting ti
>> > > flags during context switch may target the wrong task.
>> >
>> > We don't re-arm it.
>> >
>>
>> Let me get this straight. The idea is in the #DB handler to detect that
>> it was triggered in NMI context, and if so, simply disarm that
>> breakpoint permanently, right?
>>
>> Nothing should be adding hw breakpoints to NMI code anyway. Sounds
>> perfectly reasonable to me. Of course, how we tell we are in NMI
>> brings back all the races as we had in the nesting code. We can check
>> the per-cpu variable that is set with nmi_enter() and cleared at
>> nmi_exit() but what happens if the breakpoint is outside those calls.
>> We can check the stack pointer, but then we are back to userspace
>> fooling us. Maybe add the DF trick again?
>
> Can't the back link of the TSS tell us where we come from ? At least
> it should not be manipulable from user-space.

Not on 64-bit -- there are no tasks :)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Andy Lutomirski

On Thu, Jul 23, 2015 at 2:35 PM, Linus Torvalds
 wrote:
> On Thu, Jul 23, 2015 at 2:20 PM, Peter Zijlstra  wrote:
>>
>> So the NMI could trigger userspace debug register faults, and simply
>> disabling them would make the whole debug register thing entirely
>> unreliable.
>
> We could easily set something to re-enable them for when we actually
> return to user space. I'd be ok with just setting the
> _TIF_USER_WORK_MASK.
>
> But even that should not be a requirement for the basic stability and
> core integrity of the kernel. Not like the current horrid mess with
> NMI nesting and ESP fixing etc.
>
> And realistically, nobody will ever even notice. So the whole "ok, we
> can use _TIF_USER_WORK_MASK to re-enable dr7" is a tiny tiny detail
> that is more like cleaning up things, not a core issue.
>

Or we just re-enable them on the way out of NMI (i.e. the very last
thing we do in the NMI handler).  I don't want to break regular
userspace gdb when perf is running.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [Patch V4 1/3] usb: Add Xen pvUSB protocol description

2015-07-23 Thread Pasi Kärkkäinen

On Thu, Jul 23, 2015 at 12:08:01PM -0700, Greg KH wrote:
> 
> Somewhere that people can refer to that describes this public-facing API
> that "must not ever be broken or changed".  If you want to put it in a
> documentation file, or a .h file, I don't care.
> 
> > >>It is used e.g. in SUSE's xen kernel since 2.6.18.
> > >
> > >I am very aware of the amount of Xen crap in SuSE's kernel, don't use
> > >that as an excuse for me to merge it to mainline :)
> > 
> > :-)
> > 
> > Wasn't meant as an excuse, just a hint why the interface can't be the
> > same as for usbip. We have to ensure compatibility with those kernels
> 
> This shouldn't be a kernel/kernel compability issue, as the api talks
> between Xen and the OS, not between different OSs, right?
> 
> > and possibly other operating systems (BSD?, Windows?) which already
> > might be using pvUSB with a Dom0 based on the SUSE xen kernel.
> 
> Are there other operating system drivers today that use this API?  Is
> this an API in the Xen core today that we have to support?
> 
> Some more background / descriptions would be nice to have.
>

For example Xen "GPLPV" drivers for Windows do have PVUSB frontend driver..


-- Pasi

 
> thanks,
> 
> greg k-h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mmotm 2015-07-23-14-37 uploaded

2015-07-23 Thread akpm

The mm-of-the-moment snapshot 2015-07-23-14-37 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (4.x
or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/

and use of this tree is similar to
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.


This mmotm tree contains the following patches against 4.2-rc3:
(patches marked "*" will be included in linux-next)

  origin.patch
  arch-alpha-kernel-systblss-remove-debug-check.patch
* 
ipc-modify-message-queue-accounting-to-not-take-kernel-data-structures-into-account.patch
* mm-meminit-allow-early_pfn_to_nid-to-be-used-during-runtime.patch
* mm-meminit-replace-rwsem-with-completion.patch
* 
fs-file-table-reinit-files_statmax_files-after-deferred-memory-initialisation.patch
* mm-vmscan-do-not-wait-for-page-writeback-for-gfp_nofs-allocations.patch
* ocfs2-fix-bug-in-ocfs2_downconvert_thread_do_work.patch
* ocfs2-fix-bug-in-ocfs2_downconvert_thread_do_work-v2.patch
* signal-fix-information-leak-in-copy_siginfo_from_user32.patch
* signal-fix-information-leak-in-copy_siginfo_to_user.patch
* signalfd-fix-information-leak-in-signalfd_copyinfo.patch
* mm-slub-allow-merging-when-slab_debug_free-is-set.patch
* 
iommu-common-do-not-use-64-bit-constant-0xl-for-computing-align_mask.patch
* fsnotify-fix-oops-in-fsnotify_clear_marks_by_group_flags.patch
* ipc-use-private-shmem-or-hugetlbfs-inodes-for-shm-segments.patch
* mm-memory-failure-unlock_page-before-put_page.patch
* mm-memory-failure-fix-race-in-counting-num_poisoned_pages.patch
* mm-memory-failure-give-up-error-handling-for-non-tail-refcounted-thp.patch
* 
mm-memory-failure-check-__pg_hwpoison-separately-from-page_flags_check_at_.patch
* kernel-kthreadc-kthread_create_on_node-clarify-documentation.patch
* kernel-kthreadc-kthread_create_on_node-clarify-documentation-fix.patch
* capabilities-ambient-capabilities.patch
* capabilities-add-a-securebit-to-disable-pr_cap_ambient_raise.patch
* fs-optimize-inotify-fsnotify-code-for-unwatched-files.patch
* fsnotify-fix-check-in-inotify-fdinfo-printing.patch
* scripts-spellingtxt-adding-misspelled-word-for-check.patch
* scripts-spellingtxt-adding-misspelled-word-for-check-fix.patch
* kerneldoc-convert-error-messages-to-gnu-error-message-format.patch
* lindent-handle-missing-indent-gracefully.patch
* scripts-decode_stacktrace-fix-arm-architecture-decoding.patch
* ntfs-deletion-of-unnecessary-checks-before-the-function-call-iput.patch
* fs-ext4-fsyncc-generic_file_fsync-call-based-on-barrier-flag.patch
* ocfs2-fix-race-between-dio-and-recover-orphan.patch
* ocfs2-fix-several-issues-of-append-dio.patch
* ocfs2-do-not-bug-if-buffer-not-uptodate-in-__ocfs2_journal_access.patch
* ocfs2-do-not-log-twice-error-messages.patch
* ocfs2-clean-up-unused-local-variables-in-ocfs2_file_write_iter.patch
* ocfs2-set-filesytem-read-only-when-ocfs2_delete_entry-failed.patch
*

Re: [PATCH] Input: LEDs - skip unnamed LEDs

2015-07-23 Thread Dmitry Torokhov

On Thu, Jul 23, 2015 at 11:22:54PM +0200, Pavel Machek wrote:
> On Thu 2015-07-23 13:57:13, Dmitry Torokhov wrote:
> > On Thu, Jul 23, 2015 at 08:19:13AM +0200, Pavel Machek wrote:
> > > On Wed 2015-07-22 15:02:02, Dmitry Torokhov wrote:
> > > > Devices may declare more LEDs than what is known to input-leds
> > > > (HID does this for some devices). Instead of showing ugly warnings
> > > > on connect and, even worse, oopsing on disconnect, let's simply
> > > > ignore LEDs that are not known to us.
> > > > 
> > > > Reported-by: Vlastimil Babka 
> > > > Signed-off-by: Dmitry Torokhov 
> > > > ---
> > > >  drivers/input/input-leds.c |   16 ++--
> > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/input/input-leds.c b/drivers/input/input-leds.c
> > > > index 074a65e..766bf26 100644
> > > > --- a/drivers/input/input-leds.c
> > > > +++ b/drivers/input/input-leds.c
> > > > @@ -71,6 +71,18 @@ static void input_leds_event(struct input_handle 
> > > > *handle, unsigned int type,
> > > >  {
> > > >  }
> > > >  
> > > > +static int input_leds_get_count(struct input_dev *dev)
> > > > +{
> > > > +   unsigned int led_code;
> > > > +   int count = 0;
> > > > +
> > > > +   for_each_set_bit(led_code, dev->ledbit, LED_CNT)
> > > > +   if (input_led_info[led_code].name)
> > > > +   count++;
> > > > +
> > > > +   return count;
> > > > +}
> > > > +
> > > >  static int input_leds_connect(struct input_handler *handler,
> > > >   struct input_dev *dev,
> > > >   const struct input_device_id *id)
> > > > @@ -81,7 +93,7 @@ static int input_leds_connect(struct input_handler 
> > > > *handler,
> > > > int led_no;
> > > > int error;
> > > >  
> > > > -   num_leds = bitmap_weight(dev->ledbit, LED_CNT);
> > > > +   num_leds = input_leds_get_count(dev);
> > > > if (!num_leds)
> > > > return -ENXIO;
> > > >  
> > > > @@ -112,7 +124,7 @@ static int input_leds_connect(struct input_handler 
> > > > *handler,
> > > > led->handle = >handle;
> > > > led->code = led_code;
> > > >  
> > > > -   if (WARN_ON(!input_led_info[led_code].name))
> > > > +   if (!input_led_info[led_code].name)
> > > > continue;
> > > >  
> > > > led->cdev.name = kasprintf(GFP_KERNEL, "%s::%s",
> > > >
> > > 
> > > Are you sure? AFAICT you need to fix err_unregister_leds not to
> > > unregister leds with no name...
> > 
> > Well, if we skip unnamed leds and do not include them into total count
> > then we won't need to unregister them.
> 
> I don't get it.
> 
> If there's unnamed led at index 0, and named one at indexes 1 and
> 2.. and there's -ENOMEM registering 2, it will try to unregister leds
> 0 and 1, and crash, no?

There won't be unnamed led in position 0 of the leds array because we
skip over unnamed leds. The issue with original code was that we did not
reduce total number of leds when we encountered unnamed one and thus on
unbind we'd try unregistering non-initialized slots at the end of the
array, but now we account for unnamed leds when we count them, before
allocating the array.

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/8] power: bq27x00_battery: Renaming for consistency

2015-07-23 Thread Belisko Marek

On Thu, Jul 23, 2015 at 11:15 PM, Pali Rohár  wrote:
> On Thursday 23 July 2015 22:56:26 Belisko Marek wrote:
>> Hi Pali,
>>
>> On Thu, Jul 23, 2015 at 10:15 PM, Pali Rohár 
>> wrote:
>> > On Thursday 23 July 2015 19:03:08 Andrew F. Davis wrote:
>> >> >> -#ifdef CONFIG_BATTERY_BQ27X00_I2C
>> >> >> -MODULE_ALIAS("i2c:bq27000-battery");
>> >> >> +#ifdef CONFIG_BATTERY_BQ27XXX_I2C
>> >> >> +MODULE_ALIAS("i2c:bq27xxx-battery");
>> >> >>
>> >> >>  #endif
>> >> >
>> >> > Why is this MODULE_ALIAS needed? Some lines upper there is
>> >> >
>> >> >  MODULE_DEVICE_TABLE(i2c, bq27xxx_id);
>> >> >
>> >> > which add proper i2c: module alias...
>> >>
>> >> Not sure, looks like it was added in commit
>> >> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 which claims that the
>> >> "module won't get loaded automatically" without it, but I have not
>> >> had this problem, so I'm not sure why it's there.
>> >
>> > git grep bq27000-battery show me that only one driver uses that
>> > name: drivers/w1/slaves/w1_bq27000.c
>> >
>> > And more over, it is platform device, not i2c device. So that
>> > commit 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 is wrong! CCing
>> > Marek.
>>
>> If you look to power/bq27x00 driver then there is I2C part and
>> platform part only
>> both selectable by config. In
>> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 was added MODULE_ALIAS to
>> have this driver working as module for both buses. Even if I2C isn't
>> used anywhere
>> I add MODULE_ALIAS also for that.
>>
>> > MODULE_ALIAS("platform:bq27000-battery") is really needed for
>> > w1_bq27000.c but MODULE_ALIAS("i2c:bq27000-battery") should be
>> > removed. It is not used by any board platform code or DT.
>>
>> Not sure if it's good idea to remove it. Somebody outside can use it.
>>
>> > Marek, correct me if I'm wrong.
>> >
>> > --
>> > Pali Rohár
>> > pali.ro...@gmail.com
>>
>> BR,
>>
>> marek
>
> Who, where any why is using alias i2c:bq27000-battery??
>
> It is obviously wrong to add that alias. MODULE_DEVICE_TABLE(i2c, ...
> macro automatically adds all MODULE_ALIASes for all i2c devices.
Hmm are you sure? I think it's not true if you look to include/linux/module.h
Those 2 macros have completely other purposes.
>
> If there is such device with needs i2c:bq27000-battery is should be
> fixed to use i2c:bq27??? identifier exported by MODULE_DEVICE_TABLE.
Look if you don't like it remove i2c part from driver and post patch.
But as I said
somebody can use it (not talking about vanilla tree).
>
> git grep on kernel tree did not return anything, so there is no usage.
>
> Of course alias for platform:bq27000-battery is needed.
>
> --
> Pali Rohár
> pali.ro...@gmail.com

BR,

marek



-- 
as simple and primitive as possible
-
Marek Belisko - OPEN-NANDRA
Freelance Developer

Ruska Nova Ves 219 | Presov, 08005 Slovak Republic
Tel: +421 915 052 184
skype: marekwhite
twitter: #opennandra
web: http://open-nandra.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Linus Torvalds

On Thu, Jul 23, 2015 at 2:20 PM, Peter Zijlstra  wrote:
>
> So the NMI could trigger userspace debug register faults, and simply
> disabling them would make the whole debug register thing entirely
> unreliable.

We could easily set something to re-enable them for when we actually
return to user space. I'd be ok with just setting the
_TIF_USER_WORK_MASK.

But even that should not be a requirement for the basic stability and
core integrity of the kernel. Not like the current horrid mess with
NMI nesting and ESP fixing etc.

And realistically, nobody will ever even notice. So the whole "ok, we
can use _TIF_USER_WORK_MASK to re-enable dr7" is a tiny tiny detail
that is more like cleaning up things, not a core issue.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] tools lib traceevent: Allow setting an alternative symbol resolver

2015-07-23 Thread Steven Rostedt

On Thu, 23 Jul 2015 18:25:36 -0300
Arnaldo Carvalho de Melo  wrote:

> Like this?

Yep, but some comments.

> diff --git a/tools/lib/traceevent/event-parse.c 
> b/tools/lib/traceevent/event-parse.c
> index cc25f059ab3d..7f9266225f11 100644
> --- a/tools/lib/traceevent/event-parse.c
> +++ b/tools/lib/traceevent/event-parse.c
> @@ -418,7 +418,7 @@ static int func_map_init(struct pevent *pevent)
>  }
>  
>  static struct func_map *
> -find_func(struct pevent *pevent, unsigned long long addr)
> +__find_func(struct pevent *pevent, unsigned long long addr)
>  {
>   struct func_map *func;
>   struct func_map key;
> @@ -434,6 +434,60 @@ find_func(struct pevent *pevent, unsigned long long addr)
>   return func;
>  }
>  
> +struct func_resolver {
> + pevent_func_resolver_t *func;
> + void   *priv;
> + struct func_mapmap;
> +};
> +
> +/**
> + * pevent_set_function_resolver - set an alternative function resolver
> + * @pevent: handle for the pevent
> + * @resolver: function to be used
> + * @priv: resolver function private state.
> + *
> + * Some tools may have already a way to resolve kernel functions, allow them 
> to
> + * keep using it instead of duplicating all the entries inside
> + * pevent->funclist.
> + */
> +int pevent_set_function_resolver(struct pevent *pevent,
> +  pevent_func_resolver_t *func, void *priv)
> +{
> + struct func_resolver *resolver = malloc(sizeof(*resolver));
> +
> + if (resolver == NULL) {
> + errno = ENOMEM;

Why set errno, wont a failed malloc set it for us?

> + return -1;
> + }
> +
> + resolver->func = func;
> + resolver->priv = priv;
> +
> + free(pevent->func_resolver);
> + pevent->func_resolver = resolver;

Also I wonder if we should add a way to clear the resolver. That is,
you want to use the default resolver?

Not really a necessity, as I don't see any current programs using it,
but it would complete the interface.

-- Steve

> +
> + return 0;
> +}
> +
> +static struct func_map *
> +find_func(struct pevent *pevent, unsigned long long addr)
> +{
> + struct func_map *map;
> +
> + if (!pevent->func_resolver)
> + return __find_func(pevent, addr);
> +
> + map = >func_resolver->map;
> + map->mod  = NULL;
> + map->addr = addr;
> + map->func = pevent->func_resolver->func(pevent->func_resolver->priv,
> + >addr, >mod);
> + if (map->func == NULL)
> + return NULL;
> +
> + return map;
> +}
> +
>  /**
>   * pevent_find_function - find a function by a given address
>   * @pevent: handle for the pevent
> @@ -6564,6 +6618,7 @@ void pevent_free(struct pevent *pevent)
>   free(pevent->trace_clock);
>   free(pevent->events);
>   free(pevent->sort_events);
> + free(pevent->func_resolver);
>  
>   free(pevent);
>  }
> diff --git a/tools/lib/traceevent/event-parse.h 
> b/tools/lib/traceevent/event-parse.h
> index 063b1971eb35..416e1bd9fe33 100644
> --- a/tools/lib/traceevent/event-parse.h
> +++ b/tools/lib/traceevent/event-parse.h
> @@ -453,6 +453,10 @@ struct cmdline_list;
>  struct func_map;
>  struct func_list;
>  struct event_handler;
> +struct func_resolver;
> +
> +typedef char *(pevent_func_resolver_t)(void *priv,
> +unsigned long long *addrp, char **modp);
>  
>  struct pevent {
>   int ref_count;
> @@ -481,6 +485,7 @@ struct pevent {
>   int cmdline_count;
>  
>   struct func_map *func_map;
> + struct func_resolver *func_resolver;
>   struct func_list *funclist;
>   unsigned int func_count;
>  
> @@ -611,6 +616,8 @@ enum trace_flag_type {
>   TRACE_FLAG_SOFTIRQ  = 0x10,
>  };
>  
> +int pevent_set_function_resolver(struct pevent *pevent,
> +  pevent_func_resolver_t *func, void *priv);
>  int pevent_register_comm(struct pevent *pevent, const char *comm, int pid);
>  int pevent_register_trace_clock(struct pevent *pevent, const char 
> *trace_clock);
>  int pevent_register_function(struct pevent *pevent, char *name,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Steven Rostedt

On Thu, 23 Jul 2015 14:08:59 -0700
Linus Torvalds  wrote:

> On Thu, Jul 23, 2015 at 1:49 PM, Andy Lutomirski  wrote:
> >
> > Issue A: to return with RF clear, we need to disarm the breakpoint.
> > If it's limited to the duration of the NMI, that's easy.  If not, when
> > do we re-arm?  New prepare_exit_to_usermode hook?  Hmm, setting ti
> > flags during context switch may target the wrong task.
> 
> We don't re-arm it.
> 

Let me get this straight. The idea is in the #DB handler to detect that
it was triggered in NMI context, and if so, simply disarm that
breakpoint permanently, right?

Nothing should be adding hw breakpoints to NMI code anyway. Sounds
perfectly reasonable to me. Of course, how we tell we are in NMI
brings back all the races as we had in the nesting code. We can check
the per-cpu variable that is set with nmi_enter() and cleared at
nmi_exit() but what happens if the breakpoint is outside those calls.
We can check the stack pointer, but then we are back to userspace
fooling us. Maybe add the DF trick again?

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] tools lib traceevent: Allow setting an alternative symbol resolver

2015-07-23 Thread Arnaldo Carvalho de Melo

Em Thu, Jul 23, 2015 at 02:11:33PM -0400, Steven Rostedt escreveu:
> On Thu, 23 Jul 2015 14:10:39 -0400
> Steven Rostedt  wrote:
> 
> > On Thu, 23 Jul 2015 14:25:54 -0300
> > Arnaldo Carvalho de Melo  wrote:
> > 
> > > diff --git a/tools/lib/traceevent/event-parse.c 
> > > b/tools/lib/traceevent/event-parse.c
> > > index cc25f059ab3d..2750e7e7efff 100644
> > > --- a/tools/lib/traceevent/event-parse.c
> > > +++ b/tools/lib/traceevent/event-parse.c
> > > @@ -418,7 +418,7 @@ static int func_map_init(struct pevent *pevent)
> > >  }
> > >  
> > >  static struct func_map *
> > > -find_func(struct pevent *pevent, unsigned long long addr)
> > > +__find_func(struct pevent *pevent, unsigned long long addr)
> > >  {
> > >   struct func_map *func;
> > >   struct func_map key;
> > > @@ -434,6 +434,51 @@ find_func(struct pevent *pevent, unsigned long long 
> > > addr)
> > >   return func;
> > >  }
> > >  
> > > +static struct {
> > > + pevent_function_resolver_t *function;
> > > + void   *priv;
> > > +} function_resolver;
> > > +
> > > +/**
> > > + * pevent_set_function_resolver - set an alternative function resolver
> > > + * @resolver - function to be used
> > > + * @priv - resolver function private state.
> > > + *
> > > + * Some tools may have already a way to resolve kernel functions, allow 
> > > them
> > > + * to keep using it instead of duplicating all the entries inside 
> > > pevent->funclist.
> > > + */
> > > +void pevent_set_function_resolver(pevent_function_resolver_t *resolver, 
> > > void *priv)
> > > +{
> > > + function_resolver.function = resolver;
> > > + function_resolver.priv= priv;
> > 
> > What about passing in pevent, and making the allocation here?
> 
> In fact, we could remove the global function_resolver, and allocate
> that here too.

Like this?

commit ce60dbbf3cf352b54ddb44c8e86fde159b4e539e
Author: Arnaldo Carvalho de Melo 
Date:   Wed Jul 22 12:36:55 2015 -0300

tools lib traceevent: Allow setting an alternative symbol resolver

The perf tools have a symbol resolver that includes solving kernel
symbols using either kallsyms or ELF symtabs, and it also is using
libtraceevent to format the trace events fields, including via
subsystem specific plugins, like the "timer" one.

To solve fields like "timer:hrtimer_start"'s "function", libtraceevent
needs a way to map from its value to a function name and addr.

This patch provides a way for tools that already have symbol resolving
facilities to ask libtraceevent to use it when needing to resolve
kernel symbols.

Acked-by: David Ahern 
Cc: Adrian Hunter 
Cc: Borislav Petkov 
Cc: Frederic Weisbecker 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Stephane Eranian 
Cc: Steven Rostedt 
Link: http://lkml.kernel.org/n/tip-fdx1fazols17w5py26ia3...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cc25f059ab3d..7f9266225f11 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -418,7 +418,7 @@ static int func_map_init(struct pevent *pevent)
 }
 
 static struct func_map *
-find_func(struct pevent *pevent, unsigned long long addr)
+__find_func(struct pevent *pevent, unsigned long long addr)
 {
struct func_map *func;
struct func_map key;
@@ -434,6 +434,60 @@ find_func(struct pevent *pevent, unsigned long long addr)
return func;
 }
 
+struct func_resolver {
+   pevent_func_resolver_t *func;
+   void   *priv;
+   struct func_mapmap;
+};
+
+/**
+ * pevent_set_function_resolver - set an alternative function resolver
+ * @pevent: handle for the pevent
+ * @resolver: function to be used
+ * @priv: resolver function private state.
+ *
+ * Some tools may have already a way to resolve kernel functions, allow them to
+ * keep using it instead of duplicating all the entries inside
+ * pevent->funclist.
+ */
+int pevent_set_function_resolver(struct pevent *pevent,
+pevent_func_resolver_t *func, void *priv)
+{
+   struct func_resolver *resolver = malloc(sizeof(*resolver));
+
+   if (resolver == NULL) {
+   errno = ENOMEM;
+   return -1;
+   }
+
+   resolver->func = func;
+   resolver->priv = priv;
+
+   free(pevent->func_resolver);
+   pevent->func_resolver = resolver;
+
+   return 0;
+}
+
+static struct func_map *
+find_func(struct pevent *pevent, unsigned long long addr)
+{
+   struct func_map *map;
+
+   if (!pevent->func_resolver)
+   return __find_func(pevent, addr);
+
+   map = >func_resolver->map;
+   map->mod  = NULL;
+   map->addr = addr;
+   map->func = pevent->func_resolver->func(pevent->func_resolver->priv,
+   >addr, >mod);
+   if (map->func == NULL)
+   return NULL;
+
+

Re: [PATCH 4/5] ARM: dts: qcom: Add ks8851 node for wired ethernet

2015-07-23 Thread Andy Gross

On Tue, Jun 16, 2015 at 01:31:15PM -0700, Stephen Boyd wrote:
> The micrel ks8851 device is present on MSM8960 CDP boards. It is
> connected to two regulators, one controlled via a gpio and
> another controlled via the RPM. Add the gsbi, spi, gpio
> regulator, and micrel ks8851 nodes so that ethernet works
> properly.
> 
> Signed-off-by: Stephen Boyd 
> ---

Applied, Thanks!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/5] ARM: dts: qcom: Add MSM8960 RPM and RPM regulator nodes

2015-07-23 Thread Andy Gross

On Tue, Jun 16, 2015 at 01:31:13PM -0700, Stephen Boyd wrote:
> Add the basic RPM and RPM regulator nodes that boards can fill in
> with their board specific details.
> 
> Signed-off-by: Stephen Boyd 
> ---

Applied, thanks

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Input: LEDs - skip unnamed LEDs

2015-07-23 Thread Pavel Machek

On Thu 2015-07-23 13:57:13, Dmitry Torokhov wrote:
> On Thu, Jul 23, 2015 at 08:19:13AM +0200, Pavel Machek wrote:
> > On Wed 2015-07-22 15:02:02, Dmitry Torokhov wrote:
> > > Devices may declare more LEDs than what is known to input-leds
> > > (HID does this for some devices). Instead of showing ugly warnings
> > > on connect and, even worse, oopsing on disconnect, let's simply
> > > ignore LEDs that are not known to us.
> > > 
> > > Reported-by: Vlastimil Babka 
> > > Signed-off-by: Dmitry Torokhov 
> > > ---
> > >  drivers/input/input-leds.c |   16 ++--
> > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/input/input-leds.c b/drivers/input/input-leds.c
> > > index 074a65e..766bf26 100644
> > > --- a/drivers/input/input-leds.c
> > > +++ b/drivers/input/input-leds.c
> > > @@ -71,6 +71,18 @@ static void input_leds_event(struct input_handle 
> > > *handle, unsigned int type,
> > >  {
> > >  }
> > >  
> > > +static int input_leds_get_count(struct input_dev *dev)
> > > +{
> > > + unsigned int led_code;
> > > + int count = 0;
> > > +
> > > + for_each_set_bit(led_code, dev->ledbit, LED_CNT)
> > > + if (input_led_info[led_code].name)
> > > + count++;
> > > +
> > > + return count;
> > > +}
> > > +
> > >  static int input_leds_connect(struct input_handler *handler,
> > > struct input_dev *dev,
> > > const struct input_device_id *id)
> > > @@ -81,7 +93,7 @@ static int input_leds_connect(struct input_handler 
> > > *handler,
> > >   int led_no;
> > >   int error;
> > >  
> > > - num_leds = bitmap_weight(dev->ledbit, LED_CNT);
> > > + num_leds = input_leds_get_count(dev);
> > >   if (!num_leds)
> > >   return -ENXIO;
> > >  
> > > @@ -112,7 +124,7 @@ static int input_leds_connect(struct input_handler 
> > > *handler,
> > >   led->handle = >handle;
> > >   led->code = led_code;
> > >  
> > > - if (WARN_ON(!input_led_info[led_code].name))
> > > + if (!input_led_info[led_code].name)
> > >   continue;
> > >  
> > >   led->cdev.name = kasprintf(GFP_KERNEL, "%s::%s",
> > >
> > 
> > Are you sure? AFAICT you need to fix err_unregister_leds not to
> > unregister leds with no name...
> 
> Well, if we skip unnamed leds and do not include them into total count
> then we won't need to unregister them.

I don't get it.

If there's unnamed led at index 0, and named one at indexes 1 and
2.. and there's -ENOMEM registering 2, it will try to unregister leds
0 and 1, and crash, no?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 01:38:33PM -0700, Linus Torvalds wrote:

> And the "take them and disable them" is really simple. No "am I in an
> NMI contect" thing (because that leads to the whole question about
> "what is NMI context"). That's not the real rule anyway.
> 
> No, make it very simple and straightforward. Make the test be "uhhuh,
> I got a #DB in kernel mode, and interrupts were disabled - I know I'm
> going to return with "ret", so I'm just going to have to disable this
> breakpoint".
> 
> Nothing clever. Nothing subtle. Nothing that needs "this range of
> instructions is magical". No.  Just a very simple rule: if the context
> we return to is kernel mode and interrupts are disabled, we're using
> 'ret', so we cannot suppress debug faults.
> 
> Did I miss something? There were a lot of emails flying around, but I
> *thought* I saw them all..

So the NMI could trigger userspace debug register faults, and simply
disabling them would make the whole debug register thing entirely
unreliable.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] noop-iosched: do not attempt to sort requests

2015-07-23 Thread Tahsin Erdogan

Noop scheduler currently dispatches a request by calling
elv_dispatch_sort(). In practice, sorting does not occur because
__elv_next_request() asks the io scheduler to dispatch a request
only when elevator queue is empty.

Also, not reordering requests seems more appropriate for noop. This
change makes the behavior more explicit.

Reviewed-by: Nauman Rafique 
Signed-off-by: Tahsin Erdogan 
---
 block/Kconfig.iosched | 8 
 block/noop-iosched.c  | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
index 421bef9..b9e42f8 100644
--- a/block/Kconfig.iosched
+++ b/block/Kconfig.iosched
@@ -7,10 +7,10 @@ config IOSCHED_NOOP
default y
---help---
  The no-op I/O scheduler is a minimal scheduler that does basic merging
- and sorting. Its main uses include non-disk based block devices like
- memory devices, and specialised software or hardware environments
- that do their own scheduling and require only minimal assistance from
- the kernel.
+ only. Its main uses include non-disk based block devices like memory
+ devices, and specialised software or hardware environments that do
+ their own scheduling and require only minimal assistance from the
+ kernel.
 
 config IOSCHED_DEADLINE
tristate "Deadline I/O scheduler"
diff --git a/block/noop-iosched.c b/block/noop-iosched.c
index 3de89d4..f0fec14 100644
--- a/block/noop-iosched.c
+++ b/block/noop-iosched.c
@@ -26,7 +26,7 @@ static int noop_dispatch(struct request_queue *q, int force)
struct request *rq;
rq = list_entry(nd->queue.next, struct request, queuelist);
list_del_init(>queuelist);
-   elv_dispatch_sort(q, rq);
+   elv_dispatch_add_tail(q, rq);
return 1;
}
return 0;
-- 
2.4.3.573.g4eafbef

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 1/4] mm, compaction: introduce kcompactd

2015-07-23 Thread David Rientjes

On Thu, 23 Jul 2015, Vlastimil Babka wrote:

> > When a khugepaged allocation fails for a node, it could easily kick off 
> > background compaction on that node and revisit the range later, very 
> > similar to how we can kick off background compaction in the page allocator 
> > when async or sync_light compaction fails.
> 
> The revisiting sounds rather complicated. Page allocator doesn't have to do 
> that.
> 

I'm referring to khugepaged having a hugepage allocation fail, the page 
allocator kicking off background compaction, and khugepaged rescanning the 
same memory for which the allocation failed later.

> > The distinction I'm trying to draw is between "periodic" and "background" 
> > compaction.  I think there're usecases for both and we shouldn't be 
> > limiting ourselves to one or the other.
> 
> OK, I understand you think we can have both, and the periodic one would be in
> khugepaged. My main concern is that if we do the periodic one in khugepaged,
> people might oppose adding yet another one as kcompactd. I hope we agree that
> khugepaged is not suitable for all the use cases of the background one.
> 

Yes, absolutely.  I agree that we need the ability to do background 
compaction without requiring CONFIG_TRANSPARENT_HUGEPAGE.

> My secondary concern/opinion is that I would hope that the background 
> compaction
> would be good enough to remove the need for the periodic one. So I would try 
> the
> background one first. But I understand the periodic one is simpler to 
> implement.
> On the other hand, it's not as urgent if you can simulate it from userspace.
> With the 15min period you use, there's likely not much overhead saved when
> invoking it from within the kernel? Sure there wouldn't be the synchronization
> with khugepaged activity, but I still wonder if wiating for up to 1 minute
> before khugepaged wakes up can make much difference with the 15min period.
> Hm, your cron job could also perhaps adjust the khugepaged sleep tunable when
> compaction is done, which IIRC results in immediate wakeup.
> 

There are certainly ways to do this from userspace, but the premise is 
that this issue, specifically for users of thp, is significant for 
everyone ;)

The problem that I've encountered with a background-only approach is that 
it doesn't help when you exec a large process that wants to fault most of 
its text and thp immediately cannot be allocated.  This can be a result of 
never having done any compaction at all other than from the page 
allocator, which terminates when a page of the given order is available.  
So on a fragmented machine, all memory faulted is shown in 
thp_fault_fallback and we rely on khugepaged to (slowly) fix this problem 
up for us.  We have shown great improvement in cpu utilization by 
periodically compacting memory today.

Background compaction arguably wouldn't help that situation because it's 
not fast enough to compact memory simultaneous to the large number of page 
faults, and you can't wait for it to complete at exec().  The result is 
the same: large thp_fault_fallback.

So I can understand the need for both periodic and background compaction 
(and direct compaction for non-thp non-atomic high-order allocations 
today) and I'm perhaps not as convinced as you are that we can eventually 
do without periodic compaction.

It seems to me that the vast majority of this discussion has centered 
around the vehicle that performs the compaction.  We certainly require 
kcompactd for background compaction, and we both agree that we need that 
functionality.

Two issues I want to bring up:

 (1) do non-thp configs benefit from periodic compaction?

 In my experience, no, but perhaps there are other use cases where
 this has been a pain.  The primary candidates, in my opinion,
 would be the networking stack and slub.  Joonsoo reports having to
 workaround issues with high-order slub allocations being too
 expensive.  I'm not sure that would be better served by periodic
 compaction, but it seems like a candidate for background compaction.

 This is why my rfc tied periodic compaction to khugepaged, and we
 have strong evidence that this helps thp and cpu utilization.  For
 periodic compaction to be possible outside of thp, we'd need a use
 case for it.

 (2) does kcompactd have to be per-node?

 I don't see the immediate benefit since direct compaction can
 already scan remote memory and migrate it, khugepaged can do the
 same.  Is there evidence that suggests that a per-node kcompactd
 is significantly better than a single kthread?  I think others
 would be more receptive of a single kthread addition.

My theory is that periodic compaction is only significantly beneficial for 
thp per my rfc, and I think there's a significant advantage for khugepaged 
to be able to trigger this periodic compaction immediately before scanning 
and allocating to avoid waiting potentially for the lengthy 
alloc_sleep_millisecs.

Re: Dealing with the NMI mess

2015-07-23 Thread Steven Rostedt

On Thu, 23 Jul 2015 13:21:16 -0700
Andy Lutomirski  wrote:

> 3. Forbid faults (other than MCE) inside NMI.
> 
> Option 3 is almost easy.  There are really only two kinds of faults
> that can legitimately nest inside NMI: #PF and #DB.  #DB is easy to
> fix (e.g. with my patches or Peter's patches).

What about int3? Which is needed to make ftrace work. This was a
requirement to get rid of stomp-machine when updating ftrace functions,
as well as the rational for doing the whole NMI nesting work in the
first place.

> 
> What if we went all out and forbade page faults in NMI as well.  There
> are two reasons that I can think of that we might page fault inside an
> NMI:
> 
> a) vmalloc fault.  I think Ingo already half-implemented a rework to
> eliminate vmalloc faults entirely.
> 
> b) User memory access faults.

c) stack tracing faults

I would have NMIs debug deadlocks with printing stack traces. The stack
tracer can page fault, and before the NMI nesting code, while debugging
machines, these stack dumps would randomly reboot the box. While
writing the NMI nesting code I realized why those reboots happened, and
that was due to the stack trace faulting, and the printk from NMI was
slow enough to have another NMI go off and stomp over the outer NMIs
stack. Which lead to triple faults and such.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Willy Tarreau

On Thu, Jul 23, 2015 at 02:13:16PM -0700, Linus Torvalds wrote:
> On Thu, Jul 23, 2015 at 1:52 PM, Willy Tarreau  wrote:
> >
> > What's the worst case that can happen with RF cleared when returing
> > to user space ?
> 
> Not a good idea. We are fine breaking breakpoints on the kernel ("use
> the tracing infrastructure instead"). Breaking it in user space is not
> really an option.

But that wouldn't disable the breakpoint, just make it strike again,
so the user would not be hurt.

> And we really don't need to. We'd only use 'ret' when returning to
> kernel code. And not even for the usual case, only for the "interrupts
> are off" case.  If somebody tries to put a breakpoint on something
> that is used in an irq-off situation, they are doing something very
> specialized, and we cna tell them: "sorry, we had to break your use
> case because it's crazy any other way".
> 
> Those kind of people are by definition not "users". They are mucking
> with kernel internals. Breaking them is not a regression.
> 
> Btw, we should still ask Intel for that "fast iret that doesn't
> re-enable NMI". So for possible future CPU's we might let people do
> crazy things again.

I'm just thinking that there should be an option for this : task switching.
You can store the EFLAGS in the TSS, so by preparing a dummy task with
everything needed to emulate iret, we might be able to do it without the
iret instruction. Or is this a stupid idea ? At least now I've well
understood that ugliness is not an excuse for not proposing something :-)

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Peter Zijlstra

On Thu, Jul 23, 2015 at 01:21:16PM -0700, Andy Lutomirski wrote:
> 3. Forbid faults (other than MCE) inside NMI.
> 
> Option 3 is almost easy.  There are really only two kinds of faults
> that can legitimately nest inside NMI: #PF and #DB.  #DB is easy to
> fix (e.g. with my patches or Peter's patches).
> 
> What if we went all out and forbade page faults in NMI as well.  There
> are two reasons that I can think of that we might page fault inside an
> NMI:
> 
> b) User memory access faults.
> 
> The reason we access user state in general from an NMI is to allow
> perf to capture enough user stack data to let the tooling backtrace
> back to user space.  What if we did it differently?  Instead of
> capturing this data in NMI context, capture it in
> prepare_exit_to_usermode. 

> Peter, can this be done without breaking the perf ABI?  If we were
> designing all of this stuff from scratch right now, I'd suggest doing
> it this way, but I'm not sure whether it makes sense to try to
> retrofit it in.

Not really; but also almost :/

So the thing is that we currently attach the user backtrace to all
events -- and there can be many before we return to userspace again.

So none of those events would have a userspace stack, I'm sure that's
going to confuse the tooling.

OTOH, userspace stacks are a best effort thing, we bail at the first
sign of trouble (eg. the stack page is not there).

Now realistically this 'never' happens, and it would result in
consistently truncated user traces, where your proposal would result in
a whole bunch of events with no user traces and then an 'extra' event
with a one.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/8] power: bq27x00_battery: Renaming for consistency

2015-07-23 Thread Pali Rohár

On Thursday 23 July 2015 22:56:26 Belisko Marek wrote:
> Hi Pali,
> 
> On Thu, Jul 23, 2015 at 10:15 PM, Pali Rohár 
> wrote:
> > On Thursday 23 July 2015 19:03:08 Andrew F. Davis wrote:
> >> >> -#ifdef CONFIG_BATTERY_BQ27X00_I2C
> >> >> -MODULE_ALIAS("i2c:bq27000-battery");
> >> >> +#ifdef CONFIG_BATTERY_BQ27XXX_I2C
> >> >> +MODULE_ALIAS("i2c:bq27xxx-battery");
> >> >> 
> >> >>  #endif
> >> > 
> >> > Why is this MODULE_ALIAS needed? Some lines upper there is
> >> > 
> >> >  MODULE_DEVICE_TABLE(i2c, bq27xxx_id);
> >> > 
> >> > which add proper i2c: module alias...
> >> 
> >> Not sure, looks like it was added in commit
> >> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 which claims that the
> >> "module won't get loaded automatically" without it, but I have not
> >> had this problem, so I'm not sure why it's there.
> > 
> > git grep bq27000-battery show me that only one driver uses that
> > name: drivers/w1/slaves/w1_bq27000.c
> > 
> > And more over, it is platform device, not i2c device. So that
> > commit 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 is wrong! CCing
> > Marek.
> 
> If you look to power/bq27x00 driver then there is I2C part and
> platform part only
> both selectable by config. In
> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 was added MODULE_ALIAS to
> have this driver working as module for both buses. Even if I2C isn't
> used anywhere
> I add MODULE_ALIAS also for that.
> 
> > MODULE_ALIAS("platform:bq27000-battery") is really needed for
> > w1_bq27000.c but MODULE_ALIAS("i2c:bq27000-battery") should be
> > removed. It is not used by any board platform code or DT.
> 
> Not sure if it's good idea to remove it. Somebody outside can use it.
> 
> > Marek, correct me if I'm wrong.
> > 
> > --
> > Pali Rohár
> > pali.ro...@gmail.com
> 
> BR,
> 
> marek

Who, where any why is using alias i2c:bq27000-battery??

It is obviously wrong to add that alias. MODULE_DEVICE_TABLE(i2c, ... 
macro automatically adds all MODULE_ALIASes for all i2c devices.

If there is such device with needs i2c:bq27000-battery is should be 
fixed to use i2c:bq27??? identifier exported by MODULE_DEVICE_TABLE.

git grep on kernel tree did not return anything, so there is no usage.

Of course alias for platform:bq27000-battery is needed.

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.

Re: Dealing with the NMI mess

2015-07-23 Thread Linus Torvalds

On Thu, Jul 23, 2015 at 1:52 PM, Willy Tarreau  wrote:
>
> What's the worst case that can happen with RF cleared when returing
> to user space ?

Not a good idea. We are fine breaking breakpoints on the kernel ("use
the tracing infrastructure instead"). Breaking it in user space is not
really an option.

And we really don't need to. We'd only use 'ret' when returning to
kernel code. And not even for the usual case, only for the "interrupts
are off" case.  If somebody tries to put a breakpoint on something
that is used in an irq-off situation, they are doing something very
specialized, and we cna tell them: "sorry, we had to break your use
case because it's crazy any other way".

Those kind of people are by definition not "users". They are mucking
with kernel internals. Breaking them is not a regression.

Btw, we should still ask Intel for that "fast iret that doesn't
re-enable NMI". So for possible future CPU's we might let people do
crazy things again.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 2/6] x86, mpx: do not set ->vm_ops on mpx VMAs

2015-07-23 Thread Kirill A. Shutemov

On Thu, Jul 23, 2015 at 01:59:11PM -0700, Andrew Morton wrote:
> On Fri, 17 Jul 2015 14:53:09 +0300 "Kirill A. Shutemov" 
>  wrote:
> 
> > MPX setups private anonymous mapping, but uses vma->vm_ops too.
> > This can confuse core VM, as it relies on vm->vm_ops to distinguish
> > file VMAs from anonymous.
> > 
> > As result we will get SIGBUS, because handle_pte_fault() thinks it's
> > file VMA without vm_ops->fault and it doesn't know how to handle the
> > situation properly.
> > 
> > Let's fix that by not setting ->vm_ops.
> > 
> > We don't really need ->vm_ops here: MPX VMA can be detected with VM_MPX
> > flag. And vma_merge() will not merge MPX VMA with non-MPX VMA, because
> > ->vm_flags won't match.
> > 
> > The only thing left is name of VMA. I'm not sure if it's part of ABI, or
> > we can just drop it. The patch keep it by providing arch_vma_name() on x86.
> > 
> > Build tested only.
> 
> mpx.c has changed.
> 
> arch/x86/mm/mpx.c: In function 'try_unmap_single_bt':
> arch/x86/mm/mpx.c:930: error: implicit declaration of function 'is_mpx_vma'
> 
> I'll drop this patch and see what happens.

Ingo has applied an updated version to x86/urgent:

https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent=a89652769470d12cd484ee3d3f7bde0742be8d96

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Linus Torvalds

On Thu, Jul 23, 2015 at 1:49 PM, Andy Lutomirski  wrote:
>
> Issue A: to return with RF clear, we need to disarm the breakpoint.
> If it's limited to the duration of the NMI, that's easy.  If not, when
> do we re-arm?  New prepare_exit_to_usermode hook?  Hmm, setting ti
> flags during context switch may target the wrong task.

We don't re-arm it.

We can entertain the notion *eventually* to do something clever, but
for now, just say: stability and simplicity is more important.

People can use tracepoints in interrupts-off code (they get rewritten
with 'int3', that's fine), but not instruction breakpoints.

> Issue C: #DB with invalid stack pointer (can happen due to watchpoints
> during SYSCALL entry or SYSRET exit).  I guess we need to ban such
> watchpoints.

.. but this is unrelated, to NMI, just "syscall is a nasty interface".
Don't we already ban them?

> Issue D: debug exception inside EFI (especially mixed-mode EFI).  We
> can't return using RET, so we need to catch that case.

If NMI code calls EFI code, then it's broken.

> These issues mostly go away if we preemptively disarm DR7 early in NMI
> processing and rearm it at the end.

I'm not *violently* opposed to that, but it's just a band-aid. It
doesn't *fix* anything. You aren't protecting against random DB
exceptions just because somebody put a data breakpoint on the NMI
stack, for example. You still get page faults. Etc etc.

So I thinkt he whole "use ret instead" is a pretty simple approach.
Make that "just work".

Then, if you want to play with dr7 inside NMI to make it more likely
that you can have breakpoints live in irq-off situation, I think
that's a magic special case. It shouldn't be part of the design.
Things should work without it.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Willy Tarreau

On Thu, Jul 23, 2015 at 01:53:34PM -0700, Andy Lutomirski wrote:
> On Thu, Jul 23, 2015 at 1:52 PM, Willy Tarreau  wrote:
> > On Thu, Jul 23, 2015 at 01:38:33PM -0700, Linus Torvalds wrote:
> >> On Thu, Jul 23, 2015 at 1:21 PM, Andy Lutomirski  
> >> wrote:
> >> >
> >> > 2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.
> >> >
> >> > We haven't considered:
> >> >
> >> > 3. Forbid faults (other than MCE) inside NMI.
> >>
> >> I'd really prefer #2. #3 depends on us getting many things right, and
> >> never introducing new cases in the future.
> >>
> >> #2, in contrast, seems to be fairly localized. Yes, RF is an issue,
> >> but returning to user space with RF clear doesn't really seem to be
> >> all that problematic.
> >
> > What's the worst case that can happen with RF cleared when returing
> > to user space ? My understanding is that it's just that we risk to
> > break again on an instruction that had a break point set and which
> > already triggered the breakpoint, right ?
> 
> I assume Linus meant returning to kernel space with RF clear.  Returns
> to userspace have their own fancy logic here, and it's survived for a
> couple of releases, including through an explicit test of RF handling
> :)

Ah you must be right, got it. Yes you want to break into the NMI handler
and you either disable all breakpoints/single-step until the NMI's iret
by clearing DR7, or you loop over and over on the same instruction if
you try to restart the stopped instruction with RF clear. That makes
sense.

Thanks,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] ARM: dts: qcom: Add MSM8960 CDP RPM regulators

2015-07-23 Thread Andy Gross

On Tue, Jun 16, 2015 at 01:31:14PM -0700, Stephen Boyd wrote:
> Add RPM regulators and configure their constraints on the MSM8960
> CDP so that we can control these supplies.
> 
> Signed-off-by: Stephen Boyd 
> ---

Applied, thanks

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] ARM: qcom_defconfig: Enable options for KS8851 ethernet

2015-07-23 Thread Andy Gross

On Tue, Jun 16, 2015 at 01:31:16PM -0700, Stephen Boyd wrote:
> Enable the RPM and RPM regulator drivers as well as the KS8851
> ethernet driver so that ethernet works on MSM8960 CDP.
> 
> Signed-off-by: Stephen Boyd 
> ---

Applied, thanks!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] ARM: dts: qcom: Replace gpio node with pinctrl node

2015-07-23 Thread Andy Gross

On Tue, Jun 16, 2015 at 01:31:12PM -0700, Stephen Boyd wrote:
> Now that we have a proper pinctrl driver for the gpio block we
> can change the compatible field here and configure the pinmux on
> msm8960 devices.
> 
> Signed-off-by: Stephen Boyd 
> ---

Applied, thanks

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 2/6] x86, mpx: do not set ->vm_ops on mpx VMAs

2015-07-23 Thread Andrew Morton

On Fri, 17 Jul 2015 14:53:09 +0300 "Kirill A. Shutemov" 
 wrote:

> MPX setups private anonymous mapping, but uses vma->vm_ops too.
> This can confuse core VM, as it relies on vm->vm_ops to distinguish
> file VMAs from anonymous.
> 
> As result we will get SIGBUS, because handle_pte_fault() thinks it's
> file VMA without vm_ops->fault and it doesn't know how to handle the
> situation properly.
> 
> Let's fix that by not setting ->vm_ops.
> 
> We don't really need ->vm_ops here: MPX VMA can be detected with VM_MPX
> flag. And vma_merge() will not merge MPX VMA with non-MPX VMA, because
> ->vm_flags won't match.
> 
> The only thing left is name of VMA. I'm not sure if it's part of ABI, or
> we can just drop it. The patch keep it by providing arch_vma_name() on x86.
> 
> Build tested only.

mpx.c has changed.

arch/x86/mm/mpx.c: In function 'try_unmap_single_bt':
arch/x86/mm/mpx.c:930: error: implicit declaration of function 'is_mpx_vma'

I'll drop this patch and see what happens.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 1/4] mm, compaction: introduce kcompactd

2015-07-23 Thread David Rientjes

On Thu, 23 Jul 2015, Joonsoo Kim wrote:

> > The slub allocator does try to allocate its high-order memory with 
> > __GFP_WAIT before falling back to lower orders if possible.  I would think 
> > that this would be the greatest sign of on-demand memory compaction being 
> > a problem, especially since CONFIG_SLUB is the default, but I haven't seen 
> > such reports.
> 
> In fact, some of our product had trouble with slub's high order
> allocation 5 months ago. At that time, compaction didn't make high order
> page and compaction attempts are frequently deferred. It also causes many
> reclaim to make high order page so I suggested masking out __GFP_WAIT
> and adding __GFP_NO_KSWAPD when trying slub's high order allocation to
> reduce reclaim/compaction overhead. Although using high order page in slub
> has some gains that reducing internal fragmentation and reducing management
> overhead, benefit is marginal compared to the cost at making high order
> page. This solution improves system response time for our case. I planned
> to submit the patch but it is delayed due to my laziness. :)
> 

Hi Joonsoo,

On a fragmented machine I can certainly understand that the overhead 
involved in allocating the high-order page outweighs the benefit later and 
it's better to fallback more quickly to page orders if the cache allows 
it.

I believe that this would be improved by the suggestion of doing 
background synchronous compaction.  So regardless of whether __GFP_WAIT is 
set, if the allocation fails then we can kick off background compaction 
that will hopefully defragment memory for future callers.  That should 
make high-order atomic allocations more successful as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] ARM: dts: qcom: Replace gpio node with pinctrl node

2015-07-23 Thread Andy Gross

On Fri, Jun 05, 2015 at 03:52:25PM -0700, Bjorn Andersson wrote:
> Replace the standalone gpio driver with pinctrl-msm as we now have
> msm8660 support there.
> 
> Signed-off-by: Bjorn Andersson 
> ---

Applied, thanks!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Input: LEDs - skip unnamed LEDs

2015-07-23 Thread Dmitry Torokhov

On Thu, Jul 23, 2015 at 08:19:13AM +0200, Pavel Machek wrote:
> On Wed 2015-07-22 15:02:02, Dmitry Torokhov wrote:
> > Devices may declare more LEDs than what is known to input-leds
> > (HID does this for some devices). Instead of showing ugly warnings
> > on connect and, even worse, oopsing on disconnect, let's simply
> > ignore LEDs that are not known to us.
> > 
> > Reported-by: Vlastimil Babka 
> > Signed-off-by: Dmitry Torokhov 
> > ---
> >  drivers/input/input-leds.c |   16 ++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/input/input-leds.c b/drivers/input/input-leds.c
> > index 074a65e..766bf26 100644
> > --- a/drivers/input/input-leds.c
> > +++ b/drivers/input/input-leds.c
> > @@ -71,6 +71,18 @@ static void input_leds_event(struct input_handle 
> > *handle, unsigned int type,
> >  {
> >  }
> >  
> > +static int input_leds_get_count(struct input_dev *dev)
> > +{
> > +   unsigned int led_code;
> > +   int count = 0;
> > +
> > +   for_each_set_bit(led_code, dev->ledbit, LED_CNT)
> > +   if (input_led_info[led_code].name)
> > +   count++;
> > +
> > +   return count;
> > +}
> > +
> >  static int input_leds_connect(struct input_handler *handler,
> >   struct input_dev *dev,
> >   const struct input_device_id *id)
> > @@ -81,7 +93,7 @@ static int input_leds_connect(struct input_handler 
> > *handler,
> > int led_no;
> > int error;
> >  
> > -   num_leds = bitmap_weight(dev->ledbit, LED_CNT);
> > +   num_leds = input_leds_get_count(dev);
> > if (!num_leds)
> > return -ENXIO;
> >  
> > @@ -112,7 +124,7 @@ static int input_leds_connect(struct input_handler 
> > *handler,
> > led->handle = >handle;
> > led->code = led_code;
> >  
> > -   if (WARN_ON(!input_led_info[led_code].name))
> > +   if (!input_led_info[led_code].name)
> > continue;
> >  
> > led->cdev.name = kasprintf(GFP_KERNEL, "%s::%s",
> >
> 
> Are you sure? AFAICT you need to fix err_unregister_leds not to
> unregister leds with no name...

Well, if we skip unnamed leds and do not include them into total count
then we won't need to unregister them.

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/8] power: bq27x00_battery: Renaming for consistency

2015-07-23 Thread Belisko Marek

Hi Pali,

On Thu, Jul 23, 2015 at 10:15 PM, Pali Rohár  wrote:
> On Thursday 23 July 2015 19:03:08 Andrew F. Davis wrote:
>> >> -#ifdef CONFIG_BATTERY_BQ27X00_I2C
>> >> -MODULE_ALIAS("i2c:bq27000-battery");
>> >> +#ifdef CONFIG_BATTERY_BQ27XXX_I2C
>> >> +MODULE_ALIAS("i2c:bq27xxx-battery");
>> >>
>> >>  #endif
>> >
>> > Why is this MODULE_ALIAS needed? Some lines upper there is
>> >
>> >  MODULE_DEVICE_TABLE(i2c, bq27xxx_id);
>> >
>> > which add proper i2c: module alias...
>>
>> Not sure, looks like it was added in commit
>> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 which claims that the
>> "module won't get loaded automatically" without it, but I have not
>> had this problem, so I'm not sure why it's there.
>>
>
> git grep bq27000-battery show me that only one driver uses that name:
> drivers/w1/slaves/w1_bq27000.c
>
> And more over, it is platform device, not i2c device. So that commit
> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 is wrong! CCing Marek.
If you look to power/bq27x00 driver then there is I2C part and
platform part only
both selectable by config. In 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 was added
MODULE_ALIAS to have this driver working as module for both buses.
Even if I2C isn't used anywhere
I add MODULE_ALIAS also for that.
>
> MODULE_ALIAS("platform:bq27000-battery") is really needed for
> w1_bq27000.c but MODULE_ALIAS("i2c:bq27000-battery") should be removed.
> It is not used by any board platform code or DT.
Not sure if it's good idea to remove it. Somebody outside can use it.
>
> Marek, correct me if I'm wrong.
>
> --
> Pali Rohár
> pali.ro...@gmail.com

BR,

marek



-- 
as simple and primitive as possible
-
Marek Belisko - OPEN-NANDRA
Freelance Developer

Ruska Nova Ves 219 | Presov, 08005 Slovak Republic
Tel: +421 915 052 184
skype: marekwhite
twitter: #opennandra
web: http://open-nandra.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Andy Lutomirski

On Thu, Jul 23, 2015 at 1:52 PM, Willy Tarreau  wrote:
> On Thu, Jul 23, 2015 at 01:38:33PM -0700, Linus Torvalds wrote:
>> On Thu, Jul 23, 2015 at 1:21 PM, Andy Lutomirski  wrote:
>> >
>> > 2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.
>> >
>> > We haven't considered:
>> >
>> > 3. Forbid faults (other than MCE) inside NMI.
>>
>> I'd really prefer #2. #3 depends on us getting many things right, and
>> never introducing new cases in the future.
>>
>> #2, in contrast, seems to be fairly localized. Yes, RF is an issue,
>> but returning to user space with RF clear doesn't really seem to be
>> all that problematic.
>
> What's the worst case that can happen with RF cleared when returing
> to user space ? My understanding is that it's just that we risk to
> break again on an instruction that had a break point set and which
> already triggered the breakpoint, right ?

I assume Linus meant returning to kernel space with RF clear.  Returns
to userspace have their own fancy logic here, and it's survived for a
couple of releases, including through an explicit test of RF handling
:)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] input: twl4030-vibra: Fix ERROR: Bad of_node_put() warning

2015-07-23 Thread Dmitry Torokhov

On Thu, Jul 23, 2015 at 10:38:34PM +0200, Marek Belisko wrote:
> Fix following:
> [8.862274] ERROR: Bad of_node_put() on /ocp/i2c@4807/twl@48/audio
> [8.869293] CPU: 0 PID: 1003 Comm: modprobe Not tainted 4.2.0-rc2-letux+ 
> #1175
> [8.876922] Hardware name: Generic OMAP36xx (Flattened Device Tree)
> [8.883514] [] (unwind_backtrace) from [] 
> (show_stack+0x10/0x14)
> [8.891693] [] (show_stack) from [] 
> (dump_stack+0x78/0x94)
> [8.899322] [] (dump_stack) from [] 
> (kobject_release+0x68/0x7c)
> [8.907409] [] (kobject_release) from [] 
> (twl4030_vibra_probe+0x74/0x188 [twl4030_vibra])
> [8.917877] [] (twl4030_vibra_probe [twl4030_vibra]) from 
> [] (platform_drv_probe+0x48/0x90)
> [8.928497] [] (platform_drv_probe) from [] 
> (really_probe+0xd4/0x238)
> [8.937103] [] (really_probe) from [] 
> (driver_probe_device+0x30/0x48)
> [8.945678] [] (driver_probe_device) from [] 
> (__driver_attach+0x68/0x8c)
> [8.954589] [] (__driver_attach) from [] 
> (bus_for_each_dev+0x50/0x84)
> [8.963226] [] (bus_for_each_dev) from [] 
> (bus_add_driver+0xcc/0x1e4)
> [8.971832] [] (bus_add_driver) from [] 
> (driver_register+0x9c/0xe0)
> [8.980255] [] (driver_register) from [] 
> (do_one_initcall+0x100/0x1b8)
> [8.988983] [] (do_one_initcall) from [] 
> (do_init_module+0x58/0x1c0)
> [8.997497] [] (do_init_module) from [] 
> (SyS_init_module+0x54/0x64)
> [9.005950] [] (SyS_init_module) from [] 
> (ret_fast_syscall+0x0/0x54)
> [9.015838] input: twl4030:vibrator as 
> /devices/platform/6800.ocp/4807.i2c/i2c-0/0-0048/4807.i2c:twl@48:audio/input/input2
> 
> node passed to of_find_node_by_name is put inside that function and new node
> is returned if found. Free returned node not already freed node.

Hmm, if of_find_node_by_name() "puts" passed in node should we not "get"
it before calling of_find_node_by_name()? The node pointer in question
is simply copied from parent device.

Thanks.

> 
> Signed-off-by: Marek Belisko 
> ---
>  drivers/input/misc/twl4030-vibra.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/input/misc/twl4030-vibra.c 
> b/drivers/input/misc/twl4030-vibra.c
> index fc17b95..10c4e3d 100644
> --- a/drivers/input/misc/twl4030-vibra.c
> +++ b/drivers/input/misc/twl4030-vibra.c
> @@ -183,7 +183,8 @@ static bool twl4030_vibra_check_coexist(struct 
> twl4030_vibra_data *pdata,
>   if (pdata && pdata->coexist)
>   return true;
>  
> - if (of_find_node_by_name(node, "codec")) {
> + node = of_find_node_by_name(node, "codec");
> + if (node) {
>   of_node_put(node);
>   return true;
>   }
> -- 
> 1.9.1
> 

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] mmap.2: document the munmap exception for underlying page size

2015-07-23 Thread David Rientjes

On Thu, 23 Jul 2015, Michael Kerrisk (man-pages) wrote:

> >> Should we also add a similar comment for the mmap offset?  Currently
> >> the man page says:
> >>
> >> "offset must be a multiple of the page size as returned by
> >>  sysconf(_SC_PAGE_SIZE)."
> >>
> >> For hugetlbfs, I beieve the offset must be a multiple of the
> >> hugetlb page size.  A similar comment/exception about using
> >> the "underlying page size" would apply here as well.
> >>
> > 
> > Yes, that makes sense, thanks.  We should also explicitly say that mmap(2) 
> > automatically aligns length to be hugepage aligned if backed by hugetlbfs.
> 
> And, surely, it also does something similar for mmap()'s 'addr'
> argument? 
> 
> I suggest we add a subsection to describe the HugeTLB differences. How 
> about something like:
> 
>Huge page (Huge TLB) mappings
>For  mappings  that  employ  huge pages, the requirements for the
>arguments  of  mmap()  and  munmap()  differ  somewhat  from  the
>requirements for mappings that use the native system page size.
> 
>For mmap(), offset must be a multiple of the underlying huge page
>size.  The system automatically aligns length to be a multiple of
>the underlying huge page size.
> 
>For  munmap(),  addr  and  length  must both be a multiple of the
>underlying huge page size.
> ?
> 

Looks good, please add my acked-by.  The commit that expanded on the 
documentation of this behavior was 
80d6b94bd69a7a49b52bf503ef6a841f43cf5bbb.

Answering from your other email, no, this behavior in the kernel has not 
changed recently but we found it wasn't properly documented so we wanted 
to fix that both in the kernel tree and in the man-pages to make it 
explicit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Willy Tarreau

On Thu, Jul 23, 2015 at 01:38:33PM -0700, Linus Torvalds wrote:
> On Thu, Jul 23, 2015 at 1:21 PM, Andy Lutomirski  wrote:
> >
> > 2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.
> >
> > We haven't considered:
> >
> > 3. Forbid faults (other than MCE) inside NMI.
> 
> I'd really prefer #2. #3 depends on us getting many things right, and
> never introducing new cases in the future.
> 
> #2, in contrast, seems to be fairly localized. Yes, RF is an issue,
> but returning to user space with RF clear doesn't really seem to be
> all that problematic.

What's the worst case that can happen with RF cleared when returing
to user space ? My understanding is that it's just that we risk to
break again on an instruction that had a break point set and which
already triggered the breakpoint, right ?

If so the problem probably is whether there's a risk of looping again
without ever getting a chance to execute this instruction normally.
But if the NMIs don't bomb as fast as we can process them, at some
point the instruction should get a chance to be executed, so the
problem doesn't seem dramatic.

That makes me think that I have no idea what happens if we try to
step-trace "int 2", I don't even know if we pass through the NMI
handler.

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 09/11] ARM: dts: msm8974: Add smem reservation and node

2015-07-23 Thread Andy Gross

On Fri, Jun 26, 2015 at 02:50:17PM -0700, bj...@kryo.se wrote:
> From: Bjorn Andersson 
> 
> Signed-off-by: Bjorn Andersson 
> ---

Applied, thanks.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] merge_config.sh: Add the ability to perform make with an ARCH

2015-07-23 Thread Dan Murphy

The script does not allow building for different architectures.
It may assume that the developer has set the ARCH as a global
variable.

Add a switch argument to pass in the desired architecture.
Then verify that that architecture is supported in the arch
directory.

If not exit if it is supported then set it.

Signed-off-by: Dan Murphy 
---
 scripts/kconfig/merge_config.sh | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/scripts/kconfig/merge_config.sh b/scripts/kconfig/merge_config.sh
index ec8e203..bdbff4b 100755
--- a/scripts/kconfig/merge_config.sh
+++ b/scripts/kconfig/merge_config.sh
@@ -19,7 +19,16 @@
 #  but WITHOUT ANY WARRANTY; without even the implied warranty of
 #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 #  See the GNU General Public License for more details.
-
+verify_arch() {
+   cd arch
+   for d in * ; do
+   if [ "$d" = "$_TEST_ARCH" ]; then
+   BUILD_ARCH="ARCH="$d
+   break
+   fi
+   done
+   cd ..
+}
 clean_up() {
rm -f $TMP_FILE
exit
@@ -33,6 +42,7 @@ usage() {
echo "  -nuse allnoconfig instead of alldefconfig"
echo "  -rlist redundant entries when merging fragments"
echo "  -Odir to put generated output files"
+   echo "  -Aarchitecture to support for make"
 }
 
 RUNMAKE=true
@@ -71,6 +81,21 @@ while true; do
shift 2
continue
;;
+   "-A")
+   if [ "$2" != "" ]; then
+   _TEST_ARCH=$2
+   verify_arch
+   if [ "$BUILD_ARCH" = "" ]; then
+   echo "ARCH $_TEST_ARCH is not valid" 1>&2
+   exit 1
+   fi
+   else
+   echo "ARCH $_TEST_ARCH is not valid" 1>&2
+   exit 1
+   fi
+   shift 2
+   continue
+   ;;
*)
break
;;
@@ -139,7 +164,7 @@ fi
 # Use the merged file as the starting point for:
 # alldefconfig: Fills in any missing symbols with Kconfig default
 # allnoconfig: Fills in any missing symbols with # CONFIG_* is not set
-make KCONFIG_ALLCONFIG=$TMP_FILE $OUTPUT_ARG $ALLTARGET
+make $BUILD_ARCH KCONFIG_ALLCONFIG=$TMP_FILE $OUTPUT_ARG $ALLTARGET
 
 
 # Check all specified config values took (might have missed-dependency issues)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-07-23 Thread Alexei Starovoitov


On 7/23/15 4:54 AM, He Kuang wrote:

trimmed cc-list, since it's not related to kernel.


Thank you for your guidence, and by referencing your last mail
and other llvm backends, I found setting
BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h


thanks! yes. it was missing.


and fix some unhandeled switch can make llc output debug_info,


what do you mean ?


but important information is missing in the result:


hmm. I see slightly different picture.
With 'clang -O2 -target bpf -g -S a.c'
I see all the right info inside .s file.
with '-c a.c' for some reasons it produces bogus offset:
   Abbrev Offset: 0x
   Pointer Size:  8
/usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset 
() is larger than abbrev section size (4b)


and objdump fails to parse .o
I'm using llvm trunk 3.8. Do you see this as well?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: string_escape_mem ESCAPE_SPACE

2015-07-23 Thread Andy Shevchenko

On Thu, 2015-07-23 at 13:36 -0700, Kees Cook wrote:
> On Thu, Jul 23, 2015 at 1:27 PM, Andy Shevchenko
>  wrote:
> > On Thu, 2015-07-23 at 12:59 -0700, Kees Cook wrote:
> > > Hi,
> > > 
> > > I'm curious why ESCAPE_SPACE doesn't escape spaces (0x20)?
> > 
> > Space is a printable character.
> > You perhaps wants something like ESCAPE_SPACE | ESCAPE_HEX.
> 
> Yeah, I can get the effect I want with:
> 
> flags = ESCAPE_SPACE | ESCAPE_SPECIAL | ESCAPE_NULL | ESCAPE_HEX;
> esc = "\f\n\r\t\v\\\a\e\0 ";

esc can't contain '\0' in the middle.

So, you would like to convert only space to hex and leave everything
else printable as is?

> 
> This isn't reachable via kasprintf, though (it always has a NULL 
> esc).
> I will consider some options and send patches.

Before doing this, describe your use case in detail, please.

> 
> > >  That is
> > > surprising to me, especially since things like isspace() include
> > > 0x20.
> > 
> > Moreover, there are test cases in test-string_helpers.c module and 
> > they
> > are based on the real use cases (before helpers were introduced and
> > users were converted). So, there is no user which expects hex 
> > conversio
> > n of the printable character if not asked explicitly.
> 
> Yeah, I saw it was testing for space to be excluded. I guess I just
> think the name "ESCAPE_SPACE" is misleading. :)

For sake of name shortness I suppose. The idea is to escape *special*
spaces by this.

-- 
Andy Shevchenko 
Intel Finland Oy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Andy Lutomirski

On Thu, Jul 23, 2015 at 1:38 PM, Linus Torvalds
 wrote:
> On Thu, Jul 23, 2015 at 1:21 PM, Andy Lutomirski  wrote:
>>
>> 2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.
>>
>> We haven't considered:
>>
>> 3. Forbid faults (other than MCE) inside NMI.
>
> I'd really prefer #2. #3 depends on us getting many things right, and
> never introducing new cases in the future.
>
> #2, in contrast, seems to be fairly localized. Yes, RF is an issue,
> but returning to user space with RF clear doesn't really seem to be
> all that problematic.
>
> The point of RF is to make forward progress in the face of debug
> register faults, but I don't see what was wrong with the whole
> "disable any debug events that happen with interrupts disabled".
>
> And no, I do *not* believe that we should disable debug faults ahead
> of time. We should take them, disable them, and return with 'ret'. No
> complex "you can't put breakpoints in this region" crap, no magic
> rules, no subtle issues.
>
> I really think your "disallow #DB" is pointless. I think your "prevent
> instruction breakpoints in NMI" is wrong. Let them happen. Take them
> and disable them. Return with RT clear. Go on with your life.
>
> And the "take them and disable them" is really simple. No "am I in an
> NMI contect" thing (because that leads to the whole question about
> "what is NMI context"). That's not the real rule anyway.
>
> No, make it very simple and straightforward. Make the test be "uhhuh,
> I got a #DB in kernel mode, and interrupts were disabled - I know I'm
> going to return with "ret", so I'm just going to have to disable this
> breakpoint".
>
> Nothing clever. Nothing subtle. Nothing that needs "this range of
> instructions is magical". No.  Just a very simple rule: if the context
> we return to is kernel mode and interrupts are disabled, we're using
> 'ret', so we cannot suppress debug faults.

There are some subtleties in here.

Issue A: to return with RF clear, we need to disarm the breakpoint.
If it's limited to the duration of the NMI, that's easy.  If not, when
do we re-arm?  New prepare_exit_to_usermode hook?  Hmm, setting ti
flags during context switch may target the wrong task.

Issue B: single-step exception after SYSENTER.  The patches I just
sent fix that, though.

Issue C: #DB with invalid stack pointer (can happen due to watchpoints
during SYSCALL entry or SYSRET exit).  I guess we need to ban such
watchpoints.

Issue D: debug exception inside EFI (especially mixed-mode EFI).  We
can't return using RET, so we need to catch that case.

These issues mostly go away if we preemptively disarm DR7 early in NMI
processing and rearm it at the end.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 01/11] soc: qcom: Add device tree binding for SMEM

2015-07-23 Thread Andy Gross

On Fri, Jun 26, 2015 at 02:50:09PM -0700, bj...@kryo.se wrote:
> From: Bjorn Andersson 
> 
> Add device tree binding documentation for the Qualcom Shared Memory
> Manager.
> 
> Signed-off-by: Bjorn Andersson 



> + smem@fa0 {
> + compatible = "qcom,smem";
> +
> + memory-region = <_region>;
> + reg = <0xfc428000 0x4000>;

I'll fixup the address here before applying.

Applied, thanks!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 02/11] soc: qcom: Add Shared Memory Manager driver

2015-07-23 Thread Andy Gross

On Fri, Jun 26, 2015 at 02:50:10PM -0700, bj...@kryo.se wrote:
> From: Bjorn Andersson 
> 
> The Shared Memory Manager driver implements an interface for allocating
> and accessing items in the memory area shared among all of the
> processors in a Qualcomm platform.
> 
> Signed-off-by: Bjorn Andersson 
> ---

Applied, thanks!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] cpufreq: Avoid attempts to create duplicate symbolic links

2015-07-23 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

After commit 87549141d516 (cpufreq: Stop migrating sysfs files on
hotplug) there is a problem with CPUs that share cpufreq policy
objects with other CPUs and are initially offline.

Say CPU1 shares a policy with CPU0 which is online and is registered
first.  As part of the registration process, cpufreq_add_dev() is
called for it.  It creates the policy object and a symbolic link
to it from the CPU1's sysfs directory.  If CPU1 is registered
subsequently and it is offline at that time, cpufreq_add_dev() will
attempt to create a symbolic link to the policy object for it, but
that link is present already, so a warning about that will be
triggered.

To avoid that warning, make cpufreq use an additional CPU mask
containing related CPUs that are actually present for each policy
object.  That mask is initialized when the policy object is populated
after its creation (for the first online CPU using it) and it includes
CPUs from the "policy CPUs" mask returned by the cpufreq driver's
->init() callback that are physically present at that time.  Symbolic
links to the policy are created only for the CPUs in that mask.

If cpufreq_add_dev() is invoked for an offline CPU, it checks the
new mask and only creates the symlink if the CPU was not in it (the
CPU is added to the mask at the same time).

In turn, cpufreq_remove_dev() drops the given CPU from the new mask,
removes its symlink to the policy object and returns, unless it is
the CPU owning the policy object.  In that case, the policy object
is moved to a new CPU's sysfs directory or deleted if the CPU being
removed was the last user of the policy.

While at it, notice that cpufreq_remove_dev() can't fail, because
its return value is ignored, so make it ignore return values from
__cpufreq_remove_dev_prepare() and __cpufreq_remove_dev_finish()
and prevent these functions from aborting on errors returned by
__cpufreq_governor().

Fixes: 87549141d516 (cpufreq: Stop migrating sysfs files on hotplug)
Signed-off-by: Rafael J. Wysocki 
Reported-by: Russell King 
---

This is supposed to replace the other patches sent so far to address the issue
at hand.

---
 drivers/cpufreq/cpufreq.c |   92 --
 include/linux/cpufreq.h   |1 
 2 files changed, 50 insertions(+), 43 deletions(-)

Index: linux-pm/include/linux/cpufreq.h
===
--- linux-pm.orig/include/linux/cpufreq.h
+++ linux-pm/include/linux/cpufreq.h
@@ -62,6 +62,7 @@ struct cpufreq_policy {
/* CPUs sharing clock, require sw coordination */
cpumask_var_t   cpus;   /* Online CPUs only */
cpumask_var_t   related_cpus; /* Online + Offline CPUs */
+   cpumask_var_t   real_cpus; /* Related and present */
 
unsigned intshared_type; /* ACPI: ANY or ALL affected CPUs
should set cpufreq */
Index: linux-pm/drivers/cpufreq/cpufreq.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1002,7 +1002,7 @@ static int cpufreq_add_dev_symlink(struc
int ret = 0;
 
/* Some related CPUs might not be present (physically hotplugged) */
-   for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
+   for_each_cpu(j, policy->real_cpus) {
if (j == policy->kobj_cpu)
continue;
 
@@ -1019,7 +1019,7 @@ static void cpufreq_remove_dev_symlink(s
unsigned int j;
 
/* Some related CPUs might not be present (physically hotplugged) */
-   for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
+   for_each_cpu(j, policy->real_cpus) {
if (j == policy->kobj_cpu)
continue;
 
@@ -1163,11 +1163,14 @@ static struct cpufreq_policy *cpufreq_po
if (!zalloc_cpumask_var(>related_cpus, GFP_KERNEL))
goto err_free_cpumask;
 
+   if (!zalloc_cpumask_var(>real_cpus, GFP_KERNEL))
+   goto err_free_rcpumask;
+
ret = kobject_init_and_add(>kobj, _cpufreq, >kobj,
   "cpufreq");
if (ret) {
pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-   goto err_free_rcpumask;
+   goto err_free_real_cpus;
}
 
INIT_LIST_HEAD(>policy_list);
@@ -1184,6 +1187,8 @@ static struct cpufreq_policy *cpufreq_po
 
return policy;
 
+err_free_real_cpus:
+   free_cpumask_var(policy->real_cpus);
 err_free_rcpumask:
free_cpumask_var(policy->related_cpus);
 err_free_cpumask:
@@ -1234,6 +1239,7 @@ static void cpufreq_policy_free(struct c
write_unlock_irqrestore(_driver_lock, flags);
 
cpufreq_policy_put_kobj(policy, notify);
+   free_cpumask_var(policy->real_cpus);
free_cpumask_var(policy->related_cpus);

Re: [PATCH] usb: ulpi: call put_device if device_register fails

2015-07-23 Thread Greg Kroah-Hartman

On Thu, Jul 23, 2015 at 03:08:08PM -0500, Felipe Balbi wrote:
> > I don't like fixes like this because no one now has any pressure to fix
> > it "properly".  Are you doing that work?  If not, who is?
> 
> Heikki is author, I'd expect him to fix it up. We can also revert the
> fix if you prefer, I'm totally fine with that.

Let's leave it alone for now and see what happens...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 08/11] ARM: dts: msm8974: Add tcsr mutex node

2015-07-23 Thread Andy Gross

On Fri, Jun 26, 2015 at 02:50:16PM -0700, bj...@kryo.se wrote:
> From: Bjorn Andersson 
> 
> Signed-off-by: Bjorn Andersson 
> ---
>  arch/arm/boot/dts/qcom-msm8974.dtsi | 12 
>  1 file changed, 12 insertions(+)
> 

Applied, thanks!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*

2015-07-23 Thread Andrew Morton

On Thu, 16 Jul 2015 01:41:56 + Naoya Horiguchi  
wrote:

> The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
> free target page in successful page migration") was not closed completely,
> because that can happen not only for soft-offline, but also for hard-offline.
> Consider that a slab page is about to be freed into buddy pool, and then an
> uncorrected memory error hits the page just after entering __free_one_page(),
> then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
> despite the fact that it's not necessary because the data on the affected
> page is not consumed.
> 
> To solve it, this patch drops __PG_HWPOISON from page flag checks at
> allocation/free time. I think it's justified because __PG_HWPOISON flags is
> defined to prevent the page from being reused and setting it outside the
> page's alloc-free cycle is a designed behavior (not a bug.)
> 
> And the patch reverts most of the changes from commit add05cecef80 about
> the new refcounting rule of soft-offlined pages, which is no longer necessary.
> 
> ...
>
> --- v4.2-rc2.orig/mm/memory-failure.c
> +++ v4.2-rc2/mm/memory-failure.c
> @@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
>  
>   get_online_mems();
>  
> + if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> + set_migratetype_isolate(page, true);
> +
>   ret = get_any_page(page, pfn, flags);
>   put_online_mems();
>   if (ret > 0) { /* for in-use pages */

This patch gets build-broken by your
mm-page_isolation-make-set-unset_migratetype_isolate-file-local.patch,
which I shall drop.

From: Naoya Horiguchi 
Subject: mm, page_isolation: make set/unset_migratetype_isolate() file-local

Nowaday, set/unset_migratetype_isolate() is defined and used only in
mm/page_isolation, so let's limit the scope within the file.

Signed-off-by: Naoya Horiguchi 
Cc: David Rientjes 
Cc: Vlastimil Babka 
Cc: Joonsoo Kim 
Cc: Minchan Kim 
Signed-off-by: Andrew Morton 
---

 include/linux/page-isolation.h |5 -
 mm/page_isolation.c|5 +++--
 2 files changed, 3 insertions(+), 7 deletions(-)

diff -puN 
include/linux/page-isolation.h~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
 include/linux/page-isolation.h
--- 
a/include/linux/page-isolation.h~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
+++ a/include/linux/page-isolation.h
@@ -65,11 +65,6 @@ undo_isolate_page_range(unsigned long st
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
bool skip_hwpoisoned_pages);
 
-/*
- * Internal functions. Changes pageblock's migrate type.
- */
-int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages);
-void unset_migratetype_isolate(struct page *page, unsigned migratetype);
 struct page *alloc_migrate_target(struct page *page, unsigned long private,
int **resultp);
 
diff -puN 
mm/page_isolation.c~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
 mm/page_isolation.c
--- 
a/mm/page_isolation.c~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
+++ a/mm/page_isolation.c
@@ -9,7 +9,8 @@
 #include 
 #include "internal.h"
 
-int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages)
+static int set_migratetype_isolate(struct page *page,
+   bool skip_hwpoisoned_pages)
 {
struct zone *zone;
unsigned long flags, pfn;
@@ -72,7 +73,7 @@ out:
return ret;
 }
 
-void unset_migratetype_isolate(struct page *page, unsigned migratetype)
+static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 {
struct zone *zone;
unsigned long flags, nr_pages;
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: ia64: remove redundant freq_table of acpi_cpufreq_data

2015-07-23 Thread Rafael J. Wysocki

On Monday, July 20, 2015 03:22:45 PM Viresh Kumar wrote:
> On 20-07-15, 14:22, Pan Xinhui wrote:
> > From: Pan Xinhui 
> > 
> > freq_table is now stored as policy->freq_table, so drop the redundant
> > freq_table from struct cpufreq_acpi_io.
> > 
> > Signed-off-by: Pan Xinhui 
> > ---
> >  drivers/cpufreq/ia64-acpi-cpufreq.c | 15 ---
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> Acked-by: Viresh Kumar 

Queued up for 4.3, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] input: twl4030-vibra: Fix ERROR: Bad of_node_put() warning

2015-07-23 Thread Marek Belisko

Fix following:
[8.862274] ERROR: Bad of_node_put() on /ocp/i2c@4807/twl@48/audio
[8.869293] CPU: 0 PID: 1003 Comm: modprobe Not tainted 4.2.0-rc2-letux+ 
#1175
[8.876922] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[8.883514] [] (unwind_backtrace) from [] 
(show_stack+0x10/0x14)
[8.891693] [] (show_stack) from [] 
(dump_stack+0x78/0x94)
[8.899322] [] (dump_stack) from [] 
(kobject_release+0x68/0x7c)
[8.907409] [] (kobject_release) from [] 
(twl4030_vibra_probe+0x74/0x188 [twl4030_vibra])
[8.917877] [] (twl4030_vibra_probe [twl4030_vibra]) from 
[] (platform_drv_probe+0x48/0x90)
[8.928497] [] (platform_drv_probe) from [] 
(really_probe+0xd4/0x238)
[8.937103] [] (really_probe) from [] 
(driver_probe_device+0x30/0x48)
[8.945678] [] (driver_probe_device) from [] 
(__driver_attach+0x68/0x8c)
[8.954589] [] (__driver_attach) from [] 
(bus_for_each_dev+0x50/0x84)
[8.963226] [] (bus_for_each_dev) from [] 
(bus_add_driver+0xcc/0x1e4)
[8.971832] [] (bus_add_driver) from [] 
(driver_register+0x9c/0xe0)
[8.980255] [] (driver_register) from [] 
(do_one_initcall+0x100/0x1b8)
[8.988983] [] (do_one_initcall) from [] 
(do_init_module+0x58/0x1c0)
[8.997497] [] (do_init_module) from [] 
(SyS_init_module+0x54/0x64)
[9.005950] [] (SyS_init_module) from [] 
(ret_fast_syscall+0x0/0x54)
[9.015838] input: twl4030:vibrator as 
/devices/platform/6800.ocp/4807.i2c/i2c-0/0-0048/4807.i2c:twl@48:audio/input/input2

node passed to of_find_node_by_name is put inside that function and new node
is returned if found. Free returned node not already freed node.

Signed-off-by: Marek Belisko 
---
 drivers/input/misc/twl4030-vibra.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/input/misc/twl4030-vibra.c 
b/drivers/input/misc/twl4030-vibra.c
index fc17b95..10c4e3d 100644
--- a/drivers/input/misc/twl4030-vibra.c
+++ b/drivers/input/misc/twl4030-vibra.c
@@ -183,7 +183,8 @@ static bool twl4030_vibra_check_coexist(struct 
twl4030_vibra_data *pdata,
if (pdata && pdata->coexist)
return true;
 
-   if (of_find_node_by_name(node, "codec")) {
+   node = of_find_node_by_name(node, "codec");
+   if (node) {
of_node_put(node);
return true;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dealing with the NMI mess

2015-07-23 Thread Linus Torvalds

On Thu, Jul 23, 2015 at 1:21 PM, Andy Lutomirski  wrote:
>
> 2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.
>
> We haven't considered:
>
> 3. Forbid faults (other than MCE) inside NMI.

I'd really prefer #2. #3 depends on us getting many things right, and
never introducing new cases in the future.

#2, in contrast, seems to be fairly localized. Yes, RF is an issue,
but returning to user space with RF clear doesn't really seem to be
all that problematic.

The point of RF is to make forward progress in the face of debug
register faults, but I don't see what was wrong with the whole
"disable any debug events that happen with interrupts disabled".

And no, I do *not* believe that we should disable debug faults ahead
of time. We should take them, disable them, and return with 'ret'. No
complex "you can't put breakpoints in this region" crap, no magic
rules, no subtle issues.

I really think your "disallow #DB" is pointless. I think your "prevent
instruction breakpoints in NMI" is wrong. Let them happen. Take them
and disable them. Return with RT clear. Go on with your life.

And the "take them and disable them" is really simple. No "am I in an
NMI contect" thing (because that leads to the whole question about
"what is NMI context"). That's not the real rule anyway.

No, make it very simple and straightforward. Make the test be "uhhuh,
I got a #DB in kernel mode, and interrupts were disabled - I know I'm
going to return with "ret", so I'm just going to have to disable this
breakpoint".

Nothing clever. Nothing subtle. Nothing that needs "this range of
instructions is magical". No.  Just a very simple rule: if the context
we return to is kernel mode and interrupts are disabled, we're using
'ret', so we cannot suppress debug faults.

Did I miss something? There were a lot of emails flying around, but I
*thought* I saw them all..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: ia64: Fix a memory leak in acpi_cpufreq_cpu_exit

2015-07-23 Thread Rafael J. Wysocki

On Monday, July 20, 2015 02:24:36 PM Pan Xinhui wrote:
> From: Pan Xinhui 
> 
> freq_table should be alloced in ->init and freed in ->exit. However it
> does not be freed. Fix this memory leak in acpi_cpufreq_cpu_exit.
> 
> Signed-off-by: Pan Xinhui 

Queued up for 4.3, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: string_escape_mem ESCAPE_SPACE

2015-07-23 Thread Kees Cook

On Thu, Jul 23, 2015 at 1:27 PM, Andy Shevchenko
 wrote:
> On Thu, 2015-07-23 at 12:59 -0700, Kees Cook wrote:
>> Hi,
>>
>> I'm curious why ESCAPE_SPACE doesn't escape spaces (0x20)?
>
> Space is a printable character.
> You perhaps wants something like ESCAPE_SPACE | ESCAPE_HEX.

Yeah, I can get the effect I want with:

flags = ESCAPE_SPACE | ESCAPE_SPECIAL | ESCAPE_NULL | ESCAPE_HEX;
esc = "\f\n\r\t\v\\\a\e\0 ";

This isn't reachable via kasprintf, though (it always has a NULL esc).
I will consider some options and send patches.

>>  That is
>> surprising to me, especially since things like isspace() include
>> 0x20.
>
> Moreover, there are test cases in test-string_helpers.c module and they
> are based on the real use cases (before helpers were introduced and
> users were converted). So, there is no user which expects hex conversio
> n of the printable character if not asked explicitly.

Yeah, I saw it was testing for space to be excluded. I guess I just
think the name "ESCAPE_SPACE" is misleading. :)

Thanks!

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] intel_pstate: Add get_scaling cpu_defaults param to Knights Landing

2015-07-23 Thread Rafael J. Wysocki

On Tuesday, July 21, 2015 10:41:13 AM Lukasz Anaczkowski wrote:
> Scaling for Knights Landing is same as the default scaling (10).
> When Knigts Landing support was added to the pstate driver, this
> parameter was omitted resulting in a kernel panic during boot.
> 
> Reported-by: Yasuaki Ishimatsu 
> Signed-off-by: Dasaratharaman Chandramouli 
> 
> Signed-off-by: Lukasz Anaczkowski 

Queued up for 4.3, thanks!

> ---
>  drivers/cpufreq/intel_pstate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 15ada47..fcb929e 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -681,6 +681,7 @@ static struct cpu_defaults knl_params = {
>   .get_max = core_get_max_pstate,
>   .get_min = core_get_min_pstate,
>   .get_turbo = knl_get_turbo_pstate,
> + .get_scaling = core_get_scaling,
>   .set = core_set_pstate,
>   },
>  };
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Remove cpufreq_rwsem

2015-07-23 Thread Rafael J. Wysocki

On Wednesday, July 22, 2015 05:59:11 PM Sebastian Andrzej Siewior wrote:
> cpufreq_rwsem was introduced in commit 6eed9404ab3c4 ("cpufreq: Use
> rwsem for protecting critical sections) in order to replace
> try_module_get() on the cpu-freq driver. That try_module_get() worked
> well until the refcount was so heavily used that module removal became
> more or less impossible.
> 
> Though when looking at the various (undocumented) protection
> mechanisms in that code, the randomly sprinkeled around cpufreq_rwsem
> locking sites are superfluous.
> 
> The policy, which is acquired in cpufreq_cpu_get() and released in
> cpufreq_cpu_put() is sufficiently protected already.
> 
>   cpufreq_cpu_get(cpu)
> /* Protects against concurrent driver removal */
> read_lock_irqsave(_driver_lock, flags);
> policy = per_cpu(cpufreq_cpu_data, cpu);
> kobject_get(>kobj);
> read_unlock_irqrestore(_driver_lock, flags);
> 
> The reference on the policy serializes versus module unload already:
> 
>   cpufreq_unregister_driver()
> subsys_interface_unregister()
>   __cpufreq_remove_dev_finish()
> per_cpu(cpufreq_cpu_data) = NULL;
>   cpufreq_policy_put_kobj()
> 
> If there is a reference held on the policy, i.e. obtained prior to the
> unregister call, then cpufreq_policy_put_kobj() will wait until that
> reference is dropped. So once subsys_interface_unregister() returns
> there is no policy pointer in flight and no new reference can be
> obtained. So that rwsem protection is useless.
> 
> The other usage of cpufreq_rwsem in show()/store() of the sysfs
> interface is redundant as well because sysfs already does the proper
> kobject_get()/put() pairs.
> 
> That leaves CPU hotplug versus module removal. The current
> down_write() around the write_lock() in cpufreq_unregister_driver() is
> silly at best as it protects actually nothing.
> 
> The trivial solution to this is to prevent hotplug across
> cpufreq_unregister_driver completely.
> 
> Signed-off-by: Sebastian Andrzej Siewior 

Makes sense.

Queued up for 4.3, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND PATCH] cpupower: Do not change the frequency of offline cpu

2015-07-23 Thread Rafael J. Wysocki

On Thursday, July 23, 2015 12:20:27 PM Shilpasri G Bhat wrote:
> Check if the cpu is online before changing the frequency/governor of
> the cpu.
> 
> Reported-by: Pavaman Subramaniyam 
> Signed-off-by: Shilpasri G Bhat 
> Reviewed-by: Gautham R. Shenoy 
> Acked-by: Thomas Renninger 

Queued up for 4.3, thanks!

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] cpufreq: Separate CPU device removal from CPU online

2015-07-23 Thread Rafael J. Wysocki

On Thursday, July 23, 2015 12:09:42 PM Viresh Kumar wrote:
> On 23-07-15, 02:04, Rafael J. Wysocki wrote:
> > +static int cpufreq_add_dev(struct device *dev, struct subsys_interface 
> > *sif)
> > +{
> > +   unsigned int cpu = dev->id;
> > +   struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
> > +
> > +   pr_debug("%s: adding CPU %u\n", __func__, cpu);
> > +
> > +   if (policy && policy->kobj_cpu != cpu) {
> 
> Why are you comparing cpu against kobj_cpu ? I don't think it can ever
> be false.

It can.  When we're adding a CPU that has a policy already, because it is
"related" to a previously registered CPU.


> > +   int ret;
> > +
> > +   pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
> 
> dev_dbg

OK

> > +   ret = sysfs_create_link(>kobj, >kobj, "cpufreq");
> > +   if (ret) {
> > +   dev_dbg(dev, "%s: Failed to create link (%d)\n",
> 
> dev_err

Well, I'm wondering about this.  Why does this have to be dev_err()?


> > +   __func__, ret);
> > +   return ret;
> > +   }
> > +
> > +   /* Track CPUs for which sysfs links are created */
> > +   cpumask_set_cpu(cpu, policy->linked_cpus);
> > +   }
> > +
> > +   return cpu_online(cpu) ? cpufreq_dev_online(dev, false) : 0;
> > +}
> 
> Looks fine otherwise. Thanks for getting your hands dirty :)
> 
> >  static void cpufreq_offline_prepare(unsigned int cpu)
> >  {
> > struct cpufreq_policy *policy;
> > @@ -2344,31 +2343,35 @@ unlock:
> >  }
> >  EXPORT_SYMBOL(cpufreq_update_policy);
> >  
> > +static void cpufreq_cpu_online(unsigned int cpu)
> > +{
> > +   struct device *dev = get_cpu_device(cpu);
> > +
> > +   if (dev)
> > +   cpufreq_dev_online(dev, true);
> > +}
> 
> What about dropping this wrapper function and ...
> 
> >  static int cpufreq_cpu_callback(struct notifier_block *nfb,
> > unsigned long action, void *hcpu)
> >  {
> > unsigned int cpu = (unsigned long)hcpu;
> > -   struct device *dev;
> >  
> > -   dev = get_cpu_device(cpu);
> 
> ... keeping this as is? And then we can do
> s/cpufreq_dev_online/cpufreq_cpu_online which suits better.

Well, we don't need the dev things for DOWN_PREPARE and POST_DEAD.

We actually only need it in a few places in cpufreq_dev_online(), or
maybe simply cpufreq_online(), so it can take the cpu argument too.

> > -   if (dev) {
> > -   switch (action & ~CPU_TASKS_FROZEN) {
> > -   case CPU_ONLINE:
> > -   cpufreq_add_dev(dev, NULL);
> > -   break;
> > -
> > -   case CPU_DOWN_PREPARE:
> > -   cpufreq_offline_prepare(cpu);
> > -   break;
> > -
> > -   case CPU_POST_DEAD:
> > -   cpufreq_offline_finish(cpu);
> > -   break;
> > -
> > -   case CPU_DOWN_FAILED:
> > -   cpufreq_add_dev(dev, NULL);
> > -   break;
> > -   }
> > +   switch (action & ~CPU_TASKS_FROZEN) {
> > +   case CPU_ONLINE:
> > +   cpufreq_cpu_online(cpu);
> > +   break;
> > +
> > +   case CPU_DOWN_PREPARE:
> > +   cpufreq_offline_prepare(cpu);
> > +   break;
> > +
> > +   case CPU_POST_DEAD:
> > +   cpufreq_offline_finish(cpu);
> > +   break;
> > +
> > +   case CPU_DOWN_FAILED:
> > +   cpufreq_cpu_online(cpu);
> > +   break;
> > }
> > return NOTIFY_OK;
> >  }
> 
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] Add support for Hyperlinks and Markup on kernel-doc

2015-07-23 Thread Jonathan Corbet

On Thu, 23 Jul 2015 15:16:23 -0300
Danilo Cesar Lemes de Paula  wrote:

> This series add supports for hyperlink cross-references on Docbooks and
> an optional markup syntax for in-source Documentation.

I like the idea; just be warned that it's likely to be a week or two and
one more ocean crossing before I can take a serious look at this...

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: string_escape_mem ESCAPE_SPACE

2015-07-23 Thread Andy Shevchenko

On Thu, 2015-07-23 at 12:59 -0700, Kees Cook wrote:
> Hi,
> 
> I'm curious why ESCAPE_SPACE doesn't escape spaces (0x20)?

Space is a printable character.
You perhaps wants something like ESCAPE_SPACE | ESCAPE_HEX.

>  That is
> surprising to me, especially since things like isspace() include 
> 0x20.

Moreover, there are test cases in test-string_helpers.c module and they
are based on the real use cases (before helpers were introduced and
users were converted). So, there is no user which expects hex conversio
n of the printable character if not asked explicitly.

-- 
Andy Shevchenko 
Intel Finland Oy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: rename and document alloc_pages_exact_node

2015-07-23 Thread David Rientjes

On Thu, 23 Jul 2015, Christoph Lameter wrote:

> > The only possible downside would be existing users of
> > alloc_pages_node() that are calling it with an offline node.  Since it's a
> > VM_BUG_ON() that would catch that, I think it should be changed to a
> > VM_WARN_ON() and eventually fixed up because it's nonsensical.
> > VM_BUG_ON() here should be avoided.
> 
> The offline node thing could be addresses by using numa_mem_id()?
> 

I was concerned about any callers that were passing an offline node, not 
NUMA_NO_NODE, today.  One of the alloc-node functions has a VM_BUG_ON() 
for it, the other silently calls node_zonelist() on it.

I suppose the final alloc_pages_node() implementation could be

if (nid == NUMA_NO_NODE || VM_WARN_ON(!node_online(nid)))
nid = numa_mem_id();

VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
return __alloc_pages(gfp_mask, order, node_zonelist(nid, gfp_mask));

though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] ACPI: Update method tracing facility.

2015-07-23 Thread Rafael J. Wysocki

On Thursday, July 23, 2015 01:06:37 PM Lv Zheng wrote:
> This patch updates method tracing facility according to ACPICA 20150717
> release changes.

Can you please explain this in a few more words.

Why do we need to update the interface?

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] locktorture: Support rtmutex torturing

2015-07-23 Thread Davidlohr Bueso

On Wed, 2015-07-22 at 17:17 -0700, Paul E. McKenney wrote:
> On Wed, Jul 22, 2015 at 02:07:27PM -0700, Davidlohr Bueso wrote:
> > Real time mutexes is one of the few general primitives
> > that we do not have in locktorture. Address this -- a few
> > considerations:
> > 
> > o To spice things up, enable competing thread(s) to become
> > rt, such that we can stress different prio boosting paths
> > in the rtmutex code. Introduce a ->task_boost callback,
> > only used by rtmutex-torturer. Tasks will boost/deboost
> > around every 50k (arbitrarily) lock/unlock operations.
> > 
> > o Hold times are similar to what we have for other locks:
> > only occasionally having longer hold times (per ~200k ops).
> > So we roughly do two full rt boost+deboosting ops with
> > short hold times.
> > 
> > Signed-off-by: Davidlohr Bueso 
> 
> I have queued this one for testing, and by default would push it into
> the 4.4 merge window (the one after next).  Please let me know if you
> want it sooner.

Thanks, although here's a v2 with some small fixes:

- comment when reseting prio s/500k/50k.
- when reseting prio, there's a chance a non-rt task can be converted
(which we certainly don't want). Check trsp for nil.

While I was planning for 4.3, I can certainly wait a few more weeks for
4.4. Either way is fine.

Thanks,
Davidlohr

8<---
[PATCH v2] locktorture: Support rtmutex torturing

Real time mutexes is one of the few general primitives
that we do not have in locktorture. Address this -- a few
considerations:

o To spice things up, enable competing thread(s) to become
rt, such that we can stress different prio boosting paths
in the rtmutex code. Introduce a ->task_boost callback,
only used by rtmutex-torturer. Tasks will boost/deboost
around every 50k (arbitrarily) lock/unlock operations.

o Hold times are similar to what we have for other locks:
only occasionally having longer hold times (per ~200k ops).
So we roughly do two full rt boost+deboosting ops with
short hold times.

Signed-off-by: Davidlohr Bueso 
---
 Documentation/locking/locktorture.txt  |   3 +
 kernel/locking/locktorture.c   | 114 -
 .../selftests/rcutorture/configs/lock/CFLIST   |   3 +-
 .../selftests/rcutorture/configs/lock/LOCK05   |   6 ++
 .../selftests/rcutorture/configs/lock/LOCK05.boot  |   1 +
 5 files changed, 124 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/rcutorture/configs/lock/LOCK05
 create mode 100644 tools/testing/selftests/rcutorture/configs/lock/LOCK05.boot

diff --git a/Documentation/locking/locktorture.txt 
b/Documentation/locking/locktorture.txt
index 619f2bb..a2ef3a9 100644
--- a/Documentation/locking/locktorture.txt
+++ b/Documentation/locking/locktorture.txt
@@ -52,6 +52,9 @@ torture_typeType of lock to torture. By default, only 
spinlocks will
 
 o "mutex_lock": mutex_lock() and mutex_unlock() pairs.
 
+o "rtmutex_lock": rtmutex_lock() and rtmutex_unlock()
+  pairs. Kernel must have 
CONFIG_RT_MUTEX=y.
+
 o "rwsem_lock": read/write down() and up() semaphore pairs.
 
 torture_runnable  Start locktorture at boot time in the case where the
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 3224418..26ddd63 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -17,12 +17,14 @@
  *
  * Copyright (C) IBM Corporation, 2014
  *
- * Author: Paul E. McKenney 
+ * Authors: Paul E. McKenney 
+ *  Davidlohr Bueso 
  * Based on kernel/rcu/torture.c.
  */
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -91,11 +93,13 @@ struct lock_torture_ops {
void (*init)(void);
int (*writelock)(void);
void (*write_delay)(struct torture_random_state *trsp);
+   void (*task_boost)(struct torture_random_state *trsp);
void (*writeunlock)(void);
int (*readlock)(void);
void (*read_delay)(struct torture_random_state *trsp);
void (*readunlock)(void);
-   unsigned long flags;
+
+   unsigned long flags; /* for irq spinlocks */
const char *name;
 };
 
@@ -139,9 +143,15 @@ static void torture_lock_busted_write_unlock(void)
  /* BUGGY, do not use in real life!!! */
 }
 
+static void torture_boost_dummy(struct torture_random_state *trsp)
+{
+   /* Only rtmutexes care about priority */
+}
+
 static struct lock_torture_ops lock_busted_ops = {
.writelock  = torture_lock_busted_write_lock,
.write_delay= torture_lock_busted_write_delay,
+   .task_boost = torture_boost_dummy,
.writeunlock= torture_lock_busted_write_unlock,
.readlock   = NULL,
.read_delay = NULL,
@@ -185,6 +195,7 @@ static void torture_spin_lock_write_unlock(void) 
__releases(torture_spinlock)
 static struct

Dealing with the NMI mess

2015-07-23 Thread Andy Lutomirski

[moved to a new thread, cc list trimmed]

Hi all-

We've considered two approaches to dealing with NMIs:

1. Allow nesting.  We know quite well how messy that is.

2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.

We haven't considered:

3. Forbid faults (other than MCE) inside NMI.

Option 3 is almost easy.  There are really only two kinds of faults
that can legitimately nest inside NMI: #PF and #DB.  #DB is easy to
fix (e.g. with my patches or Peter's patches).

What if we went all out and forbade page faults in NMI as well.  There
are two reasons that I can think of that we might page fault inside an
NMI:

a) vmalloc fault.  I think Ingo already half-implemented a rework to
eliminate vmalloc faults entirely.

b) User memory access faults.

The reason we access user state in general from an NMI is to allow
perf to capture enough user stack data to let the tooling backtrace
back to user space.  What if we did it differently?  Instead of
capturing this data in NMI context, capture it in
prepare_exit_to_usermode.  That would let us capture user state
*correctly*, which we currently can't really do.  There's a
never-ending series of minor bugs in which we try to guess the user
register state from NMI context, and it sort of works.  In
prepare_exit_to_usermode, we really truly know the user state.
There's a race where an NMI hits during or after
prepare_exit_to_usermode, but maybe that's okay -- just admit defeat
in that case and don't show the user state.  (Realistically, without
CFI data, we're not going to be guaranteed to get the right state
anyway.)

To make this work, we'd have to teach NMI-from-userspace to call the
callback itself.  It would look like:

prepare_exit_to_usermode() {
  ...
  while (blah blah blah) {
if (cached_flags & TIF_PERF_CAPTURE_USER_STATE)
  perf_capture_user_state();
...
  }
  ...
}

and then, on NMI exit, we'd call perf_capture_user_state directly,
since we don't want to enable IRQs or do opportunsitic sysret on exit
from NMI.  (Why not?  Because NMIs are still masked, and we don't want
to pay for double-IRET to unmask them, so we really want to leave IRQs
off and IRET straight back to user mode.)

There's an unavoidable race in which we enter user mode with
TIF_PERF_CAPTURE_USER_STATE still set.  In principle, we could
IPI-to-self from the NMI handler to cover that case (mostly -- we
capture the wrong state if we're on our way to an IRET fault), or we
could just check on entry if the flag is still set and, if so, admit
defeat.

Peter, can this be done without breaking the perf ABI?  If we were
designing all of this stuff from scratch right now, I'd suggest doing
it this way, but I'm not sure whether it makes sense to try to
retrofit it in.


If we decide to stick with option 2, then I've now convinced myself
that banning all kernel breakpoints and watchpoints during NMI
processing is probably for the best.  Maybe we should go one step
farther and ban all DR7 breakpoints period.  Sure, it will slow down
perf if there are user breakpoints or watchpoints set, but, having
looked at the asm, returning from #DB using RET is, while doable,
distinctly ugly.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm, page_isolation: make set/unset_migratetype_isolate() file-local

2015-07-23 Thread David Rientjes

On Thu, 23 Jul 2015, Naoya Horiguchi wrote:

> Nowaday, set/unset_migratetype_isolate() is defined and used only in
> mm/page_isolation, so let's limit the scope within the file.
> 
> Signed-off-by: Naoya Horiguchi 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/8] power: bq27x00_battery: Renaming for consistency

2015-07-23 Thread Pali Rohár

On Thursday 23 July 2015 19:03:08 Andrew F. Davis wrote:
> >> -#ifdef CONFIG_BATTERY_BQ27X00_I2C
> >> -MODULE_ALIAS("i2c:bq27000-battery");
> >> +#ifdef CONFIG_BATTERY_BQ27XXX_I2C
> >> +MODULE_ALIAS("i2c:bq27xxx-battery");
> >> 
> >>  #endif
> > 
> > Why is this MODULE_ALIAS needed? Some lines upper there is
> > 
> >  MODULE_DEVICE_TABLE(i2c, bq27xxx_id);
> > 
> > which add proper i2c: module alias...
> 
> Not sure, looks like it was added in commit
> 8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 which claims that the
> "module won't get loaded automatically" without it, but I have not
> had this problem, so I'm not sure why it's there.
> 

git grep bq27000-battery show me that only one driver uses that name:
drivers/w1/slaves/w1_bq27000.c

And more over, it is platform device, not i2c device. So that commit 
8ebb7e9c1a502cfc300618c19c3c6f06fc76d237 is wrong! CCing Marek.

MODULE_ALIAS("platform:bq27000-battery") is really needed for 
w1_bq27000.c but MODULE_ALIAS("i2c:bq27000-battery") should be removed. 
It is not used by any board platform code or DT.

Marek, correct me if I'm wrong.

-- 
Pali Rohár
pali.ro...@gmail.com

signature.asc
Description: This is a digitally signed message part.

Re: [PATCH] usb: ulpi: call put_device if device_register fails

2015-07-23 Thread Felipe Balbi

Hi,

On Thu, Jul 23, 2015 at 11:00:29AM -0700, Greg Kroah-Hartman wrote:
> On Thu, Jul 23, 2015 at 12:02:40AM -0500, Felipe Balbi wrote:
> > Hi,
> > 
> > On Wed, Jul 22, 2015 at 08:14:46PM -0700, Greg Kroah-Hartman wrote:
> > > On Wed, Jul 22, 2015 at 09:04:40PM -0500, Felipe Balbi wrote:
> > > > On Wed, Jul 22, 2015 at 02:39:34PM -0700, Greg Kroah-Hartman wrote:
> > > > > On Tue, Jun 23, 2015 at 01:57:38PM +0300, Heikki Krogerus wrote:
> > > > > > On Fri, Jun 19, 2015 at 01:12:36AM +0800, ChengYi He wrote:
> > > > > > > put_device is required to release the last reference to the 
> > > > > > > device.
> > > > > > > 
> > > > > > > Signed-off-by: ChengYi He 
> > > > > > > ---
> > > > > > >  drivers/usb/common/ulpi.c | 4 +++-
> > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/usb/common/ulpi.c b/drivers/usb/common/ulpi.c
> > > > > > > index 0e6f968..bd25bdb 100644
> > > > > > > --- a/drivers/usb/common/ulpi.c
> > > > > > > +++ b/drivers/usb/common/ulpi.c
> > > > > > > @@ -184,8 +184,10 @@ static int ulpi_register(struct device *dev, 
> > > > > > > struct ulpi *ulpi)
> > > > > > >   request_module("ulpi:v%04xp%04x", ulpi->id.vendor, 
> > > > > > > ulpi->id.product);
> > > > > > >  
> > > > > > >   ret = device_register(>dev);
> > > > > > > - if (ret)
> > > > > > > + if (ret) {
> > > > > > > + put_device(>dev);
> > > > > > 
> > > > > > If device_register returns failure, put_device has already been
> > > > > > called. Check device_add in drivers/base/core.c.
> > > > > 
> > > > > Yes, please read the function, which says:
> > > > >  * NOTE: _Never_ directly free @dev after calling this function, even
> > > > >  * if it returned an error! Always use put_device() to give up your
> > > > >  * reference instead.
> > > > > 
> > > > > But, the problem is that the ulpi core doesn't "own" that struct 
> > > > > device.
> > > > > It comes from elsewhere.  It comes from somewhere deep down in the dw3
> > > > > core, which is where I lost the path.  Something needs to be fixed in
> > > > > dwc3_probe() to properly clean up the device if it fails, which is not
> > > > > happening right now.
> > > > > 
> > > > > So this patch would actually cause much bigger problems than fixing
> > > > > anything, so it's wrong, but for a different reason than you are 
> > > > > talking
> > > > > about here.
> > > > > 
> > > > > And ugh, the ulpi and dwc code binding together, what a mess, 
> > > > > horrid...
> > > > 
> > > > any suggestions ? DWC *is* the one implementing the bus. If there's a
> > > > better way, we can certainly shuffle code around.
> > > 
> > > As dwc is the only thing using the bus, why is it drivers/usb/core/ ?
> > 
> > musb also has a SW-accessible ULPI bus. And, IIRC, so does DWC2 ;-)
> 
> But they aren't calling ulpi_register(), so how can they be using this
> code?

the thing was just added :-) It didn't exist before.

> > > And the error path here is broken, the bus should be creating the device
> > > (i.e. no subsystem should ever be registering a device it did not
> > > create), so that it can properly clean things up when stuff goes wrong.
> > > 
> > > The whole subsys_init() is also a bad feeling that it's not architected
> > > correctly, that shouldn't be needed, which is why I never took that
> > > patch.  Just noticed it came in through yours, I wanted it "broken" so
> > > it would be fixed "properly" and not papered over like this.
> > 
> > I just felt it would be better to 'fix' it for the -rc until it can be
> > fixed *properly*. A follow up fix should incur no visible changes to
> > drivers anyway.
> 
> I don't like fixes like this because no one now has any pressure to fix
> it "properly".  Are you doing that work?  If not, who is?

Heikki is author, I'd expect him to fix it up. We can also revert the
fix if you prefer, I'm totally fine with that.

-- 
balbi


signature.asc
Description: Digital signature

Re: [PATCH] drivers: qcom: Select QCOM_SCM unconditionally for QCOM_PM

2015-07-23 Thread Andy Gross

On Tue, Jul 14, 2015 at 04:54:12PM -0500, Andy Gross wrote:
> On Fri, Jul 10, 2015 at 02:18:00PM -0600, Lina Iyer wrote:
> > Enable QCOM_SCM for QCOM power management driver
> > 
> > Signed-off-by: Lina Iyer 
> 
> Acked-by: Andy Gross 

Applied.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

string_escape_mem ESCAPE_SPACE

2015-07-23 Thread Kees Cook

Hi,

I'm curious why ESCAPE_SPACE doesn't escape spaces (0x20)? That is
surprising to me, especially since things like isspace() include 0x20.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[net-next PATCH v1 3/6] net: netcp: Fixes error in oversized memory allocation for statistics storage

2015-07-23 Thread WingMan Kwok

The CPSW driver keeps internally some, but not all, of
the statistics available in the hw statistics modules.  Furthermore,
some of the locations in the hw statistics modules are reserved and
contain no useful information.  Prior to this patch, the driver
allocates memory of the size of the the whole hw statistics modules,
instead of the size of statistics-entries-interested-in (i.e. et_stats),
for internal storage.  This patch fixes that.

Signed-off-by: WingMan Kwok 
---
 drivers/net/ethernet/ti/netcp_ethss.c |   46 +++--
 1 file changed, 21 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c 
b/drivers/net/ethernet/ti/netcp_ethss.c
index b954856..3976516 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -295,8 +295,6 @@ struct xgbe_hw_stats {
u32 rx_dma_overruns;
 };
 
-#define XGBE10_NUM_STAT_ENTRIES (sizeof(struct xgbe_hw_stats)/sizeof(u32))
-
 struct gbenu_ss_regs {
u32 id_ver;
u32 synce_count;/* NU */
@@ -480,7 +478,6 @@ struct gbenu_hw_stats {
u32 tx_pri7_drop_bcnt;
 };
 
-#define GBENU_NUM_HW_STAT_ENTRIES (sizeof(struct gbenu_hw_stats) / sizeof(u32))
 #define GBENU_HW_STATS_REG_MAP_SZ  0x200
 
 struct gbe_ss_regs {
@@ -615,7 +612,6 @@ struct gbe_hw_stats {
u32 rx_dma_overruns;
 };
 
-#define GBE13_NUM_HW_STAT_ENTRIES (sizeof(struct gbe_hw_stats)/sizeof(u32))
 #define GBE_MAX_HW_STAT_MODS   9
 #define GBE_HW_STATS_REG_MAP_SZ0x100
 
@@ -2555,10 +2551,12 @@ static int set_xgbe_ethss10_priv(struct gbe_priv 
*gbe_dev,
}
gbe_dev->xgbe_serdes_regs = regs;
 
+   gbe_dev->et_stats = xgbe10_et_stats;
+   gbe_dev->num_et_stats = ARRAY_SIZE(xgbe10_et_stats);
+
gbe_dev->hw_stats = devm_kzalloc(gbe_dev->dev,
- XGBE10_NUM_STAT_ENTRIES *
- (gbe_dev->max_num_ports) * sizeof(u64),
- GFP_KERNEL);
+gbe_dev->num_et_stats * sizeof(u64),
+GFP_KERNEL);
if (!gbe_dev->hw_stats) {
dev_err(gbe_dev->dev, "hw_stats memory allocation failed\n");
return -ENOMEM;
@@ -2577,8 +2575,6 @@ static int set_xgbe_ethss10_priv(struct gbe_priv *gbe_dev,
gbe_dev->ale_ports = gbe_dev->max_num_ports;
gbe_dev->host_port = XGBE10_HOST_PORT_NUM;
gbe_dev->ale_entries = XGBE10_NUM_ALE_ENTRIES;
-   gbe_dev->et_stats = xgbe10_et_stats;
-   gbe_dev->num_et_stats = ARRAY_SIZE(xgbe10_et_stats);
gbe_dev->stats_en_mask = (1 << (gbe_dev->max_num_ports)) - 1;
 
/* Subsystem registers */
@@ -2663,10 +2659,12 @@ static int set_gbe_ethss14_priv(struct gbe_priv 
*gbe_dev,
}
gbe_dev->switch_regs = regs;
 
+   gbe_dev->et_stats = gbe13_et_stats;
+   gbe_dev->num_et_stats = ARRAY_SIZE(gbe13_et_stats);
+
gbe_dev->hw_stats = devm_kzalloc(gbe_dev->dev,
- GBE13_NUM_HW_STAT_ENTRIES *
- gbe_dev->max_num_slaves * sizeof(u64),
- GFP_KERNEL);
+gbe_dev->num_et_stats * sizeof(u64),
+GFP_KERNEL);
if (!gbe_dev->hw_stats) {
dev_err(gbe_dev->dev, "hw_stats memory allocation failed\n");
return -ENOMEM;
@@ -2689,8 +2687,6 @@ static int set_gbe_ethss14_priv(struct gbe_priv *gbe_dev,
gbe_dev->ale_ports = gbe_dev->max_num_ports;
gbe_dev->host_port = GBE13_HOST_PORT_NUM;
gbe_dev->ale_entries = GBE13_NUM_ALE_ENTRIES;
-   gbe_dev->et_stats = gbe13_et_stats;
-   gbe_dev->num_et_stats = ARRAY_SIZE(gbe13_et_stats);
gbe_dev->stats_en_mask = GBE13_REG_VAL_STAT_ENABLE_ALL;
 
/* Subsystem registers */
@@ -2717,10 +2713,18 @@ static int set_gbenu_ethss_priv(struct gbe_priv 
*gbe_dev,
void __iomem *regs;
int i, ret;
 
+   gbe_dev->et_stats = gbenu_et_stats;
+
+   if (IS_SS_ID_NU(gbe_dev))
+   gbe_dev->num_et_stats = GBENU_ET_STATS_HOST_SIZE +
+   (gbe_dev->max_num_slaves * GBENU_ET_STATS_PORT_SIZE);
+   else
+   gbe_dev->num_et_stats = GBENU_ET_STATS_HOST_SIZE +
+   GBENU_ET_STATS_PORT_SIZE;
+
gbe_dev->hw_stats = devm_kzalloc(gbe_dev->dev,
- GBENU_NUM_HW_STAT_ENTRIES *
- (gbe_dev->max_num_ports) * sizeof(u64),
- GFP_KERNEL);
+gbe_dev->num_et_stats * sizeof(u64),
+GFP_KERNEL);
if (!gbe_dev->hw_stats) {
dev_err(gbe_dev->dev, "hw_stats memory allocation failed\n");

[net-next PATCH v1 4/6] net: netcp: Consolidates statistics collection code

2015-07-23 Thread WingMan Kwok

Different Keystone2 platforms have different number and
layouts of hw statistics modules.  This patch consolidates
the statistics processing of different Keystone2 platforms
for easy maintenance.

Signed-off-by: WingMan Kwok 
---
 drivers/net/ethernet/ti/netcp_ethss.c |   99 ++---
 1 file changed, 54 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c 
b/drivers/net/ethernet/ti/netcp_ethss.c
index 3976516..b06f210 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -1550,70 +1550,79 @@ static int keystone_get_sset_count(struct net_device 
*ndev, int stringset)
}
 }
 
-static void gbe_update_stats(struct gbe_priv *gbe_dev, uint64_t *data)
+static inline void gbe_update_hw_stats_entry(struct gbe_priv *gbe_dev,
+int et_stats_entry)
 {
void __iomem *base = NULL;
u32  __iomem *p;
u32 tmp = 0;
+
+   /* The hw_stats_regs pointers are already
+* properly set to point to the right base:
+*/
+   base = gbe_dev->hw_stats_regs[gbe_dev->et_stats[et_stats_entry].type];
+   p = base + gbe_dev->et_stats[et_stats_entry].offset;
+   tmp = readl(p);
+   gbe_dev->hw_stats[et_stats_entry] += tmp;
+
+   /* write-to-decrement:
+* new register value = old register value - write value
+*/
+   writel(tmp, p);
+}
+
+static void gbe_update_stats(struct gbe_priv *gbe_dev, uint64_t *data)
+{
int i;
 
for (i = 0; i < gbe_dev->num_et_stats; i++) {
-   base = gbe_dev->hw_stats_regs[gbe_dev->et_stats[i].type];
-   p = base + gbe_dev->et_stats[i].offset;
-   tmp = readl(p);
-   gbe_dev->hw_stats[i] = gbe_dev->hw_stats[i] + tmp;
+   gbe_update_hw_stats_entry(gbe_dev, i);
+
if (data)
data[i] = gbe_dev->hw_stats[i];
-   /* write-to-decrement:
-* new register value = old register value - write value
-*/
-   writel(tmp, p);
}
 }
 
-static void gbe_update_stats_ver14(struct gbe_priv *gbe_dev, uint64_t *data)
+static inline void gbe_stats_mod_visible_ver14(struct gbe_priv *gbe_dev,
+  int stats_mod)
 {
-   void __iomem *gbe_statsa = gbe_dev->hw_stats_regs[0];
-   void __iomem *gbe_statsb = gbe_dev->hw_stats_regs[1];
-   u64 *hw_stats = _dev->hw_stats[0];
-   void __iomem *base = NULL;
-   u32  __iomem *p;
-   u32 tmp = 0, val, pair_size = (gbe_dev->num_et_stats / 2);
-   int i, j, pair;
+   u32 val;
 
-   for (pair = 0; pair < 2; pair++) {
-   val = readl(GBE_REG_ADDR(gbe_dev, switch_regs, stat_port_en));
+   val = readl(GBE_REG_ADDR(gbe_dev, switch_regs, stat_port_en));
 
-   if (pair == 0)
-   val &= ~GBE_STATS_CD_SEL;
-   else
-   val |= GBE_STATS_CD_SEL;
+   switch (stats_mod) {
+   case GBE_STATSA_MODULE:
+   case GBE_STATSB_MODULE:
+   val &= ~GBE_STATS_CD_SEL;
+   break;
+   case GBE_STATSC_MODULE:
+   case GBE_STATSD_MODULE:
+   val |= GBE_STATS_CD_SEL;
+   break;
+   default:
+   return;
+   }
+
+   /* make the stat module visible */
+   writel(val, GBE_REG_ADDR(gbe_dev, switch_regs, stat_port_en));
+}
 
-   /* make the stat modules visible */
-   writel(val, GBE_REG_ADDR(gbe_dev, switch_regs, stat_port_en));
+static void gbe_update_stats_ver14(struct gbe_priv *gbe_dev, uint64_t *data)
+{
+   u32 half_num_et_stats = (gbe_dev->num_et_stats / 2);
+   int et_entry, j, pair;
 
-   for (i = 0; i < pair_size; i++) {
-   j = pair * pair_size + i;
-   switch (gbe_dev->et_stats[j].type) {
-   case GBE_STATSA_MODULE:
-   case GBE_STATSC_MODULE:
-   base = gbe_statsa;
-   break;
-   case GBE_STATSB_MODULE:
-   case GBE_STATSD_MODULE:
-   base  = gbe_statsb;
-   break;
-   }
+   for (pair = 0; pair < 2; pair++) {
+   gbe_stats_mod_visible_ver14(gbe_dev, (pair ?
+ GBE_STATSC_MODULE :
+ GBE_STATSA_MODULE));
+
+   for (j = 0; j < half_num_et_stats; j++) {
+   et_entry = pair * half_num_et_stats + j;
+   gbe_update_hw_stats_entry(gbe_dev, et_entry);
 
-   p = base + gbe_dev->et_stats[j].offset;
-   tmp = readl(p);
-   hw_stats[j] += tmp;
if (data)
-

[net-next PATCH v1 6/6] net: netcp: Adds missing statistics for K2L and K2E

2015-07-23 Thread WingMan Kwok

This patch adds the missing statistics for the host
and slave ports of the CPSW on K2L and K2E platforms.

Signed-off-by: WingMan Kwok 
---
 drivers/net/ethernet/ti/netcp_ethss.c |  177 -
 1 file changed, 174 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c 
b/drivers/net/ethernet/ti/netcp_ethss.c
index aa33066..01a955c 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -872,7 +872,7 @@ static const struct netcp_ethtool_stat gbe13_et_stats[] = {
 };
 
 /* This is the size of entries in GBENU_STATS_HOST */
-#define GBENU_ET_STATS_HOST_SIZE   33
+#define GBENU_ET_STATS_HOST_SIZE   52
 
 #define GBENU_STATS_HOST(field)\
 {  \
@@ -881,8 +881,8 @@ static const struct netcp_ethtool_stat gbe13_et_stats[] = {
offsetof(struct gbenu_hw_stats, field)  \
 }
 
-/* This is the size of entries in GBENU_STATS_HOST */
-#define GBENU_ET_STATS_PORT_SIZE   46
+/* This is the size of entries in GBENU_STATS_PORT */
+#define GBENU_ET_STATS_PORT_SIZE   65
 
 #define GBENU_STATS_P1(field)  \
 {  \
@@ -974,7 +974,26 @@ static const struct netcp_ethtool_stat gbenu_et_stats[] = {
GBENU_STATS_HOST(ale_unknown_mcast_bytes),
GBENU_STATS_HOST(ale_unknown_bcast),
GBENU_STATS_HOST(ale_unknown_bcast_bytes),
+   GBENU_STATS_HOST(ale_pol_match),
+   GBENU_STATS_HOST(ale_pol_match_red),
+   GBENU_STATS_HOST(ale_pol_match_yellow),
GBENU_STATS_HOST(tx_mem_protect_err),
+   GBENU_STATS_HOST(tx_pri0_drop),
+   GBENU_STATS_HOST(tx_pri1_drop),
+   GBENU_STATS_HOST(tx_pri2_drop),
+   GBENU_STATS_HOST(tx_pri3_drop),
+   GBENU_STATS_HOST(tx_pri4_drop),
+   GBENU_STATS_HOST(tx_pri5_drop),
+   GBENU_STATS_HOST(tx_pri6_drop),
+   GBENU_STATS_HOST(tx_pri7_drop),
+   GBENU_STATS_HOST(tx_pri0_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri1_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri2_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri3_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri4_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri5_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri6_drop_bcnt),
+   GBENU_STATS_HOST(tx_pri7_drop_bcnt),
/* GBENU Module 1 */
GBENU_STATS_P1(rx_good_frames),
GBENU_STATS_P1(rx_broadcast_frames),
@@ -1021,7 +1040,26 @@ static const struct netcp_ethtool_stat gbenu_et_stats[] 
= {
GBENU_STATS_P1(ale_unknown_mcast_bytes),
GBENU_STATS_P1(ale_unknown_bcast),
GBENU_STATS_P1(ale_unknown_bcast_bytes),
+   GBENU_STATS_P1(ale_pol_match),
+   GBENU_STATS_P1(ale_pol_match_red),
+   GBENU_STATS_P1(ale_pol_match_yellow),
GBENU_STATS_P1(tx_mem_protect_err),
+   GBENU_STATS_P1(tx_pri0_drop),
+   GBENU_STATS_P1(tx_pri1_drop),
+   GBENU_STATS_P1(tx_pri2_drop),
+   GBENU_STATS_P1(tx_pri3_drop),
+   GBENU_STATS_P1(tx_pri4_drop),
+   GBENU_STATS_P1(tx_pri5_drop),
+   GBENU_STATS_P1(tx_pri6_drop),
+   GBENU_STATS_P1(tx_pri7_drop),
+   GBENU_STATS_P1(tx_pri0_drop_bcnt),
+   GBENU_STATS_P1(tx_pri1_drop_bcnt),
+   GBENU_STATS_P1(tx_pri2_drop_bcnt),
+   GBENU_STATS_P1(tx_pri3_drop_bcnt),
+   GBENU_STATS_P1(tx_pri4_drop_bcnt),
+   GBENU_STATS_P1(tx_pri5_drop_bcnt),
+   GBENU_STATS_P1(tx_pri6_drop_bcnt),
+   GBENU_STATS_P1(tx_pri7_drop_bcnt),
/* GBENU Module 2 */
GBENU_STATS_P2(rx_good_frames),
GBENU_STATS_P2(rx_broadcast_frames),
@@ -1068,7 +1106,26 @@ static const struct netcp_ethtool_stat gbenu_et_stats[] 
= {
GBENU_STATS_P2(ale_unknown_mcast_bytes),
GBENU_STATS_P2(ale_unknown_bcast),
GBENU_STATS_P2(ale_unknown_bcast_bytes),
+   GBENU_STATS_P2(ale_pol_match),
+   GBENU_STATS_P2(ale_pol_match_red),
+   GBENU_STATS_P2(ale_pol_match_yellow),
GBENU_STATS_P2(tx_mem_protect_err),
+   GBENU_STATS_P2(tx_pri0_drop),
+   GBENU_STATS_P2(tx_pri1_drop),
+   GBENU_STATS_P2(tx_pri2_drop),
+   GBENU_STATS_P2(tx_pri3_drop),
+   GBENU_STATS_P2(tx_pri4_drop),
+   GBENU_STATS_P2(tx_pri5_drop),
+   GBENU_STATS_P2(tx_pri6_drop),
+   GBENU_STATS_P2(tx_pri7_drop),
+   GBENU_STATS_P2(tx_pri0_drop_bcnt),
+   GBENU_STATS_P2(tx_pri1_drop_bcnt),
+   GBENU_STATS_P2(tx_pri2_drop_bcnt),
+   GBENU_STATS_P2(tx_pri3_drop_bcnt),
+   GBENU_STATS_P2(tx_pri4_drop_bcnt),
+   GBENU_STATS_P2(tx_pri5_drop_bcnt),
+   GBENU_STATS_P2(tx_pri6_drop_bcnt),
+   GBENU_STATS_P2(tx_pri7_drop_bcnt),
/* GBENU Module 3 */
GBENU_STATS_P3(rx_good_frames),
GBENU_STATS_P3(rx_broadcast_frames),
@@ -1115,7 +1172,26 @@ static const struct netcp_ethtool_stat gbenu_et_stats[] 
= {
GBENU_STATS_P3(ale_unknown_mcast_bytes),

[net-next PATCH v1 1/6] net: netcp: Fixes the use of spin_lock_bh in timer function

2015-07-23 Thread WingMan Kwok

This patch fixes a bug in which the timer routine synchronized
against the ethtool-triggered statistics updates with spin_lock_bh().
A timer function is itself a bottom-half, so this should be
spin_lock().

Signed-off-by: WingMan Kwok 
---
 drivers/net/ethernet/ti/netcp_ethss.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c 
b/drivers/net/ethernet/ti/netcp_ethss.c
index 9b7e0a3..cabf977 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -2189,14 +2189,15 @@ static void netcp_ethss_timer(unsigned long arg)
netcp_ethss_update_link_state(gbe_dev, slave, NULL);
}
 
-   spin_lock_bh(_dev->hw_stats_lock);
+   /* A timer runs as a BH, no need to block them */
+   spin_lock(_dev->hw_stats_lock);
 
if (gbe_dev->ss_version == GBE_SS_VERSION_14)
gbe_update_stats_ver14(gbe_dev, NULL);
else
gbe_update_stats(gbe_dev, NULL);
 
-   spin_unlock_bh(_dev->hw_stats_lock);
+   spin_unlock(_dev->hw_stats_lock);
 
gbe_dev->timer.expires  = jiffies + GBE_TIMER_INTERVAL;
add_timer(_dev->timer);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[net-next PATCH v1 0/6] net: netcp: Bug fixes of CPSW statistics collection

2015-07-23 Thread WingMan Kwok

This patch set contains bug fixes and enhencements of hw ethernet
statistics processing on TI's Keystone2 CPSW ethernet switches.

v1: Removes unused defines in PATCH 3/6 based on reviewer's comment

WingMan Kwok (6):
  net: netcp: Fixes the use of spin_lock_bh in timer function
  net: netcp: Fixes hw statistics module base setting error
  net: netcp: Fixes error in oversized memory allocation for statistics
storage
  net: netcp: Consolidates statistics collection code
  net: netcp: Fixes to CPSW statistics collection
  net: netcp: Adds missing statistics for K2L and K2E

 drivers/net/ethernet/ti/netcp_ethss.c |  403 ++---
 1 file changed, 324 insertions(+), 79 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[net-next PATCH v1 2/6] net: netcp: Fixes hw statistics module base setting error

2015-07-23 Thread WingMan Kwok

This patch fixes error in the setting of the hw statistics
module base for K2HK platform.  In K2HK although there are
4 hw statistics modules, but only 2 are visible at a time.
Thus when setting up the pointers to the base of the
corresponding hw statistics modules, modules 0 and 2 should
point to one base, while modules 1 and 3 should point to the
other.

Signed-off-by: WingMan Kwok 
---
 drivers/net/ethernet/ti/netcp_ethss.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c 
b/drivers/net/ethernet/ti/netcp_ethss.c
index cabf977..b954856 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -2675,10 +2675,14 @@ static int set_gbe_ethss14_priv(struct gbe_priv 
*gbe_dev,
gbe_dev->sgmii_port_regs = gbe_dev->ss_regs + GBE13_SGMII_MODULE_OFFSET;
gbe_dev->host_port_regs = gbe_dev->switch_regs + GBE13_HOST_PORT_OFFSET;
 
+   /* K2HK has only 2 hw stats modules visible at a time, so
+* module 0 & 2 points to one base and
+* module 1 & 3 points to the other base
+*/
for (i = 0; i < gbe_dev->max_num_slaves; i++) {
gbe_dev->hw_stats_regs[i] =
gbe_dev->switch_regs + GBE13_HW_STATS_OFFSET +
-   (GBE_HW_STATS_REG_MAP_SZ * i);
+   (GBE_HW_STATS_REG_MAP_SZ * (i & 0x1));
}
 
gbe_dev->ale_reg = gbe_dev->switch_regs + GBE13_ALE_OFFSET;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[net-next PATCH v1 5/6] net: netcp: Fixes to CPSW statistics collection

2015-07-23 Thread WingMan Kwok

In certain applications it's beneficial to allow the CPSW h/w
stats counters to continue to increment even while the kernel
polls them. This patch implements this behavior for both 1G
and 10G ethernet subsystem modules.

Signed-off-by: WingMan Kwok 
---
 drivers/net/ethernet/ti/netcp_ethss.c |   86 -
 1 file changed, 75 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/ti/netcp_ethss.c 
b/drivers/net/ethernet/ti/netcp_ethss.c
index b06f210..aa33066 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -642,6 +642,7 @@ struct gbe_priv {
boolenable_ale;
u8  max_num_slaves;
u8  max_num_ports; /* max_num_slaves + 1 */
+   u8  num_stats_mods;
struct netcp_tx_pipetx_pipe;
 
int host_port;
@@ -671,6 +672,7 @@ struct gbe_priv {
struct net_device   *dummy_ndev;
 
u64 *hw_stats;
+   u32 *hw_stats_prev;
const struct netcp_ethtool_stat *et_stats;
int num_et_stats;
/*  Lock for updating the hwstats */
@@ -1550,25 +1552,37 @@ static int keystone_get_sset_count(struct net_device 
*ndev, int stringset)
}
 }
 
+static void gbe_reset_mod_stats(struct gbe_priv *gbe_dev, int stats_mod)
+{
+   void __iomem *base = gbe_dev->hw_stats_regs[stats_mod];
+   u32  __iomem *p_stats_entry;
+   int i;
+
+   for (i = 0; i < gbe_dev->num_et_stats; i++) {
+   if (gbe_dev->et_stats[i].type == stats_mod) {
+   p_stats_entry = base + gbe_dev->et_stats[i].offset;
+   gbe_dev->hw_stats[i] = 0;
+   gbe_dev->hw_stats_prev[i] = readl(p_stats_entry);
+   }
+   }
+}
+
 static inline void gbe_update_hw_stats_entry(struct gbe_priv *gbe_dev,
 int et_stats_entry)
 {
void __iomem *base = NULL;
-   u32  __iomem *p;
-   u32 tmp = 0;
+   u32  __iomem *p_stats_entry;
+   u32 curr, delta;
 
/* The hw_stats_regs pointers are already
 * properly set to point to the right base:
 */
base = gbe_dev->hw_stats_regs[gbe_dev->et_stats[et_stats_entry].type];
-   p = base + gbe_dev->et_stats[et_stats_entry].offset;
-   tmp = readl(p);
-   gbe_dev->hw_stats[et_stats_entry] += tmp;
-
-   /* write-to-decrement:
-* new register value = old register value - write value
-*/
-   writel(tmp, p);
+   p_stats_entry = base + gbe_dev->et_stats[et_stats_entry].offset;
+   curr = readl(p_stats_entry);
+   delta = curr - gbe_dev->hw_stats_prev[et_stats_entry];
+   gbe_dev->hw_stats_prev[et_stats_entry] = curr;
+   gbe_dev->hw_stats[et_stats_entry] += delta;
 }
 
 static void gbe_update_stats(struct gbe_priv *gbe_dev, uint64_t *data)
@@ -1607,6 +1621,12 @@ static inline void gbe_stats_mod_visible_ver14(struct 
gbe_priv *gbe_dev,
writel(val, GBE_REG_ADDR(gbe_dev, switch_regs, stat_port_en));
 }
 
+static void gbe_reset_mod_stats_ver14(struct gbe_priv *gbe_dev, int stats_mod)
+{
+   gbe_stats_mod_visible_ver14(gbe_dev, stats_mod);
+   gbe_reset_mod_stats(gbe_dev, stats_mod);
+}
+
 static void gbe_update_stats_ver14(struct gbe_priv *gbe_dev, uint64_t *data)
 {
u32 half_num_et_stats = (gbe_dev->num_et_stats / 2);
@@ -2560,6 +2580,7 @@ static int set_xgbe_ethss10_priv(struct gbe_priv *gbe_dev,
}
gbe_dev->xgbe_serdes_regs = regs;
 
+   gbe_dev->num_stats_mods = gbe_dev->max_num_ports;
gbe_dev->et_stats = xgbe10_et_stats;
gbe_dev->num_et_stats = ARRAY_SIZE(xgbe10_et_stats);
 
@@ -2571,6 +2592,16 @@ static int set_xgbe_ethss10_priv(struct gbe_priv 
*gbe_dev,
return -ENOMEM;
}
 
+   gbe_dev->hw_stats_prev =
+   devm_kzalloc(gbe_dev->dev,
+gbe_dev->num_et_stats * sizeof(u32),
+GFP_KERNEL);
+   if (!gbe_dev->hw_stats_prev) {
+   dev_err(gbe_dev->dev,
+   "hw_stats_prev memory allocation failed\n");
+   return -ENOMEM;
+   }
+
gbe_dev->ss_version = XGBE_SS_VERSION_10;
gbe_dev->sgmii_port_regs = gbe_dev->ss_regs +
XGBE10_SGMII_MODULE_OFFSET;
@@ -2668,6 +2699,7 @@ static int set_gbe_ethss14_priv(struct gbe_priv *gbe_dev,
}
gbe_dev->switch_regs = regs;
 
+   gbe_dev->num_stats_mods = gbe_dev->max_num_slaves;
gbe_dev->et_stats = gbe13_et_stats;
gbe_dev->num_et_stats = ARRAY_SIZE(gbe13_et_stats);
 
@@ -2679,6 +2711,16 @@ static int set_gbe_ethss14_priv(struct gbe_priv *gbe_dev,
return -ENOMEM;
}
 
+

Re: [PATCH 2/2] locktorture: 'tis a slow death

2015-07-23 Thread Davidlohr Bueso

On Wed, 2015-07-22 at 17:13 -0700, Paul E. McKenney wrote:
> I need to see something more than what I am seeing for me to be able
> to accept this, cute though it unarguably is.

heh I didn't consider copyright for this kind of stuff. And was naive to
think that keeping his (what I assume to be) initials was enough. I've
contacted the author for permission to use his work.

But yeah, I had to laugh when I saw this. Although I probably triggered
some red flag by googling 'torture' and 'weapons' :-)

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] backlight: pm8941-wled: Add default-brightness property

2015-07-23 Thread Bjorn Andersson

Add the possibility of specifying the default brightness in DT.

Signed-off-by: Bjorn Andersson 
---

This depends on the patch moving pm8941-wled to backlight [1]. The dt property
is used by several other backlight drivers, so I considered this to be a
"common" property and it's hence not prefixed with "qcom,".

[1] https://lkml.org/lkml/2015/7/21/906

 Documentation/devicetree/bindings/video/backlight/pm8941-wled.txt | 1 +
 drivers/video/backlight/pm8941-wled.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/video/backlight/pm8941-wled.txt 
b/Documentation/devicetree/bindings/video/backlight/pm8941-wled.txt
index 424f8444a6cd..37503f8c3620 100644
--- a/Documentation/devicetree/bindings/video/backlight/pm8941-wled.txt
+++ b/Documentation/devicetree/bindings/video/backlight/pm8941-wled.txt
@@ -5,6 +5,7 @@ Required properties:
 - reg: slave address
 
 Optional properties:
+- default-brightness: value from: 0-4095
 - label: The name of the backlight device
 - qcom,cs-out: bool; enable current sink output
 - qcom,cabc: bool; enable content adaptive backlight control
diff --git a/drivers/video/backlight/pm8941-wled.c 
b/drivers/video/backlight/pm8941-wled.c
index c704c3236034..b875e58df0fc 100644
--- a/drivers/video/backlight/pm8941-wled.c
+++ b/drivers/video/backlight/pm8941-wled.c
@@ -373,6 +373,7 @@ static int pm8941_wled_probe(struct platform_device *pdev)
struct backlight_device *bl;
struct pm8941_wled *wled;
struct regmap *regmap;
+   u32 val = 0;
int rc;
 
regmap = dev_get_regmap(pdev->dev.parent, NULL);
@@ -395,8 +396,11 @@ static int pm8941_wled_probe(struct platform_device *pdev)
if (rc)
return rc;
 
+   of_property_read_u32(pdev->dev.of_node, "default-brightness", );
+
memset(, 0, sizeof(struct backlight_properties));
props.type = BACKLIGHT_RAW;
+   props.brightness = val;
props.max_brightness = PM8941_WLED_REG_VAL_MAX;
bl = devm_backlight_device_register(>dev, wled->name,
>dev, wled,
-- 
1.8.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 1/2] mfd: devicetree: add bindings for Atmel Flexcom

2015-07-23 Thread Boris Brezillon

On Thu, 23 Jul 2015 18:42:55 +0200
Cyrille Pitchen  wrote:

> This patch documents the DT bindings for the Atmel Flexcom which will be
> introduced by sama5d2x SoCs. These bindings will be used by the actual
> Flexcom driver to be sent in another patch.
> 
> Signed-off-by: Cyrille Pitchen 
> ---
>  .../devicetree/bindings/mfd/atmel-flexcom.txt  | 68 
> ++
>  1 file changed, 68 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mfd/atmel-flexcom.txt
> 
> diff --git a/Documentation/devicetree/bindings/mfd/atmel-flexcom.txt 
> b/Documentation/devicetree/bindings/mfd/atmel-flexcom.txt
> new file mode 100644
> index ..a63226b7a9cb
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mfd/atmel-flexcom.txt
> @@ -0,0 +1,68 @@
> +* Device tree bindings for Atmel Flexcom (Flexible Serial Communication Unit)
> +
> +The Atmel Flexcom is just a wrapper which embeds a SPI controller, an I2C
> +controller and an USART. Only one function can be used at a time and is 
> chosen
> +at boot time according to the device tree.
> +
> +Required properties:
> +- compatible:Should be "atmel,sama5d2-flexcom"
> +- reg:   Should be the pair (offset, size) for the 
> Flexcom
> + dedicated I/O registers (without USART, TWI or SPI
> + registers).
> +- clocks:Should be the Flexcom peripheral clock from PMC.
> +- #address-cells:Should be <2>
> +- #size-cells:   Should be <1>
> +- ranges:Should be a list of ranges.
> + One range per peripheral wrapped by the Flexcom. So each
> + range is a triplet (child_addr, parent_addr, size). The
> + first u32 of "child_addr" is the value to be set in the
> + Operating Mode bitfield of the Flexcom Mode Register.
> + Then "parent_addr" stores the base address of the
> + corresponding peripheral in the system memory. Finally,
> + "size" if the size of the memory region of this
> + peripheral.
> +
> +Required child:
> +A single available child for the serial controller to enable.
> +
> +Required properties of this child:
> +- reg:   Should be a pair (child_addr, size) with 
> child_addr
> + matching one of the parent ranges.
> +- clocks:Should be the very same phandle as for the parent's one.
> +
> +Other properties remain unchanged. See documentation of the respective 
> device:
> +- ../serial/atmel-usart.txt
> +- ../spi/spi_atmel.txt
> +- ../i2c/i2c-at91.txt
> +
> +Example:
> +
> +flexcom@f8034000 {
> + compatible = "atmel,sama5d2-flexcom";
> + reg = <0xf8034000 0x200>;
> + clocks = <_clk>;
> + #address-cells = <2>;
> + #size-cells = <1>;
> + ranges = <1 0 0xf8034200 0x200  /* opmode 1: USART */
> +   2 0 0xf8034400 0x200  /* opmode 2: SPI */
> +   3 0 0xf8034600 0x200>;/* opmode 3: I2C */
> +
> + spi@f8034400 {

Should be:

spi@2,0 {

> + compatible = "atmel,at91rm9200-spi";
> + reg = <2 0 0x200>;
> + interrupts = <19 IRQ_TYPE_LEVEL_HIGH 7>;
> + pinctrl-names = "default";
> + pinctrl-0 = <_flx0_default>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> + clocks = <_clk>;
> + clock-names = "spi_clk";
> + atmel,fifo-size = <32>;
> +
> + mtd_dataflash@0 {
> + compatible = "atmel,at25f512b";
> + reg = <0>;
> + spi-max-frequency = <2000>;
> + };
> + };
> +};



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH block/for-linus] block: export bio_associate_*() and wbc_account_io()

2015-07-23 Thread Jens Axboe


On 07/23/2015 12:27 PM, Tejun Heo wrote:

bio_associate_blkcg(), bio_associate_current() and wbc_account_io()
are used to implement cgroup writeback support for filesystems and
thus need to be exported.  Export them.

Signed-off-by: Tejun Heo 
Reported-by: Stephen Rothwell 
---
Hello, Jens.

While this change isn't strictly necessary for 4.2, I think it'd
better to push it through for-linus so that it's there before any
filesystem specific changes are merged.


Might as well shove it it, as it'll make the lives of others easier.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size

2015-07-23 Thread Alex Bennée


Steven Rostedt  writes:

> On Fri, 17 Jul 2015 10:32:15 -0400
> Steven Rostedt  wrote:
>
>
>> This change affects all callers of dymanic_array, not just bitmasks.
>> 
>> >__data_size += __item_length;
>> >  
>> >  #undef __string
>> 
>> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
>> ftrace_print_array_seq compute buf_len" it works again.
>> 
>> I'm going to look into this some more, and maybe the answer is to go
>> back and just pass in buffer length here. I can't see what was broken
>> before that change.
>
> OK, the print_array() code is already being used by the thermal events
> and can't be changed. But we can't make the proposed change because
> that changes the user interface.
>
> What we can change is the sample code!
>
> -- Steve
>
> From 95de1e9721a2f9d05831a53d228e181a33001c55 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" 
> Date: Fri, 17 Jul 2015 14:03:26 -0400
> Subject: [PATCH] tracing: Fix sample output of dynamic arrays
>
> He Kuang noticed that the trace event samples for arrays was broken:
>
> "The output result of trace_foo_bar event in traceevent samples is
>  wrong. This problem can be reproduced as following:
>
>   (Build kernel with SAMPLE_TRACE_EVENTS=m)
>
>   $ insmod trace-events-sample.ko
>
>   $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable
>
>   $ cat /sys/kernel/debug/tracing/trace
>
>   event-sample-980 [000]   43.649559: foo_bar: foo hello 21 0x15
>   BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0x} Snoopy
>  ^^
>  The array length is not right, should be {0x1}.
>   (,)
>
>   event-sample-980 [000]   44.653827: foo_bar: foo hello 22 0x16
>   BIT2|BIT3|0x10
>   {0x1,0x2,0x646e6147,0x666c61,0x,0x,0x750aeffe,0x7}
>   ^^
>  The array length is not right, should be {0x1,0x2}.
>   Gandalf (,)"
>
> This was caused by an update to have __print_array()'s second parameter
> be the count of items in the array and not the size of the array.
>
> As there is already users of __print_array(), it can not change. But
> the sample code can and we can also improve on the documentation about
> __print_array() and __get_dynamic_array_len().
>
> Link: 
> http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-heku...@huawei.com
>
> Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len")
> Reported-by: He Kuang 
> Signed-off-by: Steven Rostedt 
> ---
>  samples/trace_events/trace-events-sample.h | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/samples/trace_events/trace-events-sample.h 
> b/samples/trace_events/trace-events-sample.h
> index 8965d1bb8811..125d6402f64f 100644
> --- a/samples/trace_events/trace-events-sample.h
> +++ b/samples/trace_events/trace-events-sample.h
> @@ -168,7 +168,10 @@
>   *
>   *  For __dynamic_array(int, foo, bar) use __get_dynamic_array(foo)
>   *Use __get_dynamic_array_len(foo) to get the length of the array
> - *saved.
> + *saved. Note, __get_dynamic_array_len() returns the total 
> allocated
> + *length of the dynamic array; __print_array() expects the second
> + *parameter to be the number of elements. To get that, the array 
> length
> + *needs to be divided by the element size.
>   *
>   *  For __string(foo, bar) use __get_str(foo)
>   *
> @@ -288,7 +291,7 @@ TRACE_EVENT(foo_bar,
>   *This prints out the array that is defined by __array in a nice format.
>   */
> __print_array(__get_dynamic_array(list),
> - __get_dynamic_array_len(list),
> + __get_dynamic_array_len(list) / sizeof(int),
>   sizeof(int)),
> __get_str(str), __get_bitmask(cpus))
>  );

Reviewed-by: Alex Bennée 

-- 
Alex Bennée
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] checkpatch: Always check block comment styles

2015-07-23 Thread Joe Perches

Some of the block comment tests that are used only for networking are
appropriate for all patches.

For example, these styles are not encouraged:

/*
 block comment without introductory *
*/
and
/*
 * block comment with line terminating */

Remove the networking specific test and add comments.

There are some infrequent false positives where code is lazily
commented out using /* and */ rather than using #if 0/#endif blocks
like:
/* case foo:
case bar: */
case baz:

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 137bd1c..34ca400 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2751,6 +2751,8 @@ sub process {
}
}
 
+# Block comment styles
+# Networking with an initial /*
if ($realfile =~ m@^(drivers/net/|net/)@ &&
$prevrawline =~ /^\+[ \t]*\/\*[ \t]*$/ &&
$rawline =~ /^\+[ \t]*\*/ &&
@@ -2759,22 +2761,23 @@ sub process {
 "networking block comments don't use an empty /* 
line, use /* Comment...\n" . $hereprev);
}
 
-   if ($realfile =~ m@^(drivers/net/|net/)@ &&
-   $prevrawline =~ /^\+[ \t]*\/\*/ &&  #starting /*
+# Block comments use * on subsequent lines
+   if ($prevline =~ /$;[ \t]*$/ && #ends in comment
+   $prevrawline =~ /^\+.*?\/\*/ && #starting /*
$prevrawline !~ /\*\/[ \t]*$/ &&#no trailing */
$rawline =~ /^\+/ &&#line is new
$rawline !~ /^\+[ \t]*\*/) {#no leading *
-   WARN("NETWORKING_BLOCK_COMMENT_STYLE",
-"networking block comments start with * on 
subsequent lines\n" . $hereprev);
+   WARN("BLOCK_COMMENT_STYLE",
+"Block comments use * on subsequent lines\n" . 
$hereprev);
}
 
-   if ($realfile =~ m@^(drivers/net/|net/)@ &&
-   $rawline !~ m@^\+[ \t]*\*/[ \t]*$@ &&   #trailing */
+# Block comments use */ on trailing lines
+   if ($rawline !~ m@^\+[ \t]*\*/[ \t]*$@ &&   #trailing */
$rawline !~ m@^\+.*/\*.*\*/[ \t]*$@ &&  #inline /*...*/
$rawline !~ m@^\+.*\*{2,}/[ \t]*$@ &&   #trailing **/
$rawline =~ m@^\+[ \t]*.+\*\/[ \t]*$@) {#non blank */
-   WARN("NETWORKING_BLOCK_COMMENT_STYLE",
-"networking block comments put the trailing */ on 
a separate line\n" . $herecurr);
+   WARN("BLOCK_COMMENT_STYLE",
+"Block comments use a trailing */ on a separate 
line\n" . $herecurr);
}
 
 # check for missing blank lines after struct/union declarations


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] checkpatch: Report the right line # when using --emacs and --file

2015-07-23 Thread Joe Perches

commit 34d8815f9512 ("checkpatch: add --showfile to allow input
via pipe to show filenames") broke the --emacs with --file option.

Fix it.

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index c0a95aa..137bd1c 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2166,7 +2166,11 @@ sub process {
if ($showfile) {
$prefix = "$realfile:$realline: "
} elsif ($emacs) {
-   $prefix = "$filename:$linenr: ";
+   if ($file) {
+   $prefix = "$filename:$realline: ";
+   } else {
+   $prefix = "$filename:$linenr: ";
+   }
}
 
if ($found_file) {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] cpufreq: Fix double addition of sysfs links

2015-07-23 Thread Rafael J. Wysocki

On Thu, Jul 23, 2015 at 7:22 PM, Rafael J. Wysocki  wrote:
> Hi Viresh,
>
> On Thu, Jul 23, 2015 at 8:09 AM, Viresh Kumar  wrote:
>> On 22-07-15, 18:42, Rafael J. Wysocki wrote:
>>> > 3. what happens when 'policy' is NULL at the point when the first (few) 
>>> > CPUs
>>> >are added - how do the symlinks get created later if/when policy 
>>> > becomes
>>> >non-NULL (can it?)
>>>
>>> Yes, it can, and we have a design issue here that bothers me a bit.
>>
>> I replied to Russell with a NO here as the first CPU should have
>> created the policy. BUT...
>>
>>> Namley, we need a driver's ->init callback to populate policy->cpus
>>> for us, but this is not the only thing it is doing, so the concern is
>>> that it may not be able to deal with CPUs that aren't online.
>>
>> ... the first few CPUs could have been offline and so we might not
>> have tried to add the policy at all.. Need to fix that for sure.
>
> Wait here.
>
> The current Linus' tree doesn't have that problem as far as I can say.
>
> Say cpufreq_interface->add_dev() is called for an offline CPU (say
> CPU2).  It points to cpufreq_add_dev(), so we see that the CPU is
> offline and call add_cpu_dev_symlink() for it.  But the first argument
> we pass to that is per_cpu(cpufreq_cpu_data, cpu) and that is NULL,
> because the policy is not there yet.  So we just return 0 (and the CPU
> has no policy and no link).
>
> Now say cpufreq_interface->add_dev() is called for an online CPU (say
> CPU3).  It goes and creates the policy for it and the driver's
> ->init() tells us that CPU2 is related to it.  So
> cpufreq_add_dev_interface() creates the link for CPU2 and we're fine.
>
> Now say CPU3 was offline too when cpufreq_interface->add_dev() was
> called for it.  We don't create a policy or a link for it.  Now say
> CPU2 becomes online.  cpufreq_cpu_callback() calls cpufreq_add_dev()
> for it and we land in the previous case.
>
> The *broken* case is when CPU2 is online to start with and it had
> created the link for CPU3, so when an offline CPU3 is now being added,
> we try to create the link for it again.  That is the case we need to
> address in -rc without introducing new problems.  The $subject patch
> adresses that issue, but it introduces the above problem.  On the
> other hand, my patch at https://patchwork.kernel.org/patch/6839151/
> should take care of this too (unless it is broken in a way I'm not
> seeing now).

It doesn't address the case when the CPU being removed is the policy owner.

Let me prepare a new version of it and we'll start over from there.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier

2015-07-23 Thread Rafael J. Wysocki

Hi Viresh,

On Thu, Jul 23, 2015 at 10:13 AM, Viresh Kumar  wrote:
> If subsys callback ->add_dev() is called for an offline CPU, before its
> policy is allocated, we will miss adding its sysfs symlink.
>
> Fix this by tracking such CPUs in a separate mask.
>
> Fixes: 9b07109f06a1 ("cpufreq: Fix double addition of sysfs links")
> Signed-off-by: Viresh Kumar 

No, we need to go to back to square one.

No fixes of fixes of fixes etc please.

Let me prepare a patch for -rc that won't introduce *new* problems and
we can make major changes as 4.3 material, OK?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/2] iio: fix drivers that consider 0 as a valid IRQ in client->irq

2015-07-23 Thread Jonathan Cameron

On 23/07/15 15:38, Octavian Purdila wrote:
> On Thu, Jul 23, 2015 at 5:05 PM,  wrote:
>>
>> Octavian Purdila writes:
>>>
>>> On Fri, Jun 5, 2015 at 4:59 PM, Octavian Purdila
>>>  wrote:

 Since patch "i2c / ACPI: Use 0 to indicate that device does not have
 interrupt assigned" [1], 0 is not a valid i2c client irq anymore, so
 change all driver's checks accordingly.
 The same issue occurs when the device is instantiated via device tree
 with no IRQ, or from the i2c sysfs interface, even before the patch
 above.
 [1] 
 http://lkml.kernel.org/g/<1430908148-201129-3-git-send-email-mika.westerb...@linux.intel.com>
 Signed-off-by: Octavian Purdila 
 Reviewed-by: Mika Westerberg 
>>>
>>>
>>> Hi Jonathan,
>>> Does this look OK to you? If so, could you pleas ACK the patch so that
>>> Linus can pick it up in its for-next branch if/when needed?
>>> Thanks,
>>> Tavi
>>
>> Hi Tavi,
>> This is fine, but is there a particular rush to get it in?
>> Otherwise I'll just take it through the IIO tree.
>> Acked-by: Jonathan Cameron 
> 
> Hi Jonathan,
> 
> Didn't mean to rush things, I haven't seen any activity on this for
> some time and thought it was forgotten. 
A not entirely false assumption.  I'd marked it in my email as to be
applied then it got buried. oops and sorry about that.
> I was also confused with the
> status of Mika's patch, but now that I learned it was merged in 4.2,
> its clear to me that this patch needs to go through the IIO tree.
> 
Applied to the togreg branch of iio.git - initially pushed out as
testing for autobuilders to play with it.

Thanks,

Jonathan
> Thanks,
> Tavi
> 
A
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] x86/entry/32: Remove duplicate initialization of tss.ss1

2015-07-23 Thread Andy Lutomirski

It's statically initialized, so we don't need to dynamically
initialize it too.

Reported-by: Brian Gerst 
Signed-off-by: Andy Lutomirski 
---

Changes since v1: Delete the code :)

arch/x86/kernel/cpu/common.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index e2ed2513a51e..e08eee98a5f8 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1005,14 +1005,6 @@ void enable_sep_cpu(void)
if (IS_ENABLED(CONFIG_X86_32) && !boot_cpu_has(X86_FEATURE_SEP))
goto out;
 
-#ifdef CONFIG_X86_32
-   /*
-* We cache MSR_IA32_SYSENTER_CS's value in the TSS's ss1 field --
-* see the big comment in struct x86_hw_tss's definition.
-*/
-   tss->x86_tss.ss1 = __KERNEL_CS;
-#endif
-
wrmsrl_safe(MSR_IA32_SYSENTER_CS, __KERNEL_CS);
wrmsrl_safe(MSR_IA32_SYSENTER_ESP,
(unsigned long)tss +
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel broken on processors without performance counters

2015-07-23 Thread Jason Baron

On 07/23/2015 10:49 AM, Peter Zijlstra wrote:
> On Thu, Jul 23, 2015 at 04:33:08PM +0200, Peter Zijlstra wrote:
>> Lemme finish this and I'll post it.
> Compile tested on x86_64 only..
>
> Please have a look, I think you said I got some of the logic wrong, I've
> not double checked that.
>
> I'll go write comments and double check things.
>
> ---
>  arch/arm/include/asm/jump_label.h |  22 +++-
>  arch/arm64/include/asm/jump_label.h   |  22 +++-
>  arch/mips/include/asm/jump_label.h|  23 +++-
>  arch/powerpc/include/asm/jump_label.h |  23 +++-
>  arch/s390/include/asm/jump_label.h|  23 +++-
>  arch/sparc/include/asm/jump_label.h   |  38 ++---
>  arch/x86/include/asm/jump_label.h |  24 +++-
>  include/linux/jump_label.h| 101 
> +-
>  kernel/jump_label.c   |  89 +++---
>  9 files changed, 297 insertions(+), 68 deletions(-)
>
> diff --git a/arch/arm/include/asm/jump_label.h 
> b/arch/arm/include/asm/jump_label.h
> index 5f337dc5c108..6c9789b33497 100644
> --- a/arch/arm/include/asm/jump_label.h
> +++ b/arch/arm/include/asm/jump_label.h
> @@ -13,14 +13,32 @@
>  #define JUMP_LABEL_NOP   "nop"
>  #endif
>  
> -static __always_inline bool arch_static_branch(struct static_key *key)
> +static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
>  {
> + unsigned long kval = (unsigned long)key + inv;
> +
>   asm_volatile_goto("1:\n\t"
>JUMP_LABEL_NOP "\n\t"
>".pushsection __jump_table,  \"aw\"\n\t"
>".word 1b, %l[l_yes], %c0\n\t"
>".popsection\n\t"
> -  : :  "i" (key) :  : l_yes);
> +  : :  "i" (kval) :  : l_yes);
> +
> + return false;
> +l_yes:
> + return true;
> +}
> +
> +static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> +{
> + unsigned long kval = (unsigned long)key + inv;
> +
> + asm_volatile_goto("1:\n\t"
> +  "b %l[l_yes]\n\t"
> +  ".pushsection __jump_table,  \"aw\"\n\t"
> +  ".word 1b, %l[l_yes], %c0\n\t"
> +  ".popsection\n\t"
> +  : :  "i" (kval) :  : l_yes);
>  
>   return false;
>  l_yes:
> diff --git a/arch/arm64/include/asm/jump_label.h 
> b/arch/arm64/include/asm/jump_label.h
> index c0e5165c2f76..e5cda5d75c62 100644
> --- a/arch/arm64/include/asm/jump_label.h
> +++ b/arch/arm64/include/asm/jump_label.h
> @@ -26,14 +26,32 @@
>  
>  #define JUMP_LABEL_NOP_SIZE  AARCH64_INSN_SIZE
>  
> -static __always_inline bool arch_static_branch(struct static_key *key)
> +static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
>  {
> + unsigned long kval = (unsigned long)key + inv;
> +
>   asm goto("1: nop\n\t"
>".pushsection __jump_table,  \"aw\"\n\t"
>".align 3\n\t"
>".quad 1b, %l[l_yes], %c0\n\t"
>".popsection\n\t"
> -  :  :  "i"(key) :  : l_yes);
> +  :  :  "i"(kval) :  : l_yes);
> +
> + return false;
> +l_yes:
> + return true;
> +}
> +
> +static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> +{
> + unsigned long kval = (unsigned long)key + inv;
> +
> + asm goto("1: b %l[l_yes]\n\t"
> +  ".pushsection __jump_table,  \"aw\"\n\t"
> +  ".align 3\n\t"
> +  ".quad 1b, %l[l_yes], %c0\n\t"
> +  ".popsection\n\t"
> +  :  :  "i"(kval) :  : l_yes);
>  
>   return false;
>  l_yes:
> diff --git a/arch/mips/include/asm/jump_label.h 
> b/arch/mips/include/asm/jump_label.h
> index 608aa57799c8..d9fca6f52a93 100644
> --- a/arch/mips/include/asm/jump_label.h
> +++ b/arch/mips/include/asm/jump_label.h
> @@ -26,14 +26,33 @@
>  #define NOP_INSN "nop"
>  #endif
>  
> -static __always_inline bool arch_static_branch(struct static_key *key)
> +static __always_inline bool arch_static_branch(struct static_key *key, bool 
> inv)
>  {
> + unsigned long kval = (unsigned long)key + inv;
> +
>   asm_volatile_goto("1:\t" NOP_INSN "\n\t"
>   "nop\n\t"
>   ".pushsection __jump_table,  \"aw\"\n\t"
>   WORD_INSN " 1b, %l[l_yes], %0\n\t"
>   ".popsection\n\t"
> - : :  "i" (key) : : l_yes);
> + : :  "i" (kval) : : l_yes);
> +
> + return false;
> +l_yes:
> + return true;
> +}
> +
> +static __always_inline bool arch_static_branch_jump(struct static_key *key, 
> bool inv)
> +{
> + unsigned long kval = (unsigned long)key + inv;
> +
> + asm_volatile_goto("1:\tj %l[l_yes]\n\t"
> + "nop\n\t"
> + ".pushsection __jump_table,  \"aw\"\n\t"
> + WORD_INSN " 1b, %l[l_yes], %0\n\t"
> + ".popsection\n\t"
> + : :  "i" (kval) : : l_yes);
> +
>   return false;
>  l_yes:
>   return true;
> diff

[PATCH v2] x86/asm/msr: Make wrmsrl() a function

2015-07-23 Thread Andy Lutomirski

As of cf991de2f614 ("x86/asm/msr: Make wrmsrl_safe() a function"),
wrmsrl_safe is a function, but wrmsrl is still a macro.  The wrmsrl
macro performs invalid shifts if the value argument is 32 bits.
This makes it unnecessarily awkward to write code that puts an
unsigned long into an MSR.

To make this work, syscall_init needs tweaking to stop passing
a function pointer to wrmsrl.

Signed-off-by: Andy Lutomirski 
---

Changes since v1: Fix one more warning.

arch/x86/include/asm/msr.h  | 6 --
 arch/x86/include/asm/paravirt.h | 6 +-
 arch/x86/kernel/cpu/common.c| 6 +++---
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 131eec2ca137..714c80755dae 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -185,8 +185,10 @@ static inline void wrmsr(unsigned msr, unsigned low, 
unsigned high)
 #define rdmsrl(msr, val)   \
((val) = native_read_msr((msr)))
 
-#define wrmsrl(msr, val)   \
-   native_write_msr((msr), (u32)((u64)(val)), (u32)((u64)(val) >> 32))
+static inline void wrmsrl(unsigned msr, u64 val)
+{
+   native_write_msr(msr, (u32)val, (u32)(val >> 32));
+}
 
 /* wrmsr with exception handling */
 static inline int wrmsr_safe(unsigned msr, unsigned low, unsigned high)
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index c2be0375bcad..10d0596433f8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -153,7 +153,11 @@ do {   \
val = paravirt_read_msr(msr, &_err);\
 } while (0)
 
-#define wrmsrl(msr, val)   wrmsr(msr, (u32)((u64)(val)), ((u64)(val))>>32)
+static inline void wrmsrl(unsigned msr, u64 val)
+{
+   wrmsr(msr, (u32)val, (u32)(val>>32));
+}
+
 #define wrmsr_safe(msr, a, b)  paravirt_write_msr(msr, a, b)
 
 /* rdmsr with exception handling */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ffb0020ada5f..e2ed2513a51e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1184,13 +1184,13 @@ void syscall_init(void)
 * set CS/DS but only a 32bit target. LSTAR sets the 64bit rip.
 */
wrmsrl(MSR_STAR,  ((u64)__USER32_CS)<<48  | ((u64)__KERNEL_CS)<<32);
-   wrmsrl(MSR_LSTAR, entry_SYSCALL_64);
+   wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
 
 #ifdef CONFIG_IA32_EMULATION
-   wrmsrl(MSR_CSTAR, entry_SYSCALL_compat);
+   wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);
enable_sep_cpu();
 #else
-   wrmsrl(MSR_CSTAR, ignore_sysret);
+   wrmsrl(MSR_CSTAR, (unsigned long)ignore_sysret);
wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG);
wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL);
wrmsrl_safe(MSR_IA32_SYSENTER_EIP, 0ULL);
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86/entry/32: Initialize ss1 (SYSENTER_CS shadow) even on non-SEP CPUs

2015-07-23 Thread Andy Lutomirski

On Thu, Jul 23, 2015 at 12:03 PM, Brian Gerst  wrote:
> On Thu, Jul 23, 2015 at 2:56 PM, Andy Lutomirski  wrote:
>> native_load_sp0 relies on this.  I'm not sure why we haven't seen
>> reports of crashes.  Maybe no one tests new kernels on non-SEP CPUs.
>
> It's already statically initialized in cpu_tss.

Indeed.  I'll just delete it, then.

>
> --
> Brian Gerst



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] Freq/CPU%/CORE_BUSY% support

2015-07-23 Thread Stephane Eranian

On Thu, Jul 23, 2015 at 4:46 AM, Kan Liang  wrote:
> This patch set supports per-sample freq/CPU%/CORE_BUSY% print in perf
> report -D and --stdio.
> For printing these information, the perf.data file must have been obtained
> by group read and using special events cycles, ref-cycles, msr/tsc/,
> msr/aperf/ or msr/mperf/.
>
>  - Freq (MHz): The frequency during the sample interval. Needs cycles
>ref-cycles event.
>  - CPU%: CPU utilization during the sample interval. Needs ref-cycles and
>msr/tsc/ events.
>  - CORE_BUSY%: actual percent performance (APERF/MPERF%) during the
>sample interval. Needs msr/aperf/ and msr/mperf/ events.
>
> For printing CPU% and CORE_BUSY%, please also apply the kernel patch.
> http://marc.info/?l=linux-kernel=143747254926369=2
>
> Here is an example:
>
> $ perf record -e
> '{cycles,ref-cycles,msr/tsc/,msr/mperf/,msr/aperf/}:S' ~/tchain_edit
>
> $ perf report --stdio --group --show-freq-perf
>
Based on what I see in the patch, you are assuming that you ALWAYS run
perf report on the same system as where you ran perf record. This is a
problem on servers. so you need to have all the information you need in
the perf.data file. You cannot fish the information from sysfs on the
host where perf report runs.

>  Overhead   FREQ MHz   CPU%  CORE_BUSY%
> Command  Shared Object Symbol
>    .  .  ..
> ...    ..
>
> 99.54%  99.54%  99.53%  99.53%  99.53%   2301 96 99
> tchain_edit  tchain_edit   [.] f3
>  0.20%   0.20%   0.20%   0.20%   0.20%   2301 98 99
> tchain_edit  tchain_edit   [.] f2
>  0.05%   0.05%   0.05%   0.05%   0.05%   2300 98 99
> tchain_edit  [kernel.vmlinux]  [k] read_tsc
>
> Kan Liang (5):
>   perf,tools: introduce get_cpu_max_freq
>   perf,tools: Dump per-sample freq/CPU%/CORE_BUSY% in report -D
>   perf,tools: save misc sample read value in struct perf_sample
>   perf,tools: caculate and save freq/CPU%/CORE_BUSY% in he_stat
>   perf,tools: Show freq/CPU%/CORE_BUSY% in perf report --stdio
>
>  tools/perf/Documentation/perf-report.txt | 12 ++
>  tools/perf/builtin-annotate.c|  2 +-
>  tools/perf/builtin-diff.c|  2 +-
>  tools/perf/builtin-report.c  | 24 +++
>  tools/perf/perf.h|  1 +
>  tools/perf/tests/hists_link.c|  4 +-
>  tools/perf/ui/hist.c | 71 
> +---
>  tools/perf/util/cpumap.c | 32 ++
>  tools/perf/util/cpumap.h |  2 +
>  tools/perf/util/event.h  | 11 +
>  tools/perf/util/hist.c   | 51 ---
>  tools/perf/util/hist.h   |  5 +++
>  tools/perf/util/pmu.h|  2 +
>  tools/perf/util/session.c| 50 +++---
>  tools/perf/util/session.h| 28 +
>  tools/perf/util/sort.c   |  3 ++
>  tools/perf/util/sort.h   |  3 ++
>  tools/perf/util/symbol.h |  9 +++-
>  tools/perf/util/util.c   |  2 +
>  19 files changed, 293 insertions(+), 21 deletions(-)
>
> --
> 1.8.3.1
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch V4 1/3] usb: Add Xen pvUSB protocol description

2015-07-23 Thread Greg KH

On Thu, Jul 23, 2015 at 08:46:17AM +0200, Juergen Gross wrote:
> On 07/23/2015 06:36 AM, Greg KH wrote:
> >On Thu, Jul 23, 2015 at 06:04:39AM +0200, Juergen Gross wrote:
> >>On 07/23/2015 01:46 AM, Greg KH wrote:
> >>>On Tue, Jun 23, 2015 at 08:53:23AM +0200, Juergen Gross wrote:
> Add the definition of pvUSB protocol used between the pvUSB frontend in
> a Xen domU and the pvUSB backend in a Xen driver domain (usually Dom0).
> 
> This header was originally provided by Fujitsu for Xen based on Linux
> 2.6.18.
> 
> Changes are:
> - adapt to Linux style guide
> 
> Signed-off-by: Juergen Gross 
> ---
>   include/xen/interface/io/usbif.h | 252 
>  +++
> >>>
> >>>Why is this a different interface than the existing ones we have today
> >>>(i.e. usbip?)  Where is it documented?  Do the Xen developers /
> >>
> >>The interface definition is living in the Xen git repository for several
> >>years now:
> >>
> >>git://xenbits.xen.org/xen.git -> xen/include/public/io/usbif.h
> >
> >That's header file, not a document describing the api here.
> 
> I suppose you want to tell me I should add something like:
> 
> Documentation/DocBook/usb/API-struct-urb.html

Somewhere that people can refer to that describes this public-facing API
that "must not ever be broken or changed".  If you want to put it in a
documentation file, or a .h file, I don't care.

> >>It is used e.g. in SUSE's xen kernel since 2.6.18.
> >
> >I am very aware of the amount of Xen crap in SuSE's kernel, don't use
> >that as an excuse for me to merge it to mainline :)
> 
> :-)
> 
> Wasn't meant as an excuse, just a hint why the interface can't be the
> same as for usbip. We have to ensure compatibility with those kernels

This shouldn't be a kernel/kernel compability issue, as the api talks
between Xen and the OS, not between different OSs, right?

> and possibly other operating systems (BSD?, Windows?) which already
> might be using pvUSB with a Dom0 based on the SUSE xen kernel.

Are there other operating system drivers today that use this API?  Is
this an API in the Xen core today that we have to support?

Some more background / descriptions would be nice to have.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] perf,tools: Dump per-sample freq/CPU%/CORE_BUSY% in report -D

2015-07-23 Thread kan . liang

From: Kan Liang 

The group read results from cycles/ref-cycles/TSC/ASTATE/MSTATE event
can be used to calculate the frequency, CPU Utilization and percent
performance during each sampling period.
This patch shows them in report -D.

Here is an example:

$ perf record -e
'{cycles,ref-cycles,msr/tsc/,msr/mperf/,msr/aperf/}:S' ~/tchain_edit

Here is one sample from perf report -D

1972044565107 0x3498 [0x88]: PERF_RECORD_SAMPLE(IP, 0x2): 10608/10608:
0x4005fd period: 564686 addr: 0
... sample_read:
 group nr 5
. id 0012, value 02143901
. id 0052, value 02143896
. id 0094, value 021e443d
. id 00d4, value 021db984
. id 0114, value 021db964
. Freq 2301 MHz
. CPU% 98%
. CORE_BUSY% 99%

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-report.c |  3 +++
 tools/perf/util/pmu.h   |  2 ++
 tools/perf/util/session.c   | 34 +-
 tools/perf/util/session.h   | 38 ++
 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 62cce98..0cd0573 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -38,6 +38,8 @@
 
 #include "util/auxtrace.h"
 
+#include "util/pmu.h"
+
 #include 
 #include 
 
@@ -818,6 +820,7 @@ repeat:
symbol_conf.cumulate_callchain = false;
}
 
+   msr_pmu = perf_pmu__find("msr");
cpu_max_freq = get_cpu_max_freq() / 1000;
 
if (setup_sorting() < 0) {
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 7b9c8cf..e3e67aa 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -27,6 +27,8 @@ struct perf_pmu {
struct list_head list;/* ELEM */
 };
 
+struct perf_pmu *msr_pmu;
+
 struct perf_pmu_info {
const char *unit;
double scale;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index ed9dc25..6dd20b5 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -17,6 +17,7 @@
 #include "asm/bug.h"
 #include "auxtrace.h"
 #include "thread-stack.h"
+#include "pmu.h"
 
 static int perf_session__deliver_event(struct perf_session *session,
   union perf_event *event,
@@ -851,8 +852,14 @@ static void perf_evlist__print_tstamp(struct perf_evlist 
*evlist,
printf("%" PRIu64 " ", sample->time);
 }
 
-static void sample_read__printf(struct perf_sample *sample, u64 read_format)
+static void sample_read__printf(struct perf_evlist *evlist,
+   struct perf_sample *sample,
+   u64 read_format)
 {
+   struct perf_evsel *evsel;
+   struct perf_sample_id *sid;
+   u64 data[FREQ_PERF_MAX] = { 0 };
+
printf("... sample_read:\n");
 
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -875,10 +882,26 @@ static void sample_read__printf(struct perf_sample 
*sample, u64 read_format)
printf(". id %016" PRIx64
   ", value %016" PRIx64 "\n",
   value->id, value->value);
+
+   sid = perf_evlist__id2sid(evlist, value->id);
+   evsel = sid->evsel;
+   if (evsel != NULL)
+   SET_FREQ_PERF_VALUE(evsel, data,
+   value->value);
}
} else
printf(". id %016" PRIx64 ", value %016" PRIx64 "\n",
sample->read.one.id, sample->read.one.value);
+
+   if (HAS_FREQ(data))
+   printf(". Freq %lu MHz\n",
+  (data[FREQ_PERF_CYCLES] * cpu_max_freq) / 
data[FREQ_PERF_REF_CYCLES]);
+   if (HAS_CPU_U(data))
+   printf(". CPU%% %lu%%\n",
+  (100 * data[FREQ_PERF_REF_CYCLES]) / 
data[FREQ_PERF_TSC]);
+   if (HAS_CORE_BUSY(data))
+   printf(". CORE_BUSY%% %lu%%\n",
+  (100 * data[FREQ_PERF_APERF]) / data[FREQ_PERF_MPERF]);
 }
 
 static void dump_event(struct perf_evlist *evlist, union perf_event *event,
@@ -899,8 +922,8 @@ static void dump_event(struct perf_evlist *evlist, union 
perf_event *event,
   event->header.size, perf_event__name(event->header.type));
 }
 
-static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
-   struct perf_sample *sample)
+static void dump_sample(struct perf_evlist *evlist, struct perf_evsel *evsel,
+   union perf_event *event, struct perf_sample *sample)
 {
u64 sample_type;
 
@@ -938,7 +961,7 @@ static void dump_sample(struct perf_evsel *evsel, union 
perf_event *event,
printf("... transaction: %" PRIx64 "\n", sample->transaction);
 
if (sample_type & PERF_SAMPLE_READ)
-

[PATCH 4/5] perf,tools: caculate and save freq/CPU%/CORE_BUSY% in he_stat

2015-07-23 Thread kan . liang

From: Kan Liang 

Introduce a new hist_iter ops (hist_iter_freq_perf) to caculate the
freq/CPU%/CORE_BUSY% freq when processing samples, and save them in
hist_entry.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-annotate.c |  2 +-
 tools/perf/builtin-diff.c |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c| 51 ++-
 tools/perf/util/hist.h|  2 ++
 tools/perf/util/sort.h|  3 +++
 tools/perf/util/symbol.h  |  6 +
 7 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2c1bec3..06e2f87 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -71,7 +71,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
 
-   he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+   he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, NULL, 
true);
if (he == NULL)
return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index daaa7dc..2fffcc4 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -315,7 +315,7 @@ static int hists__add_entry(struct hists *hists,
u64 weight, u64 transaction)
 {
if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-  transaction, true) != NULL)
+  transaction, NULL, true) != NULL)
return 0;
return -ENOMEM;
 }
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 8c102b0..5d9f9e3 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(hists, , NULL,
-   NULL, NULL, 1, 1, 0, true);
+   NULL, NULL, 1, 1, 0, NULL, 
true);
if (he == NULL) {
addr_location__put();
goto out;
@@ -116,7 +116,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(hists, , NULL,
-   NULL, NULL, 1, 1, 0, true);
+   NULL, NULL, 1, 1, 0, NULL, 
true);
if (he == NULL) {
addr_location__put();
goto out;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6f28d53..26b8eea 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -436,7 +436,9 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
  struct symbol *sym_parent,
  struct branch_info *bi,
  struct mem_info *mi,
- u64 period, u64 weight, u64 transaction,
+ u64 period, u64 weight,
+ u64 transaction,
+ struct freq_perf_info *info,
  bool sample_self)
 {
struct hist_entry entry = {
@@ -454,6 +456,9 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.nr_events = 1,
.period = period,
.weight = weight,
+   .freq = (info != NULL) ? info->freq : 0,
+   .cpu_u = (info != NULL) ? info->cpu_u : 0,
+   .core_busy = (info != NULL) ? info->core_busy : 0,
},
.parent = sym_parent,
.filtered = symbol__parent_filter(sym_parent) | al->filtered,
@@ -481,6 +486,32 @@ iter_add_next_nop_entry(struct hist_entry_iter *iter 
__maybe_unused,
 }
 
 static int
+iter_add_single_freq_perf_entry(struct hist_entry_iter *iter, struct 
addr_location *al)
+{
+   struct perf_evsel *evsel = iter->evsel;
+   struct perf_sample *sample = iter->sample;
+   struct hist_entry *he;
+   struct freq_perf_info info = {0};
+   u64 *data = sample->freq_perf_data;
+
+   if (data[FREQ_PERF_REF_CYCLES] > 0)
+   info.freq = (data[FREQ_PERF_CYCLES] * cpu_max_freq) / 
data[FREQ_PERF_REF_CYCLES];
+   if (data[FREQ_PERF_TSC] > 0)
+   info.cpu_u = (100 * data[FREQ_PERF_REF_CYCLES]) / 
data[FREQ_PERF_TSC];
+   if (data[FREQ_PERF_MPERF] > 0)
+   info.core_busy = (100 * data[FREQ_PERF_APERF]) / 
data[FREQ_PERF_MPERF];
+
+   he = __hists__add_entry(evsel__hists(evsel),

[PATCH 1/5] perf,tools: introduce get_cpu_max_freq

2015-07-23 Thread kan . liang

From: Kan Liang 

Get cpu max frequency from the first online cpu, and save the MHz value
in get_cpu_max_freq.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-report.c |  2 ++
 tools/perf/util/cpumap.c| 32 
 tools/perf/util/cpumap.h|  2 ++
 3 files changed, 36 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 95a4771..62cce98 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -818,6 +818,8 @@ repeat:
symbol_conf.cumulate_callchain = false;
}
 
+   cpu_max_freq = get_cpu_max_freq() / 1000;
+
if (setup_sorting() < 0) {
if (sort_order)
parse_options_usage(report_usage, options, "s", 1);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 3667e21..548ef13 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -499,3 +499,35 @@ int cpu__setup_cpunode_map(void)
closedir(dir1);
return 0;
 }
+
+unsigned int get_cpu_max_freq(void)
+{
+   const char *mnt;
+   char path[PATH_MAX], tmp;
+   FILE *fp;
+   unsigned int freq;
+   int cpu = 0;
+   int ret;
+
+   mnt = sysfs__mountpoint();
+   if (!mnt)
+   return 0;
+
+   snprintf(path, PATH_MAX, "%s/devices/system/cpu/online", mnt);
+   fp = fopen(path, "r");
+   if (fp) {
+   ret = fscanf(fp, "%u%c", , );
+   fclose(fp);
+   if (ret < 1)
+   return 0;
+   }
+
+   snprintf(path, PATH_MAX, 
"%s/devices/system/cpu/cpu%d/cpufreq/cpuinfo_max_freq", mnt, cpu);
+   fp = fopen(path, "r");
+   if (!fp)
+   return 0;
+   ret = fscanf(fp, "%u", );
+   fclose(fp);
+
+   return (ret == 1) ? freq : 0;
+}
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 0af9cec..70ac686 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -56,8 +56,10 @@ static inline bool cpu_map__empty(const struct cpu_map *map)
 int max_cpu_num;
 int max_node_num;
 int *cpunode_map;
+unsigned int cpu_max_freq;
 
 int cpu__setup_cpunode_map(void);
+unsigned int get_cpu_max_freq(void);
 
 static inline int cpu__max_node(void)
 {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/5] perf,tools: Show freq/CPU%/CORE_BUSY% in perf report --stdio

2015-07-23 Thread kan . liang

From: Kan Liang 

Show frequency, CPU Utilization and percent performance for each symbol
in perf report by --stdio --show-freq-perf

In sampling group, only group leader do sampling. So only need to print
group leader's freq in --group.

Here is an example.

$ perf report --stdio --group --show-freq-perf

 Overhead   FREQ MHz   CPU%  CORE_BUSY%
Command  Shared Object Symbol
   .  .  ..
...    ..

99.54%  99.54%  99.53%  99.53%  99.53%   2301 96 99
tchain_edit  tchain_edit   [.] f3
 0.20%   0.20%   0.20%   0.20%   0.20%   2301 98 99
tchain_edit  tchain_edit   [.] f2
 0.05%   0.05%   0.05%   0.05%   0.05%   2300 98 99
tchain_edit  [kernel.vmlinux]  [k] read_tsc

Signed-off-by: Kan Liang 
---
 tools/perf/Documentation/perf-report.txt | 12 ++
 tools/perf/builtin-report.c  | 19 +
 tools/perf/perf.h|  1 +
 tools/perf/ui/hist.c | 71 +---
 tools/perf/util/hist.h   |  3 ++
 tools/perf/util/session.c|  2 +-
 tools/perf/util/sort.c   |  3 ++
 tools/perf/util/symbol.h |  3 +-
 tools/perf/util/util.c   |  2 +
 9 files changed, 109 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index c33b69f..faa8825 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -303,6 +303,18 @@ OPTIONS
special event -e cpu/mem-loads/ or -e cpu/mem-stores/. See
'perf mem' for simpler access.
 
+--show-freq-perf::
+   Show CPU frequency and performance result from sample read.
+   To generate the frequency and performance output, the perf.data file
+   must have been obtained by group read and using special events cycles,
+   ref-cycles, msr/tsc/, msr/aperf/ or msr/mperf/
+   Freq MHz: The frequency during the sample interval. Needs cycles and
+ ref-cycles event.
+   CPU%: CPU utilization during the sample interval. Needs ref-cycles and
+ msr/tsc/ events.
+   CORE_BUSY%: actual percent performance (APERF/MPERF%) during the
+   sample interval. Needs msr/aperf/ and msr/mperf/ events.
+
 --percent-limit::
Do not show entries which have an overhead under that percent.
(Default: 0).
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0cd0573..961b848 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -166,6 +166,8 @@ static int process_sample_event(struct perf_tool *tool,
iter.ops = _iter_mem;
else if (symbol_conf.cumulate_callchain)
iter.ops = _iter_cumulative;
+   else if (symbol_conf.show_freq_perf)
+   iter.ops = _iter_freq_perf;
else
iter.ops = _iter_normal;
 
@@ -723,6 +725,8 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_BOOLEAN(0, "demangle-kernel", _conf.demangle_kernel,
"Enable kernel symbol demangling"),
OPT_BOOLEAN(0, "mem-mode", _mode, "mem access profile"),
+   OPT_BOOLEAN(0, "show-freq-perf", _conf.show_freq_perf,
+   "show CPU freqency and performance info"),
OPT_CALLBACK(0, "percent-limit", , "percent",
 "Don't show entries under that percent", 
parse_percent_limit),
OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
@@ -735,7 +739,9 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
struct perf_data_file file = {
.mode  = PERF_DATA_MODE_READ,
};
+   struct perf_evsel *pos;
int ret = hists__init();
+   bool freq_perf_info[FREQ_PERF_MAX] = {0};
 
if (ret < 0)
return ret;
@@ -823,6 +829,19 @@ repeat:
msr_pmu = perf_pmu__find("msr");
cpu_max_freq = get_cpu_max_freq() / 1000;
 
+   if (symbol_conf.show_freq_perf) {
+   perf_freq = perf_cpu_u = perf_core_busy = false;
+   evlist__for_each(session->evlist, pos) {
+   SET_FREQ_PERF_VALUE(pos, freq_perf_info, true);
+   }
+   if (HAS_FREQ(freq_perf_info))
+   perf_freq = true;
+   if (HAS_CPU_U(freq_perf_info))
+   perf_cpu_u = true;
+   if (HAS_CORE_BUSY(freq_perf_info))
+   perf_core_busy = true;
+   }
+
if (setup_sorting() < 0) {
if (sort_order)
parse_options_usage(report_usage, options, "s", 1);
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 937b16a..87daab8 100644
---

[PATCH 3/5] perf,tools: save misc sample read value in struct perf_sample

2015-07-23 Thread kan . liang

From: Kan Liang 

Save group read results from cycles/ref-cycles/TSC/ASTATE/MSTATE in
struct perf_sample. The following sample process function can easily
use them to caculate freq/CPU%/CORE_BUSY% and add them in hists.

Signed-off-by: Kan Liang 
---
 tools/perf/util/event.h   | 11 +++
 tools/perf/util/session.c | 16 
 tools/perf/util/session.h | 10 --
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index c53f363..f7aabe3 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -176,6 +176,16 @@ enum {
PERF_IP_FLAG_TRACE_BEGIN|\
PERF_IP_FLAG_TRACE_END)
 
+enum perf_freq_perf_index {
+   FREQ_PERF_TSC   = 0,
+   FREQ_PERF_APERF = 1,
+   FREQ_PERF_MPERF = 2,
+   FREQ_PERF_CYCLES= 3,
+   FREQ_PERF_REF_CYCLES= 4,
+
+   FREQ_PERF_MAX
+};
+
 struct perf_sample {
u64 ip;
u32 pid, tid;
@@ -191,6 +201,7 @@ struct perf_sample {
u64 data_src;
u32 flags;
u16 insn_len;
+   u64 freq_perf_data[FREQ_PERF_MAX];
void *raw_data;
struct ip_callchain *callchain;
struct branch_stack *branch_stack;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6dd20b5..939dfed 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -999,6 +999,8 @@ static int deliver_sample_value(struct perf_evlist *evlist,
struct machine *machine)
 {
struct perf_sample_id *sid = perf_evlist__id2sid(evlist, v->id);
+   struct perf_evsel *evsel;
+   u64 nr = 0;
 
if (sid) {
sample->id = v->id;
@@ -1011,6 +1013,20 @@ static int deliver_sample_value(struct perf_evlist 
*evlist,
return 0;
}
 
+   if (perf_evsel__is_group_leader(sid->evsel)) {
+   evsel = sid->evsel;
+   SET_FREQ_PERF_VALUE(evsel, sample->freq_perf_data,
+   sample->read.group.values[nr].value);
+   evlist__for_each_continue(evlist, evsel) {
+   if ((evsel->leader != sid->evsel) ||
+   (++nr >= sample->read.group.nr))
+   break;
+
+   SET_FREQ_PERF_VALUE(evsel, sample->freq_perf_data,
+   
sample->read.group.values[nr].value);
+   }
+   }
+
return tool->sample(tool, event, sample, sid->evsel, machine);
 }
 
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index df2094d..8c3cae8 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -46,16 +46,6 @@ struct perf_session {
 #define PERF_MSR_APERF 1
 #define PERF_MSR_MPERF 2
 
-enum perf_freq_perf_index {
-   FREQ_PERF_TSC   = 0,
-   FREQ_PERF_APERF = 1,
-   FREQ_PERF_MPERF = 2,
-   FREQ_PERF_CYCLES= 3,
-   FREQ_PERF_REF_CYCLES= 4,
-
-   FREQ_PERF_MAX
-};
-
 #define SET_FREQ_PERF_VALUE(event, array, value)   \
 {  \
if (event->attr.type == msr_pmu->type) {\
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5] Freq/CPU%/CORE_BUSY% support

2015-07-23 Thread kan . liang

From: Kan Liang 

(Re-send. Missed "From: Kan Liang " in last mail)
This patch set supports per-sample freq/CPU%/CORE_BUSY% print in perf
report -D and --stdio.
For printing these information, the perf.data file must have been obtained
by group read and using special events cycles, ref-cycles, msr/tsc/,
msr/aperf/ or msr/mperf/.

 - Freq (MHz): The frequency during the sample interval. Needs cycles
   ref-cycles event.
 - CPU%: CPU utilization during the sample interval. Needs ref-cycles and
   msr/tsc/ events.
 - CORE_BUSY%: actual percent performance (APERF/MPERF%) during the
   sample interval. Needs msr/aperf/ and msr/mperf/ events.

For printing CPU% and CORE_BUSY%, please also apply the kernel patch.
http://marc.info/?l=linux-kernel=143747254926369=2

Here is an example:

$ perf record -e
'{cycles,ref-cycles,msr/tsc/,msr/mperf/,msr/aperf/}:S' ~/tchain_edit

$ perf report --stdio --group --show-freq-perf

 Overhead   FREQ MHz   CPU%  CORE_BUSY%
Command  Shared Object Symbol
   .  .  ..
...    ..

99.54%  99.54%  99.53%  99.53%  99.53%   2301 96 99
tchain_edit  tchain_edit   [.] f3
 0.20%   0.20%   0.20%   0.20%   0.20%   2301 98 99
tchain_edit  tchain_edit   [.] f2
 0.05%   0.05%   0.05%   0.05%   0.05%   2300 98 99
tchain_edit  [kernel.vmlinux]  [k] read_tsc

Kan Liang (5):
  perf,tools: introduce get_cpu_max_freq
  perf,tools: Dump per-sample freq/CPU%/CORE_BUSY% in report -D
  perf,tools: save misc sample read value in struct perf_sample
  perf,tools: caculate and save freq/CPU%/CORE_BUSY% in he_stat
  perf,tools: Show freq/CPU%/CORE_BUSY% in perf report --stdio

 tools/perf/Documentation/perf-report.txt | 12 ++
 tools/perf/builtin-annotate.c|  2 +-
 tools/perf/builtin-diff.c|  2 +-
 tools/perf/builtin-report.c  | 24 +++
 tools/perf/perf.h|  1 +
 tools/perf/tests/hists_link.c|  4 +-
 tools/perf/ui/hist.c | 71 +---
 tools/perf/util/cpumap.c | 32 ++
 tools/perf/util/cpumap.h |  2 +
 tools/perf/util/event.h  | 11 +
 tools/perf/util/hist.c   | 51 ---
 tools/perf/util/hist.h   |  5 +++
 tools/perf/util/pmu.h|  2 +
 tools/perf/util/session.c| 50 +++---
 tools/perf/util/session.h| 28 +
 tools/perf/util/sort.c   |  3 ++
 tools/perf/util/sort.h   |  3 ++
 tools/perf/util/symbol.h |  9 +++-
 tools/perf/util/util.c   |  2 +
 19 files changed, 293 insertions(+), 21 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5] Freq/CPU%/CORE_BUSY% support

2015-07-23 Thread Kan Liang

This patch set supports per-sample freq/CPU%/CORE_BUSY% print in perf
report -D and --stdio.
For printing these information, the perf.data file must have been obtained
by group read and using special events cycles, ref-cycles, msr/tsc/,
msr/aperf/ or msr/mperf/.

 - Freq (MHz): The frequency during the sample interval. Needs cycles
   ref-cycles event.
 - CPU%: CPU utilization during the sample interval. Needs ref-cycles and
   msr/tsc/ events.
 - CORE_BUSY%: actual percent performance (APERF/MPERF%) during the
   sample interval. Needs msr/aperf/ and msr/mperf/ events.

For printing CPU% and CORE_BUSY%, please also apply the kernel patch.
http://marc.info/?l=linux-kernel=143747254926369=2

Here is an example:

$ perf record -e
'{cycles,ref-cycles,msr/tsc/,msr/mperf/,msr/aperf/}:S' ~/tchain_edit

$ perf report --stdio --group --show-freq-perf

 Overhead   FREQ MHz   CPU%  CORE_BUSY%
Command  Shared Object Symbol
   .  .  ..
...    ..

99.54%  99.54%  99.53%  99.53%  99.53%   2301 96 99
tchain_edit  tchain_edit   [.] f3
 0.20%   0.20%   0.20%   0.20%   0.20%   2301 98 99
tchain_edit  tchain_edit   [.] f2
 0.05%   0.05%   0.05%   0.05%   0.05%   2300 98 99
tchain_edit  [kernel.vmlinux]  [k] read_tsc

Kan Liang (5):
  perf,tools: introduce get_cpu_max_freq
  perf,tools: Dump per-sample freq/CPU%/CORE_BUSY% in report -D
  perf,tools: save misc sample read value in struct perf_sample
  perf,tools: caculate and save freq/CPU%/CORE_BUSY% in he_stat
  perf,tools: Show freq/CPU%/CORE_BUSY% in perf report --stdio

 tools/perf/Documentation/perf-report.txt | 12 ++
 tools/perf/builtin-annotate.c|  2 +-
 tools/perf/builtin-diff.c|  2 +-
 tools/perf/builtin-report.c  | 24 +++
 tools/perf/perf.h|  1 +
 tools/perf/tests/hists_link.c|  4 +-
 tools/perf/ui/hist.c | 71 +---
 tools/perf/util/cpumap.c | 32 ++
 tools/perf/util/cpumap.h |  2 +
 tools/perf/util/event.h  | 11 +
 tools/perf/util/hist.c   | 51 ---
 tools/perf/util/hist.h   |  5 +++
 tools/perf/util/pmu.h|  2 +
 tools/perf/util/session.c| 50 +++---
 tools/perf/util/session.h| 28 +
 tools/perf/util/sort.c   |  3 ++
 tools/perf/util/sort.h   |  3 ++
 tools/perf/util/symbol.h |  9 +++-
 tools/perf/util/util.c   |  2 +
 19 files changed, 293 insertions(+), 21 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] perf,tools: Dump per-sample freq/CPU%/CORE_BUSY% in report -D

2015-07-23 Thread Kan Liang

The group read results from cycles/ref-cycles/TSC/ASTATE/MSTATE event
can be used to calculate the frequency, CPU Utilization and percent
performance during each sampling period.
This patch shows them in report -D.

Here is an example:

$ perf record -e
'{cycles,ref-cycles,msr/tsc/,msr/mperf/,msr/aperf/}:S' ~/tchain_edit

Here is one sample from perf report -D

1972044565107 0x3498 [0x88]: PERF_RECORD_SAMPLE(IP, 0x2): 10608/10608:
0x4005fd period: 564686 addr: 0
... sample_read:
 group nr 5
. id 0012, value 02143901
. id 0052, value 02143896
. id 0094, value 021e443d
. id 00d4, value 021db984
. id 0114, value 021db964
. Freq 2301 MHz
. CPU% 98%
. CORE_BUSY% 99%

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-report.c |  3 +++
 tools/perf/util/pmu.h   |  2 ++
 tools/perf/util/session.c   | 34 +-
 tools/perf/util/session.h   | 38 ++
 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 62cce98..0cd0573 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -38,6 +38,8 @@
 
 #include "util/auxtrace.h"
 
+#include "util/pmu.h"
+
 #include 
 #include 
 
@@ -818,6 +820,7 @@ repeat:
symbol_conf.cumulate_callchain = false;
}
 
+   msr_pmu = perf_pmu__find("msr");
cpu_max_freq = get_cpu_max_freq() / 1000;
 
if (setup_sorting() < 0) {
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 7b9c8cf..e3e67aa 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -27,6 +27,8 @@ struct perf_pmu {
struct list_head list;/* ELEM */
 };
 
+struct perf_pmu *msr_pmu;
+
 struct perf_pmu_info {
const char *unit;
double scale;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index ed9dc25..6dd20b5 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -17,6 +17,7 @@
 #include "asm/bug.h"
 #include "auxtrace.h"
 #include "thread-stack.h"
+#include "pmu.h"
 
 static int perf_session__deliver_event(struct perf_session *session,
   union perf_event *event,
@@ -851,8 +852,14 @@ static void perf_evlist__print_tstamp(struct perf_evlist 
*evlist,
printf("%" PRIu64 " ", sample->time);
 }
 
-static void sample_read__printf(struct perf_sample *sample, u64 read_format)
+static void sample_read__printf(struct perf_evlist *evlist,
+   struct perf_sample *sample,
+   u64 read_format)
 {
+   struct perf_evsel *evsel;
+   struct perf_sample_id *sid;
+   u64 data[FREQ_PERF_MAX] = { 0 };
+
printf("... sample_read:\n");
 
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -875,10 +882,26 @@ static void sample_read__printf(struct perf_sample 
*sample, u64 read_format)
printf(". id %016" PRIx64
   ", value %016" PRIx64 "\n",
   value->id, value->value);
+
+   sid = perf_evlist__id2sid(evlist, value->id);
+   evsel = sid->evsel;
+   if (evsel != NULL)
+   SET_FREQ_PERF_VALUE(evsel, data,
+   value->value);
}
} else
printf(". id %016" PRIx64 ", value %016" PRIx64 "\n",
sample->read.one.id, sample->read.one.value);
+
+   if (HAS_FREQ(data))
+   printf(". Freq %lu MHz\n",
+  (data[FREQ_PERF_CYCLES] * cpu_max_freq) / 
data[FREQ_PERF_REF_CYCLES]);
+   if (HAS_CPU_U(data))
+   printf(". CPU%% %lu%%\n",
+  (100 * data[FREQ_PERF_REF_CYCLES]) / 
data[FREQ_PERF_TSC]);
+   if (HAS_CORE_BUSY(data))
+   printf(". CORE_BUSY%% %lu%%\n",
+  (100 * data[FREQ_PERF_APERF]) / data[FREQ_PERF_MPERF]);
 }
 
 static void dump_event(struct perf_evlist *evlist, union perf_event *event,
@@ -899,8 +922,8 @@ static void dump_event(struct perf_evlist *evlist, union 
perf_event *event,
   event->header.size, perf_event__name(event->header.type));
 }
 
-static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
-   struct perf_sample *sample)
+static void dump_sample(struct perf_evlist *evlist, struct perf_evsel *evsel,
+   union perf_event *event, struct perf_sample *sample)
 {
u64 sample_type;
 
@@ -938,7 +961,7 @@ static void dump_sample(struct perf_evsel *evsel, union 
perf_event *event,
printf("... transaction: %" PRIx64 "\n", sample->transaction);
 
if (sample_type & PERF_SAMPLE_READ)
-

[PATCH 4/5] perf,tools: caculate and save freq/CPU%/CORE_BUSY% in he_stat

2015-07-23 Thread Kan Liang

Introduce a new hist_iter ops (hist_iter_freq_perf) to caculate the
freq/CPU%/CORE_BUSY% freq when processing samples, and save them in
hist_entry.

Signed-off-by: Kan Liang 
---
 tools/perf/builtin-annotate.c |  2 +-
 tools/perf/builtin-diff.c |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c| 51 ++-
 tools/perf/util/hist.h|  2 ++
 tools/perf/util/sort.h|  3 +++
 tools/perf/util/symbol.h  |  6 +
 7 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2c1bec3..06e2f87 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -71,7 +71,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
 
-   he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+   he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, NULL, 
true);
if (he == NULL)
return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index daaa7dc..2fffcc4 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -315,7 +315,7 @@ static int hists__add_entry(struct hists *hists,
u64 weight, u64 transaction)
 {
if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-  transaction, true) != NULL)
+  transaction, NULL, true) != NULL)
return 0;
return -ENOMEM;
 }
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 8c102b0..5d9f9e3 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(hists, , NULL,
-   NULL, NULL, 1, 1, 0, true);
+   NULL, NULL, 1, 1, 0, NULL, 
true);
if (he == NULL) {
addr_location__put();
goto out;
@@ -116,7 +116,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(hists, , NULL,
-   NULL, NULL, 1, 1, 0, true);
+   NULL, NULL, 1, 1, 0, NULL, 
true);
if (he == NULL) {
addr_location__put();
goto out;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6f28d53..26b8eea 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -436,7 +436,9 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
  struct symbol *sym_parent,
  struct branch_info *bi,
  struct mem_info *mi,
- u64 period, u64 weight, u64 transaction,
+ u64 period, u64 weight,
+ u64 transaction,
+ struct freq_perf_info *info,
  bool sample_self)
 {
struct hist_entry entry = {
@@ -454,6 +456,9 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.nr_events = 1,
.period = period,
.weight = weight,
+   .freq = (info != NULL) ? info->freq : 0,
+   .cpu_u = (info != NULL) ? info->cpu_u : 0,
+   .core_busy = (info != NULL) ? info->core_busy : 0,
},
.parent = sym_parent,
.filtered = symbol__parent_filter(sym_parent) | al->filtered,
@@ -481,6 +486,32 @@ iter_add_next_nop_entry(struct hist_entry_iter *iter 
__maybe_unused,
 }
 
 static int
+iter_add_single_freq_perf_entry(struct hist_entry_iter *iter, struct 
addr_location *al)
+{
+   struct perf_evsel *evsel = iter->evsel;
+   struct perf_sample *sample = iter->sample;
+   struct hist_entry *he;
+   struct freq_perf_info info = {0};
+   u64 *data = sample->freq_perf_data;
+
+   if (data[FREQ_PERF_REF_CYCLES] > 0)
+   info.freq = (data[FREQ_PERF_CYCLES] * cpu_max_freq) / 
data[FREQ_PERF_REF_CYCLES];
+   if (data[FREQ_PERF_TSC] > 0)
+   info.cpu_u = (100 * data[FREQ_PERF_REF_CYCLES]) / 
data[FREQ_PERF_TSC];
+   if (data[FREQ_PERF_MPERF] > 0)
+   info.core_busy = (100 * data[FREQ_PERF_APERF]) / 
data[FREQ_PERF_MPERF];
+
+   he = __hists__add_entry(evsel__hists(evsel), al, iter->parent,

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 1842 matches

Mail list logo