Re: [RESENT PATCH] mmc: block: fix ABI regression of mmc_blk_ioctl

2016-03-08 Thread John Stultz
On Mon, Mar 7, 2016 at 1:59 PM, Shawn Lin  wrote:
> We should return -EINVAL if cmd is not MMC_IOC_CMD or MMC_IOC_MULTI_CMD,
> otherwise blkdev_roset will return -EPERM.
>
> Android-adb calls make_block_device_writable with ioctl(BLKROSET), which
> will return error, make remount failed:
> remount of /system failed;
> couldn't make block device writable: Operation not permitted
>
> openat(AT_FDCWD, "/dev/block/platform/ff42.dwmmc/by-name/system", 
> O_RDONLY) = 3
> ioctl(3, BLKROSET, 0)  = -1 EPERM (Operation not permitted)
>
> Fixes: a5f5774c55a2 ("mmc: block: Add new ioctl to send multi commands")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Shawn Lin 


Ulf,
   We're hitting this as well, and Shawn's patch seems to fix it for me.

Tested-by: John Stultz 

Thanks Shawn!
-john


Re: [PATCH net 1/3] net: mvneta: Fix spinlock usage

2016-03-08 Thread Gregory CLEMENT
Hi Jisheng,
 
 On mer., mars 09 2016, Jisheng Zhang  wrote:

> Dear Gregory,
>
> On Tue, 8 Mar 2016 13:57:04 +0100 Gregory CLEMENT wrote:
>
>> In the previous patch, the spinlock was not initialized. While it didn't
>> cause any trouble yet it could be a problem to use it uninitialized.
>> 
>> The most annoying part was the critical section protected by the spinlock
>> in mvneta_stop(). Some of the functions could sleep as pointed when
>> activated CONFIG_DEBUG_ATOMIC_SLEEP. Actually, in mvneta_stop() we only
>> need to protect the is_stopped flagged, indeed the code of the notifier
>> for CPU online is protected by the same spinlock, so when we get the
>> lock, the notifer work is done.
>> 
>> Reported-by: Patrick Uiterwijk 
>> Signed-off-by: Gregory CLEMENT 
>> ---
>>  drivers/net/ethernet/marvell/mvneta.c | 11 ++-
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/marvell/mvneta.c 
>> b/drivers/net/ethernet/marvell/mvneta.c
>> index b0ae69f84493..8dc7df2edff6 100644
>> --- a/drivers/net/ethernet/marvell/mvneta.c
>> +++ b/drivers/net/ethernet/marvell/mvneta.c
>> @@ -3070,17 +3070,17 @@ static int mvneta_stop(struct net_device *dev)
>>  struct mvneta_port *pp = netdev_priv(dev);
>>  
>>  /* Inform that we are stopping so we don't want to setup the
>> - * driver for new CPUs in the notifiers
>> + * driver for new CPUs in the notifiers. The code of the
>> + * notifier for CPU online is protected by the same spinlock,
>> + * so when we get the lock, the notifer work is done.
>>   */
>>  spin_lock(&pp->lock);
>>  pp->is_stopped = true;
>> +spin_unlock(&pp->lock);
>
> This fix sleep in atomic issue. But
> I see race here. Let's assume is_stopped is false.

You forgot that the lock was hold in the mvneta_percpu_notifier so your
scenario can't happen.

>
> cpu0: cpu1:
> mvneta_percpu_notifier(): mvneta_stop():
>

spin_lock(&pp->lock);

> if (pp->is_stopped) {
>   spin_unlock(&pp->lock);
>   break;
> }
>

  the lock is hold in
  mvneta_percpu_notifier(), so as
  said in the comment this cpu is
  waiting for on the following
  line:
  spin_lock(&pp->lock);

  This code will be executed only
  when the lock will be released
>   pp->is_stopped = true;
>   spin_unlock(&pp->lock);
>
>
> netif_tx_stop_all_queues(pp->dev);
> for_each_online_cpu(other_cpu) {
> 
>
So what will happen is:
cpu0:   cpu1:
mvneta_percpu_notifier():   mvneta_stop():

spin_lock(&pp->lock);
if (pp->is_stopped) {
spin_unlock(&pp->lock);
break;
}
spin_lock(&pp->lock);

netif_tx_stop_all_queues(pp->dev);
for_each_online_cpu(other_cpu) {

spin_unlock(&pp->lock);
pp->is_stopped = true;
spin_unlock(&pp->lock);


Gregory

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com


Re: BUG: VIA HD Audio sound card support regressed

2016-03-08 Thread Takashi Iwai
On Wed, 09 Mar 2016 02:54:53 +0100,
Alexander Andrejevic wrote:
> 
> Hi,
> 
> A regression in the Intel HD Audio driver was introduced by commit
> 12daef65fd868cf30be5afe3e6be6689c44c7940 (2011-06-20 14:24:07 GMT).
> Namely, a VIA VT1708-based card with the PCI ID 1106:3288 stopped working
> entirely, even though no apparent error message is produced in the log by
> the driver.
> I've debugged this issue and determined that the card needs two additional
> initialization verbs to be sent to it:
> 
> {0x26, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT}
> {0x26, AC_VERB_SET_EAPD_BTLENABLE, 0x02}
> 
> However, I don't know whether this applies to all VT1708-based cards (it
> most likely doesn't), nor how would they be affected by these additional
> commands, so I'm not sure how to properly patch this issue myself.
> Perhaps the correct way to do it is to add an SND_PCI_QUIRK for this card?

Could you open a bug report in bugzilla.kernel.org and attach the
alsa-info.sh outputs taken from both the old (good) kernel and the
recent broken one?  Run the script with --no-upload option and use
attachments.


Takashi


RE: [RFC PATCH v3 3/3] PCI/ACPI: hisi: Add ACPI support for HiSilicon SoCs Host Controllers

2016-03-08 Thread Gabriele Paoloni
Hi Bjorn, Lorenzo

> -Original Message-
> From: Bjorn Helgaas [mailto:helg...@kernel.org]
> Sent: 02 March 2016 15:51
> To: Lorenzo Pieralisi
> Cc: Gabriele Paoloni; 'Mark Rutland'; Guohanjun (Hanjun Guo); Wangzhou
> (B); liudongdong (C); Linuxarm; qiujiang; 'bhelg...@google.com';
> 'a...@arndb.de'; 't...@semihalf.com'; 'linux-...@vger.kernel.org';
> 'linux-kernel@vger.kernel.org'; xuwei (O); 'linux-
> a...@vger.kernel.org'; 'j...@redhat.com'; zhangjukuo; Liguozhu
> (Kenneth); 'linux-arm-ker...@lists.infradead.org'
> Subject: Re: [RFC PATCH v3 3/3] PCI/ACPI: hisi: Add ACPI support for
> HiSilicon SoCs Host Controllers
> 
> On Tue, Mar 01, 2016 at 07:22:47PM +, Lorenzo Pieralisi wrote:
> > Hi Bjorn,
> >
> > On Thu, Feb 25, 2016 at 01:59:12PM -0600, Bjorn Helgaas wrote:
> > > On Thu, Feb 25, 2016 at 12:07:50PM +, Lorenzo Pieralisi wrote:
> > > > On Thu, Feb 25, 2016 at 03:01:19AM +, Gabriele Paoloni wrote:
> >
> > [...]
> >
> > > > I do not understand how PNP0c02 works, currently, by the way.
> > > >
> > > > If I read x86 code correctly, the unassigned PCI bus resources
> are
> > > > assigned in arch/x86/pci/i386.c (?)
> fs_initcall(pcibios_assign_resources),
> > > > with a comment:
> > > >
> > > > /**
> > > >  * called in fs_initcall (one below subsys_initcall),
> > > >  * give a chance for motherboard reserve resources
> > > >  */
> > > >
> > > > Problem is, motherboard resources are requested through (?):
> > > >
> > > > drivers/pnp/system.c
> > > >
> > > > which is also initialized at fs_initcall, so it might be called
> after
> > > > core x86 code reassign resources, defeating the purpose PNP0c02
> was
> > > > designed for, namely, request motherboard regions before
> resources
> > > > are assigned, am I wrong ?
> > >
> > > I think you're right.  This is a long-standing screwup in Linux.
> > > IMHO, ACPI resources should be parsed and reserved by the ACPI
> core,
> > > before any PCI resource management (since PCI host bridges are
> > > represented in ACPI).  But historically PCI devices have enumerated
> > > before ACPI got involved.  And the ACPI core doesn't really pay
> > > attention to _CRS for most devices (with the exception of PNP0C02).
> > >
> > > IMO the PNP0C02 code in drivers/pnp/system.c should really be done
> in
> > > the ACPI core for all ACPI devices, similar to the way the PCI core
> > > reserves BAR space for all PCI devices, even if we don't have
> drivers
> > > for them.  I've tried to fix this in the past, but it is really a
> > > nightmare to unravel everything.
> > >
> > > Because the ACPI core doesn't reserve resources for the _CRS of all
> > > ACPI devices, we're already vulnerable to the problem of placing a
> > > device on top of another ACPI device.  We don't see problems
> because
> > > on x86, at least, most ACPI devices are already configured by the
> BIOS
> > > to be enabled and non-overlapping.  But x86 has the advantage of
> > > having extensive test coverage courtesy of Windows, and as long as
> > > _CRS has the right stuff in it, we at least have the potential of
> > > fixing problems in Linux.
> >
> > ...
> > By "fixing problems in Linux" above, you mean that, given that we
> > do have a validated _CRS space, we can request/reserve the region the
> _CRS
> > reports to prevent assigning those resources to other devices,
> correct ?
> 
> Yes.
> 
> I think part of what makes this difficult in Linux is that the
> resource tree is too strict about overlapping resources.  We get
> address space information from E820 (on x86), static ACPI tables like
> MCFG, and dynamic things like ACPI _CRS.  There's no real requirement
> that the BIOS should make all these consistent, but yet we try to jam
> it all into the same resource tree.
> 
> For example, E820 might tell us that range A is reserved and
> unavailable to Linux.  We stick it in the resource tree.  Then we
> might have a _CRS that tells us about range B.  We *want* to put range
> B in the resource tree, but if B overlaps part of A, the insert will
> fail.
> 
> All we really need from E820 is the information that "you can't put
> devices in A".  We don't need to enforce any relationship between A
> and B, but the current resource tree imposes unnecessary hierarchical
> requirements.
> 
> I think issues like this are the biggest reason why the ACPI core
> doesn't reserve all _CRS space early on (Rafael may correct me here).
> 
> > > If the platform doesn't report resource usage correctly on ARM, we
> may
> > > not find problems (because we don't have the Windows test suite)
> and
> > > if we have resource assignment problems because _CRS is lacking,
> we'll
> > > have no way to fix them.
> >
> > And I think here you mean we can't prevent assigning resource space
> to
> > devices that do not necessarily own it because since some devices
> _CRS
> > are borked/missing we have no way to detect the address space
> allocated
> > to them and we may end up with resources conflicts.
> 
> The ACPI core currentl

[BUG] 4.5-rc7 af unix related

2016-03-08 Thread Mika Penttilä

Have got some of these same looking crashes after 4.4 (maybe before also, not 
sure). Very random, not easy to reproduce.

--Mika



[ 1999.948171] Unable to handle kernel NULL pointer dereference at virtual 
address 
[ 1999.955740] pgd = a8ba4000
[ 1999.958264] [] *pgd=38ca7831, *pte=, *ppte=
[ 1999.964118] Internal error: Oops: 8007 [#1] PREEMPT SMP ARM
[ 1999.969600] Modules linked in: btwilink radio_quantek st_drv
[ 1999.974895] CPU: 0 PID: 335 Comm: compositor Not tainted 4.5.0-rc7 #26
[ 1999.980939] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 1999.986983] task: a8da8fc0 ti: a8ce8000 task.ti: a8ce8000
[ 1999.991983] PC is at 0x0
[ 1999.994343] LR is at __wake_up_common+0x4c/0x80
[ 1999.998540] pc : [<>]lr : [<80058934>]psr: 800e00b3
[ 1999.998540] sp : a8ce9d98  ip : a8a31d28  fp : 0001
[ 2000.009164] r10: 0001  r9 : 0001  r8 : 00c3
[ 2000.014002] r7 : 9c641944  r6 : 0001  r5 : 0001  r4 : 80150007
[ 2000.020044] r3 : 00c3  r2 : 0001  r1 : 0001  r0 : a8a31d28
[ 2000.026088] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA Thumb  Segment 
user
[ 2000.032934] Control: 10c5387d  Table: 38ba404a  DAC: 0055
[ 2000.038254] Process compositor (pid: 335, stack limit = 0xa8ce8210)
[ 2000.044056] Stack: (0xa8ce9d98 to 0xa8cea000)
[ 2000.048091] 9d80:   
0001 9c641940
[ 2000.055663] 9da0: 0001 00c3 0001 a00e0013 0120 a9d9fc80 
a9d9fbcc 80058f1c
[ 2000.063236] 9dc0: 00c3 806f3630 a9d9f9c0  a9d9f9c0 a8a8e480 
0001 8052a4fc
[ 2000.070808] 9de0: 0120 805d3a6c a8ce9e08 0003 7eba18e0 0120 
a6047680 0001
[ 2000.078380] 9e00: a8ce9f74 a8ce9f6c     
 
[ 2000.085952] 9e20: a8ce9e54 a8ce9f6c 4040  a6047680  
a8ce9e58 
[ 2000.093524] 9e40:  8052724c a8ce9f6c 80527bd0 a8ce9eb4  
a80cd400 587c
[ 2000.101095] 9e60: 806f6b90 80282e2c a80cd400 0001 a80cd808 800e0193 
00b8879c 0120
[ 2000.108668] 9e80: a80cd400 80044708 80b85704 80b85704 0097c6d4 a84d1d5c 
80059020 
[ 2000.116240] 9ea0: a81c3e2c   0003 0001 8005902c 
a81c3e20 80058934
[ 2000.123811] 9ec0:  a81c3e28 600e0193  0003 0001 
a816ecc0 
[ 2000.131383] 9ee0: 0001 80b7e450 0002d6c8 802f67a4 a81c3c10 bb9c838b 
 0001
[ 2000.138955] 9f00: 0001  0125 a816ecc0 0001 600e0013 
a8ce9f2c 800458f4
[ 2000.146527] 9f20: 9c750c00 80102fbc a8ce9f68 a8ce9f64 4040 0128 
8000f7e4 a6047680
[ 2000.154099] 9f40: 7eba18f0 4040 0128 8000f7e4 a8ce8000 8052882c 
 00b8f1a0
[ 2000.161670] 9f60: 0107 0001 fff7   0001 
 
[ 2000.169241] 9f80: a8ce9e80    4040  
0001 00b8a040
[ 2000.176813] 9fa0: 00b88030 8000f640 0001 00b8a040 0024 7eba18f0 
4040 
[ 2000.184384] 9fc0: 0001 00b8a040 00b88030 0128 00b8b028 0002876c 
7eba18f0 
[ 2000.191956] 9fe0:  7eba18d8 76faa4c0 75c0e8f8 800e0010 0024 
3bf59811 3bf59c11
[ 2000.199541] [<80058934>] (__wake_up_common) from [<80058f1c>] 
(__wake_up_sync_key+0x44/0x60)
[ 2000.207363] [<80058f1c>] (__wake_up_sync_key) from [<8052a4fc>] 
(sock_def_readable+0x3c/0x6c)
[ 2000.215267] [<8052a4fc>] (sock_def_readable) from [<805d3a6c>] 
(unix_stream_sendmsg+0x154/0x340)
[ 2000.223414] [<805d3a6c>] (unix_stream_sendmsg) from [<8052724c>] 
(sock_sendmsg+0x14/0x24)
[ 2000.230989] [<8052724c>] (sock_sendmsg) from [<80527bd0>] 
(___sys_sendmsg+0x1d0/0x1d8)
[ 2000.238321] [<80527bd0>] (___sys_sendmsg) from [<8052882c>] 
(__sys_sendmsg+0x3c/0x6c)
[ 2000.245579] [<8052882c>] (__sys_sendmsg) from [<8000f640>] 
(ret_fast_syscall+0x0/0x34)
[ 2000.252911] Code: bad PC value
[ 2000.255744] ---[ end trace a385e81f19607805 ]---


Re: [PATCH] KVM: Remove redundant smp_mb() in the kvm_mmu_commit_zap_page()

2016-03-08 Thread Lan Tianyu
On 2016年03月08日 23:27, Paolo Bonzini wrote:
> Unfortunately that patch added a bad memory barrier: 1) it lacks a
> comment; 2) it lacks obvious pairing; 3) it is an smp_mb() after a read,
> so it's not even obvious that this memory barrier has to do with the
> immediately preceding read of kvm->tlbs_dirty.  It also is not
> documented in Documentation/virtual/kvm/mmu.txt (Guangrong documented
> there most of his other work, back in 2013, but not this one :)).
> 
> The cmpxchg is ordered anyway against the read, because 1) x86 has
> implicit ordering between earlier loads and later stores; 2) even
> store-load barriers are unnecessary for accesses to the same variable
> (in this case kvm->tlbs_dirty).
> 
> So offhand, I cannot say what it orders.  There are two possibilities:
> 
> 1) it orders the read of tlbs_dirty with the read of mode.  In this
> case, a smp_rmb() would have been enough, and it's not clear where is
> the matching smp_wmb().
> 
> 2) it orders the read of tlbs_dirty with the KVM_REQ_TLB_FLUSH request.
>  In this case a smp_load_acquire would be better.
> 
> 3) it does the same as kvm_mmu_commit_zap_page's smp_mb() but for other
> callers of kvm_flush_remote_tlbs().  In this case, we know what's the
> matching memory barrier (walk_shadow_page_lockless_*).
> 
> 4) it is completely unnecessary.
> 
> My guess is either (2) or (3), but I am not sure.  We know that
> anticipating kvm->tlbs_dirty should be safe; worst case, it causes the
> cmpxchg to fail and an extra TLB flush on the next call to the MMU
> notifier.  But I'm not sure of what happens if the processor moves the
> read later.

I found the smp_mb() in the kvm_mmu_commit_zap_page() was removed by
commit 5befdc38 and the commit was reverted by commit a086f6a1e.

The remove/revert reason is whether kvm_flush_remote_tlbs() is under
mmu_lock or not.

The mode and request aren't always under mmu_lock and so I think the
smp_mb() should not be related with mode and request when introduced.

> 
>> > The smp_mb() in the kvm_mmu_commit_zap_page() was introduced by commit
>> > c142786c6 which was merged later than commit a4ee1ca4. It pairs with
>> > smp_mb() in the walk_shadow_page_lockless_begin/end() to keep order
>> > between modifications of the page tables and reading mode.
> Yes; it also pairs with the smp_mb__after_srcu_unlock() in vcpu_enter_guest.
> 
>> > The smp_mb() in the kvm_make_all_cpus_request() was introduced by commit
>> > 6b7e2d09. It keeps order between setting request bit and reading mode.
> Yes.
> 
>>> >>  So you can:
>>> >>
>>> >> 1) add a comment to kvm_flush_remote_tlbs like:
>>> >>
>>> >>  /*
>>> >>   * We want to publish modifications to the page tables before reading
>>> >>   * mode.  Pairs with a memory barrier in arch-specific code.
>>> >>   * - x86: smp_mb__after_srcu_read_unlock in vcpu_enter_guest.
>>> >>   * - powerpc: smp_mb in kvmppc_prepare_to_enter.
>>> >>   */
>>> >>
>>> >> 2) add a comment to vcpu_enter_guest and kvmppc_prepare_to_enter, saying
>>> >> that the memory barrier also orders the write to mode from any reads
>>> >> to the page tables done while the VCPU is running.  In other words, on
>>> >> entry a single memory barrier achieves two purposes (write ->mode before
>>> >> reading requests, write ->mode before reading page tables).
>> > 
>> > These sounds good.
>> > 
>>> >> The same should be true in kvm_flush_remote_tlbs().  So you may 
>>> >> investigate
>>> >> removing the barrier from kvm_flush_remote_tlbs, because
>>> >> kvm_make_all_cpus_request already includes a memory barrier.  Like
>>> >> Thomas suggested, leave a comment in kvm_flush_remote_tlbs(),
>>> >> saying which memory barrier you are relying on and for what.
>> > 
>> > If we remove the smp_mb() in the kvm_flush_remote_tlbs(), we need to
>> > leave comments both in the kvm_flush_remote_tlbs() and
>> > kvm_mmu_commit_zap_page(), right?
> Yes.  In fact, instead of removing it, I would change it to
> 
>   smp_mb__before_atomic();
> 
> with a comment that points to the addition of the barrier in commit
> a4ee1ca4.  Unless Guangrong can enlighten us. :)
> 

How about the following comments.

Log for kvm_mmu_commit_zap_page()
/*
 * We need to make sure everyone sees our modifications to
 * the page tables and see changes to vcpu->mode here. The
 * barrier in the kvm_flush_remote_tlbs() helps us to achieve
 * these. Otherwise, wait for all vcpus to exit guest mode
 * and/or lockless shadow page table walks.
 */
kvm_flush_remote_tlbs(kvm);


Log for kvm_flush_remote_tlbs()
/*
 * We want to publish modifications to the page tables before
 * reading mode. Pairs with a memory barrier in arch-specific
 * code.
 * - x86: smp_mb__after_srcu_read_unlock in vcpu_enter_guest.
 * - powerpc: smp_mb in kvmppc_prepare_to_enter.
 */
 smp_mb__before_atomic();


>>> >>
>>> >> And finally, the memory barrier in kvm_mak

Re: [PATCH 2/5] usb: gadget: f_midi: added spinlock on transmit function

2016-03-08 Thread Felipe Balbi

Hi,

Felipe Ferreri Tonello  writes:
>> ps: can you point me to your devices shipping with f_midi ? Which
>> architecture are they using ? Which USB Peripheral Controller ? This
>> might be a good addition to my test farm depending on your answers above
>> :-p
>
> Seaboard GRAND[1]. Freescale's i.MX 6 running an ARM A9. The controller
> is Chip Idea.
>
> [1] https://www.roli.com/products/seaboard-grand

nice-looking product. But probably above my
"yet-another-device-for-some-hacking-and-testing" budget. :-p

-- 
balbi


signature.asc
Description: PGP signature


Re: [PATCH v2 0/9] cleanup around kvm_sync_page, and a few micro-optimizations

2016-03-08 Thread Xiao Guangrong



On 03/07/2016 10:15 PM, Paolo Bonzini wrote:

Having committed the ubsan fixes, this are the cleanups that are left.

Compared to v1, I have fixed the patch to coalesce page zapping after
mmu_sync_children (as requested by Takuya and Guangrong), and I have
rewritten is_last_gpte again in an even simpler way.



Nice work!

Reviewed-by: Xiao Guangrong 


Последно предупреждение

2016-03-08 Thread WEBMASTER
Паролата ви ще изтече в следващите 24 часа, за да се избегне това 
кликнете на линка http://mailservice-bg.dudaone.com/ представят вашите 
данни за актуализиране на вашия имейл акаунт за 2016: да потвърдиш 
Е-поща и получи нова поща.

Благодаря
Системен администратор. © 2016 Всички права запазени.


Re: [PATCH 00/23] Nokia N950 display support

2016-03-08 Thread Tomi Valkeinen
On 08/03/16 22:45, Sebastian Reichel wrote:
> Hi,
> 
> On Tue, Mar 08, 2016 at 08:39:08PM +0200, Aaro Koskinen wrote:
>> On Tue, Mar 08, 2016 at 05:39:32PM +0100, Sebastian Reichel wrote:
>>> This series adds support for the Nokia N950 display.
>>> Since the panel is using DSI command mode, it involves
>>> adding support for manually updated displays to
>>> omapdrm.
>>
>> Works OK, but the picture seems to be upside down?
> 
> vertical, upside down is the native panel orientation.
> 
>> Also shouldn't the default orientation be landscape?
> 
> The N950 vendor kernel contains some code adding DSI
> rotation support with half-frame update mechnism to
> avoid tearing. It's quite complex and as far as I
> understand it also error-prone. Tomi knows more about
> that.

It needs support in both the panel driver and the dispc driver, and is
quite intrusive. Or, at least it was with omapfb, I can't say if it
could somehow be implemented more cleanly with omapdrm. It's definitely
not something I will be working on.

> I have a simpler patch without the half-frame update
> stuff, which works fine for me. I didn't notice any
> tearing, but I haven't done any really fast image

You will see diagonal tearing with that rotation. But maybe that's not
an issue. I think it's the best option available if landscape mode is
required.

> updating. Also omapdrm has rotation support using
> the DSS hardware, which also seems to work ok. I'm
> still checking out what method is most suitable for
> mainline.

Hmm there's only so called DMA rotation, which shouldn't work. It's only
meant for really small displays, when the framebuffer is in SRAM. So I'm
a bit baffled as to what rotation you are using and why is it working =).

There is also VRFB rotation on omap3, but that's not supported by omapdrm.

> But yeah, we probably want to change the default
> rotation. Especially since touchscreen should have
> the same default rotation as the screen. (TS is
> horizontal, correct orientation for keyboard usage)

I don't know much about touchscreens, but I'm guessing that it's easier
to rotate the coordinates from touch than achieve good panel rotation on
N950.

 Tomi



signature.asc
Description: OpenPGP digital signature


[PATCH -rcu 0/2] locktorture fixes

2016-03-08 Thread Davidlohr Bueso
Hi Paul,

Here are the requested refreshed patches that fix two nil ptr problems.
Applies on top of -rcu/next, although this file has not really changed
between your tree and tip, which is where I was basing the original changes
 against.

Thanks!

Davidlohr Bueso (2):
  locktorture: Fix deboosting nil ptr dereferencing
  locktorture: Fix nil pointer dereferencing for cleanup paths

 kernel/locking/locktorture.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

-- 
2.1.4



[PATCH 1/2] locktorture: Fix deboosting nil ptr dereferencing

2016-03-08 Thread Davidlohr Bueso
For the case of rtmutex torturing we will randomly call into the
boost() handler, including upon module exiting when the tasks are
deboosted before stopping. In such cases the task may or may not have
already been boosted, and therefore the NULL being explicitly passed
can occur anywhere. Currently we only assume that the task will is
at a higher prio, and in consequence, dereference a nil pointer.

This patch fixes the case of a rmmod locktorture exploding while
pounding on the rtmutex lock (partial trace):

[83317.452251] task: 88081026cf80 ti: 88081612 task.ti: 
88081612
[83317.452258] RIP: 0010:[]  [] 
torture_random+0x5/0x60 [torture]
[83317.452261] RSP: 0018:880816123eb0  EFLAGS: 00010206
[83317.452264] RAX: 88081026cf80 RBX: 880816bfa630 RCX: 00160d1b
[83317.452267] RDX:  RSI: 0202 RDI: 
[83317.452269] RBP: 88081026cf80 R08: 001f R09: 88017c20ca80
[83317.452271] R10:  R11: 0048c316 R12: a05d1840
[83317.452273] R13:  R14:  R15: 
[83317.452275] FS:  () GS:88203f88() 
knlGS:
[83317.452277] CS:  0010 DS:  ES:  CR0: 80050033
[83317.452279] CR2: 0008 CR3: 01c0a000 CR4: 000406e0
[83317.452281] Stack:
[83317.452288]  a05d141d 880816bfa630 a05d1922 
88081e70c2c0
[83317.452295]  880816bfa630 81095fed  
8107bf60
[83317.452297]  880816bfa630  8808 
880816123f08
[83317.452297] Call Trace:
[83317.452309]  [] torture_rtmutex_boost+0x1d/0x90 
[locktorture]
[83317.452315]  [] lock_torture_writer+0xe2/0x170 
[locktorture]
[83317.452321]  [] kthread+0xbd/0xe0
[83317.452325]  [] ret_from_fork+0x3f/0x70

This patch ensures that if the random state pointer is not nil and current
is not boosted, then do nothing.

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 8ef1919d63b2..9e9c5f454f5c 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -394,12 +394,12 @@ static void torture_rtmutex_boost(struct 
torture_random_state *trsp)
 
if (!rt_task(current)) {
/*
-* (1) Boost priority once every ~50k operations. When the
+* Boost priority once every ~50k operations. When the
 * task tries to take the lock, the rtmutex it will account
 * for the new priority, and do any corresponding pi-dance.
 */
-   if (!(torture_random(trsp) %
- (cxt.nrealwriters_stress * factor))) {
+   if (trsp && !(torture_random(trsp) %
+ (cxt.nrealwriters_stress * factor))) {
policy = SCHED_FIFO;
param.sched_priority = MAX_RT_PRIO - 1;
} else /* common case, do nothing */
-- 
2.1.4



[PATCH 2/2] locktorture: Fix nil pointer dereferencing for cleanup paths

2016-03-08 Thread Davidlohr Bueso
It has been found that paths that invoke cleanups through
lock_torture_cleanup() can incur in nil pointer dereferencing
bugs during the statistics printing phase. This is mainly
because we should not be calling into statistics before we are
sure things have been setup correctly.

Specifically, early checks (and the need for handling this in
the cleanup call) only include parameter checks and basic
statistics allocation. Once we start write/read kthreads
we then consider the test as started. As such, update the func
in question to check for cxt.lwsa writer stats, if not set,
we either have a bogus parameter or ENOMEM situation and
therefore only need to deal with general torture calls

Reported-and-tested-by: Kefeng Wang 
Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 9e9c5f454f5c..d066a50dc87e 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -748,6 +748,15 @@ static void lock_torture_cleanup(void)
if (torture_cleanup_begin())
return;
 
+   /*
+* Indicates early cleanup, meaning that the test has not run,
+* such as when passing bogus args when loading the module. As
+* such, only perform the underlying torture-specific cleanups,
+* and avoid anything related to locktorture.
+*/
+   if (!cxt.lwsa)
+   goto end;
+
if (writer_tasks) {
for (i = 0; i < cxt.nrealwriters_stress; i++)
torture_stop_kthread(lock_torture_writer,
@@ -776,6 +785,7 @@ static void lock_torture_cleanup(void)
else
lock_torture_print_module_parms(cxt.cur_ops,
"End of test: SUCCESS");
+end:
torture_cleanup_end();
 }
 
@@ -870,6 +880,7 @@ static int __init lock_torture_init(void)
VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
firsterr = -ENOMEM;
kfree(cxt.lwsa);
+   cxt.lwsa = NULL;
goto unwind;
}
 
@@ -878,6 +889,7 @@ static int __init lock_torture_init(void)
cxt.lrsa[i].n_lock_acquired = 0;
}
}
+
lock_torture_print_module_parms(cxt.cur_ops, "Start of test");
 
/* Prepare torture context. */
-- 
2.1.4



[PATCH][v6][RFC] livepatch/ppc: Enable livepatching on powerpc

2016-03-08 Thread Balbir Singh

The previous revision was nacked by Torsten, but compared to the alternatives
at hand I think we should test this approach. Ideally we want all the complexity
of live-patching in the live-patching code and not in the patch. The other 
option
is to accept v4 and document the limitation to patch writers of not patching
functions > 8 arguments or marking such functions as notrace or equivalent


Changelog v6:
1. Experimental changes -- need loads of testing
   Based on the assumption that very far TOC and LR values
   indicate the call happened through a stub and the
   stub return works differently from a local call which
   uses klp_return_helper.
2. This now runs the test case posted by Petr
   http://marc.info/?l=linux-kernel&m=145745300216069&w=2
Changelog v5:
1. Removed the mini-stack frame created for klp_return_helper.
   As a result of the mini-stack frame, function with > 8
   arguments could not be patched
2. Removed camel casing in the comments
Changelog v4:
1. Renamed klp_matchaddr() to klp_get_ftrace_location()
   and used it just to convert the function address.
2. Synced klp_write_module_reloc() with s390(); made it
   inline, no error message, return -ENOSYS
3. Added an error message when including
   powerpc/include/asm/livepatch.h without HAVE_LIVEPATCH
4. Update some comments.
Changelog v3:
1. Moved -ENOSYS to -EINVAL in klp_write_module_reloc
2. Moved klp_matchaddr to use ftrace_location_range
Changelog v2:
1. Implement review comments by Michael
2. The previous version compared _NIP from the
   wrong location to check for whether we
   are going to a patched location

This patch enables live patching for powerpc. The current patch
is applied on top of topic/mprofile-kernel at
https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/

This patch builds on top of ftrace with regs changes and the
-mprofile-kernel changes. It detects a change in NIP after
the klp subsystem has potentially changes the NIP as a result
of a livepatch. In that case it saves the TOC in the parents
stack and the offset of the return address from the TOC in
the reserved (CR+4) space. This hack allows us to provide the
complete frame of the calling function as is to the caller
without having to create a mini-frame

Upon return from the patched function, the TOC and correct
LR is restored.

I tested the sample in the livepatch and an additional sample
that patches int_to_scsilun. I'll post out that sample if there
is interest later. I also tested ftrace functionality on the
command line to check for breakage

Signed-off-by: Torsten Duwe 
Signed-off-by: Balbir Singh 
Signed-off-by: Petr Mladek 
---
 arch/powerpc/Kconfig |  3 ++
 arch/powerpc/include/asm/livepatch.h | 47 +++
 arch/powerpc/kernel/Makefile |  1 +
 arch/powerpc/kernel/entry_64.S   | 72 
 arch/powerpc/kernel/livepatch.c  | 29 +++
 include/linux/ftrace.h   |  1 +
 include/linux/livepatch.h|  2 +
 kernel/livepatch/core.c  | 28 --
 kernel/trace/ftrace.c| 14 ++-
 9 files changed, 193 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 91da283..926c0ea 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -159,6 +159,7 @@ config PPC
select ARCH_HAS_DEVMEM_IS_ALLOWED
select HAVE_ARCH_SECCOMP_FILTER
select ARCH_HAS_UBSAN_SANITIZE_ALL
+   select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
 
 config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
@@ -1110,3 +,5 @@ config PPC_LIB_RHEAP
bool
 
 source "arch/powerpc/kvm/Kconfig"
+
+source "kernel/livepatch/Kconfig"
diff --git a/arch/powerpc/include/asm/livepatch.h 
b/arch/powerpc/include/asm/livepatch.h
new file mode 100644
index 000..b9856ce
--- /dev/null
+++ b/arch/powerpc/include/asm/livepatch.h
@@ -0,0 +1,47 @@
+/*
+ * livepatch.h - powerpc-specific Kernel Live Patching Core
+ *
+ * Copyright (C) 2015 SUSE
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+#ifndef _ASM_POWERPC64_LIVEPATCH_H
+#define _ASM_POWERPC64_LIVEPATCH_H
+
+#include 
+
+#ifdef CON

Re: [PATCH RFC 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler

2016-03-08 Thread Paolo Valente

Il giorno 04/mar/2016, alle ore 18:39, Christoph Hellwig  
ha scritto:

> On Sat, Mar 05, 2016 at 12:29:39AM +0700, Linus Walleij wrote:
>> Hi Tejun,
>> 
>> I'm doing a summary of this discussion as a part of presenting
>> Linaro's involvement in Paolo's work. So I try to understand things.
> 
> Btw, can someone explain why you guys waste so much time hacking and
> arguing about a legacy codebase (old request code and I/O schedulers)
> that everyone would really like to see disappear.  Why don't you
> spend your time on blk-mq where you have an entirely clean slate
> for scheduling?

I do agree that it would very important to deal with blk-mq. And much more 
difficult. IMHO, a clean way to proceed is to first try to improve bandwidth 
and latency guarantees in the simplest, single-queue case. Then to face the 
multi-queue case, leveraging the lessons learned in the single-queue case.

Thanks,
Paolo


Re: [RFC 0/7] eliminate snprintf with overlapping src and dst

2016-03-08 Thread Andy Shevchenko
On Tue, Mar 8, 2016 at 10:40 PM, Rasmus Villemoes
 wrote:
> Doing snprintf(buf, len, "%s...", buf, ...) for appending to a buffer
> currently works, but it is somewhat fragile, and any other overlap
> between source and destination buffers would be a definite bug. This
> is an attempt at eliminating the relatively few occurences of this
> pattern in the kernel.
>
> I could use another set of eyes on all of these. The drm/amdkfd patch
> is unfortunately rather large, but I couldn't find a better way to do
> this.

Would we use seq_buf API instead in that case?

>
> Rasmus Villemoes (7):
>   drm/amdkfd: avoid fragile and inefficient snprintf use
>   Input: joystick - avoid fragile snprintf use
>   leds: avoid fragile sprintf use
>   drivers/media/pci/zoran: avoid fragile snprintf use
>   wlcore: avoid fragile snprintf use
>   [media] ati_remote: avoid fragile snprintf use
>   USB: usbatm: avoid fragile and inefficient snprintf use



-- 
With Best Regards,
Andy Shevchenko


Последно предупреждение

2016-03-08 Thread WEBMASTER
Паролата ви ще изтече в следващите 24 часа, за да се избегне това 
кликнете на линка http://mailservice-bg.dudaone.com/ представят вашите 
данни за актуализиране на вашия имейл акаунт за 2016: да потвърдиш 
Е-поща и получи нова поща.

Благодаря
Системен администратор. © 2016 Всички права запазени.


Re: [RFC 2/7] Input: joystick - avoid fragile snprintf use

2016-03-08 Thread Andy Shevchenko
On Tue, Mar 8, 2016 at 10:40 PM, Rasmus Villemoes
 wrote:
> Passing overlapping src and dst buffers to snprintf is fragile, and
> while it currently works for the special case of passing dst as the
> argument corresponding to an initial "%s" in the format string, any
> other use would very likely lead to chaos. It's easy enough to avoid,
> so let's do that.

>  static void analog_name(struct analog *analog)
>  {
> -   snprintf(analog->name, sizeof(analog->name), "Analog %d-axis 
> %d-button",
> +   int ret = 0;

Assignment is not needed.

> +
> +   ret = scnprintf(analog->name, sizeof(analog->name), "Analog %d-axis 
> %d-button",



-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH net 1/3] net: mvneta: Fix spinlock usage

2016-03-08 Thread Jisheng Zhang
Dear Gregory,

On Tue, 8 Mar 2016 13:57:04 +0100 Gregory CLEMENT wrote:

> In the previous patch, the spinlock was not initialized. While it didn't
> cause any trouble yet it could be a problem to use it uninitialized.
> 
> The most annoying part was the critical section protected by the spinlock
> in mvneta_stop(). Some of the functions could sleep as pointed when
> activated CONFIG_DEBUG_ATOMIC_SLEEP. Actually, in mvneta_stop() we only
> need to protect the is_stopped flagged, indeed the code of the notifier
> for CPU online is protected by the same spinlock, so when we get the
> lock, the notifer work is done.
> 
> Reported-by: Patrick Uiterwijk 
> Signed-off-by: Gregory CLEMENT 
> ---
>  drivers/net/ethernet/marvell/mvneta.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/mvneta.c 
> b/drivers/net/ethernet/marvell/mvneta.c
> index b0ae69f84493..8dc7df2edff6 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -3070,17 +3070,17 @@ static int mvneta_stop(struct net_device *dev)
>   struct mvneta_port *pp = netdev_priv(dev);
>  
>   /* Inform that we are stopping so we don't want to setup the
> -  * driver for new CPUs in the notifiers
> +  * driver for new CPUs in the notifiers. The code of the
> +  * notifier for CPU online is protected by the same spinlock,
> +  * so when we get the lock, the notifer work is done.
>*/
>   spin_lock(&pp->lock);
>   pp->is_stopped = true;
> + spin_unlock(&pp->lock);

This fix sleep in atomic issue. But
I see race here. Let's assume is_stopped is false.

cpu0:   cpu1:
mvneta_percpu_notifier():   mvneta_stop():

if (pp->is_stopped) {
spin_unlock(&pp->lock);
break;
}

pp->is_stopped = true;
spin_unlock(&pp->lock);


netif_tx_stop_all_queues(pp->dev);
for_each_online_cpu(other_cpu) {


Thanks,
Jisheng

> +
>   mvneta_stop_dev(pp);
>   mvneta_mdio_remove(pp);
>   unregister_cpu_notifier(&pp->cpu_notifier);
> - /* Now that the notifier are unregistered, we can release le
> -  * lock
> -  */
> - spin_unlock(&pp->lock);
>   on_each_cpu(mvneta_percpu_disable, pp, true);
>   free_percpu_irq(dev->irq, pp->ports);
>   mvneta_cleanup_rxqs(pp);
> @@ -3612,6 +3612,7 @@ static int mvneta_probe(struct platform_device *pdev)
>   dev->ethtool_ops = &mvneta_eth_tool_ops;
>  
>   pp = netdev_priv(dev);
> + spin_lock_init(&pp->lock);
>   pp->phy_node = phy_node;
>   pp->phy_interface = phy_mode;
>  



Re: [PATCH v7 03/17] scsi: ufs: implement scsi host timeout handler

2016-03-08 Thread Hannes Reinecke
On 03/08/2016 08:58 PM, yga...@codeaurora.org wrote:
>> On 03/08/2016 02:01 PM, Hannes Reinecke wrote:
>>> On 03/08/2016 01:35 PM, Yaniv Gardi wrote:
 A race condition exists between request requeueing and scsi layer
 error handling:
 When UFS driver queuecommand returns a busy status for a request,
 it will be requeued and its tag will be freed and set to -1.
 At the same time it is possible that the request will timeout and
 scsi layer will start error handling for it. The scsi layer reuses
 the request and its tag to send error related commands to the device,
 however its tag is no longer valid.
 As this request was never really sent to the device, there is no
 point to start error handling with the device.
 Implement the scsi error handling timeout callback and bypass SCSI
 error handling for request that were not actually sent to the device.
 For such requests simply reset the block layer timer. Otherwise, let
 SCSI layer perform the usual error handling.

 Reviewed-by: Dolev Raviv 
 Signed-off-by: Gilad Broner 
 Signed-off-by: Yaniv Gardi 

 ---
  drivers/scsi/ufs/ufshcd.c | 36 
  1 file changed, 36 insertions(+)

>>> Having a timeout handler is always a good idea, even though this
>>> doesn't do anything here.
>>> Are we sure that the requests will return eventually?
>>> Does the UFS spec provide for a command abort?
>>>
>> In fact, looking at the UFS spec there _is_ a command abort.
>> I would recommend implementing a task management request UPIO with
>> type 'ABORT TASK' here for any task found to be pending.
>> In the end, you might run into a _valid_ timeout, at which point you
>> really want to abort the command...
>>
> 
> but this is not what we'd like to achieve.
> we don't want to abort a task that was not even dispatched to the UFS driver.
> in those cases we need to re-queue the request and reset the timer.
> 
Fully understood.

> Hannes, i appreciate your time, but I really don't understand why you
> insist on coming up with suggestions, when we already implemented one that
> is working. more over, your solution doesn't fix the race condition which is 
> the
> reason for this patch.
> as i don't have HW to test anything at the moment, I think it's better to
> stick with this solution that also fix the BUG and also was verified and
> tested.
> 
Ah. Didn't know that. I was under the impression that you _had_ the
hardware available. If not then of course it's not easy to verify
anything.

So, all things considered:

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic

2016-03-08 Thread Vineet Gupta
+CC linux-arch, parisc folks, PeterZ

On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote:
> On Tue, 8 Mar 2016, Vineet Gupta wrote:
> 
>> # set the bit
>> 80543b8e:ld_s   r2,[r13,0] <--- (A) Finds PG_locked is set
>> 80543b90:or r3,r2,1<--- (B) other core unlocks right here
>> 80543b94:st_s   r3,[r13,0] <--- (C) sets PG_locked (overwrites 
>> unlock)
> 
> Duh. Guess you  need to take the spinlock also in the arch specific
> implementation of __bit_spin_unlock(). This is certainly not the only case
> in which we use the __ op to unlock.

__bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() 
is
- so I don't think we need a spinlock there.

There is clearly a problem in slub code that it is pairing a test_and_set_bit()
with a __clear_bit(). Latter can obviously clobber former if they are not a 
single
instruction each unlike x86 or they use llock/scond kind of instructions where 
the
interim store from other core is detected and causes a retry of whole 
llock/scond
sequence.

BTW ARC is not the only arch which suffers from this - other arches potentially
also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of
external hashed spinlocks to protect the r-m-w sequences.

https://lkml.org/lkml/2014/6/1/178

So there also we have the same race because the outer spin lock is not taken for
slab_unlock() -> __bit_spin_lock() -> __clear_bit.

Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the 
bit
unconditionally but only if it was clear (PARISC does the same). That would be a
slight micro-optimization as we won't need another snoop transaction to make 
line
writable and that would also elide this problem, but I think there is a
fundamental problem here in slub which is pairing atomic and non atomic ops - 
for
performance reasons. It doesn't work on all arches and/or configurations.

> You need a true atomic op or you need to take the "spinlock" in all
> cases where you modify the bit.

No we don't in __bit_spin_lock and we already do in bit_spin_lock.

> If you take the lock in __bit_spin_unlock
> then the race cannot happen.

Of course it won't but that means we penalize all non atomic callers of the API
with a superfluous spinlock which is not require din first place given the
definition of API.


>> Are you convinced now !
> 
> Yes, please fix your arch specific code.





Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100

2016-03-08 Thread Scott Wood
On Tue, Mar 08, 2016 at 08:59:12AM +0100, Alessio Igor Bogani wrote:
> The mtmsr() function hangs during restart. Make reboot works on
> MVME5100 removing that function call.
> ---
>  arch/powerpc/platforms/embedded6xx/mvme5100.c | 2 --
>  1 file changed, 2 deletions(-)

Missing signoff

Do you know why MSR_IP was there to begin with?  Does this board have a
switch that determines whether boot vectors are high or low (I remember
some 83xx boards that did), in which case is this fixing one config by
breaking another?

-Scott


Re: [PATCH RFC 09/22] block, cfq: replace CFQ with the BFQ-v0 I/O scheduler

2016-03-08 Thread Paolo Valente

Il giorno 01/mar/2016, alle ore 19:46, Tejun Heo  ha scritto:

> Hello, Paolo.
> 
> Sorry about the delay.
> 
> On Sat, Feb 20, 2016 at 11:23:43AM +0100, Paolo Valente wrote:
>> Before replying to your points, I want to stress that I'm not a
>> champion of budget-based scheduling at all costs. Budget-based
>> scheduling just seems to provide tight bandwidth and latency
>> guarantees that are practically impossible to get with time-based
>> scheduling. I will try to explain this fact better, and to provide
>> also a numerical example, in my replies to your points.
> 
> I do like the budget-based scheduling.  It just feels that the budget
> is based on the wrong unit.
> 

This is probably the focal point of our discussion. Unfortunately, I
am still not convinced of your claim. In fact, basing budgets on
sectors (service), instead of time, still seems to me the only way to
provide the stronger bandwidth and low-latency guarantees that I have
tried to highlight in my previous email. And these guarantees do not
seem to concern only a single special case, but several, common use
cases for server and desktop systems. I will try to repeat these facts
more concisely, and hopefully more clearly, in my replies to next
points.

> ...
>> I think I got your point. In any case, a queue is not punished *after*
>> it has consumed an undue amount of the resource, because a queue just
>> cannot get to consume an undue amount of the resource. There is a
>> timeout: a queue cannot be served for more than a pre-defined maximum
>> time slice.
>> 
>> Even if a queue expires before that timeout, BFQ checks anyway, on the
>> expiration of the queue, whether the queue is so slow to not deserve
>> accurate service-based guarantees. This is done to achieve additional
>> robustness. In fact, if service-based guarantees were provided to a
>> very slow queue that, for some reason, never causes the timeout to
>> expire, then the queue would happen to be served too often, i.e., to
>> get the undue amount of IO resource you mention.
> 
> I see.  Once a queue starts timing out its slice, it gets switched to
> time based scheduling; however, as you mentioned, workloads which
> generate moderate random IOs would still get preferential treatment
> over workloads which generate sequential IOs, by definition.
> 

Exactly. However, there is still something I don’t fully understand in
your doubt. With BFQ, workloads that generate moderate random IOs
would actually do less damage to throughput, on average, than with
CFQ. In fact, with CFQ the queues containing these IOs would
systematically get a full time slice, while there are two
possibilities with BFQ:
1) If the degree of randomness of (the IOs in) these queues is not too
high, then these queues are likely to finish budgets before
timeouts. In this case, with BFQ these queues get less service than
with CFQ, and thus can waste throughput less.
2) If the degree of randomness of these queues is very high, then they
consume full time slices with BFQ, exactly as with CFQ.

Of course, performance may differ if time slices, i.e., timeouts,
differ between BFQ and CFQ, but this is easy to tune, if needed.

> ...
>> Your metaphor is clear, but it does not seem to match what I expect
>> from my storage device. As a user of my PC, I’m concerned, e.g., about
>> how long it takes to copy a large file. I’m not concerned about what
>> percentage of time will be guaranteed to my file copy if other
>> processes happen to do I/O in parallel. As a similar example, a good
> 
> The underlying device is fundamentally incapable of giving guarantees
> like that.  The only way to get a (quasi) bandwidth guarantee from a
> disk device is either ensuring that the IO is almost completely
> sequential or there's enough buffer in capacity for the expected
> seekiness of the IO pattern.
> 
> For use cases where the differences in seekiness across workloads are
> accidental - e.g. all are trying to stream different files but some
> files are more fragmented by accident - using bandwidth as the
> resource unit would be helpful in mitigating the random gaps that the
> user shouldn't be bothered by, but that'd be focusing on a pretty
> narrow set of use cases.
> 
> Workloads are varied and underlying device performs wildly differently
> depending on the specific IO pattern.  Because rotating disks suck so
> badly, it's true that there's a lot of wiggle room in what the IO
> scheduler can do.  People are accustomed to dealing with random
> behaviors.  That said, it still doesn't feel comfortable to use the
> obviously wrong unit as the fundamental basis of resource
> distribution.
> 

Actually this does not seem to match our (admittedly limited)
experience with: low-to-high-end rotational devices, RAIDS, SSDs, SD
cards and eMMCs. When stimulated with the same patterns in out tests,
these devices always responded with about the same IO service
times. And this seems to comply with intuition, because, apart from
different initial cac

Re: linux-next: removal of the tiny tree

2016-03-08 Thread Josh Triplett
On Wed, Mar 09, 2016 at 04:32:39PM +1100, Stephen Rothwell wrote:
> Hi Josh,
> 
> I noticed that the tiny tree
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/josh/linux.git branch 
> tiny/next
> 
> has not been updated since v3.18-rc1.  I am going to remove it from
> linux-next tomorrow unless I hear that it may be useful.  It can always
> be easily added back if it proves useful in the future.

Please go ahead.  I'll send a note requesting re-addition when it has
new bits.

- Josh Triplett


Re: [PATCH 4/5] gpio: of: Add support to have multiple gpios in gpio-hog

2016-03-08 Thread Markus Pargmann
Hi,

On Tue, Mar 08, 2016 at 05:32:07PM +0530, Laxman Dewangan wrote:
> The child node for gpio hogs under gpio controller's node
> provide the mechanism to automatic GPIO request and
> configuration as part of the gpio-controller's driver
> probe function.
> 
> Currently, property "gpio" takes one gpios for such
> configuration. Add support to have multiple GPIOs in
> this property so that multiple GPIOs of gpio-controller
> can be configured by this mechanism with one child node.

So if I read this correctly you want to have multiple GPIOs with the
same line name? Why don't you use multiple child nodes with individual
line names?

Best Regards,

Markus

> 
> Signed-off-by: Laxman Dewangan 
> Cc: Benoit Parrot 
> Cc: Alexandre Courbot 
> ---
>  drivers/gpio/gpiolib-of.c | 64 
> ---
>  1 file changed, 49 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
> index d81dbd8..0e4e8fd 100644
> --- a/drivers/gpio/gpiolib-of.c
> +++ b/drivers/gpio/gpiolib-of.c
> @@ -118,6 +118,21 @@ int of_get_named_gpio_flags(struct device_node *np, 
> const char *list_name,
>  }
>  EXPORT_SYMBOL(of_get_named_gpio_flags);
>  
> +static int of_gpio_get_gpio_cells_size(struct device_node *chip_np)
> +{
> + u32 ncells;
> + int ret;
> +
> + ret = of_property_read_u32(chip_np, "#gpio-cells", &ncells);
> + if (ret)
> + return ret;
> +
> + if (ncells > MAX_PHANDLE_ARGS)
> + return -EINVAL;
> +
> + return ncells;
> +}
> +
>  /**
>   * of_parse_own_gpio() - Get a GPIO hog descriptor, names and flags for GPIO 
> API
>   * @np:  device node to get GPIO from
> @@ -131,6 +146,7 @@ EXPORT_SYMBOL(of_get_named_gpio_flags);
>   */
>  static struct gpio_desc *of_parse_own_gpio(struct device_node *np,
>  const char **name,
> +int gpio_index,
>  enum gpio_lookup_flags *lflags,
>  enum gpiod_flags *dflags)
>  {
> @@ -139,8 +155,8 @@ static struct gpio_desc *of_parse_own_gpio(struct 
> device_node *np,
>   struct gg_data gg_data = {
>   .flags = &xlate_flags,
>   };
> - u32 tmp;
> - int i, ret;
> + int ncells;
> + int i, start_index, ret;
>  
>   chip_np = np->parent;
>   if (!chip_np)
> @@ -150,17 +166,16 @@ static struct gpio_desc *of_parse_own_gpio(struct 
> device_node *np,
>   *lflags = 0;
>   *dflags = 0;
>  
> - ret = of_property_read_u32(chip_np, "#gpio-cells", &tmp);
> - if (ret)
> - return ERR_PTR(ret);
> + ncells = of_gpio_get_gpio_cells_size(chip_np);
> + if (ncells < 0)
> + return ERR_PTR(ncells);
>  
> - if (tmp > MAX_PHANDLE_ARGS)
> - return ERR_PTR(-EINVAL);
> + start_index = ncells * gpio_index;
>  
> - gg_data.gpiospec.args_count = tmp;
> + gg_data.gpiospec.args_count = ncells;
>   gg_data.gpiospec.np = chip_np;
> - for (i = 0; i < tmp; i++) {
> - ret = of_property_read_u32_index(np, "gpios", i,
> + for (i = 0; i < ncells; i++) {
> + ret = of_property_read_u32_index(np, "gpios", start_index + i,
>  &gg_data.gpiospec.args[i]);
>   if (ret)
>   return ERR_PTR(ret);
> @@ -211,18 +226,37 @@ static int of_gpiochip_scan_gpios(struct gpio_chip 
> *chip)
>   enum gpio_lookup_flags lflags;
>   enum gpiod_flags dflags;
>   int ret;
> + int i, ncells, ngpios;
> +
> + ncells = of_gpio_get_gpio_cells_size(chip->of_node);
> + if (ncells < 0)
> + return 0;
>  
>   for_each_available_child_of_node(chip->of_node, np) {
>   if (!of_property_read_bool(np, "gpio-hog"))
>   continue;
>  
> - desc = of_parse_own_gpio(np, &name, &lflags, &dflags);
> - if (IS_ERR(desc))
> + ngpios = of_property_count_u32_elems(np, "gpios");
> + if (ngpios < 0)
> + continue;
> +
> + if (ngpios % ncells) {
> + dev_warn(chip->parent,
> + "GPIOs entries are not proper in gpios\n");
>   continue;
> + }
> +
> + ngpios /= ncells;
> + for (i = 0; i < ngpios; i++) {
> + desc = of_parse_own_gpio(np, &name, i,
> +  &lflags, &dflags);
> + if (IS_ERR(desc))
> + continue;
>  
> - ret = gpiod_hog(desc, name, lflags, dflags);
> - if (ret < 0)
> - return ret;
> + ret = gpiod_hog(desc, name, lflags, dflags);
> + if (ret < 0)
> + return ret;
> + }
>   }
>  
>   return 

[PATCH] mm/mempool: Avoid KASAN marking mempool posion checks as use-after-free

2016-03-08 Thread Matthew Dawson
When removing an element from the mempool, mark it as unpoisoned in KASAN
before verifying its contents for SLUB/SLAB debugging.  Otherwise KASAN
will flag the reads checking the element use-after-free writes as
use-after-free reads.

Signed-off-by: Matthew Dawson 
---
 mm/mempool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/mempool.c b/mm/mempool.c
index 004d42b..7924f4f 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -135,8 +135,8 @@ static void *remove_element(mempool_t *pool)
void *element = pool->elements[--pool->curr_nr];
 
BUG_ON(pool->curr_nr < 0);
-   check_element(pool, element);
kasan_unpoison_element(pool, element);
+   check_element(pool, element);
return element;
 }
 
-- 
2.7.1



Re: [PATCH v3 3/9] irqchip/gic-v2: Gather ACPI specific data in a single structure

2016-03-08 Thread Julien Grall

Hi Christoffer,

On 09/03/2016 12:47, Christoffer Dall wrote:

On Tue, Mar 08, 2016 at 11:29:27AM +, Julien Grall wrote:

For now, there is only one member. More member will be added later.


questionable commit message


What about:

"The ACPI code requires to use global variables in order to collect 
information from the tables.


For now only a single global variable is used, but more will be added in 
a subsequent patch. To make clear they are ACPI specific, gather all the 
information in a single structure."


[...]


@@ -1316,7 +1319,7 @@ static int __init gic_v2_acpi_init(struct 
acpi_subtable_header *header,
return -EINVAL;
}

-   cpu_base = ioremap(cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE);
+   cpu_base = ioremap(acpi_data.cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE);
if (!cpu_base) {
pr_err("Unable to map GICC registers\n");
return -ENOMEM;
--
1.9.1


super nit: I would use cpu_phys_base instead of cpu_phy_base, but I'll
leave it up to you.


I will update the commit message, so I will rename the variable too.



Acked-by: Christoffer Dall 


Cheers,

--
Julien Grall


RE: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

2016-03-08 Thread Li, Liang Z
> On 04/03/2016 15:26, Li, Liang Z wrote:
> >> >
> >> > The memory usage will keep increasing due to ever growing caches,
> >> > etc, so you'll be left with very little free memory fairly soon.
> >> >
> > I don't think so.
> >
> 
> Roman is right.  For example, here I am looking at a 64 GB (physical) machine
> which was booted about 30 minutes ago, and which is running disk-heavy
> workloads (installing VMs).
> 
> Since I have started writing this email (2 minutes?), the amount of free
> memory has already gone down from 37 GB to 33 GB.  I expect that by the
> time I have finished running the workload, in two hours, it will not have any
> free memory.
> 
> Paolo

I have a VM which has 2GB of RAM, when the guest booted, there were about 1.4GB 
of free pages.
Then I tried to download a large file from the internet with the browser, after 
the downloading finished,
there were only 72MB of free pages left, as Roman pointed out, there were quite 
a lot of Cached memory.
Then I tried to compile the QEMU, after the compiling finished, there were 
about 1.3G free pages.

So even the cache will increase to a large amount, it will be freed if there 
are some other specific workloads. 
The cache memory is a big issue that should be taken into consideration.
 How about reclaim some cache before getting the free pages information?  

Liang 


[PATCH 1/2] vga_switcheroo: add power support for windows 10 machines.

2016-03-08 Thread Dave Airlie
From: Dave Airlie 

Windows 10 seems to have standardised power control for the
optimus/powerxpress laptops using PR3 power resource hooks.

I'm not sure this is definitely the correct place to be
doing this, but it works for me here.

The ACPI device for the GPU I have is \_SB_.PCI0.PEG_.VID_
but the power resource hooks are on \_SB_.PCI0.PEG_, so
this patch creates a new power domain to turn the GPU
device parent off using standard ACPI calls.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/vga/vga_switcheroo.c | 54 +++-
 include/linux/vga_switcheroo.h   |  3 ++-
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/vga/vga_switcheroo.c b/drivers/gpu/vga/vga_switcheroo.c
index 665ab9f..be32cb2 100644
--- a/drivers/gpu/vga/vga_switcheroo.c
+++ b/drivers/gpu/vga/vga_switcheroo.c
@@ -42,7 +42,7 @@
 #include 
 #include 
 #include 
-
+#include 
 /**
  * DOC: Overview
  *
@@ -997,3 +997,55 @@ vga_switcheroo_init_domain_pm_optimus_hdmi_audio(struct 
device *dev,
return -EINVAL;
 }
 EXPORT_SYMBOL(vga_switcheroo_init_domain_pm_optimus_hdmi_audio);
+
+/* With Windows 10 the runtime suspend/resume can use power
+   resources on the parent device */
+static int vga_acpi_switcheroo_runtime_suspend(struct device *dev)
+{
+   struct pci_dev *pdev = to_pci_dev(dev);
+   int ret;
+   struct acpi_device *adev;
+
+   ret = dev->bus->pm->runtime_suspend(dev);
+   if (ret)
+   return ret;
+
+   ret = acpi_bus_get_device(ACPI_HANDLE(&pdev->dev), &adev);
+   if (!ret)
+   acpi_device_set_power(adev->parent, ACPI_STATE_D3_COLD);
+   return 0;
+}
+
+static int vga_acpi_switcheroo_runtime_resume(struct device *dev)
+{
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct acpi_device *adev;
+   int ret;
+
+   ret = acpi_bus_get_device(ACPI_HANDLE(&pdev->dev), &adev);
+   if (!ret)
+   acpi_device_set_power(adev->parent, ACPI_STATE_D0);
+   ret = dev->bus->pm->runtime_resume(dev);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+int vga_switcheroo_init_parent_pr3_ops(struct device *dev,
+  struct dev_pm_domain *domain)
+
+{
+   /* copy over all the bus versions */
+   if (dev->bus && dev->bus->pm) {
+   domain->ops = *dev->bus->pm;
+   domain->ops.runtime_suspend = 
vga_acpi_switcheroo_runtime_suspend;
+   domain->ops.runtime_resume = vga_acpi_switcheroo_runtime_resume;
+
+   dev_pm_domain_set(dev, domain);
+   return 0;
+   }
+   dev_pm_domain_set(dev, NULL);
+   return -EINVAL;
+}
+EXPORT_SYMBOL(vga_switcheroo_init_parent_pr3_ops);
diff --git a/include/linux/vga_switcheroo.h b/include/linux/vga_switcheroo.h
index 69e1d4a1..5ce0cbe 100644
--- a/include/linux/vga_switcheroo.h
+++ b/include/linux/vga_switcheroo.h
@@ -144,6 +144,7 @@ void vga_switcheroo_set_dynamic_switch(struct pci_dev 
*pdev, enum vga_switcheroo
 int vga_switcheroo_init_domain_pm_ops(struct device *dev, struct dev_pm_domain 
*domain);
 void vga_switcheroo_fini_domain_pm_ops(struct device *dev);
 int vga_switcheroo_init_domain_pm_optimus_hdmi_audio(struct device *dev, 
struct dev_pm_domain *domain);
+int vga_switcheroo_init_parent_pr3_ops(struct device *dev, struct 
dev_pm_domain *domain);
 #else
 
 static inline void vga_switcheroo_unregister_client(struct pci_dev *dev) {}
@@ -163,6 +164,6 @@ static inline void vga_switcheroo_set_dynamic_switch(struct 
pci_dev *pdev, enum
 static inline int vga_switcheroo_init_domain_pm_ops(struct device *dev, struct 
dev_pm_domain *domain) { return -EINVAL; }
 static inline void vga_switcheroo_fini_domain_pm_ops(struct device *dev) {}
 static inline int vga_switcheroo_init_domain_pm_optimus_hdmi_audio(struct 
device *dev, struct dev_pm_domain *domain) { return -EINVAL; }
-
+static inline int vga_switcheroo_init_parent_pr3_ops(struct device *dev, 
struct dev_pm_domain *domain) { return -EINVAL; }
 #endif
 #endif /* _LINUX_VGA_SWITCHEROO_H_ */
-- 
2.5.0



[PATCH 2/2] nouveau: use new vga_switcheroo power domain.

2016-03-08 Thread Dave Airlie
From: Dave Airlie 

This fixes GPU auto powerdown on the Lenovo W541,
since we advertise Windows 2013 to the ACPI layer.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/nouveau/nouveau_vga.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
b/drivers/gpu/drm/nouveau/nouveau_vga.c
index af89c36..b987427f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -101,8 +101,12 @@ nouveau_vga_init(struct nouveau_drm *drm)
runtime = true;
vga_switcheroo_register_client(dev->pdev, &nouveau_switcheroo_ops, 
runtime);
 
-   if (runtime && nouveau_is_v1_dsm() && !nouveau_is_optimus())
-   vga_switcheroo_init_domain_pm_ops(drm->dev->dev, 
&drm->vga_pm_domain);
+   if (runtime) {
+   if (nouveau_is_v1_dsm() && !nouveau_is_optimus())
+   vga_switcheroo_init_domain_pm_ops(drm->dev->dev, 
&drm->vga_pm_domain);
+   else if (nouveau_is_optimus())
+   vga_switcheroo_init_parent_pr3_ops(drm->dev->dev, 
&drm->vga_pm_domain);
+   }
 }
 
 void
@@ -117,7 +121,7 @@ nouveau_vga_fini(struct nouveau_drm *drm)
runtime = true;
 
vga_switcheroo_unregister_client(dev->pdev);
-   if (runtime && nouveau_is_v1_dsm() && !nouveau_is_optimus())
+   if (runtime && (nouveau_is_v1_dsm() || nouveau_is_optimus()))
vga_switcheroo_fini_domain_pm_ops(drm->dev->dev);
vga_client_register(dev->pdev, NULL, NULL, NULL);
 }
-- 
2.5.0



[PATCH 10/11] objtool: Add several performance improvements

2016-03-08 Thread Josh Poimboeuf
Use hash tables for instruction and rela lookups (and keep the linked
lists around for sequential access).

Also cache the section struct for the "__func_stack_frame_non_standard"
section.

With this change, "objtool check net/wireless/nl80211.o" goes from:

  real  0m1.168s
  user  0m1.163s
  sys   0m0.005s

to:

  real  0m0.059s
  user  0m0.042s
  sys   0m0.017s

for a 20x speedup.

With the same object, it should be noted that the memory heap usage grew
from 8MB to 62MB.  Reducing the memory usage is on the TODO list.

Reported-by: Ingo Molnar 
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 18 --
 tools/objtool/elf.c   | 21 +++--
 tools/objtool/elf.h   | 10 --
 3 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index cf1e48d..bfeee22 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -34,6 +34,8 @@
 #include "arch.h"
 #include "warn.h"
 
+#include 
+
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 
 #define STATE_FP_SAVED 0x1
@@ -42,6 +44,7 @@
 
 struct instruction {
struct list_head list;
+   struct hlist_node hash;
struct section *sec;
unsigned long offset;
unsigned int len, state;
@@ -61,7 +64,8 @@ struct alternative {
 struct objtool_file {
struct elf *elf;
struct list_head insn_list;
-   struct section *rodata;
+   DECLARE_HASHTABLE(insn_hash, 16);
+   struct section *rodata, *whitelist;
 };
 
 const char *objname;
@@ -72,7 +76,7 @@ static struct instruction *find_insn(struct objtool_file 
*file,
 {
struct instruction *insn;
 
-   list_for_each_entry(insn, &file->insn_list, list)
+   hash_for_each_possible(file->insn_hash, insn, hash, offset)
if (insn->sec == sec && insn->offset == offset)
return insn;
 
@@ -111,14 +115,12 @@ static struct instruction *next_insn_same_sec(struct 
objtool_file *file,
  */
 static bool ignore_func(struct objtool_file *file, struct symbol *func)
 {
-   struct section *macro_sec;
struct rela *rela;
struct instruction *insn;
 
/* check for STACK_FRAME_NON_STANDARD */
-   macro_sec = find_section_by_name(file->elf, 
"__func_stack_frame_non_standard");
-   if (macro_sec && macro_sec->rela)
-   list_for_each_entry(rela, ¯o_sec->rela->rela_list, list)
+   if (file->whitelist && file->whitelist->rela)
+   list_for_each_entry(rela, &file->whitelist->rela->rela_list, 
list)
if (rela->sym->sec == func->sec &&
rela->addend == func->offset)
return true;
@@ -276,6 +278,7 @@ static int decode_instructions(struct objtool_file *file)
return -1;
}
 
+   hash_add(file->insn_hash, &insn->hash, insn->offset);
list_add_tail(&insn->list, &file->insn_list);
}
}
@@ -729,6 +732,7 @@ static int decode_sections(struct objtool_file *file)
 {
int ret;
 
+   file->whitelist = find_section_by_name(file->elf, 
"__func_stack_frame_non_standard");
file->rodata = find_section_by_name(file->elf, ".rodata");
 
ret = decode_instructions(file);
@@ -1091,6 +1095,7 @@ static void cleanup(struct objtool_file *file)
free(alt);
}
list_del(&insn->list);
+   hash_del(&insn->hash);
free(insn);
}
elf_close(file->elf);
@@ -1125,6 +1130,7 @@ int cmd_check(int argc, const char **argv)
}
 
INIT_LIST_HEAD(&file.insn_list);
+   hash_init(file.insn_hash);
 
ret = decode_sections(&file);
if (ret < 0)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 7de243f..e11f6b6 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -59,7 +59,7 @@ static struct symbol *find_symbol_by_index(struct elf *elf, 
unsigned int idx)
struct symbol *sym;
 
list_for_each_entry(sec, &elf->sections, list)
-   list_for_each_entry(sym, &sec->symbol_list, list)
+   hash_for_each_possible(sec->symbol_hash, sym, hash, idx)
if (sym->idx == idx)
return sym;
 
@@ -82,13 +82,15 @@ struct rela *find_rela_by_dest_range(struct section *sec, 
unsigned long offset,
 unsigned int len)
 {
struct rela *rela;
+   unsigned long o;
 
if (!sec->rela)
return NULL;
 
-   list_for_each_entry(rela, &sec->rela->rela_list, list)
-   if (rela->offset >= offset && rela->offset < offset + len)
-   return rela;
+   for (o = offset; o < offset + len; o++)
+   hash_for_each_possible(sec->rela->rela_hash, rela,

[PATCH 08/11] objtool: Fix false positive warnings for functions with multiple switch statements

2016-03-08 Thread Josh Poimboeuf
Ingo reported [1] some false positive objtool warnings:

  drivers/net/wireless/realtek/rtlwifi/base.o: warning: objtool: 
rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
  drivers/net/wireless/realtek/rtlwifi/base.o: warning: objtool: 
rtlwifi_rate_mapping()+0x2f3: frame pointer state mismatch
  ...

And so did the 0-day bot [2]:

  drivers/gpu/drm/radeon/cik.o: warning: objtool: 
cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
  drivers/gpu/drm/radeon/cik.o: warning: objtool: 
cik_tiling_mode_table_init()+0x72b: call without frame pointer save/setup
  ...

Both sets of warnings involve functions which have multiple switch
statements.  When there's more than one switch statement in a function,
objtool interprets all the switch jump tables as a single table.  If the
targets of one jump table assume a stack frame and the targets of
another one don't, it prints false positive warnings.

Fix the bug by detecting the size of each switch jump table.  For
multiple tables, each one ends where the next one begins.

[1] https://lkml.kernel.org/r/20160308103716.ga9...@gmail.com
[2] https://lists.01.org/pipermail/kbuild-all/2016-March/018124.html

Reported-by: Ingo Molnar 
Reported-by: kbuild test robot 
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 145 +-
 1 file changed, 100 insertions(+), 45 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index cdbdd7d..cf1e48d 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -61,6 +61,7 @@ struct alternative {
 struct objtool_file {
struct elf *elf;
struct list_head insn_list;
+   struct section *rodata;
 };
 
 const char *objname;
@@ -599,73 +600,125 @@ out:
return ret;
 }
 
-/*
- * For some switch statements, gcc generates a jump table in the .rodata
- * section which contains a list of addresses within the function to jump to.
- * This finds these jump tables and adds them to the insn->alts lists.
- */
-static int add_switch_table_alts(struct objtool_file *file)
+static int add_switch_table(struct objtool_file *file, struct symbol *func,
+   struct instruction *insn, struct rela *table,
+   struct rela *next_table)
 {
-   struct instruction *insn, *alt_insn;
-   struct rela *rodata_rela, *text_rela;
-   struct section *rodata;
-   struct symbol *func;
+   struct rela *rela = table;
+   struct instruction *alt_insn;
struct alternative *alt;
 
-   for_each_insn(file, insn) {
+   list_for_each_entry_from(rela, &file->rodata->rela->rela_list, list) {
+   if (rela == next_table)
+   break;
+
+   if (rela->sym->sec != insn->sec ||
+   rela->addend <= func->offset ||
+   rela->addend >= func->offset + func->len)
+   break;
+
+   alt_insn = find_insn(file, insn->sec, rela->addend);
+   if (!alt_insn) {
+   WARN("%s: can't find instruction at %s+0x%x",
+file->rodata->rela->name, insn->sec->name,
+rela->addend);
+   return -1;
+   }
+
+   alt = malloc(sizeof(*alt));
+   if (!alt) {
+   WARN("malloc failed");
+   return -1;
+   }
+
+   alt->insn = alt_insn;
+   list_add_tail(&alt->list, &insn->alts);
+   }
+
+   return 0;
+}
+
+static int add_func_switch_tables(struct objtool_file *file,
+ struct symbol *func)
+{
+   struct instruction *insn, *prev_jump;
+   struct rela *text_rela, *rodata_rela, *prev_rela;
+   int ret;
+
+   prev_jump = NULL;
+
+   func_for_each_insn(file, func, insn) {
if (insn->type != INSN_JUMP_DYNAMIC)
continue;
 
text_rela = find_rela_by_dest_range(insn->sec, insn->offset,
insn->len);
-   if (!text_rela || strcmp(text_rela->sym->name, ".rodata"))
-   continue;
-
-   rodata = find_section_by_name(file->elf, ".rodata");
-   if (!rodata || !rodata->rela)
+   if (!text_rela || text_rela->sym != file->rodata->sym)
continue;
 
/* common case: jmpq *[addr](,%rax,8) */
-   rodata_rela = find_rela_by_dest(rodata, text_rela->addend);
+   rodata_rela = find_rela_by_dest(file->rodata,
+   text_rela->addend);
 
-   /* rare case:   jmpq *[addr](%rip) */
+   /*
+* TODO: Document where this is needed, or get rid of it.
+*
+* rare case:   jmpq *[addr](%rip)
+  

[PATCH 07/11] objtool: Rename some variables and functions

2016-03-08 Thread Josh Poimboeuf
Rename some list heads to distinguish them from hash node heads, which
are added later in the patch series.

Also rename the get_*() functions to add_*(), which is more descriptive:
they "add" data to the objtool_file struct.

Also rename rodata_rela and text_rela to be clearer:
- text_rela refers to a rela entry in .rela.text.
- rodata_rela refers to a rela entry in .rela.rodata.

Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 80 ++-
 tools/objtool/elf.c   | 22 ++--
 tools/objtool/elf.h   |  4 +--
 3 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index a974f295..cdbdd7d 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -60,7 +60,7 @@ struct alternative {
 
 struct objtool_file {
struct elf *elf;
-   struct list_head insns;
+   struct list_head insn_list;
 };
 
 const char *objname;
@@ -71,7 +71,7 @@ static struct instruction *find_insn(struct objtool_file 
*file,
 {
struct instruction *insn;
 
-   list_for_each_entry(insn, &file->insns, list)
+   list_for_each_entry(insn, &file->insn_list, list)
if (insn->sec == sec && insn->offset == offset)
return insn;
 
@@ -83,18 +83,18 @@ static struct instruction *next_insn_same_sec(struct 
objtool_file *file,
 {
struct instruction *next = list_next_entry(insn, list);
 
-   if (&next->list == &file->insns || next->sec != insn->sec)
+   if (&next->list == &file->insn_list || next->sec != insn->sec)
return NULL;
 
return next;
 }
 
 #define for_each_insn(file, insn)  \
-   list_for_each_entry(insn, &file->insns, list)
+   list_for_each_entry(insn, &file->insn_list, list)
 
 #define func_for_each_insn(file, func, insn)   \
for (insn = find_insn(file, func->sec, func->offset);   \
-insn && &insn->list != &file->insns && \
+insn && &insn->list != &file->insn_list && \
insn->sec == func->sec &&   \
insn->offset < func->offset + func->len;\
 insn = list_next_entry(insn, list))
@@ -117,7 +117,7 @@ static bool ignore_func(struct objtool_file *file, struct 
symbol *func)
/* check for STACK_FRAME_NON_STANDARD */
macro_sec = find_section_by_name(file->elf, 
"__func_stack_frame_non_standard");
if (macro_sec && macro_sec->rela)
-   list_for_each_entry(rela, ¯o_sec->rela->relas, list)
+   list_for_each_entry(rela, ¯o_sec->rela->rela_list, list)
if (rela->sym->sec == func->sec &&
rela->addend == func->offset)
return true;
@@ -240,7 +240,7 @@ static int dead_end_function(struct objtool_file *file, 
struct symbol *func)
 
 /*
  * Call the arch-specific instruction decoder for all the instructions and add
- * them to the global insns list.
+ * them to the global instruction list.
  */
 static int decode_instructions(struct objtool_file *file)
 {
@@ -275,7 +275,7 @@ static int decode_instructions(struct objtool_file *file)
return -1;
}
 
-   list_add_tail(&insn->list, &file->insns);
+   list_add_tail(&insn->list, &file->insn_list);
}
}
 
@@ -285,14 +285,14 @@ static int decode_instructions(struct objtool_file *file)
 /*
  * Warnings shouldn't be reported for ignored functions.
  */
-static void get_ignores(struct objtool_file *file)
+static void add_ignores(struct objtool_file *file)
 {
struct instruction *insn;
struct section *sec;
struct symbol *func;
 
list_for_each_entry(sec, &file->elf->sections, list) {
-   list_for_each_entry(func, &sec->symbols, list) {
+   list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;
 
@@ -308,7 +308,7 @@ static void get_ignores(struct objtool_file *file)
 /*
  * Find the destination instructions for all jumps.
  */
-static int get_jump_destinations(struct objtool_file *file)
+static int add_jump_destinations(struct objtool_file *file)
 {
struct instruction *insn;
struct rela *rela;
@@ -365,7 +365,7 @@ static int get_jump_destinations(struct objtool_file *file)
 /*
  * Find the destination instructions for all calls.
  */
-static int get_call_destinations(struct objtool_file *file)
+static int add_call_destinations(struct objtool_file *file)
 {
struct instruction *insn;
unsigned long dest_off;
@@ -534,7 +534,7 @@ static int handle_jump_alt(struct objtool_file *file,
  * instruction(s) has them adde

[PATCH 09/11] tools/objtool: Copy hashtable.h into tools directory

2016-03-08 Thread Josh Poimboeuf
Copy hashtable.h from include/linux/tools.h.  It's needed by objtool in
the next patch in the series.

Add some includes that it needs, and remove references to
kernel-specific features like RCU and __read_mostly.

Also change some if its dependency headers' includes to use quotes
instead of brackets so gcc can find them.

Signed-off-by: Josh Poimboeuf 
---
 tools/include/asm-generic/bitops/__fls.h |   2 +-
 tools/include/asm-generic/bitops/fls.h   |   2 +-
 tools/include/asm-generic/bitops/fls64.h |   2 +-
 tools/include/linux/hashtable.h  | 152 +++
 4 files changed, 155 insertions(+), 3 deletions(-)
 create mode 100644 tools/include/linux/hashtable.h

diff --git a/tools/include/asm-generic/bitops/__fls.h 
b/tools/include/asm-generic/bitops/__fls.h
index 2218b9a..494c9c6 100644
--- a/tools/include/asm-generic/bitops/__fls.h
+++ b/tools/include/asm-generic/bitops/__fls.h
@@ -1 +1 @@
-#include <../../../../include/asm-generic/bitops/__fls.h>
+#include "../../../../include/asm-generic/bitops/__fls.h"
diff --git a/tools/include/asm-generic/bitops/fls.h 
b/tools/include/asm-generic/bitops/fls.h
index dbf711a..0e4995f 100644
--- a/tools/include/asm-generic/bitops/fls.h
+++ b/tools/include/asm-generic/bitops/fls.h
@@ -1 +1 @@
-#include <../../../../include/asm-generic/bitops/fls.h>
+#include "../../../../include/asm-generic/bitops/fls.h"
diff --git a/tools/include/asm-generic/bitops/fls64.h 
b/tools/include/asm-generic/bitops/fls64.h
index 980b1f6..35bee00 100644
--- a/tools/include/asm-generic/bitops/fls64.h
+++ b/tools/include/asm-generic/bitops/fls64.h
@@ -1 +1 @@
-#include <../../../../include/asm-generic/bitops/fls64.h>
+#include "../../../../include/asm-generic/bitops/fls64.h"
diff --git a/tools/include/linux/hashtable.h b/tools/include/linux/hashtable.h
new file mode 100644
index 000..c65cc0a
--- /dev/null
+++ b/tools/include/linux/hashtable.h
@@ -0,0 +1,152 @@
+/*
+ * Statically sized hash table implementation
+ * (C) 2012  Sasha Levin 
+ */
+
+#ifndef _LINUX_HASHTABLE_H
+#define _LINUX_HASHTABLE_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
+#define DEFINE_HASHTABLE(name, bits)   
\
+   struct hlist_head name[1 << (bits)] =   
\
+   { [0 ... ((1 << (bits)) - 1)] = HLIST_HEAD_INIT }
+
+#define DECLARE_HASHTABLE(name, bits)  
\
+   struct hlist_head name[1 << (bits)]
+
+#define HASH_SIZE(name) (ARRAY_SIZE(name))
+#define HASH_BITS(name) ilog2(HASH_SIZE(name))
+
+/* Use hash_32 when possible to allow for fast 32bit hashing in 64bit kernels. 
*/
+#define hash_min(val, bits)
\
+   (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits))
+
+static inline void __hash_init(struct hlist_head *ht, unsigned int sz)
+{
+   unsigned int i;
+
+   for (i = 0; i < sz; i++)
+   INIT_HLIST_HEAD(&ht[i]);
+}
+
+/**
+ * hash_init - initialize a hash table
+ * @hashtable: hashtable to be initialized
+ *
+ * Calculates the size of the hashtable from the given parameter, otherwise
+ * same as hash_init_size.
+ *
+ * This has to be a macro since HASH_BITS() will not work on pointers since
+ * it calculates the size during preprocessing.
+ */
+#define hash_init(hashtable) __hash_init(hashtable, HASH_SIZE(hashtable))
+
+/**
+ * hash_add - add an object to a hashtable
+ * @hashtable: hashtable to add to
+ * @node: the &struct hlist_node of the object to be added
+ * @key: the key of the object to be added
+ */
+#define hash_add(hashtable, node, key) 
\
+   hlist_add_head(node, &hashtable[hash_min(key, HASH_BITS(hashtable))])
+
+/**
+ * hash_hashed - check whether an object is in any hashtable
+ * @node: the &struct hlist_node of the object to be checked
+ */
+static inline bool hash_hashed(struct hlist_node *node)
+{
+   return !hlist_unhashed(node);
+}
+
+static inline bool __hash_empty(struct hlist_head *ht, unsigned int sz)
+{
+   unsigned int i;
+
+   for (i = 0; i < sz; i++)
+   if (!hlist_empty(&ht[i]))
+   return false;
+
+   return true;
+}
+
+/**
+ * hash_empty - check whether a hashtable is empty
+ * @hashtable: hashtable to check
+ *
+ * This has to be a macro since HASH_BITS() will not work on pointers since
+ * it calculates the size during preprocessing.
+ */
+#define hash_empty(hashtable) __hash_empty(hashtable, HASH_SIZE(hashtable))
+
+/**
+ * hash_del - remove an object from a hashtable
+ * @node: &struct hlist_node of the object to remove
+ */
+static inline void hash_del(struct hlist_node *node)
+{
+   hlist_del_init(node);
+}
+
+/**
+ * hash_for_each - iterate over a hashtable
+ * @name: hashtable to iterate
+ * @bkt: integer to use as bucket loop cursor

[PATCH 06/11] objtool: Remove superflous INIT_LIST_HEAD

2016-03-08 Thread Josh Poimboeuf
The insns list is initialized twice, in cmd_check() and in
decode_instructions().  Remove the latter.

Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 46a8985..a974f295 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -249,8 +249,6 @@ static int decode_instructions(struct objtool_file *file)
struct instruction *insn;
int ret;
 
-   INIT_LIST_HEAD(&file->insns);
-
list_for_each_entry(sec, &file->elf->sections, list) {
 
if (!(sec->sh.sh_flags & SHF_EXECINSTR))
-- 
2.4.3



[PATCH 11/11] objtool: Only print one warning per function

2016-03-08 Thread Josh Poimboeuf
When objtool discovers an issue, it's very common for it to flood the
terminal with a lot of duplicate warnings.  For example:

  warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
  warning: objtool: rtlwifi_rate_mapping()+0x2f3: frame pointer state mismatch
  warning: objtool: rtlwifi_rate_mapping()+0x2ff: frame pointer state mismatch
  warning: objtool: rtlwifi_rate_mapping()+0x30b: frame pointer state mismatch
  ...

The first warning is usually all you need.  Change it to only warn once
per function.

Suggested-by: Ingo Molnar 
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 48 ++-
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index bfeee22..7515cb2 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -800,7 +800,7 @@ static int validate_branch(struct objtool_file *file,
struct instruction *insn;
struct section *sec;
unsigned char state;
-   int ret, warnings = 0;
+   int ret;
 
insn = first;
sec = insn->sec;
@@ -809,7 +809,7 @@ static int validate_branch(struct objtool_file *file,
if (insn->alt_group && list_empty(&insn->alts)) {
WARN_FUNC("don't know how to handle branch to middle of 
alternative instruction group",
  sec, insn->offset);
-   warnings++;
+   return 1;
}
 
while (1) {
@@ -817,10 +817,10 @@ static int validate_branch(struct objtool_file *file,
if (frame_state(insn->state) != frame_state(state)) {
WARN_FUNC("frame pointer state mismatch",
  sec, insn->offset);
-   warnings++;
+   return 1;
}
 
-   return warnings;
+   return 0;
}
 
/*
@@ -828,14 +828,15 @@ static int validate_branch(struct objtool_file *file,
 * the next function.
 */
if (is_fentry_call(insn) && (state & STATE_FENTRY))
-   return warnings;
+   return 0;
 
insn->visited = true;
insn->state = state;
 
list_for_each_entry(alt, &insn->alts, list) {
ret = validate_branch(file, alt->insn, state);
-   warnings += ret;
+   if (ret)
+   return 1;
}
 
switch (insn->type) {
@@ -845,7 +846,7 @@ static int validate_branch(struct objtool_file *file,
if (state & STATE_FP_SAVED) {
WARN_FUNC("duplicate frame pointer 
save",
  sec, insn->offset);
-   warnings++;
+   return 1;
}
state |= STATE_FP_SAVED;
}
@@ -856,7 +857,7 @@ static int validate_branch(struct objtool_file *file,
if (state & STATE_FP_SETUP) {
WARN_FUNC("duplicate frame pointer 
setup",
  sec, insn->offset);
-   warnings++;
+   return 1;
}
state |= STATE_FP_SETUP;
}
@@ -875,9 +876,9 @@ static int validate_branch(struct objtool_file *file,
if (!nofp && has_modified_stack_frame(insn)) {
WARN_FUNC("return without frame pointer 
restore",
  sec, insn->offset);
-   warnings++;
+   return 1;
}
-   return warnings;
+   return 0;
 
case INSN_CALL:
if (is_fentry_call(insn)) {
@@ -887,16 +888,16 @@ static int validate_branch(struct objtool_file *file,
 
ret = dead_end_function(file, insn->call_dest);
if (ret == 1)
-   return warnings;
+   return 0;
if (ret == -1)
-   warnings++;
+   return 1;
 
/* fallthrough */
case INSN_CALL_DYNAMIC:
if (!nofp && !has_valid_stack_frame(insn)) {
WARN_FUNC("call without frame pointer 
save/setup",
  sec, insn->off

[PATCH 03/11] objtool: Compile with debugging symbols

2016-03-08 Thread Josh Poimboeuf
Compile objtool with debugging symbols ('-g') to help tools like perf
and gdb understand what it's doing.  Combined with '-O2', it's not
always helpful, but it's better than nothing.

Reported-by: Ingo Molnar 
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index e4a6bd5..6765c7e 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -27,7 +27,7 @@ OBJTOOL_IN := $(OBJTOOL)-in.o
 all: $(OBJTOOL)
 
 INCLUDES := -I$(srctree)/tools/include
-CFLAGS   += -Wall -Werror $(EXTRA_WARNINGS) -fomit-frame-pointer -O2 
$(INCLUDES)
+CFLAGS   += -Wall -Werror $(EXTRA_WARNINGS) -fomit-frame-pointer -O2 -g 
$(INCLUDES)
 LDFLAGS  += -lelf $(LIBSUBCMD)
 
 AWK = awk
-- 
2.4.3



[PATCH 01/11] objtool: Prevent infinite recursion in noreturn detection

2016-03-08 Thread Josh Poimboeuf
Ingo reported an infinite loop in objtool with a certain randconfig [1].
With the given config, two functions in crypto/ablkcipher.o contained
sibling calls to each other, which threw the recursive call in
dead_end_function() for a loop (literally!).

Split the noreturn detection into two passes.  In the first pass, check
for return instructions.  In the second pass, do the potentially
recursive sibling call check.  In most cases, the first pass will be
good enough.  In the rare case where a second pass is needed, recursion
should hopefully no longer be possible.

[1] https://lkml.kernel.org/r/20160308154909.ga20...@gmail.com

Reported-by: Ingo Molnar 
Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index f7e0eba..80d9ed9 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -125,7 +125,7 @@ static bool ignore_func(struct objtool_file *file, struct 
symbol *func)
 static bool dead_end_function(struct objtool_file *file, struct symbol *func)
 {
int i;
-   struct instruction *insn;
+   struct instruction *insn, *func_insn;
bool empty = true;
 
/*
@@ -154,10 +154,11 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
if (!func->sec)
return false;
 
-   insn = find_instruction(file, func->sec, func->offset);
-   if (!insn)
+   func_insn = find_instruction(file, func->sec, func->offset);
+   if (!func_insn)
return false;
 
+   insn = func_insn;
list_for_each_entry_from(insn, &file->insns, list) {
if (insn->sec != func->sec ||
insn->offset >= func->offset + func->len)
@@ -167,6 +168,21 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
 
if (insn->type == INSN_RETURN)
return false;
+   }
+
+   if (empty)
+   return false;
+
+   /*
+* A function can have a sibling call instead of a return.  In that
+* case, the function's dead-end status depends on whether the target
+* of the sibling call returns.
+*/
+   insn = func_insn;
+   list_for_each_entry_from(insn, &file->insns, list) {
+   if (insn->sec != func->sec ||
+   insn->offset >= func->offset + func->len)
+   break;
 
if (insn->type == INSN_JUMP_UNCONDITIONAL) {
struct instruction *dest = insn->jump_dest;
@@ -194,7 +210,7 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
return false;
}
 
-   return !empty;
+   return true;
 }
 
 /*
-- 
2.4.3



[PATCH 05/11] objtool: Add helper macros for traversing instructions

2016-03-08 Thread Josh Poimboeuf
Add some helper macros to make it easier to traverse instructions, and
to abstract the details of the instruction list implementation in
preparation for creating a hash structure.

Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 128 ++
 1 file changed, 55 insertions(+), 73 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index fe24804..46a8985 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -66,9 +66,8 @@ struct objtool_file {
 const char *objname;
 static bool nofp;
 
-static struct instruction *find_instruction(struct objtool_file *file,
-   struct section *sec,
-   unsigned long offset)
+static struct instruction *find_insn(struct objtool_file *file,
+struct section *sec, unsigned long offset)
 {
struct instruction *insn;
 
@@ -79,6 +78,31 @@ static struct instruction *find_instruction(struct 
objtool_file *file,
return NULL;
 }
 
+static struct instruction *next_insn_same_sec(struct objtool_file *file,
+ struct instruction *insn)
+{
+   struct instruction *next = list_next_entry(insn, list);
+
+   if (&next->list == &file->insns || next->sec != insn->sec)
+   return NULL;
+
+   return next;
+}
+
+#define for_each_insn(file, insn)  \
+   list_for_each_entry(insn, &file->insns, list)
+
+#define func_for_each_insn(file, func, insn)   \
+   for (insn = find_insn(file, func->sec, func->offset);   \
+insn && &insn->list != &file->insns && \
+   insn->sec == func->sec &&   \
+   insn->offset < func->offset + func->len;\
+insn = list_next_entry(insn, list))
+
+#define sec_for_each_insn_from(file, insn) \
+   for (; insn; insn = next_insn_same_sec(file, insn))
+
+
 /*
  * Check if the function has been manually whitelisted with the
  * STACK_FRAME_NON_STANDARD macro, or if it should be automatically whitelisted
@@ -99,16 +123,9 @@ static bool ignore_func(struct objtool_file *file, struct 
symbol *func)
return true;
 
/* check if it has a context switching instruction */
-   insn = find_instruction(file, func->sec, func->offset);
-   if (!insn)
-   return false;
-   list_for_each_entry_from(insn, &file->insns, list) {
-   if (insn->sec != func->sec ||
-   insn->offset >= func->offset + func->len)
-   break;
+   func_for_each_insn(file, func, insn)
if (insn->type == INSN_CONTEXT_SWITCH)
return true;
-   }
 
return false;
 }
@@ -131,7 +148,7 @@ static int __dead_end_function(struct objtool_file *file, 
struct symbol *func,
   int recursion)
 {
int i;
-   struct instruction *insn, *func_insn;
+   struct instruction *insn;
bool empty = true;
 
/*
@@ -160,16 +177,7 @@ static int __dead_end_function(struct objtool_file *file, 
struct symbol *func,
if (!func->sec)
return 0;
 
-   func_insn = find_instruction(file, func->sec, func->offset);
-   if (!func_insn)
-   return 0;
-
-   insn = func_insn;
-   list_for_each_entry_from(insn, &file->insns, list) {
-   if (insn->sec != func->sec ||
-   insn->offset >= func->offset + func->len)
-   break;
-
+   func_for_each_insn(file, func, insn) {
empty = false;
 
if (insn->type == INSN_RETURN)
@@ -184,8 +192,7 @@ static int __dead_end_function(struct objtool_file *file, 
struct symbol *func,
 * case, the function's dead-end status depends on whether the target
 * of the sibling call returns.
 */
-   insn = func_insn;
-   list_for_each_entry_from(insn, &file->insns, list) {
+   func_for_each_insn(file, func, insn) {
if (insn->sec != func->sec ||
insn->offset >= func->offset + func->len)
break;
@@ -294,17 +301,8 @@ static void get_ignores(struct objtool_file *file)
if (!ignore_func(file, func))
continue;
 
-   insn = find_instruction(file, sec, func->offset);
-   if (!insn)
-   continue;
-
-   list_for_each_entry_from(insn, &file->insns, list) {
-   if (insn->sec != func->sec ||
-   insn->offset >= func->offset + func->len)
-   break;
-
+  

[PATCH 04/11] objtool: Fix false positive warnings related to sibling calls

2016-03-08 Thread Josh Poimboeuf
With some configs [1], objtool prints a bunch of false positive warnings
like:

  arch/x86/events/core.o: warning: objtool: x86_del_exclusive()+0x0: frame 
pointer state mismatch

For some reason this config has a bunch of sibling calls.  When objtool
follows a sibling call jump, it attempts to compare the frame pointer
state.  But it also accidentally compares the FENTRY state, resulting in
a false positive warning.

[1] https://lkml.kernel.org/r/20160308154909.ga20...@gmail.com

Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 51da270..fe24804 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -729,6 +729,11 @@ static bool has_valid_stack_frame(struct instruction *insn)
   (insn->state & STATE_FP_SETUP);
 }
 
+static unsigned int frame_state(unsigned long state)
+{
+   return (state & (STATE_FP_SAVED | STATE_FP_SETUP));
+}
+
 /*
  * Follow the branch starting at the given instruction, and recursively follow
  * any other branches (jumps).  Meanwhile, track the frame pointer state at
@@ -756,7 +761,7 @@ static int validate_branch(struct objtool_file *file,
 
while (1) {
if (insn->visited) {
-   if (insn->state != state) {
+   if (frame_state(insn->state) != frame_state(state)) {
WARN_FUNC("frame pointer state mismatch",
  sec, insn->offset);
warnings++;
-- 
2.4.3



Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async

2016-03-08 Thread Sergey Senozhatsky
Hello Jan,

On (03/07/16 13:16), Jan Kara wrote:
[..]
> > So if this will be a problem in practice, using a kthread will probably be
> > the easiest solution.
> 
> Hum, and thinking more about it: Considering that WQ_MEM_RECLAIM workqueues
> create kthread anyway as a rescuer thread, it may be the simplest to just
> go back to using a single kthread for printing. What do you think?

I have this patch on top of the series now. In short, it closes one more
possibility of lockups -- console_lock()/console_unlock() calls. the patch
splits console_unlock() in two parts:
-- the fast one just wake up printing kthread
-- the slow one does call_console_drivers() loop

I think it sort of makes sense to tweak the patch below a bit and fold it
into 0001, and move _some_ of the vprintk_emit() checks to console_unlock().

very schematically, after folding, vprintk_emit() will be

if (in_sched) {
if (!printk_sync && printk_thread)
wake_up()
else
irq_work_queue()
}

if (!in_sched)
if (console_trylock())
console_unlock()

and console_unlock() will be

if (!in_panic && !printk_sync && printk_thread) {
up_console_sem()
wake_up()
} else {
console_unlock_for_printk()
}

console_unlock_for_printk() does the call_console_drivers() loop.

console_flush_on_panic() and printing_func() call console_unlock_for_printk()
directly.


What do you think? Or would you prefer to first introduce async
printk() rework, and move to console_unlock() in vprintk_emit()
one release cycle later?
IOW, in 3 steps:
-- first make printk() async
-- then console_unlock() async, and use console_unlock_for_printk() in
   vprintk_emit()
-- then switch to console_unlock() in vprintk_emit().


below is the patch which introduces console_unlock_for_printk().
not the squashed console_unlock_for_printk() and 0001.

-ss

==

>From bc3932c68c5afb9bf43af98335c705c75067a93a Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Subject: [PATCH 3/4] printk: introduce console_unlock_for_printk()

Even though we already have asynchronous printk()->vprintk_emit(),
there are still good chances to get lockups, because we don't have
asynchronous console_unlock(). So any process doing console_lock()
and console_unlock() will end up looping in console_unlock(), pushing
the messages to console drivers (possibly with IRQs or preemption
disabled), regardless the fact that we have a dedicated kthread for
that particular job.

Apart from that, console_lock()/console_unlock() can be executed by
user processes as a part of system calls:

a)  SyS_open()
 ...
  chrdev_open()
   tty_open()
console_device()
 console_lock()
 console_unlock()
  for (;;) {
   call_console_drivers()
  }

b) SyS_read()
...
 sysfs_read_file()
  dev_attr_show()
   show_cons_active()
console_lock()
console_unlock()
 for (;;) {
  call_console_drivers()
 }

c) doing `cat /proc/consoles`
  SyS_read()
   vfs_read()
proc_reg_read()
 seq_read()
  c_stop()
   console_unlock()
for (;;) {
 call_console_drivers()
}

etc.

This can add unnecessary latencies to the user space processes.

This patch splits console_unlock() in two parts:
-- the fast path up() console semaphore and wake up printing kthread
   (if there is one, of course), otherwise
-- the slow path: does what console_unlock() did previously, emit
   the messages and then up() console semaphore

The actual printing loop is, thus, moved to a new function,
console_unlock_for_printk(). There are 3 places that
unconditionally call it:
 a) direct printing from vprintk_emit()
 b) console_flush_on_panic()
 c) printing kthread callback

Signed-off-by: Sergey Senozhatsky 
---
 kernel/printk/printk.c | 51 +++---
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index de45d86..ddaf62e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -303,6 +303,8 @@ static struct task_struct *printk_thread;
 /* Wait for printing wakeups from async vprintk_emit() */
 static DECLARE_WAIT_QUEUE_HEAD(printing_wait);
 
+static void console_unlock_for_printk(void);
+
 static int printing_func(void *data)
 {
while (1) {
@@ -314,7 +316,7 @@ static int printing_func(void *data)
remove_wait_queue(&printing_wait, &wait);
 
console_lock();
-   console_unlock();
+   console_unlock_for_printk();
}
 
return 0;
@@ -1900,7 +1902,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 * /dev/kmsg and syslog() users.
 */
if (console_trylock())
-   consol

[PATCH 02/11] objtool: Detect infinite recursion

2016-03-08 Thread Josh Poimboeuf
I don't _think_ dead_end_function() can get into a recursive loop, but
just in case, stop the loop and print a warning.

Signed-off-by: Josh Poimboeuf 
---
 tools/objtool/builtin-check.c | 45 +++
 1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 80d9ed9..51da270 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -121,8 +121,14 @@ static bool ignore_func(struct objtool_file *file, struct 
symbol *func)
  *
  * For local functions, we have to detect them manually by simply looking for
  * the lack of a return instruction.
+ *
+ * Returns:
+ *  -1: error
+ *   0: no dead end
+ *   1: dead end
  */
-static bool dead_end_function(struct objtool_file *file, struct symbol *func)
+static int __dead_end_function(struct objtool_file *file, struct symbol *func,
+  int recursion)
 {
int i;
struct instruction *insn, *func_insn;
@@ -144,19 +150,19 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
};
 
if (func->bind == STB_WEAK)
-   return false;
+   return 0;
 
if (func->bind == STB_GLOBAL)
for (i = 0; i < ARRAY_SIZE(global_noreturns); i++)
if (!strcmp(func->name, global_noreturns[i]))
-   return true;
+   return 1;
 
if (!func->sec)
-   return false;
+   return 0;
 
func_insn = find_instruction(file, func->sec, func->offset);
if (!func_insn)
-   return false;
+   return 0;
 
insn = func_insn;
list_for_each_entry_from(insn, &file->insns, list) {
@@ -167,11 +173,11 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
empty = false;
 
if (insn->type == INSN_RETURN)
-   return false;
+   return 0;
}
 
if (empty)
-   return false;
+   return 0;
 
/*
 * A function can have a sibling call instead of a return.  In that
@@ -190,7 +196,7 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
 
if (!dest)
/* sibling call to another file */
-   return false;
+   return 0;
 
if (dest->sec != func->sec ||
dest->offset < func->offset ||
@@ -201,16 +207,28 @@ static bool dead_end_function(struct objtool_file *file, 
struct symbol *func)
if (!dest_func)
continue;
 
-   return dead_end_function(file, dest_func);
+   if (recursion == 5) {
+   WARN_FUNC("infinite recursion (objtool 
bug!)",
+ dest->sec, dest->offset);
+   return -1;
+   }
+
+   return __dead_end_function(file, dest_func,
+  recursion + 1);
}
}
 
if (insn->type == INSN_JUMP_DYNAMIC)
/* sibling call */
-   return false;
+   return 0;
}
 
-   return true;
+   return 1;
+}
+
+static int dead_end_function(struct objtool_file *file, struct symbol *func)
+{
+   return __dead_end_function(file, func, 0);
 }
 
 /*
@@ -809,8 +827,11 @@ static int validate_branch(struct objtool_file *file,
break;
}
 
-   if (dead_end_function(file, insn->call_dest))
+   ret = dead_end_function(file, insn->call_dest);
+   if (ret == 1)
return warnings;
+   if (ret == -1)
+   warnings++;
 
/* fallthrough */
case INSN_CALL_DYNAMIC:
-- 
2.4.3



Re: [PATCH v3] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-03-08 Thread Byungchul Park
On Wed, Mar 09, 2016 at 11:00:37AM +0900, Byungchul Park wrote:
> On Wed, Feb 17, 2016 at 10:28:29AM +0100, Ingo Molnar wrote:
> > 
> > * Byungchul Park  wrote:
> > 
> > > diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
> > > index b8120ab..6634b68 100644
> > > --- a/kernel/locking/semaphore.c
> > > +++ b/kernel/locking/semaphore.c
> > > @@ -130,13 +130,14 @@ EXPORT_SYMBOL(down_killable);
> > >  int down_trylock(struct semaphore *sem)
> > >  {
> > >   unsigned long flags;
> > > - int count;
> > > + int count = -1;
> > >  
> > > - raw_spin_lock_irqsave(&sem->lock, flags);
> > > - count = sem->count - 1;
> > > - if (likely(count >= 0))
> > > - sem->count = count;
> > > - raw_spin_unlock_irqrestore(&sem->lock, flags);
> > > + if (raw_spin_trylock_irqsave(&sem->lock, flags)) {
> > > + count = sem->count - 1;
> > > + if (likely(count >= 0))
> > > + sem->count = count;
> > > + raw_spin_unlock_irqrestore(&sem->lock, flags);
> > > + }
> > 
> > I still don't really like it: two parallel trylocks will cause one of them 
> > to fail 
> > - while with the previous code they would both succeed.
> > 
> > None of these changes are necessary with all the printk robustification 
> > changes/enhancements we talked about, right?
> 
> Not only printk() but any code using a semaphore, mutex and so on, can also
> cause a deadlock if wake_up_process() eventually tries to acquire the lock.
> There are several ways to solve this problem.
> 
> 1. ensure wake_up_process() does not try to acquire the locks.
> 2. ensure wake_up_process() isn't protected by a spinlock of the locks.
> 3. ensure any kind of trylock stuff never cause waiting and deadlock.
> 4. and so on..
> 
> I am not sure which one is the best. But I think 3rd one is the one since
> it can be done by a generic way, even though it might decrease the success
> ratio as Ingo said, but IMHO it's not a big problem since a trylock user 
> only uses the trylock when it doesn't need to be cared whether it succeed
> or fail.
> 
> Which one among those do you think the best approach? Please let me know,
> then I will try to solve this problem by the appoach.

Or what do you think about this approach in which I replace the semaphore
with mutex and apply this patch to mutex trylock? Since the parallelism
does not mean that much to mutex trylock.. Right?

> 
> > 
> > Thanks,
> > 
> > Ingo


[PATCH 00/11] Various objtool fixes

2016-03-08 Thread Josh Poimboeuf
Based on tip/master.

These patches fix all known objtool issues:

- infinite loop
- sibling call false positives
- switch statement jump table fix
- performance improvements
- print one warning per function

Josh Poimboeuf (11):
  objtool: Prevent infinite recursion in noreturn detection
  objtool: Detect infinite recursion
  objtool: Compile with debugging symbols
  objtool: Fix false positive warnings related to sibling calls
  objtool: Add helper macros for traversing instructions
  objtool: Remove superflous INIT_LIST_HEAD
  objtool: Rename some variables and functions
  objtool: Fix false positive warnings for functions with multiple
switch statements
  tools/objtool: Copy hashtable.h into tools directory
  objtool: Add several performance improvements
  objtool: Only print one warning per function

 tools/include/asm-generic/bitops/__fls.h |   2 +-
 tools/include/asm-generic/bitops/fls.h   |   2 +-
 tools/include/asm-generic/bitops/fls64.h |   2 +-
 tools/include/linux/hashtable.h  | 152 +++
 tools/objtool/Makefile   |   2 +-
 tools/objtool/builtin-check.c| 429 +++
 tools/objtool/elf.c  |  37 ++-
 tools/objtool/elf.h  |  14 +-
 8 files changed, 447 insertions(+), 193 deletions(-)
 create mode 100644 tools/include/linux/hashtable.h

-- 
2.4.3



Re: [PATCH v3 9/9] clocksource: arm_arch_timer: Remove arch_timer_get_timecounter

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:33AM +, Julien Grall wrote:
> The only call of arch_timer_get_timecounter (in KVM) has been removed.
> 
> Signed-off-by: Julien Grall 
> 
Acked-by: Christoffer Dall 

> ---
> Cc: Daniel Lezcano 
> Cc: Thomas Gleixner 
> 
> Changes in v3:
> - Patch added
> ---
>  drivers/clocksource/arm_arch_timer.c | 5 -
>  include/clocksource/arm_arch_timer.h | 6 --
>  2 files changed, 11 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c 
> b/drivers/clocksource/arm_arch_timer.c
> index d8887f3..94696d3 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -463,11 +463,6 @@ struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)
>   return &arch_timer_kvm_info;
>  }
>  
> -struct timecounter *arch_timer_get_timecounter(void)
> -{
> - return &arch_timer_kvm_info.timecounter;
> -}
> -
>  static void __init arch_counter_register(unsigned type)
>  {
>   u64 start_count;
> diff --git a/include/clocksource/arm_arch_timer.h 
> b/include/clocksource/arm_arch_timer.h
> index 9dd996a..caedb74 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -58,7 +58,6 @@ struct arch_timer_kvm_info {
>  
>  extern u32 arch_timer_get_rate(void);
>  extern u64 (*arch_timer_read_counter)(void);
> -extern struct timecounter *arch_timer_get_timecounter(void);
>  extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void);
>  
>  #else
> @@ -73,11 +72,6 @@ static inline u64 arch_timer_read_counter(void)
>   return 0;
>  }
>  
> -static inline struct timecounter *arch_timer_get_timecounter(void)
> -{
> - return NULL;
> -}
> -
>  #endif
>  
>  #endif
> -- 
> 1.9.1
> 


Re: [PATCH v3 8/9] KVM: arm/arm64: vgic: Rely on the GIC driver to parse the firmware tables

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:32AM +, Julien Grall wrote:
> Currenlty, the firmware tables are parsed 2 times: once in the GIC

Currently,

> drivers, the other time when initializing the vGIC. It means code
> duplication and make more tedious to add the support for another
> firmware table (like ACPI).
> 
> Use the recently introduced helper gic_get_kvm_info() to get
> information about the virtual GIC.
> 
> With this change, the virtual GIC becomes agnostic to the firmware
> table and KVM will be able to initialize the vGIC on ACPI.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Gleb Natapov 
> Cc: Paolo Bonzini 
> 
> Changes in v2:
> - Use 0 rather than a negative value to know when the maintenance IRQ
> is not present.
> - Use resource for vcpu and vctrl
> ---
>  include/kvm/arm_vgic.h |  7 +++---
>  virt/kvm/arm/vgic-v2.c | 67 
> ++
>  virt/kvm/arm/vgic-v3.c | 45 ++---
>  virt/kvm/arm/vgic.c| 50 -
>  4 files changed, 68 insertions(+), 101 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 13a3d53..ed62772 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define VGIC_NR_IRQS_LEGACY  256
>  #define VGIC_NR_SGIS 16
> @@ -357,15 +358,15 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, 
> struct irq_phys_map *map);
>  #define vgic_initialized(k)  (!!((k)->arch.vgic.nr_cpus))
>  #define vgic_ready(k)((k)->arch.vgic.ready)
>  
> -int vgic_v2_probe(struct device_node *vgic_node,
> +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info,
> const struct vgic_ops **ops,
> const struct vgic_params **params);
>  #ifdef CONFIG_KVM_ARM_VGIC_V3
> -int vgic_v3_probe(struct device_node *vgic_node,
> +int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info,
> const struct vgic_ops **ops,
> const struct vgic_params **params);
>  #else
> -static inline int vgic_v3_probe(struct device_node *vgic_node,
> +static inline int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info,
>   const struct vgic_ops **ops,
>   const struct vgic_params **params)
>  {
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index ff02f08..3598cd4 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -20,9 +20,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
> -#include 
> -#include 
>  
>  #include 
>  
> @@ -177,38 +174,38 @@ static const struct vgic_ops vgic_v2_ops = {
>  static struct vgic_params vgic_v2_params;
>  
>  /**
> - * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
> - * @node:pointer to the DT node
> - * @ops: address of a pointer to the GICv2 operations
> - * @params:  address of a pointer to HW-specific parameters
> + * vgic_v2_probe - probe for a GICv2 compatible interrupt controller
> + * @gic_kvm_info:pointer to the GIC description
> + * @ops: address of a pointer to the GICv2 operations
> + * @params:  address of a pointer to HW-specific parameters
>   *
>   * Returns 0 if a GICv2 has been found, with the low level operations
>   * in *ops and the HW parameters in *params. Returns an error code
>   * otherwise.
>   */
> -int vgic_v2_probe(struct device_node *vgic_node,
> -   const struct vgic_ops **ops,
> -   const struct vgic_params **params)
> +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info,
> +const struct vgic_ops **ops,
> +const struct vgic_params **params)
>  {
>   int ret;
> - struct resource vctrl_res;
> - struct resource vcpu_res;
>   struct vgic_params *vgic = &vgic_v2_params;
> + resource_size_t vctrl_size = resource_size(&gic_kvm_info->vctrl);
>  
> - vgic->maint_irq = irq_of_parse_and_map(vgic_node, 0);
> - if (!vgic->maint_irq) {
> - kvm_err("error getting vgic maintenance irq from DT\n");
> + if (!gic_kvm_info->maint_irq) {
> + kvm_err("error getting vgic maintenance irq\n");
>   ret = -ENXIO;
>   goto out;
>   }
> + vgic->maint_irq = gic_kvm_info->maint_irq;
>  
> - ret = of_address_to_resource(vgic_node, 2, &vctrl_res);
> - if (ret) {
> - kvm_err("Cannot obtain GICH resource\n");
> + if (!gic_kvm_info->vctrl.start) {
> + kvm_err("GICH not present in the firmware table\n");
> + ret = -ENXIO;
>   goto out;
>   }
>  
> - vgic->vctrl_base = of_iomap(vgic_node, 2);
> + vgic->vctrl_base = ioremap(gic_kvm_info->vctrl.start,
> +resource_size(&gic_kvm_info->vctrl));
>   if (!vgic->vctrl_ba

[REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-03-08 Thread Huang Rui
Introduce an AMD accumlated power reporting mechanism for the Family
15h, Model 60h processor that can be used to calculate the average
power consumed by a processor during a measurement interval. The
feature support is indicated by CPUID Fn8000_0007_EDX[12].

This feature will be implemented both in hwmon and perf. The current
design provides one event to report per package/processor power
consumption by counting each compute unit power value.

Here the gory details of how the computation is done:

-
* Tsample: compute unit power accumulator sample period
* Tref: the PTSC counter period (PTSC: performance timestamp counter)
* N: the ratio of compute unit power accumulator sample period to the
  PTSC period

* Jmax: max compute unit accumulated power which is indicated by
  MSR_C001007b[MaxCpuSwPwrAcc]

* Jx/Jy: compute unit accumulated power which is indicated by
  MSR_C001007a[CpuSwPwrAcc]

* Tx/Ty: the value of performance timestamp counter which is indicated
  by CU_PTSC MSR_C0010280[PTSC]
* PwrCPUave: CPU average power

i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].

ii. Read the full range of the cumulative energy value from the new
MSR MaxCpuSwPwrAcc.
Jmax = value returned.

iii. At time x, software reads CpuSwPwrAcc and samples the PTSC.
Jx = value read from CpuSwPwrAcc and Tx = value read from PTSC.

iv. At time y, software reads CpuSwPwrAcc and samples the PTSC.
Jy = value read from CpuSwPwrAcc and Ty = value read from PTSC.

v. Calculate the average power consumption for a compute unit over
time period (y-x). Unit of result is uWatt:

if (Jy < Jx) // Rollover has occurred
Jdelta = (Jy + Jmax) - Jx
else
Jdelta = Jy - Jx
PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
--

Simple example:

  root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' 
make -j4
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALLscripts/checksyscalls.sh
CHK include/generated/compile.h
SKIPPED include/generated/compile.h
Building modules, stage 2.
  Kernel: arch/x86/boot/bzImage is ready  (#40)
MODPOST 4225 modules

   Performance counter stats for 'system wide':

  183.44 mWatts power/power-pkg/

   341.837270111 seconds time elapsed

  root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' 
sleep 10

   Performance counter stats for 'system wide':

0.18 mWatts power/power-pkg/

10.012551815 seconds time elapsed

Suggested-by: Peter Zijlstra 
Suggested-by: Ingo Molnar 
Suggested-by: Borislav Petkov 
Reviewed-by: Thomas Gleixner 
Signed-off-by: Huang Rui 
Cc: Guenter Roeck 
---

Hi Boris,

I already redo this patch based on tip/master, it depends on some
previous patches you applied before. If you need me to send them
again, please let me know.

Thanks,
Rui

---
 arch/x86/Kconfig|   9 ++
 arch/x86/events/Makefile|   1 +
 arch/x86/events/amd/power.c | 353 
 include/linux/perf_event.h  |   4 +
 4 files changed, 367 insertions(+)
 create mode 100644 arch/x86/events/amd/power.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3e61672..52ef30d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1210,6 +1210,15 @@ config MICROCODE_OLD_INTERFACE
def_bool y
depends on MICROCODE
 
+config PERF_EVENTS_AMD_POWER
+   depends on PERF_EVENTS && CPU_SUP_AMD
+   tristate "AMD Processor Power Reporting Mechanism"
+   ---help---
+ Provide power reporting mechanism support for AMD processors.
+ Currently, it leverages X86_FEATURE_ACC_POWER
+ (CPUID Fn8000_0007_EDX[12]) interface to calculate the
+ average power consumption on Family 15h processors.
+
 config X86_MSR
tristate "/dev/cpu/*/msr - Model-specific register support"
---help---
diff --git a/arch/x86/events/Makefile b/arch/x86/events/Makefile
index fdfea15..f59618a 100644
--- a/arch/x86/events/Makefile
+++ b/arch/x86/events/Makefile
@@ -1,6 +1,7 @@
 obj-y  += core.o
 
 obj-$(CONFIG_CPU_SUP_AMD)   += amd/core.o amd/uncore.o
+obj-$(CONFIG_PERF_EVENTS_AMD_POWER)+= amd/power.o
 obj-$(CONFIG_X86_LOCAL_APIC)+= amd/ibs.o msr.o
 ifdef CONFIG_AMD_IOMMU
 obj-$(CONFIG_CPU_SUP_AMD)   += amd/iommu.o
diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
new file mode 100644
index 000..55a3529
--- /dev/null
+++ b/arch/x86/events/amd/power.c
@@ -0,0 +1,353 @

Re: [PATCH v3 7/9] KVM: arm/arm64: arch_timer: Rely on the arch timer to parse the firmware tables

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:31AM +, Julien Grall wrote:
> The firmware table is currently parsed by the virtual timer code in
> order to retrieve the virtual timer interrupt. However, this is already
> done by the arch timer driver.
> 
> To avoid code duplication, use the newly function arch_timer_get_kvm_info()
> which return all the information required by the virtual timer code.
> 
> Signed-off-by: Julien Grall 
> 
Acked-by: Christoffer Dall 


Re: [PATCH v3 6/9] irqchip/gic-v3: Parse and export virtual GIC information

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:30AM +, Julien Grall wrote:
> Fill up the recently introduced gic_kvm_info with the virtual GIC
> information.

this is not really virtual GIC information, it's information about the
hardware used for virtualization.

> 
> Signed-off-by: Julien Grall 
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> 
> ---
> Changes in v3:
> - Add ACPI support
> 
> Changes in v2:
> - Use 0 rather than a negative value to know when the maintenance IRQ
> is not present.
> - Use resource for vcpu and vctrl
> ---
>  drivers/irqchip/irq-gic-v3.c   | 85 
> +-
>  include/linux/irqchip/arm-gic-common.h |  1 +
>  2 files changed, 85 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 50e87e6..6ae25120 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -28,6 +28,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -56,6 +57,8 @@ struct gic_chip_data {
>  static struct gic_chip_data gic_data __read_mostly;
>  static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE;
>  
> +static struct gic_kvm_info gic_v3_kvm_info;
> +
>  #define gic_data_rdist() (this_cpu_ptr(gic_data.rdists.rdist))
>  #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base)
>  #define gic_data_rdist_sgi_base()(gic_data_rdist_rd_base() + SZ_64K)
> @@ -901,6 +904,37 @@ static int __init gic_validate_dist_version(void __iomem 
> *dist_base)
>   return 0;
>  }
>  
> +static void __init gic_of_setup_kvm_info(struct device_node *node)
> +{
> + int ret;
> + struct resource r;
> + u32 gicv_idx;
> +
> + gic_v3_kvm_info.type = GIC_V3;
> +
> + gic_v3_kvm_info.maint_irq = irq_of_parse_and_map(node, 0);
> +
> + if (of_property_read_u32(node, "#redistributor-regions",
> +  &gicv_idx))
> + gicv_idx = 1;
> +
> + gicv_idx += 3;  /* Also skip GICD, GICC, GICH */
> + ret = of_address_to_resource(node, gicv_idx, &r);
> + if (!ret) {
> + if (!PAGE_ALIGNED(r.start))
> + pr_warn("GICV physical address 0x%llx not page 
> aligned\n",
> + (unsigned long long)r.start);
> + else if (!PAGE_ALIGNED(resource_size(&r)))
> + pr_warn("GICV size 0x%llx not a multiple of page size 
> 0x%lx\n",
> + (unsigned long long)resource_size(&r),
> + PAGE_SIZE);
> + else
> + gic_v3_kvm_info.vcpu = r;
> + }
> +
> + gic_set_kvm_info(&gic_v3_kvm_info);

I have the same error handling concerns as the previous patch here.

> +}
> +
>  static int __init gic_of_init(struct device_node *node, struct device_node 
> *parent)
>  {
>   void __iomem *dist_base;
> @@ -952,8 +986,10 @@ static int __init gic_of_init(struct device_node *node, 
> struct device_node *pare
>  
>   err = gic_init_bases(dist_base, rdist_regs, nr_redist_regions,
>redist_stride, &node->fwnode);
> - if (!err)
> + if (!err) {
> + gic_of_setup_kvm_info(node);
>   return 0;
> + }
>  
>  out_unmap_rdist:
>   for (i = 0; i < nr_redist_regions; i++)
> @@ -974,6 +1010,10 @@ static struct
>   struct redist_region *redist_regs;
>   u32 nr_redist_regions;
>   bool single_redist;
> + u32 maint_irq;
> + int maint_irq_mode;
> + phys_addr_t vctrl_base;
> + phys_addr_t vcpu_base;
>  } acpi_data __initdata;
>  
>  static void __init
> @@ -1020,6 +1060,13 @@ gic_acpi_parse_madt_gicc(struct acpi_subtable_header 
> *header,
>   return -ENOMEM;
>  
>   gic_acpi_register_redist(gicc->gicr_base_address, redist_base);
> +
> + acpi_data.maint_irq = gicc->vgic_interrupt;
> + acpi_data.maint_irq_mode = (gicc->flags & ACPI_MADT_VGIC_IRQ_MODE) ?
> + ACPI_EDGE_SENSITIVE : ACPI_LEVEL_SENSITIVE;
> + acpi_data.vctrl_base = gicc->gich_base_address;
> + acpi_data.vcpu_base = gicc->gicv_base_address;
> +
>   return 0;
>  }
>  
> @@ -,6 +1158,40 @@ static bool __init acpi_validate_gic_table(struct 
> acpi_subtable_header *header,
>  }
>  
>  #define ACPI_GICV3_DIST_MEM_SIZE (SZ_64K)
> +#define ACPI_GICV2_VCTRL_MEM_SIZE(SZ_4K)
> +#define ACPI_GICV2_VCPU_MEM_SIZE (SZ_8K)
> +
> +static void __init gic_acpi_setup_kvm_info(void)
> +{
> + int irq;
> +
> + gic_v3_kvm_info.type = GIC_V3;
> +
> + irq = acpi_register_gsi(NULL, acpi_data.maint_irq,
> + acpi_data.maint_irq_mode,
> + ACPI_ACTIVE_HIGH);
> + if (irq > 0)
> + gic_v3_kvm_info.maint_irq = irq;
> +
> + if (acpi_data.vctrl_base) {
> + struct resource *vctrl = &gic_v3_kvm_info.vctrl;
> +
> + vctrl->f

Re: [PATCH v3 2/9] clocksource: arm_arch_timer: Extend arch_timer_kvm_info to get the virtual IRQ

2016-03-08 Thread Julien Grall

Hi Christoffer,

On 09/03/2016 10:27, Christoffer Dall wrote:

On Tue, Mar 08, 2016 at 11:29:26AM +, Julien Grall wrote:

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index b7ab588..d8887f3 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -701,6 +701,8 @@ static void __init arch_timer_common_init(void)
arch_timer_banner(arch_timers_present);
arch_counter_register(arch_timers_present);
arch_timer_arch_init();
+
+   arch_timer_kvm_info.virtual_irq = arch_timer_ppi[VIRT_PPI];


why is this in common_init and not just in init?


I thought we wanted to initialize virtual_irq for both the system 
registers timer and the memory timer. Although, as talked IRL, KVM 
mandates system registers timer. So I will initialize the virtual_irq in 
arch_timer_init.


Cheers,

--
Julien Grall


Re: [PATCH 0/2] mm: Enable page parallel initialisation for Power

2016-03-08 Thread Li Zhang
On Wed, Mar 9, 2016 at 12:28 PM, Balbir Singh  wrote:
>
>
> On 09/03/16 15:17, Li Zhang wrote:
>> On Tue, Mar 8, 2016 at 10:45 PM, Balbir Singh  wrote:
>>>
>>> On 08/03/16 14:55, Li Zhang wrote:
 From: Li Zhang 

 Uptream has supported page parallel initialisation for X86 and the
 boot time is improved greately. Some tests have been done for Power.

 Here is the result I have done with different memory size.

 * 4GB memory:
 boot time is as the following:
 with patch vs without patch: 10.4s vs 24.5s
 boot time is improved 57%
 * 200GB memory:
 boot time looks the same with and without patches.
 boot time is about 38s
 * 32TB memory:
 boot time looks the same with and without patches
 boot time is about 160s.
 The boot time is much shorter than X86 with 24TB memory.
 From community discussion, it costs about 694s for X86 24T system.

 From code view, parallel initialisation improve the performance by
 deferring memory initilisation to kswap with N kthreads, it should
 improve the performance therotically.

 From the test result, On X86, performance is improved greatly with huge
 memory. But on Power platform, it is improved greatly with less than
 100GB memory. For huge memory, it is not improved greatly. But it saves
 the time with several threads at least, as the following information
 shows(32TB system log):

 [   22.648169] node 9 initialised, 16607461 pages in 280ms
 [   22.783772] node 3 initialised, 23937243 pages in 410ms
 [   22.858877] node 6 initialised, 29179347 pages in 490ms
 [   22.863252] node 2 initialised, 29179347 pages in 490ms
 [   22.907545] node 0 initialised, 32049614 pages in 540ms
 [   22.920891] node 15 initialised, 32212280 pages in 550ms
 [   22.923236] node 4 initialised, 32306127 pages in 550ms
 [   22.923384] node 12 initialised, 32314319 pages in 550ms
 [   22.924754] node 8 initialised, 32314319 pages in 550ms
 [   22.940780] node 13 initialised, 33353677 pages in 570ms
 [   22.940796] node 11 initialised, 33353677 pages in 570ms
 [   22.941700] node 5 initialised, 33353677 pages in 570ms
 [   22.941721] node 10 initialised, 33353677 pages in 570ms
 [   22.941876] node 7 initialised, 33353677 pages in 570ms
 [   22.944946] node 14 initialised, 33353677 pages in 570ms
 [   22.946063] node 1 initialised, 33345485 pages in 580ms

 It saves the time about 550*16 ms at least, although it can be ignore to 
 compare
 the boot time about 160 seconds. What's more, the boot time is much shorter
 on Power even without patches than x86 for huge memory machine.

 So this patchset is still necessary to be enabled for Power.


>> Hi Balbir,
>>
>> Thanks for your reviewing.
>>
>>> The patchset looks good, two questions
>>>
>>> 1. The patchset is still necessary for
>>> a. systems with smaller amount of RAM?
>>I think it is. Currently, I tested systems for 4GB, 50GB, and
>> boot time is improved.
>>We may test more systems with different memory size in the future.
>>> b. Theoretically it improves boot time?
>>The boot time is improved a little bit for huge memory system
>> and it can be ignored.
>>But I think it's still necessary to enable this feature.
>>
>>> 2. the pgdat->node_spanned_pages >> 8 sounds arbitrary
>>> On a system with 2TB*16 nodes, it would initialize about 8GB before 
>>> calling deferred init?
>>> Don't we need at-least 32GB + space for other early hash allocations
>>> BTW, My expectation was that 32TB would imply 32GB+32GB of large hash 
>>> allocations early on
>>   pgdat->node_spanned_pages >> 8 means that it allocates the size
>> of the memory on one node.
>>   On a system with 2TB *16nodes, it will allocate 16*8GB = 128GB.
>>   I am not sure if it can be minimised to >> 16 to make sure all
>> the architectures with different
>>   memory size work well.  And this is also mentioned in early
>> discussion for X86, so I choose  >> 8.
>>
>> *From the code as the following:
>>
>>   free_area_init_core ->
>>  memmap_init->
>>   update_defer_init
>>  #define memmap_init(size, nid, zone, start_pfn) \
>>memmap_init_zone((size), (nid), (zone), (start_pfn), MEMMAP_EARLY)
>>
>>  memmap_init_zone is based on a zone, but free_area_init_core will
>> help find the highest
>>  zone on the node. And update_defer_init() get max initialised
>> memory on highest zone for a node to
>>  reserve for early initialisation.
>>
>>  static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>>  {
>> ...
>>for (j = 0; j < MAX_NR_ZONES; j++) {
>>   
>>  memmap_init(size, nid, j, zone_start_fn);   //find
>> the highest 

Re: [PATCH v3 3/9] irqchip/gic-v2: Gather ACPI specific data in a single structure

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:27AM +, Julien Grall wrote:
> For now, there is only one member. More member will be added later.

questionable commit message

> 
> Signed-off-by: Julien Grall 
> 
> ---
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> 
> Changes in v2:
> - Patch added
> ---
>  drivers/irqchip/irq-gic.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 8f9ebf7..fbde202 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -1245,7 +1245,10 @@ IRQCHIP_DECLARE(pl390, "arm,pl390", gic_of_init);
>  #endif
>  
>  #ifdef CONFIG_ACPI
> -static phys_addr_t cpu_phy_base __initdata;
> +static struct
> +{
> + phys_addr_t cpu_phy_base;
> +} acpi_data __initdata;
>  
>  static int __init
>  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
> @@ -1265,10 +1268,10 @@ gic_acpi_parse_madt_cpu(struct acpi_subtable_header 
> *header,
>* All CPU interface addresses have to be the same.
>*/
>   gic_cpu_base = processor->base_address;
> - if (cpu_base_assigned && gic_cpu_base != cpu_phy_base)
> + if (cpu_base_assigned && gic_cpu_base != acpi_data.cpu_phy_base)
>   return -EINVAL;
>  
> - cpu_phy_base = gic_cpu_base;
> + acpi_data.cpu_phy_base = gic_cpu_base;
>   cpu_base_assigned = 1;
>   return 0;
>  }
> @@ -1316,7 +1319,7 @@ static int __init gic_v2_acpi_init(struct 
> acpi_subtable_header *header,
>   return -EINVAL;
>   }
>  
> - cpu_base = ioremap(cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE);
> + cpu_base = ioremap(acpi_data.cpu_phy_base, ACPI_GIC_CPU_IF_MEM_SIZE);
>   if (!cpu_base) {
>   pr_err("Unable to map GICC registers\n");
>   return -ENOMEM;
> -- 
> 1.9.1
> 
super nit: I would use cpu_phys_base instead of cpu_phy_base, but I'll
leave it up to you.

Acked-by: Christoffer Dall 


Re: [PATCH 2/3] perf/x86/pebs: add workaround for broken OVFL status on HSW

2016-03-08 Thread Stephane Eranian
On Tue, Mar 8, 2016 at 9:34 PM, Stephane Eranian  wrote:
> On Tue, Mar 8, 2016 at 1:13 PM, Stephane Eranian  wrote:
>> Hi,
>>
>> On Tue, Mar 8, 2016 at 1:07 PM, Peter Zijlstra  wrote:
>>> On Tue, Mar 08, 2016 at 12:59:23PM -0800, Stephane Eranian wrote:
 hi,

 On Mon, Mar 7, 2016 at 12:25 PM, Peter Zijlstra  
 wrote:
 >
 > On Mon, Mar 07, 2016 at 07:27:31PM +0100, Jiri Olsa wrote:
 > > On Mon, Mar 07, 2016 at 01:18:40PM +0100, Peter Zijlstra wrote:
 > > > On Mon, Mar 07, 2016 at 11:24:13AM +0100, Peter Zijlstra wrote:
 > > >
 > > > > I suspect Andi is having something along:
 > > > >
 > > > >  
 > > > > lkml.kernel.org/r/1445458568-16956-1-git-send-email-a...@firstfloor.org
 > > > >
 > > > > applied to his tree.
 > > >
 > > > OK, I munged a bunch of patches together, please have a hard look at 
 > > > the
 > > > end result found in:
 > > >
 > > >   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git 
 > > > perf/core
 > > >

 I ran this kernel on Haswell. Even with Andi's fixes the problem I 
 identified is
 still there, so my patch is still needed.
>>>
>>> Right, your patch should be included in that kernel, or did I make a
>>> royal mess of things?
>>>
>> No, it is as expected for the OVF PMI fix.
>>
>>> I put Andi's late status ack on top of your patch.
>>>
> Ok, I ran into a problem on Broadwell with your branch with Andi's
> patches. I see

Sorry this is with tip.git and not your branch. Will try with it too.

> a problem which had disappeared since SandyBridge:
>
> 11551.128422] [ cut here ]
> [11551.128435] WARNING: CPU: 3 PID: 12114 at
> arch/x86/events/intel/core.c:1868 intel_pmu_handle_irq+0x2da/0x4b0()
> [11551.128437] perfevents: irq loop stuck!
> [11551.128469][] dump_stack+0x4d/0x63
> [11551.128479]  [] warn_slowpath_common+0x97/0xe0
> [11551.128482]  [] warn_slowpath_fmt+0x46/0x50
> [11551.128486]  [] intel_pmu_handle_irq+0x2da/0x4b0
> [11551.128491]  [] perf_event_nmi_handler+0x39/0x60
> [11551.128494]  [] nmi_handle+0x61/0x110
> [11551.128497]  [] default_do_nmi+0x44/0x110
> [11551.128500]  [] do_nmi+0xd7/0x140
> [11551.128504]  [] end_repeat_nmi+0x1a/0x1e
> [11551.128507]  [] ? native_write_msr+0x6/0x30
> [11551.128510]  [] ? native_write_msr+0x6/0x30
> [11551.128514]  [] ? native_write_msr+0x6/0x30
> [11551.128515]  <>  [] ?
> intel_pmu_enable_event+0x215/0x230
> [11551.128520]  [] x86_pmu_start+0x8d/0x120
> [11551.128523]  [] x86_pmu_enable+0x27b/0x2f0
> [11551.128527]  [] perf_pmu_enable+0x1d/0x30
> [11551.128530]  [] ctx_resched+0x5a/0x70
> [11551.128532]  [] __perf_event_enable+0x1ac/0x210
> [11551.128537]  [] event_function+0xa1/0x170
> [11551.128540]  [] ? perf_duration_warn+0x70/0x70
> [11551.128543]  [] remote_function+0x47/0x60
> [11551.128547]  [] generic_exec_single+0xa8/0xb0
> [11551.128550]  [] ? perf_duration_warn+0x70/0x70
> [11551.128553]  [] ? perf_duration_warn+0x70/0x70
> [11551.128555]  [] smp_call_function_single+0xa8/0x100
> [11551.128559]  [] event_function_call+0x84/0x100
> [11551.128561]  [] ? ctx_resched+0x70/0x70
> [11551.128564]  [] ? ctx_resched+0x70/0x70
> [11551.128566]  [] ? perf_ctx_lock+0x30/0x30
> [11551.128570]  [] _perf_event_enable+0x60/0x80
> [11551.128572]  [] perf_ioctl+0x271/0x3e0
>
> The infinite loop in the irq handler!
>
> But here it seems there is a race with a perf_events ioctl() to likely
> reset the period.
> I am not using the perf tool here just running a self-monitoring task.
>
>
>>> Also note, Ingo merged most of those patches today, all except the top
>>> 3, because Andi wanted to double check something.
>>


Re: [PATCH v1 1/1] mfd: intel-lpss: Pass I2C configuration via properties on BXT

2016-03-08 Thread Lee Jones
On Tue, 26 Jan 2016, Andy Shevchenko wrote:

> From: Mika Westerberg 
> 
> I2C host controller need to be configured properly in order to meet I2C
> timings specified in the I2C protocol specification. Some Intel Broxton
> based machines do not have this information in the ACPI namespace (or the
> boot firmware does not support ACPI at all) so we use build-in device
> properties instead.
> 
> Signed-off-by: Mika Westerberg 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/mfd/intel-lpss-acpi.c | 12 
>  drivers/mfd/intel-lpss-pci.c  | 12 
>  2 files changed, 24 insertions(+)

Applied, thanks.

> diff --git a/drivers/mfd/intel-lpss-acpi.c b/drivers/mfd/intel-lpss-acpi.c
> index 06f00d6..5a8d9c7 100644
> --- a/drivers/mfd/intel-lpss-acpi.c
> +++ b/drivers/mfd/intel-lpss-acpi.c
> @@ -44,8 +44,20 @@ static const struct intel_lpss_platform_info bxt_info = {
>   .clk_rate = 1,
>  };
>  
> +static struct property_entry bxt_i2c_properties[] = {
> + PROPERTY_ENTRY_U32("i2c-sda-hold-time-ns", 42),
> + PROPERTY_ENTRY_U32("i2c-sda-falling-time-ns", 171),
> + PROPERTY_ENTRY_U32("i2c-scl-falling-time-ns", 208),
> + { },
> +};
> +
> +static struct property_set bxt_i2c_pset = {
> + .properties = bxt_i2c_properties,
> +};
> +
>  static const struct intel_lpss_platform_info bxt_i2c_info = {
>   .clk_rate = 13300,
> + .pset = &bxt_i2c_pset,
>  };
>  
>  static const struct acpi_device_id intel_lpss_acpi_ids[] = {
> diff --git a/drivers/mfd/intel-lpss-pci.c b/drivers/mfd/intel-lpss-pci.c
> index a7136c7..92b456f 100644
> --- a/drivers/mfd/intel-lpss-pci.c
> +++ b/drivers/mfd/intel-lpss-pci.c
> @@ -107,8 +107,20 @@ static const struct intel_lpss_platform_info 
> bxt_uart_info = {
>   .pset = &uart_pset,
>  };
>  
> +static struct property_entry bxt_i2c_properties[] = {
> + PROPERTY_ENTRY_U32("i2c-sda-hold-time-ns", 42),
> + PROPERTY_ENTRY_U32("i2c-sda-falling-time-ns", 171),
> + PROPERTY_ENTRY_U32("i2c-scl-falling-time-ns", 208),
> + { },
> +};
> +
> +static struct property_set bxt_i2c_pset = {
> + .properties = bxt_i2c_properties,
> +};
> +
>  static const struct intel_lpss_platform_info bxt_i2c_info = {
>   .clk_rate = 13300,
> + .pset = &bxt_i2c_pset,
>  };
>  
>  static const struct pci_device_id intel_lpss_pci_ids[] = {

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [Ocfs2-devel] [PATCH v4 2/5] ocfs2: sysfile interfaces for online file check

2016-03-08 Thread Eric Ren



On 02/29/2016 01:17 PM, Gang He wrote:

Implement online file check sysfile interfaces, e.g.
how to create the related sysfile according to device name,
how to display/handle file check request from the sysfile.

Signed-off-by: Gang He 

Tested-by: Eric Ren 

---
  fs/ocfs2/Makefile|   3 +-
  fs/ocfs2/filecheck.c | 606 +++
  fs/ocfs2/filecheck.h |  49 +
  fs/ocfs2/inode.h |   3 +
  4 files changed, 660 insertions(+), 1 deletion(-)
  create mode 100644 fs/ocfs2/filecheck.c
  create mode 100644 fs/ocfs2/filecheck.h

diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
index ce210d4..e27e652 100644
--- a/fs/ocfs2/Makefile
+++ b/fs/ocfs2/Makefile
@@ -41,7 +41,8 @@ ocfs2-objs := \
quota_local.o   \
quota_global.o  \
xattr.o \
-   acl.o
+   acl.o   \
+   filecheck.o
  
  ocfs2_stackglue-objs := stackglue.o

  ocfs2_stack_o2cb-objs := stack_o2cb.o
diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
new file mode 100644
index 000..2cabbcf
--- /dev/null
+++ b/fs/ocfs2/filecheck.c
@@ -0,0 +1,606 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * filecheck.c
+ *
+ * Code which implements online file check.
+ *
+ * Copyright (C) 2016 SuSE.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ocfs2.h"
+#include "ocfs2_fs.h"
+#include "stackglue.h"
+#include "inode.h"
+
+#include "filecheck.h"
+
+
+/* File check error strings,
+ * must correspond with error number in header file.
+ */
+static const char * const ocfs2_filecheck_errs[] = {
+   "SUCCESS",
+   "FAILED",
+   "INPROGRESS",
+   "READONLY",
+   "INJBD",
+   "INVALIDINO",
+   "BLOCKECC",
+   "BLOCKNO",
+   "VALIDFLAG",
+   "GENERATION",
+   "UNSUPPORTED"
+};
+
+static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
+static LIST_HEAD(ocfs2_filecheck_sysfs_list);
+
+struct ocfs2_filecheck {
+   struct list_head fc_head;   /* File check entry list head */
+   spinlock_t fc_lock;
+   unsigned int fc_max;/* Maximum number of entry in list */
+   unsigned int fc_size;   /* Current entry count in list */
+   unsigned int fc_done;   /* Finished entry count in list */
+};
+
+struct ocfs2_filecheck_sysfs_entry {   /* sysfs entry per mounting */
+   struct list_head fs_list;
+   atomic_t fs_count;
+   struct super_block *fs_sb;
+   struct kset *fs_devicekset;
+   struct kset *fs_fcheckkset;
+   struct ocfs2_filecheck *fs_fcheck;
+};
+
+#define OCFS2_FILECHECK_MAXSIZE100
+#define OCFS2_FILECHECK_MINSIZE10
+
+/* File check operation type */
+enum {
+   OCFS2_FILECHECK_TYPE_CHK = 0,   /* Check a file(inode) */
+   OCFS2_FILECHECK_TYPE_FIX,   /* Fix a file(inode) */
+   OCFS2_FILECHECK_TYPE_SET = 100  /* Set entry list maximum size */
+};
+
+struct ocfs2_filecheck_entry {
+   struct list_head fe_list;
+   unsigned long fe_ino;
+   unsigned int fe_type;
+   unsigned int fe_done:1;
+   unsigned int fe_status:31;
+};
+
+struct ocfs2_filecheck_args {
+   unsigned int fa_type;
+   union {
+   unsigned long fa_ino;
+   unsigned int fa_len;
+   };
+};
+
+static const char *
+ocfs2_filecheck_error(int errno)
+{
+   if (!errno)
+   return ocfs2_filecheck_errs[errno];
+
+   BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
+  errno > OCFS2_FILECHECK_ERR_END);
+   return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
+}
+
+static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
+   struct kobj_attribute *attr,
+   char *buf);
+static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
+struct kobj_attribute *attr,
+const char *buf, size_t count);
+static struct kobj_attribute ocfs2_attr_filecheck_chk =
+   __ATTR(check, S_IRUSR | S_IWUSR,
+   ocfs2_filecheck_show,
+   ocfs2_filecheck_store);
+static struct kobj_attribute ocfs2_attr_filecheck_fix =
+   __ATTR(fix, S_IRUSR | S_IWUSR,
+   ocfs2_filecheck_show,
+ 

Re: [Ocfs2-devel] [PATCH v4 4/5] ocfs2: check/fix inode block for online file check

2016-03-08 Thread Eric Ren



On 02/29/2016 01:18 PM, Gang He wrote:

Implement online check or fix inode block during
reading a inode block to memory.

Signed-off-by: Gang He 

Tested-by: Eric Ren 

---
  fs/ocfs2/inode.c   | 225 +++--
  fs/ocfs2/ocfs2_trace.h |   2 +
  2 files changed, 218 insertions(+), 9 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 8f87e05..6ce531e 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -53,6 +53,7 @@
  #include "xattr.h"
  #include "refcounttree.h"
  #include "ocfs2_trace.h"
+#include "filecheck.h"
  
  #include "buffer_head_io.h"
  
@@ -74,6 +75,14 @@ static int ocfs2_truncate_for_delete(struct ocfs2_super *osb,

struct inode *inode,
struct buffer_head *fe_bh);
  
+static int ocfs2_filecheck_read_inode_block_full(struct inode *inode,

+struct buffer_head **bh,
+int flags, int type);
+static int ocfs2_filecheck_validate_inode_block(struct super_block *sb,
+   struct buffer_head *bh);
+static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
+ struct buffer_head *bh);
+
  void ocfs2_set_inode_flags(struct inode *inode)
  {
unsigned int flags = OCFS2_I(inode)->ip_attr;
@@ -127,6 +136,7 @@ struct inode *ocfs2_ilookup(struct super_block *sb, u64 
blkno)
  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 int sysfile_type)
  {
+   int rc = 0;
struct inode *inode = NULL;
struct super_block *sb = osb->sb;
struct ocfs2_find_inode_args args;
@@ -161,12 +171,17 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 
blkno, unsigned flags,
}
trace_ocfs2_iget5_locked(inode->i_state);
if (inode->i_state & I_NEW) {
-   ocfs2_read_locked_inode(inode, &args);
+   rc = ocfs2_read_locked_inode(inode, &args);
unlock_new_inode(inode);
}
if (is_bad_inode(inode)) {
iput(inode);
-   inode = ERR_PTR(-ESTALE);
+   if ((flags & OCFS2_FI_FLAG_FILECHECK_CHK) ||
+   (flags & OCFS2_FI_FLAG_FILECHECK_FIX))
+   /* Return OCFS2_FILECHECK_ERR_XXX related errno */
+   inode = ERR_PTR(rc);
+   else
+   inode = ERR_PTR(-ESTALE);
goto bail;
}
  
@@ -409,7 +424,7 @@ static int ocfs2_read_locked_inode(struct inode *inode,

struct ocfs2_super *osb;
struct ocfs2_dinode *fe;
struct buffer_head *bh = NULL;
-   int status, can_lock;
+   int status, can_lock, lock_level = 0;
u32 generation = 0;
  
  	status = -EINVAL;

@@ -477,7 +492,7 @@ static int ocfs2_read_locked_inode(struct inode *inode,
mlog_errno(status);
return status;
}
-   status = ocfs2_inode_lock(inode, NULL, 0);
+   status = ocfs2_inode_lock(inode, NULL, lock_level);
if (status) {
make_bad_inode(inode);
mlog_errno(status);
@@ -494,16 +509,32 @@ static int ocfs2_read_locked_inode(struct inode *inode,
}
  
  	if (can_lock) {

-   status = ocfs2_read_inode_block_full(inode, &bh,
-OCFS2_BH_IGNORE_CACHE);
+   if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+   status = ocfs2_filecheck_read_inode_block_full(inode,
+   &bh, OCFS2_BH_IGNORE_CACHE, 0);
+   else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+   status = ocfs2_filecheck_read_inode_block_full(inode,
+   &bh, OCFS2_BH_IGNORE_CACHE, 1);
+   else
+   status = ocfs2_read_inode_block_full(inode,
+   &bh, OCFS2_BH_IGNORE_CACHE);
} else {
status = ocfs2_read_blocks_sync(osb, args->fi_blkno, 1, &bh);
/*
 * If buffer is in jbd, then its checksum may not have been
 * computed as yet.
 */
-   if (!status && !buffer_jbd(bh))
-   status = ocfs2_validate_inode_block(osb->sb, bh);
+   if (!status && !buffer_jbd(bh)) {
+   if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+   status = ocfs2_filecheck_validate_inode_block(
+   osb->sb, bh);
+   else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+   status = ocfs2

Re: [PATCH v3 5/9] irqchip/gic-v3: Gather all ACPI specific data in a single structure

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:29AM +, Julien Grall wrote:
> Even though all the variables aren't marked with __initdata, they are
> only used during initialization. So the structure is marked with
> __initdata.

Not sure I understand this commit message.

As I see it, this commit includes two changes:

1. Mark the variables only used during init with __initdata

2. Move the variables into a structure

If I get that right, can you argue for both changes?

Thanks,
-Christoffer

> 
> Signed-off-by: Julien Grall 
> 
> ---
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> 
> Changes in v3:
> - Patch added
> ---
>  drivers/irqchip/irq-gic-v3.c | 60 
> 
>  1 file changed, 33 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 5b7d3c2..50e87e6 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -968,19 +968,22 @@ out_unmap_dist:
>  IRQCHIP_DECLARE(gic_v3, "arm,gic-v3", gic_of_init);
>  
>  #ifdef CONFIG_ACPI
> -static void __iomem *dist_base;
> -static struct redist_region *redist_regs __initdata;
> -static u32 nr_redist_regions __initdata;
> -static bool single_redist;
> +static struct
> +{
> + void __iomem *dist_base;
> + struct redist_region *redist_regs;
> + u32 nr_redist_regions;
> + bool single_redist;
> +} acpi_data __initdata;
>  
>  static void __init
>  gic_acpi_register_redist(phys_addr_t phys_base, void __iomem *redist_base)
>  {
>   static int count = 0;
>  
> - redist_regs[count].phys_base = phys_base;
> - redist_regs[count].redist_base = redist_base;
> - redist_regs[count].single_redist = single_redist;
> + acpi_data.redist_regs[count].phys_base = phys_base;
> + acpi_data.redist_regs[count].redist_base = redist_base;
> + acpi_data.redist_regs[count].single_redist = acpi_data.single_redist;
>   count++;
>  }
>  
> @@ -1008,7 +1011,7 @@ gic_acpi_parse_madt_gicc(struct acpi_subtable_header 
> *header,
>  {
>   struct acpi_madt_generic_interrupt *gicc =
>   (struct acpi_madt_generic_interrupt *)header;
> - u32 reg = readl_relaxed(dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> + u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & 
> GIC_PIDR2_ARCH_MASK;
>   u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
>   void __iomem *redist_base;
>  
> @@ -1025,7 +1028,7 @@ static int __init gic_acpi_collect_gicr_base(void)
>   acpi_tbl_entry_handler redist_parser;
>   enum acpi_madt_type type;
>  
> - if (single_redist) {
> + if (acpi_data.single_redist) {
>   type = ACPI_MADT_TYPE_GENERIC_INTERRUPT;
>   redist_parser = gic_acpi_parse_madt_gicc;
>   } else {
> @@ -1076,14 +1079,14 @@ static int __init gic_acpi_count_gicr_regions(void)
>   count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR,
> gic_acpi_match_gicr, 0);
>   if (count > 0) {
> - single_redist = false;
> + acpi_data.single_redist = false;
>   return count;
>   }
>  
>   count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
> gic_acpi_match_gicc, 0);
>   if (count > 0)
> - single_redist = true;
> + acpi_data.single_redist = true;
>  
>   return count;
>  }
> @@ -1103,7 +1106,7 @@ static bool __init acpi_validate_gic_table(struct 
> acpi_subtable_header *header,
>   if (count <= 0)
>   return false;
>  
> - nr_redist_regions = count;
> + acpi_data.nr_redist_regions = count;
>   return true;
>  }
>  
> @@ -1114,25 +1117,28 @@ gic_acpi_init(struct acpi_subtable_header *header, 
> const unsigned long end)
>  {
>   struct acpi_madt_generic_distributor *dist;
>   struct fwnode_handle *domain_handle;
> + size_t size;
>   int i, err;
>  
>   /* Get distributor base address */
>   dist = (struct acpi_madt_generic_distributor *)header;
> - dist_base = ioremap(dist->base_address, ACPI_GICV3_DIST_MEM_SIZE);
> - if (!dist_base) {
> + acpi_data.dist_base = ioremap(dist->base_address,
> +   ACPI_GICV3_DIST_MEM_SIZE);
> + if (!acpi_data.dist_base) {
>   pr_err("Unable to map GICD registers\n");
>   return -ENOMEM;
>   }
>  
> - err = gic_validate_dist_version(dist_base);
> + err = gic_validate_dist_version(acpi_data.dist_base);
>   if (err) {
> - pr_err("No distributor detected at @%p, giving up", dist_base);
> + pr_err("No distributor detected at @%p, giving up",
> +acpi_data.dist_base);
>   goto out_dist_unmap;
>   }
>  
> - redist_regs = kzalloc(sizeof(*redist_regs) * nr_redist_regions,
> -   GFP_KERNEL);
> - if (!redist_regs)

linux-next: Tree for Mar 9

2016-03-08 Thread Stephen Rothwell
Hi all,

Changes since 20160308:

The usb tree gained a conflict against the tip tree.

The gpio tree gained a conflict against the mfd tree.

The aio tree still had a build failure so I used the version from
next-20160111.

Non-merge commits (relative to Linus' tree): 10121
 7937 files changed, 377182 insertions(+), 189149 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
(this fails its final link) and pseries_le_defconfig and i386, sparc
and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 241 trees (counting Linus' and 36 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (7f02bf6b5f5d Merge tag 'sound-4.5' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound)
Merging fixes/master (36f90b0a2ddd Linux 4.5-rc2)
Merging kbuild-current/rc-fixes (3d1450d54a4f Makefile: Force gzip and xz on 
module install)
Merging arc-current/for-curr (fc77dbd34c5c Linux 4.5-rc6)
Merging arm-current/fixes (f474c8c857d9 ARM: 8544/1: set_memory_xx fixes)
Merging m68k-current/for-linus (daf670bc9d36 m68k/defconfig: Update defconfigs 
for v4.5-rc1)
Merging metag-fixes/fixes (0164a711c97b metag: Fix ioremap_wc/ioremap_cached 
build errors)
Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5)
Merging powerpc-fixes/fixes (37c5e942bb2e powerpc/fsl-book3e: Avoid lbarx on 
e5500)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging sparc/master (f983cd32cd5d Merge branch 'parisc-4.5-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux)
Merging net/master (133800d1f028 sctp: fix copying more bytes than expected in 
sctp_add_bind_addr)
Merging ipsec/master (52717aa43094 vti: Fix recource leeks on pmtu discovery)
Merging ipvs/master (7617a24f83b5 ipvs: correct initial offset of Call-ID 
header search in SIP persistence engine)
Merging wireless-drivers/master (10da848f67a7 ssb: host_soc depends on sprom)
Merging mac80211/master (2af8c4dc2e9c mac80211_hwsim: treat as part of mac80211 
for MAINTAINERS)
Merging sound-current/for-linus (ad09ef2cce91 Merge tag 'asoc-fix-v4.5-rc6' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (54c6e2dd00c3 PCI: Allow a NULL "parent" pointer 
in pci_bus_assign_domain_nr())
Merging driver-core.current/driver-core-linus (18558cae0272 Linux 4.5-rc4)
Merging tty.current/tty-linus (18558cae0272 Linux 4.5-rc4)
Merging usb.current/usb-linus (861c3849222b Merge tag 'usb-serial-4.5-rc7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus)
Merging usb-gadget-fixes/fixes (3b2435192fe9 MAINTAINERS: drop OMAP USB and 
MUSB maintainership)
Merging usb-serial-fixes/usb-linus (f6cede5b49e8 Linux 4.5-rc7)
Merging usb-chipidea-fixes/ci-for-usb-stable (d144dfea8af7 usb: chipidea: otg: 
change workqueue ci_otg as freezable)
Merging staging.current/staging-linus (fc77dbd34c5c Linux 4.5-rc6)
Merging char-misc.current/char-misc-linus (fc77dbd34c5c Linux 4.5-rc6)
Merging input-current/for-linus (ff84dabe3c6e Input: colibri-vf50-ts - add 
missing #include )
Merging crypto-current/master (8a3978ad55fb crypto: marvell/cesa - fix test in 
mv_cesa_dev_dma_init())
Merging ide/master (e04a2bd6d8c9 drivers/ide: make ide-scan-pci.c driver 
explicitly non-modular)
Merging devicetree-current/devicetree/merge (f76502aa9140 of/dynamic: Fix test 
for PPC_PSERIES)
Merging rr-fixes/fixes (8244062ef1e5 modules

Re: linux-next: manual merge of the gpio tree with the mfd tree

2016-03-08 Thread Lee Jones
On Wed, 09 Mar 2016, Stephen Rothwell wrote:

> Hi Linus,
> 
> Today's linux-next merge of the gpio tree got a conflict in:
> 
>   drivers/gpio/gpio-tps65912.c
> 
> between commits:
> 
>   65b6555971d0 ("mfd: tps65912: Remove old driver in preparation for new 
> driver")
>   ca801a22f465 ("gpio: tps65912: Add GPIO driver for the TPS65912 PMIC")
> 
> from the mfd tree and commit:
> 
>   0964ac703edf ("gpio: tps65912: Use devm_gpiochip_add_data() for gpio 
> registration")
> 
> from the gpio tree.
> 
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).

I sent out a pull-request for this already.

Please pull 'ib-mfd-regulator-gpio-4.6' from my tree.

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


[PATCH] arm64: Fix the ptep_set_wrprotect() to set PTE_DIRTY if (PTE_DBM && !PTE_RDONLY)

2016-03-08 Thread Ganapatrao Kulkarni
Commit 2f4b829c625e ("arm64: Add support for hardware updates of the
access and dirty pte bits") introduced support for handling hardware
updates of the access flag and dirty status.

ptep_set_wrprotect is setting PTR_DIRTY if !PTE_RDONLY,
however by design it suppose to set PTE_DIRTY
only if (PTE_DBM && !PTE_RDONLY). This patch addes code to
test and set accordingly.

This patch fixes BUG,
kernel BUG at /build/linux-StrpB2/linux-4.4.0/fs/ext4/inode.c:2394!
Internal error: Oops - BUG: 0 [#1] SMP

on thunderx numa board, when ARM64_HW_AFDBM and NUMA_BALANCING are enabled.

note: this patch is not tested on platform which supports AFDBM.

Signed-off-by: Ganapatrao Kulkarni 
---
 arch/arm64/include/asm/pgtable.h | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f506086..d396892 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -613,20 +613,24 @@ static inline pmd_t pmdp_get_and_clear(struct mm_struct 
*mm,
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
address, pte_t *ptep)
 {
-   pteval_t pteval;
+   pteval_t pteval, pteval2;
unsigned long tmp;
 
asm volatile("//ptep_set_wrprotect\n"
-   "   prfmpstl1strm, %2\n"
-   "1: ldxr%0, %2\n"
-   "   tst %0, %4  // check for hw dirty 
(!PTE_RDONLY)\n"
-   "   csel%1, %3, xzr, eq // set PTE_DIRTY|PTE_RDONLY if 
dirty\n"
-   "   orr %0, %0, %1  // if !dirty, PTE_RDONLY is 
already set\n"
-   "   and %0, %0, %5  // clear PTE_WRITE/PTE_DBM\n"
-   "   stxr%w1, %0, %2\n"
+   "   prfmpstl1strm, %3\n"
+   "1: ldxr%0, %3\n"
+   "   and %2, %0, %4  //extract bits PTE_WRITE and 
PTE_RDONLY\n"
+   "   cmp %2, %5  // compare wth PTE_WRITE\n"
+   "   b.ne2f\n"
+   "   orr %0, %0, %8  // Set PTE_DIRTY if (PTE_DBM && 
!PTE_RDONLY)\n"
+   "2: tst %0, %6  // check for !PTE_RDONLY\n"
+   "   csel%1, %6, xzr, eq // select PTE_RDONLY if 
!PTE_RDONLY\n"
+   "   orr %0, %0, %1  // set PTE_RDONLY if 
!PTE_RDONLY\n"
+   "   and %0, %0, %7  // clear PTE_WRITE/PTE_DBM\n"
+   "   stxr%w1, %0, %3\n"
"   cbnz%w1, 1b\n"
-   : "=&r" (pteval), "=&r" (tmp), "+Q" (pte_val(*ptep))
-   : "r" (PTE_DIRTY|PTE_RDONLY), "L" (PTE_RDONLY), "L" (~PTE_WRITE)
+   : "=&r" (pteval), "=&r" (tmp), "=&r" (pteval2), "+Q" (pte_val(*ptep))
+   : "r" (PTE_WRITE|PTE_RDONLY), "r" (PTE_WRITE), "r" (PTE_RDONLY), "L" 
(~PTE_WRITE), "L" (PTE_DIRTY)
: "cc");
 }
 
-- 
1.8.1.4



[PATCH v11 5/9] arm64: Kprobes with single stepping support

2016-03-08 Thread David Long
From: Sandeepa Prabhu 

Add support for basic kernel probes(kprobes) and jump probes
(jprobes) for ARM64.

Kprobes utilizes software breakpoint and single step debug
exceptions supported on ARM v8.

A software breakpoint is placed at the probe address to trap the
kernel execution into the kprobe handler.

ARM v8 supports enabling single stepping before the break exception
return (ERET), with next PC in exception return address (ELR_EL1). The
kprobe handler prepares an executable memory slot for out-of-line
execution with a copy of the original instruction being probed, and
enables single stepping. The PC is set to the out-of-line slot address
before the ERET. With this scheme, the instruction is executed with the
exact same register context except for the PC (and DAIF) registers.

Debug mask (PSTATE.D) is enabled only when single stepping a recursive
kprobe, e.g.: during kprobes reenter so that probed instruction can be
single stepped within the kprobe handler -exception- context.
The recursion depth of kprobe is always 2, i.e. upon probe re-entry,
any further re-entry is prevented by not calling handlers and the case
counted as a missed kprobe).

Single stepping from the x-o-l slot has a drawback for PC-relative accesses
like branching and symbolic literals access as the offset from the new PC
(slot address) may not be ensured to fit in the immediate value of
the opcode. Such instructions need simulation, so reject
probing them.

Instructions generating exceptions or cpu mode change are rejected
for probing.

Exclusive load/store instructions are rejected too.  Additionally, the
code is checked to see if it is inside an exclusive load/store sequence
(code from Pratyush).

System instructions are mostly enabled for stepping, except MSR/MRS
accesses to "DAIF" flags in PSTATE, which are not safe for
probing.

Thanks to Steve Capper and Pratyush Anand for several suggested
Changes.

Signed-off-by: Sandeepa Prabhu 
Signed-off-by: David A. Long 
Signed-off-by: Pratyush Anand 
---
 arch/arm64/Kconfig  |   1 +
 arch/arm64/include/asm/debug-monitors.h |   5 +
 arch/arm64/include/asm/insn.h   |   4 +-
 arch/arm64/include/asm/kprobes.h|  60 
 arch/arm64/include/asm/probes.h |  44 +++
 arch/arm64/include/asm/ptrace.h |   2 +-
 arch/arm64/kernel/Makefile  |   1 +
 arch/arm64/kernel/debug-monitors.c  |  18 +-
 arch/arm64/kernel/kprobes-arm64.c   | 121 
 arch/arm64/kernel/kprobes-arm64.h   |  35 +++
 arch/arm64/kernel/kprobes.c | 512 
 arch/arm64/kernel/vmlinux.lds.S |   1 +
 arch/arm64/mm/fault.c   |  25 ++
 13 files changed, 824 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/include/asm/kprobes.h
 create mode 100644 arch/arm64/include/asm/probes.h
 create mode 100644 arch/arm64/kernel/kprobes-arm64.c
 create mode 100644 arch/arm64/kernel/kprobes-arm64.h
 create mode 100644 arch/arm64/kernel/kprobes.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 4211b0d..c395386 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -81,6 +81,7 @@ config ARM64
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
+   select HAVE_KPROBES
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
diff --git a/arch/arm64/include/asm/debug-monitors.h 
b/arch/arm64/include/asm/debug-monitors.h
index 279c85b5..274ab60 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -78,6 +78,11 @@
 
 #define CACHE_FLUSH_IS_SAFE1
 
+/* kprobes BRK opcodes with ESR encoding  */
+#define BRK64_ESR_MASK 0x
+#define BRK64_ESR_KPROBES  0x0004
+#define BRK64_OPCODE_KPROBES   (AARCH64_BREAK_MON | (BRK64_ESR_KPROBES << 5))
+
 /* AArch32 */
 #define DBG_ESR_EVT_BKPT   0x4
 #define DBG_ESR_EVT_VECC   0x5
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 72dda48..b9567a1 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -253,6 +253,8 @@ __AARCH64_INSN_FUNCS(ldr_reg,   0x3FE0EC00, 0x38606800)
 __AARCH64_INSN_FUNCS(ldr_lit,  0xBF00, 0x1800)
 __AARCH64_INSN_FUNCS(ldrsw_lit,0xFF00, 0x9800)
 __AARCH64_INSN_FUNCS(exclusive,0x3F80, 0x0800)
+__AARCH64_INSN_FUNCS(load_ex,  0x3F40, 0x0840)
+__AARCH64_INSN_FUNCS(store_ex, 0x3F40, 0x0800)
 __AARCH64_INSN_FUNCS(stp_post, 0x7FC0, 0x2880)
 __AARCH64_INSN_FUNCS(ldp_post, 0x7FC0, 0x28C0)
 __AARCH64_INSN_FUNCS(stp_pre,  0x7FC0, 0x2980)
@@ -401,7 +403,7 @@ bool aarch32_insn_is_wide(u32 insn);
 #define A32_RT_OFFSET  12
 #define A32_RT2_OFFSET  0
 
-u32 aarch64_extract_system_register(u32 insn);
+u32 aarch64_insn_extract_system_reg(u32 insn);
 u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
 u32 aar

[PATCH v11 8/9] arm64: Add kernel return probes support (kretprobes)

2016-03-08 Thread David Long
From: Sandeepa Prabhu 

The pre-handler of this special 'trampoline' kprobe executes the return
probe handler functions and restores original return address in ELR_EL1.
This way the saved pt_regs still hold the original register context to be
carried back to the probed kernel function.

Signed-off-by: Sandeepa Prabhu 
Signed-off-by: David A. Long 
---
 arch/arm64/Kconfig  |  1 +
 arch/arm64/kernel/kprobes.c | 75 -
 2 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c395386..72412de 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -82,6 +82,7 @@ config ARM64
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_KPROBES
+   select HAVE_KRETPROBES if HAVE_KPROBES
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
index bd3f233..13d 100644
--- a/arch/arm64/kernel/kprobes.c
+++ b/arch/arm64/kernel/kprobes.c
@@ -534,7 +534,80 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
-   return NULL;
+   struct kretprobe_instance *ri = NULL;
+   struct hlist_head *head, empty_rp;
+   struct hlist_node *tmp;
+   unsigned long flags, orig_ret_addr = 0;
+   unsigned long trampoline_address =
+   (unsigned long)&kretprobe_trampoline;
+
+   INIT_HLIST_HEAD(&empty_rp);
+   kretprobe_hash_lock(current, &head, &flags);
+
+   /*
+* It is possible to have multiple instances associated with a given
+* task either because multiple functions in the call path have
+* a return probe installed on them, and/or more than one return
+* probe was registered for a target function.
+*
+* We can handle this because:
+* - instances are always inserted at the head of the list
+* - when multiple return probes are registered for the same
+*   function, the first instance's ret_addr will point to the
+*   real return address, and all the rest will point to
+*   kretprobe_trampoline
+*/
+   hlist_for_each_entry_safe(ri, tmp, head, hlist) {
+   if (ri->task != current)
+   /* another task is sharing our hash bucket */
+   continue;
+
+   if (ri->rp && ri->rp->handler) {
+   __this_cpu_write(current_kprobe, &ri->rp->kp);
+   get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
+   ri->rp->handler(ri, regs);
+   __this_cpu_write(current_kprobe, NULL);
+   }
+
+   orig_ret_addr = (unsigned long)ri->ret_addr;
+   recycle_rp_inst(ri, &empty_rp);
+
+   if (orig_ret_addr != trampoline_address)
+   /*
+* This is the real return address. Any other
+* instances associated with this task are for
+* other calls deeper on the call stack
+*/
+   break;
+   }
+
+   kretprobe_assert(ri, orig_ret_addr, trampoline_address);
+   /* restore the original return address */
+   instruction_pointer(regs) = orig_ret_addr;
+   reset_current_kprobe();
+   kretprobe_hash_unlock(current, &flags);
+
+   hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
+   hlist_del(&ri->hlist);
+   kfree(ri);
+   }
+
+   /* return 1 so that post handlers not called */
+   return (void *) orig_ret_addr;
+}
+
+void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
+ struct pt_regs *regs)
+{
+   ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
+
+   /* replace return addr (x30) with trampoline */
+   regs->regs[30] = (long)&kretprobe_trampoline;
+}
+
+int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+{
+   return 0;
 }
 
 int __init arch_init_kprobes(void)
-- 
2.5.0



[PATCH v11 7/9] arm64: Add trampoline code for kretprobes

2016-03-08 Thread David Long
From: William Cohen 

The trampoline code is used by kretprobes to capture a return from a probed
function.  This is done by saving the registers, calling the handler, and
restoring the registers. The code then returns to the original saved caller
return address. It is necessary to do this directly instead of using a
software breakpoint because the code used in processing that breakpoint
could itself be kprobe'd and cause a problematic reentry into the debug
exception handler.

Signed-off-by: William Cohen 
Signed-off-by: David A. Long 
---
 arch/arm64/include/asm/kprobes.h   |  2 +
 arch/arm64/kernel/Makefile |  1 +
 arch/arm64/kernel/asm-offsets.c| 11 +
 arch/arm64/kernel/kprobes.c|  5 ++
 arch/arm64/kernel/kprobes_trampoline.S | 88 ++
 5 files changed, 107 insertions(+)
 create mode 100644 arch/arm64/kernel/kprobes_trampoline.S

diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
index 79c9511..61b4915 100644
--- a/arch/arm64/include/asm/kprobes.h
+++ b/arch/arm64/include/asm/kprobes.h
@@ -56,5 +56,7 @@ int kprobe_exceptions_notify(struct notifier_block *self,
 unsigned long val, void *data);
 int kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr);
 int kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr);
+void kretprobe_trampoline(void);
+void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
 
 #endif /* _ARM_KPROBES_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 08325e5..f192b7d 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -37,6 +37,7 @@ arm64-obj-$(CONFIG_CPU_IDLE)  += cpuidle.o
 arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
 arm64-obj-$(CONFIG_KGDB)   += kgdb.o
 arm64-obj-$(CONFIG_KPROBES)+= kprobes.o kprobes-arm64.o
\
+  kprobes_trampoline.o 
\
   probes-simulate-insn.o
 arm64-obj-$(CONFIG_EFI)+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)+= pci.o
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index fffa4ac6..f7cc8ce 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -50,6 +50,17 @@ int main(void)
   DEFINE(S_X5, offsetof(struct pt_regs, regs[5]));
   DEFINE(S_X6, offsetof(struct pt_regs, regs[6]));
   DEFINE(S_X7, offsetof(struct pt_regs, regs[7]));
+  DEFINE(S_X8, offsetof(struct pt_regs, regs[8]));
+  DEFINE(S_X10,offsetof(struct pt_regs, regs[10]));
+  DEFINE(S_X12,offsetof(struct pt_regs, regs[12]));
+  DEFINE(S_X14,offsetof(struct pt_regs, regs[14]));
+  DEFINE(S_X16,offsetof(struct pt_regs, regs[16]));
+  DEFINE(S_X18,offsetof(struct pt_regs, regs[18]));
+  DEFINE(S_X20,offsetof(struct pt_regs, regs[20]));
+  DEFINE(S_X22,offsetof(struct pt_regs, regs[22]));
+  DEFINE(S_X24,offsetof(struct pt_regs, regs[24]));
+  DEFINE(S_X26,offsetof(struct pt_regs, regs[26]));
+  DEFINE(S_X28,offsetof(struct pt_regs, regs[28]));
   DEFINE(S_LR, offsetof(struct pt_regs, regs[30]));
   DEFINE(S_SP, offsetof(struct pt_regs, sp));
 #ifdef CONFIG_COMPAT
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
index ffc5affd..bd3f233 100644
--- a/arch/arm64/kernel/kprobes.c
+++ b/arch/arm64/kernel/kprobes.c
@@ -532,6 +532,11 @@ int __kprobes longjmp_break_handler(struct kprobe *p, 
struct pt_regs *regs)
return 1;
 }
 
+void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
+{
+   return NULL;
+}
+
 int __init arch_init_kprobes(void)
 {
return 0;
diff --git a/arch/arm64/kernel/kprobes_trampoline.S 
b/arch/arm64/kernel/kprobes_trampoline.S
new file mode 100644
index 000..072b4e5
--- /dev/null
+++ b/arch/arm64/kernel/kprobes_trampoline.S
@@ -0,0 +1,88 @@
+/*
+ * trampoline entry and return code for kretprobes.
+ */
+
+#include 
+#include 
+#include 
+
+   .text
+
+.macro save_all_base_regs ctxt
+   stp x0, x1, [\ctxt, #S_X0]
+   stp x2, x3, [\ctxt, #S_X2]
+   stp x4, x5, [\ctxt, #S_X4]
+   stp x6, x7, [\ctxt, #S_X6]
+   stp x8, x9, [\ctxt, #S_X8]
+   stp x10, x11, [\ctxt, #S_X10]
+   stp x12, x13, [\ctxt, #S_X12]
+   stp x14, x15, [\ctxt, #S_X14]
+   stp x16, x17, [\ctxt, #S_X16]
+   stp x18, x19, [\ctxt, #S_X18]
+   stp x20, x21, [\ctxt, #S_X20]
+   stp x22, x23, [\ctxt, #S_X22]
+   stp x24, x25, [\ctxt, #S_X24]
+   stp x26, x27, [\ctxt, #S_X26]
+   stp x28, x29, [\ctxt, #S_X28]

Re: [PATCH v3 1/9] clocksource: arm_arch_timer: Gather KVM specific information in a structure

2016-03-08 Thread Julien Grall

Hi Christoffer,

On 09/03/2016 10:23, Christoffer Dall wrote:

On Tue, Mar 08, 2016 at 11:29:25AM +, Julien Grall wrote:

-static struct timecounter timecounter;
+static struct arch_timer_kvm_info arch_timer_kvm_info;
+
+struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)


borderline bikeshedding question:

does it make sense that the info the arch timer code exports is labeled
to be kvm-specific?  In other words, could we imagine another subsystem
using this timer info some time and is there a more generic term that
would be more appropriate?


I can't see any other usage. I would keep the function name KVM specific 
until someone really need similar information.


Cheers,

--
Julien Grall


[PATCH v11 9/9] kprobes: Add arm64 case in kprobe example module

2016-03-08 Thread David Long
From: Sandeepa Prabhu 

Add info prints in sample kprobe handlers for ARM64

Signed-off-by: Sandeepa Prabhu 
---
 samples/kprobes/kprobe_example.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/samples/kprobes/kprobe_example.c b/samples/kprobes/kprobe_example.c
index 727eb21..0c72b8a 100644
--- a/samples/kprobes/kprobe_example.c
+++ b/samples/kprobes/kprobe_example.c
@@ -42,6 +42,10 @@ static int handler_pre(struct kprobe *p, struct pt_regs 
*regs)
" ex1 = 0x%lx\n",
p->addr, regs->pc, regs->ex1);
 #endif
+#ifdef CONFIG_ARM64
+   pr_info("pre_handler: p->addr = 0x%p, pc = 0x%lx\n",
+   p->addr, (long)regs->pc);
+#endif
 
/* A dump_stack() here will give a stack backtrace */
return 0;
@@ -67,6 +71,10 @@ static void handler_post(struct kprobe *p, struct pt_regs 
*regs,
printk(KERN_INFO "post_handler: p->addr = 0x%p, ex1 = 0x%lx\n",
p->addr, regs->ex1);
 #endif
+#ifdef CONFIG_ARM64
+   pr_info("post_handler: p->addr = 0x%p, pc = 0x%lx\n",
+   p->addr, (long)regs->pc);
+#endif
 }
 
 /*
-- 
2.5.0



Re: [PATCH 2/3] perf/x86/pebs: add workaround for broken OVFL status on HSW

2016-03-08 Thread Stephane Eranian
On Tue, Mar 8, 2016 at 1:13 PM, Stephane Eranian  wrote:
> Hi,
>
> On Tue, Mar 8, 2016 at 1:07 PM, Peter Zijlstra  wrote:
>> On Tue, Mar 08, 2016 at 12:59:23PM -0800, Stephane Eranian wrote:
>>> hi,
>>>
>>> On Mon, Mar 7, 2016 at 12:25 PM, Peter Zijlstra  
>>> wrote:
>>> >
>>> > On Mon, Mar 07, 2016 at 07:27:31PM +0100, Jiri Olsa wrote:
>>> > > On Mon, Mar 07, 2016 at 01:18:40PM +0100, Peter Zijlstra wrote:
>>> > > > On Mon, Mar 07, 2016 at 11:24:13AM +0100, Peter Zijlstra wrote:
>>> > > >
>>> > > > > I suspect Andi is having something along:
>>> > > > >
>>> > > > >  
>>> > > > > lkml.kernel.org/r/1445458568-16956-1-git-send-email-a...@firstfloor.org
>>> > > > >
>>> > > > > applied to his tree.
>>> > > >
>>> > > > OK, I munged a bunch of patches together, please have a hard look at 
>>> > > > the
>>> > > > end result found in:
>>> > > >
>>> > > >   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git 
>>> > > > perf/core
>>> > > >
>>>
>>> I ran this kernel on Haswell. Even with Andi's fixes the problem I 
>>> identified is
>>> still there, so my patch is still needed.
>>
>> Right, your patch should be included in that kernel, or did I make a
>> royal mess of things?
>>
> No, it is as expected for the OVF PMI fix.
>
>> I put Andi's late status ack on top of your patch.
>>
Ok, I ran into a problem on Broadwell with your branch with Andi's
patches. I see
a problem which had disappeared since SandyBridge:

11551.128422] [ cut here ]
[11551.128435] WARNING: CPU: 3 PID: 12114 at
arch/x86/events/intel/core.c:1868 intel_pmu_handle_irq+0x2da/0x4b0()
[11551.128437] perfevents: irq loop stuck!
[11551.128469][] dump_stack+0x4d/0x63
[11551.128479]  [] warn_slowpath_common+0x97/0xe0
[11551.128482]  [] warn_slowpath_fmt+0x46/0x50
[11551.128486]  [] intel_pmu_handle_irq+0x2da/0x4b0
[11551.128491]  [] perf_event_nmi_handler+0x39/0x60
[11551.128494]  [] nmi_handle+0x61/0x110
[11551.128497]  [] default_do_nmi+0x44/0x110
[11551.128500]  [] do_nmi+0xd7/0x140
[11551.128504]  [] end_repeat_nmi+0x1a/0x1e
[11551.128507]  [] ? native_write_msr+0x6/0x30
[11551.128510]  [] ? native_write_msr+0x6/0x30
[11551.128514]  [] ? native_write_msr+0x6/0x30
[11551.128515]  <>  [] ?
intel_pmu_enable_event+0x215/0x230
[11551.128520]  [] x86_pmu_start+0x8d/0x120
[11551.128523]  [] x86_pmu_enable+0x27b/0x2f0
[11551.128527]  [] perf_pmu_enable+0x1d/0x30
[11551.128530]  [] ctx_resched+0x5a/0x70
[11551.128532]  [] __perf_event_enable+0x1ac/0x210
[11551.128537]  [] event_function+0xa1/0x170
[11551.128540]  [] ? perf_duration_warn+0x70/0x70
[11551.128543]  [] remote_function+0x47/0x60
[11551.128547]  [] generic_exec_single+0xa8/0xb0
[11551.128550]  [] ? perf_duration_warn+0x70/0x70
[11551.128553]  [] ? perf_duration_warn+0x70/0x70
[11551.128555]  [] smp_call_function_single+0xa8/0x100
[11551.128559]  [] event_function_call+0x84/0x100
[11551.128561]  [] ? ctx_resched+0x70/0x70
[11551.128564]  [] ? ctx_resched+0x70/0x70
[11551.128566]  [] ? perf_ctx_lock+0x30/0x30
[11551.128570]  [] _perf_event_enable+0x60/0x80
[11551.128572]  [] perf_ioctl+0x271/0x3e0

The infinite loop in the irq handler!

But here it seems there is a race with a perf_events ioctl() to likely
reset the period.
I am not using the perf tool here just running a self-monitoring task.


>> Also note, Ingo merged most of those patches today, all except the top
>> 3, because Andi wanted to double check something.
>


Re: [PATCH 4.4 00/74] 4.4.5-stable review

2016-03-08 Thread Kevin Hilman
Greg Kroah-Hartman  writes:

> On Tue, Mar 08, 2016 at 02:11:08AM -0800, kernelci.org bot wrote:
>> stable-queue boot: 205 boots: 14 failed, 190 passed with 1 offline 
>> (v4.4.4-74-gcc3ba9c14b31)
>> 
>> Full Boot Summary: 
>> https://kernelci.org/boot/all/job/stable-queue/kernel/v4.4.4-74-gcc3ba9c14b31/
>> Full Build Summary: 
>> https://kernelci.org/build/stable-queue/kernel/v4.4.4-74-gcc3ba9c14b31/
>> 
>> Tree: stable-queue
>> Branch: local/linux-4.4.y.queue
>> Git Describe: v4.4.4-74-gcc3ba9c14b31
>> Git Commit: cc3ba9c14b31161587ce85e9b5d642e730a2d0e8
>> Git URL: git://server.roeck-us.net/git/linux-stable.git
>> Tested: 47 unique boards, 13 SoC families, 18 builds out of 132
>> 
>> Boot Failures Detected: 
>> https://kernelci.org/boot/?v4.4.4-74-gcc3ba9c14b31&fail
>> 
>> arm:
>> 
>> mxs_defconfig:
>> imx23-olinuxino: 1 failed lab
>> 
>> omap2plus_defconfig:
>> omap4-panda: 1 failed lab
>> 
>> multi_v7_defconfig+CONFIG_LKDTM=y:
>> imx53-qsrb: 1 failed lab
>> imx6dl-riotboard: 1 failed lab
>> socfpga_cyclone5_socrates: 1 failed lab
>> 
>> multi_v7_defconfig+CONFIG_SMP=n:
>> imx53-qsrb: 1 failed lab
>> imx6dl-riotboard: 1 failed lab
>> socfpga_cyclone5_socrates: 1 failed lab
>> 
>> multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y:
>> socfpga_cyclone5_socrates: 1 failed lab
>> 
>> imx_v6_v7_defconfig:
>> imx53-qsrb: 1 failed lab
>> imx6dl-riotboard: 1 failed lab
>> 
>> multi_v7_defconfig+CONFIG_PROVE_LOCKING=y:
>> imx53-qsrb: 1 failed lab
>> imx6dl-riotboard: 1 failed lab
>> socfpga_cyclone5_socrates: 1 failed lab
>> 
>> Offline Platforms:
>> 
>> arm:
>> 
>> mxs_defconfig:
>> imx28-duckbill: 1 offline lab
>
> I really don't know what these mean, any chance you can distill these
> down to "all is fine", or "there is a problem with this arch" type
> emails?

All is fine.

These failures are are on newly added boards coming from a new lab and
they're failing in other trees also, so we'll ignore them for now and
check with the specific lab owner.

Kevin


[PATCH v11 2/9] arm64: Add more test functions to insn.c

2016-03-08 Thread David Long
From: "David A. Long" 

Certain instructions are hard to execute correctly out-of-line (as in
kprobes).  Test functions are added to insn.[hc] to identify these.  The
instructions include any that use PC-relative addressing, change the PC,
or change interrupt masking. For efficiency and simplicity test
functions are also added for small collections of related instructions.

Signed-off-by: David A. Long 
---
 arch/arm64/include/asm/insn.h | 35 +++
 arch/arm64/kernel/insn.c  | 34 ++
 2 files changed, 69 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 30e50eb..662b42a 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -120,6 +120,29 @@ enum aarch64_insn_register {
AARCH64_INSN_REG_SP = 31  /* Stack pointer: as load/store base reg */
 };
 
+enum aarch64_insn_special_register {
+   AARCH64_INSN_SPCLREG_SPSR_EL1   = 0xC200,
+   AARCH64_INSN_SPCLREG_ELR_EL1= 0xC201,
+   AARCH64_INSN_SPCLREG_SP_EL0 = 0xC208,
+   AARCH64_INSN_SPCLREG_SPSEL  = 0xC210,
+   AARCH64_INSN_SPCLREG_CURRENTEL  = 0xC212,
+   AARCH64_INSN_SPCLREG_DAIF   = 0xDA11,
+   AARCH64_INSN_SPCLREG_NZCV   = 0xDA10,
+   AARCH64_INSN_SPCLREG_FPCR   = 0xDA20,
+   AARCH64_INSN_SPCLREG_DSPSR_EL0  = 0xDA28,
+   AARCH64_INSN_SPCLREG_DLR_EL0= 0xDA29,
+   AARCH64_INSN_SPCLREG_SPSR_EL2   = 0xE200,
+   AARCH64_INSN_SPCLREG_ELR_EL2= 0xE201,
+   AARCH64_INSN_SPCLREG_SP_EL1 = 0xE208,
+   AARCH64_INSN_SPCLREG_SPSR_INQ   = 0xE218,
+   AARCH64_INSN_SPCLREG_SPSR_ABT   = 0xE219,
+   AARCH64_INSN_SPCLREG_SPSR_UND   = 0xE21A,
+   AARCH64_INSN_SPCLREG_SPSR_FIQ   = 0xE21B,
+   AARCH64_INSN_SPCLREG_SPSR_EL3   = 0xF200,
+   AARCH64_INSN_SPCLREG_ELR_EL3= 0xF201,
+   AARCH64_INSN_SPCLREG_SP_EL2 = 0xF210
+};
+
 enum aarch64_insn_variant {
AARCH64_INSN_VARIANT_32BIT,
AARCH64_INSN_VARIANT_64BIT
@@ -223,8 +246,13 @@ static __always_inline bool aarch64_insn_is_##abbr(u32 
code) \
 static __always_inline u32 aarch64_insn_get_##abbr##_value(void) \
 { return (val); }
 
+__AARCH64_INSN_FUNCS(adr_adrp, 0x1F00, 0x1000)
+__AARCH64_INSN_FUNCS(prfm_lit, 0xFF00, 0xD800)
 __AARCH64_INSN_FUNCS(str_reg,  0x3FE0EC00, 0x38206800)
 __AARCH64_INSN_FUNCS(ldr_reg,  0x3FE0EC00, 0x38606800)
+__AARCH64_INSN_FUNCS(ldr_lit,  0xBF00, 0x1800)
+__AARCH64_INSN_FUNCS(ldrsw_lit,0xFF00, 0x9800)
+__AARCH64_INSN_FUNCS(exclusive,0x3F80, 0x0800)
 __AARCH64_INSN_FUNCS(stp_post, 0x7FC0, 0x2880)
 __AARCH64_INSN_FUNCS(ldp_post, 0x7FC0, 0x28C0)
 __AARCH64_INSN_FUNCS(stp_pre,  0x7FC0, 0x2980)
@@ -273,10 +301,14 @@ __AARCH64_INSN_FUNCS(svc, 0xFFE0001F, 0xD401)
 __AARCH64_INSN_FUNCS(hvc,  0xFFE0001F, 0xD402)
 __AARCH64_INSN_FUNCS(smc,  0xFFE0001F, 0xD403)
 __AARCH64_INSN_FUNCS(brk,  0xFFE0001F, 0xD420)
+__AARCH64_INSN_FUNCS(exception,0xFF00, 0xD400)
 __AARCH64_INSN_FUNCS(hint, 0xF01F, 0xD503201F)
 __AARCH64_INSN_FUNCS(br,   0xFC1F, 0xD61F)
 __AARCH64_INSN_FUNCS(blr,  0xFC1F, 0xD63F)
 __AARCH64_INSN_FUNCS(ret,  0xFC1F, 0xD65F)
+__AARCH64_INSN_FUNCS(mrs,  0xFFF0, 0xD530)
+__AARCH64_INSN_FUNCS(msr_imm,  0xFFF8F01F, 0xD500401F)
+__AARCH64_INSN_FUNCS(msr_reg,  0xFFF0, 0xD510)
 
 #undef __AARCH64_INSN_FUNCS
 
@@ -286,6 +318,8 @@ bool aarch64_insn_is_branch_imm(u32 insn);
 int aarch64_insn_read(void *addr, u32 *insnp);
 int aarch64_insn_write(void *addr, u32 insn);
 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+bool aarch64_insn_uses_literal(u32 insn);
+bool aarch64_insn_is_branch(u32 insn);
 u64 aarch64_insn_decode_immediate(enum aarch64_insn_imm_type type, u32 insn);
 u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
  u32 insn, u64 imm);
@@ -367,6 +401,7 @@ bool aarch32_insn_is_wide(u32 insn);
 #define A32_RT_OFFSET  12
 #define A32_RT2_OFFSET  0
 
+u32 aarch64_extract_system_register(u32 insn);
 u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
 u32 aarch32_insn_mcr_extract_opc2(u32 insn);
 u32 aarch32_insn_mcr_extract_crm(u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 7371455..60c1c71 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -162,6 +162,32 @@ static bool __kprobes __aarch64_insn_hotpatch_safe(u32 
insn)
aarch64_insn_is_nop(insn);
 }
 
+bool __kprobes aarch64_insn_uses_literal(u32 insn)
+{
+   /* ldr/ldrsw (literal), prfm */
+
+   return aarch64_insn_is_ldr_lit(insn) ||
+   aarch64_insn_is_ldrsw_lit(insn) ||
+   aarch64_insn_is_adr_adrp(insn) ||
+   aarch64_insn_is_prfm_lit(insn);
+}
+
+bool __kprobes aarch64_insn_is_branch(u32 insn)
+{
+   /* b, bl, 

[PATCH v11 1/9] arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature

2016-03-08 Thread David Long
From: "David A. Long" 

Add HAVE_REGS_AND_STACK_ACCESS_API feature for arm64.

Signed-off-by: David A. Long 
---
 arch/arm64/Kconfig  |   1 +
 arch/arm64/include/asm/ptrace.h |  31 +++
 arch/arm64/kernel/ptrace.c  | 117 
 3 files changed, 149 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc6228..4211b0d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -78,6 +78,7 @@ config ARM64
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+   select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
select IOMMU_DMA if IOMMU_SUPPORT
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index e9e5467..7bd6445 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -118,6 +118,8 @@ struct pt_regs {
u64 syscallno;
 };
 
+#define MAX_REG_OFFSET offsetof(struct user_pt_regs, pstate)
+
 #define arch_has_single_step() (1)
 
 #ifdef CONFIG_COMPAT
@@ -146,6 +148,35 @@ struct pt_regs {
 #define user_stack_pointer(regs) \
(!compat_user_mode(regs) ? (regs)->sp : (regs)->compat_sp)
 
+extern int regs_query_register_offset(const char *name);
+extern const char *regs_query_register_name(unsigned int offset);
+extern bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr);
+extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
+  unsigned int n);
+
+/**
+ * regs_get_register() - get register value from its offset
+ * @regs: pt_regs from which register value is gotten
+ * @offset:offset number of the register.
+ *
+ * regs_get_register returns the value of a register whose offset from @regs.
+ * The @offset is the offset of the register in struct pt_regs.
+ * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
+ */
+static inline u64 regs_get_register(struct pt_regs *regs,
+ unsigned int offset)
+{
+   if (unlikely(offset > MAX_REG_OFFSET))
+   return 0;
+   return *(u64 *)((u64)regs + offset);
+}
+
+/* Valid only for Kernel mode traps. */
+static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
+{
+   return regs->sp;
+}
+
 static inline unsigned long regs_return_value(struct pt_regs *regs)
 {
return regs->regs[0];
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index ff7f132..efebf0f 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -48,6 +48,123 @@
 #define CREATE_TRACE_POINTS
 #include 
 
+struct pt_regs_offset {
+   const char *name;
+   int offset;
+};
+
+#define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct pt_regs, r)}
+#define REG_OFFSET_END {.name = NULL, .offset = 0}
+#defineGPR_OFFSET_NAME(r)  \
+   {.name = "x" #r, .offset = offsetof(struct pt_regs, regs[r])}
+
+static const struct pt_regs_offset regoffset_table[] = {
+   GPR_OFFSET_NAME(0),
+   GPR_OFFSET_NAME(1),
+   GPR_OFFSET_NAME(2),
+   GPR_OFFSET_NAME(3),
+   GPR_OFFSET_NAME(4),
+   GPR_OFFSET_NAME(5),
+   GPR_OFFSET_NAME(6),
+   GPR_OFFSET_NAME(7),
+   GPR_OFFSET_NAME(8),
+   GPR_OFFSET_NAME(9),
+   GPR_OFFSET_NAME(10),
+   GPR_OFFSET_NAME(11),
+   GPR_OFFSET_NAME(12),
+   GPR_OFFSET_NAME(13),
+   GPR_OFFSET_NAME(14),
+   GPR_OFFSET_NAME(15),
+   GPR_OFFSET_NAME(16),
+   GPR_OFFSET_NAME(17),
+   GPR_OFFSET_NAME(18),
+   GPR_OFFSET_NAME(19),
+   GPR_OFFSET_NAME(20),
+   GPR_OFFSET_NAME(21),
+   GPR_OFFSET_NAME(22),
+   GPR_OFFSET_NAME(23),
+   GPR_OFFSET_NAME(24),
+   GPR_OFFSET_NAME(25),
+   GPR_OFFSET_NAME(26),
+   GPR_OFFSET_NAME(27),
+   GPR_OFFSET_NAME(28),
+   GPR_OFFSET_NAME(29),
+   GPR_OFFSET_NAME(30),
+   {.name = "lr", .offset = offsetof(struct pt_regs, regs[30])},
+   REG_OFFSET_NAME(sp),
+   REG_OFFSET_NAME(pc),
+   REG_OFFSET_NAME(pstate),
+   REG_OFFSET_END,
+};
+
+/**
+ * regs_query_register_offset() - query register offset from its name
+ * @name:  the name of a register
+ *
+ * regs_query_register_offset() returns the offset of a register in struct
+ * pt_regs from its name. If the name is invalid, this returns -EINVAL;
+ */
+int regs_query_register_offset(const char *name)
+{
+   const struct pt_regs_offset *roff;
+
+   for (roff = regoffset_table; roff->name != NULL; roff++)
+   if (!strcmp(roff->name, name))
+   return roff->offset;
+   return -EINVAL;
+}
+
+/**
+ * regs_query_register_name() - query register name from its offset
+ * @offset:the offset of a register in struct pt_regs.
+ *
+ * regs_query_register_name() returns the name of a register from its
+ * offset in struct pt_regs. If the @offs

[PATCH v11 0/9] arm64: Add kernel probes (kprobes) support

2016-03-08 Thread David Long
From: "David A. Long" 

This patchset is heavily based on Sandeepa Prabhu's ARM v8 kprobes patches,
first seen in October 2013. This version attempts to address concerns raised by
reviewers and also fixes problems discovered during testing.

This patchset adds support for kernel probes(kprobes), jump probes(jprobes)
and return probes(kretprobes) support for ARM64.

The kprobes mechanism makes use of software breakpoint and single stepping
support available in the ARM v8 kernel.

Changes since v2 include:

1) Removal of NOP padding in kprobe XOL slots. Slots are now exactly one
instruction long.
2) Disabling of interrupts during execution in single-step mode.
3) Fixing of numerous problems in instruction simulation code (mostly
thanks to Will Cohen).
4) Support for the HAVE_REGS_AND_STACK_ACCESS_API feature is added, to allow
access to kprobes through debugfs.
5) kprobes is *not* enabled in defconfig.
6) Numerous complaints from checkpatch have been cleaned up, although a couple
remain as removing the function pointer typedefs results in ugly code.

Changes since v3 include:

1) Remove table-driven instruction parsing and replace with an if statement
calling out to old and new instruction test functions in insn.c.
2) I removed the addition of orig_x0 to ptrace.h.
3) Reorder the patches.
4) Replace the previous interrupt disabling (from Will Cohen) with
an improved solution (from Steve Capper).

Changes since v4 include:

1) Added insn.c functions to detect exception instructions and DAIF
   read/write instructions, and use them to reject probing same.
2) Changed adr detect function to also recognize adrp. Reject both.
3) Added missing __kprobes for some new functions.
4) Added call to kprobes_fault_handler from mm do_page_fault.
5) Reject all non-simulated branch/ret instructions, not just those
   that use an immediate offset.
6) Moved software breakpoint definitions into debug-monitors.h.
7) Removed "!XIP_KERNEL" from Kconfig.
8) changed kprobes_condition_check_t and kprobes_prepare_t to probes_*,
   for future sharing with uprobes.
9) Removed bogus call to kprobes_restore_local_irqflag() from 
   trampoline_probe_handler().

Changes since v5 include:

1) Replaced installation of breakpoint hook with direct call from the
handlers in debug-monitors.c, as requested.
2) Reject probing of instructions that read the interrupt mask, in
addition to instructions that set it.
3) Cleaned up comments describing usage of Debug Mask.
4) Added KPROBE_REENTER case in reenter_kprobe.
5) Corrected the ifdef'd definitions for notify_page_fault() to be
consistent when KPROBES is not configed.
6) Changed "cpsr" to "pstate" for HAVE_REGS_AND_STACK_ACCESS_API feature.
7) Added back in missing new files in previous patch.
8) Changed two instances of pr_warning() to pr_warn().

Note that there seems to be at least a potential issue with kprobes
on multiple (possibly all) platforms having to do with use of kfree
inside of the kretprobes trampoline handler.  This has manifested
occasionally in systemtap testing on arm64.  There does not appear to
be an simple solution to the problem.

Changes since v6 include:

1) New trampoline code from Will Cohen fixes the occasional failure seen
when processing kretprobes by replacing the software breakpoint with
assembly code to implement the return to the original execution stream.
2) Changed ip0, ip1, fp, and lr to plain numbered registers for purposes
of recognizing them as an ascii string in the stack/reg access code.
3) Removed orig_x0.
4) Moved ARM_x* defines from arch/arm64/include/uapi/asm/ptrace.h to
arch/arm64/kernel/ptrace.c.

Changes since v7 include:

1) Move trampoline entry/return code into separate ".S" file instead
of making it a macro in a header file.
2) Add missing register name definitions in asm-offsets.c and use them
in place of hard-coded integer offsets in the trampoline code.
3) Correct the values used to decode MSR immediate instructions, in insn.h.
4) Remove the currently unused simulate_none() function.

Changes since v8 include:

1) Replaced use of REG_OFFSET_NAME with GPR_OFFSET_NAME for numbered
registers.
2) Added an alias for "lr" in the register name lookup table, which perf
tools need to be able to recognize.
3) Changed the code for checking instruction types for probeability and
steppability as per review feedback.
4) Fixed the size of cache being flushed when filling single-step slot.
5) Fixed big-endian issues.
6) Blacklisted copy_to/from_user to avoid aborts while single-stepping.
7) Record conditional instructions that fail the conditional test just
like any other probed (non-conditional) instruction.
8) Removed use of magic number for detecting jprobe return and just
check the breakpoint address instead.
9) Got rid of the unnecessary arch/arm64/kprobes.h.
10) The PSTATE and SP are now properly saved in the kretprobe trampoline
code.
11) This patch no longer depends on the "Consolidate redundant
register/stack access code" patch set.
12) Remove call to

[PATCH v11 6/9] arm64: kprobes instruction simulation support

2016-03-08 Thread David Long
From: Sandeepa Prabhu 

Kprobes needs simulation of instructions that cannot be stepped
from a different memory location, e.g.: those instructions
that uses PC-relative addressing. In simulation, the behaviour
of the instruction is implemented using a copy of pt_regs.

The following instruction categories are simulated:
 - All branching instructions(conditional, register, and immediate)
 - Literal access instructions(load-literal, adr/adrp)

Conditional execution is limited to branching instructions in
ARM v8. If conditions at PSTATE do not match the condition fields
of opcode, the instruction is effectively NOP.

Thanks to Will Cohen for assorted suggested changes.

Signed-off-by: Sandeepa Prabhu 
Signed-off-by: William Cohen 
Signed-off-by: David A. Long 
---
 arch/arm64/include/asm/insn.h|   1 +
 arch/arm64/include/asm/probes.h  |   5 +-
 arch/arm64/kernel/Makefile   |   3 +-
 arch/arm64/kernel/insn.c |   1 +
 arch/arm64/kernel/kprobes-arm64.c|  29 
 arch/arm64/kernel/kprobes.c  |  32 -
 arch/arm64/kernel/probes-simulate-insn.c | 218 +++
 arch/arm64/kernel/probes-simulate-insn.h |  28 
 8 files changed, 311 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
 create mode 100644 arch/arm64/kernel/probes-simulate-insn.h

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index b9567a1..26cee10 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -410,6 +410,7 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn);
 
 typedef bool (pstate_check_t)(unsigned long);
 extern pstate_check_t * const opcode_condition_checks[16];
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __ASM_INSN_H */
diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
index c5fcbe6..d524f7d 100644
--- a/arch/arm64/include/asm/probes.h
+++ b/arch/arm64/include/asm/probes.h
@@ -15,11 +15,12 @@
 #ifndef _ARM_PROBES_H
 #define _ARM_PROBES_H
 
+#include 
+
 struct kprobe;
 struct arch_specific_insn;
 
 typedef u32 kprobe_opcode_t;
-typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
 typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
 
 enum pc_restore_type {
@@ -35,7 +36,7 @@ struct kprobe_pc_restore {
 /* architecture specific copy of original instruction */
 struct arch_specific_insn {
kprobe_opcode_t *insn;
-   kprobes_pstate_check_t *pstate_cc;
+   pstate_check_t *pstate_cc;
kprobes_handler_t *handler;
/* restore address after step xol */
struct kprobe_pc_restore restore;
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 4efb791..08325e5 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -36,7 +36,8 @@ arm64-obj-$(CONFIG_CPU_PM)+= sleep.o suspend.o
 arm64-obj-$(CONFIG_CPU_IDLE)   += cpuidle.o
 arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
 arm64-obj-$(CONFIG_KGDB)   += kgdb.o
-arm64-obj-$(CONFIG_KPROBES)+= kprobes.o kprobes-arm64.o
+arm64-obj-$(CONFIG_KPROBES)+= kprobes.o kprobes-arm64.o
\
+  probes-simulate-insn.o
 arm64-obj-$(CONFIG_EFI)+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)   += armv8_deprecated.o
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 9f15ceb..f9a3432 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #define AARCH64_INSN_SF_BITBIT(31)
diff --git a/arch/arm64/kernel/kprobes-arm64.c 
b/arch/arm64/kernel/kprobes-arm64.c
index e07727a..487238a 100644
--- a/arch/arm64/kernel/kprobes-arm64.c
+++ b/arch/arm64/kernel/kprobes-arm64.c
@@ -21,6 +21,7 @@
 #include 
 
 #include "kprobes-arm64.h"
+#include "probes-simulate-insn.h"
 
 static bool __kprobes aarch64_insn_is_steppable(u32 insn)
 {
@@ -62,8 +63,36 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct 
arch_specific_insn *asi)
 */
if (aarch64_insn_is_steppable(insn))
return INSN_GOOD;
+
+   if (aarch64_insn_is_bcond(insn)) {
+   asi->handler = simulate_b_cond;
+   } else if (aarch64_insn_is_cbz(insn) ||
+   aarch64_insn_is_cbnz(insn)) {
+   asi->handler = simulate_cbz_cbnz;
+   } else if (aarch64_insn_is_tbz(insn) ||
+   aarch64_insn_is_tbnz(insn)) {
+   asi->handler = simulate_tbz_tbnz;
+   } else if (aarch64_insn_is_adr_adrp(insn))
+   asi->handler = simulate_adr_adrp;
+   else if (aarch64_insn_is_b(insn) ||
+   aarch64_insn_is_bl(insn))
+   asi->handler = simulate_b_bl;
+   else if (aarch64_insn_is_br(insn) ||
+   aarch64_insn_is_blr(insn) ||
+   aarch6

[PATCH v11 3/9] arm64: add copy_to/from_user to kprobes blacklist

2016-03-08 Thread David Long
From: "David A. Long" 

Currrently taking exceptions when accessing user data from a kprobe'd
instruction doesn't work. Avoid this situation by blacklisting the relevant
functions.

Signed-off-by: David A. Long 
---
 arch/arm64/lib/copy_from_user.S | 1 +
 arch/arm64/lib/copy_to_user.S   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 4699cd7..0ac2131 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -66,6 +66,7 @@
.endm
 
 end.reqx5
+   .section .kprobes.text,"ax",%progbits
 ENTRY(__copy_from_user)
 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN)
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 7512bbb..e4eb84c 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -65,6 +65,7 @@
.endm
 
 end.reqx5
+   .section .kprobes.text,"ax",%progbits
 ENTRY(__copy_to_user)
 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN)
-- 
2.5.0



[PATCH v11 4/9] arm64: add conditional instruction simulation support

2016-03-08 Thread David Long
From: "David A. Long" 

Cease using the arm32 arm_check_condition() function and replace it with
a local version for use in deprecated instruction support on arm64. Also
make the function table used by this available for future use by kprobes
and/or uprobes.

This function is dervied from code written by Sandeepa Prabhu.

Signed-off-by: Sandeepa Prabhu 
Signed-off-by: David A. Long 
---
 arch/arm64/include/asm/insn.h|  3 ++
 arch/arm64/kernel/Makefile   |  3 +-
 arch/arm64/kernel/armv8_deprecated.c | 19 +++-
 arch/arm64/kernel/insn.c | 94 
 4 files changed, 115 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 662b42a..72dda48 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -405,6 +405,9 @@ u32 aarch64_extract_system_register(u32 insn);
 u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
 u32 aarch32_insn_mcr_extract_opc2(u32 insn);
 u32 aarch32_insn_mcr_extract_crm(u32 insn);
+
+typedef bool (pstate_check_t)(unsigned long);
+extern pstate_check_t * const opcode_condition_checks[16];
 #endif /* __ASSEMBLY__ */
 
 #endif /* __ASM_INSN_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e6..fd5f163 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -26,8 +26,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
$(call if_changed,objcopy)
 
 arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o 
\
-  sys_compat.o entry32.o   
\
-  ../../arm/kernel/opcodes.o
+  sys_compat.o entry32.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)+= arm64ksyms.o module.o
 arm64-obj-$(CONFIG_PERF_EVENTS)+= perf_regs.o perf_callchain.o
diff --git a/arch/arm64/kernel/armv8_deprecated.c 
b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207..c655259 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -369,6 +369,21 @@ static int emulate_swpX(unsigned int address, unsigned int 
*data,
return res;
 }
 
+#defineARM_OPCODE_CONDITION_UNCOND 0xf
+
+static unsigned int __kprobes arm32_check_condition(u32 opcode, u32 psr)
+{
+   u32 cc_bits  = opcode >> 28;
+
+   if (cc_bits != ARM_OPCODE_CONDITION_UNCOND) {
+   if ((*opcode_condition_checks[cc_bits])(psr))
+   return ARM_OPCODE_CONDTEST_PASS;
+   else
+   return ARM_OPCODE_CONDTEST_FAIL;
+   }
+   return ARM_OPCODE_CONDTEST_UNCOND;
+}
+
 /*
  * swp_handler logs the id of calling process, dissects the instruction, sanity
  * checks the memory location, calls emulate_swpX for the actual operation and
@@ -383,7 +398,7 @@ static int swp_handler(struct pt_regs *regs, u32 instr)
 
type = instr & TYPE_SWPB;
 
-   switch (arm_check_condition(instr, regs->pstate)) {
+   switch (arm32_check_condition(instr, regs->pstate)) {
case ARM_OPCODE_CONDTEST_PASS:
break;
case ARM_OPCODE_CONDTEST_FAIL:
@@ -464,7 +479,7 @@ static int cp15barrier_handler(struct pt_regs *regs, u32 
instr)
 {
perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, 1, regs, regs->pc);
 
-   switch (arm_check_condition(instr, regs->pstate)) {
+   switch (arm32_check_condition(instr, regs->pstate)) {
case ARM_OPCODE_CONDTEST_PASS:
break;
case ARM_OPCODE_CONDTEST_FAIL:
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 60c1c71..9f15ceb 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1234,3 +1234,97 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn)
 {
return insn & CRM_MASK;
 }
+
+static bool __kprobes __check_eq(unsigned long pstate)
+{
+   return (pstate & PSR_Z_BIT) != 0;
+}
+
+static bool __kprobes __check_ne(unsigned long pstate)
+{
+   return (pstate & PSR_Z_BIT) == 0;
+}
+
+static bool __kprobes __check_cs(unsigned long pstate)
+{
+   return (pstate & PSR_C_BIT) != 0;
+}
+
+static bool __kprobes __check_cc(unsigned long pstate)
+{
+   return (pstate & PSR_C_BIT) == 0;
+}
+
+static bool __kprobes __check_mi(unsigned long pstate)
+{
+   return (pstate & PSR_N_BIT) != 0;
+}
+
+static bool __kprobes __check_pl(unsigned long pstate)
+{
+   return (pstate & PSR_N_BIT) == 0;
+}
+
+static bool __kprobes __check_vs(unsigned long pstate)
+{
+   return (pstate & PSR_V_BIT) != 0;
+}
+
+static bool __kprobes __check_vc(unsigned long pstate)
+{
+   return (pstate & PSR_V_BIT) == 0;
+}
+
+static bool __kprobes __check_hi(unsigned long pstate)
+{
+   pstate &= ~(pstate >> 1);   /* PSR_C_BIT &= ~PSR_Z_BIT */
+   return (pstate & PSR_C_BIT) != 0;
+}
+
+static bool __kprobes __check_ls

linux-next: removal of the tiny tree

2016-03-08 Thread Stephen Rothwell
Hi Josh,

I noticed that the tiny tree

  git://git.kernel.org/pub/scm/linux/kernel/git/josh/linux.git branch tiny/next

has not been updated since v3.18-rc1.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


[PATCH 2/3] Staging: comedi: fix WARNING issue in s626.c

2016-03-08 Thread ravishankarkm
This is a patch to the s626.c file that fixes up a  Block comments issues
found by the checkpatch.pl tool.

i.e. Block comments use a trailing */ on a separate line

Signed-off-by: Ravishankar Karkala Mallikarjunayya 
---
 drivers/staging/comedi/drivers/s626.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/comedi/drivers/s626.c 
b/drivers/staging/comedi/drivers/s626.c
index c907bc5..f21c6bc 100644
--- a/drivers/staging/comedi/drivers/s626.c
+++ b/drivers/staging/comedi/drivers/s626.c
@@ -77,23 +77,30 @@ struct s626_buffer_dma {
 struct s626_private {
u8 ai_cmd_running;  /* ai_cmd is running */
unsigned int ai_sample_timer;   /* time between samples in
-* units of the timer */
+* units of the timer
+*/
int ai_convert_count;   /* conversion counter */
unsigned int ai_convert_timer;  /* time between conversion in
-* units of the timer */
-   u16 counter_int_enabs;  /* counter interrupt enable mask
-* for MISC2 register */
-   u8 adc_items;   /* number of items in ADC poll list */
+* units of the timer
+*/
+   u16 counter_int_enabs;  /* counter interrupt enable mask
+* for MISC2 register
+*/
+   u8 adc_items;   /* number of items in ADC poll list */
struct s626_buffer_dma rps_buf; /* DMA buffer used to hold ADC (RPS1)
-* program */
+* program
+*/
struct s626_buffer_dma ana_buf; /* DMA buffer used to receive ADC data
-* and hold DAC data */
+* and hold DAC data
+*/
u32 *dac_wbuf;  /* pointer to logical adrs of DMA buffer
-* used to hold DAC data */
+* used to hold DAC data
+*/
u16 dacpol; /* image of DAC polarity register */
u8 trim_setpoint[12];   /* images of TrimDAC setpoints */
u32 i2c_adrs;   /* I2C device address for onboard EEPROM
-* (board rev dependent) */
+* (board rev dependent)
+*/
 };
 
 /* Counter overflow/index event flag masks for RDMISC2. */
@@ -572,11 +579,14 @@ static int s626_set_dac(struct comedi_device *dev,
 * running after the packet has been sent to the target DAC.
 */
val = 0x0F00;   /* Continue clock after target DAC data
-* (write to non-existent trimdac). */
+* (write to non-existent trimdac).
+*/
val |= 0x4000;  /* Address the two main dual-DAC devices
-* (TSL's chip select enables target device). */
+* (TSL's chip select enables target device).
+*/
val |= ((u32)(chan & 1) << 15); /* Address the DAC channel
-* within the device. */
+* within the device.
+*/
val |= (u32)dacdata;/* Include DAC setpoint data. */
return s626_send_dac(dev, val);
 }
-- 
1.9.1



[PATCH 3/3] Staging: comedi: fix WARNING issue in s626.c

2016-03-08 Thread ravishankarkm
This is a patch to the s626.c file that fixes up a  line over
80 characters issues found by the checkpatch.pl tool.

Signed-off-by: Ravishankar Karkala Mallikarjunayya 
---
 drivers/staging/comedi/drivers/s626.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/s626.c 
b/drivers/staging/comedi/drivers/s626.c
index f21c6bc..f44e11b 100644
--- a/drivers/staging/comedi/drivers/s626.c
+++ b/drivers/staging/comedi/drivers/s626.c
@@ -2543,7 +2543,8 @@ static int s626_initialize(struct comedi_device *dev)
for (i = 0; i < 2; i++) {
writel(S626_I2C_CLKSEL, dev->mmio + S626_P_I2CSTAT);
s626_mc_enable(dev, S626_MC2_UPLD_IIC, S626_P_MC2);
-   ret = comedi_timeout(dev, NULL, NULL, s626_i2c_handshake_eoc, 
0);
+   ret = comedi_timeout(dev, NULL, NULL, s626_i2c_handshake_eoc,
+0);
if (ret)
return ret;
}
-- 
1.9.1



Re: [RESEND PATCH v4 0/8] i2c: Relax mandatory I2C ID table passing

2016-03-08 Thread Lee Jones
On Tue, 08 Mar 2016, Kieran Bingham wrote:

> On 8 Mar 2016 11:22, "Lee Jones"  wrote:
> >
> > On Mon, 12 Oct 2015, Kieran Bingham wrote:
> >
> > > Hi Wolfram,
> > >
> > > On 9 October 2015 at 22:16, Wolfram Sang  wrote:
> > > >
> > > > As said to Kieran personally in Dublin, I want a verification that all
> > > > binding methods still work, especially runtime instantiation for
> drivers
> > > > without i2c_device_ids.
> > >
> > > Ok, I should be able to find some time to look at that this week.
> > >
> > > >  Also, for the last patch, a verification should
> > > > be done if the drivers i2c_device_id hasn't been used meanwhile.
> > >
> > > I'll see what I can do ...
> > >
> > > > I'd also like to see 'probe_new' instead of 'probe2' for the new
> function
> > > > name. That should be it.
> > >
> > > Ok, obviously this is only a temporary naming so I don't mind either
> way,
> > > I'll do a rename for the next version
> > >
> > > I've also just found a compile failure to fix up on !CONFIG_OF, this
> > > can make its way into the respin.
> >
> > I still don't see this upstream.  What's the latest status?
> 
> Needs correct testing:
> "verification that all binding methods still work, especially runtime
> instantiation for drivers without i2c_device_ids."
> 
> Actually I rebased this set last week.  I was going to try and see if I can
> test in qemu. Just need to work out how to load DT fragments at runtime.

Sounds like an over-the-top solution.  Can't you just modprobe some modules?

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


linux-next: removal of the squashfs tree

2016-03-08 Thread Stephen Rothwell
Hi Phillip,

I noticed that the squashfs tree

  git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git branch 
master

has not been updated since November 2014.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


[PATCH 1/3] Staging: comedi: fix type issue in s626.c

2016-03-08 Thread ravishankarkm
This is a patch to the s626.c file that fixes up a type issues
found by the checkpatch.pl tool.

i.e Prefer kernel type 'u8' over 'uint8_t'
Prefer kernel type 'u16' over 'uint16_t'
Prefer kernel type 'u32' over 'uint32_t'
Prefer kernel type 's16' over 'int16_t'
Prefer kernel type 's32' over 'int32_t'

Signed-off-by: Ravishankar Karkala Mallikarjunayya 
---
 drivers/staging/comedi/drivers/s626.c | 306 +-
 1 file changed, 153 insertions(+), 153 deletions(-)

diff --git a/drivers/staging/comedi/drivers/s626.c 
b/drivers/staging/comedi/drivers/s626.c
index 35f0f67..c907bc5 100644
--- a/drivers/staging/comedi/drivers/s626.c
+++ b/drivers/staging/comedi/drivers/s626.c
@@ -75,24 +75,24 @@ struct s626_buffer_dma {
 };
 
 struct s626_private {
-   uint8_t ai_cmd_running; /* ai_cmd is running */
+   u8 ai_cmd_running;  /* ai_cmd is running */
unsigned int ai_sample_timer;   /* time between samples in
 * units of the timer */
int ai_convert_count;   /* conversion counter */
unsigned int ai_convert_timer;  /* time between conversion in
 * units of the timer */
-   uint16_t counter_int_enabs; /* counter interrupt enable mask
+   u16 counter_int_enabs;  /* counter interrupt enable mask
 * for MISC2 register */
-   uint8_t adc_items;  /* number of items in ADC poll list */
+   u8 adc_items;   /* number of items in ADC poll list */
struct s626_buffer_dma rps_buf; /* DMA buffer used to hold ADC (RPS1)
 * program */
struct s626_buffer_dma ana_buf; /* DMA buffer used to receive ADC data
 * and hold DAC data */
-   uint32_t *dac_wbuf; /* pointer to logical adrs of DMA buffer
+   u32 *dac_wbuf;  /* pointer to logical adrs of DMA buffer
 * used to hold DAC data */
-   uint16_t dacpol;/* image of DAC polarity register */
-   uint8_t trim_setpoint[12];  /* images of TrimDAC setpoints */
-   uint32_t i2c_adrs;  /* I2C device address for onboard EEPROM
+   u16 dacpol; /* image of DAC polarity register */
+   u8 trim_setpoint[12];   /* images of TrimDAC setpoints */
+   u32 i2c_adrs;   /* I2C device address for onboard EEPROM
 * (board rev dependent) */
 };
 
@@ -179,7 +179,7 @@ static void s626_debi_transfer(struct comedi_device *dev)
 /*
  * Read a value from a gate array register.
  */
-static uint16_t s626_debi_read(struct comedi_device *dev, uint16_t addr)
+static u16 s626_debi_read(struct comedi_device *dev, u16 addr)
 {
/* Set up DEBI control register value in shadow RAM */
writel(S626_DEBI_CMD_RDWORD | addr, dev->mmio + S626_P_DEBICMD);
@@ -193,8 +193,8 @@ static uint16_t s626_debi_read(struct comedi_device *dev, 
uint16_t addr)
 /*
  * Write a value to a gate array register.
  */
-static void s626_debi_write(struct comedi_device *dev, uint16_t addr,
-   uint16_t wdata)
+static void s626_debi_write(struct comedi_device *dev, u16 addr,
+   u16 wdata)
 {
/* Set up DEBI control register value in shadow RAM */
writel(S626_DEBI_CMD_WRWORD | addr, dev->mmio + S626_P_DEBICMD);
@@ -241,7 +241,7 @@ static int s626_i2c_handshake_eoc(struct comedi_device *dev,
return -EBUSY;
 }
 
-static int s626_i2c_handshake(struct comedi_device *dev, uint32_t val)
+static int s626_i2c_handshake(struct comedi_device *dev, u32 val)
 {
unsigned int ctrl;
int ret;
@@ -267,8 +267,8 @@ static int s626_i2c_handshake(struct comedi_device *dev, 
uint32_t val)
return ctrl & S626_I2C_ERR;
 }
 
-/* Read uint8_t from EEPROM. */
-static uint8_t s626_i2c_read(struct comedi_device *dev, uint8_t addr)
+/* Read u8 from EEPROM. */
+static u8 s626_i2c_read(struct comedi_device *dev, u8 addr)
 {
struct s626_private *devpriv = dev->private;
 
@@ -288,7 +288,7 @@ static uint8_t s626_i2c_read(struct comedi_device *dev, 
uint8_t addr)
/*
 * Execute EEPROM read:
 *  Byte2 = I2C command: read from I2C EEPROM device.
-*  Byte1 receives uint8_t from EEPROM.
+*  Byte1 receives u8 from EEPROM.
 *  Byte0 = Not sent.
 */
if (s626_i2c_handshake(dev, S626_I2C_B2(S626_I2C_ATTRSTART,
@@ -304,10 +304,10 @@ static uint8_t s626_i2c_read(struct comedi_device *dev, 
uint8_t addr)
 /* ***  DAC FUNCTIONS *** */
 
 /* TrimDac LogicalChan-to-PhysicalChan mapping table. */
-static const uint8_t s626_trimchan[] = { 10, 9, 8, 3, 2, 7, 6, 1, 0, 5, 4 };
+static const u8 s626_trimchan[] = { 10, 9, 8, 3, 2, 7, 6, 1, 0, 5, 4 };
 
 /* TrimDac LogicalChan-to-EepromAdrs m

[PATCH] iio: adis16480: fix FNCTIO_CTRL corruption when enabling IRQ

2016-03-08 Thread Vlad Banea
Enabling the IRQ should leave all other settings in the FNCTIO_CTRL
register untouched: read the whole register, toggle just the enable bit,
before writing it back.
---
 drivers/iio/imu/adis16480.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/iio/imu/adis16480.c b/drivers/iio/imu/adis16480.c
index b94bfd3..8473859 100644
--- a/drivers/iio/imu/adis16480.c
+++ b/drivers/iio/imu/adis16480.c
@@ -738,8 +738,19 @@ static int adis16480_stop_device(struct iio_dev *indio_dev)
 
 static int adis16480_enable_irq(struct adis *adis, bool enable)
 {
-   return adis_write_reg_16(adis, ADIS16480_REG_FNCTIO_CTRL,
-   enable ? BIT(3) : 0);
+   u16 fnctio_ctrl;
+   int ret;
+
+   ret = adis_read_reg_16(adis, ADIS16480_REG_FNCTIO_CTRL, &fnctio_ctrl);
+   if (ret < 0)
+   return ret;
+
+   if (enable)
+   fnctio_ctrl |= BIT(3);
+   else
+   fnctio_ctrl &= ~BIT(3);
+
+   return adis_write_reg_16(adis, ADIS16480_REG_FNCTIO_CTRL, fnctio_ctrl);
 }
 
 static int adis16480_initial_setup(struct iio_dev *indio_dev)
-- 
2.7.1



linux-next: removal of the rpmsg tree

2016-03-08 Thread Stephen Rothwell
Hi Ohad,

I noticed that the rpmsg tree

  git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg.git branch for-next

has not been updated since November 2014.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


linux-next: removal of the random tree

2016-03-08 Thread Stephen Rothwell
Hi Ted,

I noticed that the random tree

  git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random.git branch dev

has not been updated since October 2014.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


[PATCH] ARM: dts: uniphier: add pinmux node for I2C ch4

2016-03-08 Thread Masahiro Yamada
This will be needed for UniPhier PH1-LD11 and PH1-LD20 SoCs.

Signed-off-by: Masahiro Yamada 
---

 arch/arm/boot/dts/uniphier-pinctrl.dtsi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/uniphier-pinctrl.dtsi 
b/arch/arm/boot/dts/uniphier-pinctrl.dtsi
index f67445f..2459279 100644
--- a/arch/arm/boot/dts/uniphier-pinctrl.dtsi
+++ b/arch/arm/boot/dts/uniphier-pinctrl.dtsi
@@ -63,6 +63,11 @@
function = "i2c3";
};
 
+   pinctrl_i2c4: i2c4_grp {
+   groups = "i2c4";
+   function = "i2c4";
+   };
+
pinctrl_uart0: uart0_grp {
groups = "uart0";
function = "uart0";
-- 
1.9.1



Re: linux-next: removal of the apm tree

2016-03-08 Thread Jiri Kosina
On Wed, 9 Mar 2016, Stephen Rothwell wrote:

> Hi Jiri,
> 
> I noticed that the apm tree
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/jikos/apm.git branch for-next
> 
> has not been updtaed since July 2014.  I am going to remove it from
> linux-next tomorrow unless I hear that it may be useful.  It can always
> be easily added back if it proves useful in the future.

Stephen,

I agree with the removal. Unsurprisingly, there haven't been that many 
changes in that area lately :)

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: [PATCH v1] tools/vm/page-types.c: remove memset() in walk_pfn()

2016-03-08 Thread Naoya Horiguchi
On Wed, Mar 09, 2016 at 07:28:21AM +0300, Konstantin Khlebnikov wrote:
> On Tue, Mar 8, 2016 at 8:58 AM, Naoya Horiguchi
>  wrote:
> > On Tue, Mar 08, 2016 at 08:12:09AM +0300, Konstantin Khlebnikov wrote:
> >> On Tue, Mar 8, 2016 at 4:47 AM, Naoya Horiguchi
> >>  wrote:
> >> > I found that page-types is very slow and my testing shows many timeout 
> >> > errors.
> >> > Here's an example with a simple program allocating 1000 thps.
> >> >
> >> >   $ time ./page-types -p $(pgrep -f test_alloc)
> >> >   ...
> >> >   real0m17.201s
> >> >   user0m16.889s
> >> >   sys 0m0.312s
> >> >
> >> >   $ time ./page-types.patched -p $(pgrep -f test_alloc)
> >> >   ...
> >> >   real0m0.182s
> >> >   user0m0.046s
> >> >   sys 0m0.135s
> >> >
> >> > Most of time is spent in memset(), which isn't necessary because we check
> >> > that the return of kpagecgroup_read() is equal to pages and uninitialized
> >> > memory is never used. So we can drop this memset().
> >>
> >> These zeros are used in show_page_range() - for merging pages into ranges.
> >
> > Hi Konstantin,
> >
> > Thank you for the response. The below code does solve the problem, so 
> > that's fine.
> >
> > But I don't understand how the zeros are used. show_page_range() is called
> > via add_page() which is called for i=0 to i=pages-1, and the buffer cgi is
> > already filled for the range [i, pages-1] by kpagecgroup_read(), so even if
> > without zero initialization, kpagecgroup_read() properly fills zeros, right?
> > IOW, is there any problem if we don't do this zero initialization?
> 
> kpagecgroup_read() reads only if kpagecgroup were opened,
> /proc/kpagecgroup might even not exist. Probably it's better to fill
> them with zeros here.
> Pre-memset was an optimization - it fills buffer only once instead on
> each kpagecgroup_read() call.

Ah, OK.

So here's ver.2.

Thanks,
Naoya
---
From: Naoya Horiguchi 
Subject: [PATCH v2] tools/vm/page-types.c: avoid memset() in walk_pfn() when 
count == 1

I found that page-types is very slow and my testing shows many timeout errors.
Here's an example with a simple program allocating 1000 thps.

  $ time ./page-types -p $(pgrep -f test_alloc)
  ...
  real0m17.201s
  user0m16.889s
  sys 0m0.312s

Most of time is spent in memset(). Currently memset() clears over whole buffer
for every walk_pfn() call, which is inefficient when walk_pfn() is called from
walk_vma(), because in that case walk_pfn() is called for each pfn.
So this patch limits the zero initialization only for the first element.

  $ time ./page-types.patched -p $(pgrep -f test_alloc)
  ...
  real0m0.182s
  user0m0.046s
  sys 0m0.135s

Fixes: 954e95584579 ("tools/vm/page-types.c: add memory cgroup dumping and 
filtering")
Signed-off-by: Naoya Horiguchi 
Suggested-by: Konstantin Khlebnikov 
---
 tools/vm/page-types.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c
index dab61c377f54..e92903fc7113 100644
--- a/tools/vm/page-types.c
+++ b/tools/vm/page-types.c
@@ -633,7 +633,15 @@ static void walk_pfn(unsigned long voffset,
unsigned long pages;
unsigned long i;
 
-   memset(cgi, 0, sizeof cgi);
+   /*
+* kpagecgroup_read() reads only if kpagecgroup were opened, but
+* /proc/kpagecgroup might even not exist, so it's better to fill
+* them with zeros here.
+*/
+   if (count == 1)
+   cgi[0] = 0;
+   else
+   memset(cgi, 0, sizeof cgi);
 
while (count) {
batch = min_t(unsigned long, count, KPAGEFLAGS_BATCH);
-- 
2.4.3



linux-next: removal of the mmc tree

2016-03-08 Thread Stephen Rothwell
Hi Chris,

I noticed that the mmc tree

  git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc.git branch mmc-next

has not been updated since May 2014.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


linux-next: removal of the mips-fixes tree

2016-03-08 Thread Stephen Rothwell
Hi James,

I noticed that the mips-fixes tree

  git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips.git branch 
mips-fixes

has not been updated since v3.16-rc5.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


linux-next: removal of the lblnet tree

2016-03-08 Thread Stephen Rothwell
Hi Paul,

I noticed that the lblnet tree

  git://git.infradead.org/users/pcmoore/lblnet branch next

has not been updated since v3.18.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


linux-next: removal of the kgdb tree

2016-03-08 Thread Stephen Rothwell
Hi Jason,

I noticed that the kgdb tree

  git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb.git branch 
kgdb-next

has not been updated since March 2015.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


linux-next: removal of the bcm2835 tree

2016-03-08 Thread Stephen Rothwell
Hi Stephen,

I noticed that the bcm2835 tree

  git://git.kernel.org/pub/scm/linux/kernel/git/rpi/linux-rpi.git branch 
for-next

has not been updated since v3.18.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH v3 4/9] irqchip/gic-v2: Parse and export virtual GIC information

2016-03-08 Thread Christoffer Dall
On Tue, Mar 08, 2016 at 11:29:28AM +, Julien Grall wrote:
> For now, the firmware tables are parsed 2 times: once in the GIC
> drivers, the other timer when initializing the vGIC. It means code
> duplication and make more tedious to add the support for another
> firmware table (like ACPI).
> 
> Introduce a new structure and set of helpers to get/set the virtual GIC
> information. Also fill up the structure for GICv2.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> 
> Changes in v2:
> - Use 0 rather than a negative value to know when the maintenance IRQ
> is not present.
> - Use resource for vcpu and vctrl
> ---
>  drivers/irqchip/irq-gic-common.c   | 13 ++
>  drivers/irqchip/irq-gic-common.h   |  3 ++
>  drivers/irqchip/irq-gic.c  | 80 
> +-
>  include/linux/irqchip/arm-gic-common.h | 33 ++
>  4 files changed, 128 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/irqchip/arm-gic-common.h
> 
> diff --git a/drivers/irqchip/irq-gic-common.c 
> b/drivers/irqchip/irq-gic-common.c
> index f174ce0..704caf4 100644
> --- a/drivers/irqchip/irq-gic-common.c
> +++ b/drivers/irqchip/irq-gic-common.c
> @@ -21,6 +21,19 @@
>  
>  #include "irq-gic-common.h"
>  
> +static const struct gic_kvm_info *gic_kvm_info;
> +
> +const struct gic_kvm_info *gic_get_kvm_info(void)
> +{
> + return gic_kvm_info;
> +}
> +
> +void gic_set_kvm_info(const struct gic_kvm_info *info)
> +{
> + WARN(gic_kvm_info != NULL, "gic_kvm_info already set\n");

why do we WARN here?  Wouldn't this be an obvious bug?

> + gic_kvm_info = info;
> +}
> +
>  void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
>   void *data)
>  {
> diff --git a/drivers/irqchip/irq-gic-common.h 
> b/drivers/irqchip/irq-gic-common.h
> index fff697d..205e5fd 100644
> --- a/drivers/irqchip/irq-gic-common.h
> +++ b/drivers/irqchip/irq-gic-common.h
> @@ -19,6 +19,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  struct gic_quirk {
>   const char *desc;
> @@ -35,4 +36,6 @@ void gic_cpu_config(void __iomem *base, void 
> (*sync_access)(void));
>  void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
>   void *data);
>  
> +void gic_set_kvm_info(const struct gic_kvm_info *info);
> +
>  #endif /* _IRQ_GIC_COMMON_H */
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index fbde202..0c58112 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -102,6 +102,8 @@ static struct static_key supports_deactivate = 
> STATIC_KEY_INIT_TRUE;
>  
>  static struct gic_chip_data gic_data[CONFIG_ARM_GIC_MAX_NR] __read_mostly;
>  
> +static struct gic_kvm_info gic_v2_kvm_info;
> +
>  #ifdef CONFIG_GIC_NON_BANKED
>  static void __iomem *gic_get_percpu_base(union gic_base *base)
>  {
> @@ -1189,6 +1191,35 @@ static bool gic_check_eoimode(struct device_node 
> *node, void __iomem **base)
>   return true;
>  }
>  
> +static void __init gic_of_setup_kvm_info(struct device_node *node)
> +{
> + int ret;
> + struct resource r;

I would prefer two struct resource variables more akin to how the KVM
code did it already, and then name them something more understandable
than 'r'.

You could also argue that you can then only populate the gic_v2_kvm_info
with all coherent info or nothing, instead of filling it out partially
and then exiting.

> +
> + gic_v2_kvm_info.type = GIC_V2;
> +
> + gic_v2_kvm_info.maint_irq = irq_of_parse_and_map(node, 0);
> +
> + ret = of_address_to_resource(node, 2, &r);
> + if (!ret)
> + gic_v2_kvm_info.vctrl = r;
> +
> + ret = of_address_to_resource(node, 3, &r);

here you're overwriting the error return value if the first call to
of_address_to_resource failed ?

> + if (!ret) {
> + if (!PAGE_ALIGNED(r.start))
> + pr_warn("GICV physical address 0x%llx not page 
> aligned\n",
> + (unsigned long long)r.start);

how does KVM know that this went bad?

> + else if (!PAGE_ALIGNED(resource_size(&r)))
> + pr_warn("GICV size 0x%llx not a multiple of page size 
> 0x%lx\n",
> + (unsigned long long)resource_size(&r),
> + PAGE_SIZE);

same?

> + else
> + gic_v2_kvm_info.vcpu = r;
> + }
> +
> + gic_set_kvm_info(&gic_v2_kvm_info);

so here you're setting the kvm info even if one of the calls above
fails?

I think this function could leverage much more of the existing KVM
implementation to avoid these kinds of mistakes.

> +}
> +
>  int __init
>  gic_of_init(struct device_node *node, struct device_node *parent)
>  {
> @@ -1218,8 +1249,10 @@ gic_of_init(struct device_node *node, struct 
> device_node *parent)
>  
>   __gic_init_bases(gic_cnt, -1, dist_base, cpu_base, percpu_offset,
>

[PATCH 1/1] Fixes: cfc8874a485 ("perf script: Process cpu/threads maps")

2016-03-08 Thread Chris Phlipot
fix the perf script python database export crash.
Remove the union in evsel so that the database id and priv pointer can be
used simultainously without conflicting and crashing.

Detailed Description for the fixed bug follows:

perf script crashes with a segmentaiton fault on user space tool version
4.5.rc7.ge2857b when using the python database export API. It works
properly in 4.4 and prior versions.

the crash fist appeared in
cfc8874a485 ("perf script: Process cpu/threads maps")

How to reprodcue the bug:

remove any temporary files left over from a previous crash
(if you have already attemped to reproduce the bug):
$ rm -r test_db-perf-data
$ dropdb test_db

$ ./perf record timeout 1 yes >/dev/null
$ ./perf script -s scripts/python/export-to-postgresql.py test_db

Stack Trace:
Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x1) at malloc.c:2929
2929malloc.c: No such file or directory.
(gdb) bt
at util/stat.c:122
argv=, prefix=) at builtin-script.c:2231
argc=argc@entry=4, argv=argv@entry=0x7fffdf70) at perf.c:390
at perf.c:451

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/evsel.h | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8e75434..4d8037a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -93,10 +93,8 @@ struct perf_evsel {
const char  *unit;
struct event_format *tp_format;
off_t   id_offset;
-   union {
-   void*priv;
-   u64 db_id;
-   };
+   void*priv;
+   u64 db_id;
struct cgroup_sel   *cgrp;
void*handler;
struct cpu_map  *cpus;
-- 
1.9.1



linux-next: removal of the apm tree

2016-03-08 Thread Stephen Rothwell
Hi Jiri,

I noticed that the apm tree

  git://git.kernel.org/pub/scm/linux/kernel/git/jikos/apm.git branch for-next

has not been updtaed since July 2014.  I am going to remove it from
linux-next tomorrow unless I hear that it may be useful.  It can always
be easily added back if it proves useful in the future.

-- 
Cheers,
Stephen Rothwell


Re: [PATCH RESEND] Revert "PCI: dra7xx: Mark driver as broken"

2016-03-08 Thread Kishon Vijay Abraham I


On Tuesday 08 March 2016 11:35 PM, Bjorn Helgaas wrote:
> On Fri, Mar 04, 2016 at 03:59:19PM +0530, Kishon Vijay Abraham I wrote:
>> From: Sekhar Nori 
>>
>> This reverts commit <5c3b99d057525fe2befe6a7db9b1309035d93eee>
>> ("PCI: dra7xx: Mark driver as broken").
>>
>> With support to de-assert PCIe reset present in kernel,
>> DRA7x PCIe is not broken anymore.
>>
>> Signed-off-by: Sekhar Nori 
>> Signed-off-by: Kishon Vijay Abraham I 
>> ---
>> Bjorn,
>>
>> This patch should be merged only after [1] hits linus tree.
>>
>> Thanks
>> Kishon
>>
>> [1] -> git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending.git
>>  for-v4.6/omap-hwmod-b
> 
> OK.  Would you mind pinging me about this after [1] is merged to
> Linus' tree?  I don't want to put this in my "next" branch yet,
> because I don't know whether Pauls branch or mine will be merged first
> in the merge window.  But after the merge window closes, I can
> certainly include this in a "PCI fixes" pull request for inclusion in
> v4.6.

yes sure. I can do that.

Thanks
Kishon

> 
> Bjorn
> 
>>  drivers/pci/host/Kconfig |1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
>> index 75a6054..6d181ba 100644
>> --- a/drivers/pci/host/Kconfig
>> +++ b/drivers/pci/host/Kconfig
>> @@ -5,7 +5,6 @@ config PCI_DRA7XX
>>  bool "TI DRA7xx PCIe controller"
>>  select PCIE_DW
>>  depends on OF && HAS_IOMEM && TI_PIPE3
>> -depends on BROKEN
>>  help
>>   Enables support for the PCIe controller in the DRA7xx SoC.  There
>>   are two instances of PCIe controller in DRA7xx.  This controller can
>> -- 
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] platform/chrome: cros_ec_lightbar - use name instead of ID to hide lightbar attributes

2016-03-08 Thread Clinton Sprain
Lightbar attributes are hidden if the ID of the device is not 0
(the assumption being that 0 = cros_ec = might have a lightbar,
1 = cros_pd = hide); however, sometimes these devices get IDs 1
and 2 (or something else) instead of IDs 0 and 1. This prevents
the lightbar attributes from appearing when they should.

Proposed change is to instead check whether the name assigned to
the device is CROS_EC_DEV_NAME (true for cros_ec, false for cros_pd).

Signed-off-by: Clinton Sprain 
---
 drivers/platform/chrome/cros_ec_lightbar.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/chrome/cros_ec_lightbar.c 
b/drivers/platform/chrome/cros_ec_lightbar.c
index ff76405..b6356b3 100644
--- a/drivers/platform/chrome/cros_ec_lightbar.c
+++ b/drivers/platform/chrome/cros_ec_lightbar.c
@@ -414,7 +414,12 @@ static umode_t cros_ec_lightbar_attrs_are_visible(struct 
kobject *kobj,
  struct cros_ec_dev, class_dev);
struct platform_device *pdev = container_of(ec->dev,
   struct platform_device, dev);
-   if (pdev->id != 0)
+   struct cros_ec_platform *pdata = pdev->dev.platform_data;
+   int is_cros_ec;
+
+   is_cros_ec = strcmp(pdata->ec_name, CROS_EC_DEV_NAME);
+
+   if (is_cros_ec != 0)
return 0;
 
/* Only instantiate this stuff if the EC has a lightbar */
-- 
2.5.0



Re: [PATCH v2] ARM: dts: add "simple-bus" where "arm,amba-bus" is used alone

2016-03-08 Thread Masahiro Yamada
Hi Rob,


2016-03-08 17:49 GMT+09:00 Rob Herring :
> On Mon, Mar 7, 2016 at 11:46 PM, Masahiro Yamada
>  wrote:
>> The compatible string "simple-bus" is well defined in ePAPR, while
>> I see no documentation for the "arm,amba-bus" in ePAPR or under
>> Documentation/devicetree/.
>>
>> DT is also used by other projects than Linux kernel.  It is not a
>> good idea to rely on such an unofficial binding.
>>
>> Signed-off-by: Masahiro Yamada 
>> ---
>>
>> Changes in v2:
>>   - Rephrase the git-log
>
> [...]
>
>> diff --git a/arch/arm/boot/dts/axm55xx.dtsi b/arch/arm/boot/dts/axm55xx.dtsi
>> index ea288f0..8da4582 100644
>> --- a/arch/arm/boot/dts/axm55xx.dtsi
>> +++ b/arch/arm/boot/dts/axm55xx.dtsi
>> @@ -107,7 +107,7 @@
>> };
>>
>> amba {
>> -   compatible = "arm,amba-bus";
>> +   compatible = "arm,amba-bus", "simple-bus";
>
> As I mentioned in the last version, I think you should remove
> arm,amba-bus and just have simple-bus. I don't believe anything relies
> on having arm,amba-bus.

Sorry about this.

V3 is here now:
https://patchwork.kernel.org/patch/8542031/




-- 
Best Regards
Masahiro Yamada


Re: [PATCH 0/2] mm: Enable page parallel initialisation for Power

2016-03-08 Thread Balbir Singh


On 09/03/16 15:17, Li Zhang wrote:
> On Tue, Mar 8, 2016 at 10:45 PM, Balbir Singh  wrote:
>>
>> On 08/03/16 14:55, Li Zhang wrote:
>>> From: Li Zhang 
>>>
>>> Uptream has supported page parallel initialisation for X86 and the
>>> boot time is improved greately. Some tests have been done for Power.
>>>
>>> Here is the result I have done with different memory size.
>>>
>>> * 4GB memory:
>>> boot time is as the following:
>>> with patch vs without patch: 10.4s vs 24.5s
>>> boot time is improved 57%
>>> * 200GB memory:
>>> boot time looks the same with and without patches.
>>> boot time is about 38s
>>> * 32TB memory:
>>> boot time looks the same with and without patches
>>> boot time is about 160s.
>>> The boot time is much shorter than X86 with 24TB memory.
>>> From community discussion, it costs about 694s for X86 24T system.
>>>
>>> From code view, parallel initialisation improve the performance by
>>> deferring memory initilisation to kswap with N kthreads, it should
>>> improve the performance therotically.
>>>
>>> From the test result, On X86, performance is improved greatly with huge
>>> memory. But on Power platform, it is improved greatly with less than
>>> 100GB memory. For huge memory, it is not improved greatly. But it saves
>>> the time with several threads at least, as the following information
>>> shows(32TB system log):
>>>
>>> [   22.648169] node 9 initialised, 16607461 pages in 280ms
>>> [   22.783772] node 3 initialised, 23937243 pages in 410ms
>>> [   22.858877] node 6 initialised, 29179347 pages in 490ms
>>> [   22.863252] node 2 initialised, 29179347 pages in 490ms
>>> [   22.907545] node 0 initialised, 32049614 pages in 540ms
>>> [   22.920891] node 15 initialised, 32212280 pages in 550ms
>>> [   22.923236] node 4 initialised, 32306127 pages in 550ms
>>> [   22.923384] node 12 initialised, 32314319 pages in 550ms
>>> [   22.924754] node 8 initialised, 32314319 pages in 550ms
>>> [   22.940780] node 13 initialised, 33353677 pages in 570ms
>>> [   22.940796] node 11 initialised, 33353677 pages in 570ms
>>> [   22.941700] node 5 initialised, 33353677 pages in 570ms
>>> [   22.941721] node 10 initialised, 33353677 pages in 570ms
>>> [   22.941876] node 7 initialised, 33353677 pages in 570ms
>>> [   22.944946] node 14 initialised, 33353677 pages in 570ms
>>> [   22.946063] node 1 initialised, 33345485 pages in 580ms
>>>
>>> It saves the time about 550*16 ms at least, although it can be ignore to 
>>> compare
>>> the boot time about 160 seconds. What's more, the boot time is much shorter
>>> on Power even without patches than x86 for huge memory machine.
>>>
>>> So this patchset is still necessary to be enabled for Power.
>>>
>>>
> Hi Balbir,
>
> Thanks for your reviewing.
>
>> The patchset looks good, two questions
>>
>> 1. The patchset is still necessary for
>> a. systems with smaller amount of RAM?
>I think it is. Currently, I tested systems for 4GB, 50GB, and
> boot time is improved.
>We may test more systems with different memory size in the future.
>> b. Theoretically it improves boot time?
>The boot time is improved a little bit for huge memory system
> and it can be ignored.
>But I think it's still necessary to enable this feature.
>
>> 2. the pgdat->node_spanned_pages >> 8 sounds arbitrary
>> On a system with 2TB*16 nodes, it would initialize about 8GB before 
>> calling deferred init?
>> Don't we need at-least 32GB + space for other early hash allocations
>> BTW, My expectation was that 32TB would imply 32GB+32GB of large hash 
>> allocations early on
>   pgdat->node_spanned_pages >> 8 means that it allocates the size
> of the memory on one node.
>   On a system with 2TB *16nodes, it will allocate 16*8GB = 128GB.
>   I am not sure if it can be minimised to >> 16 to make sure all
> the architectures with different
>   memory size work well.  And this is also mentioned in early
> discussion for X86, so I choose  >> 8.
>
> *From the code as the following:
>
>   free_area_init_core ->
>  memmap_init->
>   update_defer_init
>  #define memmap_init(size, nid, zone, start_pfn) \
>memmap_init_zone((size), (nid), (zone), (start_pfn), MEMMAP_EARLY)
>
>  memmap_init_zone is based on a zone, but free_area_init_core will
> help find the highest
>  zone on the node. And update_defer_init() get max initialised
> memory on highest zone for a node to
>  reserve for early initialisation.
>
>  static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>  {
> ...
>for (j = 0; j < MAX_NR_ZONES; j++) {
>   
>  memmap_init(size, nid, j, zone_start_fn);   //find
> the highest zone on a node.
>  ...
>}
>  }
>
> *   From the dmesg log, after applying this patchset, it has
> 123013440K(about 117GB),
> which 

Re: [PATCH v1] tools/vm/page-types.c: remove memset() in walk_pfn()

2016-03-08 Thread Konstantin Khlebnikov
On Tue, Mar 8, 2016 at 8:58 AM, Naoya Horiguchi
 wrote:
> On Tue, Mar 08, 2016 at 08:12:09AM +0300, Konstantin Khlebnikov wrote:
>> On Tue, Mar 8, 2016 at 4:47 AM, Naoya Horiguchi
>>  wrote:
>> > I found that page-types is very slow and my testing shows many timeout 
>> > errors.
>> > Here's an example with a simple program allocating 1000 thps.
>> >
>> >   $ time ./page-types -p $(pgrep -f test_alloc)
>> >   ...
>> >   real0m17.201s
>> >   user0m16.889s
>> >   sys 0m0.312s
>> >
>> >   $ time ./page-types.patched -p $(pgrep -f test_alloc)
>> >   ...
>> >   real0m0.182s
>> >   user0m0.046s
>> >   sys 0m0.135s
>> >
>> > Most of time is spent in memset(), which isn't necessary because we check
>> > that the return of kpagecgroup_read() is equal to pages and uninitialized
>> > memory is never used. So we can drop this memset().
>>
>> These zeros are used in show_page_range() - for merging pages into ranges.
>
> Hi Konstantin,
>
> Thank you for the response. The below code does solve the problem, so that's 
> fine.
>
> But I don't understand how the zeros are used. show_page_range() is called
> via add_page() which is called for i=0 to i=pages-1, and the buffer cgi is
> already filled for the range [i, pages-1] by kpagecgroup_read(), so even if
> without zero initialization, kpagecgroup_read() properly fills zeros, right?
> IOW, is there any problem if we don't do this zero initialization?

kpagecgroup_read() reads only if kpagecgroup were opened,
/proc/kpagecgroup might even not exist. Probably it's better to fill
them with zeros here.
Pre-memset was an optimization - it fills buffer only once instead on
each kpagecgroup_read() call.

>
> Thanks,
> Naoya Horiguchi
>
>> You could add fast-path for count=1
>>
>> @@ -633,7 +633,10 @@ static void walk_pfn(unsigned long voffset,
>> unsigned long pages;
>> unsigned long i;
>>
>> -   memset(cgi, 0, sizeof cgi);
>> +   if (count == 1)
>> +   cgi[0] = 0;
>> +   else
>> +   memset(cgi, 0, sizeof cgi);
>>
>> while (count) {
>> batch = min_t(unsigned long, count, KPAGEFLAGS_BATCH);
>>


[PATCH v3] ARM,ARM64: dts: drop "arm,amba-bus" in favor of "simple-bus"

2016-03-08 Thread Masahiro Yamada
The compatible string "simple-bus" is well defined in ePAPR, while
I see no documentation for the "arm,amba-bus" arnywhere in ePAPR or
Documentation/devicetree/.

DT is also used by other projects than Linux kernel.  It is not a
good idea to rely on such an unofficial binding.

This commit
  - replaces "arm,amba-bus" with "simple-bus"
  - drops "arm,amba-bus" where it is used along with "simple-bus"

Signed-off-by: Masahiro Yamada 
---

Changes in v3:
  - Kill "arm,amba-bus" completely

Changes in v2:
  - Rephrase the git-log

 arch/arm/boot/dts/axm55xx.dtsi   | 2 +-
 arch/arm/boot/dts/exynos3250.dtsi| 2 +-
 arch/arm/boot/dts/exynos4.dtsi   | 2 +-
 arch/arm/boot/dts/exynos4415.dtsi| 2 +-
 arch/arm/boot/dts/exynos5250.dtsi| 2 +-
 arch/arm/boot/dts/exynos5420.dtsi| 2 +-
 arch/arm/boot/dts/exynos5440.dtsi| 2 +-
 arch/arm/boot/dts/hi3620.dtsi| 2 +-
 arch/arm/boot/dts/hip01.dtsi | 2 +-
 arch/arm/boot/dts/hisi-x5hd2.dtsi| 2 +-
 arch/arm/boot/dts/integrator.dtsi| 2 +-
 arch/arm/boot/dts/qcom-apq8064.dtsi  | 2 +-
 arch/arm/boot/dts/qcom-msm8660.dtsi  | 2 +-
 arch/arm/boot/dts/qcom-msm8960.dtsi  | 2 +-
 arch/arm/boot/dts/rk3036.dtsi| 2 +-
 arch/arm/boot/dts/rk3228.dtsi| 2 +-
 arch/arm/boot/dts/rk3288.dtsi| 2 +-
 arch/arm/boot/dts/rk3xxx.dtsi| 2 +-
 arch/arm/boot/dts/s5pv210.dtsi   | 2 +-
 arch/arm/boot/dts/socfpga.dtsi   | 2 +-
 arch/arm/boot/dts/socfpga_arria10.dtsi   | 2 +-
 arch/arm/boot/dts/ste-nomadik-stn8815.dtsi   | 2 +-
 arch/arm/boot/dts/ste-u300.dts   | 2 +-
 arch/arm/boot/dts/versatile-ab.dts   | 2 +-
 arch/arm/boot/dts/vexpress-v2m-rs1.dtsi  | 2 +-
 arch/arm/boot/dts/vexpress-v2m.dtsi  | 2 +-
 arch/arm64/boot/dts/arm/foundation-v8.dts| 2 +-
 arch/arm64/boot/dts/arm/juno-motherboard.dtsi| 2 +-
 arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi | 2 +-
 29 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/arm/boot/dts/axm55xx.dtsi b/arch/arm/boot/dts/axm55xx.dtsi
index ea288f0..a9d6d59 100644
--- a/arch/arm/boot/dts/axm55xx.dtsi
+++ b/arch/arm/boot/dts/axm55xx.dtsi
@@ -107,7 +107,7 @@
};
 
amba {
-   compatible = "arm,amba-bus";
+   compatible = "simple-bus";
#address-cells = <2>;
#size-cells = <2>;
ranges;
diff --git a/arch/arm/boot/dts/exynos3250.dtsi 
b/arch/arm/boot/dts/exynos3250.dtsi
index 18e3def..c61ec96 100644
--- a/arch/arm/boot/dts/exynos3250.dtsi
+++ b/arch/arm/boot/dts/exynos3250.dtsi
@@ -381,7 +381,7 @@
};
 
amba {
-   compatible = "arm,amba-bus";
+   compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
ranges;
diff --git a/arch/arm/boot/dts/exynos4.dtsi b/arch/arm/boot/dts/exynos4.dtsi
index 045785c..3647f96 100644
--- a/arch/arm/boot/dts/exynos4.dtsi
+++ b/arch/arm/boot/dts/exynos4.dtsi
@@ -661,7 +661,7 @@
amba {
#address-cells = <1>;
#size-cells = <1>;
-   compatible = "arm,amba-bus";
+   compatible = "simple-bus";
interrupt-parent = <&gic>;
ranges;
 
diff --git a/arch/arm/boot/dts/exynos4415.dtsi 
b/arch/arm/boot/dts/exynos4415.dtsi
index ad76484..28b04b6 100644
--- a/arch/arm/boot/dts/exynos4415.dtsi
+++ b/arch/arm/boot/dts/exynos4415.dtsi
@@ -380,7 +380,7 @@
};
 
amba {
-   compatible = "arm,amba-bus";
+   compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
interrupt-parent = <&gic>;
diff --git a/arch/arm/boot/dts/exynos5250.dtsi 
b/arch/arm/boot/dts/exynos5250.dtsi
index 33e2d5f..56ba53e 100644
--- a/arch/arm/boot/dts/exynos5250.dtsi
+++ b/arch/arm/boot/dts/exynos5250.dtsi
@@ -674,7 +674,7 @@
amba {
#address-cells = <1>;
#size-cells = <1>;
-   compatible = "arm,amba-bus";
+   compatible = "simple-bus";
interrupt-parent = <&gic>;
ranges;
 
diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
b/arch/arm/boot/dts/exynos5420.dtsi
index 48a0a55..ec80ddb 100644
--- a/arch/arm/boot/dts/exynos5420.dtsi
+++ b/arch/arm/boot/dts/exynos5420.dtsi
@@ -327,7 +327,7 @@
amba {
#address-cells = <1>;
#size-cells = <1>;
-   compatible = "arm,amba-bus";
+   compatible = "simple-bus";
 

Re: [PATCH 4.4 13/74] cifs: fix out-of-bounds access in lease parsing

2016-03-08 Thread Steve French
On Tue, Mar 8, 2016 at 9:47 PM, Ben Hutchings  wrote:
> On Mon, 2016-03-07 at 16:02 -0800, Greg Kroah-Hartman wrote:
>> 4.4-stable review patch.  If anyone has any objections, please let me know.
>>
>> --
>>
>> From: Justin Maggard 
>>
>> commit deb7deff2f00bdbbcb3d560dad2a89ef37df837d upstream.
>>
>> When opening a file, SMB2_open() attempts to parse the lease state from the
>> SMB2 CREATE Response.  However, the parsing code was not careful to ensure
>> that the create contexts are not empty or invalid, which can lead to out-
>> of-bounds memory access.  This can be seen easily by trying
>> to read a file from a OSX 10.11 SMB3 server.  Here is sample crash output:
>>
>> BUG: unable to handle kernel paging request at 8800a1a77cc6
>> IP: [] SMB2_open+0x804/0x960
>> PGD 8f77067 PUD 0
>> Oops:  [#1] SMP
>> Modules linked in:
>> CPU: 3 PID: 2876 Comm: cp Not tainted 4.5.0-rc3.x86_64.1+ #14
>> Hardware name: NETGEAR ReadyNAS 314  /ReadyNAS 314  , BIOS 
>> 4.6.5 10/11/2012
>> task: 880073cdc080 ti: 88005b31c000 task.ti: 88005b31c000
>> RIP: 0010:[]  [] SMB2_open+0x804/0x960
>> RSP: 0018:88005b31fa08  EFLAGS: 00010282
>> RAX: 0015 RBX:  RCX: 0006
>> RDX:  RSI: 0246 RDI: 88007eb8c8b0
>> RBP: 88005b31fad8 R08: 66203d206363 R09: 6131613030383866
>> R10: 30303838 R11: 02b0 R12: 8800660fd800
>> R13: 8800a1a77cc2 R14: 424d53fe R15: 88005f5a28c0
>> FS:  7f7c8a2897c0() GS:88007eb8() knlGS:
>> CS:  0010 DS:  ES:  CR0: 8005003b
>> CR2: 8800a1a77cc6 CR3: 5b281000 CR4: 06e0
>> Stack:
>>  88005b31fa70 88278789 01d3 88005f5a2a80
>>  0003 88005d029d00 88006fde05a0 
>>  88005b31fc78 88006fde0780 88005b31fb2f 00010fe0
>> Call Trace:
>>  [] ? cifsConvertToUTF16+0x159/0x2d0
>>  [] smb2_open_file+0x98/0x210
>>  [] ? __kmalloc+0x1c/0xe0
>>  [] cifs_open+0x2a4/0x720
>>  [] do_dentry_open+0x1ff/0x310
>>  [] ? cifsFileInfo_get+0x30/0x30
>>  [] vfs_open+0x52/0x60
>>  [] path_openat+0x170/0xf70
>>  [] ? remove_wait_queue+0x48/0x50
>>  [] do_filp_open+0x79/0xd0
>>  [] ? __alloc_fd+0x3a/0x170
>>  [] do_sys_open+0x114/0x1e0
>>  [] SyS_open+0x19/0x20
>>  [] entry_SYSCALL_64_fastpath+0x12/0x6a
>> Code: 4d 8d 6c 07 04 31 c0 4c 89 ee e8 47 6f e5 ff 31 c9 41 89 ce 44 89 f1 
>> 48 c7 c7 28 b1 bd 88 31 c0 49 01 cd 4c 89 ee e8 2b 6f e5 ff <45> 0f b7 75 04 
>> 48 c7 c7 31 b1 bd 88 31 c0 4d 01 ee 4c 89 f6 e8
>> RIP  [] SMB2_open+0x804/0x960
>>  RSP
>> CR2: 8800a1a77cc6
>> ---[ end trace d9f69ba64feee469 ]---
>>
>> Signed-off-by: Justin Maggard 
>> Signed-off-by: Steve French 
>> Signed-off-by: Greg Kroah-Hartman 
>>
>> ---
>>  fs/cifs/smb2pdu.c |   24 ++--
>>  1 file changed, 14 insertions(+), 10 deletions(-)
>>
>> --- a/fs/cifs/smb2pdu.c
>> +++ b/fs/cifs/smb2pdu.c
>> @@ -1109,21 +1109,25 @@ parse_lease_state(struct TCP_Server_Info
>>  {
>>   char *data_offset;
>>   struct create_context *cc;
>> - unsigned int next = 0;
>> + unsigned int next;
>> + unsigned int remaining;
>>   char *name;
>>
>>   data_offset = (char *)rsp + 4 + le32_to_cpu(rsp->CreateContextsOffset);
>> + remaining = le32_to_cpu(rsp->CreateContextsLength);
>
> What if remaining is > the response length?

Do you want to do the followon patch to check for that, or do you want me
to write up a small patch for that?

>>   cc = (struct create_context *)data_offset;
>> - do {
>> - cc = (struct create_context *)((char *)cc + next);
>> + while (remaining >= sizeof(struct create_context)) {
>>   name = le16_to_cpu(cc->NameOffset) + (char *)cc;
>> - if (le16_to_cpu(cc->NameLength) != 4 ||
>> - strncmp(name, "RqLs", 4)) {
>> - next = le32_to_cpu(cc->Next);
>> - continue;
>> - }
>> - return server->ops->parse_lease_buf(cc, epoch);
>> - } while (next != 0);
>> + if (le16_to_cpu(cc->NameLength) == 4 &&
>> + strncmp(name, "RqLs", 4) == 0)
>> + return server->ops->parse_lease_buf(cc, epoch);
>> +
>> + next = le32_to_cpu(cc->Next);
>> + if (!next)
>> + break;
>> + remaining -= next;
>
> What if next > remaining?
>
> This change seems to be only scratching the surface of the security
> failure here.
>
> Ben.
>
>> + cc = (struct create_context *)((char *)cc + next);
>> + }
>>
>>   return 0;
>>  }
>
> --
> Ben Hutchings
> When in doubt, use brute force. - Ken Thompson



-- 
Thanks,

Steve


  1   2   3   4   5   6   7   8   9   10   >