Re: v4.2-rc dcache regression, probably 75a6f82a0d10

2015-07-31 Thread Dominique Martinet
Hugh Dickins wrote on Fri, Jul 31, 2015:
> On Fri, 31 Jul 2015, Linus Torvalds wrote:
>> I'd be more suspicious about other effects. For example, iot's not at
>> all obvious that the commit in question just changes the order of the
>> flags/inode field accesses, there are potentialy bigger changes there.
>> For example, this part (in __d_obtain_alias):
>> 
>> -   tmp->d_inode = inode;
>> -   tmp->d_flags |= add_flags;
>> +   __d_set_inode_and_type(tmp, inode, add_flags);
>> 
>> looks a bit off, because it *used* to just add those flags, but now,
>> through __d_set_inode_and_type, it does
>> 
>> +   dentry->d_inode = inode;
>> +   smp_wmb();
>> +   flags = READ_ONCE(dentry->d_flags);
>> +   flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
>> +   flags |= type_flags;
>> +   WRITE_ONCE(dentry->d_flags, flags);
>> 
>> so it clears DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU.
>> 
>> Is that correct? Maybe, I haven't checked. And maybe it's a big bad
>> bug. Regardless, it sure as hell isn't just changing the order of the
>> access to those fields. That "DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU"
>> clearing came from __d_instantiate(), but now it hits __d_obtain_alias
>> too.

Yeah, had spotted this and tried to fix just this bit, but it didn't
seem to change much for my problem.
Not saying it isn't a bug, but I have no idea what __d_obtain_alias
does and nobody seemed to care about this bit in my previous thread.

> Yes, the one which grabbed my attention is:
> 
> @@ -311,7 +346,7 @@ static void dentry_iput(struct dentry * dentry)
>  {
>   struct inode *inode = dentry->d_inode;
>   if (inode) {
> - dentry->d_inode = NULL;
> + __d_clear_type_and_inode(dentry);
>   hlist_del_init(>d_u.d_alias);
>   spin_unlock(>d_lock);
>   spin_unlock(>i_lock);

Oh, missed that one... Running tests with just that over the weekend,
it's a good candidate and the first 10 minutes of tests sound quite
positive!

I think it's wrong to call it a "fix" even if it stops the bug from
reproducing though, the way I understand the intent of the patch, they
want that checking d_flags be enough to take decisions without having to
check d_inode as well - so now things rely on that, it's still going to
be wrong on HEAD... I think?
(I'm running tests at this commit, so I don't have the patches that rely
on that yet)

As I understand things, the fact that lookup managed to get a
being-removed entry from rcu/wherever isn't changed, it's just that it
won't fail as fast and maybe something later will notice the lack of
inode and fallback graciously instead of that ENOTDIR I've been
tracking -- so that commit makes it possible to hit the bug, but there's
another issue about taking the dentry in the first place?


I really need to spend some time to understand vfs/dcache better as a
whole at some point, I've been looking at a small part of it without
context for too long...


Cheers,
-- 
Dominique
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[lkp] [x86/build] b2c51106c75: -18.1% will-it-scale.per_process_ops

2015-07-31 Thread kernel test robot
FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm
commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection of 
GCC -mpreferred-stack-boundary support")


=
tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test:
  
wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2

commit: 
  f2a50f8b7da45ff2de93a71393e715a2ab9f3b68
  b2c51106c7581866c37ffc77c5d739f3d4b7cbc9

f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 
 -- 
 %stddev %change %stddev
 \  |\  
879002 ±  0% -18.1% 720270 ±  7%  will-it-scale.per_process_ops
  0.02 ±  0% +34.5%   0.02 ±  7%  will-it-scale.scalability
  26153173 ±  0%  +7.0%   27977076 ±  0%  
will-it-scale.time.voluntary_context_switches
 15.70 ±  2% +14.5%  17.98 ±  0%  time.user_time
370683 ±  0%  +6.2% 393491 ±  0%  vmstat.system.cs
830343 ± 56% -54.0% 382128 ± 39%  cpuidle.C1E-NHM.time
788.25 ± 14% -21.7% 617.25 ± 16%  cpuidle.C1E-NHM.usage
  3947 ±  2% +10.6%   4363 ±  3%  slabinfo.kmalloc-192.active_objs
  3947 ±  2% +10.6%   4363 ±  3%  slabinfo.kmalloc-192.num_objs
   1082762 ±162%-100.0%   0.00 ± -1%  
latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
   1082762 ±162%-100.0%   0.00 ± -1%  
latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
   1082762 ±162%-100.0%   0.00 ± -1%  
latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
  2.58 ±  8% +19.5%   3.09 ±  3%  
perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry
  7.02 ±  3%  +9.2%   7.67 ±  2%  
perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry
  3.07 ±  2% +14.8%   3.53 ±  3%  
perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp
  3.05 ±  5%  -8.4%   2.79 ±  4%  
perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry
  0.98 ±  3% -25.1%   0.74 ±  7%  
perf-profile.cpu-cycles.is_ftrace_trampoline.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency
  1.82 ± 18% +46.6%   2.67 ±  3%  
perf-profile.cpu-cycles.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page
  8.05 ±  3%  +9.5%   8.82 ±  3%  
perf-profile.cpu-cycles.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp
  7.75 ± 34% -64.5%   2.75 ± 64%  
sched_debug.cfs_rq[2]:/.nr_spread_over
  1135 ± 20% -43.6% 640.75 ± 49%  
sched_debug.cfs_rq[3]:/.blocked_load_avg
  1215 ± 21% -43.1% 691.50 ± 46%  
sched_debug.cfs_rq[3]:/.tg_load_contrib
 38.50 ± 21%+129.9%  88.50 ± 36%  sched_debug.cfs_rq[4]:/.load
 26.00 ± 20% +98.1%  51.50 ± 46%  
sched_debug.cfs_rq[4]:/.runnable_load_avg
128.25 ± 18%+227.5% 420.00 ± 43%  
sched_debug.cfs_rq[4]:/.utilization_load_avg
  1015 ± 78%+101.1%   2042 ± 25%  
sched_debug.cfs_rq[6]:/.blocked_load_avg
  1069 ± 72%+100.2%   2140 ± 23%  
sched_debug.cfs_rq[6]:/.tg_load_contrib
 88.75 ± 14% -47.3%  46.75 ± 36%  sched_debug.cfs_rq[9]:/.load
 59.25 ± 23% -41.4%  34.75 ± 34%  
sched_debug.cfs_rq[9]:/.runnable_load_avg
315.50 ± 45% -64.6% 111.67 ±  1%  
sched_debug.cfs_rq[9]:/.utilization_load_avg
   2246758 ±  7% +87.6%4213925 ± 65%  sched_debug.cpu#0.nr_switches
   2249376 ±  7% +87.4%4215969 ± 65%  sched_debug.cpu#0.sched_count
   1121438 ±  7% +81.0%2030313 ± 61%  sched_debug.cpu#0.sched_goidle
   1151160 ±  7% +86.5%2146608 ± 64%  sched_debug.cpu#0.ttwu_count
 33.75 ± 15% -22.2%  26.25 ±  6%  sched_debug.cpu#1.cpu_load[3]
 33.25 ± 10% -18.0%  27.25 ±  7%  sched_debug.cpu#1.cpu_load[4]
 40.00 ± 18% +24.4%  49.75 ± 18%  sched_debug.cpu#10.cpu_load[2]
 39.25 ± 14% +22.3%  48.00 ± 10%  sched_debug.cpu#10.cpu_load[3]
 39.50 ± 15% +20.3%  47.50 ±  6%  sched_debug.cpu#10.cpu_load[4]
   5269004 ±  1% 

[lkp] [wmi] a46ad0f13bc: alienware_wmi: alienware-wmi: No known WMI GUID found

2015-07-31 Thread kernel test robot
FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit a46ad0f13bc32a9601f3c5dff43fafdc2c598814 ("Add WMI driver for 
controlling AlienFX features on some Alienware products")


The following new message in kernel log may make end user confusing.

[6.640707] alienware_wmi: alienware-wmi: No known WMI GUID found
[6.640707] alienware_wmi: alienware-wmi: No known WMI GUID found


Thanks,
Ying Huang
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 3.14.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
CONFIG_KERNEL_LZMA=y
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
# CONFIG_POSIX_MQUEUE is not set
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set

#
# CPU/Task time and stats accounting
#
# CONFIG_TICK_CPU_ACCOUNTING is not set
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
CONFIG_IRQ_TIME_ACCOUNTING=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set

#
# RCU Subsystem
#
CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_BOOST is not set
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_NONE is not set
CONFIG_RCU_NOCB_CPU_ZERO=y
# CONFIG_RCU_NOCB_CPU_ALL is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_ARCH_WANTS_PROT_NUMA_PROT_NONE=y
CONFIG_CGROUPS=y
CONFIG_CGROUP_DEBUG=y
# CONFIG_CGROUP_FREEZER is not set
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CPUSETS is not set
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_RESOURCE_COUNTERS is not set
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
CONFIG_RT_GROUP_SCHED=y
# CONFIG_BLK_CGROUP is not set
# CONFIG_CHECKPOINT_RESTORE is not set
# CONFIG_NAMESPACES is not set
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_EXPERT=y
# CONFIG_UID16 is not set
CONFIG_SYSFS_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_PRINTK=y
CONFIG_BUG=y
# CONFIG_PCSPKR_PLATFORM is not set
# CONFIG_BASE_FULL is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
# CONFIG_SHMEM is not set
# CONFIG_AIO is not 

[lkp] [staging] 68905a14e49: kernel BUG at drivers/base/driver.c:153!

2015-07-31 Thread kernel test robot
FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 68905a14e49c97bf49dacd753e40ddd5b254e2ad ("staging: unisys: Add s-Par 
visornic ethernet driver")


+--+++
|  | dbb9d61994 
| 68905a14e4 |
+--+++
| boot_successes   | 0  
| 0  |
| boot_failures| 11 
| 11 |
| invoked_oom-killer:gfp_mask=0x   | 11 
||
| Mem-Info | 11 
||
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 11 
||
| backtrace:lock_torture_stats | 11 
||
| kernel_BUG_at_drivers/base/driver.c  | 0  
| 11 |
| invalid_opcode   | 0  
| 11 |
| RIP:driver_register  | 0  
| 11 |
| Kernel_panic-not_syncing:Fatal_exception | 0  
| 11 |
| backtrace:visornic_init  | 0  
| 11 |
| backtrace:kernel_init_freeable   | 0  
| 11 |
+--+++


[   12.273990] GPIO INIT FAIL!!
[   12.275607] [ cut here ]
[   12.275607] [ cut here ]
[   12.276231] kernel BUG at drivers/base/driver.c:153!
[   12.276231] kernel BUG at drivers/base/driver.c:153!
[   12.276231] invalid opcode:  [#1] 
[   12.276231] invalid opcode:  [#1] PREEMPT PREEMPT SMP SMP 

[   12.276231] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.1.0-rc7-01053-g68905a1 #1
[   12.276231] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.1.0-rc7-01053-g68905a1 #1
[   12.276231] task: 8819c000 ti: 881a task.ti: 
881a
[   12.276231] task: 8819c000 ti: 881a task.ti: 
881a
[   12.276231] RIP: 0010:[] 
[   12.276231] RIP: 0010:[]  [] 
driver_register+0xa8/0xe0
 [] driver_register+0xa8/0xe0
[   12.276231] RSP: 0018:881a3e30  EFLAGS: 00010246
[   12.276231] RSP: 0018:881a3e30  EFLAGS: 00010246
[   12.276231] RAX:  RBX: 82faef80 RCX: 
[   12.276231] RAX:  RBX: 82faef80 RCX: 
[   12.276231] RDX:  RSI: 82fae400 RDI: 82faefe0
[   12.276231] RDX:  RSI: 82fae400 RDI: 82faefe0
[   12.276231] RBP: 881a3e78 R08: 880011279b00 R09: 
[   12.276231] RBP: 881a3e78 R08: 880011279b00 R09: 
[   12.276231] R10: 880011279b00 R11:  R12: 82faefe0
[   12.276231] R10: 880011279b00 R11:  R12: 82faefe0
[   12.276231] R13:  R14: 8205aab0 R15: 
[   12.276231] R13:  R14: 8205aab0 R15: 
[   12.276231] FS:  () GS:88001380() 
knlGS:
[   12.276231] FS:  () GS:88001380() 
knlGS:
[   12.276231] CS:  0010 DS:  ES:  CR0: 80050033
[   12.276231] CS:  0010 DS:  ES:  CR0: 80050033
[   12.276231] CR2: 7f594be968ec CR3: 02e08000 CR4: 000406b0
[   12.276231] CR2: 7f594be968ec CR3: 02e08000 CR4: 000406b0
[   12.276231] Stack:
[   12.276231] Stack:
[   12.276231]  82055977
[   12.276231]  82055977 0100 0100 
  8205aab0 8205aab0

[   12.276231]  
[   12.276231]   881a3e78 881a3e78 
811d0df6 811d0df6 830c0910 830c0910

[   12.276231]  880010561d50
[   12.276231]  880010561d50 881a3e88 881a3e88 
8205ac53 8205ac53 881a3f08 881a3f08

[   12.276231] Call Trace:
[   12.276231] Call Trace:
[   12.276231]  [] ? visorbus_register_visor_driver+0x47/0x100
[   12.276231]  [] ? visorbus_register_visor_driver+0x47/0x100
[   12.276231]  [] ? visornic_change_mtu+0x10/0x10
[   12.276231]  [] ? visornic_change_mtu+0x10/0x10
[   12.276231]  [] ? __kmalloc+0x86/0xa0
[   12.276231]  [] ? __kmalloc+0x86/0xa0
[   12.276231]  [] visornic_init+0x1a3/0x1f0
[   12.276231]  [] visornic_init+0x1a3/0x1f0
[   

[PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-31 Thread Calvin Owens
These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig 
Cc: Bart Van Assche 
Cc: Joe Lawrence 
Signed-off-by: Calvin Owens 
---
Changes in v3:
* Drop the sas_device_lock while enabling devices, and leave the
  sas_device object on the list, since it may need to be looked up
  there while it is being enabled.
* Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
  reference (this was an oversight in v2).
* Be consistent about calling sas_device_put() while holding the
  sas_device_lock where feasible.
* Take and assert_spin_locked() on the sas_device_lock from the newly
  added __get_sdev_from_target(), add wrapper similar to other lookups
  for callers which do not explicitly take the lock.

Changes in v2:
* Squished patches 1-3 into this one
* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
* Store a pointer to the sas_device object in ->hostdata, to eliminate
  the need for several lookups on the lists.

 drivers/scsi/mpt2sas/mpt2sas_base.h  |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 467 +--
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 348 insertions(+), 153 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h 
b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
  * @flags: MPT_TARGET_FLAGS_XXX flags
  * @deleted: target flaged for deletion
  * @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
  */
 struct MPT2SAS_TARGET {
struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
u32 flags;
u8  deleted;
u8  tm_busy;
+   struct _sas_device *sdev;
 };
 
 
@@ -376,8 +378,24 @@ struct _sas_device {
u8  phy;
u8  responding;
u8  pfa_led_on;
+   struct kref refcount;
 };
 
+static inline void sas_device_get(struct _sas_device *s)
+{
+   kref_get(>refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+   kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+   kref_put(>refcount, sas_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node 
*mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
 u16 handle);
 struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct 
MPT2SAS_ADAPTER
 *ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
 struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
 
 void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c 
b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..a2af9a5 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
}
 }
 
+static struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+   struct MPT2SAS_TARGET *tgt_priv)
+{
+   struct _sas_device *ret;
+
+   assert_spin_locked(>sas_device_lock);
+
+   ret = tgt_priv->sdev;
+   if (ret)
+   sas_device_get(ret);
+
+   return ret;
+}
+
+static struct _sas_device *
+mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+   struct MPT2SAS_TARGET *tgt_priv)
+{
+   struct _sas_device *ret;
+   unsigned long flags;
+
+   spin_lock_irqsave(>sas_device_lock, flags);
+   ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
+   spin_unlock_irqrestore(>sas_device_lock, flags);
+
+   return ret;
+}
+
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+u64 sas_address)
+{
+   struct _sas_device *sas_device;
+
+   assert_spin_locked(>sas_device_lock);
+
+   list_for_each_entry(sas_device, >sas_device_list, list)
+   if (sas_device->sas_address == sas_address)
+   goto found_device;
+
+   list_for_each_entry(sas_device, >sas_device_init_list, list)
+   if (sas_device->sas_address == 

[PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

2015-07-31 Thread Calvin Owens
The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig 
Signed-off-by: Calvin Owens 
---

Changes in v3:
* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
  which can loop over a sleep forever (5m+ at least) at unloading. I
  don't think anything prevented this before, but taking the fw_event
  object off the list at the top of _firmware_event_work() seems to have
  made it more likely to happen.

Changes in v2:
* Squished patches 4-6 into one patch
* Remove the fw_event from fw_event_list at the start of
  _firmware_event_work()
* Explicitly seperate fw_event_list removal from fw_event freeing

 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ---
 1 file changed, 91 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c 
b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index a2af9a5..cdc647d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
u8  VP_ID;
u8  ignore;
u16 event;
+   struct kref refcount;
charevent_data[0] __aligned(4);
 };
 
+static void fw_event_work_free(struct kref *r)
+{
+   kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+   kref_get(_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+   kref_put(_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+   struct fw_event_work *fw_event;
+
+   fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+   if (!fw_event)
+   return NULL;
+
+   kref_init(_event->refcount);
+   return fw_event;
+}
+
 /* raid transport support */
 static struct raid_template *mpt2sas_raid_template;
 
@@ -2864,36 +2892,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct 
fw_event_work *fw_event)
return;
 
spin_lock_irqsave(>fw_event_lock, flags);
+   fw_event_work_get(fw_event);
list_add_tail(_event->list, >fw_event_list);
INIT_DELAYED_WORK(_event->delayed_work, _firmware_event_work);
+   fw_event_work_get(fw_event);
queue_delayed_work(ioc->firmware_event_thread,
_event->delayed_work, 0);
spin_unlock_irqrestore(>fw_event_lock, flags);
 }
 
 /**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
  * @ioc: per adapter object
  * @fw_event: object describing the event
  * Context: This function will acquire ioc->fw_event_lock.
  *
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
  *
  * Return nothing.
  */
 static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
 *fw_event)
 {
unsigned long flags;
 
spin_lock_irqsave(>fw_event_lock, flags);
-   list_del(_event->list);
-   kfree(fw_event);
+   if (!list_empty(_event->list)) {
+   list_del_init(_event->list);
+   fw_event_work_put(fw_event);
+   }
spin_unlock_irqrestore(>fw_event_lock, flags);
 }
 
-
 /**
  * _scsih_error_recovery_delete_devices - remove devices not responding
  * @ioc: per adapter object
@@ -2908,13 +2939,14 @@ _scsih_error_recovery_delete_devices(struct 
MPT2SAS_ADAPTER *ioc)
if (ioc->is_driver_loading)
return;
 
-   fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+   fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
 
fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+   fw_event_work_put(fw_event);
 }
 
 /**
@@ -2928,12 +2960,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER 
*ioc)
 {
struct fw_event_work *fw_event;
 
-   fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+   fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+   fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+   unsigned long flags;
+   

[PATCH v3 0/2] Fixes for memory corruption in mpt2sas

2015-07-31 Thread Calvin Owens
Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Changes are noted in the individual patches, I realized putting them in the
cover was probably a bit confusing.

Thanks,
Calvin


Patches in this series:
 [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
 [PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Total diffstat:
 drivers/scsi/mpt2sas/mpt2sas_base.h  |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 579 ++-
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 439 insertions(+), 174 deletions(-)

Diff showing changes v2 => v3:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch

Diff showing changes v1 => v2:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 1/1] Documentation: describe how to add a system call

2015-07-31 Thread H. Peter Anvin
On 07/31/2015 09:32 PM, Josh Triplett wrote:
> 
> Sure, agreed.  But I really hope we don't create new kernel ABIs that
> involve constructs like that.
> 

It's worth noting I have pushed for auto-marshalling in general for a
long time, not the least to get rid of the awful syscall(3) wrapper.  I
even built a prototype but didn't have time to productize it.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Staging:wilc1000 :Remove typedef from struct

2015-07-31 Thread Sudip Mukherjee
On Fri, Jul 31, 2015 at 01:52:13PM -0700, Greg Kroah-Hartman wrote:
> On Fri, Jul 31, 2015 at 11:08:47AM +0530, Shraddha Barke wrote:
> > This patch fixes the following checkpatch.pl warning:
> > 
> > WARNING: do not add new typedefs
> > 
> > Signed-off-by: Shraddha Barke 
> > ---

> > -typedef enum {
> > +enum {
> > CLASS1_FRAME_TYPE  = 0x00,
> > CLASS2_FRAME_TYPE  = 0x01,
> > CLASS3_FRAME_TYPE  = 0x02,
> 
> Did you test-build this change?
This enum is not used anywhere. So did not affect the build.

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 1/1] Documentation: describe how to add a system call

2015-07-31 Thread Josh Triplett
On Fri, Jul 31, 2015 at 03:54:56PM -0700, Andy Lutomirski wrote:
> On Fri, Jul 31, 2015 at 3:08 PM,   wrote:
> > On Fri, Jul 31, 2015 at 02:19:29PM -0700, Andy Lutomirski wrote:
> >> On Fri, Jul 31, 2015 at 1:59 PM,   wrote:
> >> > Agreed.  I think the proposal above would be a net improvement, but
> >> > ideally you'd want something that's annotated and generates automatic
> >> > marshalling code.
> >> >
> >>
> >> I assume this is idle musing.  If, however, we were to actually do
> >> this, I'd suggest we seriously consider speaking the Cap'n Proto
> >> serialization format.  It's quite nice, it encodes and decodes *very*
> >> quickly and, unlike TLV schemes, you don't have to read it in order,
> >> making the read-side code less awkward.
> >
> > That seems like *massive* overkill for a kernel<->userspace syscall
> > interface.  I was more thinking about having a few standardized marshal
> > types, and incrementally adding more when more patterns show up.  For a
> > first pass, just automatically running copy_from_user and
> > copy_param_struct on appropriate sets of __user parameters identified as
> > such in a structured text file seems quite sufficient.  (Plus
> > automatically generating syscalls.h from that.)
> 
> If a param struct does the trick, then I agree.  It's when you start
> having lists and other variable-size stuff that it gets messier.

Sure, agreed.  But I really hope we don't create new kernel ABIs that
involve constructs like that.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: v4.2-rc dcache regression, probably 75a6f82a0d10

2015-07-31 Thread Hugh Dickins
On Fri, 31 Jul 2015, Hugh Dickins wrote:
> On Fri, 31 Jul 2015, Linus Torvalds wrote:
> > 
> > So leave it running a while longer, but maybe it's 4bf46a272647 like
> > Dominique suspects. Although I don't see how that could trigger
> > anything either..
> 
> I restarted with a slightly different version of the load this
> morning, which has sometimes shown the issue more easily - I thought
> it better to restart with a variant than persist with a run that
> might have settled into a protected pattern.  We'll see what that
> shows later on.

It showed nothing useful to this discussion: after an hour and a half
it had hung on some almost-certainly-unrelated issue that I've never
seen before - looked as if a jbd2 transaction never completed, some
tasks waiting for that, some waiting for f_pos_lock held by those
(which I hit when I tried to tail the output log).  Worry about
that another time, if it ever shows up again.

I think I'll try reinstating Al's commit, and hacking out that change
of David's in dentry_iput() that worried me (though the "especially
problematic" remark in his change description suggests that it is
an intentional and necessary change to suit unionmount).  See how
that goes.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/5] toshiba_acpi: Refactor *{get, set} functions return value

2015-07-31 Thread Azael Avalos
This patch refactors the return value of the driver *{get, set}
functions, since the driver default error value is -EIO.

All the functions now check for TOS_FAILURE, TOS_NOT_SUPPORTED and
TOS_SUCCESS.

On TOS_FAILURE a pr_err message is printed informing the user of the
error (no change was made to this, except the check was added to the
functions not checking for this).

On TOS_NOT_SUPPORTED we now return -ENODEV immediately (some
functions were returning -EIO and some other were not checking)

On TOS_SUCCESS* we now return 0 (as a side effect, a new success value
was added, since some functions return one instead of zero to
indicate success).

As a special case, the LED functions now check for *FAILURE on
*set, and check for TOS_FAILURE and TOS_SUCCESS on *get with their
"default" return value set to LED_OFF.

Also the {lcd, video}_proc* functions were adapted to reflect these
changes to their parent HCI functions.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 177 +---
 1 file changed, 104 insertions(+), 73 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 66b596a..7b16d8d 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -93,6 +93,7 @@ MODULE_LICENSE("GPL");
 
 /* Return codes */
 #define TOS_SUCCESS0x
+#define TOS_SUCCESS2   0x0001
 #define TOS_OPEN_CLOSE_OK  0x0044
 #define TOS_FAILURE0x1000
 #define TOS_NOT_SUPPORTED  0x8000
@@ -469,7 +470,8 @@ static void toshiba_illumination_set(struct led_classdev 
*cdev,
 {
struct toshiba_acpi_dev *dev = container_of(cdev,
struct toshiba_acpi_dev, led_dev);
-   u32 state, result;
+   u32 result;
+   u32 state;
 
/* First request : initialize communication. */
if (!sci_open(dev))
@@ -503,7 +505,7 @@ static enum led_brightness toshiba_illumination_get(struct 
led_classdev *cdev)
if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
pr_err("ACPI call for illumination failed\n");
return LED_OFF;
-   } else if (result == TOS_NOT_SUPPORTED) {
+   } else if (result != TOS_SUCCESS) {
return LED_OFF;
}
 
@@ -565,7 +567,7 @@ static int toshiba_kbd_illum_status_set(struct 
toshiba_acpi_dev *dev, u32 time)
return -ENODEV;
}
 
-   return 0;
+   return result == TOS_SUCCESS ? 0 : -EIO;
 }
 
 static int toshiba_kbd_illum_status_get(struct toshiba_acpi_dev *dev, u32 
*time)
@@ -584,21 +586,22 @@ static int toshiba_kbd_illum_status_get(struct 
toshiba_acpi_dev *dev, u32 *time)
return -ENODEV;
}
 
-   return 0;
+   return result == TOS_SUCCESS ? 0 : -EIO;
 }
 
 static enum led_brightness toshiba_kbd_backlight_get(struct led_classdev *cdev)
 {
struct toshiba_acpi_dev *dev = container_of(cdev,
struct toshiba_acpi_dev, kbd_led);
-   u32 state, result;
+   u32 result;
+   u32 state;
 
/* Check the keyboard backlight state */
result = hci_read(dev, HCI_KBD_ILLUMINATION, );
if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
pr_err("ACPI call to get the keyboard backlight failed\n");
return LED_OFF;
-   } else if (result == TOS_NOT_SUPPORTED) {
+   } else if (result != TOS_SUCCESS) {
return LED_OFF;
}
 
@@ -610,7 +613,8 @@ static void toshiba_kbd_backlight_set(struct led_classdev 
*cdev,
 {
struct toshiba_acpi_dev *dev = container_of(cdev,
struct toshiba_acpi_dev, kbd_led);
-   u32 state, result;
+   u32 result;
+   u32 state;
 
/* Set the keyboard backlight state */
state = brightness ? 1 : 0;
@@ -640,7 +644,7 @@ static int toshiba_touchpad_set(struct toshiba_acpi_dev 
*dev, u32 state)
return -ENODEV;
}
 
-   return 0;
+   return result == TOS_SUCCESS ? 0 : -EIO;
 }
 
 static int toshiba_touchpad_get(struct toshiba_acpi_dev *dev, u32 *state)
@@ -659,7 +663,7 @@ static int toshiba_touchpad_get(struct toshiba_acpi_dev 
*dev, u32 *state)
return -ENODEV;
}
 
-   return 0;
+   return result == TOS_SUCCESS ? 0 : -EIO;
 }
 
 /* Eco Mode support */
@@ -709,6 +713,8 @@ toshiba_eco_mode_get_status(struct led_classdev *cdev)
if (ACPI_FAILURE(status) || out[0] == TOS_INPUT_DATA_ERROR) {
pr_err("ACPI call to get ECO led failed\n");
return LED_OFF;
+   } else if (out[0] != TOS_SUCCESS) {
+   return LED_OFF;
}
 
return out[2] ? LED_FULL : LED_OFF;
@@ -769,12 +775,15 @@ static int toshiba_accelerometer_get(struct 
toshiba_acpi_dev *dev,
if (ACPI_FAILURE(status) || out[0] == TOS_INPUT_DATA_ERROR) {

[PATCH v3 5/5] toshiba_acpi: Bump driver version to 0.23

2015-07-31 Thread Azael Avalos
Given that some features were added (/dev/toshiba_acpi device), some
clean-ups and minor (cosmetic) changes all over the driver code, bump
the driver version to 0.23 to reflect these overall changes.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 4802fd7..1645bc4 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -31,7 +31,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
-#define TOSHIBA_ACPI_VERSION   "0.22"
+#define TOSHIBA_ACPI_VERSION   "0.23"
 #define PROC_INTERFACE_VERSION 1
 
 #include 
-- 
2.4.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/5] toshiba_acpi: Remove unnecessary checks and returns in HCI/SCI functions

2015-07-31 Thread Azael Avalos
A previous patch added explicit feature checks for support, *SUCCESS*
and *FAILURE to the HCI/SCI *{get, set} functions.

This patch removes some unnedded checks to the driver HCI/SCI
functions given that the default error return value is now set to
-EIO, there is no need to check for other error values other than
the ones currently checking for.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 169 ++--
 1 file changed, 44 insertions(+), 125 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 7b16d8d..4802fd7 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -459,8 +459,6 @@ static void toshiba_illumination_available(struct 
toshiba_acpi_dev *dev)
sci_close(dev);
if (ACPI_FAILURE(status))
pr_err("ACPI call to query Illumination support failed\n");
-   else if (out[0] == TOS_NOT_SUPPORTED)
-   return;
else if (out[0] == TOS_SUCCESS)
dev->illumination_supported = 1;
 }
@@ -481,12 +479,8 @@ static void toshiba_illumination_set(struct led_classdev 
*cdev,
state = brightness ? 1 : 0;
result = sci_write(dev, SCI_ILLUMINATION, state);
sci_close(dev);
-   if (result == TOS_FAILURE) {
+   if (result == TOS_FAILURE)
pr_err("ACPI call for illumination failed\n");
-   return;
-   } else if (result == TOS_NOT_SUPPORTED) {
-   return;
-   }
 }
 
 static enum led_brightness toshiba_illumination_get(struct led_classdev *cdev)
@@ -502,7 +496,7 @@ static enum led_brightness toshiba_illumination_get(struct 
led_classdev *cdev)
/* Check the illumination */
result = sci_read(dev, SCI_ILLUMINATION, );
sci_close(dev);
-   if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
+   if (result == TOS_FAILURE) {
pr_err("ACPI call for illumination failed\n");
return LED_OFF;
} else if (result != TOS_SUCCESS) {
@@ -527,10 +521,8 @@ static void toshiba_kbd_illum_available(struct 
toshiba_acpi_dev *dev)
 
status = tci_raw(dev, in, out);
sci_close(dev);
-   if (ACPI_FAILURE(status) || out[0] == TOS_INPUT_DATA_ERROR) {
+   if (ACPI_FAILURE(status)) {
pr_err("ACPI call to query kbd illumination support failed\n");
-   } else if (out[0] == TOS_NOT_SUPPORTED) {
-   return;
} else if (out[0] == TOS_SUCCESS) {
/*
 * Check for keyboard backlight timeout max value,
@@ -560,12 +552,10 @@ static int toshiba_kbd_illum_status_set(struct 
toshiba_acpi_dev *dev, u32 time)
 
result = sci_write(dev, SCI_KBD_ILLUM_STATUS, time);
sci_close(dev);
-   if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
+   if (result == TOS_FAILURE)
pr_err("ACPI call to set KBD backlight status failed\n");
-   return -EIO;
-   } else if (result == TOS_NOT_SUPPORTED) {
+   else if (result == TOS_NOT_SUPPORTED)
return -ENODEV;
-   }
 
return result == TOS_SUCCESS ? 0 : -EIO;
 }
@@ -579,12 +569,10 @@ static int toshiba_kbd_illum_status_get(struct 
toshiba_acpi_dev *dev, u32 *time)
 
result = sci_read(dev, SCI_KBD_ILLUM_STATUS, time);
sci_close(dev);
-   if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
+   if (result == TOS_FAILURE)
pr_err("ACPI call to get KBD backlight status failed\n");
-   return -EIO;
-   } else if (result == TOS_NOT_SUPPORTED) {
+   else if (result == TOS_NOT_SUPPORTED)
return -ENODEV;
-   }
 
return result == TOS_SUCCESS ? 0 : -EIO;
 }
@@ -598,7 +586,7 @@ static enum led_brightness toshiba_kbd_backlight_get(struct 
led_classdev *cdev)
 
/* Check the keyboard backlight state */
result = hci_read(dev, HCI_KBD_ILLUMINATION, );
-   if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
+   if (result == TOS_FAILURE) {
pr_err("ACPI call to get the keyboard backlight failed\n");
return LED_OFF;
} else if (result != TOS_SUCCESS) {
@@ -619,12 +607,8 @@ static void toshiba_kbd_backlight_set(struct led_classdev 
*cdev,
/* Set the keyboard backlight state */
state = brightness ? 1 : 0;
result = hci_write(dev, HCI_KBD_ILLUMINATION, state);
-   if (result == TOS_FAILURE || result == TOS_INPUT_DATA_ERROR) {
+   if (result == TOS_FAILURE)
pr_err("ACPI call to set KBD Illumination mode failed\n");
-   return;
-   } else if (result == TOS_NOT_SUPPORTED) {
-   return;
-   }
 }
 
 /* TouchPad support */
@@ -637,12 +621,10 @@ static int toshiba_touchpad_set(struct toshiba_acpi_dev 
*dev, u32 state)
 
result = sci_write(dev, SCI_TOUCHPAD, 

[PATCH v3 2/5] toshiba_acpi: Remove "*not supported" feature prints

2015-07-31 Thread Azael Avalos
Currently the driver prints "*not supported" if any of the features
queried are in fact not supported, let us print the available
features instead.

This patch removes all instances pr_info printing "*not supported",
and add a new function called "print_supported_features", which will
print the available laptop features.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 72 +++--
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index d983dc4..66b596a 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -459,7 +459,7 @@ static void toshiba_illumination_available(struct 
toshiba_acpi_dev *dev)
if (ACPI_FAILURE(status))
pr_err("ACPI call to query Illumination support failed\n");
else if (out[0] == TOS_NOT_SUPPORTED)
-   pr_info("Illumination device not available\n");
+   return;
else if (out[0] == TOS_SUCCESS)
dev->illumination_supported = 1;
 }
@@ -483,7 +483,6 @@ static void toshiba_illumination_set(struct led_classdev 
*cdev,
pr_err("ACPI call for illumination failed\n");
return;
} else if (result == TOS_NOT_SUPPORTED) {
-   pr_info("Illumination not supported\n");
return;
}
 }
@@ -505,7 +504,6 @@ static enum led_brightness toshiba_illumination_get(struct 
led_classdev *cdev)
pr_err("ACPI call for illumination failed\n");
return LED_OFF;
} else if (result == TOS_NOT_SUPPORTED) {
-   pr_info("Illumination not supported\n");
return LED_OFF;
}
 
@@ -530,7 +528,7 @@ static void toshiba_kbd_illum_available(struct 
toshiba_acpi_dev *dev)
if (ACPI_FAILURE(status) || out[0] == TOS_INPUT_DATA_ERROR) {
pr_err("ACPI call to query kbd illumination support failed\n");
} else if (out[0] == TOS_NOT_SUPPORTED) {
-   pr_info("Keyboard illumination not available\n");
+   return;
} else if (out[0] == TOS_SUCCESS) {
/*
 * Check for keyboard backlight timeout max value,
@@ -564,7 +562,6 @@ static int toshiba_kbd_illum_status_set(struct 
toshiba_acpi_dev *dev, u32 time)
pr_err("ACPI call to set KBD backlight status failed\n");
return -EIO;
} else if (result == TOS_NOT_SUPPORTED) {
-   pr_info("Keyboard backlight status not supported\n");
return -ENODEV;
}
 
@@ -584,7 +581,6 @@ static int toshiba_kbd_illum_status_get(struct 
toshiba_acpi_dev *dev, u32 *time)
pr_err("ACPI call to get KBD backlight status failed\n");
return -EIO;
} else if (result == TOS_NOT_SUPPORTED) {
-   pr_info("Keyboard backlight status not supported\n");
return -ENODEV;
}
 
@@ -603,7 +599,6 @@ static enum led_brightness toshiba_kbd_backlight_get(struct 
led_classdev *cdev)
pr_err("ACPI call to get the keyboard backlight failed\n");
return LED_OFF;
} else if (result == TOS_NOT_SUPPORTED) {
-   pr_info("Keyboard backlight not supported\n");
return LED_OFF;
}
 
@@ -624,7 +619,6 @@ static void toshiba_kbd_backlight_set(struct led_classdev 
*cdev,
pr_err("ACPI call to set KBD Illumination mode failed\n");
return;
} else if (result == TOS_NOT_SUPPORTED) {
-   pr_info("Keyboard backlight not supported\n");
return;
}
 }
@@ -758,7 +752,7 @@ static void toshiba_accelerometer_available(struct 
toshiba_acpi_dev *dev)
   out[0] == TOS_NOT_INITIALIZED)
pr_err("Accelerometer not initialized\n");
else if (out[0] == TOS_NOT_SUPPORTED)
-   pr_info("Accelerometer not supported\n");
+   return;
else if (out[0] == TOS_SUCCESS)
dev->accelerometer_supported = 1;
 }
@@ -801,7 +795,6 @@ static void toshiba_usb_sleep_charge_available(struct 
toshiba_acpi_dev *dev)
sci_close(dev);
return;
} else if (out[0] == TOS_NOT_SUPPORTED) {
-   pr_info("USB Sleep and Charge not supported\n");
sci_close(dev);
return;
} else if (out[0] == TOS_SUCCESS) {
@@ -814,7 +807,7 @@ static void toshiba_usb_sleep_charge_available(struct 
toshiba_acpi_dev *dev)
if (ACPI_FAILURE(status)) {
pr_err("ACPI call to get USB Sleep and Charge mode failed\n");
} else if (out[0] == TOS_NOT_SUPPORTED) {
-   pr_info("USB Sleep and Charge not supported\n");
+   return;
} else if (out[0] == TOS_SUCCESS) {
dev->usbsc_bat_level = out[2];
/* Flag as 

[PATCH v3 1/5] toshiba_acpi: Change *available functions return type

2015-07-31 Thread Azael Avalos
This patch changes the *available functions return type from int to
void.

The checks for support of their respective features are done inside
such functions and there was no need to return anything as we can
flag the queried feature as supported inside these functions.

The code was adapted accordingly to these changes and two new
variables were created and another was changed from uint to bool.

Also, the function toshiba_acceleremoter_supported was renamed to
toshiba_accelerometer_available to maintain the naming consistency on
the driver.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 129 +---
 1 file changed, 62 insertions(+), 67 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index f722898..d983dc4 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -187,7 +187,6 @@ struct toshiba_acpi_dev {
unsigned int info_supported:1;
unsigned int tr_backlight_supported:1;
unsigned int kbd_illum_supported:1;
-   unsigned int kbd_led_registered:1;
unsigned int touchpad_supported:1;
unsigned int eco_supported:1;
unsigned int accelerometer_supported:1;
@@ -198,6 +197,10 @@ struct toshiba_acpi_dev {
unsigned int panel_power_on_supported:1;
unsigned int usb_three_supported:1;
unsigned int sysfs_created:1;
+
+   bool kbd_led_registered;
+   bool illumination_led_registered;
+   bool eco_led_registered;
 };
 
 static struct toshiba_acpi_dev *toshiba_acpi;
@@ -439,26 +442,26 @@ static u32 sci_write(struct toshiba_acpi_dev *dev, u32 
reg, u32 in1)
 }
 
 /* Illumination support */
-static int toshiba_illumination_available(struct toshiba_acpi_dev *dev)
+static void toshiba_illumination_available(struct toshiba_acpi_dev *dev)
 {
u32 in[TCI_WORDS] = { SCI_GET, SCI_ILLUMINATION, 0, 0, 0, 0 };
u32 out[TCI_WORDS];
acpi_status status;
 
+   dev->illumination_supported = 0;
+   dev->illumination_led_registered = false;
+
if (!sci_open(dev))
-   return 0;
+   return;
 
status = tci_raw(dev, in, out);
sci_close(dev);
-   if (ACPI_FAILURE(status)) {
+   if (ACPI_FAILURE(status))
pr_err("ACPI call to query Illumination support failed\n");
-   return 0;
-   } else if (out[0] == TOS_NOT_SUPPORTED) {
+   else if (out[0] == TOS_NOT_SUPPORTED)
pr_info("Illumination device not available\n");
-   return 0;
-   }
-
-   return 1;
+   else if (out[0] == TOS_SUCCESS)
+   dev->illumination_supported = 1;
 }
 
 static void toshiba_illumination_set(struct led_classdev *cdev,
@@ -510,41 +513,42 @@ static enum led_brightness 
toshiba_illumination_get(struct led_classdev *cdev)
 }
 
 /* KBD Illumination */
-static int toshiba_kbd_illum_available(struct toshiba_acpi_dev *dev)
+static void toshiba_kbd_illum_available(struct toshiba_acpi_dev *dev)
 {
u32 in[TCI_WORDS] = { SCI_GET, SCI_KBD_ILLUM_STATUS, 0, 0, 0, 0 };
u32 out[TCI_WORDS];
acpi_status status;
 
+   dev->kbd_illum_supported = 0;
+   dev->kbd_led_registered = false;
+
if (!sci_open(dev))
-   return 0;
+   return;
 
status = tci_raw(dev, in, out);
sci_close(dev);
if (ACPI_FAILURE(status) || out[0] == TOS_INPUT_DATA_ERROR) {
pr_err("ACPI call to query kbd illumination support failed\n");
-   return 0;
} else if (out[0] == TOS_NOT_SUPPORTED) {
pr_info("Keyboard illumination not available\n");
-   return 0;
+   } else if (out[0] == TOS_SUCCESS) {
+   /*
+* Check for keyboard backlight timeout max value,
+* previous kbd backlight implementation set this to
+* 0x3c0003, and now the new implementation set this
+* to 0x3c001a, use this to distinguish between them.
+*/
+   if (out[3] == SCI_KBD_TIME_MAX)
+   dev->kbd_type = 2;
+   else
+   dev->kbd_type = 1;
+   /* Get the current keyboard backlight mode */
+   dev->kbd_mode = out[2] & SCI_KBD_MODE_MASK;
+   /* Get the current time (1-60 seconds) */
+   dev->kbd_time = out[2] >> HCI_MISC_SHIFT;
+   /* Flag as supported */
+   dev->kbd_illum_supported = 1;
}
-
-   /*
-* Check for keyboard backlight timeout max value,
-* previous kbd backlight implementation set this to
-* 0x3c0003, and now the new implementation set this
-* to 0x3c001a, use this to distinguish between them.
-*/
-   if (out[3] == SCI_KBD_TIME_MAX)
-   dev->kbd_type = 2;
-   else
-   dev->kbd_type = 1;
-   /* Get 

[PATCH v3 0/5] toshiba_acpi: Refactor *{get, set} and *available functions

2015-07-31 Thread Azael Avalos
These patches changes the *{get, set} and *available functions default
return type, changes the printed messages of the supported features,
removes some unnedded checks and bumps the driver version to 0.23.

Changes since v2:
- Introduced a new patch to remove unnecessary checks instead of
  cramming them in the patch series
- Updated patch descriptions
- Reverted some changes on the previous version to avoid compiler
  warnings

Changes since v1:
- Fixed typos in patch 01 description and use newly created
  *led_registered variables to unregister leds
- Adapt *{lcd, video}_proc functions in patch 03 to ensure we are not
  breaking userspace and updated patch description

Azael Avalos (5):
  toshiba_acpi: Change *available functions return type
  toshiba_acpi: Remove "*not supported" feature prints
  toshiba_acpi: Refactor *{get, set} functions return value
  toshiba_acpi: Remove unnecessary checks and returns in HCI/SCI
functions
  toshiba_acpi: Bump driver version to 0.23

 drivers/platform/x86/toshiba_acpi.c | 531 +---
 1 file changed, 248 insertions(+), 283 deletions(-)

-- 
2.4.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 01/11] user_ns: 3 new LSM hooks for user namespace operations

2015-07-31 Thread Serge E. Hallyn
On Fri, Jul 31, 2015 at 11:28:56AM +0200, Lukasz Pawelczyk wrote:
> On czw, 2015-07-30 at 16:30 -0500, Serge E. Hallyn wrote:
> > On Fri, Jul 24, 2015 at 12:04:35PM +0200, Lukasz Pawelczyk wrote:
> > > @@ -969,6 +982,7 @@ static int userns_install(struct nsproxy 
> > > *nsproxy, struct ns_common *ns)
> > >  {
> > >   struct user_namespace *user_ns = to_user_ns(ns);
> > >   struct cred *cred;
> > > + int err;
> > >  
> > >   /* Don't allow gaining capabilities by reentering
> > >* the same user namespace.
> > > @@ -986,6 +1000,10 @@ static int userns_install(struct nsproxy 
> > > *nsproxy, struct ns_common *ns)
> > >   if (!ns_capable(user_ns, CAP_SYS_ADMIN))
> > >   return -EPERM;
> > >  
> > > + err = security_userns_setns(nsproxy, user_ns);
> > > + if (err)
> > > + return err;
> > 
> > So at this point the LSM thinks current is in the new ns.  If
> > prepare_creds() fails below, should it be informed of that?
> > (Or am I over-thinking this?)
> > 
> > > +
> > >   cred = prepare_creds();
> > >   if (!cred)
> > >   return -ENOMEM;
> 
> Hmm, the use case for this hook I had in mind was just to allow or
> disallow the operation based on the information passed in arguments.
> Not to register the current in any way so LSM can think it is or isn't
> in the new namespace.
> 
> I think that any other LSM check that would like to know in what
> namespace the current is, would just check that from current's creds.
> Not use some stale and duplicated information the above hook could have
> registered.
> 
> I see no reason for this hook to change the LSM state, only to answer
> the question: allowed/disallowed (eventually return an error cause it
> is unable to give an answer which falls into the disallow category).

How about renaming it "security_userns_may_setns()" for clarity?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/5] dt-bindings: Add a binding for Mediatek xHCI host controller

2015-07-31 Thread chunfeng yun
hi,
On Fri, 2015-07-31 at 14:37 +0100, Mark Rutland wrote:
> Hi,
> 
> > > > + - mediatek,usb-wakeup: to access usb wakeup control register
> > > 
> > > What exactly does this property imply?
> > > 
> > There are some control registers for usb wakeup which are put in another
> > module, here to get the node of that module, and then use regmap and
> > syscon to operate it.
> 
> Ok. You need to specify the type of this property (i.e. that it is a
> phandle to a syscon node). The description makes it sound like a boolean.
> 
Is it ok to add a prefix of syscon, and name it syscon-usb-wakeup?

> > 
> > > > + - mediatek,wakeup-src: 1: ip sleep wakeup mode; 2: line state wakeup
> > > > +   mode; others means don't enable wakeup source of usb
> > > 
> > > This sounds like configuration rather than a hardware property. Why do
> > > you think this needs to be in the DT?
> > > 
> > Yes, it's better to put it in the DT. 
> 
> That doesn't answer my question.
> 
> _why_ do you think this needs to be in the DT? What do you think is
> better for it being there?
> 
It is unthoughtful to put it here;
There is different configuration on platforms, such as on tablet which
only needs line-state wakeup (because system can't enter suspend when
plug in usb cable, so don't need ip-sleep-wakeup to remote wakeup
system), and on box just needs ip-sleep wakeup mode. so it is better to
put in each board's dts.
 
> > 
> > > > + - mediatek,u2port-num: the number should not greater than the number
> > > > +   of phys
> > > 
> > > What exactly does this property imply?
> > > 
> > On some platform, it only makes use of partial usb ports, so disable
> > others to save power.
> 
> What exactly do you mean by "partial USB ports"?
> 
> If a phy isn't wired up, it won't be listed in the phys property, if it
> is then disabling it sounds like a run-time decision.
> 
Yes, you are right.
This confuse me a little before. It was a property of old phy driver at
first, and then ported it here, so did not remove it temp.
After I re-write the phy driver, I will remove it.

Thanks a lot.

> Thanks,
> Mark.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pinctrl: UniPhier: PH1-Pro5: add I2C ch6 pin-mux setting

2015-07-31 Thread Masahiro Yamada
The initial version of this driver missed to add I2C ch6 pin-muxing.

Signed-off-by: Masahiro Yamada 
---

 drivers/pinctrl/uniphier/pinctrl-ph1-pro5.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/pinctrl/uniphier/pinctrl-ph1-pro5.c 
b/drivers/pinctrl/uniphier/pinctrl-ph1-pro5.c
index b35cf4a..9af4559 100644
--- a/drivers/pinctrl/uniphier/pinctrl-ph1-pro5.c
+++ b/drivers/pinctrl/uniphier/pinctrl-ph1-pro5.c
@@ -810,6 +810,8 @@ static const unsigned i2c5b_pins[] = {196, 197};
 static const unsigned i2c5b_muxvals[] = {2, 2};
 static const unsigned i2c5c_pins[] = {215, 216};
 static const unsigned i2c5c_muxvals[] = {2, 2};
+static const unsigned i2c6_pins[] = {101, 102};
+static const unsigned i2c6_muxvals[] = {2, 2};
 static const unsigned nand_pins[] = {19, 20, 21, 22, 23, 24, 25, 28, 29, 30,
 31, 32, 33, 34, 35};
 static const unsigned nand_muxvals[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
@@ -927,6 +929,7 @@ static const struct uniphier_pinctrl_group 
ph1_pro5_groups[] = {
UNIPHIER_PINCTRL_GROUP(i2c5),
UNIPHIER_PINCTRL_GROUP(i2c5b),
UNIPHIER_PINCTRL_GROUP(i2c5c),
+   UNIPHIER_PINCTRL_GROUP(i2c6),
UNIPHIER_PINCTRL_GROUP(uart0),
UNIPHIER_PINCTRL_GROUP(uart0b),
UNIPHIER_PINCTRL_GROUP(uart1),
@@ -1204,6 +1207,7 @@ static const char * const i2c1_groups[] = {"i2c1"};
 static const char * const i2c2_groups[] = {"i2c2"};
 static const char * const i2c3_groups[] = {"i2c3"};
 static const char * const i2c5_groups[] = {"i2c5", "i2c5b", "i2c5c"};
+static const char * const i2c6_groups[] = {"i2c6"};
 static const char * const nand_groups[] = {"nand", "nand_cs1"};
 static const char * const uart0_groups[] = {"uart0", "uart0b"};
 static const char * const uart1_groups[] = {"uart1"};
@@ -1290,6 +1294,7 @@ static const struct uniphier_pinmux_function 
ph1_pro5_functions[] = {
UNIPHIER_PINMUX_FUNCTION(i2c2),
UNIPHIER_PINMUX_FUNCTION(i2c3),
UNIPHIER_PINMUX_FUNCTION(i2c5),
+   UNIPHIER_PINMUX_FUNCTION(i2c6),
UNIPHIER_PINMUX_FUNCTION(nand),
UNIPHIER_PINMUX_FUNCTION(uart0),
UNIPHIER_PINMUX_FUNCTION(uart1),
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] powerpc/85xx: add sleep and deep sleep support

2015-07-31 Thread Scott Wood
On Fri, 2015-07-24 at 20:46 +0800, Chenhui Zhao wrote:
> +static void mpc85xx_pmc_set_wake(struct device *dev, void *enable)
>  {
>   int ret;
> + u32 value[2];
> +
> + if (!device_may_wakeup(dev))
> + return;
> +
> + if (!pmc_regs) {
> + dev_err(dev, "%s: PMC is unavailable\n", __func__);
> + return;
> + }
> +
> + ret = of_property_read_u32_array(dev->of_node, "sleep", value, 2);

This will crash on any device without an of_node.

> + if (ret) {
> + dev_dbg(dev, "%s: Can not find the \"sleep\" property.\n",
> + __func__);
> + return;
> + }
> +
> + if (*(int *)enable)
> + pmc_pmcdr_mask &= ~value[1];
> + else
> + pmc_pmcdr_mask |= value[1];
> +
> + if ((value[1] & 0xe0) && (pmc_flag & PMC_LOSSLESS))
> + pmc_powmgtcsr = POWMGTCSR_LOSSLESS;
> +}

What is 0xe0?

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] powerpc: pm: support deep sleep feature on T104x

2015-07-31 Thread Scott Wood
On Fri, 2015-07-31 at 20:53 +0800, Chenhui Zhao wrote:
> diff --git a/arch/powerpc/kernel/fsl_booke_entry_mapping.S 
> b/arch/powerpc/kernel/fsl_booke_entry_mapping.S
> index f22e7e4..32ec426f 100644
> --- a/arch/powerpc/kernel/fsl_booke_entry_mapping.S
> +++ b/arch/powerpc/kernel/fsl_booke_entry_mapping.S
> @@ -170,6 +170,10 @@ skpinv:  addir6,r6,1 /* 
> Increment */
>   lis r6,MAS2_VAL(PAGE_OFFSET, BOOK3E_PAGESZ_64M, M_IF_SMP)@h
>   ori r6,r6,MAS2_VAL(PAGE_OFFSET, BOOK3E_PAGESZ_64M, M_IF_SMP)@l
>   mtspr   SPRN_MAS2,r6
> +#ifdef ENTRY_DEEPSLEEP_SETUP
> + LOAD_REG_IMMEDIATE(r8, MEMORY_START)
> + ori r8,r8,(MAS3_SX|MAS3_SW|MAS3_SR)
> +#endif
>   mtspr   SPRN_MAS3,r8
>   tlbwe
>  
> @@ -212,12 +216,18 @@ next_tlb_setup:
>   #error You need to specify the mapping or not use this at all.
>  #endif
>  
> +#ifdef ENTRY_DEEPSLEEP_SETUP
> + LOAD_REG_ADDR(r6, 2f)
> + mfmsr   r7
> + rlwinm  r7,r7,0,~(MSR_IS|MSR_DS)
> +#else
>   lis r7,MSR_KERNEL@h
>   ori r7,r7,MSR_KERNEL@l
>   bl  1f  /* Find our address */
>  1:   mflrr9
>   rlwimi  r6,r9,0,20,31
>   addir6,r6,(2f - 1b)
> +#endif

Could you explain what's going on here?  What does the TLB look like before 
and after?

> +int fsl_dp_iomap(void)

I don't think this needs to be global (see the comment where it gets called), 
but if it must be, this name is too terse.

> +{
> + struct device_node *np;
> + int ret = 0;
> + phys_addr_t ccsr_phy_addr, dcsr_phy_addr;
> +
> + saved_law = NULL;
> + ccsr_base = NULL;
> + dcsr_base = NULL;
> + pld_base = NULL;
> +
> + ccsr_phy_addr = get_immrbase();
> + if (ccsr_phy_addr == -1) {
> + pr_err("%s: Can't get the address of CCSR\n", __func__);
> + ret = -EINVAL;
> + goto ccsr_err;
> + }
> + ccsr_base = ioremap(ccsr_phy_addr, SIZE_2MB);
> + if (!ccsr_base) {
> + ret = -ENOMEM;
> + goto ccsr_err;
> + }
> +
> + dcsr_phy_addr = get_dcsrbase();
> + if (dcsr_phy_addr == -1) {
> + pr_err("%s: Can't get the address of DCSR\n", __func__);
> + ret = -EINVAL;
> + goto dcsr_err;
> + }
> + dcsr_base = ioremap(dcsr_phy_addr, SIZE_1MB);
> + if (!dcsr_base) {
> + ret = -ENOMEM;
> + goto dcsr_err;
> + }

Please just map the device tree nodes you need.

> +
> + np = of_find_compatible_node(NULL, NULL, "fsl,tetra-fpga");
> + if (np) {
> + pld_flag = T1040QDS_TETRA_FLAG;
> + } else {
> + np = of_find_compatible_node(NULL, NULL, "fsl,deepsleep-cpld");

I've already rejected fsl,deepsleep-cpld multiple times when others tried to 
add it to a device tree.

> +{
> + u32 ddr_buff_addr;
> +
> + /*
> +  * DDR training initialization will break 128 bytes at the beginning
> +  * of DDR, therefore, save them so that the bootloader will restore
> +  * them. Assume that DDR is mapped to the address space started with
> +  * CONFIG_PAGE_OFFSET.
> +  */
> + memcpy(ddr_buff, (void *)CONFIG_PAGE_OFFSET, DDR_BUF_SIZE);

That assumption may not be true in all relocatable scenarios.

It'd be a lot simpler to just mark that first page as reserved.

> + /* assume ddr_buff is in the physical address space of 4GB */
> + ddr_buff_addr = (u32)(__pa(ddr_buff) & 0x);
> +
> + /*
> +  * the bootloader will restore the first 128 bytes of DDR from
> +  * the location indicated by the register SPARECR3
> +  */
> + out_be32(ccsr_base + CCSR_SCFG_SPARECR3, ddr_buff_addr);

...yeah, please just mark it reserved.

> +}
> +
> +static void fsl_dp_mp_save(void *ccsr)
> +{
> +  struct fsl_bstr *dst = _bstr;
> +
> +  dst->bstrh = in_be32(ccsr + LCC_BSTRH);
> +  dst->bstrl = in_be32(ccsr + LCC_BSTRL);
> +  dst->bstar = in_be32(ccsr + LCC_BSTAR);
> +  dst->cpu_mask = in_be32(ccsr + DCFG_BASE + DCFG_BRR);
> +}

What is "mp"?

> +static void fsl_dp_law_save(void *ccsr)
> +{
> + int i;
> + struct fsl_law *dst = saved_law;
> + struct fsl_law *src = (void *)(ccsr + CCSR_LAW_BASE);
> +
> + for (i = 0; i < num_laws; i++) {
> + dst->lawbarh = in_be32(>lawbarh);
> + dst->lawbarl = in_be32(>lawbarl);
> + dst->lawar = in_be32(>lawar);
> + dst++;
> + src++;
> + }
> +}

Why wouldn't U-Boot restore these the same way on resume as they are now?

> +int fsl_enter_epu_deepsleep(void)
> +{
> + fsl_dp_ddr_save(ccsr_base);
> +
> + fsl_dp_set_resume_pointer(ccsr_base);
> +
> + fsl_dp_mp_save(ccsr_base);
> + fsl_dp_law_save(ccsr_base);
> + /*  enable Warm Device Reset request. */
> + setbits32(ccsr_base + CCSR_SCFG_DPSLPCR, CCSR_SCFG_DPSLPCR_WDRR_EN);
> +
> + /* set GPIO1_29 as an output pin (not open-drain), and output 0 */
> + 

Re: [PATCH v3 3/5] usb: phy: add usb3.0 phy driver for mt65xx SoCs

2015-07-31 Thread chunfeng yun
On Fri, 2015-07-31 at 19:48 +0530, Kishon Vijay Abraham I wrote:
> Hi,
> 
> On Friday 31 July 2015 05:55 PM, chunfeng yun wrote:
> > hi,
> > On Tue, 2015-07-28 at 11:17 +0530, Kishon Vijay Abraham I wrote:
> >> Hi,
> >>
> >> On Sunday 26 July 2015 08:21 AM, chunfeng yun wrote:
> >>> hi,
> >>> On Wed, 2015-07-22 at 09:21 -0500, Felipe Balbi wrote:
>  Hi,
> 
>  On Wed, Jul 22, 2015 at 10:05:43PM +0800, Chunfeng Yun wrote:
> > support usb3.0 phy of mt65xx SoCs
> >
> > Signed-off-by: Chunfeng Yun 
> 
>  you missed Kishon here.
> 
> >>> Thank you.
> > ---
> >  drivers/phy/Kconfig   |   9 +
> >  drivers/phy/Makefile  |   1 +
> >  drivers/phy/phy-mt65xx-usb3.c | 426 
> > ++
> >  3 files changed, 436 insertions(+)
> >  create mode 100644 drivers/phy/phy-mt65xx-usb3.c
> >
> > diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
> > index c0e6ede..019cf8b 100644
> > --- a/drivers/phy/Kconfig
> > +++ b/drivers/phy/Kconfig
> > @@ -193,6 +193,15 @@ config PHY_HIX5HD2_SATA
> > help
> >   Support for SATA PHY on Hisilicon hix5hd2 Soc.
> >  
> > +config PHY_MT65XX_USB3
> > +   tristate "Mediatek USB3.0 PHY Driver"
> > +   depends on ARCH_MEDIATEK && OF
> > +   select GENERIC_PHY
> > +   help
> > + Say 'Y' here to add support for Mediatek USB3.0 PHY driver
> > + for mt65xx SoCs. it supports two usb2.0 ports and
> > + one usb3.0 port.
> > +
> >  config PHY_SUN4I_USB
> > tristate "Allwinner sunxi SoC USB PHY driver"
> > depends on ARCH_SUNXI && HAS_IOMEM && OF
> > diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
> > index f344e1b..3ceff2a 100644
> > --- a/drivers/phy/Makefile
> > +++ b/drivers/phy/Makefile
> > @@ -22,6 +22,7 @@ obj-$(CONFIG_TI_PIPE3)+= 
> > phy-ti-pipe3.o
> >  obj-$(CONFIG_TWL4030_USB)  += phy-twl4030-usb.o
> >  obj-$(CONFIG_PHY_EXYNOS5250_SATA)  += phy-exynos5250-sata.o
> >  obj-$(CONFIG_PHY_HIX5HD2_SATA) += phy-hix5hd2-sata.o
> > +obj-$(CONFIG_PHY_MT65XX_USB3)  += phy-mt65xx-usb3.o
> >  obj-$(CONFIG_PHY_SUN4I_USB)+= phy-sun4i-usb.o
> >  obj-$(CONFIG_PHY_SUN9I_USB)+= phy-sun9i-usb.o
> >  obj-$(CONFIG_PHY_SAMSUNG_USB2) += phy-exynos-usb2.o
> > diff --git a/drivers/phy/phy-mt65xx-usb3.c 
> > b/drivers/phy/phy-mt65xx-usb3.c
> > new file mode 100644
> > index 000..5da4534
> > --- /dev/null
> > +++ b/drivers/phy/phy-mt65xx-usb3.c
> > @@ -0,0 +1,426 @@
> > +/*
> > + * Copyright (c) 2015 MediaTek Inc.
> > + * Author: Chunfeng.Yun 
> > + *
> > + * This software is licensed under the terms of the GNU General Public
> > + * License version 2, as published by the Free Software Foundation, and
> > + * may be copied, distributed, and modified under those terms.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> >>
> >> Lot of these #include are not required. Add only those what are required by
> >> this driver.
> > The dummy header files will be removed later
> > 
> > +#include 
> > +
> > +/*
> > + * for sifslv2 register
> > + * relative to USB3_SIF2_BASE base address
> > + */
> > +#define SSUSB_SIFSLV_SPLLC (0x)
> > +#define SSUSB_SIFSLV_U2PHY_COM_BASE(0x0800)
> >>
> >> Looks like all this base address can come from dt.
> > The phy supports multi-ports, and these are sub-segment registers for
> > port0, and other ports can be calculated from the bases. So I think it's
> > better to use the same base address in dts
> 
> Nope. Except for the register offsets everything else can come from dt.
> > 
> > +#define SSUSB_SIFSLV_U3PHYD_BASE   (0x0900)
> > +#define SSUSB_USB30_PHYA_SIV_B_BASE(0x0b00)
> > +#define SSUSB_SIFSLV_U3PHYA_DA_BASE(0x0c00)
> > +
> > +/*port1 refs. +0x800(refer to port0)*/
> > +#define U3P_PORT_INTERVAL (0x800)  /*based on port0 */
> > +#define U3P_PHY_DELTA(index) ((U3P_PORT_INTERVAL) * (index))
> > +
> > +#define U3P_USBPHYACR0 (SSUSB_SIFSLV_U2PHY_COM_BASE + 0x)
> > +#define PA0_RG_U2PLL_FORCE_ON  (0x1 << 15)
> > +
> > +#define U3P_USBPHYACR2 (SSUSB_SIFSLV_U2PHY_COM_BASE + 0x0008)
> > +#define 

[PATCH -next] ia64: Define ioremap_uc and ioremap_wc

2015-07-31 Thread Guenter Roeck
Commit 3cc2dac5be3f ("drivers/video/fbdev/atyfb: Replace MTRR UC hole
with strong UC") introduces calls to ioremap_wc and ioremap_uc. This
causes build failures with ia64:allmodconfig. Map the missing
functions to ioremap_nocache.

Fixes: 3cc2dac5be3f ("drivers/video/fbdev/atyfb:
Replace MTRR UC hole with strong UC")
Cc: Paul Gortmaker 
Cc: Luis R. Rodriguez 
Signed-off-by: Guenter Roeck 
---
 arch/ia64/include/asm/io.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/ia64/include/asm/io.h b/arch/ia64/include/asm/io.h
index 80a7e34be009..36b65939b6b4 100644
--- a/arch/ia64/include/asm/io.h
+++ b/arch/ia64/include/asm/io.h
@@ -426,6 +426,8 @@ __writeq (unsigned long val, volatile void __iomem *addr)
 
 extern void __iomem * ioremap(unsigned long offset, unsigned long size);
 extern void __iomem * ioremap_nocache (unsigned long offset, unsigned long 
size);
+#define ioremap_wc ioremap_nocache
+#define ioremap_uc ioremap_nocache
 extern void iounmap (volatile void __iomem *addr);
 extern void __iomem * early_ioremap (unsigned long phys_addr, unsigned long 
size);
 #define early_memremap(phys_addr, size)early_ioremap(phys_addr, size)
@@ -436,7 +438,6 @@ static inline void __iomem * ioremap_cache (unsigned long 
phys_addr, unsigned lo
return ioremap(phys_addr, size);
 }
 
-
 /*
  * String version of IO memory access ops:
  */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] parisc: Define ioremap_uc and ioremap_wc

2015-07-31 Thread Guenter Roeck
Commit 3cc2dac5be3f ("drivers/video/fbdev/atyfb: Replace MTRR UC hole
with strong UC") introduces calls to ioremap_wc and ioremap_uc. This
causes build failures with parisc:allmodconfig. Map the missing
functions to ioremap_nocache.

Fixes: 3cc2dac5be3f ("drivers/video/fbdev/atyfb:
Replace MTRR UC hole with strong UC")
Cc: Luis R. Rodriguez 
Cc: Paul Gortmaker 
Signed-off-by: Guenter Roeck 
---
 arch/parisc/include/asm/io.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/parisc/include/asm/io.h b/arch/parisc/include/asm/io.h
index 8cd0abf28ffb..1a16f1d1075f 100644
--- a/arch/parisc/include/asm/io.h
+++ b/arch/parisc/include/asm/io.h
@@ -137,6 +137,8 @@ static inline void __iomem * ioremap(unsigned long offset, 
unsigned long size)
return __ioremap(offset, size, _PAGE_NO_CACHE);
 }
 #define ioremap_nocache(off, sz)   ioremap((off), (sz))
+#define ioremap_wc ioremap_nocache
+#define ioremap_uc ioremap_nocache
 
 extern void iounmap(const volatile void __iomem *addr);
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] alpha: Define ioremap_uc and ioremap_wc

2015-07-31 Thread Guenter Roeck
Commit 3cc2dac5be3f ("drivers/video/fbdev/atyfb: Replace MTRR UC hole
with strong UC") introduces calls to ioremap_wc and ioremap_uc. This
causes build failures with alpha:allmodconfig. Map the missing functions
to ioremap_nocache.

Fixes: 3cc2dac5be3f ("drivers/video/fbdev/atyfb:
Replace MTRR UC hole with strong UC")
Cc: Paul Gortmaker 
Cc: Luis R. Rodriguez 
Signed-off-by: Guenter Roeck 
---
 arch/alpha/include/asm/io.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h
index f05bdb4b1cb9..7edfe6bf0ee6 100644
--- a/arch/alpha/include/asm/io.h
+++ b/arch/alpha/include/asm/io.h
@@ -299,6 +299,9 @@ static inline void __iomem * ioremap_nocache(unsigned long 
offset,
return ioremap(offset, size);
 } 
 
+#define ioremap_wc ioremap_nocache
+#define ioremap_uc ioremap_nocache
+
 static inline void iounmap(volatile void __iomem *addr)
 {
IO_CONCAT(__IO_PREFIX,iounmap)(addr);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/7] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL

2015-07-31 Thread Waiman Long
The smp_store_release() is not a full barrier. In order to avoid missed
wakeup, we may need to add memory barrier around locked and cpu state
variables adding to complexity. As the chance of spurious wakeup is very
low, it is easier and safer to just do an unconditional kick at unlock
time.

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock_paravirt.h |   11 ---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 15d3733..2dd4b39 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -240,7 +240,6 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)
cpu_relax();
}
 
-   WRITE_ONCE(pn->state, vcpu_halted);
if (!lp) { /* ONCE */
lp = pv_hash(lock, pn);
/*
@@ -320,9 +319,15 @@ __visible void __pv_queued_spin_unlock(struct qspinlock 
*lock)
/*
 * At this point the memory pointed at by lock can be freed/reused,
 * however we can still use the pv_node to kick the CPU.
+*
+* As smp_store_release() is not a full barrier, adding a check to
+* the node->state doesn't guarantee the checking is really done
+* after clearing the lock byte since they are in 2 separate
+* cachelines and so hardware can reorder them. So either we insert
+* memory barrier here and in the corresponding pv_wait_head()
+* function or we do an unconditional kick which is what is done here.
 */
-   if (READ_ONCE(node->state) == vcpu_halted)
-   pv_kick(node->cpu);
+   pv_kick(node->cpu);
 }
 /*
  * Include the architecture specific callee-save thunk of the
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 6/7] locking/pvqspinlock: Allow vCPUs kick-ahead

2015-07-31 Thread Waiman Long
Frequent CPU halting (vmexit) and CPU kicking (vmenter) lengthens
critical section and block forward progress.  This patch implements
a kick-ahead mechanism where the unlocker will kick the queue head
vCPUs as well as up to four additional vCPUs next to the queue head
if they were halted.  The kickings are done after exiting the critical
section to improve parallelism.

The amount of kick-ahead allowed depends on the number of vCPUs
in the VM guest. Currently it allows up to 1 vCPU kick-ahead per
4 vCPUs available up to a maximum of PV_KICK_AHEAD_MAX (4). There
are diminishing returns in increasing the maximum value. The current
value of 4 is a compromise of getting a nice performance boost without
penalizing too much on the one vCPU that is doing all the kickings.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

WestmereHaswell
  Patch 32 vCPUs48 vCPUs32 vCPUs48 vCPUs
  - 
  Before patch   3m42.3s10m27.5s 2m00.7s17m22.2s
  After patch3m02.3s 9m35.9s 1m59.8s16m57.6s

This improves performance quite substantially on Westmere, but not
so much on Haswell.

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock_paravirt.h |  108 ---
 1 files changed, 99 insertions(+), 9 deletions(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 5e140fe..c4cc631 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -54,6 +54,7 @@ enum pv_qlock_stat {
pvstat_kick_time,
pvstat_lock_kick,
pvstat_unlock_kick,
+   pvstat_kick_ahead,
pvstat_pend_lock,
pvstat_pend_fail,
pvstat_spurious,
@@ -75,6 +76,7 @@ static const char * const stat_fsnames[pvstat_num] = {
[pvstat_kick_time]   = "kick_time_count",
[pvstat_lock_kick]   = "lock_kick_count",
[pvstat_unlock_kick] = "unlock_kick_count",
+   [pvstat_kick_ahead]  = "kick_ahead_count",
[pvstat_pend_lock]   = "pending_lock_count",
[pvstat_pend_fail]   = "pending_fail_count",
[pvstat_spurious]= "spurious_wakeup",
@@ -87,7 +89,8 @@ static atomic_t pvstats[pvstat_num];
  * pv_kick_latencies = sum of all pv_kick latencies in ns
  * pv_wake_latencies = sum of all wakeup latencies in ns
  *
- * Avg kick latency   = pv_kick_latencies/(lock_kick_count + unlock_kick_count)
+ * Avg kick latency   = pv_kick_latencies/
+ * (lock_kick_count + unlock_kick_count + kick_ahead_count)
  * Avg wake latency   = pv_wake_latencies/kick_time_count
  * Avg # of hops/hash = hash_hops_count/unlock_kick_count
  */
@@ -219,6 +222,18 @@ static struct pv_hash_entry *pv_lock_hash;
 static unsigned int pv_lock_hash_bits __read_mostly;
 
 /*
+ * Allow kick-ahead of vCPUs at unlock time
+ *
+ * The pv_kick_ahead value is set by a simple formula that 1 vCPU kick-ahead
+ * is allowed per 4 vCPUs available up to a maximum of PV_KICK_AHEAD_MAX.
+ * There are diminishing returns in increasing PV_KICK_AHEAD_MAX. The current
+ * value of 4 is a good compromise that gives a good performance boost without
+ * penalizing the vCPU that is doing the kicking by too much.
+ */
+#define PV_KICK_AHEAD_MAX  4
+static int pv_kick_ahead __read_mostly;
+
+/*
  * Allocate memory for the PV qspinlock hash buckets
  *
  * This function should be called from the paravirt spinlock initialization
@@ -226,7 +241,8 @@ static unsigned int pv_lock_hash_bits __read_mostly;
  */
 void __init __pv_init_lock_hash(void)
 {
-   int pv_hash_size = ALIGN(4 * num_possible_cpus(), PV_HE_PER_LINE);
+   int ncpus = num_possible_cpus();
+   int pv_hash_size = ALIGN(4 * ncpus, PV_HE_PER_LINE);
 
if (pv_hash_size < PV_HE_MIN)
pv_hash_size = PV_HE_MIN;
@@ -240,6 +256,13 @@ void __init __pv_init_lock_hash(void)
   pv_hash_size, 0, HASH_EARLY,
   _lock_hash_bits, NULL,
   pv_hash_size, pv_hash_size);
+   /*
+* Enable the unlock kick ahead mode according to the number of
+* vCPUs available.
+*/
+   pv_kick_ahead = min(ncpus/4, PV_KICK_AHEAD_MAX);
+   if (pv_kick_ahead)
+   pr_info("PV unlock kick ahead max count = %d\n", pv_kick_ahead);
 }
 
 #define for_each_hash_entry(he, offset, hash)  
\
@@ -435,6 +458,28 @@ static void pv_wait_node(struct mcs_spinlock *node)
 }
 
 /*
+ * Helper to get the address of the next kickable node
+ *
+ * The node has to be in the halted state. If the chkonly flag is set,
+ * the CPU state won't be 

[PATCH v4 2/7] locking/pvqspinlock: Add pending bit support

2015-07-31 Thread Waiman Long
Like the native qspinlock, using the pending bit when it is lightly
loaded to acquire the lock is faster than going through the PV queuing
process which is even slower than the native queuing process. It also
avoids loading two additional cachelines (the MCS and PV nodes).

This patch adds the pending bit support for PV qspinlock. The pending
bit code has a smaller spin threshold (1<<10). It will default back
to the queuing method if it cannot acquired the lock within a certain
time limit.

On a VM with 32 vCPUs on a 32-core Westmere-EX box, the kernel
build times on 4.2-rc1 based kernels were:

  KernelBuild Time  Sys Time
  ----  
  w/o patch   3m28.5s   28m17.5s
  with patch  3m19.3s   23m55.7s

Using a locking microbenchmark on the same system, the locking
rates in (kops/s) were:

  Threads   Rate w/o patch  Rate with patch
  ---   --  ---
  2 (same socket) 6,515,265   7,077,476
  2 (diff sockets)2,967,145   4,353,851

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock.c  |   27 +-
 kernel/locking/qspinlock_paravirt.h |   67 +++
 2 files changed, 93 insertions(+), 1 deletions(-)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 38c4920..6518ee9 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -162,6 +162,17 @@ static __always_inline void 
clear_pending_set_locked(struct qspinlock *lock)
WRITE_ONCE(l->locked_pending, _Q_LOCKED_VAL);
 }
 
+/**
+ * clear_pending - clear the pending bit.
+ * @lock: Pointer to queued spinlock structure
+ */
+static __always_inline void clear_pending(struct qspinlock *lock)
+{
+   struct __qspinlock *l = (void *)lock;
+
+   WRITE_ONCE(l->pending, 0);
+}
+
 /*
  * xchg_tail - Put in the new queue tail code word & retrieve previous one
  * @lock : Pointer to queued spinlock structure
@@ -193,6 +204,15 @@ static __always_inline void 
clear_pending_set_locked(struct qspinlock *lock)
 }
 
 /**
+ * clear_pending - clear the pending bit.
+ * @lock: Pointer to queued spinlock structure
+ */
+static __always_inline void clear_pending(struct qspinlock *lock)
+{
+   atomic_add(-_Q_PENDING_VAL, >val);
+}
+
+/**
  * xchg_tail - Put in the new queue tail code word & retrieve previous one
  * @lock : Pointer to queued spinlock structure
  * @tail : The new queue tail code word
@@ -245,6 +265,7 @@ static __always_inline void __pv_wait_head(struct qspinlock 
*lock,
   struct mcs_spinlock *node) { }
 
 #define pv_enabled()   false
+#define pv_pending_lock(l, v)  false
 
 #define pv_init_node   __pv_init_node
 #define pv_wait_node   __pv_wait_node
@@ -286,8 +307,11 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 
val)
 
BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS));
 
-   if (pv_enabled())
+   if (pv_enabled()) {
+   if (pv_pending_lock(lock, val))
+   return; /* Got the lock via pending bit */
goto queue;
+   }
 
if (virt_queued_spin_lock(lock))
return;
@@ -463,6 +487,7 @@ EXPORT_SYMBOL(queued_spin_lock_slowpath);
 #undef pv_wait_node
 #undef pv_kick_node
 #undef pv_wait_head
+#undef pv_pending_lock
 
 #undef  queued_spin_lock_slowpath
 #define queued_spin_lock_slowpath  __pv_queued_spin_lock_slowpath
diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 2dd4b39..5325877 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -22,6 +22,14 @@
 
 #define _Q_SLOW_VAL(3U << _Q_LOCKED_OFFSET)
 
+/*
+ * Queued Spinlock Spin Threshold
+ *
+ * The vCPU will spin a relatively short time in pending mode before falling
+ * back to queuing.
+ */
+#define PENDING_SPIN_THRESHOLD (SPIN_THRESHOLD >> 5)
+
 enum vcpu_state {
vcpu_running = 0,
vcpu_halted,
@@ -152,6 +160,65 @@ static void pv_init_node(struct mcs_spinlock *node)
 }
 
 /*
+ * Try to acquire the lock and wait using the pending bit
+ */
+static int pv_pending_lock(struct qspinlock *lock, u32 val)
+{
+   int loop = PENDING_SPIN_THRESHOLD;
+   u32 new, old;
+
+   /*
+* wait for in-progress pending->locked hand-overs
+*/
+   while ((val == _Q_PENDING_VAL) && loop) {
+   cpu_relax();
+   val = atomic_read(>val);
+   loop--;
+   }
+
+   /*
+* trylock || pending
+*/
+   for (;; loop--) {
+   if (val & ~_Q_LOCKED_MASK)
+   goto queue;
+   new = _Q_LOCKED_VAL;
+   if (val == new)
+   new |= _Q_PENDING_VAL;
+   old = atomic_cmpxchg(>val, val, new);
+   if (old == val)
+   break;
+   if 

[PATCH v4 4/7] locking/pvqspinlock, x86: Optimize PV unlock code path

2015-07-31 Thread Waiman Long
The unlock function in queued spinlocks was optimized for better
performance on bare metal systems at the expense of virtualized guests.

For x86-64 systems, the unlock call needs to go through a
PV_CALLEE_SAVE_REGS_THUNK() which saves and restores 8 64-bit
registers before calling the real __pv_queued_spin_unlock()
function. The thunk code may also be in a separate cacheline from
__pv_queued_spin_unlock().

This patch optimizes the PV unlock code path by:
 1) Moving the unlock slowpath code from the fastpath into a separate
__pv_queued_spin_unlock_slowpath() function to make the fastpath
as simple as possible..
 2) For x86-64, hand-coded an assembly function to combine the register
saving thunk code with the fastpath code. Only registers that
are used in the fastpath will be saved and restored. If the
fastpath fails, the slowpath function will be called via another
PV_CALLEE_SAVE_REGS_THUNK(). For 32-bit, it falls back to the C
__pv_queued_spin_unlock() code as the thunk saves and restores
only one 32-bit register.

With a microbenchmark of 5M lock-unlock loop, the table below shows
the execution times before and after the patch with different number
of threads in a VM running on a 32-core Westmere-EX box with x86-64
4.2-rc1 based kernels:

  Threads   Before patchAfter patch % Change
  ---   --- 
 1 134.1 ms   119.1 ms-11%
 2 1163  ms967  ms-17%
 3 3641  ms   3298  ms-9.4%
 4 4051  ms   3634  ms-10.3%

Signed-off-by: Waiman Long 
---
 arch/x86/include/asm/qspinlock_paravirt.h |   59 +
 kernel/locking/qspinlock_paravirt.h   |   31 ++-
 2 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/qspinlock_paravirt.h 
b/arch/x86/include/asm/qspinlock_paravirt.h
index b002e71..3001972 100644
--- a/arch/x86/include/asm/qspinlock_paravirt.h
+++ b/arch/x86/include/asm/qspinlock_paravirt.h
@@ -1,6 +1,65 @@
 #ifndef __ASM_QSPINLOCK_PARAVIRT_H
 #define __ASM_QSPINLOCK_PARAVIRT_H
 
+/*
+ * For x86-64, PV_CALLEE_SAVE_REGS_THUNK() saves and restores 8 64-bit
+ * registers. For i386, however, only 1 32-bit register needs to be saved
+ * and restored. So an optimized version of __pv_queued_spin_unlock() is
+ * hand-coded for 64-bit, but it isn't worthwhile to do it for 32-bit.
+ */
+#ifdef CONFIG_64BIT
+
+PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath);
+#define __pv_queued_spin_unlock__pv_queued_spin_unlock
+#define PV_UNLOCK  "__raw_callee_save___pv_queued_spin_unlock"
+#define PV_UNLOCK_SLOWPATH 
"__raw_callee_save___pv_queued_spin_unlock_slowpath"
+
+/*
+ * Optimized assembly version of __raw_callee_save___pv_queued_spin_unlock
+ * which combines the registers saving trunk and the body of the following
+ * C code:
+ *
+ * void __pv_queued_spin_unlock(struct qspinlock *lock)
+ * {
+ * struct __qspinlock *l = (void *)lock;
+ * u8 lockval = cmpxchg(>locked, _Q_LOCKED_VAL, 0);
+ *
+ * if (likely(lockval == _Q_LOCKED_VAL))
+ * return;
+ * pv_queued_spin_unlock_slowpath(lock, lockval);
+ * }
+ *
+ * For x86-64,
+ *   rdi = lock(first argument)
+ *   rsi = lockval (second argument)
+ *   rdx = internal variable (set to 0)
+ */
+asm(".pushsection .text;"
+".globl " PV_UNLOCK ";"
+".align 4,0x90;"
+PV_UNLOCK ": "
+"push  %rdx;"
+"mov   $0x1,%eax;"
+"xor   %edx,%edx;"
+"lock cmpxchg %dl,(%rdi);"
+"cmp   $0x1,%al;"
+"jne   .slowpath;"
+"pop   %rdx;"
+"ret;"
+".slowpath: "
+"push   %rsi;"
+"movzbl %al,%esi;"
+"call " PV_UNLOCK_SLOWPATH ";"
+"pop%rsi;"
+"pop%rdx;"
+"ret;"
+".size " PV_UNLOCK ", .-" PV_UNLOCK ";"
+".popsection");
+
+#else /* CONFIG_64BIT */
+
+extern void __pv_queued_spin_unlock(struct qspinlock *lock);
 PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock);
 
+#endif /* CONFIG_64BIT */
 #endif
diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 3552aa9..5efcc65 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -516,23 +516,20 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)
 }
 
 /*
- * PV version of the unlock function to be used in stead of
- * queued_spin_unlock().
+ * PV versions of the unlock fastpath and slowpath functions to be used
+ * instead of queued_spin_unlock().
  */
-__visible void __pv_queued_spin_unlock(struct qspinlock *lock)
+__visible void
+__pv_queued_spin_unlock_slowpath(struct qspinlock *lock, u8 lockval)
 {
struct __qspinlock *l = (void *)lock;
struct pv_node *node;
-   u8 lockval = cmpxchg(>locked, _Q_LOCKED_VAL, 0);
 
/*
 * We must not unlock if SLOW, because in that case we must first
 * unhash. 

[PATCH v4 0/7] locking/qspinlock: Enhance pvqspinlock performance

2015-07-31 Thread Waiman Long
v3->v4:
 - Patch 1: add comment about possible racing condition in PV unlock.
 - Patch 2: simplified the pv_pending_lock() function as suggested by
   Davidlohr.
 - Move PV unlock optimization patch forward to patch 4 & rerun
   performance test.

v2->v3:
 - Moved deferred kicking enablement patch forward & move back
   the kick-ahead patch to make the effect of kick-ahead more visible.
 - Reworked patch 6 to make it more readable.
 - Reverted back to use state as a tri-state variable instead of
   adding an additional bistate variable.
 - Added performance data for different values of PV_KICK_AHEAD_MAX.
 - Add a new patch to optimize PV unlock code path performance.

v1->v2:
 - Take out the queued unfair lock patches
 - Add a patch to simplify the PV unlock code
 - Move pending bit and statistics collection patches to the front
 - Keep vCPU kicking in pv_kick_node(), but defer it to unlock time
   when appropriate.
 - Change the wait-early patch to use adaptive spinning to better
   balance the difference effect on normal and over-committed guests.
 - Add patch-to-patch performance changes in the patch commit logs.

This patchset tries to improve the performance of both normal and
over-commmitted VM guests. The kick-ahead and adaptive spinning
patches are inspired by the "Do Virtual Machines Really Scale?" blog
from Sanidhya Kashyap.

Patch 1 simplifies the unlock code by doing unconditional vCPU kick
when _Q_SLOW_VAL is set as the chance of spurious wakeup showing
up in the statistical data that I collected was very low (1 or 2
occasionally).

Patch 2 adds pending bit support to pvqspinlock improving performance
at light load.

Patch 3 allows the collection of various count data that are useful
to see what is happening in the system. They do add a bit of overhead
when enabled slowing performance a tiny bit.

Patch 4 optimizes the PV unlock code path performance for x86-64
architecture.

Patch 5 is an enablement patch for deferring vCPU kickings from the
lock side to the unlock side.

Patch 6 enables multiple vCPU kick-ahead's at unlock time, outside of
the critical section which can improve performance in overcommitted
guests and sometime even in normal guests.

Patch 7 enables adaptive spinning in the queue nodes. This patch can
lead to pretty big performance increase in over-committed guest at
the expense of a slight performance hit in normal guests.

Patches 2 & 4 improves performance of common uncontended and lightly
contended cases. Patches 5-7 are for improving performance in
over-committed VM guests.

Performance measurements were done on a 32-CPU Westmere-EX and
Haswell-EX systems. The Westmere-EX system got the most performance
gain from patch 5, whereas the Haswell-EX system got the most gain
from patch 6 for over-committed guests.

The table below shows the Linux kernel build times for various
values of PV_KICK_AHEAD_MAX on an over-committed 48-vCPU guest on
the Westmere-EX system:

  PV_KICK_AHEAD_MAX Patches 1-5 Patches 1-6
  - --- ---
  1   9m46.9s11m10.1s
  2   9m40.2s10m08.3s
  3   9m36.8s 9m49.8s
  4   9m35.9s 9m38.7s
  5   9m35.1s 9m33.0s
  6   9m35.7s 9m28.5s

With patches 1-5, the performance wasn't very sensitive to different
PV_KICK_AHEAD_MAX values. Adding patch 6 into the mix, however, changes
the picture quite dramatically. There is a performance regression if
PV_KICK_AHEAD_MAX is too small. Starting with a value of 4, increasing
PV_KICK_AHEAD_MAX only gets us a minor benefit.

Waiman Long (7):
  locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL
  locking/pvqspinlock: Add pending bit support
  locking/pvqspinlock: Collect slowpath lock statistics
  locking/pvqspinlock, x86: Optimize PV unlock code path
  locking/pvqspinlock: Enable deferment of vCPU kicking to unlock call
  locking/pvqspinlock: Allow vCPUs kick-ahead
  locking/pvqspinlock: Queue node adaptive spinning

 arch/x86/Kconfig  |7 +
 arch/x86/include/asm/qspinlock_paravirt.h |   59 +++
 kernel/locking/qspinlock.c|   38 ++-
 kernel/locking/qspinlock_paravirt.h   |  546 +++--
 4 files changed, 612 insertions(+), 38 deletions(-)

 v3-to-v4 diff:

 arch/x86/include/asm/qspinlock_paravirt.h |5 +--
 kernel/locking/qspinlock_paravirt.h   |   45 +---
 2 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/qspinlock_paravirt.h 
b/arch/x86/include/asm/qspinlock_paravirt.h
index 46f0f82..3001972 100644
--- a/arch/x86/include/asm/qspinlock_paravirt.h
+++ b/arch/x86/include/asm/qspinlock_paravirt.h
@@ -9,10 +9,10 @@
  */
 #ifdef CONFIG_64BIT
 
-PV_CALLEE_SAVE_REGS_THUNK(pv_queued_spin_unlock_slowpath);
+PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath);
 #define 

[PATCH v4 3/7] locking/pvqspinlock: Collect slowpath lock statistics

2015-07-31 Thread Waiman Long
This patch enables the accumulation of kicking and waiting related
PV qspinlock statistics when the new QUEUED_LOCK_STAT configuration
option is selected. It also enables the collection of kicking and
wakeup latencies which have a heavy dependency on the CPUs being used.

The measured latencies for different CPUs are:

CPU Wakeup  Kicking
--- --  ---
Haswell-EX  89.8us   7.4us
Westmere-EX 67.6us   9.3us

The measured latencies varied a bit from run-to-run. The wakeup
latency is much higher than the kicking latency.

A sample of statistics counts after a kernel build (no CPU overcommit)
was:

hash_hops_count=43
kick_latencies=5783565492
kick_time_count=640269
lock_kick_count=640238
pending_fail_count=10672
pending_lock_count=2946871
spurious_wakeup=14
unlock_kick_count=38
wait_again_count=4
wait_head_count=41
wait_node_count=640242
wake_latencies=42491684295

Signed-off-by: Waiman Long 
---
 arch/x86/Kconfig|7 ++
 kernel/locking/qspinlock_paravirt.h |  179 ++-
 2 files changed, 182 insertions(+), 4 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b3a1a5d..e89080b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -685,6 +685,13 @@ config PARAVIRT_SPINLOCKS
 
  If you are unsure how to answer this question, answer Y.
 
+config QUEUED_LOCK_STAT
+   bool "Paravirt queued lock statistics"
+   depends on PARAVIRT && DEBUG_FS && QUEUED_SPINLOCKS
+   ---help---
+ Enable the collection of statistical data on the behavior of
+ paravirtualized queued spinlocks and report them on debugfs.
+
 source "arch/x86/xen/Kconfig"
 
 config KVM_GUEST
diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 5325877..3552aa9 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -44,6 +44,153 @@ struct pv_node {
 };
 
 /*
+ * PV qspinlock statistics
+ */
+enum pv_qlock_stat {
+   pvstat_wait_head,
+   pvstat_wait_node,
+   pvstat_wait_again,
+   pvstat_kick_time,
+   pvstat_lock_kick,
+   pvstat_unlock_kick,
+   pvstat_pend_lock,
+   pvstat_pend_fail,
+   pvstat_spurious,
+   pvstat_hops,
+   pvstat_num  /* Total number of statistics counts */
+};
+
+#ifdef CONFIG_QUEUED_LOCK_STAT
+/*
+ * Collect pvqspinlock statiatics
+ */
+#include 
+#include 
+
+static const char * const stat_fsnames[pvstat_num] = {
+   [pvstat_wait_head]   = "wait_head_count",
+   [pvstat_wait_node]   = "wait_node_count",
+   [pvstat_wait_again]  = "wait_again_count",
+   [pvstat_kick_time]   = "kick_time_count",
+   [pvstat_lock_kick]   = "lock_kick_count",
+   [pvstat_unlock_kick] = "unlock_kick_count",
+   [pvstat_pend_lock]   = "pending_lock_count",
+   [pvstat_pend_fail]   = "pending_fail_count",
+   [pvstat_spurious]= "spurious_wakeup",
+   [pvstat_hops]= "hash_hops_count",
+};
+
+static atomic_t pvstats[pvstat_num];
+
+/*
+ * pv_kick_latencies = sum of all pv_kick latencies in ns
+ * pv_wake_latencies = sum of all wakeup latencies in ns
+ *
+ * Avg kick latency   = pv_kick_latencies/(lock_kick_count + unlock_kick_count)
+ * Avg wake latency   = pv_wake_latencies/kick_time_count
+ * Avg # of hops/hash = hash_hops_count/unlock_kick_count
+ */
+static atomic64_t pv_kick_latencies, pv_wake_latencies;
+static DEFINE_PER_CPU(u64, pv_kick_time);
+
+/*
+ * Reset all the statistics counts if set
+ */
+static bool reset_cnts __read_mostly;
+
+/*
+ * Initialize debugfs for the PV qspinlock statistics
+ */
+static int __init pv_qspinlock_debugfs(void)
+{
+   struct dentry *d_pvqlock = debugfs_create_dir("pv-qspinlock", NULL);
+   int i;
+
+   if (!d_pvqlock)
+   pr_warn("Could not create 'pv-qspinlock' debugfs directory\n");
+
+   for (i = 0; i < pvstat_num; i++)
+   debugfs_create_u32(stat_fsnames[i], 0444, d_pvqlock,
+ (u32 *)[i]);
+   debugfs_create_u64("kick_latencies", 0444, d_pvqlock,
+  (u64 *)_kick_latencies);
+   debugfs_create_u64("wake_latencies", 0444, d_pvqlock,
+  (u64 *)_wake_latencies);
+   debugfs_create_bool("reset_cnts", 0644, d_pvqlock, (u32 *)_cnts);
+   return 0;
+}
+fs_initcall(pv_qspinlock_debugfs);
+
+/*
+ * Reset all the counts
+ */
+static noinline void pvstat_reset(void)
+{
+   int i;
+
+   for (i = 0; i < pvstat_num; i++)
+   atomic_set([i], 0);
+   atomic64_set(_kick_latencies, 0);
+   atomic64_set(_wake_latencies, 0);
+   reset_cnts = 0;
+}
+
+/*
+ * Increment the PV qspinlock statistics counts
+ */
+static inline void pvstat_inc(enum pv_qlock_stat stat)
+{
+   atomic_inc([stat]);
+   if (unlikely(reset_cnts))
+   pvstat_reset();
+}
+
+/*
+ * PV hash hop 

[PATCH v4 7/7] locking/pvqspinlock: Queue node adaptive spinning

2015-07-31 Thread Waiman Long
In an overcommitted guest where some vCPUs have to be halted to make
forward progress in other areas, it is highly likely that a vCPU later
in the spinlock queue will be spinning while the ones earlier in the
queue would have been halted. The spinning in the later vCPUs is then
just a waste of precious CPU cycles because they are not going to
get the lock soon as the earlier ones have to be woken up and take
their turn to get the lock.

Reducing the spinning threshold is found to improve performance in
an overcommitted VM guest, but decrease performance when there is
no overcommittment.

This patch implements an adaptive spinning mechanism where the vCPU
will call pv_wait() earlier if all the following conditions are true:

 1) the vCPU has not been halted before;
 2) the previous vCPU is in the halted state;
 3) the current vCPU is at least 2 nodes away from the lock holder;
 4) there are a lot of pv_wait() for the current vCPU recently.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

WestmereHaswell
  Patch 32 vCPUs48 vCPUs32 vCPUs48 vCPUs
  - 
  Before patch   3m02.3s 9m35.9s 1m59.8s16m57.6s
  After patch3m06.5s 9m38.7s 2m01.5s 9m42.3s

This patch seemed to cause a little bit of performance degraduation
for 32 vCPUs. For 48 vCPUs, there wasn't much change for Westmere,
but a pretty big performance jump for Haswell.

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock.c  |5 +-
 kernel/locking/qspinlock_paravirt.h |  132 +-
 2 files changed, 131 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 94fdd27..da39d43 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -258,7 +258,8 @@ static __always_inline void set_locked(struct qspinlock 
*lock)
  */
 
 static __always_inline void __pv_init_node(struct mcs_spinlock *node) { }
-static __always_inline void __pv_wait_node(struct mcs_spinlock *node) { }
+static __always_inline void __pv_wait_node(struct mcs_spinlock *node,
+  struct mcs_spinlock *prev) { }
 static __always_inline void __pv_kick_node(struct qspinlock *lock,
   struct mcs_spinlock *node) { }
 static __always_inline void __pv_wait_head(struct qspinlock *lock,
@@ -415,7 +416,7 @@ queue:
prev = decode_tail(old);
WRITE_ONCE(prev->next, node);
 
-   pv_wait_node(node);
+   pv_wait_node(node, prev);
arch_mcs_spin_lock_contended(>locked);
}
 
diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index c4cc631..d04911b 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -23,12 +23,47 @@
 #define _Q_SLOW_VAL(3U << _Q_LOCKED_OFFSET)
 
 /*
- * Queued Spinlock Spin Threshold
+ * Queued Spinlock Spin Thresholds
  *
  * The vCPU will spin a relatively short time in pending mode before falling
  * back to queuing.
+ *
+ * Queue Node Adaptive Spinning
+ *
+ * A queue node vCPU will spin less if the following conditions are all true:
+ * 1) vCPU in the previous node is halted && it has not been halted before
+ * 2) it is at least 2 nodes away from the lock holder
+ * 3) there is a lot of pv_wait() in the curent vCPU recently
+ *
+ * The last condition is being monitored by the waithist field in the pv_node
+ * structure which tracks the history of pv_wait() relative to slowpath calls.
+ * Each pv_wait will increment this field by PV_WAIT_INC until it exceeds
+ * PV_WAITHIST_MAX. Each slowpath lock call will decrement it by 1 until it
+ * reaches PV_WAITHIST_MIN. If its value is higher than PV_WAITHIST_THRESHOLD,
+ * the vCPU will spin less. The reason for this adaptive spinning is to try
+ * to enable wait-early when overcommitted which should cause a lot more
+ * pv_wait's, but don't use it when it is not.
+ *
+ * The queue node vCPU will monitor the state of the previous node
+ * periodically to see if there is any change. If the previous node is
+ * found to be halted, it will call pv_wait() immediately when wait_early
+ * mode is enabled as the wakeup latency is pretty high. On the other, it
+ * won't go to the halted state immediately on entry to pv_wait_node() as
+ * the previous node may be being woken up.
+ *
+ * With PV_WAIT_INC set to 2, each pv_wait() while not in wait-early mode
+ * will increment waithist by 1. Each slowpath call without pv_wait() will
+ * decrement waithist by 1. The threshold is set in a way as to not prefer
+ * enabling wait-early.
  */
-#define PENDING_SPIN_THRESHOLD 

[PATCH v4 5/7] locking/pvqspinlock: Enable deferment of vCPU kicking to unlock call

2015-07-31 Thread Waiman Long
Most of the vCPU kickings are done on the locking side where the new
lock holder wake up the queue head vCPU to spin on the lock. However,
there are situations where it may be advantageous to defer the vCPU
kicking to when the lock holder releases the lock.

This patch enables the deferment of vCPU kicking to the unlock function
by adding a new vCPU state (vcpu_hashed) to marks the fact that
 1) _Q_SLOW_VAL is set in the lock, and
 2) the pv_node address is stored in the hash table

This enablement patch, by itself, should not change the performance
of the pvqspinlock code. Actual deferment vCPU kicks will be added
in a later patch.

Signed-off-by: Waiman Long 
---
 kernel/locking/qspinlock.c  |6 +++---
 kernel/locking/qspinlock_paravirt.h |   34 --
 2 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 6518ee9..94fdd27 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -259,8 +259,8 @@ static __always_inline void set_locked(struct qspinlock 
*lock)
 
 static __always_inline void __pv_init_node(struct mcs_spinlock *node) { }
 static __always_inline void __pv_wait_node(struct mcs_spinlock *node) { }
-static __always_inline void __pv_kick_node(struct mcs_spinlock *node) { }
-
+static __always_inline void __pv_kick_node(struct qspinlock *lock,
+  struct mcs_spinlock *node) { }
 static __always_inline void __pv_wait_head(struct qspinlock *lock,
   struct mcs_spinlock *node) { }
 
@@ -464,7 +464,7 @@ queue:
cpu_relax();
 
arch_mcs_spin_unlock_contended(>locked);
-   pv_kick_node(next);
+   pv_kick_node(lock, next);
 
 release:
/*
diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index 5efcc65..5e140fe 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -33,6 +33,7 @@
 enum vcpu_state {
vcpu_running = 0,
vcpu_halted,
+   vcpu_hashed,/* vcpu_halted + node stored in hash table */
 };
 
 struct pv_node {
@@ -406,13 +407,17 @@ static void pv_wait_node(struct mcs_spinlock *node)
pv_wait(>state, vcpu_halted);
}
 
+   if (READ_ONCE(node->locked))
+   break;
+
/*
-* Reset the vCPU state to avoid unncessary CPU kicking
+* Reset the vCPU state to running to avoid unncessary CPU
+* kicking unless vcpu_hashed had already been set. In this
+* case, node->locked should have just been set, and we
+* aren't going to set state to vcpu_halted again.
 */
-   WRITE_ONCE(pn->state, vcpu_running);
+   cmpxchg(>state, vcpu_halted, vcpu_running);
 
-   if (READ_ONCE(node->locked))
-   break;
/*
 * If the locked flag is still not set after wakeup, it is a
 * spurious wakeup and the vCPU should wait again. However,
@@ -431,12 +436,16 @@ static void pv_wait_node(struct mcs_spinlock *node)
 
 /*
  * Called after setting next->locked = 1, used to wake those stuck in
- * pv_wait_node().
+ * pv_wait_node(). Alternatively, it can also defer the kicking to the
+ * unlock function.
  */
-static void pv_kick_node(struct mcs_spinlock *node)
+static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
 {
struct pv_node *pn = (struct pv_node *)node;
 
+   if (xchg(>state, vcpu_running) != vcpu_halted)
+   return;
+
/*
 * Note that because node->locked is already set, this actual
 * mcs_spinlock entry could be re-used already.
@@ -446,10 +455,8 @@ static void pv_kick_node(struct mcs_spinlock *node)
 *
 * See the comment in pv_wait_node().
 */
-   if (xchg(>state, vcpu_running) == vcpu_halted) {
-   pvstat_inc(pvstat_lock_kick);
-   pv_kick(pn->cpu);
-   }
+   pvstat_inc(pvstat_lock_kick);
+   pv_kick(pn->cpu);
 }
 
 /*
@@ -471,6 +478,13 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)
cpu_relax();
}
 
+   if (!lp && (xchg(>state, vcpu_hashed) == vcpu_hashed))
+   /*
+* The hashed table & _Q_SLOW_VAL had been filled
+* by the lock holder.
+*/
+   lp = (struct qspinlock **)-1;
+
if (!lp) { /* ONCE */
lp = pv_hash(lock, pn);
/*
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4.1 000/267] 4.1.4-stable review

2015-07-31 Thread Guenter Roeck

On 07/31/2015 12:37 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 4.1.4 release.
There are 267 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Aug  2 19:39:27 UTC 2015.
Anything received after that time might be too late.



Build results:
total: 141 pass: 141 fail: 0
Qemu test results:
total: 33 pass: 33 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next 5/9] openvswitch: Add conntrack action

2015-07-31 Thread Pravin Shelar
On Thu, Jul 30, 2015 at 11:12 AM, Joe Stringer  wrote:
> Expose the kernel connection tracker via OVS. Userspace components can
> make use of the "ct()" action, followed by "recirculate", to populate
> the conntracking state in the OVS flow key, and subsequently match on
> that state.
>
> Example ODP flows allowing traffic from 1->2, only replies from 2->1:
> in_port=1,tcp,action=ct(commit,zone=1),2
> in_port=2,ct_state=-trk,tcp,action=ct(zone=1),recirc(1)
> recirc_id=1,in_port=2,ct_state=+trk+est-new,tcp,action=1
>
> IP fragments are handled by transparently assembling them as part of the
> ct action. The maximum received unit (MRU) size is tracked so that
> refragmentation can occur during output.
>
> IP frag handling contributed by Andy Zhou.
>
> Signed-off-by: Joe Stringer 
> Signed-off-by: Justin Pettit 
> Signed-off-by: Andy Zhou 
> ---
> This can be tested with the corresponding userspace component here:
> https://www.github.com/justinpettit/openvswitch conntrack
> ---
>  include/uapi/linux/openvswitch.h |  41 
>  net/openvswitch/Kconfig  |  11 +
>  net/openvswitch/Makefile |   1 +
>  net/openvswitch/actions.c| 162 -
>  net/openvswitch/conntrack.c  | 480 
> +++
>  net/openvswitch/conntrack.h  |  82 +++
>  net/openvswitch/datapath.c   |  62 +++--
>  net/openvswitch/datapath.h   |   6 +
>  net/openvswitch/flow.c   |   3 +
>  net/openvswitch/flow.h   |   6 +
>  net/openvswitch/flow_netlink.c   |  73 --
>  net/openvswitch/flow_netlink.h   |   4 +-
>  net/openvswitch/vport.c  |   1 +
>  13 files changed, 897 insertions(+), 35 deletions(-)
>  create mode 100644 net/openvswitch/conntrack.c
>  create mode 100644 net/openvswitch/conntrack.h
>
...

> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index e50678d..4a62ed4 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 

..
>  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> @@ -52,6 +55,16 @@ struct deferred_action {
> struct sw_flow_key pkt_key;
>  };
>
> +struct ovs_frag_data {
> +   struct dst_entry *dst;
> +   struct vport *vport;
> +   struct sw_flow_key *key;
> +   struct ovs_skb_cb cb;
> +   __be16 vlan_proto;
> +};
> +
> +static DEFINE_PER_CPU(struct ovs_frag_data, ovs_frag_data_storage);
> +
>  #define DEFERRED_ACTION_FIFO_SIZE 10
>  struct action_fifo {
> int head;
> @@ -594,14 +607,136 @@ static int set_sctp(struct sk_buff *skb, struct 
> sw_flow_key *flow_key,
> return 0;
>  }
>
> -static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port)
> +/* Given an IP frame, reconstruct its MAC header.  */
> +static void ovs_setup_l2_header(struct sk_buff *skb,
> +   const struct ovs_frag_data *data)
> +{
> +   struct sw_flow_key *key = data->key;
> +
> +   skb_push(skb, ETH_HLEN);
> +   skb_reset_mac_header(skb);
> +
> +   ether_addr_copy(eth_hdr(skb)->h_source, key->eth.src);
> +   ether_addr_copy(eth_hdr(skb)->h_dest, key->eth.dst);
> +   eth_hdr(skb)->h_proto = key->eth.type;
> +
> +   if ((data->key->eth.tci & htons(VLAN_TAG_PRESENT)) &&
> +   !skb_vlan_tag_present(skb))
> +   __vlan_hwaccel_put_tag(skb, data->vlan_proto,
> +  ntohs(key->eth.tci));
> +}
> +
> +static void prepare_frag(struct vport *vport, struct sw_flow_key *key,
> +struct sk_buff *skb)
> +{
> +   unsigned int hlen = ETH_HLEN;
> +   struct ovs_frag_data *data;
> +
> +   data = this_cpu_ptr(_frag_data_storage);
> +   data->dst = skb_dst(skb);
> +   data->vport = vport;
> +   data->key = key;
> +   data->cb = *OVS_CB(skb);
> +
> +   if (key->eth.tci & htons(VLAN_TAG_PRESENT)) {
> +   if (skb_vlan_tag_present(skb)) {
> +   data->vlan_proto = skb->vlan_proto;
> +   } else {
> +   data->vlan_proto = vlan_eth_hdr(skb)->h_vlan_proto;
> +   hlen += VLAN_HLEN;
> +   }
> +   }
Not all actions keep flow key uptodate, so here you can access stale values.

> +
> +   memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> +   skb_pull(skb, hlen);
> +}
> +
> +static int ovs_vport_output(struct sock *sock, struct sk_buff *skb)
> +{
> +   struct ovs_frag_data *data = this_cpu_ptr(_frag_data_storage);
> +   struct vport *vport = data->vport;
> +
> +   skb_dst_drop(skb);
> +   skb_dst_set(skb, dst_clone(data->dst));
> +   *OVS_CB(skb) = data->cb;
> +
> +   ovs_setup_l2_header(skb, data);
> +   ovs_vport_send(vport, skb);
> +
> +   return 0;
> +}
> +
...
> +static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
> + struct sw_flow_key *key)
>  {
> struct 

Re: [PATCH 3.14 000/125] 3.14.49-stable review

2015-07-31 Thread Guenter Roeck

On 07/31/2015 12:40 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.14.49 release.
There are 125 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Aug  2 19:40:05 UTC 2015.
Anything received after that time might be too late.



Build results:
total: 132 pass: 132 fail: 0
Qemu test results:
total: 32 pass: 32 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.10 00/89] 3.10.85-stable review

2015-07-31 Thread Guenter Roeck

On 07/31/2015 12:40 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.10.85 release.
There are 89 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Aug  2 19:40:13 UTC 2015.
Anything received after that time might be too late.



Build results:
total: 127 pass: 127 fail: 0
Qemu test results:
total: 29 pass: 29 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Revisit AF_BUS: is it a better way to implement KDBUS?

2015-07-31 Thread cee1
2015-08-01 5:15 GMT+08:00 Andy Lutomirski :
> On Fri, Jul 31, 2015 at 9:25 AM, cee1  wrote:
>> 2015-07-31 2:12 GMT+08:00 Andy Lutomirski :
>>>
>>> I find myself wondering whether an in-kernel *bus* is a good idea at
>>> all.  Creating a bus that unprivileged programs are allowed to
>>> broadcast on (which is kind of the point) opens up big cans of worms.
>>
>> This can be solved in this AF_BUS like this:
>> * Becoming a bus master needs a proper CAP.
>> * Impose a bus endpoint to join multicast address "maddr1" first, if
>> it wants to send to multicast address "maddr2".
>>
>> The bus endpoint sends the request of joining maddr1, and the bus
>> master grants it with replying a cmsg(control message) and setting up
>> a proper eBPF.
>>
>> Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
>> 1) maddr1 & maddr2 == maddr1
>> And 2) the eBPF allows it.
>>  (i.e. the same multicast match logic in this AF_BUS)
>>
>
> I don't understand.
>
> If the endpoint is unprivileged (i.e. random untrusted things can send
> multicast), then you have the scaling problem.  If the endpoint is
> privileged, then it's much less clear to me that this thing is useful.

That means an endpoint has to request the ability of sending to a
specific multicast address(aka join a multicast group), and it's up to
bus master whether grants it or not.



-- 
Regards,

- cee1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 2/5] mtd: nand: vf610_nfc: add hardware BCH-ECC support

2015-07-31 Thread Brian Norris
On Sat, Aug 01, 2015 at 02:28:06AM +0200, Stefan Agner wrote:
> Did the measurement:
> 
> As is:
> ...
> [   30.955675] mtd_speedtest: testing eraseblock write speed
> [  143.349572] mtd_speedtest: eraseblock write speed is 4641 KiB/s
> [  143.355606] mtd_speedtest: testing eraseblock read speed
> [  183.816690] mtd_speedtest: eraseblock read speed is 12893 KiB/s
> [  185.874702] mtd_speedtest: testing page write speed
> [  302.608719] mtd_speedtest: page write speed is 4468 KiB/s
> [  302.614229] mtd_speedtest: testing page read speed
> [  343.831663] mtd_speedtest: page read speed is 12656 KiB/s
> ...
> 
> Unconditionally read OOB:
> ...
> [   29.076983] mtd_speedtest: testing eraseblock write speed
> [  140.829920] mtd_speedtest: eraseblock write speed is 4667 KiB/s
> [  140.835960] mtd_speedtest: testing eraseblock read speed
> [  181.594498] mtd_speedtest: eraseblock read speed is 12798 KiB/s
> [  183.652793] mtd_speedtest: testing page write speed
> [  299.772069] mtd_speedtest: page write speed is 4492 KiB/s
> [  299.777583] mtd_speedtest: testing page read speed
> [  341.283668] mtd_speedtest: page read speed is 12568 KiB/s
> ...
> 
> And with conditional OOB again, reading OOB if required in
> vf610_nfc_correct_data.
> ...
> [   29.907147] mtd_speedtest: testing eraseblock write speed
> [  141.146171] mtd_speedtest: eraseblock write speed is 4689 KiB/s
> [  141.152185] mtd_speedtest: testing eraseblock read speed
> [  181.644380] mtd_speedtest: eraseblock read speed is 12883 KiB/s
> [  183.703198] mtd_speedtest: testing page write speed
> [  299.423179] mtd_speedtest: page write speed is 4507 KiB/s
> [  299.428671] mtd_speedtest: testing page read speed
> [  340.695925] mtd_speedtest: page read speed is 12640 KiB/s
> [  342.747510] mtd_speedtest: testing 2 page write speed
> ...
> 
> The last test is probably pointless since we never read a empty page in
> the speedtest. So performance hit is measurable but small (somewhat
> below 100KiB/s).
> 
> This is with 64 bytes OOB. Since OOB sizes are only getting bigger, I
> would rather still consider it... What do you think?

If the code isn't that ugly, then keep the conditional OOB. But I'd
consider that performance difference almost negligible, given the noise
for the write tests is on the order of 40 KB/s.

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 2/4] gpio: brcmstb: Add interrupt and wakeup source support

2015-07-31 Thread Gregory Fong
Uses the gpiolib irqchip helpers.  For this to work, the irq setup
function is called once per bank instead of once per device.  Note
that all known uses of this block have a BCM7120 L2 interrupt
controller as a parent.  Supports interrupts for all GPIOs.

In the IRQ handler, we check for raised IRQs for invalid GPIOs and
warn (ratelimited) if they're encountered.

Also, several drivers (e.g. gpio-keys) allow for GPIOs to be
configured as wakeup sources, and this GPIO controller supports that
through a separate interrupt path.

The de-facto standard DT property "wakeup-source" is checked, since
that indicates whether the GPIO controller hardware can wake.  Uses
the IRQCHIP_MASK_ON_SUSPEND irq_chip flag because UPG GIO doesn't have
any of its own wakeup source configuration.

Aside regarding gpiolib irqchip helpers: It wasn't obvious (to me)
that you can have multiple chained irqchips and associated IRQ domains
for a single parent IRQ, and as long as the xlate function is written
correctly, a GPIO IRQ request end up checking the correct domain and
will get associated with the correct IRQ.  What helps make this clear
is to read
  drivers/gpio/gpiolib-of.c:
   - of_gpiochip_find_and_xlate()
   - of_get_named_gpiod_flags()
  drivers/gpio/gpiolib.c:
   - gpiochip_find()

Signed-off-by: Gregory Fong 
---
v4:
- when checking parent_irq, use <= 0 or > 0 since 0 is NO_IRQ.

 drivers/gpio/Kconfig|   1 +
 drivers/gpio/gpio-brcmstb.c | 262 +++-
 2 files changed, 257 insertions(+), 6 deletions(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 8f1fe73..0b77175 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -131,6 +131,7 @@ config GPIO_BRCMSTB
default y if ARCH_BRCMSTB
depends on OF_GPIO && (ARCH_BRCMSTB || COMPILE_TEST)
select GPIO_GENERIC
+   select GPIOLIB_IRQCHIP
help
  Say yes here to enable GPIO support for Broadcom STB (BCM7XXX) SoCs.
 
diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c
index 4630a81..46cc4e9 100644
--- a/drivers/gpio/gpio-brcmstb.c
+++ b/drivers/gpio/gpio-brcmstb.c
@@ -17,6 +17,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #define GIO_BANK_SIZE   0x20
 #define GIO_ODEN(bank)  (((bank) * GIO_BANK_SIZE) + 0x00)
@@ -34,14 +37,17 @@ struct brcmstb_gpio_bank {
struct bgpio_chip bgc;
struct brcmstb_gpio_priv *parent_priv;
u32 width;
+   struct irq_chip irq_chip;
 };
 
 struct brcmstb_gpio_priv {
struct list_head bank_list;
void __iomem *reg_base;
-   int num_banks;
struct platform_device *pdev;
+   int parent_irq;
int gpio_base;
+   bool can_wake;
+   int parent_wake_irq;
 };
 
 #define MAX_GPIO_PER_BANK   32
@@ -63,6 +69,183 @@ brcmstb_gpio_gc_to_priv(struct gpio_chip *gc)
return bank->parent_priv;
 }
 
+static void brcmstb_gpio_set_imask(struct brcmstb_gpio_bank *bank,
+   unsigned int offset, bool enable)
+{
+   struct bgpio_chip *bgc = >bgc;
+   struct brcmstb_gpio_priv *priv = bank->parent_priv;
+   u32 mask = bgc->pin2mask(bgc, offset);
+   u32 imask;
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   imask = bgc->read_reg(priv->reg_base + GIO_MASK(bank->id));
+   if (enable)
+   imask |= mask;
+   else
+   imask &= ~mask;
+   bgc->write_reg(priv->reg_base + GIO_MASK(bank->id), imask);
+   spin_unlock_irqrestore(>lock, flags);
+}
+
+/*  IRQ chip functions  */
+
+static void brcmstb_gpio_irq_mask(struct irq_data *d)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct brcmstb_gpio_bank *bank = brcmstb_gpio_gc_to_bank(gc);
+
+   brcmstb_gpio_set_imask(bank, d->hwirq, false);
+}
+
+static void brcmstb_gpio_irq_unmask(struct irq_data *d)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct brcmstb_gpio_bank *bank = brcmstb_gpio_gc_to_bank(gc);
+
+   brcmstb_gpio_set_imask(bank, d->hwirq, true);
+}
+
+static int brcmstb_gpio_irq_set_type(struct irq_data *d, unsigned int type)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct brcmstb_gpio_bank *bank = brcmstb_gpio_gc_to_bank(gc);
+   struct brcmstb_gpio_priv *priv = bank->parent_priv;
+   u32 mask = BIT(d->hwirq);
+   u32 edge_insensitive, iedge_insensitive;
+   u32 edge_config, iedge_config;
+   u32 level, ilevel;
+   unsigned long flags;
+
+   switch (type) {
+   case IRQ_TYPE_LEVEL_LOW:
+   level = 0;
+   edge_config = 0;
+   edge_insensitive = 0;
+   break;
+   case IRQ_TYPE_LEVEL_HIGH:
+   level = mask;
+   edge_config = 0;
+   edge_insensitive = 0;
+   break;
+   case IRQ_TYPE_EDGE_FALLING:
+   

[PATCH v4 1/4] dt-bindings: brcmstb-gpio: document properties for wakeup

2015-07-31 Thread Gregory Fong
Some brcmstb GPIO controllers can be used to wake from suspend, so use
the de facto standard property 'wakeup-source' to mark the nodes of
controllers with that capability.

Also document interrupts-extended, which will be used for wakeup
handling because the interrupt parent for the wake IRQ is different
from the regular IRQ.

While we're at it, a few more fixes: We don't actually use the
"interrupt-names" property, so remove it from the listed optional
properties and from the examples.  And since we're modifying the
examples, also follow Brian's suggestions to:
- change #gpio-cells, #interrupt-cells, and brcm,gpio-bank-widths from
  hex to dec
- use phandles

Reviewed-by: Brian Norris 
Acked-by: Florian Fainelli 
Signed-off-by: Gregory Fong 
---
v4: no changes from v3

 .../devicetree/bindings/gpio/brcm,brcmstb-gpio.txt | 35 +-
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt 
b/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt
index 435f1bc..b405b44 100644
--- a/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt
+++ b/Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt
@@ -33,6 +33,13 @@ Optional properties:
 - interrupt-parent:
 phandle of the parent interrupt controller
 
+- interrupts-extended:
+Alternate form of specifying interrupts and parents that allows for
+multiple parents.  This takes precedence over 'interrupts' and
+'interrupt-parent'.  Wakeup-capable GPIO controllers often route their
+wakeup interrupt lines through a different interrupt controller than the
+primary interrupt line, making this property necessary.
+
 - #interrupt-cells:
 Should be <2>.  The first cell is the GPIO number, the second should 
specify
 flags.  The following subset of flags is supported:
@@ -47,19 +54,33 @@ Optional properties:
 - interrupt-controller:
 Marks the device node as an interrupt controller
 
-- interrupt-names:
-The name of the IRQ resource used by this controller
+- wakeup-source:
+GPIOs for this controller can be used as a wakeup source
 
 Example:
upg_gio: gpio@f040a700 {
-   #gpio-cells = <0x2>;
-   #interrupt-cells = <0x2>;
+   #gpio-cells = <2>;
+   #interrupt-cells = <2>;
compatible = "brcm,bcm7445-gpio", "brcm,brcmstb-gpio";
gpio-controller;
interrupt-controller;
reg = <0xf040a700 0x80>;
-   interrupt-parent = <0xf>;
+   interrupt-parent = <_intc>;
+   interrupts = <0x6>;
+   brcm,gpio-bank-widths = <32 32 32 24>;
+   };
+
+   upg_gio_aon: gpio@f04172c0 {
+   #gpio-cells = <2>;
+   #interrupt-cells = <2>;
+   compatible = "brcm,bcm7445-gpio", "brcm,brcmstb-gpio";
+   gpio-controller;
+   interrupt-controller;
+   reg = <0xf04172c0 0x40>;
+   interrupt-parent = <_aon_intc>;
interrupts = <0x6>;
-   interrupt-names = "upg_gio";
-   brcm,gpio-bank-widths = <0x20 0x20 0x20 0x18>;
+   interrupts-extended = <_aon_intc 0x6>,
+   <_pm_l2_intc 0x5>;
+   wakeup-source;
+   brcm,gpio-bank-widths = <18 4>;
};
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 3/4] gpio: brcmstb: support wakeup from S5 cold boot

2015-07-31 Thread Gregory Fong
For wake from S5, we need to:
- register a reboot handler
- set wakeup capability before requesting IRQ so wakeup count is
  incremented
- mask all GPIO IRQs and clear any pending interrupts during driver
  probe to since no driver will yet be registered to handle any IRQs
  carried over from boot at that time, and it's possible that the
  booted kernel does not request the same IRQ anyway.

This means that /sys/.../power/wakeup_count is valid at boot time, and
we can properly account for S5 wakeup stats. e.g.:

  ### After waking from S5 from a GPIO key
  # cat /sys/bus/platform/drivers/brcmstb-gpio/f04172c0.gpio/power/wakeup
  enabled
  # cat /sys/bus/platform/drivers/brcmstb-gpio/f04172c0.gpio/power/wakeup_count
  1

Signed-off-by: Gregory Fong 
---
v4: rename __brcmstb_gpio_irq_set_wake() to brcmstb_gpio_priv_set_wake().

 drivers/gpio/gpio-brcmstb.c | 56 -
 1 file changed, 50 insertions(+), 6 deletions(-)

diff --git a/drivers/gpio/gpio-brcmstb.c b/drivers/gpio/gpio-brcmstb.c
index 46cc4e9..9ea86d2 100644
--- a/drivers/gpio/gpio-brcmstb.c
+++ b/drivers/gpio/gpio-brcmstb.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define GIO_BANK_SIZE   0x20
 #define GIO_ODEN(bank)  (((bank) * GIO_BANK_SIZE) + 0x00)
@@ -48,6 +49,7 @@ struct brcmstb_gpio_priv {
int gpio_base;
bool can_wake;
int parent_wake_irq;
+   struct notifier_block reboot_notifier;
 };
 
 #define MAX_GPIO_PER_BANK   32
@@ -167,10 +169,9 @@ static int brcmstb_gpio_irq_set_type(struct irq_data *d, 
unsigned int type)
return 0;
 }
 
-static int brcmstb_gpio_irq_set_wake(struct irq_data *d, unsigned int enable)
+static int brcmstb_gpio_priv_set_wake(struct brcmstb_gpio_priv *priv,
+   unsigned int enable)
 {
-   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
-   struct brcmstb_gpio_priv *priv = brcmstb_gpio_gc_to_priv(gc);
int ret = 0;
 
/*
@@ -188,6 +189,14 @@ static int brcmstb_gpio_irq_set_wake(struct irq_data *d, 
unsigned int enable)
return ret;
 }
 
+static int brcmstb_gpio_irq_set_wake(struct irq_data *d, unsigned int enable)
+{
+   struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+   struct brcmstb_gpio_priv *priv = brcmstb_gpio_gc_to_priv(gc);
+
+   return brcmstb_gpio_priv_set_wake(priv, enable);
+}
+
 static irqreturn_t brcmstb_gpio_wake_irq_handler(int irq, void *data)
 {
struct brcmstb_gpio_priv *priv = data;
@@ -246,6 +255,19 @@ static void brcmstb_gpio_irq_handler(unsigned int irq, 
struct irq_desc *desc)
chained_irq_exit(chip, desc);
 }
 
+static int brcmstb_gpio_reboot(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct brcmstb_gpio_priv *priv =
+   container_of(nb, struct brcmstb_gpio_priv, reboot_notifier);
+
+   /* Enable GPIO for S5 cold boot */
+   if (action == SYS_POWER_OFF)
+   brcmstb_gpio_priv_set_wake(priv, 1);
+
+   return NOTIFY_DONE;
+}
+
 /* Make sure that the number of banks matches up between properties */
 static int brcmstb_gpio_sanity_check_banks(struct device *dev,
struct device_node *np, struct resource *res)
@@ -285,6 +307,12 @@ static int brcmstb_gpio_remove(struct platform_device 
*pdev)
if (ret)
dev_err(>dev, "gpiochip_remove fail in 
cleanup\n");
}
+   if (priv->reboot_notifier.notifier_call) {
+   ret = unregister_reboot_notifier(>reboot_notifier);
+   if (ret)
+   dev_err(>dev,
+   "failed to unregister reboot notifier\n");
+   }
return ret;
 }
 
@@ -342,7 +370,16 @@ static int brcmstb_gpio_irq_setup(struct platform_device 
*pdev,
dev_warn(dev,
"Couldn't get wake IRQ - GPIOs will not be able 
to wake from sleep");
} else {
-   int err = devm_request_irq(dev, priv->parent_wake_irq,
+   int err;
+
+   /*
+* Set wakeup capability before requesting wakeup
+* interrupt, so we can process boot-time "wakeups"
+* (e.g., from S5 cold boot)
+*/
+   device_set_wakeup_capable(dev, true);
+   device_wakeup_enable(dev);
+   err = devm_request_irq(dev, priv->parent_wake_irq,
brcmstb_gpio_wake_irq_handler, 0,
"brcmstb-gpio-wake", priv);
 
@@ -351,8 +388,9 @@ static int brcmstb_gpio_irq_setup(struct platform_device 
*pdev,
return err;
}
 
-   device_set_wakeup_capable(dev, true);
-   device_wakeup_enable(dev);
+   

[PATCH v4 4/4] ARM: dts: brcmstb: add BCM7445 GPIO nodes

2015-07-31 Thread Gregory Fong
Need the aon_pm_l2_intc and irq0_aon_intc descriptions, so included
those as well.

Signed-off-by: Gregory Fong 
---
New in v4.

 arch/arm/boot/dts/bcm7445.dtsi | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/arch/arm/boot/dts/bcm7445.dtsi b/arch/arm/boot/dts/bcm7445.dtsi
index 58dcd66..3b6b175 100644
--- a/arch/arm/boot/dts/bcm7445.dtsi
+++ b/arch/arm/boot/dts/bcm7445.dtsi
@@ -109,6 +109,20 @@
brcm,int-fwd-mask = <0x7>;
};
 
+   irq0_aon_intc: interrupt-controller@417280 {
+   compatible = "brcm,bcm7120-l2-intc";
+   reg = <0x417280 0x8>;
+   interrupt-parent = <>;
+   #interrupt-cells = <1>;
+   interrupt-controller;
+   interrupts = ,
+,
+;
+   brcm,int-map-mask = <0x1e3 0x1800 0x10>;
+   brcm,int-fwd-mask = <0x0>;
+   brcm,irq-can-wake;
+   };
+
hif_intr2_intc: interrupt-controller@3e1000 {
compatible = "brcm,l2-intc";
reg = <0x3e1000 0x30>;
@@ -119,6 +133,16 @@
interrupt-names = "hif";
};
 
+aon_pm_l2_intc: interrupt-controller@410640 {
+   compatible = "brcm,l2-intc";
+   reg = <0x410640 0x30>;
+   interrupt-controller;
+   #interrupt-cells = <1>;
+   interrupts = ;
+   interrupt-parent = <>;
+   brcm,irq-can-wake;
+   };
+
nand: nand@3e2800 {
status = "disabled";
#address-cells = <1>;
@@ -167,6 +191,32 @@
#phy-cells = <0>;
};
};
+
+   upg_gio: gpio@40a700 {
+   compatible = "brcm,bcm7445-gpio", "brcm,brcmstb-gpio";
+   reg = <0x40a700 0x80>;
+   #gpio-cells = <2>;
+   #interrupt-cells = <2>;
+   gpio-controller;
+   interrupt-controller;
+   interrupt-parent = <_intc>;
+   interrupts = <6>;
+   brcm,gpio-bank-widths = <32 32 32 24>;
+   };
+
+   upg_gio_aon: gpio@4172c0 {
+   compatible = "brcm,bcm7445-gpio", "brcm,brcmstb-gpio";
+   reg = <0x4172c0 0x40>;
+   #gpio-cells = <2>;
+   #interrupt-cells = <2>;
+   gpio-controller;
+   interrupt-controller;
+   interrupts-extended = <_aon_intc 0x6>,
+ <_pm_l2_intc 0x5>;
+   wakeup-source;
+   brcm,gpio-bank-widths = <18 4>;
+   };
+
};
 
smpboot {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 0/4] GPIO support for BRCMSTB

2015-07-31 Thread Gregory Fong
Adds interrupt support for the GPIO controller (UPG GIO) used on Broadcom's
various BRCMSTB SoCs (BCM7XXX and others).  For all existing hardware, this
block hooks up to the BCM7120 L2 IRQ controller and so will require
CONFIG_BCM7120_L2_IRQ=y.

New in v4:
- add nodes for the BRCMSTB GPIO controller to the BCM7445 dts file
- a few improvements suggested by Linus Walleij on v3
- remove unused 'irq' argument to brcmstb_gpio_irq_bank_handler()

The following are not included in this patchset:
- Initial device tree bindings (merged from v1 to GPIO tree)
- Initial GPIO support w/o interrupts (merged from v2 to GPIO tree)
- ARM Kconfig changes (merged from v2 to arm-soc tree)
- fix for null ptr deref in driver remove (merged from v3)

Previous versions:
v1: https://lkml.org/lkml/2015/5/6/199
v2: https://lkml.org/lkml/2015/5/28/853
v3: https://lkml.org/lkml/2015/6/17/960

Gregory Fong (4):
  dt-bindings: brcmstb-gpio: document properties for wakeup
  gpio: brcmstb: Add interrupt and wakeup source support
  gpio: brcmstb: support wakeup from S5 cold boot
  ARM: dts: brcmstb: add BCM7445 GPIO nodes

 .../devicetree/bindings/gpio/brcm,brcmstb-gpio.txt |  35 ++-
 arch/arm/boot/dts/bcm7445.dtsi |  50 
 drivers/gpio/Kconfig   |   1 +
 drivers/gpio/gpio-brcmstb.c| 306 -
 4 files changed, 379 insertions(+), 13 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: switchdev: restrict vid range abstraction

2015-07-31 Thread Vivien Didelot
Hi Scott,

On Jul 29, 2015, at 5:17 PM, Scott Feldman sfel...@gmail.com wrote:

> On Wed, Jul 29, 2015 at 12:14 PM, Vivien Didelot
>  wrote:
>> Hi Scott, David,
>>
>> On Jul 29, 2015, at 2:28 PM, David da...@davemloft.net wrote:
>>
>>> From: Scott Feldman 
>>> Date: Wed, 29 Jul 2015 00:31:44 -0700
>>>
 Since the netlink request (for example vlan add) includes the range,
 I'm not seeing how we can response with success for the satisfied
 vlans in the range, and also respond with an error for the unsatisfied
 vlans in the range.   In other words, from the netlink msgs
 perspective, we need to treat a vlan range as all-or-nothing.  So in
 your example, if hw can't add vlan 2, we fail the entire request to
 add range 2-5.  This is where the prepare phase checks to make sure
 the entire request can be satisfied before committing to hw.
>>
>> I made this change in order to start restricting the bridge abstraction
>> to switchdev, since IMHO its info flags do not add much value to the
>> switch chip drivers perspective.
>>
>> While a range might be convenient to a user, exposing it to drivers is
>> likely to end up writing the same vid_begin to vid_end for loop.
>>
>>> This was my concern with the change as well.
>>>
>>> The user asked for the range to be installed, so if any portion
>>> of it cannot be done we must not make any changes to the HW
>>> configuration and fail the entire request.
>>
>> I understand the concern with the netlink request.
>>
>> However, this can be confusing to someone. With the previous example:
>>
>> bridge vlan add dev port0 vid 2-5 master
>>
>> must fail for the entire range (due to the single netlink request). But:
>>
>> bridge vlan add dev port0 vid 2 master
>>
>> will silently fallback to software VLAN (assuming that the driver
>> correctly returned -EOPNOTSUPP in the prepare phase). In other words, no
>> changes has been committed to the hardware.
> 
> I see your concern now, I think.  net/bridge/br_netlink.c:br_afspec()
> does the range loop but doesn't rewind if something goes wrong with
> one of the vlans in the range.  The call into switchdev is
> one-at-a-time at that point.  If br_afspec() handled the rewind, would
> this address your concern?  We can keep the range support in the
> switchdev vlan obj, so 'self' can use it.

I am not sure is the rewind is needed. My concern was trying to handle
the fallback to software VLAN for a single VID within a range, so that
we can free a switch chip driver for this bridge-specific notion. But
because of the single netlink request, it seems not possible.

At which level does this fallback happen exactly?

Thanks,
-v
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] alpha: use asm-generic/io.h

2015-07-31 Thread Guenter Roeck

On 07/31/2015 05:38 PM, Luis R. Rodriguez wrote:

From: "Luis R. Rodriguez" 

Since alpha does not include asm-generic/io.h it would mean
alpha folks have to always carefully monitor asm-generic patches
and before they get merged make sure their own arch implementation
solution gets added. By using asm-generic/io.h alpha gets sensible
defaults, in this case ioremap_uc() would be one example, where
by default it would return NULL, so not implemented. When alpha
folks get a chance then they can add the appropriate
implementation.

Reported-by: Guenter Roeck 
Cc: Paul Gortmaker 
Signed-off-by: Luis R. Rodriguez 


Doesn't work. Gives me lots of duplicate symbols.

Guenter


---

The easy solution *for now* is to just do:

#define ioremap_uc ioremap_nocache

*But* that would have to be done for any other asm-generic/io.h collateral
evolution, this on the other hand, would get it right for alpha from the
get-go -- so I ask -- can this please be tested and if it is OK then consider
it be merged?

If we can't add asm-generic/io.h then the above define replacement would
be the way to go.

  arch/alpha/include/asm/io.h | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h
index f05bdb4b1cb9..16a5bda42750 100644
--- a/arch/alpha/include/asm/io.h
+++ b/arch/alpha/include/asm/io.h
@@ -581,4 +581,6 @@ extern void outsl (unsigned long port, const void *src, 
unsigned long count);

  #endif /* __KERNEL__ */

+#include 
+
  #endif /* __ALPHA_IO_H */



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] ACPI: Reorganize the core device management/enumeration code

2015-07-31 Thread Rafael J. Wysocki
On Thursday, July 30, 2015 12:54:39 PM Mika Westerberg wrote:
> On Tue, Jul 14, 2015 at 01:16:12AM +0200, Rafael J. Wysocki wrote:
> > Hi,
> > 
> > The size and complexity of drivers/acpi/scan.c has bothered me for quite
> > a while and since I'm working on a feature that will benefit from exposing
> > some additional information under ACPI device objects in sysfs, I've decided
> > to reorganize the code to move some things out of scan.c.
> > 
> > [1/4] Move all ACPI device code related to sysfs into a separate (new) file.
> > [2/4] Move device matching code to bus.c.
> > [3/4] Mode device notification, bus operations and driver management code 
> > to bus.c.
> > [4/4] Register the ACPI bus type in acpi_bus_init().
> > 
> > No functional changes except for the [4/4] that will case ACPI to be 
> > disabled if
> > the ACPI bus type cannot be registered.
> 
> Just noticed this series. If it is not too late, you can add my
> 
> Reviewed-by: Mika Westerberg 
> 
> to the whole series.

I've merged the Lee Jones' MFD branch on top of it and I'd really prefer to
avoid repeating that merge. :-)

Thanks anyway though.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5] powerpc/rcpm: add RCPM driver

2015-07-31 Thread Scott Wood
On Fri, 2015-06-26 at 15:44 +0800, yuantian.t...@freescale.com wrote:
> +static void rcpm_v1_set_ip_power(bool enable, u32 *mask)
> +{
> + if (enable)
> + setbits32(_v1_regs->ippdexpcr, *mask);
> + else
> + clrbits32(_v1_regs->ippdexpcr, *mask);
> +}
> +
> +static void rcpm_v2_set_ip_power(bool enable, u32 *mask)
> +{
> + if (enable)
> + setbits32(_v2_regs->ippdexpcr[0], *mask);
> + else
> + clrbits32(_v2_regs->ippdexpcr[0], *mask);
> +}

Why do these take "u32 *mask" instead of "u32 mask"?

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] powerpc: pm: add EPU FSM configuration for deep sleep

2015-07-31 Thread Scott Wood
On Fri, 2015-07-31 at 20:53 +0800, Chenhui Zhao wrote:
> In the last stage of deep sleep, software will trigger a Finite
> State Machine (FSM) to control the hardware precedure, such as
> board isolation, killing PLLs, removing power, and so on.
> 
> When the system is waked up by an interrupt, the FSM controls the
> hardware to complete the early resume precedure.
> 
> This patch configure the EPU FSM preparing for deep sleep.
> 
> Signed-off-by: Chenhui Zhao 
> ---
>  arch/powerpc/platforms/85xx/Makefile|   2 +-
>  arch/powerpc/platforms/85xx/sleep_fsm.c | 256 
> 
>  arch/powerpc/platforms/85xx/sleep_fsm.h | 104 +
>  3 files changed, 361 insertions(+), 1 deletion(-)
>  create mode 100644 arch/powerpc/platforms/85xx/sleep_fsm.c
>  create mode 100644 arch/powerpc/platforms/85xx/sleep_fsm.h

When I asked why this was in drivers/platform[1], you said it was to share 
with LS1, and that the values used were the same -- so why did you move it to 
arch/powerpc?

[1] Note that other proposed patches create a drivers/soc/fsl instead of 
drivers/platform/fsl...  We need one of them, not both.

> +void fsl_fsm_setup(void __iomem *base, struct fsm_reg_vals *val)
> +{
> + struct fsm_reg_vals *data = val;
> +
> + BUG_ON(!base || !data);

This BUG_ON is useless.  If one of those is NULL you'll get an oops anyway.

> diff --git a/arch/powerpc/platforms/85xx/sleep_fsm.h 
> b/arch/powerpc/platforms/85xx/sleep_fsm.h
> new file mode 100644
> index 000..2c60b40
> --- /dev/null
> +++ b/arch/powerpc/platforms/85xx/sleep_fsm.h
> @@ -0,0 +1,104 @@
> +/*
> + * Freescale deep sleep FSM (finite-state machine) configuration
> + *
> + * Copyright 2015 Freescale Semiconductor Inc.
> + *
> + * This program is free software; you can redistribute  it and/or modify it
> + * under  the terms of  the GNU General  Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +#ifndef _FSL_SLEEP_FSM_H
> +#define _FSL_SLEEP_FSM_H
> +
> +#define FSL_STRIDE_4B4
> +#define FSL_STRIDE_8B8

Why not just use 4/8 directly?

> 
> +/* Block offsets */
> +#define RCPM_BLOCK_OFFSET0x00022000
> +#define EPU_BLOCK_OFFSET 0x
> +#define NPC_BLOCK_OFFSET 0x1000

I thought you said OK to not putting these offsets in the kernel source...

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] alpha: use asm-generic/io.h

2015-07-31 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Since alpha does not include asm-generic/io.h it would mean
alpha folks have to always carefully monitor asm-generic patches
and before they get merged make sure their own arch implementation
solution gets added. By using asm-generic/io.h alpha gets sensible
defaults, in this case ioremap_uc() would be one example, where
by default it would return NULL, so not implemented. When alpha
folks get a chance then they can add the appropriate
implementation.

Reported-by: Guenter Roeck 
Cc: Paul Gortmaker 
Signed-off-by: Luis R. Rodriguez 
---

The easy solution *for now* is to just do:

#define ioremap_uc ioremap_nocache

*But* that would have to be done for any other asm-generic/io.h collateral
evolution, this on the other hand, would get it right for alpha from the
get-go -- so I ask -- can this please be tested and if it is OK then consider
it be merged?

If we can't add asm-generic/io.h then the above define replacement would
be the way to go.

 arch/alpha/include/asm/io.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h
index f05bdb4b1cb9..16a5bda42750 100644
--- a/arch/alpha/include/asm/io.h
+++ b/arch/alpha/include/asm/io.h
@@ -581,4 +581,6 @@ extern void outsl (unsigned long port, const void *src, 
unsigned long count);
 
 #endif /* __KERNEL__ */
 
+#include 
+
 #endif /* __ALPHA_IO_H */
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1] hwmon: (nct7802) Add device tree support

2015-07-31 Thread Guenter Roeck

Hi Constantine,

On 07/31/2015 03:00 PM, Constantine Shulyupin wrote:

Please add a description of what you are doing here.


Signed-off-by: Constantine Shulyupin 
---
The first trial.
Question: how to configure local temp4 (EnLTD)?
Allow "temp4_type = <3>" (EnLTD=3-2=1) or "temp4_enable = <1>" or else?


I don't see a reason to disable it. After all, it is always present.

Please make sure you copy the devicetree mailing list and the devicetree
maintainers for discussing devicetree properties. You can use
scripts/get_maintainer.pl to determine who needs to be copied.

The limited scope of the properties suggests that you might plan to submit
further patches to add more properties. Specifically, the chip also has
configurable voltage sensors, fan status, and fan control, for which
you would probably need properties as well. Splitting patch submission
for the properties into multiple chunks will make it difficult to review
for the devicetree maintainers. I would suggest to determine all required
bindings and submit at least the complete bindings document in one go.


---
  .../devicetree/bindings/hwmon/nct7802.txt  | 28 
  .../devicetree/bindings/vendor-prefixes.txt|  1 +
  drivers/hwmon/nct7802.c| 52 +-


You should probably split this into three patches.

- Add vendor ID to vendor prefixes
- Add devicetree properties
- Add implementation


  3 files changed, 71 insertions(+), 10 deletions(-)
  create mode 100644 Documentation/devicetree/bindings/hwmon/nct7802.txt

diff --git a/Documentation/devicetree/bindings/hwmon/nct7802.txt 
b/Documentation/devicetree/bindings/hwmon/nct7802.txt
new file mode 100644
index 000..568d3aa
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/nct7802.txt
@@ -0,0 +1,28 @@
+Nuvoton NCT7802Y Hardware Monitoring IC
+
+Required node properties:
+
+ - "compatible": must be "nuvoton,nct7802"
+ - "reg": I2C bus address of the device
+
+Optional properties:
+
+ - temp1_type
+ - temp2_type
+ - temp3_type


Please use '-' instead of '_', and use full words.
Not sure how to enumerate the different sensors - looking for advice from 
devicetree
maintainers.


+
+Valid values:
+
+ 0 - disabled
+ 3 - thermal diode
+ 4 - thermistor


The numbering ties into implementation details (sysfs representation). This is
not desirable for devicetree properties, which are supposed to be OS and 
implementation
independent.

It might make sense to use strings here. 'disabled' seems redundant, in a way -
a temperature sensor might be considered disabled if it is not listed.

Another option might be to have a single property, such as

temperature-sensors = <0, 1, 2, 1>;

where each value indicates one of the sensors, with
0 - disabled
1 - diode
2 - thermistor

I don't really have a strong opinion, though. Again looking for advice
from devicetree maintainers.


+
+Example nct7802 node:
+
+nct7802 {
+   compatible = "nuvoton,nct7802";
+   reg = <0x2a>;
+   temp1_type = <4>;
+   temp2_type = <4>;
+   temp3_type = <4>;
+};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 181b53e..821e000 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -149,6 +149,7 @@ netxeon Shenzhen Netxeon Technology CO., LTD
  newhaven  Newhaven Display International
  nintendo  Nintendo
  nokia Nokia
+nuvotonNuvoton
  nvidiaNVIDIA
  nxp   NXP Semiconductors
  onnn  ON Semiconductor Corp.
diff --git a/drivers/hwmon/nct7802.c b/drivers/hwmon/nct7802.c
index 3ce33d2..2be995d 100644
--- a/drivers/hwmon/nct7802.c
+++ b/drivers/hwmon/nct7802.c
@@ -84,24 +84,30 @@ static ssize_t show_temp_type(struct device *dev, struct 
device_attribute *attr,
return sprintf(buf, "%u\n", (mode >> (2 * sattr->index) & 3) + 2);
  }

+int set_temp_type(struct nct7802_data *data, int index, u8 type)
+{
+   if (index == 2 && type != 4) /* RD3 */
+   return -EINVAL;
+   if ((type > 0 && type < 3) || type > 4)
+   return -EINVAL;
+   return regmap_update_bits(data->regmap, REG_MODE,
+ 3 << 2 * index,
+ (type ? type - 2 : 0) << 2 * index);
+}
+
  static ssize_t store_temp_type(struct device *dev,
   struct device_attribute *attr,
   const char *buf, size_t count)
  {
struct nct7802_data *data = dev_get_drvdata(dev);
struct sensor_device_attribute *sattr = to_sensor_dev_attr(attr);
-   unsigned int type;
+   u8 type;
int err;

-   err = kstrtouint(buf, 0, );
+   err = kstrtou8(buf, 0, );
if (err < 0)
return err;
-   if (sattr->index == 2 && type != 4) /* RD3 */
-   return -EINVAL;
-   if (type 

Re: [PATCH v9 2/5] mtd: nand: vf610_nfc: add hardware BCH-ECC support

2015-07-31 Thread Stefan Agner
On 2015-08-01 01:47, Brian Norris wrote:
> On Sat, Aug 01, 2015 at 01:35:52AM +0200, Stefan Agner wrote:
>> On 2015-08-01 01:09, Brian Norris wrote:
> 
>> >> +static int vf610_nfc_read_page(struct mtd_info *mtd, struct nand_chip 
>> >> *chip,
>> >> + uint8_t *buf, int oob_required, int page)
>> >> +{
>> >> + int eccsize = chip->ecc.size;
>> >> + int stat;
>> >> +
>> >> + vf610_nfc_read_buf(mtd, buf, eccsize);
>> >> +
>> >> + if (oob_required)
>> >> + vf610_nfc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
>> >
>> > To fix the bitflips issue above, you'll just want to unconditionally
>> > read the OOB (it's fine to ignore 'oob_required') and...
>> >
>> >> +
>> >> + stat = vf610_nfc_correct_data(mtd, buf);
>> >
>> > ...pass in chip->oob_poi as a third argument.
>> >
>>
>> Hm, this probably will have an effect on performance, since we usually
>> omit the OOB if not requested.
> 
> You could test :) I don't really like performance claims without tests.
> (I say this because I added the oob_required flag myself, but just for
> functional purposes, not performance. Many drivers got by just fine by
> always copying the OOB data.)

Did the measurement:

As is:
...
[   30.955675] mtd_speedtest: testing eraseblock write speed
[  143.349572] mtd_speedtest: eraseblock write speed is 4641 KiB/s
[  143.355606] mtd_speedtest: testing eraseblock read speed
[  183.816690] mtd_speedtest: eraseblock read speed is 12893 KiB/s
[  185.874702] mtd_speedtest: testing page write speed
[  302.608719] mtd_speedtest: page write speed is 4468 KiB/s
[  302.614229] mtd_speedtest: testing page read speed
[  343.831663] mtd_speedtest: page read speed is 12656 KiB/s
...

Unconditionally read OOB:
...
[   29.076983] mtd_speedtest: testing eraseblock write speed
[  140.829920] mtd_speedtest: eraseblock write speed is 4667 KiB/s
[  140.835960] mtd_speedtest: testing eraseblock read speed
[  181.594498] mtd_speedtest: eraseblock read speed is 12798 KiB/s
[  183.652793] mtd_speedtest: testing page write speed
[  299.772069] mtd_speedtest: page write speed is 4492 KiB/s
[  299.777583] mtd_speedtest: testing page read speed
[  341.283668] mtd_speedtest: page read speed is 12568 KiB/s
...

And with conditional OOB again, reading OOB if required in
vf610_nfc_correct_data.
...
[   29.907147] mtd_speedtest: testing eraseblock write speed
[  141.146171] mtd_speedtest: eraseblock write speed is 4689 KiB/s
[  141.152185] mtd_speedtest: testing eraseblock read speed
[  181.644380] mtd_speedtest: eraseblock read speed is 12883 KiB/s
[  183.703198] mtd_speedtest: testing page write speed
[  299.423179] mtd_speedtest: page write speed is 4507 KiB/s
[  299.428671] mtd_speedtest: testing page read speed
[  340.695925] mtd_speedtest: page read speed is 12640 KiB/s
[  342.747510] mtd_speedtest: testing 2 page write speed
...

The last test is probably pointless since we never read a empty page in
the speedtest. So performance hit is measurable but small (somewhat
below 100KiB/s).

This is with 64 bytes OOB. Since OOB sizes are only getting bigger, I
would rather still consider it... What do you think?

--
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] PowerPC/mpc85xx: Add hotplug support on E5500 and E500MC cores

2015-07-31 Thread Scott Wood
On Fri, 2015-07-31 at 17:20 +0800, b29...@freescale.com wrote:
> From: Tang Yuantian 
> 
> Freescale E500MC and E5500 core-based platforms, like P4080, T1040,
> support disabling/enabling CPU dynamically.
> This patch adds this feature on those platforms.
> 
> Signed-off-by: Chenhui Zhao 
> Signed-off-by: Tang Yuantian 
> ---
>  arch/powerpc/Kconfig  |  2 +-
>  arch/powerpc/include/asm/smp.h|  1 +
>  arch/powerpc/kernel/smp.c |  5 +
>  arch/powerpc/platforms/85xx/smp.c | 39 
> ---
>  4 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 5ef2711..dd9e252 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -386,7 +386,7 @@ config SWIOTLB
>  config HOTPLUG_CPU
>   bool "Support for enabling/disabling CPUs"
>   depends on SMP && (PPC_PSERIES || \
> - PPC_PMAC || PPC_POWERNV || (PPC_85xx && !PPC_E500MC))
> + PPC_PMAC || PPC_POWERNV || FSL_SOC_BOOKE)
>   ---help---
> Say Y here to be able to disable and re-enable individual
> CPUs at runtime on SMP machines.



> diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
> index 825663c..bf37d17 100644
> --- a/arch/powerpc/include/asm/smp.h
> +++ b/arch/powerpc/include/asm/smp.h
> @@ -67,6 +67,7 @@ void generic_cpu_die(unsigned int cpu);
>  void generic_set_cpu_dead(unsigned int cpu);
>  void generic_set_cpu_up(unsigned int cpu);
>  int generic_check_cpu_restart(unsigned int cpu);
> +int generic_check_cpu_dead(unsigned int cpu);
>  #endif
>  
>  #ifdef CONFIG_PPC64
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index ec9ec20..2cca27a 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -454,6 +454,11 @@ int generic_check_cpu_restart(unsigned int cpu)
>   return per_cpu(cpu_state, cpu) == CPU_UP_PREPARE;
>  }
>  
> +int generic_check_cpu_dead(unsigned int cpu)
> +{
> + return per_cpu(cpu_state, cpu) == CPU_DEAD;
> +}

Is there a non-generic check_cpu_dead()?

It gets open-coded in generic_cpu_die()... Either open-code it elsewhere, or 
call it check_cpu_dead() and use it everywhere there's a CPU_DEAD check.


> +
>  static bool secondaries_inhibited(void)
>  {
>   return kvm_hv_mode_active();
> diff --git a/arch/powerpc/platforms/85xx/smp.c 
> b/arch/powerpc/platforms/85xx/smp.c
> index 6811a5b..7f0dadb 100644
> --- a/arch/powerpc/platforms/85xx/smp.c
> +++ b/arch/powerpc/platforms/85xx/smp.c
> @@ -42,6 +42,7 @@ struct epapr_spin_table {
>   u32 pir;
>  };
>  
> +#ifdef CONFIG_HOTPLUG_CPU
>  static u64 timebase;
>  static int tb_req;
>  static int tb_valid;
> @@ -111,7 +112,7 @@ static void mpc85xx_take_timebase(void)
>   local_irq_restore(flags);
>  }
>  
> -#ifdef CONFIG_HOTPLUG_CPU
> +#ifndef CONFIG_PPC_E500MC
>  static void e500_cpu_idle(void)

What happens if we bisect to patch 1/3 and run this on e500mc?

Please move the ifdef to that patch.

>  {
>   u32 tmp;
> @@ -127,6 +128,7 @@ static void e500_cpu_idle(void)
>   mtmsr(tmp);
>   isync();
>  }
> +#endif
>  
>  static void qoriq_cpu_dying(void)
>  {
> @@ -144,11 +146,30 @@ static void qoriq_cpu_dying(void)
>  
>   generic_set_cpu_dead(cpu);
>  
> +#ifndef CONFIG_PPC_E500MC
>   e500_cpu_idle();
> +#endif
>  
>   while (1)
>   ;
>  }
> +
> +static void qoriq_real_cpu_die(unsigned int cpu)

Real as opposed to...?

> +{
> + int i;
> +
> + for (i = 0; i < 5; i++) {
> + if (generic_check_cpu_dead(cpu)) {
> + qoriq_pm_ops->cpu_die(cpu);
> +#ifdef CONFIG_PPC64
> + paca[cpu].cpu_start = 0;
> +#endif
> + return;
> + }
> + udelay(10);
> + }
> + pr_err("%s: CPU%d didn't die...\n", __func__, cpu);
> +}

Only 500ms timeout, versus 10sec in generic_cpu_die()?

>  #endif
>  
>  static inline void flush_spin_table(void *spin_table)
> @@ -246,11 +267,7 @@ static int smp_85xx_kick_cpu(int nr)
>   spin_table = phys_to_virt(*cpu_rel_addr);
>  
>   local_irq_save(flags);
> -#ifdef CONFIG_PPC32
>  #ifdef CONFIG_HOTPLUG_CPU
> - /* Corresponding to generic_set_cpu_dead() */
> - generic_set_cpu_up(nr);
> -
>   if (system_state == SYSTEM_RUNNING) {
>   /*
>* To keep it compatible with old boot program which uses
> @@ -263,6 +280,7 @@ static int smp_85xx_kick_cpu(int nr)
>   out_be32(_table->addr_l, 0);
>   flush_spin_table(spin_table);
>  
> + qoriq_pm_ops->cpu_up(nr);

Again, is it possible to get here without a valid qoriq_pm_ops (i.e. is there 
anything stopping the user from trying to initiate CPU hotplug)?

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please 

Re: [PATCH 3/3] PowerPC/mpc85xx: Add hotplug support on E6500 cores

2015-07-31 Thread Scott Wood
On Fri, 2015-07-31 at 17:20 +0800, b29...@freescale.com wrote:
> diff --git a/arch/powerpc/platforms/85xx/smp.c 
> b/arch/powerpc/platforms/85xx/smp.c
> index 7f0dadb..8652a49 100644
> --- a/arch/powerpc/platforms/85xx/smp.c
> +++ b/arch/powerpc/platforms/85xx/smp.c
> @@ -189,15 +189,22 @@ static inline u32 read_spin_table_addr_l(void 
> *spin_table)
>  static void wake_hw_thread(void *info)
>  {
>   void fsl_secondary_thread_init(void);
> - unsigned long imsr1, inia1;
> + unsigned long imsr, inia;
>   int nr = *(const int *)info;
> -
> - imsr1 = MSR_KERNEL;
> - inia1 = *(unsigned long *)fsl_secondary_thread_init;
> -
> - mttmr(TMRN_IMSR1, imsr1);
> - mttmr(TMRN_INIA1, inia1);
> - mtspr(SPRN_TENS, TEN_THREAD(1));
> + int hw_cpu = get_hard_smp_processor_id(nr);
> + int thread_idx = cpu_thread_in_core(hw_cpu);
> +
> + booting_cpu_hwid = (u32)hw_cpu;

Unnecessary cast.  Please explain why you need booting_cpu_hwid.

> + imsr = MSR_KERNEL;
> + inia = *(unsigned long *)fsl_secondary_thread_init;
> + if (thread_idx == 0) {
> + mttmr(TMRN_IMSR0, imsr);
> + mttmr(TMRN_INIA0, inia);
> + } else {
> + mttmr(TMRN_IMSR1, imsr);
> + mttmr(TMRN_INIA1, inia);
> + }
> + mtspr(SPRN_TENS, TEN_THREAD(thread_idx));

Please rebase this on top of http://patchwork.ozlabs.org/patch/496952/

> + /*
> +  * If both threads are offline, reset core to start.
> +  * When core is up, Thread 0 always gets up first,
> +  * so bind the current logical cpu with Thread 0.
> +  */
> + if (hw_cpu != cpu_first_thread_sibling(hw_cpu)) {
> + int hw_cpu1, hw_cpu2;
> +
> + hw_cpu1 = get_hard_smp_processor_id(primary);
> + hw_cpu2 = get_hard_smp_processor_id(primary + 1);
> + set_hard_smp_processor_id(primary, hw_cpu2);
> + set_hard_smp_processor_id(primary + 1, hw_cpu1);
> + /* get new physical cpu id */
> + hw_cpu = get_hard_smp_processor_id(nr);

NACK as discussed in http://patchwork.ozlabs.org/patch/454944/

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: v4.2-rc dcache regression, probably 75a6f82a0d10

2015-07-31 Thread Hugh Dickins
On Fri, 31 Jul 2015, Linus Torvalds wrote:
> 
> I'd be more suspicious about other effects. For example, iot's not at
> all obvious that the commit in question just changes the order of the
> flags/inode field accesses, there are potentialy bigger changes there.
> For example, this part (in __d_obtain_alias):
> 
> -   tmp->d_inode = inode;
> -   tmp->d_flags |= add_flags;
> +   __d_set_inode_and_type(tmp, inode, add_flags);
> 
> looks a bit off, because it *used* to just add those flags, but now,
> through __d_set_inode_and_type, it does
> 
> +   dentry->d_inode = inode;
> +   smp_wmb();
> +   flags = READ_ONCE(dentry->d_flags);
> +   flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
> +   flags |= type_flags;
> +   WRITE_ONCE(dentry->d_flags, flags);
> 
> so it clears DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU.
> 
> Is that correct? Maybe, I haven't checked. And maybe it's a big bad
> bug. Regardless, it sure as hell isn't just changing the order of the
> access to those fields. That "DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU"
> clearing came from __d_instantiate(), but now it hits __d_obtain_alias
> too.
> 
> There may be other changes like that for all I know. I didn't look

Yes, the one which grabbed my attention is:

@@ -311,7 +346,7 @@ static void dentry_iput(struct dentry * dentry)
 {
struct inode *inode = dentry->d_inode;
if (inode) {
-   dentry->d_inode = NULL;
+   __d_clear_type_and_inode(dentry);
hlist_del_init(>d_u.d_alias);
spin_unlock(>d_lock);
spin_unlock(>i_lock);

which I think clears the DCACHE_ENTRY_TYPE i.e. makes it DCACHE_MISS_TYPE,
when it was left as is before.  While there might be an RCU lookup in
progress, suddenly finding this to be a negative dentry.  Perhaps -
this is not an area I've visited for years, and I've not followed up
the sequence count protection.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] extcon fixes for 4.2-rc5

2015-07-31 Thread Greg KH
On Fri, Jul 31, 2015 at 04:28:02PM +0900, Chanwoo Choi wrote:
> Dear Greg,
> 
> This is extcon-fixes pull request for v4.2-rc5. I added detailed description
> of this pull request on below. Please pull extcon with following fixes.
> 
> Best Regards,
> Chanwoo Choi
> 
> The following changes since commit cbfe8fa6cd672011c755c3cd85c9ffd4e2d10a6f:
> 
>   Linux 4.2-rc4 (2015-07-26 12:26:21 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon.git 
> tags/extcon-fixes-for-4.2-rc5

Pulled and pushed out, thanks.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Unpinning an unpinned lock warning triggered by wake_up_new_task()

2015-07-31 Thread Bart Van Assche
Hello,

Here is another warning that was reported while I was retesting the SRP
initiator driver in kernel v4.2-rc4. Has anyone seen this before ?

Thanks,

Bart.

Jul 31 15:22:42 srp-ini multipathd: 66:32: mark as failed
Jul 31 15:22:42 srp-ini multipathd: mpatha: remaining active paths: 0
Jul 31 15:22:42 srp-ini kernel: sd 12:0:0:2: Parameters changed
Jul 31 15:22:43 srp-ini kernel: [ cut here ]
Jul 31 15:22:43 srp-ini kernel: WARNING: CPU: 0 PID: 358 at 
kernel/locking/lockdep.c:3497 lock_unpin_lock+0x105/0x110()
Jul 31 15:22:43 srp-ini kernel: unpinning an unpinned lock
Jul 31 15:22:43 srp-ini kernel: Modules linked in: dm_queue_length scsi_dh_alua 
af_packet xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc 
ebtable_filter ebtables ip6table_nat nf_con
ntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw 
ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack msr iptable_mangle iptable_raw iptable_filter 
ip_tables x_tables sg x86_pkg_temp_thermal coretemp cr
ct10dif_pclmul crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul 
ablk_helper cryptd microcode pcspkr tg3 libphy lpc_ich mfd_core wmi 
acpi_power_meter ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs 
ib_umad rdma_cm ib_cm iw_cm processor thermal_sys 
hwmon button dm_multipath scsi_dh dm_mod ext4 crc16 mbcache jbd2 mlx4_en ptp 
pps_core mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod sr_mod cdrom hid_generic 
usbhid hid ahci libahci libata ehci_pci ehci_hcd mlx4_core usbcore scsi_mod 
usb_common autofs4
Jul 31 15:22:43 srp-ini multipathd: mpatha: sdc - tur checker reports path is up
Jul 31 15:22:43 srp-ini multipathd: 8:32: reinstated
Jul 31 15:22:43 srp-ini multipathd: mpatha: remaining active paths: 1
Jul 31 15:22:43 srp-ini multipathd: mpatha: sdr - tur checker reports path is up
Jul 31 15:22:43 srp-ini multipathd: 65:16: reinstated
Jul 31 15:22:43 srp-ini multipathd: mpatha: remaining active paths: 2
Jul 31 15:22:43 srp-ini kernel: CPU: 0 PID: 358 Comm: multipathd Not tainted 
4.2.0-rc4-debug+ #1
Jul 31 15:22:43 srp-ini kernel: Hardware name: Dell Inc. PowerEdge R430/03XKDV, 
BIOS 1.0.2 11/17/2014
Jul 31 15:22:43 srp-ini kernel: 817a80ba 88044faf7d58 
814f9cde 
Jul 31 15:22:43 srp-ini kernel: 88044faf7da8 88044faf7d98 
810746ba 0003
Jul 31 15:22:43 srp-ini kernel: 0001 88044e4742c0 
88047fc15618 0092
Jul 31 15:22:43 srp-ini kernel: Call Trace:
Jul 31 15:22:43 srp-ini kernel: [] dump_stack+0x4c/0x65
Jul 31 15:22:43 srp-ini kernel: [] 
warn_slowpath_common+0x8a/0xc0
Jul 31 15:22:43 srp-ini kernel: [] warn_slowpath_fmt+0x46/0x50
Jul 31 15:22:43 srp-ini kernel: [] lock_unpin_lock+0x105/0x110
Jul 31 15:22:43 srp-ini kernel: [] 
wake_up_new_task+0x171/0x2e0
Jul 31 15:22:43 srp-ini kernel: [] _do_fork+0x162/0x760
Jul 31 15:22:43 srp-ini kernel: [] ? 
lockdep_sys_exit_thunk+0x12/0x14
Jul 31 15:22:43 srp-ini kernel: [] SyS_clone+0x19/0x20
Jul 31 15:22:43 srp-ini kernel: [] 
entry_SYSCALL_64_fastpath+0x16/0x7a
Jul 31 15:22:43 srp-ini kernel: ---[ end trace c764daa6543cd2b6 ]---
Jul 31 15:22:43 srp-ini kernel: device-mapper: multipath: Failing path 8:32.
Jul 31 15:22:43 srp-ini kernel: sd 14:0:0:0: alua: port group 101 state S 
preferred supports tOlUSNA
Jul 31 15:22:44 srp-ini kernel: sd 14:0:0:0: alua: stpg sense code: 04/67/0a
Jul 31 15:22:44 srp-ini kernel: device-mapper: multipath: Failing path 65:16.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [FYI] tux3: Core changes

2015-07-31 Thread Daniel Phillips

On Friday, July 31, 2015 5:00:43 PM PDT, Daniel Phillips wrote:

Note: Hirofumi's email is clear, logical and speaks to the
question. This branch of the thread is largely pointless, though
it essentially says the same thing in non-technical terms. Perhaps
your next response should be to Hirofumi, and perhaps it should be
technical.


Now, let me try to lead the way, but being specific. RDMA was raised
as a potential failure case for Tux3 page forking. But the RDMA api
does not let you use memory mmaped by Tux3 as a source or destination
of IO. Instead, it sets up its own pages and hands them out to the
RDMA app from a pool. So no issue. One down, right?

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ACPI / bus: Move duplicate code to a separate new function

2015-07-31 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

After merging commit 712e960f0ee9 (ACPI / PM: Attach ACPI power
domain only once) with commit 1dcc3d3362b0 (ACPI / bus: Move ACPI
bus type registration) there is some duplicate code in
acpi_device_is_first_physical_node() and acpi_companion_match()
that can be moved to a separate routine and called from both
places.

Signed-off-by: Rafael J. Wysocki 
---

On top of linux-pm.git/linux-next.

---
 drivers/acpi/bus.c |   51 ++-
 1 file changed, 22 insertions(+), 29 deletions(-)

Index: linux-pm/drivers/acpi/bus.c
===
--- linux-pm.orig/drivers/acpi/bus.c
+++ linux-pm/drivers/acpi/bus.c
@@ -482,6 +482,26 @@ static void acpi_device_remove_notify_ha
  Device Matching
-- 
*/
 
+static struct acpi_device *acpi_primary_dev_companion(struct acpi_device *adev,
+ const struct device *dev)
+{
+   struct mutex *physical_node_lock = >physical_node_lock;
+
+   mutex_lock(physical_node_lock);
+   if (list_empty(>physical_node_list)) {
+   adev = NULL;
+   } else {
+   const struct acpi_device_physical_node *node;
+
+   node = list_first_entry(>physical_node_list,
+   struct acpi_device_physical_node, node);
+   if (node->dev != dev)
+   adev = NULL;
+   }
+   mutex_unlock(physical_node_lock);
+   return adev;
+}
+
 /**
  * acpi_device_is_first_physical_node - Is given dev first physical node
  * @adev: ACPI companion device
@@ -496,19 +516,7 @@ static void acpi_device_remove_notify_ha
 bool acpi_device_is_first_physical_node(struct acpi_device *adev,
const struct device *dev)
 {
-   bool ret = false;
-
-   mutex_lock(>physical_node_lock);
-   if (!list_empty(>physical_node_list)) {
-   const struct acpi_device_physical_node *node;
-
-   node = list_first_entry(>physical_node_list,
-   struct acpi_device_physical_node, node);
-   ret = node->dev == dev;
-   }
-   mutex_unlock(>physical_node_lock);
-
-   return ret;
+   return !!acpi_primary_dev_companion(adev, dev);
 }
 
 /*
@@ -535,7 +543,6 @@ bool acpi_device_is_first_physical_node(
 struct acpi_device *acpi_companion_match(const struct device *dev)
 {
struct acpi_device *adev;
-   struct mutex *physical_node_lock;
 
adev = ACPI_COMPANION(dev);
if (!adev)
@@ -544,21 +551,7 @@ struct acpi_device *acpi_companion_match
if (list_empty(>pnp.ids))
return NULL;
 
-   physical_node_lock = >physical_node_lock;
-   mutex_lock(physical_node_lock);
-   if (list_empty(>physical_node_list)) {
-   adev = NULL;
-   } else {
-   const struct acpi_device_physical_node *node;
-
-   node = list_first_entry(>physical_node_list,
-   struct acpi_device_physical_node, node);
-   if (node->dev != dev)
-   adev = NULL;
-   }
-   mutex_unlock(physical_node_lock);
-
-   return adev;
+   return acpi_primary_dev_companion(adev, dev);
 }
 
 /**

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: v4.2-rc dcache regression, probably 75a6f82a0d10

2015-07-31 Thread Hugh Dickins
On Fri, 31 Jul 2015, Dominique Martinet wrote:
> Hugh Dickins wrote on Fri, Jul 31, 2015:
> > It will indeed be weird and odd if it confirms that DCACHE_DISCONNECTED
> > revert is good.  I agree that Dominique's 4bf46a272647 seems now more
> > likely, if still unlikely; but that was included in v4.1, and I saw
> > no problem with v4.1 once the rmap_walk() skip was fixed.
> 
> I think it could, actually, and that neither commits are actually bad --
> just that they affect timing enough to raise an issue between d_delete
> (I guess?) and link_path_walk (see last mail in other thread[1])
> 
> It's probably an old race that was very hard to hit because of cache
> coherency.
> Basically, before the wmb/rmb, the dentry was always updated closely to
> its flags, so the other CPU would "usually" get both updates at the same
> time; the barriers make it so the updates are split and it's possible to
> get it, and would explain why I could pick 4bf46a2726 as "the one"
> 
> 
> I'm not sure why the problem wouldn't arise on tmpfs though.
> 
> Hugh, could you try the reproducer I gave in the other thread[2] on both
> filesystems maybe?

Sorry, I probably won't get around to that, to be honest:
it shouldn't need me to run it anyway.

> I need to let the thing run for a while, might need to tune params as
> well. I was trying to fine tune cpu affinity with less threads but it's
> not getting anywhere.
> 
> I'll also check if it's getting even easier to reproduce with
> 75a6f82a0d10 (or a recent kernel), who knows... How fast do you hit the
> bug with the commit?

"A number of hours".  I don't have my records in front of me at the
moment, but I think when I was lucky it happened within two hours,
but more commonly around ten or twelve hours.

I just leave it going and get on with other things: yours may be a
_much_ better reproducer.  Though once there's a potential fix, we shall
both need to try it, to report back if our separate cases are fixed.

Hugh

> 
> 
> Thanks,
> -- 
> Dominique
> 
> [1] https://marc.info/?l=linux-fsdevel=143835651005259=2
> [2] https://marc.info/?l=linux-fsdevel=143825706609188=2 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] thermal/powerclamp: add cpu id for skylake h/s

2015-07-31 Thread Radivoje Jovanovic
+ Rui

On Fri, 31 Jul 2015 08:07:45 -0700
Radivoje Jovanovic  wrote:

> From: Radivoje Jovanovic 
> 
> Add support for Intel Skylake H/S
> 
> Signed-off-by: Radivoje Jovanovic 
> ---
>  drivers/thermal/intel_powerclamp.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index 5820e85..63879a1 100644
> --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -698,6 +698,7 @@ static const struct x86_cpu_id
> intel_powerclamp_ids[] __initconst = { { X86_VENDOR_INTEL, 6, 0x4f},
>   { X86_VENDOR_INTEL, 6, 0x56},
>   { X86_VENDOR_INTEL, 6, 0x57},
> + { X86_VENDOR_INTEL, 6, 0x5e},
>   {}
>  };
>  MODULE_DEVICE_TABLE(x86cpu, intel_powerclamp_ids);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] thermal/powerclamp: add cpu id for Skylake u/y

2015-07-31 Thread Radivoje Jovanovic
+Rui

On Fri, 31 Jul 2015 08:07:54 -0700
Radivoje Jovanovic  wrote:

> From: Radivoje Jovanovic 
> 
> Add support for Intel Skylake u/y
> 
> Signed-off-by: Radivoje Jovanovic 
> ---
>  drivers/thermal/intel_powerclamp.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index 63879a1..6e01723 100644
> --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -695,6 +695,7 @@ static const struct x86_cpu_id
> intel_powerclamp_ids[] __initconst = { { X86_VENDOR_INTEL, 6, 0x46},
>   { X86_VENDOR_INTEL, 6, 0x4c},
>   { X86_VENDOR_INTEL, 6, 0x4d},
> + { X86_VENDOR_INTEL, 6, 0x4e},
>   { X86_VENDOR_INTEL, 6, 0x4f},
>   { X86_VENDOR_INTEL, 6, 0x56},
>   { X86_VENDOR_INTEL, 6, 0x57},

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] powercap / RAPL: Add support for Skylake H/S

2015-07-31 Thread Rafael J. Wysocki
On Friday, July 31, 2015 04:48:38 PM Radivoje Jovanovic wrote:
> On Sat, 01 Aug 2015 02:12:18 +0200
> Hi Rafael,
> 
> "Rafael J. Wysocki"  wrote:
> > On Friday, July 31, 2015 08:07:10 AM Radivoje Jovanovic wrote:
> > > From: Radivoje Jovanovic 
> > > 
> > > This patche enabled RAPL to support Intel Skylake H/S
> > > 
> > > Signed-off-by: Radivoje Jovanovic 
> > 
> > Jacob, is this series fine by you?
> 
> Jacob is on sabbatical and asked me to sub him for RAPL/Powercap until
> he comes back

OK

I can apply [1-2/4], then, the rest I'm leaving to Rui.


> > 
> > > ---
> > >  drivers/powercap/intel_rapl.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/powercap/intel_rapl.c
> > > b/drivers/powercap/intel_rapl.c index 482b22d..bca9620 100644
> > > --- a/drivers/powercap/intel_rapl.c
> > > +++ b/drivers/powercap/intel_rapl.c
> > > @@ -1101,6 +1101,7 @@ static const struct x86_cpu_id rapl_ids[]
> > > __initconst = { RAPL_CPU(0x4A, rapl_defaults_tng),/* Tangier */
> > >   RAPL_CPU(0x56, rapl_defaults_core),/* Future Xeon */
> > >   RAPL_CPU(0x5A, rapl_defaults_ann),/* Annidale */
> > > + RAPL_CPU(0x5E, rapl_defaults_core),/* Skylake-H/S */
> > >   RAPL_CPU(0x57, rapl_defaults_hsw_server),/* Knights
> > > Landing */ {}
> > >  };
> > > 
> > 
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [4.2 fix] x86, mpx: do not set ->vm_ops on mpx VMAs

2015-07-31 Thread Greg KH
On Mon, Jul 20, 2015 at 02:29:58PM -0700, Dave Hansen wrote:
> 
> (sorry for the spam, I screwed up the stable@ address).
> 
> BTW, thanks to Kirill for doing this patch!  He posted it to LKML
> but we need to ensure it is picked up for 4.2 and any -stable
> kernels where this commit is applied:
> 
>   6b7339f4: mm: avoid setting up anonymous pages into file mapping
> 
> That broke MPX support because MPX sets a vma->vm_ops on an
> anonymous VMA.  We need this patch to make it work again,
> basically removing MPX's use of ->vm_ops.  Kirill made me aware
> of this long ago, but I didn't double-check that his fix got
> submitted and merged.
> 
> I (Dave) fixed up a minor merge conflict and added the
> try_unmap_single_bt() use of is_mpx_vma() (which were added
> post-4.1).
> 
> Note for -stable: The first hunk may not apply cleanly because of
> other activity in arch/x86/mm/mmap.c, but should be trivial to
> apply by hand.  Hunk #5 on mpx.c is only present on 4.2-rc kernels.

Can someone send a version that is known to apply, you don't want to
rely on me to get it right :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [FYI] tux3: Core changes

2015-07-31 Thread Daniel Phillips

On Friday, July 31, 2015 3:27:12 PM PDT, David Lang wrote:

On Fri, 31 Jul 2015, Daniel Phillips wrote:


On Friday, July 31, 2015 11:29:51 AM PDT, David Lang wrote: ...


you weren't asking about any particular feature of Tux, you 
were asking if we were still willing to push out stuff that 
breaks for users and fix it later.


I think you left a key word out of my ask: "theoretical".

Especially for filesystems that can loose the data of whoever 
is using it, the answer seems to be a clear no.


there may be bugs in what's pushed out that we don't know 
about. But we don't push out potential data corruption bugs that 
we do know about (or think we do)


so if you think this should be pushed out with this known 
corner case that's not handled properly, you have to convince 
people that it's _so_ improbable that they shouldn't care about 
it.


There should also be an onus on the person posing the worry
to prove their case beyond a reasonable doubt, which has not been
done in case we are discussing here. Note: that is a technical
assessment to which a technical response is appropriate.

I do think that we should put a cap on this fencing and make
a real effort to get Tux3 into mainline. We should at least
set a ground rule that a problem should be proved real before it
becomes a reason to derail a project in the way that our project
has been derailed. Otherwise, it's hard to see what interest is
served.

OK, lets get back to the program. I accept your assertion that
we should convince people that the issue is improbable. To do
that, I need a specific issue to address. So far, no such issue
has been provided with specificity. Do you see why this is
frustrating?

Please, community. Give us specific issues to address, or give us
some way out of this eternal limbo. Or better, lets go back to the
old way of doing things in Linux, which is what got us where we
are today. Not this.

Note: Hirofumi's email is clear, logical and speaks to the
question. This branch of the thread is largely pointless, though
it essentially says the same thing in non-technical terms. Perhaps
your next response should be to Hirofumi, and perhaps it should be
technical.

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] Powerpc: mpc85xx: refactor the PM operations

2015-07-31 Thread Scott Wood
On Fri, 2015-07-31 at 17:20 +0800, b29...@freescale.com wrote:
> @@ -71,7 +56,7 @@ static void mpc85xx_give_timebase(void)
>   barrier();
>   tb_req = 0;
>  
> - mpc85xx_timebase_freeze(1);
> + qoriq_pm_ops->freeze_time_base(1);

freeze_time_base() takes a bool.  Use true/false.

>  #ifdef CONFIG_PPC64
>   /*
>* e5500/e6500 have a workaround for erratum A-006958 in place
> @@ -104,7 +89,7 @@ static void mpc85xx_give_timebase(void)
>   while (tb_valid)
>   barrier();
>  
> - mpc85xx_timebase_freeze(0);
> + qoriq_pm_ops->freeze_time_base(0);
>  
>   local_irq_restore(flags);
>  }
> @@ -127,20 +112,10 @@ static void mpc85xx_take_timebase(void)
>  }
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> -static void smp_85xx_mach_cpu_die(void)
> +static void e500_cpu_idle(void)

This is not the function that gets called during normal cpu idle, and it 
shouldn't be named to look like it is.

>  {
> - unsigned int cpu = smp_processor_id();
>   u32 tmp;
>  
> - local_irq_disable();
> - idle_task_exit();
> - generic_set_cpu_dead(cpu);
> - mb();
> -
> - mtspr(SPRN_TCR, 0);
> -
> - cur_cpu_spec->cpu_down_flush();
> -
>   tmp = (mfspr(SPRN_HID0) & ~(HID0_DOZE|HID0_SLEEP)) | HID0_NAP;
>   mtspr(SPRN_HID0, tmp);
>   isync();
> @@ -151,6 +126,25 @@ static void smp_85xx_mach_cpu_die(void)
>   mb();
>   mtmsr(tmp);
>   isync();
> +}
> +
> +static void qoriq_cpu_dying(void)
> +{
> + unsigned int cpu = smp_processor_id();
> +
> + hard_irq_disable();
> + /* mask all irqs to prevent cpu wakeup */
> + qoriq_pm_ops->irq_mask(cpu);
> + idle_task_exit();
> +
> + mtspr(SPRN_TCR, 0);
> + mtspr(SPRN_TSR, mfspr(SPRN_TSR));
> +
> + cur_cpu_spec->cpu_down_flush();
> +
> + generic_set_cpu_dead(cpu);
> +
> + e500_cpu_idle();

Why is something that claims to be applicable to all qoriq directly calling 
an e500v2-specific function?

Could you explain irq_mask()?  Why would there still be IRQs destined for 
this CPU at this point?

@@ -431,21 +415,9 @@ void __init mpc85xx_smp_init(void)
>   smp_85xx_ops.probe = NULL;
>   }
>  
> - np = of_find_matching_node(NULL, mpc85xx_smp_guts_ids);
> - if (np) {
> - guts = of_iomap(np, 0);
> - of_node_put(np);
> - if (!guts) {
> - pr_err("%s: Could not map guts node address\n",
> - __func__);
> - return;
> - }
> - smp_85xx_ops.give_timebase = mpc85xx_give_timebase;
> - smp_85xx_ops.take_timebase = mpc85xx_take_timebase;
>  #ifdef CONFIG_HOTPLUG_CPU
> - ppc_md.cpu_die = smp_85xx_mach_cpu_die;
> + ppc_md.cpu_die = qoriq_cpu_dying;
>  #endif

Shouldn't you make sure there's a valid qoriq_pm_ops before setting 
cpu_die()?  Or make sure that qoriq_cpu_dying() works regardless.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] powercap / RAPL: Add support for Skylake H/S

2015-07-31 Thread Radivoje Jovanovic
On Sat, 01 Aug 2015 02:12:18 +0200
Hi Rafael,

"Rafael J. Wysocki"  wrote:
> On Friday, July 31, 2015 08:07:10 AM Radivoje Jovanovic wrote:
> > From: Radivoje Jovanovic 
> > 
> > This patche enabled RAPL to support Intel Skylake H/S
> > 
> > Signed-off-by: Radivoje Jovanovic 
> 
> Jacob, is this series fine by you?

Jacob is on sabbatical and asked me to sub him for RAPL/Powercap until
he comes back

> 
> > ---
> >  drivers/powercap/intel_rapl.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/powercap/intel_rapl.c
> > b/drivers/powercap/intel_rapl.c index 482b22d..bca9620 100644
> > --- a/drivers/powercap/intel_rapl.c
> > +++ b/drivers/powercap/intel_rapl.c
> > @@ -1101,6 +1101,7 @@ static const struct x86_cpu_id rapl_ids[]
> > __initconst = { RAPL_CPU(0x4A, rapl_defaults_tng),/* Tangier */
> > RAPL_CPU(0x56, rapl_defaults_core),/* Future Xeon */
> > RAPL_CPU(0x5A, rapl_defaults_ann),/* Annidale */
> > +   RAPL_CPU(0x5E, rapl_defaults_core),/* Skylake-H/S */
> > RAPL_CPU(0x57, rapl_defaults_hsw_server),/* Knights
> > Landing */ {}
> >  };
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 2/5] mtd: nand: vf610_nfc: add hardware BCH-ECC support

2015-07-31 Thread Brian Norris
On Sat, Aug 01, 2015 at 01:35:52AM +0200, Stefan Agner wrote:
> On 2015-08-01 01:09, Brian Norris wrote:

> >> +static int vf610_nfc_read_page(struct mtd_info *mtd, struct nand_chip 
> >> *chip,
> >> +  uint8_t *buf, int oob_required, int page)
> >> +{
> >> +  int eccsize = chip->ecc.size;
> >> +  int stat;
> >> +
> >> +  vf610_nfc_read_buf(mtd, buf, eccsize);
> >> +
> >> +  if (oob_required)
> >> +  vf610_nfc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
> > 
> > To fix the bitflips issue above, you'll just want to unconditionally
> > read the OOB (it's fine to ignore 'oob_required') and...
> > 
> >> +
> >> +  stat = vf610_nfc_correct_data(mtd, buf);
> > 
> > ...pass in chip->oob_poi as a third argument.
> > 
> 
> Hm, this probably will have an effect on performance, since we usually
> omit the OOB if not requested.

You could test :) I don't really like performance claims without tests.
(I say this because I added the oob_required flag myself, but just for
functional purposes, not performance. Many drivers got by just fine by
always copying the OOB data.)

> I could fetch the OOB from the NAND
> controllers SRAM only if necessary (if HW ECC status is not ok...). Does
> this sound reasonable?

That does.

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] ARM: dts: vexpress: Use assigned-clock-parents for sp810

2015-07-31 Thread Stephen Boyd
The sp810 clk driver is calling the clk consumer APIs from
clk_prepare ops to change the parent to a 1 MHz fixed rate clock
for each of the clocks that the driver provides. Use
assigned-clock-parents for this instead of doing it in the driver
to avoid using the consumer API in provider code. This also
allows us to remove the usage of clk provider APIs that take a
struct clk as an argument from the sp810 driver.

Cc: Pawel Moll 
Cc: Linus Walleij 
Cc: Sudeep Holla 
Signed-off-by: Stephen Boyd 
---
 arch/arm/boot/dts/vexpress-v2m-rs1.dtsi | 2 ++
 arch/arm/boot/dts/vexpress-v2m.dtsi | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi 
b/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi
index 2efb2058ba49..21b02874bea3 100644
--- a/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi
+++ b/arch/arm/boot/dts/vexpress-v2m-rs1.dtsi
@@ -101,6 +101,8 @@
clock-names = "refclk", "timclk", "apb_pclk";
#clock-cells = <1>;
clock-output-names = "timerclken0", 
"timerclken1", "timerclken2", "timerclken3";
+   assigned-clocks = <_sysctl 0>, <_sysctl 
1>, <_sysctl 3>, <_sysctl 3>;
+   assigned-clock-parents = <_refclk1mhz>, 
<_refclk1mhz>, <_refclk1mhz>, <_refclk1mhz>;
};
 
/* PCI-E I2C bus */
diff --git a/arch/arm/boot/dts/vexpress-v2m.dtsi 
b/arch/arm/boot/dts/vexpress-v2m.dtsi
index cb3090f919a7..e712c0af149b 100644
--- a/arch/arm/boot/dts/vexpress-v2m.dtsi
+++ b/arch/arm/boot/dts/vexpress-v2m.dtsi
@@ -100,6 +100,8 @@
clock-names = "refclk", "timclk", "apb_pclk";
#clock-cells = <1>;
clock-output-names = "timerclken0", 
"timerclken1", "timerclken2", "timerclken3";
+   assigned-clocks = <_sysctl 0>, <_sysctl 
1>, <_sysctl 3>, <_sysctl 3>;
+   assigned-clock-parents = <_refclk1mhz>, 
<_refclk1mhz>, <_refclk1mhz>, <_refclk1mhz>;
};
 
/* PCI-E I2C bus */
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] ARM64: dts: vexpress: Use assigned-clock-parents for sp810

2015-07-31 Thread Stephen Boyd
The sp810 clk driver is calling the clk consumer APIs from
clk_prepare ops to change the parent to a 1 MHz fixed rate clock
for each of the clocks that the driver provides. Use
assigned-clock-parents for this instead of doing it in the driver
to avoid using the consumer API in provider code. This also
allows us to remove the usage of clk provider APIs that take a
struct clk as an argument from the sp810 driver.

Cc: Pawel Moll 
Cc: Linus Walleij 
Cc: Sudeep Holla 
Signed-off-by: Stephen Boyd 
---
 arch/arm64/boot/dts/arm/juno-motherboard.dtsi| 2 ++
 arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/arm/juno-motherboard.dtsi 
b/arch/arm64/boot/dts/arm/juno-motherboard.dtsi
index 021e0f40f419..637e046f0e36 100644
--- a/arch/arm64/boot/dts/arm/juno-motherboard.dtsi
+++ b/arch/arm64/boot/dts/arm/juno-motherboard.dtsi
@@ -136,6 +136,8 @@
clock-names = "refclk", "timclk", 
"apb_pclk";
#clock-cells = <1>;
clock-output-names = "timerclken0", 
"timerclken1", "timerclken2", "timerclken3";
+   assigned-clocks = <_sysctl 0>, 
<_sysctl 1>, <_sysctl 3>, <_sysctl 3>;
+   assigned-clock-parents = 
<_refclk1mhz>, <_refclk1mhz>, <_refclk1mhz>, <_refclk1mhz>;
};
 
apbregs@01 {
diff --git a/arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi 
b/arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi
index c46cbb29f3c6..88a7583ed7a7 100644
--- a/arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi
+++ b/arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi
@@ -74,6 +74,8 @@
clock-names = "refclk", "timclk", "apb_pclk";
#clock-cells = <1>;
clock-output-names = "timerclken0", 
"timerclken1", "timerclken2", "timerclken3";
+   assigned-clocks = <_sysctl 0>, <_sysctl 
1>, <_sysctl 3>, <_sysctl 3>;
+   assigned-clock-parents = <_refclk1mhz>, 
<_refclk1mhz>, <_refclk1mhz>, <_refclk1mhz>;
};
 
aaci@04 {
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] powercap / RAPL: Add support for Skylake H/S

2015-07-31 Thread Rafael J. Wysocki
On Friday, July 31, 2015 08:07:10 AM Radivoje Jovanovic wrote:
> From: Radivoje Jovanovic 
> 
> This patche enabled RAPL to support Intel Skylake H/S
> 
> Signed-off-by: Radivoje Jovanovic 

Jacob, is this series fine by you?

> ---
>  drivers/powercap/intel_rapl.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
> index 482b22d..bca9620 100644
> --- a/drivers/powercap/intel_rapl.c
> +++ b/drivers/powercap/intel_rapl.c
> @@ -1101,6 +1101,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = 
> {
>   RAPL_CPU(0x4A, rapl_defaults_tng),/* Tangier */
>   RAPL_CPU(0x56, rapl_defaults_core),/* Future Xeon */
>   RAPL_CPU(0x5A, rapl_defaults_ann),/* Annidale */
> + RAPL_CPU(0x5E, rapl_defaults_core),/* Skylake-H/S */
>   RAPL_CPU(0x57, rapl_defaults_hsw_server),/* Knights Landing */
>   {}
>  };
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] clk: versatile: Switch to assigned clock parents

2015-07-31 Thread Stephen Boyd
We're removing struct clk from the clk provider API. This code is
calling the consumer APIs to change the parent to a 1 MHz fixed
rate clock for each of the clocks that the driver provides. Move
to using the assigned-clock-parents DT property for this instead.
Because this is an ABI break, detect if the property is missing
and fall back to setting the parent explicitly before the clocks
are registered.

Cc: Pawel Moll 
Cc: Linus Walleij 
Cc: Sudeep Holla 
Signed-off-by: Stephen Boyd 
---
 drivers/clk/versatile/clk-sp810.c | 76 ---
 1 file changed, 15 insertions(+), 61 deletions(-)

diff --git a/drivers/clk/versatile/clk-sp810.c 
b/drivers/clk/versatile/clk-sp810.c
index 7fbe4d4bf35e..a1cdef6b0f90 100644
--- a/drivers/clk/versatile/clk-sp810.c
+++ b/drivers/clk/versatile/clk-sp810.c
@@ -33,12 +33,9 @@ struct clk_sp810_timerclken {
 
 struct clk_sp810 {
struct device_node *node;
-   int refclk_index, timclk_index;
void __iomem *base;
spinlock_t lock;
struct clk_sp810_timerclken timerclken[4];
-   struct clk *refclk;
-   struct clk *timclk;
 };
 
 static u8 clk_sp810_timerclken_get_parent(struct clk_hw *hw)
@@ -71,55 +68,7 @@ static int clk_sp810_timerclken_set_parent(struct clk_hw 
*hw, u8 index)
return 0;
 }
 
-/*
- * FIXME - setting the parent every time .prepare is invoked is inefficient.
- * This is better handled by a dedicated clock tree configuration mechanism at
- * init-time.  Revisit this later when such a mechanism exists
- */
-static int clk_sp810_timerclken_prepare(struct clk_hw *hw)
-{
-   struct clk_sp810_timerclken *timerclken = to_clk_sp810_timerclken(hw);
-   struct clk_sp810 *sp810 = timerclken->sp810;
-   struct clk *old_parent = __clk_get_parent(hw->clk);
-   struct clk *new_parent;
-
-   if (!sp810->refclk)
-   sp810->refclk = of_clk_get(sp810->node, sp810->refclk_index);
-
-   if (!sp810->timclk)
-   sp810->timclk = of_clk_get(sp810->node, sp810->timclk_index);
-
-   if (WARN_ON(IS_ERR(sp810->refclk) || IS_ERR(sp810->timclk)))
-   return -ENOENT;
-
-   /* Select fastest parent */
-   if (clk_get_rate(sp810->refclk) > clk_get_rate(sp810->timclk))
-   new_parent = sp810->refclk;
-   else
-   new_parent = sp810->timclk;
-
-   /* Switch the parent if necessary */
-   if (old_parent != new_parent) {
-   clk_prepare(new_parent);
-   clk_set_parent(hw->clk, new_parent);
-   clk_unprepare(old_parent);
-   }
-
-   return 0;
-}
-
-static void clk_sp810_timerclken_unprepare(struct clk_hw *hw)
-{
-   struct clk_sp810_timerclken *timerclken = to_clk_sp810_timerclken(hw);
-   struct clk_sp810 *sp810 = timerclken->sp810;
-
-   clk_put(sp810->timclk);
-   clk_put(sp810->refclk);
-}
-
 static const struct clk_ops clk_sp810_timerclken_ops = {
-   .prepare = clk_sp810_timerclken_prepare,
-   .unprepare = clk_sp810_timerclken_unprepare,
.get_parent = clk_sp810_timerclken_get_parent,
.set_parent = clk_sp810_timerclken_set_parent,
 };
@@ -140,24 +89,18 @@ static void __init clk_sp810_of_setup(struct device_node 
*node)
 {
struct clk_sp810 *sp810 = kzalloc(sizeof(*sp810), GFP_KERNEL);
const char *parent_names[2];
+   int num = ARRAY_SIZE(parent_names);
char name[12];
struct clk_init_data init;
int i;
+   bool deprecated;
 
if (!sp810) {
pr_err("Failed to allocate memory for SP810!\n");
return;
}
 
-   sp810->refclk_index = of_property_match_string(node, "clock-names",
-   "refclk");
-   parent_names[0] = of_clk_get_parent_name(node, sp810->refclk_index);
-
-   sp810->timclk_index = of_property_match_string(node, "clock-names",
-   "timclk");
-   parent_names[1] = of_clk_get_parent_name(node, sp810->timclk_index);
-
-   if (!parent_names[0] || !parent_names[1]) {
+   if (of_clk_parent_fill(node, parent_names, num) != num) {
pr_warn("Failed to obtain parent clocks for SP810!\n");
return;
}
@@ -170,7 +113,9 @@ static void __init clk_sp810_of_setup(struct device_node 
*node)
init.ops = _sp810_timerclken_ops;
init.flags = CLK_IS_BASIC;
init.parent_names = parent_names;
-   init.num_parents = ARRAY_SIZE(parent_names);
+   init.num_parents = num;
+
+   deprecated = !of_find_property(node, "assigned-clock-parents", NULL);
 
for (i = 0; i < ARRAY_SIZE(sp810->timerclken); i++) {
snprintf(name, ARRAY_SIZE(name), "timerclken%d", i);
@@ -179,6 +124,15 @@ static void __init clk_sp810_of_setup(struct device_node 
*node)
sp810->timerclken[i].channel = i;
sp810->timerclken[i].hw.init = 
 
+   /*
+* If DT isn't setting the parent, force 

[PATCH 6/6] staging/lustre: Get rid of inode_dio_write_done and inode_dio_read

2015-07-31 Thread green
From: Oleg Drokin 

These primitives are long deprecated and unused.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/linux/lustre_compat25.h | 5 -
 drivers/staging/lustre/lustre/llite/llite_lib.c   | 5 +
 drivers/staging/lustre/lustre/llite/vvp_io.c  | 5 ++---
 3 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index 157bafb..6b14406 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -64,11 +64,6 @@
 
 #define LTIME_S(time) (time.tv_sec)
 
-/* inode_dio_wait(i) use as-is for write lock */
-# define inode_dio_write_done(i)   do {} while (0) /* for write unlock */
-# define inode_dio_read(i) atomic_inc(&(i)->i_dio_count)
-/* inode_dio_done(i) use as-is for read unlock */
-
 #ifndef QUOTA_OK
 # define QUOTA_OK 0
 #endif
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 39f0b2a..55e2dc6 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1356,11 +1356,8 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr 
*attr, bool hsm_import)
if (!op_data)
return -ENOMEM;
 
-   if (!S_ISDIR(inode->i_mode)) {
-   if (attr->ia_valid & ATTR_SIZE)
-   inode_dio_write_done(inode);
+   if (!S_ISDIR(inode->i_mode))
mutex_unlock(>i_mutex);
-   }
 
memcpy(_data->op_attr, attr, sizeof(*attr));
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c 
b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 91bba79..a659962 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -455,12 +455,11 @@ static void vvp_io_setattr_end(const struct lu_env *env,
struct cl_io *io= ios->cis_io;
struct inode *inode = ccc_object_inode(io->ci_obj);
 
-   if (cl_io_is_trunc(io)) {
+   if (cl_io_is_trunc(io))
/* Truncate in memory pages - they must be clean pages
 * because osc has already notified to destroy osc_extents. */
vvp_do_vmtruncate(inode, io->u.ci_setattr.sa_attr.lvb_size);
-   inode_dio_write_done(inode);
-   }
+
mutex_unlock(>i_mutex);
 }
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/6] staging/lustre: Drop FMODE_UNSIGNED_OFFSET define

2015-07-31 Thread green
From: Oleg Drokin 

It's not really used anywhere.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/linux/lustre_compat25.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index 9739611..b37856c 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -90,10 +90,6 @@
 # define NO_QUOTA (-EDQUOT)
 #endif
 
-#ifndef FMODE_UNSIGNED_OFFSET
-#define FMODE_UNSIGNED_OFFSET  ((__force fmode_t)0x2000)
-#endif
-
 #if !defined(_ASM_GENERIC_BITOPS_EXT2_NON_ATOMIC_H_) && !defined(ext2_set_bit)
 # define ext2_set_bit   __test_and_set_bit_le
 # define ext2_clear_bit   __test_and_clear_bit_le
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] Move clk-sp810 to assigned clock parents

2015-07-31 Thread Stephen Boyd
This patch set converts this code to use the assigned-clock-parents
property instead of doing some interesting stuff in the .prepare
op. I can route the dts patches through arm-soc but I'd like to take
the clk patch through clk tree because it removes some usage of the
struct clk based provider APIs that we're trying to get rid of. Also,
this is completely untested, so testing would be appreciated.
Can this be tested with qemu? I haven't tried but I was thinking
that might be an option.

Cc: Pawel Moll 
Cc: Linus Walleij 
Cc: Sudeep Holla 

Stephen Boyd (3):
  clk: versatile: Switch to assigned clock parents
  ARM: dts: vexpress: Use assigned-clock-parents for sp810
  ARM64: dts: vexpress: Use assigned-clock-parents for sp810

 arch/arm/boot/dts/vexpress-v2m-rs1.dtsi  |  2 +
 arch/arm/boot/dts/vexpress-v2m.dtsi  |  2 +
 arch/arm64/boot/dts/arm/juno-motherboard.dtsi|  2 +
 arch/arm64/boot/dts/arm/rtsm_ve-motherboard.dtsi |  2 +
 drivers/clk/versatile/clk-sp810.c| 76 +---
 5 files changed, 23 insertions(+), 61 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] staging/lustre: Drop SEEK_* definition checks

2015-07-31 Thread green
From: Oleg Drokin 

SEEK_DATA and SEEK_HOLE are always defined in the kernel,
drop the definition checks

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/linux/lustre_compat25.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index f16ba9c..502c7cc 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -121,13 +121,6 @@ static inline int ll_quota_off(struct super_block *sb, int 
off, int remount)
 # define NO_QUOTA (-EDQUOT)
 #endif
 
-#ifndef SEEK_DATA
-#define SEEK_DATA  3   /* seek to the next data */
-#endif
-#ifndef SEEK_HOLE
-#define SEEK_HOLE  4   /* seek to the next hole */
-#endif
-
 #ifndef FMODE_UNSIGNED_OFFSET
 #define FMODE_UNSIGNED_OFFSET  ((__force fmode_t)0x2000)
 #endif
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/6] staging/lustre: remove unused ll_quota_on and ll_quota_off

2015-07-31 Thread green
From: Oleg Drokin 

They are not used anywhere, so safe to drop.

Signed-off-by: Oleg Drokin 
---
 .../lustre/lustre/include/linux/lustre_compat25.h  | 31 --
 1 file changed, 31 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index 502c7cc..9739611 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -69,37 +69,6 @@
 # define inode_dio_read(i) atomic_inc(&(i)->i_dio_count)
 /* inode_dio_done(i) use as-is for read unlock */
 
-static inline int
-ll_quota_on(struct super_block *sb, int off, int ver, char *name, int remount)
-{
-   int rc;
-
-   if (sb->s_qcop->quota_on) {
-   struct path path;
-
-   rc = kern_path(name, LOOKUP_FOLLOW, );
-   if (!rc)
-   return rc;
-   rc = sb->s_qcop->quota_on(sb, off, ver
-   , 
-  );
-   path_put();
-   return rc;
-   } else
-   return -ENOSYS;
-}
-
-static inline int ll_quota_off(struct super_block *sb, int off, int remount)
-{
-   if (sb->s_qcop->quota_off) {
-   return sb->s_qcop->quota_off(sb, off
-   );
-   } else
-   return -ENOSYS;
-}
-
-
-
 #define ll_d_hlist_node hlist_node
 #define ll_d_hlist_empty(list) hlist_empty(list)
 #define ll_d_hlist_entry(ptr, type, name) hlist_entry(ptr.first, type, name)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/6] staging/lustre: Use hlist primitives directly

2015-07-31 Thread green
From: Oleg Drokin 

Get rid of ll_d_hlist* compat defines.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/linux/lustre_compat25.h |  9 -
 drivers/staging/lustre/lustre/llite/dcache.c  |  3 +--
 drivers/staging/lustre/lustre/llite/namei.c   | 10 --
 3 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index b37856c..43fa2a9 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -69,15 +69,6 @@
 # define inode_dio_read(i) atomic_inc(&(i)->i_dio_count)
 /* inode_dio_done(i) use as-is for read unlock */
 
-#define ll_d_hlist_node hlist_node
-#define ll_d_hlist_empty(list) hlist_empty(list)
-#define ll_d_hlist_entry(ptr, type, name) hlist_entry(ptr.first, type, name)
-#define ll_d_hlist_for_each(tmp, i_dentry) hlist_for_each(tmp, i_dentry)
-#define ll_d_hlist_for_each_entry(dentry, p, i_dentry, alias) \
-   p = NULL; hlist_for_each_entry(dentry, i_dentry, alias)
-
-
-
 #define ll_pagevec_init(pv, cold)   do {} while (0)
 #define ll_pagevec_add(pv, pg)   (0)
 #define ll_pagevec_lru_add_file(pv) do {} while (0)
diff --git a/drivers/staging/lustre/lustre/llite/dcache.c 
b/drivers/staging/lustre/lustre/llite/dcache.c
index 7b008a6..b866859 100644
--- a/drivers/staging/lustre/lustre/llite/dcache.c
+++ b/drivers/staging/lustre/lustre/llite/dcache.c
@@ -250,7 +250,6 @@ void ll_intent_release(struct lookup_intent *it)
 void ll_invalidate_aliases(struct inode *inode)
 {
struct dentry *dentry;
-   struct ll_d_hlist_node *p;
 
LASSERT(inode != NULL);
 
@@ -258,7 +257,7 @@ void ll_invalidate_aliases(struct inode *inode)
   inode->i_ino, inode->i_generation, inode);
 
ll_lock_dcache(inode);
-   ll_d_hlist_for_each_entry(dentry, p, >i_dentry, d_u.d_alias) {
+   hlist_for_each_entry(dentry, >i_dentry, d_u.d_alias) {
CDEBUG(D_DENTRY, "dentry in drop %pd (%p) parent %p inode %p 
flags %d\n",
   dentry, dentry, dentry->d_parent,
   d_inode(dentry), dentry->d_flags);
diff --git a/drivers/staging/lustre/lustre/llite/namei.c 
b/drivers/staging/lustre/lustre/llite/namei.c
index 2ed1e0a..05e7dc8 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -144,10 +144,9 @@ struct inode *ll_iget(struct super_block *sb, ino_t hash,
 static void ll_invalidate_negative_children(struct inode *dir)
 {
struct dentry *dentry, *tmp_subdir;
-   struct ll_d_hlist_node *p;
 
ll_lock_dcache(dir);
-   ll_d_hlist_for_each_entry(dentry, p, >i_dentry, d_u.d_alias) {
+   hlist_for_each_entry(dentry, >i_dentry, d_u.d_alias) {
spin_lock(>d_lock);
if (!list_empty(>d_subdirs)) {
struct dentry *child;
@@ -334,15 +333,14 @@ void ll_i2gids(__u32 *suppgids, struct inode *i1, struct 
inode *i2)
 static struct dentry *ll_find_alias(struct inode *inode, struct dentry *dentry)
 {
struct dentry *alias, *discon_alias, *invalid_alias;
-   struct ll_d_hlist_node *p;
 
-   if (ll_d_hlist_empty(>i_dentry))
+   if (hlist_empty(>i_dentry))
return NULL;
 
discon_alias = invalid_alias = NULL;
 
ll_lock_dcache(inode);
-   ll_d_hlist_for_each_entry(alias, p, >i_dentry, d_u.d_alias) {
+   hlist_for_each_entry(alias, >i_dentry, d_u.d_alias) {
LASSERT(alias != dentry);
 
spin_lock(>d_lock);
@@ -690,7 +688,7 @@ static struct inode *ll_create_node(struct inode *dir, 
struct lookup_intent *it)
goto out;
}
 
-   LASSERT(ll_d_hlist_empty(>i_dentry));
+   LASSERT(hlist_empty(>i_dentry));
 
/* We asked for a lock on the directory, but were granted a
 * lock on the inode.  Since we finally have an inode pointer,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/6] staging/lustre: Get rid of ll_pagevec_ macros

2015-07-31 Thread green
From: Oleg Drokin 

They are noop anyways.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/linux/lustre_compat25.h | 5 -
 drivers/staging/lustre/lustre/llite/dir.c | 4 
 2 files changed, 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h 
b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
index 43fa2a9..157bafb 100644
--- a/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
+++ b/drivers/staging/lustre/lustre/include/linux/lustre_compat25.h
@@ -69,11 +69,6 @@
 # define inode_dio_read(i) atomic_inc(&(i)->i_dio_count)
 /* inode_dio_done(i) use as-is for read unlock */
 
-#define ll_pagevec_init(pv, cold)   do {} while (0)
-#define ll_pagevec_add(pv, pg)   (0)
-#define ll_pagevec_lru_add_file(pv) do {} while (0)
-
-
 #ifndef QUOTA_OK
 # define QUOTA_OK 0
 #endif
diff --git a/drivers/staging/lustre/lustre/llite/dir.c 
b/drivers/staging/lustre/lustre/llite/dir.c
index 3d746a9..769b611 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -203,7 +203,6 @@ static int ll_dir_filler(void *_hash, struct page *page0)
 
CDEBUG(D_VFSTRACE, "read %d/%d pages\n", nrdpgs, npages);
 
-   ll_pagevec_init(_pvec, 0);
for (i = 1; i < npages; i++) {
unsigned long offset;
int ret;
@@ -228,15 +227,12 @@ static int ll_dir_filler(void *_hash, struct page *page0)
GFP_KERNEL);
if (ret == 0) {
unlock_page(page);
-   if (ll_pagevec_add(_pvec, page) == 0)
-   ll_pagevec_lru_add_file(_pvec);
} else {
CDEBUG(D_VFSTRACE, "page %lu add to page cache failed: 
%d\n",
   offset, ret);
}
page_cache_release(page);
}
-   ll_pagevec_lru_add_file(_pvec);
 
if (page_pool != )
kfree(page_pool);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] Lustre compat code removal

2015-07-31 Thread green
From: Oleg Drokin 

This is take two for the compat stuff removal.

Oleg Drokin (6):
  staging/lustre: Drop SEEK_* definition checks
  staging/lustre: remove unused ll_quota_on and ll_quota_off
  staging/lustre: Drop FMODE_UNSIGNED_OFFSET define
  staging/lustre: Use hlist primitives directly
  staging/lustre: Get rid of ll_pagevec_ macros
  staging/lustre: Get rid of inode_dio_write_done and inode_dio_read

 .../lustre/lustre/include/linux/lustre_compat25.h  | 61 --
 drivers/staging/lustre/lustre/llite/dcache.c   |  3 +-
 drivers/staging/lustre/lustre/llite/dir.c  |  4 --
 drivers/staging/lustre/lustre/llite/llite_lib.c|  5 +-
 drivers/staging/lustre/lustre/llite/namei.c| 10 ++--
 drivers/staging/lustre/lustre/llite/vvp_io.c   |  5 +-
 6 files changed, 8 insertions(+), 80 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


unpinning an unpinned lock warning triggered by generic_file_fsync()

2015-07-31 Thread Bart Van Assche
Hello,

The warning below was triggered while I was testing SRP initiator changes on
top of v4.2-rc4. I don't think that this warning was triggered by my changes.
Has anyone seen this before ?

Thanks,

Bart.

Jul 31 16:02:37 srp-ini multipathd: mpatha: load table [0 5859375088 multipath 
3 queue_if_no_path pg_init_retries 50 1 alua 2 1 queue-length 0 7 1 8:240 1 
8:224 1 65:0 1 66:160 1 66:224 1 67:16 1 66:240 1 queue-length 0 8 1 8:32 1 
8:48 1 8:96 1 65:224 1 65:112 1 65:240 1 66:0 1 67:160 1]
Jul 31 16:02:37 srp-ini multipathd: sdbg [67:160]: path added to devmap mpatha
Jul 31 16:02:37 srp-ini multipathd: sdbh: add path (uevent)
Jul 31 16:02:37 srp-ini kernel: sd 14:0:0:0: alua: port group 101 state S 
preferred supports tOlUSNA
Jul 31 16:02:37 srp-ini kernel: [ cut here ]
Jul 31 16:02:37 srp-ini kernel: WARNING: CPU: 2 PID: 9262 at 
kernel/locking/lockdep.c:3497 lock_unpin_lock+0x105/0x110()
Jul 31 16:02:37 srp-ini kernel: unpinning an unpinned lock
Jul 31 16:02:37 srp-ini kernel: Modules linked in: dm_queue_length scsi_dh_alua 
af_packet xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc 
ebtable_filter ebtables ip6table_nat nf_con
ntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw 
ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack msr iptable_mangle iptable_raw iptable_filter 
ip_tables x_tables sg ib_srp scsi_transport_srp x86_pk
g_temp_thermal coretemp crct10dif_pclmul crc32c_intel aesni_intel aes_x86_64 
glue_helper lrw gf128mul ablk_helper cryptd microcode pcspkr tg3 libphy lpc_ich 
mfd_core ib_ipoib wmi acpi_power_meter rdma_ucm ib_ucm ib_uverbs ib_umad 
rdma_cm ib_cm iw_cm processor thermal_sys 
hwmon button dm_multipath scsi_dh dm_mod ext4 crc16 mbcache jbd2 mlx4_en ptp 
pps_core mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod sr_mod cdrom hid_generic 
usbhid hid ahci libahci libata ehci_pci ehci_hcd mlx4_core usbcore usb_common 
scsi_mod autofs4
Jul 31 16:02:37 srp-ini kernel: CPU: 2 PID: 9262 Comm: kworker/2:98 Not tainted 
4.2.0-rc4-debug+ #1
Jul 31 16:02:37 srp-ini kernel: Hardware name: Dell Inc. PowerEdge R430/03XKDV, 
BIOS 1.0.2 11/17/2014
Jul 31 16:02:37 srp-ini kernel: Workqueue: dio/dm-2 dio_aio_complete_work
Jul 31 16:02:37 srp-ini kernel: 817a80ba 88006ea03a58 
814f9cde 
Jul 31 16:02:37 srp-ini kernel: 88006ea03aa8 88006ea03a98 
810746ba 88006ea03aa8
Jul 31 16:02:37 srp-ini kernel: 0003 88040d962c80 
88047fc95618 0092
Jul 31 16:02:37 srp-ini kernel: Call Trace:
Jul 31 16:02:37 srp-ini kernel: [] dump_stack+0x4c/0x65
Jul 31 16:02:37 srp-ini kernel: [] 
warn_slowpath_common+0x8a/0xc0
Jul 31 16:02:37 srp-ini kernel: [] warn_slowpath_fmt+0x46/0x50
Jul 31 16:02:37 srp-ini kernel: [] ? __lock_is_held+0x4d/0x70
Jul 31 16:02:37 srp-ini kernel: [] lock_unpin_lock+0x105/0x110
Jul 31 16:02:37 srp-ini kernel: [] __schedule+0x414/0xa90
Jul 31 16:02:37 srp-ini kernel: [] schedule+0x3e/0x90
Jul 31 16:02:37 srp-ini kernel: [] 
schedule_preempt_disabled+0x15/0x20
Jul 31 16:02:37 srp-ini kernel: [] 
mutex_lock_nested+0x156/0x3a0
Jul 31 16:02:37 srp-ini kernel: [] ? 
__generic_file_fsync+0x44/0x90
Jul 31 16:02:37 srp-ini kernel: [] 
__generic_file_fsync+0x44/0x90
Jul 31 16:02:37 srp-ini kernel: [] 
generic_file_fsync+0x1d/0x40
Jul 31 16:02:37 srp-ini kernel: [] ext4_sync_file+0x268/0x610 
[ext4]
Jul 31 16:02:37 srp-ini kernel: [] vfs_fsync_range+0x3d/0xb0
Jul 31 16:02:37 srp-ini kernel: [] dio_complete+0x141/0x180
Jul 31 16:02:37 srp-ini kernel: [] 
dio_aio_complete_work+0x27/0x30
Jul 31 16:02:37 srp-ini kernel: [] 
process_one_work+0x1d8/0x7c0
Jul 31 16:02:37 srp-ini kernel: [] ? 
process_one_work+0x14b/0x7c0
Jul 31 16:02:37 srp-ini kernel: [] worker_thread+0x114/0x460
Jul 31 16:02:37 srp-ini kernel: [] ? 
process_one_work+0x7c0/0x7c0
Jul 31 16:02:37 srp-ini kernel: [] kthread+0xf8/0x110
Jul 31 16:02:37 srp-ini kernel: [] ? 
kthread_create_on_node+0x210/0x210
Jul 31 16:02:37 srp-ini kernel: [] ret_from_fork+0x3f/0x70
Jul 31 16:02:37 srp-ini kernel: [] ? 
kthread_create_on_node+0x210/0x210
Jul 31 16:02:37 srp-ini kernel: ---[ end trace 50dfc2f612b77d64 ]---
Jul 31 16:02:38 srp-ini kernel: sd 14:0:0:0: alua: stpg sense code: 04/67/0a
Jul 31 16:02:38 srp-ini kernel: device-mapper: multipath: Failing path 8:240.
Jul 31 16:02:38 srp-ini kernel: sd 13:0:0:0: alua: port group 101 state S 
preferred supports tOlUSNA
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 1/2] perf,kvm/ppc: Add kvm_perf.h for powerpc

2015-07-31 Thread Scott Wood
[Added KVM lists and a couple relevant people]

On Fri, 2015-07-31 at 14:25 +0530, Hemant Kumar wrote:
> On 07/30/2015 03:52 AM, Scott Wood wrote:
> > On Wed, 2015-07-29 at 16:07 +0530, Hemant Kumar wrote:
> > > Hi Scott,
> > > 
> > > On 07/17/2015 01:40 AM, Scott Wood wrote:
> > > > On Thu, 2015-07-16 at 21:18 +0530, Hemant Kumar wrote:
> > > > > To analyze the exit events with perf, we need kvm_perf.h to be 
> > > > > added in
> > > > > the arch/powerpc directory, where the kvm tracepoints needed to 
> > > > > trace
> > > > > the KVM exit events are defined.
> > > > > 
> > > > > This patch adds "kvm_perf_book3s.h" to indicate that the 
> > > > > tracepoints are
> > > > > book3s specific. Generic "kvm_perf.h" then can just include
> > > > > "kvm_perf_book3s.h".
> > > > > 
> > > > > Signed-off-by: Hemant Kumar 
> > > > > ---
> > > > > Changes:
> > > > > - Not exporting the exit reasons compared to previous patchset
> > > > > (suggested
> > > > > by Paul)
> > > > > 
> > > > >arch/powerpc/include/uapi/asm/kvm_perf.h|  6 ++
> > > > >arch/powerpc/include/uapi/asm/kvm_perf_book3s.h | 14 
> > > > > ++
> > > > >2 files changed, 20 insertions(+)
> > > > >create mode 100644 arch/powerpc/include/uapi/asm/kvm_perf.h
> > > > >create mode 100644 
> > > > > arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > > > 
> > > > > diff --git a/arch/powerpc/include/uapi/asm/kvm_perf.h
> > > > > b/arch/powerpc/include/uapi/asm/kvm_perf.h
> > > > > new file mode 100644
> > > > > index 000..5ed2ff3
> > > > > --- /dev/null
> > > > > +++ b/arch/powerpc/include/uapi/asm/kvm_perf.h
> > > > > @@ -0,0 +1,6 @@
> > > > > +#ifndef _ASM_POWERPC_KVM_PERF_H
> > > > > +#define _ASM_POWERPC_KVM_PERF_H
> > > > > +
> > > > > +#include 
> > > > > +
> > > > > +#endif
> > > > > diff --git a/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > > > b/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > > > new file mode 100644
> > > > > index 000..8c8d8c2
> > > > > --- /dev/null
> > > > > +++ b/arch/powerpc/include/uapi/asm/kvm_perf_book3s.h
> > > > > @@ -0,0 +1,14 @@
> > > > > +#ifndef _ASM_POWERPC_KVM_PERF_BOOK3S_H
> > > > > +#define _ASM_POWERPC_KVM_PERF_BOOK3S_H
> > > > > +
> > > > > +#include 
> > > > > +
> > > > > +#define DECODE_STR_LEN 20
> > > > > +
> > > > > +#define VCPU_ID "vcpu_id"
> > > > > +
> > > > > +#define KVM_ENTRY_TRACE "kvm_hv:kvm_guest_enter"
> > > > > +#define KVM_EXIT_TRACE "kvm_hv:kvm_guest_exit"
> > > > > +#define KVM_EXIT_REASON "trap"
> > > > > +
> > > > > +#endif /* _ASM_POWERPC_KVM_PERF_BOOK3S_H */
> > > > Again, why is book3s stuff being presented via uapi as generic
> > > >  with generic symbol names?
> > > > 
> > > > -Scott
> > > Ok.
> > > 
> > > We can change the KVM_ENTRY_TRACE macro to something like
> > > KVM_BOOK3S_ENTRY_TRACE and likewise for KVM_EXIT_TRACE
> > > and KVM_EXIT_REASON
> > What about DECODE_STR_LEN and VCPU_ID?
> 
> DECODE_STR_LEN can be common, we can give a big enough size to it, if
> we need to.
> And, VCPU_ID depends on the field in the tracepoint payload data which is
> specific to that tracepoint. This field is used to maintain the per vcpu 
> record
> and this field gives us the vcpu id. So, yeah, I guess, since, I can't 
> find any
> such field as "vcpu_id" in the kvm_exit tracepoint for book3e, we have to
> make this specific to book3s.

Or maybe we could add kvm_guest_enter/kvm_guest_leave, with vcpu_id, to 
book3e... though the kvm-hv would be a problem for book3s-pr, if anyone cares 
about this feature there.

I'm not sure why the strings are present both in the UAPI header, as well as 
in kvm_events_tp[] in kvm-stat.c.

> > Where is this API documented?
> > 
> > >   and then, to resolve the issue of generic
> > > macro names in the userspace side, we can handle it using __weak
> > > modifier.
> > Does userspace get built differently for book3s versus book3e?  For now 
> > it'd
> > 
> > be fine for userspace to check for book3s and not use the feature if it's
> > 
> > book3e.  If and when book3e gains this feature, then userspace can be 
> > changed.
> 
> Well, I couldn't find any way to build user space differently for book3s and
> book3e.
> 
> How about keeping this as it is after modifying the tracepoint macro names
> to book3s specific in the uapi? And as and when booke decides to implement
> this feature, a runtime check for event availability can be added then, 
> IMHO.
> 
> What do you think?

What does userspace use, at runtime, to determine if this feature is present 
and whether the book3s symbols should be used?

Deferring the implementation of book3e support is fine, but from a uapi 
perspective it should be discoverable at runtime whether the feature exposed 
by asm/kvm_perf_book3s.h is available.  Otherwise, if it is implemented (or 
even if it isn't), you have the potential for user confusion if an older perf 
tool is used.  This sort of discovery is done all the time in the KVM APIs 
themselves.

FWIW, on x86 

Re: [PATCH v9 2/5] mtd: nand: vf610_nfc: add hardware BCH-ECC support

2015-07-31 Thread Stefan Agner
Hi Brian,

On 2015-08-01 01:09, Brian Norris wrote:

>> +static inline int vf610_nfc_correct_data(struct mtd_info *mtd, uint8_t *dat)
>> +{
>> +struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> +u8 ecc_status;
>> +u8 ecc_count;
>> +int flip;
>> +
>> +ecc_status = __raw_readb(nfc->regs + ECC_SRAM_ADDR * 8 + ECC_OFFSET);
>> +ecc_count = ecc_status & ECC_ERR_COUNT;
>> +if (!(ecc_status & ECC_STATUS_MASK))
>> +return ecc_count;
>> +
>> +/*
>> + * On an erased page, bit count should be zero or at least
>> + * less then half of the ECC strength
>> + */
>> +flip = count_written_bits(dat, nfc->chip.ecc.size, ecc_count);
> 
> Sorry I didn't notice this earlier, but it appears you are falling into
> the same trap that almost everyone else is -- it is not sufficient to
> check just the page area; you also need to check the OOB. Suppose that
> a MTD user wrote mostly-0xff data to the page, then the page accumulates
> bitflips in the spare area and a few in the page area, such that
> eventually HW ECC can't correct them. If there are few enough zero bits
> in the data area, you will mistakenly think that this is a blank page
> below, and memset() it to 0xff. That would be disastrous!
> 
> Fortunately, your code is otherwise quite well structured and looks
> good. A tip below.
> 
>> +
>> +if (flip > ecc_count && flip > (nfc->chip.ecc.strength / 2))
>> +return -1;
>> +
>> +/* Erased page. */
>> +memset(dat, 0xff, nfc->chip.ecc.size);
>> +return 0;
>> +}
>> +
>> +static int vf610_nfc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>> +uint8_t *buf, int oob_required, int page)
>> +{
>> +int eccsize = chip->ecc.size;
>> +int stat;
>> +
>> +vf610_nfc_read_buf(mtd, buf, eccsize);
>> +
>> +if (oob_required)
>> +vf610_nfc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
> 
> To fix the bitflips issue above, you'll just want to unconditionally
> read the OOB (it's fine to ignore 'oob_required') and...
> 
>> +
>> +stat = vf610_nfc_correct_data(mtd, buf);
> 
> ...pass in chip->oob_poi as a third argument.
> 

Hm, this probably will have an effect on performance, since we usually
omit the OOB if not requested. I could fetch the OOB from the NAND
controllers SRAM only if necessary (if HW ECC status is not ok...). Does
this sound reasonable?

>> +
>> +if (stat < 0)
>> +mtd->ecc_stats.failed++;
>> +else
>> +mtd->ecc_stats.corrected += stat;
> 
> You've got another problem here: ecc.read_page() should be returning
> 'max_bitflips' here. So, since you have a single ECC region, this block
> should probably be:
> 
>   if (stat < 0) {
>   mtd->ecc_stats.failed++;
>   return 0;
>   } else {
>   mtd->ecc_stats.corrected += stat;
>   return stat;
>   }
> 

Ok, will change that.

--
Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/13] staging/lustre: Drop SEEK_* definition checks

2015-07-31 Thread Oleg Drokin

On Jul 31, 2015, at 7:00 PM, Greg Kroah-Hartman wrote:

> On Thu, Jul 30, 2015 at 06:27:57PM -0400, gr...@linuxhacker.ru wrote:
>> From: Oleg Drokin 
>> 
>> SEEK_DATA and SEEK_HOLE are always defined in the kernel,
>> drop the definition checks
> This code removal doesn't match up with the changelog text, so I can't
> take this :(

Whoops.
A fallover from a mistaken rebase, I guess.
I'll resubmit in a moment.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] suspend: make sync() on suspend-to-RAM build-time optional

2015-07-31 Thread Rafael J. Wysocki
On Friday, July 31, 2015 12:46:17 PM Len Brown wrote:
> From: Len Brown 
> 
> The Linux kernel suspend path has traditionally invoked sys_sync()
> before freezing user threads.
> 
> But sys_sync() can be expensive, and some user-space OS's do not want
> the kernel to pay the cost of sys_sync() on every suspend -- preferring
> invoke sync() from user-space if/when they want it.
> 
> So make sys_sync on suspend build-time optional.
> 
> The default is unchanged.
> 
> Signed-off-by: Len Brown 

OK, I'm replacing your previous sync patch with this one (modulo some minor
changes in the help text).

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional

2015-07-31 Thread Rafael J. Wysocki
On Friday, July 31, 2015 12:02:36 PM Len Brown wrote:
> On Wed, Jul 22, 2015 at 4:55 AM, Oliver Neukum  wrote:
> > On Wed, 2015-07-22 at 03:25 +0200, Rafael J. Wysocki wrote:
> >> And it is more pain for me to change the user space on each of them to
> >> write to the new sysfs file on every boot than to set a kernel Kconfig
> >> option once.
> >
> > So why at all? If you really need this in sysfs, why not write
> > something like "memfast" into /sys/power/state ?
> 
> We fought this battle, and lost.
> 
> When we came out with "freeze", which is faster than "mem",
> no user-space changed to take advantage of it.

I do think that Chrome is going to use "freeze", so maybe it's not a lost
battle after all?

The problem with "memfast" and similar things is we'd also need "freezefast"
and "standbyfast" then, for consistency if nothing else, which makes a little
sense to me.

BTW, it should be noted that the whole "sync in the kernel is better, because
it doesn't race with user space writing to disks" argument was completely
bogus and useless, because in fact the sync in the kernel is done before
freezing user space and which means that it is susceptible to the very same
race condition as the sync from user space.

So if your user space does the sync before suspending, the next one in the
kernel is completely useless.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 10/10] xen/balloon: pre-allocate p2m entries for ballooned pages

2015-07-31 Thread Daniel Kiper
On Thu, Jul 30, 2015 at 06:03:12PM +0100, David Vrabel wrote:
> Pages returned by alloc_xenballooned_pages() will be used for grant
> mapping which will call set_phys_to_machine() (in PV guests).
>
> Ballooned pages are set as INVALID_P2M_ENTRY in the p2m and thus may
> be using the (shared) missing tables and a subsequent
> set_phys_to_machine() will need to allocate new tables.
>
> Since the grant mapping may be done from a context that cannot sleep,
> the p2m entries must already be allocated.
>
> Signed-off-by: David Vrabel 

Reviewed-by: Daniel Kiper 

Daniel

PS FYI, next week I am on vacation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: add the block to the tail of the list in expand()

2015-07-31 Thread Dave Hansen
On 07/31/2015 02:30 AM, Xishi Qiu wrote:
> __free_one_page() will judge whether the the next-highest order is free,
> then add the block to the tail or not. So when we split large order block, 
> add the small block to the tail, it will reduce fragment.

It's an interesting idea, but what does it do in practice?  Can you
measure a decrease in fragmentation?

Further, the comment above the function says:
 * The order of subdivision here is critical for the IO subsystem.
 * Please do not alter this order without good reasons and regression
 * testing.

Has there been regression testing?

Also, this might not do very much good in practice.  If you are
splitting a high-order page, you are doing the split because the
lower-order lists are empty.  So won't that list_add() be to an empty
list most of the time?  Or does the __rmqueue_fallback()
largest->smallest logic dominate?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 09/10] x86/xen: export xen_alloc_p2m_entry()

2015-07-31 Thread Daniel Kiper
On Thu, Jul 30, 2015 at 06:03:11PM +0100, David Vrabel wrote:
> Rename alloc_p2m() to xen_alloc_p2m_entry() and export it.
>
> This is useful for ensuring that a p2m entry is allocated (i.e., not a
> shared missing or identity entry) so that subsequent set_phys_to_machine()
> calls will require no further allocations.
>
> Signed-off-by: David Vrabel 

Reviewed-by: Daniel Kiper 

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 8/9] x86/intel_rdt: Hot cpu support for Cache Allocation

2015-07-31 Thread Vikas Shivappa



On Wed, 29 Jul 2015, Peter Zijlstra wrote:


On Wed, Jul 01, 2015 at 03:21:09PM -0700, Vikas Shivappa wrote:

+/*
+ * cbm_update_msrs() - Updates all the existing IA32_L3_MASK_n MSRs
+ * which are one per CLOSid except IA32_L3_MASK_0 on the current package.
+ */
+static void cbm_update_msrs(void *info)
+{
+   int maxid = boot_cpu_data.x86_cache_max_closid;
+   unsigned int i;
+
+   /*
+* At cpureset, all bits of IA32_L3_MASK_n are set.
+* The index starts from one as there is no need
+* to update IA32_L3_MASK_0 as it belongs to root cgroup
+* whose cache mask is all 1s always.
+*/
+   for (i = 1; i < maxid; i++) {
+   if (ccmap[i].clos_refcnt)
+   cbm_cpu_update((void *)i);
+   }
+}
+
+static inline void intel_rdt_cpu_start(int cpu)
+{
+   struct intel_pqr_state *state = _cpu(pqr_state, cpu);
+
+   state->closid = 0;
+   mutex_lock(_group_mutex);
+   if (rdt_cpumask_update(cpu))
+   smp_call_function_single(cpu, cbm_update_msrs, NULL, 1);
+   mutex_unlock(_group_mutex);
+}


If you were to guard your array with both a mutex and a raw_spinlock
then you can avoid the IPI and use CPU_STARTING.


Cpu_online was just good enough as the tasks would be ready to be scheduled. iow 
, its just at the right time.


could avoid using the interrupt disabled time ?
Dont really need the *interrupt disabled* cpu_starting notification - can leave 
that for more important code/lock free code can go there. or this change should 
not be a big concern ?





+static int intel_rdt_cpu_notifier(struct notifier_block *nb,
+ unsigned long action, void *hcpu)
+{
+   unsigned int cpu  = (unsigned long)hcpu;
+
+   switch (action) {
+   case CPU_DOWN_FAILED:
+   case CPU_ONLINE:
+   intel_rdt_cpu_start(cpu);
+   break;
+   case CPU_DOWN_PREPARE:
+   intel_rdt_cpu_exit(cpu);
+   break;
+   default:
+   break;
+   }
+
+   return NOTIFY_OK;
 }



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 08/10] xen/balloon: use hotplugged pages for foreign mappings etc.

2015-07-31 Thread Daniel Kiper
On Thu, Jul 30, 2015 at 06:03:10PM +0100, David Vrabel wrote:
> alloc_xenballooned_pages() is used to get ballooned pages to back
> foreign mappings etc.  Instead of having to balloon out real pages,
> use (if supported) hotplugged memory.
>
> This makes more memory available to the guest and reduces
> fragmentation in the p2m.
>
> This is only enabled if the xen.balloon.hotplug_unpopulated sysctl is
> set to 1.  This sysctl defaults to 0 in case the udev rules to
> automatically online hotplugged memory do not exist.
>
> Signed-off-by: David Vrabel 

Reviewed-by: Daniel Kiper 

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9] x86/intel_cqm: Modify hot cpu notification handling

2015-07-31 Thread Vikas Shivappa



On Wed, 29 Jul 2015, Peter Zijlstra wrote:


On Wed, Jul 01, 2015 at 03:21:02PM -0700, Vikas Shivappa wrote:

+/*
+ * Temporary cpumask used during hot cpu notificaiton handling. The usage
+ * is serialized by hot cpu locks.
+ */
+static cpumask_t tmp_cpumask;


So the problem with this is that its 512 bytes on your general distro
config. And this patch set includes at least 3 of them

So you've just shot 1k5 bytes of .data for no reason.

I know tglx whacked you over the head for this, but is this really worth
it? I mean, nobody sane should care about hotplug performance, so who
cares if we iterate a bunch of cpus on the abysmal slow path called
hotplug.


We did this so that we dont keep looping on every cpu to check if it 
belongs to a particular package. especially the cost being linear with more and 
more cpus getting added and on large systems.

Would it not make sense to use the mask which would tell you all the
cores on a particular core's package ? I realized to use the mask 
topology_core_cpumask only after seeing tglx's pseudo code because the name is 
definitely confusing and earlier I assumed such mask doesnt exist and hence we 
had to just loop through.
I know you pointed out to not put the mask on the stack , but the static usage 
cost should be  reasonable to avoid the cost of looping through all the 
available cpus..


also it doesnt mean we put more crap when we see  crapy code as per tglx as well 
? so why contradict that.





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 06/10] xen/balloon: only hotplug additional memory if required

2015-07-31 Thread Daniel Kiper
On Thu, Jul 30, 2015 at 06:03:08PM +0100, David Vrabel wrote:
> Now that we track the total number of pages (included hotplugged
> regions), it is easy to determine if more memory needs to be
> hotplugged.
>
> Add a new BP_WAIT state to signal that the balloon process needs to
> wait until kicked by the memory add notifier (when the new section is
> onlined by userspace).
>
> Signed-off-by: David Vrabel 

Reviewed-by: Daniel Kiper 

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 3/5] mtd: nand: vf610_nfc: add device tree bindings

2015-07-31 Thread Brian Norris
On Fri, Jul 31, 2015 at 06:52:59PM +0200, Stefan Agner wrote:
> Signed-off-by: Bill Pringlemeir 
> Signed-off-by: Stefan Agner 

The rest looks good to me. Thanks!

Reviewed-by: Brian Norris 

> ---
>  .../devicetree/bindings/mtd/vf610-nfc.txt  | 45 
> ++
>  1 file changed, 45 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mtd/vf610-nfc.txt
> 
> diff --git a/Documentation/devicetree/bindings/mtd/vf610-nfc.txt 
> b/Documentation/devicetree/bindings/mtd/vf610-nfc.txt
> new file mode 100644
> index 000..cae5f25
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mtd/vf610-nfc.txt
> @@ -0,0 +1,45 @@
> +Freescale's NAND flash controller (NFC)
> +
> +This variant of the Freescale NAND flash controller (NFC) can be found on
> +Vybrid (vf610), MPC5125, MCF54418 and Kinetis K70.
> +
> +Required properties:
> +- compatible: Should be set to "fsl,vf610-nfc"
> +- reg: address range of the NFC
> +- interrupts: interrupt of the NFC
> +- nand-bus-width: see nand.txt
> +- nand-ecc-mode: see nand.txt
> +- nand-on-flash-bbt: see nand.txt
> +- assigned-clocks: main clock from the SoC, for Vybrid < VF610_CLK_NFC>;
> +- assigned-clock-rates: The NAND bus timing is derived from this clock
> +rate and should not exceed maximum timing for any NAND memory chip
> +in a board stuffing. Typical NAND memory timings derived from this
> +clock are found in the SoC hardware reference manual. Furthermore,
> +there might be restrictions on maximum rates when using hardware ECC.
> +
> +- #address-cells, #size-cells : Must be present if the device has sub-nodes
> +  representing partitions.
> +
> +Required properties for hardware ECC:
> +- nand-ecc-strength: supported strengths are 24 and 32 bit (see nand.txt)
> +- nand-ecc-step-size: step size equals page size, currently only 2k pages are
> +supported
> +
> +Example:
> +
> + nfc: nand@400e {
> + compatible = "fsl,vf610-nfc";
> + #address-cells = <1>;
> + #size-cells = <1>;
> + reg = <0x400e 0x4000>;
> + interrupts = ;
> + clocks = < VF610_CLK_NFC>;
> + clock-names = "nfc";
> + assigned-clocks = < VF610_CLK_NFC>;
> + assigned-clock-rates = <3300>;
> + nand-bus-width = <8>;
> + nand-ecc-mode = "hw";
> + nand-ecc-strength = <32>;
> + nand-ecc-step-size = <2048>;
> + nand-on-flash-bbt;
> + };
> -- 
> 2.4.5
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 3/5] mtd: nand: vf610_nfc: add device tree bindings

2015-07-31 Thread Stefan Agner
On 2015-07-31 18:52, Stefan Agner wrote:
> Signed-off-by: Bill Pringlemeir 
> Signed-off-by: Stefan Agner 

Actually just realized that I forgot to collect the Ack of Shawn, will
add that in next revision as well.

FWIW, dispatching to some recipients have been deferred (e.g. the
mailing lists) due to DNS issues on my side, sorry about that.

--
Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 01/10] mm: memory hotplug with an existing resource

2015-07-31 Thread Daniel Kiper
On Thu, Jul 30, 2015 at 06:03:03PM +0100, David Vrabel wrote:
> Add add_memory_resource() to add memory using an existing "System RAM"
> resource.  This is useful if the memory region is being located by
> finding a free resource slot with allocate_resource().
>
> Xen guests will make use of this in their balloon driver to hotplug
> arbitrary amounts of memory in response to toolstack requests.
>
> Signed-off-by: David Vrabel 
> Cc: Andrew Morton 

Hmmm... Why do you remove my Reviewed-by line from this patch?

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ARM: dts: vf-colibri: define stdout-path property

2015-07-31 Thread Stefan Agner
Hi Shawn,

Did you notice this change? Just asking since you applied some other
changes already...

--
Stefan

On 2015-07-15 16:50, Stefan Agner wrote:
> Define Vybrid's UART0, connected to the Colibri pinout UART_A, as
> standard output.
> 
> Signed-off-by: Stefan Agner 
> ---
>  arch/arm/boot/dts/vf-colibri-eval-v3.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/vf-colibri-eval-v3.dtsi
> b/arch/arm/boot/dts/vf-colibri-eval-v3.dtsi
> index 77e1a45..2ce6fe5 100644
> --- a/arch/arm/boot/dts/vf-colibri-eval-v3.dtsi
> +++ b/arch/arm/boot/dts/vf-colibri-eval-v3.dtsi
> @@ -9,7 +9,7 @@
>  
>  / {
>   chosen {
> - bootargs = "console=ttyLP0,115200";
> + stdout-path = "serial0:115200n8";
>   };
>  
>   clk16m: clk16m {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 2/2] ARM: imx: Add suspend codes for imx7D

2015-07-31 Thread Stefan Agner
Hi Shenwei,

The Subject sounds somewhat strange, if you mean your program code, then
you should not use the plural. Or maybe "add suspend states..." or
"support suspend states..."?

On 2015-07-27 21:30, Shenwei Wang wrote:
> IMX7D contains a new version of GPC IP block (GPCv2). It has two
> major functions: power management and wakeup source management.
> 
> GPCv2 provides low power mode control for Cortex-A7 and Cortex-M4
> domains. And it can support WAIT, STOP, and DSM(Deep Sleep Mode) modes.
> After configuring the GPCv2 module, the platform can enter into a
> selected mode either automatically triggered by ARM WFI instruction or
> manually by software. The system will exit the low power states
> by the predefined wakeup sources which are managed by the gpcv2
> irqchip driver.
> 
> This patch adds a new suspend driver to manage the power states on IMX7D.
> It currently supports "SUSPEND_STANDBY" and "SUSPEND_MEM" states.
> 
> Signed-off-by: Shenwei Wang 
> Signed-off-by: Anson Huang 
> ---
>  arch/arm/mach-imx/Kconfig|   1 +
>  arch/arm/mach-imx/Makefile   |   2 +
>  arch/arm/mach-imx/pm-imx7.c  | 901 
> +++
>  arch/arm/mach-imx/suspend-imx7.S | 529 +++
>  4 files changed, 1433 insertions(+)
>  create mode 100644 arch/arm/mach-imx/pm-imx7.c
>  create mode 100644 arch/arm/mach-imx/suspend-imx7.S
> 
> diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig
> index 8ceda28..54f8553 100644
> --- a/arch/arm/mach-imx/Kconfig
> +++ b/arch/arm/mach-imx/Kconfig
> @@ -562,6 +562,7 @@ config SOC_IMX7D
>   select ARM_GIC
>   select HAVE_IMX_ANATOP
>   select HAVE_IMX_MMDC
> + select IMX_GPCV2
>   help
>   This enables support for Freescale i.MX7 Dual processor.
>  
> diff --git a/arch/arm/mach-imx/Makefile b/arch/arm/mach-imx/Makefile
> index fb689d8..ca4c566 100644
> --- a/arch/arm/mach-imx/Makefile
> +++ b/arch/arm/mach-imx/Makefile
> @@ -88,6 +88,8 @@ obj-$(CONFIG_SOC_IMX7D) += mach-imx7d.o
>  
>  ifeq ($(CONFIG_SUSPEND),y)
>  AFLAGS_suspend-imx6.o :=-Wa,-march=armv7-a
> +AFLAGS_suspend-imx7.o :=-Wa,-march=armv7-a
> +obj-$(CONFIG_IMX_GPCV2)  += suspend-imx7.o pm-imx7.o
>  obj-$(CONFIG_SOC_IMX6) += suspend-imx6.o

A rather strange ordering, can you keep the AFLAGS near the object file?

>  obj-$(CONFIG_SOC_IMX53) += suspend-imx53.o
>  endif
> diff --git a/arch/arm/mach-imx/pm-imx7.c b/arch/arm/mach-imx/pm-imx7.c
> new file mode 100644
> index 000..9035368
> --- /dev/null
> +++ b/arch/arm/mach-imx/pm-imx7.c
> @@ -0,0 +1,901 @@
> +/*
> + * Copyright (C) 2015 Freescale Semiconductor, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#define BM_LPCR_A7_AD_L2PGE  0x1
> +#define BM_LPCR_A7_AD_EN_C1_PUP  0x800
> +#define BM_LPCR_A7_AD_EN_C1_IRQ_PUP  0x400
> +#define BM_LPCR_A7_AD_EN_C0_PUP  0x200
> +#define BM_LPCR_A7_AD_EN_C0_IRQ_PUP  0x100
> +#define BM_LPCR_A7_AD_EN_PLAT_PDN0x10
> +#define BM_LPCR_A7_AD_EN_C1_PDN  0x8
> +#define BM_LPCR_A7_AD_EN_C1_WFI_PDN  0x4
> +#define BM_LPCR_A7_AD_EN_C0_PDN  0x2
> +#define BM_LPCR_A7_AD_EN_C0_WFI_PDN  0x1
> +
> +#define BM_LPCR_A7_BSC_IRQ_SRC_A7_WAKEUP 0x7000
> +#define BM_LPCR_A7_BSC_CPU_CLK_ON_LPM0x4000
> +#define BM_LPCR_A7_BSC_LPM1  0xc
> +#define BM_LPCR_A7_BSC_LPM0  0x3
> +#define BP_LPCR_A7_BSC_LPM1  2
> +#define BP_LPCR_A7_BSC_LPM0  0
> +
> +#define BM_LPCR_M4_MASK_DSM_TRIGGER  0x8000
> +
> +#define BM_SLPCR_EN_DSM  0x8000
> +#define BM_SLPCR_RBC_EN  0x4000
> +#define BM_SLPCR_VSTBY   0x4
> +#define BM_SLPCR_SBYOS   0x2
> +#define BM_SLPCR_BYPASS_PMIC_READY   0x1
> +
> +#define BM_GPC_PGC_ACK_SEL_A7_DUMMY_PUP_ACK  0x8000
> +#define BM_GPC_PGC_ACK_SEL_A7_DUMMY_PDN_ACK  0x8000

Typically the bit field size + shifts is used here, e.g.

#define BM_LPCR_A7_BSC_IRQ_SRC_A7_WAKEUP(0x7 << 28)
#define BM_LPCR_A7_BSC_CPU_CLK_ON_LPM   (0x1 << 14)

This is much easier to verify against the data sheet.

> +
> +#define GPC_LPCR_A7_BSC  0x0
> +#define GPC_LPCR_A7_AD   0x4
> +#define GPC_LPCR_M4  0x8
> +
> +#define GPC_PGC_CPU_MAPPING  0xec
> +#define GPC_PGC_SCU_TIMING   0x890
> +
> +#define GPC_SLPCR0x14
> +#define GPC_PGC_ACK_SEL_A7   0x24
> +
> +#define GPC_SLOT0_CFG0xb0
> +
> +#define GPC_PGC_C0   0x800
> +#define 

[PATCH v9 5/5] ARM: dts: vf-colibri: enable NAND flash controller

2015-07-31 Thread Stefan Agner
Enable NAND access by adding pinmux and NAND flash controller node
to device tree. The NAND chips currently used on the Colibri VF61
requires 8-bit ECC per 512 byte page, hence specify 32-bit ECC
strength per 2k page size.

Signed-off-by: Stefan Agner 
---
 arch/arm/boot/dts/vf-colibri.dtsi | 32 
 1 file changed, 32 insertions(+)

diff --git a/arch/arm/boot/dts/vf-colibri.dtsi 
b/arch/arm/boot/dts/vf-colibri.dtsi
index 68ca125..ab2e74b 100644
--- a/arch/arm/boot/dts/vf-colibri.dtsi
+++ b/arch/arm/boot/dts/vf-colibri.dtsi
@@ -52,6 +52,19 @@
pinctrl-0 = <_i2c0>;
 };
 
+ {
+   assigned-clocks = < VF610_CLK_NFC>;
+   assigned-clock-rates = <3300>;
+   nand-bus-width = <8>;
+   nand-ecc-mode = "hw";
+   nand-ecc-step-size = <2048>;
+   nand-ecc-strength = <32>;
+   nand-on-flash-bbt;
+   pinctrl-names = "default";
+   pinctrl-0 = <_nfc>;
+   status = "okay";
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_pwm0>;
@@ -156,6 +169,25 @@
>;
};
 
+   pinctrl_nfc: nfcgrp {
+   fsl,pins = <
+   VF610_PAD_PTD23__NF_IO7 0x28df
+   VF610_PAD_PTD22__NF_IO6 0x28df
+   VF610_PAD_PTD21__NF_IO5 0x28df
+   VF610_PAD_PTD20__NF_IO4 0x28df
+   VF610_PAD_PTD19__NF_IO3 0x28df
+   VF610_PAD_PTD18__NF_IO2 0x28df
+   VF610_PAD_PTD17__NF_IO1 0x28df
+   VF610_PAD_PTD16__NF_IO0 0x28df
+   VF610_PAD_PTB24__NF_WE_B0x28c2
+   VF610_PAD_PTB25__NF_CE0_B   0x28c2
+   VF610_PAD_PTB27__NF_RE_B0x28c2
+   VF610_PAD_PTC26__NF_RB_B0x283d
+   VF610_PAD_PTC27__NF_ALE 0x28c2
+   VF610_PAD_PTC28__NF_CLE 0x28c2
+   >;
+   };
+
pinctrl_pwm0: pwm0grp {
fsl,pins = <
VF610_PAD_PTB0__FTM0_CH00x1182
-- 
2.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >