Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 02:45:27PM +1000, NeilBrown wrote:

> Yes, I've looked lately :-)
> I think that all of RCU-walk, and probably some of REF-walk should happen
> before the filesystem gets to see anything.
> But once you hit a non-positive dentry or the parent of the target name, I'd
> rather hand over the the FS.

... and be ready to get it back when the sucker runs into a symlink.  Unless
you want to handle _those_ in NFS somehow (including an absolute one starting
with /sys/, etc.).

> NFSv4 has the ability to look up multiple components in a single LOOKUP call.
> VFS doesn't give it a chance to try because it wants to go step-by-step, and
> wants each entry in the cache to have an inode etc.

Do tell, how do we deal with .. afterwards if we leave the intermediate ones
without inodes?  We _could_ feed multi-component requests to filesystems
(and NFSv4 isn't the first one to handle that - 9p had been there a lot
earlier), but then you get to
* populate all of them with inodes
* be damn careful to avoid multiple dentries for the same directory
inode
Look, creating those suckers isn't the worst part; you need to be ready for
e.g. mount(2) or pathname resolution playing with the ones you'd created.
It's not fs-private data structure; pathname resolution might very well span
many filesystem types.

Worse, you get to deal with several multi-component requests jumping into
fs at the same place.  With responses arriving a bit afterwards, and guess
what?  Those requests happen to share bits and pieces of prefixes.  Oh,
and one of them is a rename.  Dealing with just the final components isn't
a problem; you'll need to deal with directory tree in all its fscking glory.
In a way that wouldn't be in too incestous relationship with the pathwalking
logics in VFS and, by that proxy, such in all other fs types.

In particular, "unknown" for intermediate nodes is a recipe for really
nasty mess.  If the path can rejoin the known universe several components
later... 

Dealing with multi-component lookups isn't impossible and might be a good
idea, but only if all intermediates are populated.  What information does
NFSv4 multi-component lookup give you?  9p one gives an array of FIDs,
one per component, and that is best used as multi-component revalidate
on hot dcache...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 26/48] net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland.

2015-05-15 Thread Willy Tarreau
Hi Ben,

On Fri, May 15, 2015 at 10:08:22PM +0100, Ben Hutchings wrote:
> I think you'll also want this related fix:
> 
> commit 91edd096e224941131f896b86838b1e59553696a
> Author: Catalin Marinas 
> Date:   Fri Mar 20 16:48:13 2015 +
> 
> net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() 
> behaviour

Ah good catch, I missed it. Now merged and tested.
BTW, I've added your s-o-b on the two other patches.

Thanks!
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] checkpatch: Add --strict warning for c99 fixed size typedefs : int_t

2015-05-15 Thread Joe Perches
Using declarations like u_int16_t in kernel code is not preferred.

Suggest the kernel sized types instead of the c99 types when not
in the uapi directory.

Add a $typeC99Typedefs variable for the types to check and
neaten the other typedef variables.

Signed-off-by:  Joe Perches 
---
 scripts/checkpatch.pl | 31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 89b1df4..9ffccc7 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -347,15 +347,20 @@ our $UTF8 = qr{
| $NON_ASCII_UTF8
 }x;
 
+our $typeC99Typedefs = qr{(?:__)?(?:[us]_?)?int_?(?:8|16|32|64)_t};
 our $typeOtherOSTypedefs = qr{(?x:
u_(?:char|short|int|long) |  # bsd
u(?:nchar|short|int|long)# sysv
 )};
-
-our $typeTypedefs = qr{(?x:
+our $typeKernelTypedefs = qr{(?x:
(?:__)?(?:u|s|be|le)(?:8|16|32|64)|
atomic_t
 )};
+our $typeTypedefs = qr{(?x:
+   $typeC99Typedefs\b|
+   $typeOtherOSTypedefs\b|
+   $typeKernelTypedefs\b
+)};
 
 our $logFunctions = qr{(?x:
printk(?:_ratelimited|_once|)|
@@ -516,7 +521,6 @@ sub build_types {
my $allWithAttr = "(?x:  \n" . join("|\n  ", @typeListWithAttr) . "\n)";
$Modifier   = qr{(?:$Attribute|$Sparse|$mods)};
$BasicType  = qr{
-   (?:$typeOtherOSTypedefs\b)|
(?:$typeTypedefs\b)|
(?:${all}\b)
}x;
@@ -524,7 +528,6 @@ sub build_types {
(?:$Modifier\s+|const\s+)*
(?:
(?:typeof|__typeof__)\s*\([^\)]*\)|
-   (?:$typeOtherOSTypedefs\b)|
(?:$typeTypedefs\b)|
(?:${all}\b)
)
@@ -542,7 +545,6 @@ sub build_types {
(?:
(?:typeof|__typeof__)\s*\([^\)]*\)|
(?:$typeTypedefs\b)|
-   (?:$typeOtherOSTypedefs\b)|
(?:${allWithAttr}\b)
)
(?:\s+$Modifier|\s+const)*
@@ -3264,7 +3266,6 @@ sub process {
$line !~ /\btypedef\s+$Type\s*\(\s*\*?$Ident\s*\)\s*\(/ &&
$line !~ /\btypedef\s+$Type\s+$Ident\s*\(/ &&
$line !~ /\b$typeTypedefs\b/ &&
-   $line !~ /\b$typeOtherOSTypedefs\b/ &&
$line !~ /\b__bitwise(?:__|)\b/) {
WARN("NEW_TYPEDEFS",
 "do not add new typedefs\n" . $herecurr);
@@ -4973,6 +4974,24 @@ sub process {
  "Using weak declarations can have unintended link 
defects\n" . $herecurr);
}
 
+# check for c99 types like uint8_t used outside of uapi/
+   if ($realfile !~ m@\binclude/uapi/@ &&
+   $line =~ /\b($Declare)\s*$Ident\s*[=;,\[]/) {
+   my $type = $1;
+   if ($type =~ /\b($typeC99Typedefs)\b/) {
+   $type = $1;
+   my $kernel_type = 'u';
+   $kernel_type = 's' if ($type =~ /^_*[si]/);
+   $type =~ /(\d+)/;
+   $kernel_type .= $1;
+   if (CHK("PREFER_KERNEL_TYPES",
+   "Prefer kernel type '$kernel_type' over 
'$type'\n" . $herecurr) &&
+   $fix) {
+   $fixed[$fixlinenr] =~ 
s/\b$type\b/$kernel_type/;
+   }
+   }
+   }
+
 # check for sizeof(&)
if ($line =~ /\bsizeof\s*\(\s*\&/) {
WARN("SIZEOF_ADDRESS",


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 03/18] f2fs crypto: declare some definitions for f2fs encryption feature

2015-05-15 Thread Tom Marshall
On Fri, May 15, 2015 at 06:14:24PM -0700, Jaegeuk Kim wrote:
> On Thu, May 14, 2015 at 09:50:44AM -0700, Tom Marshall wrote:
> > Please keep in mind that I'm also working on transparent
> > compression.  I'm watching this thread closely so that I can
> > implement a compression library alongside the crypto library.  If
> > there is any interest or benefit, I would be glad to work together
> > so that the two can be done cooperatively at the same time.
> 
> I can't imagine quickly how compression code can be shared with crypto.
> The basic approach for compression would be that X pages can be compressed 
> into
> small number of pages, Y, which can be a X to Y mapping.
> But, this per-file encryption supports only 1 to 1 4KB mapping, so that it 
> could
> be quite a simple implementation.

No, I don't intend to share actual code with crypto -- at least not much. 
I'm more interested in looking at how the crypto layer is implemented to
give me clues about how to implement a compression layer.

> Could you elaborate on your approach or design? Or, codes?
> Whatever, IMO, it needs to implement it by any filesystem first.

I don't really have any working code yet.  I will probably get to that in
the coming few weeks.  Right now I'm still working with the ugly VFS
stacking implementation that I posted initially.

The thing that I have done is dismissed the standard compression framing
formats.

zlib (and gzip) are designed for streaming and it is quite difficult to
implement random access on it.  See the example code in the zlib source,
zran.c.  It's not really tenable because 32kb of prior data is required to
be kept as priming information.  Even doing fully encapsulated blocks with
Z_FINISH, there is still no way to skip over data without decompressing it
first to build an index.

lz4 is somewhat better in that blocks are self contained.  But block lengths
must be read sequentially.  This means that reading an arbitrary position in
a file requires a proportional number of reads to find the desired block.

So, I am working with a simple framing format that I threw together.  The
header has a compression method (zlib or lz4), block size, original input
size, and a block map.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread NeilBrown
On Sat, 16 May 2015 02:47:18 +0100 Al Viro  wrote:

> On Sat, May 16, 2015 at 11:25:03AM +1000, NeilBrown wrote:
> > But surely those things can be managed with a spinlock.
> > 
> > I think a big part of the problem is that the VFS tries to control
> > filesystems rather than provide services to them.
> 
> What with being the thing syscalls talk to for sending the requests to
> filesystems...  Do you really want to push the pathname resolution into
> fs code?  You've looked at it lately, right?

Yes, I've looked lately :-)
I think that all of RCU-walk, and probably some of REF-walk should happen
before the filesystem gets to see anything.
But once you hit a non-positive dentry or the parent of the target name, I'd
rather hand over the the FS.

NFSv4 has the ability to look up multiple components in a single LOOKUP call.
VFS doesn't give it a chance to try because it wants to go step-by-step, and
wants each entry in the cache to have an inode etc.

The earlier the filesystem gets control, the less completely-general the VFS
needs to be.

> 
> > I'm not convinced that serialising 'lookup' calls is vital.  If two threads
> > find a 'not-validated' dentry, and both try to look up the inode, they
> > will both ultimately get the same struct_inode from the icache, and will 
> > both
> > succeed in connecting it to the dentry.  Obviously it would be better to
> > avoid two concurrent NFS "LOOKUP" requests, but that is a problem for NFS to
> > solve.  I suspect that using d_fsdata to point to a pending LOOKUP request
> > would allow the "second" thread to wait for that request to finish.  Other
> > filesystems would take a completely different approach.
> 
> See upthread regarding multiple negative dentries with the same name and fun
> consequences thereof.  There might be _NO_ inode.  At all.  dcache has a large
> negative component and without it you'd get really fucked on NFS as soon
> as you try to compile anything.  Shitloads of headers, looked up in a lot of
> directories.  Most of the lookups ending up negative.  We really do need that
> stuff...

Of course negative dentries are important and having multiple would be
unfortunate.  I don't suggest that for a moment.
I'm suggesting three different states for a dentry: positive, negative, don't
know.  "don't know" is a new state that isn't currently allowed.

While a filesystem is performing 'lookup', doing its own locking or not, the
dentry would be "don't know".  Anything that needed to know would block
somewhere in the filesystem code on whatever lock or waitqueue or whatever
that the filesystem developer felt as appropriate.  On i_mutex if
generic_foo() was in use.

If NFSv4 did a multi-component lookup, the intermediate dentries would be
"don't know" even while they had children.  For local filesystems, that sort
of thing would never happen.  For NFS - which has to allow for random changes
on the server anyway - it is just part of the game.

NeilBrown



pgpBnToXfLqla.pgp
Description: OpenPGP digital signature


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 08:37:20PM -0700, Linus Torvalds wrote:
> On May 15, 2015 8:17 PM, "Al Viro"  wrote:
> >
> > What for?  All we need is a flag, waitqueue and being woken
> > up when the flag gets cleared.
> 
> You need to have the flag somewhere.
> 
> The child dentry doesn't exist yet.
> 
> That's the point of the hashed entry. It approximates the not-yet-existing
> child dentry that we have *not* added to the parent until after lookup.

Point, but...  A lot of our problems comes from the fact that ->i_mutex
doubles as protection against the addition to the list of children, on
top of protection of directory itself.  What if we do the following:
have the normal case of __lookup_hash() (and other callers of lookup_real())
* allocate dentry, marked "in-lookup"
* do dcache lookup, likely to come up empty, _without_ touching
potential matches' d_lock, i.e. based on __d_lookup_rcu() (under
rcu_read_lock(), with rename_lock loop around it).  Hold parent's ->d_lock
while walking the chain, grab refcount in the unlikely case the match had
been found.  If nothing's found *and* rename_lock hadn't been touched, insert
the new dentry into hash and list of children before dropping ->d_lock.
* call ->lookup() (still under ->i_mutex, shared)
* clear "in-lookup" bit on _original_ dentry (we might very well
have returned a different one)
* kick the wait queue of parent's ->i_mutex

I'll need to think about that after I get some sleep, but it smells like
that could be feasible.  Of course, that assumes we'll be able to cope
with hashed-but-currently-in-lookup dentries, but I think it might be
doable with some massage...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] drivers: staging: unisys: visorbus: visorchipset.c: private functions should be declared static

2015-05-15 Thread Tolga Ceylan
visorchipset_file_init() and visorchipset_file_cleanup() functions
do not seem to be used from anywhere else and now are declared
as static. Sparse emitted "not declared" warnings for these two
functions.

Signed-off-by: Tolga Ceylan 
---
 drivers/staging/unisys/visorbus/visorchipset.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/unisys/visorbus/visorchipset.c 
b/drivers/staging/unisys/visorbus/visorchipset.c
index ca22f49..66ae3d0 100644
--- a/drivers/staging/unisys/visorbus/visorchipset.c
+++ b/drivers/staging/unisys/visorbus/visorchipset.c
@@ -2351,7 +2351,7 @@ static const struct file_operations visorchipset_fops = {
.mmap = visorchipset_mmap,
 };
 
-int
+static int
 visorchipset_file_init(dev_t major_dev, struct visorchannel 
**controlvm_channel)
 {
int rc = 0;
@@ -2460,7 +2460,7 @@ cleanup:
return rc;
 }
 
-void
+static void
 visorchipset_file_cleanup(dev_t major_dev)
 {
if (file_cdev.ops)
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging: comedi: fix coding style issues

2015-05-15 Thread Geliang Tang
1) Fixed an error found by checkpatch.pl.
   ERROR: space required after that ',' (ctx:VxV)
   ./drivers/ni_mio_common.c:3764
2) Changed "register 0x%x" to "register=0x%x" to keep the consistency
   of this file.
3) The kernel version is next-20150515, 4.1.0-rc3.

Signed-off-by: Geliang Tang 
---
 drivers/staging/comedi/drivers/ni_mio_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c 
b/drivers/staging/comedi/drivers/ni_mio_common.c
index 9dfd4e6..6cc304a 100644
--- a/drivers/staging/comedi/drivers/ni_mio_common.c
+++ b/drivers/staging/comedi/drivers/ni_mio_common.c
@@ -3761,7 +3761,7 @@ static unsigned int ni_gpct_to_stc_register(struct 
comedi_device *dev,
if (reg < ARRAY_SIZE(ni_gpct_to_stc_regmap)) {
regmap = _gpct_to_stc_regmap[reg];
} else {
-   dev_warn(dev->class_dev,"%s: unhandled register 0x%x\n",
+   dev_warn(dev->class_dev, "%s: unhandled register=0x%x\n",
 __func__, reg);
return 0;
}
-- 
2.3.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 3/7] cgroup: replace explicit ss_mask checking with for_each_subsys_which

2015-05-15 Thread Aleksa Sarai
Replace the explicit checking against ss_masks inside a for_each_subsys
block with for_each_subsys_which(..., ss_mask), to take advantage of the
more readable macro.

Signed-off-by: Aleksa Sarai 
---
 kernel/cgroup.c | 44 
 1 file changed, 16 insertions(+), 28 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 633d02a..eb5e4b3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1102,9 +1102,8 @@ static unsigned long cgroup_calc_child_subsys_mask(struct 
cgroup *cgrp,
while (true) {
unsigned long new_ss_mask = cur_ss_mask;
 
-   for_each_subsys(ss, ssid)
-   if (cur_ss_mask & (1 << ssid))
-   new_ss_mask |= ss->depends_on;
+   for_each_subsys_which(ss, ssid, _ss_mask)
+   new_ss_mask |= ss->depends_on;
 
/*
 * Mask out subsystems which aren't available.  This can
@@ -1242,10 +1241,7 @@ static int rebind_subsystems(struct cgroup_root 
*dst_root,
 
lockdep_assert_held(_mutex);
 
-   for_each_subsys(ss, ssid) {
-   if (!(ss_mask & (1 << ssid)))
-   continue;
-
+   for_each_subsys_which(ss, ssid, _mask) {
/* if @ss has non-root csses attached to it, can't move */
if (css_next_child(NULL, cgroup_css(>root->cgrp, ss)))
return -EBUSY;
@@ -1282,18 +1278,14 @@ static int rebind_subsystems(struct cgroup_root 
*dst_root,
 * Nothing can fail from this point on.  Remove files for the
 * removed subsystems and rebind each subsystem.
 */
-   for_each_subsys(ss, ssid)
-   if (ss_mask & (1 << ssid))
-   cgroup_clear_dir(>root->cgrp, 1 << ssid);
+   for_each_subsys_which(ss, ssid, _mask)
+   cgroup_clear_dir(>root->cgrp, 1 << ssid);
 
-   for_each_subsys(ss, ssid) {
+   for_each_subsys_which(ss, ssid, _mask) {
struct cgroup_root *src_root;
struct cgroup_subsys_state *css;
struct css_set *cset;
 
-   if (!(ss_mask & (1 << ssid)))
-   continue;
-
src_root = ss->root;
css = cgroup_css(_root->cgrp, ss);
 
@@ -2567,13 +2559,11 @@ static void cgroup_print_ss_mask(struct seq_file *seq, 
unsigned long ss_mask)
bool printed = false;
int ssid;
 
-   for_each_subsys(ss, ssid) {
-   if (ss_mask & (1 << ssid)) {
-   if (printed)
-   seq_putc(seq, ' ');
-   seq_printf(seq, "%s", ss->name);
-   printed = true;
-   }
+   for_each_subsys_which(ss, ssid, _mask) {
+   if (printed)
+   seq_putc(seq, ' ');
+   seq_printf(seq, "%s", ss->name);
+   printed = true;
}
if (printed)
seq_putc(seq, '\n');
@@ -2721,11 +2711,12 @@ static ssize_t cgroup_subtree_control_write(struct 
kernfs_open_file *of,
 */
buf = strstrip(buf);
while ((tok = strsep(, " "))) {
+   unsigned long tmp_ss_mask = ~cgrp_dfl_root_inhibit_ss_mask;
+
if (tok[0] == '\0')
continue;
-   for_each_subsys(ss, ssid) {
-   if (ss->disabled || strcmp(tok + 1, ss->name) ||
-   ((1 << ss->id) & cgrp_dfl_root_inhibit_ss_mask))
+   for_each_subsys_which(ss, ssid, _ss_mask) {
+   if (ss->disabled || strcmp(tok + 1, ss->name))
continue;
 
if (*tok == '+') {
@@ -2812,10 +2803,7 @@ static ssize_t cgroup_subtree_control_write(struct 
kernfs_open_file *of,
 * still around.  In such cases, wait till it's gone using
 * offline_waitq.
 */
-   for_each_subsys(ss, ssid) {
-   if (!(css_enable & (1 << ssid)))
-   continue;
-
+   for_each_subsys_which(ss, ssid, _enable) {
cgroup_for_each_live_child(child, cgrp) {
DEFINE_WAIT(wait);
 
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 7/7] cgroup: implement the PIDs subsystem

2015-05-15 Thread Aleksa Sarai
Adds a new single-purpose PIDs subsystem to limit the number of
tasks that can be forked inside a cgroup. Essentially this is an
implementation of RLIMIT_NPROC that applies to a cgroup rather than a
process tree.

However, it should be noted that organisational operations (adding and
removing tasks from a PIDs hierarchy) will *not* be prevented. Rather,
the number of tasks in the hierarchy cannot exceed the limit through
forking. This is due to the fact that, in the unified hierarchy, attach
cannot fail (and it is not possible for a task to overcome its PIDs
cgroup policy limit by attaching to a child cgroup).

PIDs are fundamentally a global resource, and it is possible to reach
PID exhaustion inside a cgroup without hitting any reasonable kmemcg
policy. Once you've hit PID exhaustion, you're only in a marginally
better state than OOM. This subsystem allows PID exhaustion inside a
cgroup to be prevented.

Signed-off-by: Aleksa Sarai 
---
 CREDITS   |   5 +
 include/linux/cgroup_subsys.h |   5 +
 init/Kconfig  |  16 ++
 kernel/Makefile   |   1 +
 kernel/cgroup_pids.c  | 379 ++
 5 files changed, 406 insertions(+)
 create mode 100644 kernel/cgroup_pids.c

diff --git a/CREDITS b/CREDITS
index 40cc4bf..0727426 100644
--- a/CREDITS
+++ b/CREDITS
@@ -3215,6 +3215,11 @@ S: 69 rue Dunois
 S: 75013 Paris
 S: France
 
+N: Aleksa Sarai
+E: cyp...@cyphar.com
+W: https://www.cyphar.com/
+D: `pids` cgroup subsystem
+
 N: Dipankar Sarma
 E: dipan...@in.ibm.com
 D: RCU
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index 81b7bdd..32becaf 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -56,6 +56,11 @@ SUBSYS(hugetlb)
  * Subsystems that implement the can_fork() family of callbacks.
  */
 SUBSYS_TAG(CANFORK_START)
+
+#if IS_ENABLED(CONFIG_CGROUP_PIDS)
+SUBSYS(pids)
+#endif
+
 SUBSYS_TAG(CANFORK_END)
 
 /*
diff --git a/init/Kconfig b/init/Kconfig
index dc24dec..24b2563 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -967,6 +967,22 @@ config CGROUP_FREEZER
  Provides a way to freeze and unfreeze all tasks in a
  cgroup.
 
+config CGROUP_PIDS
+   bool "PIDs cgroup subsystem"
+   help
+ Provides enforcement of process number limits in the scope of a
+ cgroup. Any attempt to fork more processes than is allowed in the
+ cgroup will fail. PIDs are fundamentally a global resource because it
+ is fairly trivial to reach PID exhaustion before you reach even a
+ conservative kmemcg limit. As a result, it is possible to grind a
+ system to halt without being limited by other cgroup policies. The
+ PIDs cgroup subsystem is designed to stop this from happening.
+
+ It should be noted that organisational operations (such as attaching
+ to a cgroup hierarchy will *not* be blocked by the PIDs subsystem),
+ since the PIDs limit only affects a process's ability to fork, not to
+ attach to a cgroup.
+
 config CGROUP_DEVICE
bool "Device controller for cgroups"
help
diff --git a/kernel/Makefile b/kernel/Makefile
index 0f8f8b0..df5406c 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -55,6 +55,7 @@ obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
 obj-$(CONFIG_COMPAT) += compat.o
 obj-$(CONFIG_CGROUPS) += cgroup.o
 obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o
+obj-$(CONFIG_CGROUP_PIDS) += cgroup_pids.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_UTS_NS) += utsname.o
 obj-$(CONFIG_USER_NS) += user_namespace.o
diff --git a/kernel/cgroup_pids.c b/kernel/cgroup_pids.c
new file mode 100644
index 000..8457da6
--- /dev/null
+++ b/kernel/cgroup_pids.c
@@ -0,0 +1,379 @@
+/*
+ * Process number limiting controller for cgroups.
+ *
+ * Used to allow a cgroup hierarchy to stop any new processes from fork()ing
+ * after a certain limit is reached.
+ *
+ * Since it is trivial to hit the task limit without hitting any kmemcg limits
+ * in place, PIDs are a fundamental resource. As such, PID exhaustion must be
+ * preventable in the scope of a cgroup hierarchy by allowing resource limiting
+ * of the number of tasks in a cgroup.
+ *
+ * In order to use the `pids` controller, set the maximum number of tasks in
+ * pids.max (this is not available in the root cgroup for obvious reasons). The
+ * number of processes currently in the cgroup is given by pids.current.
+ * Organisational operations are not blocked by cgroup policies, so it is
+ * possible to have pids.current > pids.max. However, it is not possible to
+ * violate a cgroup policy through fork(). fork() will retrun -EAGAIN if 
forking
+ * would cause a cgroup policy to be violated.
+ *
+ * To set a cgroup to have no limit, set pids.max to "max". This is the default
+ * for all new cgroups (NB that PID limits are hierarchical, so the most
+ * stringent limit in the hierarchy is followed).
+ *
+ * 

[PATCH v11 1/7] cgroup: switch to unsigned long for bitmasks

2015-05-15 Thread Aleksa Sarai
Switch the type of all internal cgroup masks to (unsigned long), which
is the correct type for bitmasks. This is in preparation for the
for_each_subsys_which patch.

Signed-off-by: Aleksa Sarai 
---
 kernel/cgroup.c | 39 ---
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 469dd54..15896ed 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -156,7 +156,7 @@ static bool cgrp_dfl_root_visible;
 static bool cgroup_legacy_files_on_dfl;
 
 /* some controllers are not supported in the default hierarchy */
-static unsigned int cgrp_dfl_root_inhibit_ss_mask;
+static unsigned long cgrp_dfl_root_inhibit_ss_mask;
 
 /* The list of hierarchy roots */
 
@@ -186,7 +186,7 @@ static struct cftype cgroup_dfl_base_files[];
 static struct cftype cgroup_legacy_base_files[];
 
 static int rebind_subsystems(struct cgroup_root *dst_root,
-unsigned int ss_mask);
+unsigned long ss_mask);
 static int cgroup_destroy_locked(struct cgroup *cgrp);
 static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
  bool visible);
@@ -998,7 +998,7 @@ static struct cgroup *task_cgroup_from_root(struct 
task_struct *task,
  * update of a tasks cgroup pointer by cgroup_attach_task()
  */
 
-static int cgroup_populate_dir(struct cgroup *cgrp, unsigned int subsys_mask);
+static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask);
 static struct kernfs_syscall_ops cgroup_kf_syscall_ops;
 static const struct file_operations proc_cgroupstats_operations;
 
@@ -1068,11 +1068,11 @@ static void cgroup_put(struct cgroup *cgrp)
  * @subtree_control is to be applied to @cgrp.  The returned mask is always
  * a superset of @subtree_control and follows the usual hierarchy rules.
  */
-static unsigned int cgroup_calc_child_subsys_mask(struct cgroup *cgrp,
- unsigned int subtree_control)
+static unsigned long cgroup_calc_child_subsys_mask(struct cgroup *cgrp,
+ unsigned long subtree_control)
 {
struct cgroup *parent = cgroup_parent(cgrp);
-   unsigned int cur_ss_mask = subtree_control;
+   unsigned long cur_ss_mask = subtree_control;
struct cgroup_subsys *ss;
int ssid;
 
@@ -1082,7 +1082,7 @@ static unsigned int cgroup_calc_child_subsys_mask(struct 
cgroup *cgrp,
return cur_ss_mask;
 
while (true) {
-   unsigned int new_ss_mask = cur_ss_mask;
+   unsigned long new_ss_mask = cur_ss_mask;
 
for_each_subsys(ss, ssid)
if (cur_ss_mask & (1 << ssid))
@@ -1200,7 +1200,7 @@ static void cgroup_rm_file(struct cgroup *cgrp, const 
struct cftype *cft)
  * @cgrp: target cgroup
  * @subsys_mask: mask of the subsystem ids whose files should be removed
  */
-static void cgroup_clear_dir(struct cgroup *cgrp, unsigned int subsys_mask)
+static void cgroup_clear_dir(struct cgroup *cgrp, unsigned long subsys_mask)
 {
struct cgroup_subsys *ss;
int i;
@@ -1215,10 +1215,11 @@ static void cgroup_clear_dir(struct cgroup *cgrp, 
unsigned int subsys_mask)
}
 }
 
-static int rebind_subsystems(struct cgroup_root *dst_root, unsigned int 
ss_mask)
+static int rebind_subsystems(struct cgroup_root *dst_root,
+unsigned long ss_mask)
 {
struct cgroup_subsys *ss;
-   unsigned int tmp_ss_mask;
+   unsigned long tmp_ss_mask;
int ssid, i, ret;
 
lockdep_assert_held(_mutex);
@@ -1253,7 +1254,7 @@ static int rebind_subsystems(struct cgroup_root 
*dst_root, unsigned int ss_mask)
 * Just warn about it and continue.
 */
if (cgrp_dfl_root_visible) {
-   pr_warn("failed to create files (%d) while rebinding 
0x%x to default root\n",
+   pr_warn("failed to create files (%d) while rebinding 
0x%lx to default root\n",
ret, ss_mask);
pr_warn("you may retry by moving them to a different 
hierarchy and unbinding\n");
}
@@ -1338,7 +1339,7 @@ static int cgroup_show_options(struct seq_file *seq,
 }
 
 struct cgroup_sb_opts {
-   unsigned int subsys_mask;
+   unsigned long subsys_mask;
unsigned int flags;
char *release_agent;
bool cpuset_clone_children;
@@ -1351,7 +1352,7 @@ static int parse_cgroupfs_options(char *data, struct 
cgroup_sb_opts *opts)
 {
char *token, *o = data;
bool all_ss = false, one_ss = false;
-   unsigned int mask = -1U;
+   unsigned long mask = -1UL;
struct cgroup_subsys *ss;
int nr_opts = 0;
int i;
@@ -1495,7 +1496,7 @@ static int cgroup_remount(struct kernfs_root *kf_root, 
int *flags, char *data)
int ret = 0;
struct cgroup_root *root = 

[PATCH v11 5/7] cgroup: move enum cgroup_subsys_id definition

2015-05-15 Thread Aleksa Sarai
This patch is in preparation for the pids cgroup subsystem patchset.

Signed-off-by: Aleksa Sarai 
---
 include/linux/cgroup.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index e7da0aa..35ba593 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -25,6 +25,14 @@
 
 #ifdef CONFIG_CGROUPS
 
+/* define the enumeration of all cgroup subsystems */
+#define SUBSYS(_x) _x ## _cgrp_id,
+enum cgroup_subsys_id {
+#include 
+   CGROUP_SUBSYS_COUNT,
+};
+#undef SUBSYS
+
 struct cgroup_root;
 struct cgroup_subsys;
 struct cgroup;
@@ -40,14 +48,6 @@ extern int cgroupstats_build(struct cgroupstats *stats,
 extern int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *tsk);
 
-/* define the enumeration of all cgroup subsystems */
-#define SUBSYS(_x) _x ## _cgrp_id,
-enum cgroup_subsys_id {
-#include 
-   CGROUP_SUBSYS_COUNT,
-};
-#undef SUBSYS
-
 /*
  * Per-subsystem/per-cgroup state maintained by the system.  This is the
  * fundamental structural building block that controllers deal with.
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 6/7] cgroup: allow a cgroup subsystem to reject a fork

2015-05-15 Thread Aleksa Sarai
Add a new cgroup subsystem callback can_fork that conditionally
states whether or not the fork is accepted or rejected by a cgroup
policy. In addition, add a cancel_fork callback so that if an error
occurs later in the forking process, any state modified by can_fork can
be reverted.

Allow for a private opaque pointer to be passed from the cgroup_can_fork
to cgroup_post_fork, allowing for the fork state to be stored by each
subsystem separately.

In order for a subsystem to know that a task associated with a cgroup
hierarchy is being migrated to another hierarchy, add a detach callback
to the subsystem which is run after the migration has been confirmed but
before the old_cset's refcount is dropped. This is necessary in order
for a subsystem to be able to keep a proper count of how many tasks are
associated with that subsystem.

Also add a tagging system for cgroup_subsys.h to allow for CGROUP_
enumerations to be be defined and used (as well as CGROUP__COUNT).

This is in preparation for implementing the pids cgroup subsystem.

Signed-off-by: Aleksa Sarai 
---
 include/linux/cgroup.h| 35 +++--
 include/linux/cgroup_subsys.h | 17 
 kernel/cgroup.c   | 90 +++
 kernel/cgroup_freezer.c   |  2 +-
 kernel/fork.c | 17 +++-
 kernel/sched/core.c   |  2 +-
 6 files changed, 149 insertions(+), 14 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 35ba593..ef9d21c 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -28,11 +28,16 @@
 /* define the enumeration of all cgroup subsystems */
 #define SUBSYS(_x) _x ## _cgrp_id,
 enum cgroup_subsys_id {
+#define SUBSYS_TAG(_t) CGROUP_ ## _t, \
+   __unused_tag_ ## _t = CGROUP_ ## _t - 1,
 #include 
+#undef SUBSYS_TAG
CGROUP_SUBSYS_COUNT,
 };
 #undef SUBSYS
 
+#define CGROUP_CANFORK_COUNT (CGROUP_CANFORK_END - CGROUP_CANFORK_START)
+
 struct cgroup_root;
 struct cgroup_subsys;
 struct cgroup;
@@ -40,7 +45,12 @@ struct cgroup;
 extern int cgroup_init_early(void);
 extern int cgroup_init(void);
 extern void cgroup_fork(struct task_struct *p);
-extern void cgroup_post_fork(struct task_struct *p);
+extern int cgroup_can_fork(struct task_struct *p,
+  void *ss_private[CGROUP_CANFORK_COUNT]);
+extern void cgroup_cancel_fork(struct task_struct *p,
+  void *ss_private[CGROUP_CANFORK_COUNT]);
+extern void cgroup_post_fork(struct task_struct *p,
+void *old_ss_private[CGROUP_CANFORK_COUNT]);
 extern void cgroup_exit(struct task_struct *p);
 extern int cgroupstats_build(struct cgroupstats *stats,
struct dentry *dentry);
@@ -629,6 +639,14 @@ struct task_struct *cgroup_taskset_next(struct 
cgroup_taskset *tset);
for ((task) = cgroup_taskset_first((tset)); (task); \
 (task) = cgroup_taskset_next((tset)))
 
+/**
+ * tset_get_css - obtain and get css for (tset, subsys_id)
+ * @tset: target taskset
+ * @subsys_id: target subsystem id
+ */
+#define tset_get_css(tset, subsys_id)  \
+   task_get_css(cgroup_taskset_first(tset), subsys_id)
+
 /*
  * Control Group subsystem type.
  * See Documentation/cgroups/cgroups.txt for details
@@ -649,7 +667,9 @@ struct cgroup_subsys {
  struct cgroup_taskset *tset);
void (*attach)(struct cgroup_subsys_state *css,
   struct cgroup_taskset *tset);
-   void (*fork)(struct task_struct *task);
+   int (*can_fork)(struct task_struct *task, void **privatep);
+   void (*cancel_fork)(struct task_struct *task, void *private);
+   void (*fork)(struct task_struct *task, void *private);
void (*exit)(struct cgroup_subsys_state *css,
 struct cgroup_subsys_state *old_css,
 struct task_struct *task);
@@ -970,10 +990,19 @@ struct cgroup_subsys_state 
*css_tryget_online_from_dir(struct dentry *dentry,
 
 struct cgroup_subsys_state;
 
+#define CGROUP_CANFORK_COUNT 0
+
 static inline int cgroup_init_early(void) { return 0; }
 static inline int cgroup_init(void) { return 0; }
 static inline void cgroup_fork(struct task_struct *p) {}
-static inline void cgroup_post_fork(struct task_struct *p) {}
+static inline int cgroup_can_fork(struct task_struct *p,
+ void *ss_private[CGROUP_CANFORK_COUNT])
+{ return 0; }
+static inline void cgroup_cancel_fork(struct task_struct *p,
+ void *ss_private[CGROUP_CANFORK_COUNT]) {}
+static inline void cgroup_post_fork(struct task_struct *p,
+   void *ss_private[CGROUP_CANFORK_COUNT]) {}
+
 static inline void cgroup_exit(struct task_struct *p) {}
 
 static inline int cgroupstats_build(struct cgroupstats *stats,
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index 

[PATCH v11 4/7] cgroup, block: implement task_get_css() and use it in bio_associate_current()

2015-05-15 Thread Aleksa Sarai
From: Tejun Heo 

bio_associate_current() currently open codes task_css() and
css_tryget_online() to find and pin $current's blkcg css.  Abstract it
into task_get_css() which is implemented from cgroup side.  As a task
is always associated with an online css for every subsystem except
while the css_set update is propagating, task_get_css() retries till
css_tryget_online() succeeds.

This is a cleanup and shouldn't lead to noticeable behavior changes.

Signed-off-by: Tejun Heo 
Cc: Li Zefan 
Cc: Jens Axboe 
Cc: Vivek Goyal 
---
 block/bio.c| 11 +--
 include/linux/cgroup.h | 25 +
 2 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index f66a4ea..968683e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1987,7 +1987,6 @@ EXPORT_SYMBOL(bioset_create_nobvec);
 int bio_associate_current(struct bio *bio)
 {
struct io_context *ioc;
-   struct cgroup_subsys_state *css;
 
if (bio->bi_ioc)
return -EBUSY;
@@ -1996,17 +1995,9 @@ int bio_associate_current(struct bio *bio)
if (!ioc)
return -ENOENT;
 
-   /* acquire active ref on @ioc and associate */
get_io_context_active(ioc);
bio->bi_ioc = ioc;
-
-   /* associate blkcg if exists */
-   rcu_read_lock();
-   css = task_css(current, blkio_cgrp_id);
-   if (css && css_tryget_online(css))
-   bio->bi_css = css;
-   rcu_read_unlock();
-
+   bio->bi_css = task_get_css(current, blkio_cgrp_id);
return 0;
 }
 
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b9cb94c..e7da0aa 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -774,6 +774,31 @@ static inline struct cgroup_subsys_state *task_css(struct 
task_struct *task,
 }
 
 /**
+ * task_get_css - find and get the css for (task, subsys)
+ * @task: the target task
+ * @subsys_id: the target subsystem ID
+ *
+ * Find the css for the (@task, @subsys_id) combination, increment a
+ * reference on and return it.  This function is guaranteed to return a
+ * valid css.
+ */
+static inline struct cgroup_subsys_state *
+task_get_css(struct task_struct *task, int subsys_id)
+{
+   struct cgroup_subsys_state *css;
+
+   rcu_read_lock();
+   while (true) {
+   css = task_css(task, subsys_id);
+   if (likely(css_tryget_online(css)))
+   break;
+   cpu_relax();
+   }
+   rcu_read_unlock();
+   return css;
+}
+
+/**
  * task_css_is_root - test whether a task belongs to the root css
  * @task: the target task
  * @subsys_id: the target subsystem ID
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 0/7] cgroups: add pids subsystem

2015-05-15 Thread Aleksa Sarai
This is an updated version of v10 of the pids patchset[1], with a few
quite large-ish updates and improvments:

* Switch all internal cgroup.c bitmasks to use (unsigned long), modify
  for_each_subsys_which() to use for_each_set_bit() [which required some
  other changes]. Also modify for_each_subsys_which() to take the
  pointer to the bitmask as an argument.

* Modify argument order of for_each_subsys_which() so that it follows
  the kernel style of (cursors..., iterable).

* Make naming of prefork / canfork consistent such that all references
  are canfork (such as CGROUP_CANFORK_*).

* A whole bunch of style and naming fixes.

* Remove ->detach and implement it using ->{can,cancel}_attach.

* Remove the cancelfork bitmask -- just do ss->cancel_fork checks since
  it's not a hot path.

* Add subsys_canfork_private{,p}().

* Include Tejun's implementation of task_get_css()[2] so that fork()s
  aren't failed because of a migration operation (which is guaranteed to
  complete in constant time).

* Switch valid input value range for `pids.max` to [0, PIDS_MAX), to
  make the interface more consistent (so that you can't input a value
  that translates to "max" transparently).

* Fixed up a whole bunch of comments that were too pids-specific or not
  explicit enough.

[1]: https://lkml.org/lkml/2015/4/19/39
[2]: http://lkml.kernel.org/g/1428350318-8215-8-git-send-email...@kernel.org

Aleksa Sarai (6):
  cgroup: switch to unsigned long for bitmasks
  cgroup: use bitmask to filter for_each_subsys
  cgroup: replace explicit ss_mask checking with for_each_subsys_which
  cgroup: move enum cgroup_subsys_id definition
  cgroup: allow a cgroup subsystem to reject a fork
  cgroup: implement the PIDs subsystem

Tejun Heo (1):
  cgroup, block: implement task_get_css() and use it in
bio_associate_current()

 CREDITS   |   5 +
 block/bio.c   |  11 +-
 include/linux/cgroup.h|  76 +++--
 include/linux/cgroup_subsys.h |  22 +++
 init/Kconfig  |  16 ++
 kernel/Makefile   |   1 +
 kernel/cgroup.c   | 215 
 kernel/cgroup_freezer.c   |   2 +-
 kernel/cgroup_pids.c  | 379 ++
 kernel/fork.c |  17 +-
 kernel/sched/core.c   |   2 +-
 11 files changed, 652 insertions(+), 94 deletions(-)
 create mode 100644 kernel/cgroup_pids.c

-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v11 2/7] cgroup: use bitmask to filter for_each_subsys

2015-05-15 Thread Aleksa Sarai
Add a new macro for_each_subsys_which that allows all enabled cgroup
subsystems to be filtered by a bitmask, such that mask & (1 << ssid)
determines if the subsystem is to be processed in the loop body (where
ssid is the unique id of the subsystem).

Also replace the need_forkexit_callback with two separate bitmasks for
each callback to make (ss->{fork,exit}) checks unnecessary.

Signed-off-by: Aleksa Sarai 
---
 kernel/cgroup.c | 46 +-
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 15896ed..633d02a 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -175,12 +175,14 @@ static DEFINE_IDR(cgroup_hierarchy_idr);
  */
 static u64 css_serial_nr_next = 1;
 
-/* This flag indicates whether tasks in the fork and exit paths should
+/*
+ * These bitmask flags indicate whether tasks in the fork and exit paths should
  * check for fork/exit handlers to call. This avoids us having to do
  * extra work in the fork/exit path if none of the subsystems need to
  * be called.
  */
-static int need_forkexit_callback __read_mostly;
+static unsigned long have_fork_callback __read_mostly;
+static unsigned long have_exit_callback __read_mostly;
 
 static struct cftype cgroup_dfl_base_files[];
 static struct cftype cgroup_legacy_base_files[];
@@ -409,6 +411,22 @@ static int notify_on_release(const struct cgroup *cgrp)
for ((ssid) = 0; (ssid) < CGROUP_SUBSYS_COUNT &&\
 (((ss) = cgroup_subsys[ssid]) || true); (ssid)++)
 
+
+/**
+ * for_each_subsys_which - filter for_each_subsys with a bitmask
+ * @ss: the iteration cursor
+ * @ssid: the index of @ss, CGROUP_SUBSYS_COUNT after reaching the end
+ * @ss_maskp: a pointer to the bitmask
+ *
+ * The block will only run for cases where the ssid-th bit (1 << ssid) of
+ * mask is set to 1.
+ */
+#define for_each_subsys_which(ss, ssid, ss_maskp)  \
+   for_each_set_bit(ssid, ss_maskp, CGROUP_SUBSYS_COUNT)   \
+   if (((ss) = cgroup_subsys[ssid]) && false)  \
+   ;   \
+   else
+
 /* iterate across the hierarchies */
 #define for_each_root(root)\
list_for_each_entry((root), _roots, root_list)
@@ -4932,7 +4950,8 @@ static void __init cgroup_init_subsys(struct 
cgroup_subsys *ss, bool early)
 * init_css_set is in the subsystem's root cgroup. */
init_css_set.subsys[ss->id] = css;
 
-   need_forkexit_callback |= ss->fork || ss->exit;
+   have_fork_callback |= (bool)ss->fork << ss->id;
+   have_exit_callback |= (bool)ss->exit << ss->id;
 
/* At system boot, before all subsystems have been
 * registered, no tasks have been forked, so we don't
@@ -5242,11 +5261,8 @@ void cgroup_post_fork(struct task_struct *child)
 * css_set; otherwise, @child might change state between ->fork()
 * and addition to css_set.
 */
-   if (need_forkexit_callback) {
-   for_each_subsys(ss, i)
-   if (ss->fork)
-   ss->fork(child);
-   }
+   for_each_subsys_which(ss, i, _fork_callback)
+   ss->fork(child);
 }
 
 /**
@@ -5290,16 +5306,12 @@ void cgroup_exit(struct task_struct *tsk)
cset = task_css_set(tsk);
RCU_INIT_POINTER(tsk->cgroups, _css_set);
 
-   if (need_forkexit_callback) {
-   /* see cgroup_post_fork() for details */
-   for_each_subsys(ss, i) {
-   if (ss->exit) {
-   struct cgroup_subsys_state *old_css = 
cset->subsys[i];
-   struct cgroup_subsys_state *css = task_css(tsk, 
i);
+   /* see cgroup_post_fork() for details */
+   for_each_subsys_which(ss, i, _exit_callback) {
+   struct cgroup_subsys_state *old_css = cset->subsys[i];
+   struct cgroup_subsys_state *css = task_css(tsk, i);
 
-   ss->exit(css, old_css, tsk);
-   }
-   }
+   ss->exit(css, old_css, tsk);
}
 
if (put_cset)
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] f2fs crypto: split f2fs_crypto_init/exit with two parts

2015-05-15 Thread Jaegeuk Kim
This patch splits f2fs_crypto_init/exit with two parts: base initialization and
memory allocation.

Firstly, f2fs module declares the base encryption memory pointers.
Then, allocating internal memories is done at the first encrypted inode access.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/crypto.c | 90 ++--
 fs/f2fs/crypto_key.c |  8 ++---
 fs/f2fs/f2fs.h   |  6 ++--
 fs/f2fs/super.c  |  8 -
 4 files changed, 65 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/crypto.c b/fs/f2fs/crypto.c
index 7f1ee9e..31fe051 100644
--- a/fs/f2fs/crypto.c
+++ b/fs/f2fs/crypto.c
@@ -63,7 +63,7 @@ static mempool_t *f2fs_bounce_page_pool;
 static LIST_HEAD(f2fs_free_crypto_ctxs);
 static DEFINE_SPINLOCK(f2fs_crypto_ctx_lock);
 
-struct workqueue_struct *f2fs_read_workqueue;
+static struct workqueue_struct *f2fs_read_workqueue;
 static DEFINE_MUTEX(crypto_init);
 
 static struct kmem_cache *f2fs_crypto_ctx_cachep;
@@ -225,10 +225,7 @@ void f2fs_end_io_crypto_work(struct f2fs_crypto_ctx *ctx, 
struct bio *bio)
queue_work(f2fs_read_workqueue, >r.work);
 }
 
-/**
- * f2fs_exit_crypto() - Shutdown the f2fs encryption system
- */
-void f2fs_exit_crypto(void)
+static void f2fs_crypto_destroy(void)
 {
struct f2fs_crypto_ctx *pos, *n;
 
@@ -241,73 +238,90 @@ void f2fs_exit_crypto(void)
if (f2fs_bounce_page_pool)
mempool_destroy(f2fs_bounce_page_pool);
f2fs_bounce_page_pool = NULL;
-   if (f2fs_read_workqueue)
-   destroy_workqueue(f2fs_read_workqueue);
-   f2fs_read_workqueue = NULL;
-   if (f2fs_crypto_ctx_cachep)
-   kmem_cache_destroy(f2fs_crypto_ctx_cachep);
-   f2fs_crypto_ctx_cachep = NULL;
-   if (f2fs_crypt_info_cachep)
-   kmem_cache_destroy(f2fs_crypt_info_cachep);
-   f2fs_crypt_info_cachep = NULL;
 }
 
 /**
- * f2fs_init_crypto() - Set up for f2fs encryption.
+ * f2fs_crypto_initialize() - Set up for f2fs encryption.
  *
  * We only call this when we start accessing encrypted files, since it
  * results in memory getting allocated that wouldn't otherwise be used.
  *
  * Return: Zero on success, non-zero otherwise.
  */
-int f2fs_init_crypto(void)
+int f2fs_crypto_initialize(void)
 {
int i, res = -ENOMEM;
 
+   if (f2fs_bounce_page_pool)
+   return 0;
+
mutex_lock(_init);
-   if (f2fs_read_workqueue)
+   if (f2fs_bounce_page_pool)
goto already_initialized;
 
-   f2fs_read_workqueue = alloc_workqueue("f2fs_crypto", WQ_HIGHPRI, 0);
-   if (!f2fs_read_workqueue)
-   goto fail;
-
-   f2fs_crypto_ctx_cachep = KMEM_CACHE(f2fs_crypto_ctx,
-   SLAB_RECLAIM_ACCOUNT);
-   if (!f2fs_crypto_ctx_cachep)
-   goto fail;
-
-   f2fs_crypt_info_cachep = KMEM_CACHE(f2fs_crypt_info,
-   SLAB_RECLAIM_ACCOUNT);
-   if (!f2fs_crypt_info_cachep)
-   goto fail;
-
for (i = 0; i < num_prealloc_crypto_ctxs; i++) {
struct f2fs_crypto_ctx *ctx;
 
ctx = kmem_cache_zalloc(f2fs_crypto_ctx_cachep, GFP_KERNEL);
-   if (!ctx) {
-   res = -ENOMEM;
+   if (!ctx)
goto fail;
-   }
list_add(>free_list, _free_crypto_ctxs);
}
 
+   /* must be allocated at the last step to avoid race condition above */
f2fs_bounce_page_pool =
mempool_create_page_pool(num_prealloc_crypto_pages, 0);
-   if (!f2fs_bounce_page_pool) {
-   res = -ENOMEM;
+   if (!f2fs_bounce_page_pool)
goto fail;
-   }
+
 already_initialized:
mutex_unlock(_init);
return 0;
 fail:
-   f2fs_exit_crypto();
+   f2fs_crypto_destroy();
mutex_unlock(_init);
return res;
 }
 
+/**
+ * f2fs_exit_crypto() - Shutdown the f2fs encryption system
+ */
+void f2fs_exit_crypto(void)
+{
+   f2fs_crypto_destroy();
+
+   if (f2fs_read_workqueue)
+   destroy_workqueue(f2fs_read_workqueue);
+   if (f2fs_crypto_ctx_cachep)
+   kmem_cache_destroy(f2fs_crypto_ctx_cachep);
+   if (f2fs_crypt_info_cachep)
+   kmem_cache_destroy(f2fs_crypt_info_cachep);
+}
+
+int __init f2fs_init_crypto()
+{
+   int res = -ENOMEM;
+
+   f2fs_read_workqueue = alloc_workqueue("f2fs_crypto", WQ_HIGHPRI, 0);
+   if (!f2fs_read_workqueue)
+   goto fail;
+
+   f2fs_crypto_ctx_cachep = KMEM_CACHE(f2fs_crypto_ctx,
+   SLAB_RECLAIM_ACCOUNT);
+   if (!f2fs_crypto_ctx_cachep)
+   goto fail;
+
+   f2fs_crypt_info_cachep = KMEM_CACHE(f2fs_crypt_info,
+   SLAB_RECLAIM_ACCOUNT);
+   if (!f2fs_crypt_info_cachep)
+   goto fail;
+
+   return 0;

Re: [PATCH v10 4/4] cgroups: implement the PIDs subsystem

2015-05-15 Thread Aleksa Sarai
Hi Tejun,

One question RE: defaults for .config. What is the kernel policy for
deciding if a particular subsystem should be made enabled-by-default?

--
Aleksa Sarai (cyphar)
www.cyphar.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] cpufreq: use generic cpufreq drivers for Exynos5250 platform

2015-05-15 Thread Anand Moon
On 15 May 2015 at 23:18, Javier Martinez Canillas
 wrote:
> Hello Bartlomiej,
>
> On 04/22/2015 07:37 PM, Bartlomiej Zolnierkiewicz wrote:
>>
>> On Tuesday, April 21, 2015 04:45:56 PM Kevin Hilman wrote:
>>> Bartlomiej Zolnierkiewicz  writes:
>>>
>>> > On Monday, April 20, 2015 02:07:33 PM Kevin Hilman wrote:
>>> >> Bartlomiej Zolnierkiewicz  writes:
>>> >>
>>> >> > Hi,
>>> >> >
>>> >> > This patch series removes the use of Exynos5250 specific support
>>> >> > from exynos-cpufreq driver and enables the use of cpufreq-dt driver
>>> >> > for this platform.  The exynos-cpufreq driver itself is also removed
>>> >> > as it is no longer used/needed after Exynos5250 support removal.
>>> >> >
>>> >> > This patch series has been tested on Exynos5250 based Arndale board.
>>> >> >
>>> >> > Depends on:
>>> >> > - next-20150330 branch of linux-next kernel tree
>>> >> > - "[PATCH 0/6] cpufreq: use generic cpufreq drivers for Exynos4210
>>> >> >   platform" [1]
>>> >> > - "[PATCH 0/6] cpufreq: use generic cpufreq drivers for Exynos4x12
>>> >> >   platform" [2]
>>> >> > - "[PATCH] cpufreq: exynos: remove dead ->need_apll_change method" [3]
>>> >>
>>> >> Any chance you could prepare a branch with all the dependencies for easy
>>> >> testing?
>>> >
>>> > All cpufreq changes with needed dependencies are now availble in
>>> >
>>> > https://github.com/bzolnier/linux.git
>>> >
>>> > repository and the branch is
>>> >
>>> > next-20150330-generic-cpufreq-exynos5420-5800-v2
>>>
>>> Great, thanks.
>>>
>>> >> Also, The previous version from Thomas was v12, and this one is neither
>>> >> versioned nor has any reference to what may have changed since that
>>> >
>>> > Please note that Thomas' patchset was split on separate parts (this is
>>> > part #3) and heavily modified so the previous versioning was dropped.
>>> >
>>> > The cover letter of part #1 ("[PATCH 0/6] cpufreq: use generic cpufreq
>>> > drivers for Exynos4210 platform") contains detailed changelog on what has
>>> > been changed since Thomas' original v12 patch series.  Individual Thomas'
>>> > patches which were modified by me also contain such information.
>>> >
>>> > Part #2 ("[PATCH 0/6] cpufreq: use generic cpufreq drivers for Exynos4x12
>>> > platform") was entirely new code when compared to Thomas' v12 patchset so
>>> > its cover letter doesn't contain such detailed changelog as part #1.
>>> >
>>> > The newly posted part #4 ("[PATCH 0/8] cpufreq: add generic cpufreq driver
>>> > support for Exynos5250/5800 platforms" 
>>> > https://lkml.org/lkml/2015/4/21/314)
>>> > also contains the detailed changelog.
>>> >
>>> > However for part #3 (this one, "[PATCH 0/4] cpufreq: use generic cpufreq
>>> > drivers for Exynos5250 platform") such summary changelog got missed for
>>> > some reason.  Here it is:
>>> > - split Exynos5250 support from the original patch
>>> > - moved E5250_CPU_DIV[0,1]() macros to clk-exynos5250.c
>>> > - added CPU regulator supply property for Google Spring board
>>> > - removed exynos-cpufreq driver entirely as it is no longer used/needed
>>>
>>> Great, thanks for clarifying.
>>>
>>> >> version.  Also, on v12, I had several comments[1] and wonder if they've
>>> >> been addressed.
>>> >
>>> > All issues previously reported should have been fixed.  If you still see
>>> > some problems please let me know.
>>> >
>>> > [ I see now that exynos5420-arndale-octa.dts, exynos5420-peach-pit.dts,
>>> >   exynos5420-smdk5420.dts and exynos5800-peach-pi.dts should also have
>>> >   been updated to contain CPU cluster regulator supply properties or else
>>> >   if the default vdd_arm/vdd_kfc regulator state is set to too low value
>>> >   there may be problems with stability when switching to higher than
>>> >   default frequencies.  I have posted v2 version of patch #2/8 of part #4
>>> >   and pushed v2 combined branch on github.  Sorry for the inconvenience. ]
>>>
>>> I've now tested your v2 branch with the bL switcher disabled, CPUidle
>>> enabled and CPUfreq enabled.
>>>
>>> With the default governor set to performance, it fails to boot.  The last
>>> kernel messages on the console are:
>>
>> [ Small explanation for people not following the discussion from
>>   the start:
>>
>>   This testing is relevant to part #4 of the rework: "[PATCH 0/8]
>>   cpufreq: add generic cpufreq driver support for Exynos5250/5800
>>   platforms" (https://lkml.org/lkml/2015/4/21/314;), not this one
>>   which is part #3 and has no known issues. ]
>>
>
> I know that Exynos5420/5422/5800 is related to part #4 and not #3 but I
> wanted to answer in this thread since here is where Kevin reported the
> issue. I tried your next-20150330-generic-cpufreq-exynos5420-5800-v2-debug
> branch with exynos_defconfig plus CONFIG_BL_SWITCHER disabled and:
>
> CONFIG_ARM_BIG_LITTLE_CPUFREQ=y
> CONFIG_ARM_DT_BL_CPUFREQ=y
>
> By default CONFIG_CPU_FREQ_GOV_PERFORMANCE=y but with that option it fails
> to boot as well on my Exynos5420 Peach Pit so seems to be exactly what Kevin
> reported on 

[PATCH net-next v3 1/2] pci: Add Cavium PCI vendor id

2015-05-15 Thread Aleksey Makarov
Signed-off-by: Aleksey Makarov 
---
 include/linux/pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index e63c02a..3633cc6 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2327,6 +2327,8 @@
 #define PCI_DEVICE_ID_ALTIMA_AC91000x03ea
 #define PCI_DEVICE_ID_ALTIMA_AC10030x03eb
 
+#define PCI_VENDOR_ID_CAVIUM   0x177d
+
 #define PCI_VENDOR_ID_BELKIN   0x1799
 #define PCI_DEVICE_ID_BELKIN_F5D7010V7 0x701f
 
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net-next v3 0/2] Adding support for Cavium ThunderX network controller

2015-05-15 Thread Aleksey Makarov
This patchset adds support for the Cavium ThunderX network controller.

changes in v3:
 * code cleanup
 * issues discovered by reviewers were addressed

changes in v2:
 * non-generic module parameters removed
 * ethtool support added (nicvf_set_rxnfc())

v2: 
https://lkml.kernel.org/g/<1415596445-10061-1-git-send-email-r...@kernel.org>
v1: https://lkml.kernel.org/g/<20141030165434.GW20170@rric.localhost>

Aleksey Makarov (1):
  pci: Add Cavium PCI vendor id

Sunil Goutham (1):
  net: Adding support for Cavium ThunderX network controller

 MAINTAINERS|7 +
 drivers/net/ethernet/Kconfig   |1 +
 drivers/net/ethernet/Makefile  |1 +
 drivers/net/ethernet/cavium/Kconfig|   40 +
 drivers/net/ethernet/cavium/Makefile   |5 +
 drivers/net/ethernet/cavium/thunder/Makefile   |   11 +
 drivers/net/ethernet/cavium/thunder/nic.h  |  439 ++
 drivers/net/ethernet/cavium/thunder/nic_main.c |  927 +
 drivers/net/ethernet/cavium/thunder/nic_reg.h  |  214 +++
 .../net/ethernet/cavium/thunder/nicvf_ethtool.c|  625 +
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 1337 ++
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 1416 
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  378 ++
 drivers/net/ethernet/cavium/thunder/q_struct.h |  702 ++
 drivers/net/ethernet/cavium/thunder/thunder_bgx.c  |  999 ++
 drivers/net/ethernet/cavium/thunder/thunder_bgx.h  |  224 
 include/linux/pci_ids.h|2 +
 17 files changed, 7328 insertions(+)
 create mode 100644 drivers/net/ethernet/cavium/Kconfig
 create mode 100644 drivers/net/ethernet/cavium/Makefile
 create mode 100644 drivers/net/ethernet/cavium/thunder/Makefile
 create mode 100644 drivers/net/ethernet/cavium/thunder/nic.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/nic_main.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nic_reg.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_main.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_queues.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/nicvf_queues.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/q_struct.h
 create mode 100644 drivers/net/ethernet/cavium/thunder/thunder_bgx.c
 create mode 100644 drivers/net/ethernet/cavium/thunder/thunder_bgx.h

-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ARM: dts: Add syscon property to the MIPI phy in exynos5420

2015-05-15 Thread Kukjin Kim
On 05/15/15 22:22, Krzysztof Kozlowski wrote:
> 2015-05-14 21:40 GMT+09:00 Tomeu Vizoso :
>> Since e4b3d38088df6f3acd40 ("phy: exynos-video-mipi: Fix regression by
>> adding support for PMU regmap") the syscon property is required in
>> samsung,s5pv210-mipi-video-phy nodes, but this DTS hadn't been updated
>> yet.
>>
>> Signed-off-by: Tomeu Vizoso 
>> Reviewed-by: Javier Martinez Canillas 
>> Cc: Sylwester Nawrocki 
>> Cc: Krzysztof Kozłowski 
>>
>> --
> 
> Triple-dash please, because this won't be cut by "git am".
> 
> Everything else looks fine, thank you.
> Reviewed-by: Krzysztof Kozlowski 
> 
Applied, thanks for you guys' effort.

- Kukjin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/9] multi_v7_defconfig: Enable options for Exynos Chromebooks

2015-05-15 Thread Kukjin Kim
On 05/15/15 21:00, Krzysztof Kozlowski wrote:
> 2015-05-15 20:37 GMT+09:00 Arnd Bergmann :
>> On Thursday 14 May 2015 17:40:07 Javier Martinez Canillas wrote:
>>> Hello arm-soc maintainers,
>>>
>>> This series is an attempt to reduce the delta between exynos_defconfig
>>> and multi_v7_defconfig. Primarily to enable the needed Kconfig symbols
>>> to make all Exynos Chromebooks peripherals to be working when building
>>> an image using the ARMv7 multi-platform default config.
>>>
>>> Since the policy is now to now enable as much as possible, I did build
>>> as a module all the Kconfig symbols that were tristate and only enable
>>> as built-in those that can't be a module because are boolean options.
>>>
>>> A nice side effect of this series is that I found that many drivers
>>> were not working properly when built as a module because the modalias
>>> information was not filled properly or at all. I've posted patches to
>>> fix the issues I found when testing this series.
>>>
>>> The patches have been tested on an Exynos5250 Snow, Exynos5420 Peach
>>> Pit and Exynos5800 Peach Pi Chromebooks but most config options will
>>> be useful for others Exynos5 or other Samsung SoCs.
>>>
>>> The series is composed of the following patches that can be applied on
>>> top of your next/defconfig branch [0].
>>
>> Looks good to me. My preferred approach for merging would be to have
>> Kukjin pick up these patches and send a pull request, along with other
>> defconfig changes he might have for exynos.
> 
OK, I'll in this weekend.

> I have some other old patches in the same topic - related to important
> stuff for Exynos boards. I'll rebase them and ask Kukjin for picking.
> 
Yeah, OK.

Thanks,
Kukjin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4.0 00/60] 4.0.4-stable review

2015-05-15 Thread Guenter Roeck

On 05/15/2015 04:14 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 4.0.4 release.
There are 60 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun May 17 23:10:03 UTC 2015.
Anything received after that time might be too late.



Build results:
total: 125 pass: 125 fail: 0
Qemu test results:
total: 30 pass: 30 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4.0 00/60] 4.0.4-stable review

2015-05-15 Thread Shuah Khan
On 05/15/2015 05:14 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.0.4 release.
> There are 60 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun May 17 23:10:03 UTC 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.0.4-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shua...@osg.samsung.com | (970) 217-8978
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 07:23:11PM -0700, Linus Torvalds wrote:

>For filesystems that say that they are ok with, make lookup_slow()
> (and *only* lookup_slow for now) instead take the rwsem for reading,
> but in addition to that, take a hashed mutex.
> 
> By "hashed mutex", I mean having a smallish table of mutexes (say,
> 1024), and just creating a hash based on the name-hash and the parent
> pointer. That way we can avoid all the issues with adding a new lock
> to the dentry itself, or having to allocate a new child dentry just
> for the lock. It *could* cause some cross-directory serialization due
> to hash collisions, but that shouldn't be noticeable if the hash is of
> a reasonable size and quality.

What for?  All we need is a flag, waitqueue and being woken
up when the flag gets cleared.  So let's just use the queue of parent's
->i_mutex and explicitly kick it when removing dentry flag.  We *are*
holding a reference on parent (we need that to hold that sucker shared,
after all), so it's not going away under us...

I'm all for gradual transformations, but in this case I suspect
that doing it on per-fs basis isn't the best way to do it; gradual massage
of code using dcache lookups or walking the lists of children in filesystems
(fortunately, it's fairly rare these days, and we only need to care about
the code checking if such a beast is hashed; d_alloc() already places new
dentry on the list of children) would seem to be a better approach.  We'd
also need to audit fs/dcache.c tree-walking-related code itself, but that's
much more limited.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.10 00/17] 3.10.79-stable review

2015-05-15 Thread Shuah Khan
On 05/15/2015 05:10 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.10.79 release.
> There are 17 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun May 17 23:09:54 UTC 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.10.79-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shua...@osg.samsung.com | (970) 217-8978
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14 00/51] 3.14.43-stable review

2015-05-15 Thread Guenter Roeck

On 05/15/2015 04:10 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.14.43 release.
There are 51 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun May 17 23:09:28 UTC 2015.
Anything received after that time might be too late.



Build results:
total: 127 pass: 127 fail: 0
Qemu test results:
total: 30 pass: 30 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14 00/51] 3.14.43-stable review

2015-05-15 Thread Shuah Khan
On 05/15/2015 05:10 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.43 release.
> There are 51 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun May 17 23:09:28 UTC 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.43-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shua...@osg.samsung.com | (970) 217-8978
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.10 00/17] 3.10.79-stable review

2015-05-15 Thread Guenter Roeck

On 05/15/2015 04:10 PM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 3.10.79 release.
There are 17 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun May 17 23:09:54 UTC 2015.
Anything received after that time might be too late.



Build results:
total: 127 pass: 126 fail: 1
Failed builds:
s390:allmodconfig

Qemu test results:
total: 27 pass: 27 fail: 0

Results are as expected.
Details are available at http://server.roeck-us.net:8010/builders.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/5] clk: hi6220: Clock driver support for Hisilicon hi6220 SoC

2015-05-15 Thread Brent Wang
Hello  Stephen,

2015-05-16 3:30 GMT+08:00 Stephen Boyd :
> On 05/15, Bintian wrote:
>> On 2015/5/15 8:25, Stephen Boyd wrote:
>> >On 05/05, Bintian Wang wrote:
>> >>diff --git a/drivers/clk/hisilicon/clkdivider-hi6220.c 
>> >>b/drivers/clk/hisilicon/clkdivider-hi6220.c
>> >>+
>> >>+/**
>> >>+ * struct hi6220_clk_divider - divider clock for hi6220
>> >>+ *
>> >>+ * @hw:handle between common and hardware-specific interfaces
>> >>+ * @reg:   register containing divider
>> >>+ * @shift: shift to the divider bit field
>> >>+ * @width: width of the divider bit field
>> >>+ * @mask:  mask for setting divider rate
>> >>+ * @table: the div table that the divider supports
>> >>+ * @lock:  register lock
>> >>+ */
>> >>+struct hi6220_clk_divider {
>> >>+   struct clk_hw   hw;
>> >>+   void __iomem*reg;
>> >>+   u8  shift;
>> >>+   u8  width;
>> >>+   u32 mask;
>> >>+   const struct clk_div_table *table;
>> >>+   spinlock_t  *lock;
>> >>+};
>> >
>> >The clk-divider.c code has been made "reusable". Can you please
>> >try to use the functions that it now exposes instead of
>> >copy/pasting it and modifying it to suit your needs? A lot of
>> >this code looks the same.
>> In fact, I discussed this problem with Rob Herring and Mike Turquette
>> in the 96boards internal mail list before.
>>
>> The divider in hi6220 has a mask bit to guarantee writing the correct
>> bits in register when setting rate, but the index of this mask bit has
>> no rules to get (e.g. by left shift some fixed bits), so I add this
>> divider clock to handle it, we can regard hi6220_clk_divider as a
>> special case of generic divider clock.
>>
>> If I don't add this divider clock for hi6220 chip, then I should change
>> the core APIs "clk_register_divider" and "clk_register_divider_table",
>> and then many other drivers will be updated.
>> So I think just add this divider clock is a good solution now.
>
> I think you missed my point. I didn't suggest using
> clk_register_divider or clk_register_divider_table(). I'm
> suggesting to use
>
> unsigned long divider_recalc_rate(struct clk_hw *hw, unsigned long 
> parent_rate,
> unsigned int val, const struct clk_div_table *table,
> unsigned long flags);
> long divider_round_rate(struct clk_hw *hw, unsigned long rate,
> unsigned long *prate, const struct clk_div_table *table,
> u8 width, unsigned long flags);
> int divider_get_val(unsigned long rate, unsigned long parent_rate,
> const struct clk_div_table *table, u8 width,
> unsigned long flags);
Got it and I will prepare next version soon.

>
>> >>+   return ERR_PTR(-ENOMEM);
>> >>+   }
>> >>+
>> >>+   for (i = 0; i < max_div; i++) {
>> >>+   table[i].div = min_div + i;
>> >>+   table[i].val = table[i].div - 1;
>> >>+   }
>> >>+
>> >>+   init.name = name;
>> >>+   init.ops = _clkdiv_ops;
>> >>+   init.flags = flags | CLK_IS_BASIC;
>> >
>> >It's basic?
>> I rechecked this flag, it's really useless to us, so I can remove it.
>> But can you tell me which case I should use it?
>
> I think the basic flag is there for drivers that want to know what type
> of clock they're dealing with when all they have is the struct clk_hw
> pointer. I like to discourage use of this flag in hopes of deleting
> it someday.
>
>>
>> How about just send this patch for review not the whole patch set in
>> next version?
>>
>
> Yes a single patch is fine. I take it you want the patch to go
> through arm-soc with some Ack from us?
Yes, exactly.
The dts file includes the clock head file,  this patch goes through
arm-soc is a good choice.

Thanks,

Bintian
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 6:55 PM, Al Viro  wrote:
>
> See upthread.  It might be doable (provided that we turn ->i_mutex into
> rwsem, to keep the exclusion with directory _modifiers_), but it'll need
> a really non-trivial code review of a bunch of filesystems, especially ones
> that want to play with the list of children like ceph does.  And things
> like sillyrename and dcache-populating readdir instances, albeit not as scary
> as ceph.  And then there's lustre...

Yup.

I don't think it's viable if we can't do it gradually, and leave
filesystems with the option to basically keep the existing locking.
Because most won't care that deeply anyway, and some have
complications like ceph.

But we might be able to do *some* changes that wouldn't be that
noticeable. For example, something like

 - phase 1:

   Turn i_mutex into an rwsem, change all users to take it for writing

   This part should be pretty much a semantic no-op.

 - phase 2:

   For filesystems that say that they are ok with, make lookup_slow()
(and *only* lookup_slow for now) instead take the rwsem for reading,
but in addition to that, take a hashed mutex.

By "hashed mutex", I mean having a smallish table of mutexes (say,
1024), and just creating a hash based on the name-hash and the parent
pointer. That way we can avoid all the issues with adding a new lock
to the dentry itself, or having to allocate a new child dentry just
for the lock. It *could* cause some cross-directory serialization due
to hash collisions, but that shouldn't be noticeable if the hash is of
a reasonable size and quality.

That would allow lookups (and _only_ lookups) to happen in parallel,
but the hashed mutex would mean that you'd serialize the "same name in
same directory" case. And we'd require filesystems to say "I can
support this concurrent lookup model".

There might be a "phase 3" and so on where we could expand this to
slightly more than just lookup_slow(), but I suspect that even doing
it  *just* there would already catch the bulk of issues. And requiring
filesystems to sign up for it means that we can ignore any ugly cases.

I dunno. The above _sounds_ fairly safe and easy because of how it
limits the impact. But I might be missing something.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH/RFC] kbuild: Create a rule for building device tree overlay objects

2015-05-15 Thread Frank Rowand
On 5/15/2015 5:47 PM, Frank Rowand wrote:
> On 5/12/2015 7:33 AM, Pantelis Antoniou wrote:
>> Hi Geert,
>>
>>> On May 12, 2015, at 14:56 , Geert Uytterhoeven  
>>> wrote:
>>>
>>> This allows to handle device tree overlays like plain device trees.
>>>
>>> Signed-off-by: Geert Uytterhoeven 
>>> ---
>>> Questions:
>>>  - Do we want dtso files under arch//boot/dts/, too?
>>>  - Do we want to move the dts files outside the kernel repository
>>>first?
>>>
>>
>> Oh that’s a nice hornet’s nest you’ve kicked here.
>>
>> arch//boot/dts should not be the place, cause overlays are not related 
>> with boot per se.
>> As they are right now are board (family) specific.
> 
> Aren't overlays meant to describe child boards (capes, shields, whatever) 
> that may
> vary from system to system, but are not expected to be hot-plugged while the 
> OS
> is up?  Or is hot-plug a design goal?

To reply to myself, there is a current discussion about whether to use overlays 
to
help with a powerpc pci desire to add and delete subtrees:

   http://www.spinics.net/lists/linux-pci/msg40740.html

That sound like hot-plug to me...

> 
> If no hot-plug, then to me an overlay is just as related to boot as the base 
> dts.
> It is a mere implementation detail that overlays are "loaded" from userspace
> instead of by the booting kernel (I don't really know the details of using
> overlays, so please correct me if I am wrong about how the kernel becomes 
> aware
> of an overlay).
> 
>>
>> I think we should try to keep an external kernel repo with them for now 
>> until we
>> figure out where to put them.
>>
>>> scripts/Makefile.lib | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
>>> index 79e86613712f2230..4b14eef1d4b2ce8f 100644
>>> --- a/scripts/Makefile.lib
>>> +++ b/scripts/Makefile.lib
>>> @@ -292,6 +292,9 @@ cmd_dtc = mkdir -p $(dir ${dtc-tmp}) ; \
>>> $(obj)/%.dtb: $(src)/%.dts FORCE
>>> $(call if_changed_dep,dtc)
>>>
>>> +$(obj)/%.dtbo: $(src)/%.dtso FORCE
>>> +   $(call if_changed_dep,dtc)
>>> +
>>> dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp)
>>>
>>> # Bzip2
>>> -- 
>>> 1.9.1
>>
>> Regards
>>
>> — Pantelis
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe devicetree" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] cpufreq_stats: Adds sysfs file /sys/devices/system/cpu/cpufreq/current_in_state

2015-05-15 Thread Viresh Kumar
On 15-05-15, 17:55, Ruchi Kandoi wrote:
> Some of the hand-held devices support hot-plugging of the cpus and
> when the core is hot-plugged out the
> /sys/devices/system/cpu/cpuX/cpufreq directory is removed too. So it
> won't be possible to share folders by multiple CPUs.

Okay, that is changing now..

http://permalink.gmane.org/gmane.linux.power-management.general/59841

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] MIPS64: Support of at least 48 bits of SEGBITS

2015-05-15 Thread Maciej W. Rozycki
On Fri, 15 May 2015, Leonid Yegoshin wrote:

> > Many processors support larger VA space than is utilized by the kernel.
> >A choice was made to reduce the size of the VA space in order to
> > reduce TLB handling overhead.
> >
> > If the true reason for the patch is to enable larger VA space, say that.
> >But is it really required by those processors you mention?  I doubt it.
> 
> Well, I was not aware about many processors capability, I can't find this kind
> of note anywhere.

 The R1 and friends all have a 44-bit virtual address space, so this 
is no news to Linux.  This is noted in  right above the 
change you made there.

  Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] PM / Wakeirq: Add automated device wake IRQ handling

2015-05-15 Thread Felipe Balbi
Hi,

On Fri, May 15, 2015 at 03:25:13PM -0700, Tony Lindgren wrote:



> diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c
> new file mode 100644
> index 000..1125481
> --- /dev/null
> +++ b/drivers/base/power/wakeirq.c
> @@ -0,0 +1,276 @@
> +/*
> + * wakeirq.c - Device wakeirq helper functions
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
> + * kind, whether express or implied; without even the implied warranty
> + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "power.h"
> +
> +/**
> + * dev_pm_attach_wake_irq - Attach device interrupt as a wake IRQ
> + * @dev: Device entry
> + * @irq: Device wake-up capable interrupt
> + * @wirq: Wake irq specific data
> + *
> + * Internal function to attach either a device IO interrupt or a
> + * dedicated wake-up interrupt as a wake IRQ.
> + */
> +static int dev_pm_attach_wake_irq(struct device *dev, int irq,
> +   struct wake_irq *wirq)
> +{
> + unsigned long flags;
> + int err;
> +
> + if (!dev || !wirq)
> + return -EINVAL;
> +
> + if (!dev->power.wakeup) {
> + dev_err(dev, "forgot to call call device_init_wakeup?\n");
> + return -EINVAL;
> + }
> +
> + spin_lock_irqsave(>power.lock, flags);
> + if (WARN_ON(dev->power.wakeirq)) {
> + dev_err(dev, "wake irq already initialized\n");

these two can be combined if you can live with a WARN_ONCE() instead:

if (dev_WARN_ONCE(dev, dev->power.wakeirq,
"wake irq already initialized\n")) {
spin_unlock_irqrestore(>power.lock, flags);
return -EEXIST;
}

dev_WARN() needs to be fixed at some point to accept a "condition"
argument :s

But really, no strong feelings.

> +static irqreturn_t handle_threaded_wakeirq(int wakeirq, void *_wirq)
> +{
> + struct wake_irq *wirq = _wirq;
> +
> + /* We don't want RPM_ASYNC or RPM_NOWAIT here */
> + return pm_runtime_resume(wirq->dev) ? IRQ_NONE : IRQ_HANDLED;

I wonder if you should add a pm_runtime_mark_last_busy() here. I guess
not after your previous patch ?

> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index 7726200..7191519 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -238,6 +239,100 @@ int device_wakeup_enable(struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(device_wakeup_enable);
>  
> +#ifdef CONFIG_PM_WAKEIRQ
> +
> +/**
> + * device_wakeup_attach_irq - Attach a wakeirq to a wakeup source
> + * @dev: Device to handle
> + * @irq: Device specific wakeirq entry

s/irq/wakeirq to match argument name below ?

> + * Attach a device specific wakeirq to the device specific
> + * wakeup source so the device wakeirq can be configured
> + * automatically for suspend and resume.
> + */
> +int device_wakeup_attach_irq(struct device *dev,
> +  struct wake_irq *wakeirq)
> +{
> + struct wakeup_source *ws;
> + int ret = 0;
> +
> + spin_lock_irq(>power.lock);
> + ws = dev->power.wakeup;
> + if (!ws) {
> + ret = -EINVAL;
> + goto unlock;
> + }
> +
> + if (ws->wakeirq) {
> + ret = -EEXIST;
> + goto unlock;
> + }
> +
> + ws->wakeirq = wakeirq;
> +
> +unlock:
> + spin_unlock_irq(>power.lock);
> +
> + return ret;
> +}



-- 
balbi


signature.asc
Description: Digital signature


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 06:47:04PM -0700, Linus Torvalds wrote:

> Now, maybe we could solve it with a new sleeping lock in the dentry
> itself. Maybe we could allocate the new dentry early, add it to the
> directory the usual way, but mark it as being "not ready" (so that
> d_lookup() wouldn't use it). And have the sleeping lock be a new
> sleeping lock in the dentry.

See upthread.  It might be doable (provided that we turn ->i_mutex into
rwsem, to keep the exclusion with directory _modifiers_), but it'll need
a really non-trivial code review of a bunch of filesystems, especially ones
that want to play with the list of children like ceph does.  And things
like sillyrename and dcache-populating readdir instances, albeit not as scary
as ceph.  And then there's lustre...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] regulator: Add SPMI regulator driver

2015-05-15 Thread Frank Rowand
On 5/12/2015 2:39 PM, Stephen Boyd wrote:
> Add an SPMI regulator driver for Qualcomm's PM8941 and PM8916
> PMICs. This driver is based largely on code from
> codeaurora.org[1].
> 
> [1] 
> https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.10/tree/drivers/regulator/qpnp-regulator.c?h=msm-3.10
> Cc: David Collins 
> Cc: 
> Signed-off-by: Stephen Boyd 
> ---
>  .../bindings/regulator/qcom,spmi-regulator.txt |  225 +++
>  drivers/regulator/Kconfig  |   11 +
>  drivers/regulator/Makefile |1 +
>  drivers/regulator/qcom_spmi-regulator.c| 1750 
> 
>  4 files changed, 1987 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt
>  create mode 100644 drivers/regulator/qcom_spmi-regulator.c
> 
> diff --git 
> a/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt 
> b/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt
> new file mode 100644
> index ..b89744da62d0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt
> @@ -0,0 +1,225 @@
> +Qualcomm SPMI Regulators
> +
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be one of:
> + "qcom,pm8841-regulators"
> + "qcom,pm8916-regulators"
> + "qcom,pm8941-regulators"
> +
> +- interrupts:
> + Usage: optional
> + Value type: 
> + Definition: List of OCP interrupts.
> +
> +- interrupt-names:
> + Usage: required if 'interrupts' property present
> + Value type: 
> + Definition: List of strings defining the names of the
> + interrupts in the 'interrupts' property 1-to-1.
> + Supported values are "ocp-", where
> +  corresponds to a voltage switch
> + type regulator.
> +
> +- vdd_s1-supply:
> +- vdd_s2-supply:
> +- vdd_s3-supply:
> +- vdd_s4-supply:
> +- vdd_s5-supply:
> +- vdd_s6-supply:
> +- vdd_s7-supply:
> +- vdd_s8-supply:
> + Usage: optional (pm8841 only)
> + Value type: 
> + Definition: Reference to regulator supplying the input pin, as
> + described in the data sheet.
> +
> +- vdd_s1-supply:
> +- vdd_s2-supply:
> +- vdd_s3-supply:
> +- vdd_s4-supply:
> +- vdd_l1_l3-supply:
> +- vdd_l2-supply:
> +- vdd_l4_l5_l6-supply:
> +- vdd_l7-supply:
> +- vdd_l8_l11_l14_l15_l16-supply:
> +- vdd_l9_l10_l12_l13_l17_l18-supply:
> + Usage: optional (pm8916 only)
> + Value type: 
> + Definition: Reference to regulator supplying the input pin, as
> + described in the data sheet.
> +
> +- vdd_s1-supply:
> +- vdd_s2-supply:
> +- vdd_s3-supply:
> +- vdd_l1_l3-supply:
> +- vdd_l2_lvs_1_2_3-supply:
> +- vdd_l4_l11-supply:
> +- vdd_l5_l7-supply:
> +- vdd_l6_l12_l14_l15-supply:
> +- vdd_l8_l16_l18_19-supply:
> +- vdd_l9_l10_l17_l22-supply:
> +- vdd_l13_l20_l23_l24-supply:
> +- vdd_l21-supply:
> +- vin_5vs-supply:
> + Usage: optional (pm8941 only)
> + Value type: 
> + Definition: Reference to regulator supplying the input pin, as
> + described in the data sheet.
> +
> +
> +The regulator node houses sub-nodes for each regulator within the device. 
> Each
> +sub-node is identified using the node's name, with valid values listed for 
> each
> +of the PMICs below.
> +
> +pm8841:
> + s1, s2, s3, s4, s5, s6, s7, s8
> +
> +pm8916:
> + s1, s2, s3, s4, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10, l11, l12, l13,
> + l14, l15, l16, l17, l18
> +
> +pm8941:
> + s1, s2, s3, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10, l11, l12, l13, l14,
> + l15, l16, l17, l18, l19, l20, l21, l22, l23, l24, lvs1, lvs2, lvs3,
> + mvs1, mvs2
> +
> +The content of each sub-node is defined by the standard binding for 
> regulators -
> +see regulator.txt - with additional custom properties described below:
> +
> +- qcom,system-load:
> + Usage: optional
> + Value type: 
> + Description: Load in uA present on regulator that is not captured by
> +  any consumer request.
> +
> +- qcom,auto-mode-enable:
> + Usage: optional
> + Value type: 
> + Description: 1 = Enable automatic hardware selection of regulator
> +  mode (HPM vs LPM); not available on boost type
> +  regulators. 0 = Disable auto mode selection.
> +
> +- qcom,bypass-mode-enable:
> + Usage: optional
> + Value type: 
> + Description: 1 = Enable bypass mode for an LDO type regulator so that
> +  it acts like a switch and simply outputs its input
> +  voltage. 0 = Do not enable bypass mode.
> +
> +- qcom,ocp-enable:
> + Usage: optional
> + Value type: 
> + Description: 1 = Allow over current protection (OCP) to be enabled for
> +  voltage switch type regulators so that they latch off
> +  automatically when over 

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 11:25:03AM +1000, NeilBrown wrote:
> But surely those things can be managed with a spinlock.
> 
> I think a big part of the problem is that the VFS tries to control
> filesystems rather than provide services to them.

What with being the thing syscalls talk to for sending the requests to
filesystems...  Do you really want to push the pathname resolution into
fs code?  You've looked at it lately, right?

> I'm not convinced that serialising 'lookup' calls is vital.  If two threads
> find a 'not-validated' dentry, and both try to look up the inode, they
> will both ultimately get the same struct_inode from the icache, and will both
> succeed in connecting it to the dentry.  Obviously it would be better to
> avoid two concurrent NFS "LOOKUP" requests, but that is a problem for NFS to
> solve.  I suspect that using d_fsdata to point to a pending LOOKUP request
> would allow the "second" thread to wait for that request to finish.  Other
> filesystems would take a completely different approach.

See upthread regarding multiple negative dentries with the same name and fun
consequences thereof.  There might be _NO_ inode.  At all.  dcache has a large
negative component and without it you'd get really fucked on NFS as soon
as you try to compile anything.  Shitloads of headers, looked up in a lot of
directories.  Most of the lookups ending up negative.  We really do need that
stuff...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 6:25 PM, NeilBrown  wrote:
>>
>>For example, simply that we only ever have one single dentry for a
>> particular name, and that we only ever have one active lookup per
>> dentry. Those things happen independently of - and before - the server
>> even sees the operation.
>
> But surely those things can be managed with a spinlock.

Some of them could. But no, not in general. Exactly because of that
"only one lookup of this particular dentry". The lookup sleeps, so
when another thread tries to look up the same dentry, it needs to hit
a sleeping lock. And since at that point we don't have the new dentry
yet (much less the inode we're looking up), it is in the only place we
*do* have: the inode of the directory we're looking up.

Now, maybe we could solve it with a new sleeping lock in the dentry
itself. Maybe we could allocate the new dentry early, add it to the
directory the usual way, but mark it as being "not ready" (so that
d_lookup() wouldn't use it). And have the sleeping lock be a new
sleeping lock in the dentry.

But that would be a very big change. And I suspect Al has tons of
reasons why it would have various problems. We do rely on the i_mutex
a fair amount..

> I think a big part of the problem is that the VFS tries to control
> filesystems rather than provide services to them.

Well, most of the time we do aim to provice services (eg the page
cache generally works that way).

But the dentry layer really is pretty important. And it's a *huge*
performance win. Yes, yes, you don't see it when it works, you only
see the bad cases, and there is no way in hell we can let the
low-level filesystem control the dentry tree.

Also, remember: we do have something like 65 different filesystems.
It's a *big* deal to make them all work "well enough". Making the
locking rules unambiguous is pretty primary.

Making one particular filesystem perform well is important to _you_,
but in the big picture...

>>  - readdir(). This is mostly to make it hard for filesystems to do the
>> wrong thing when there is concurrent file creation.
>
> and makes it impossible for them to do the right thing too?

Well, possibly. As mentioned, the readdir case might actually be
pretty trivial, but it just hasn't come up often.

I've literally only ever seen it for samba. I'd take a tested patch. Hint, hint.

>> Again, there tend to be no simple benchmarks or loads that people care
>> about that show this. Most of the time it's fairly hard to see.
>
> Which "this"?  Readdir?  I haven't come across one (NFS READDIR issues are
> all about when to use READDIR_PLUS and when not to).
> Or create-contention-on-i_mutex?  The load my customer had was 'make -j60' on
> a many-cpu machine.

Both/either. The problem is that the "make -j60" case seems to depend
on the filesystem having bad latency - it certainly doesn't show up on
a local filesystem. Maybe that could be simulated some way..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Patch "block: destroy bdi before blockdev is unregistered." has been added to the 4.0-stable tree

2015-05-15 Thread NeilBrown
On Fri, 15 May 2015 16:16:32 -0700 Greg KH  wrote:

> On Fri, May 15, 2015 at 04:03:23PM +0900, Sergey Senozhatsky wrote:
> > On (05/14/15 19:18), gre...@linuxfoundation.org wrote:
> > > This is a note to let you know that I've just added the patch titled
> > > 
> > > block: destroy bdi before blockdev is unregistered.
> > > 
> > > to the 4.0-stable tree which can be found at:
> > > 
> > > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> > > 
> > > The filename of the patch is:
> > >  block-destroy-bdi-before-blockdev-is-unregistered.patch
> > > and it can be found in the queue-4.0 subdirectory.
> > > 
> > > If you, or anyone else, feels it should not be added to the stable tree,
> > > please let  know about it.
> > > 
> > > 
> > 
> > Hello Greg,
> > 
> > jfi, I think this commit will WARN_ON(). fixed by 
> > https://lkml.org/lkml/2015/5/8/29
> > 
> > 
> > (https://lkml.org/lkml/2015/4/28/568)
> 
> That site isn't working for me, what is the git commit id of the fix for
> this in Linus's tree?
> 
> thanks,
> 
> greg k-h

http://lkml.kernel.org/r/<20150508150924.33c3bca8@notabene.brown>

Unfortunately it isn't with Linus yet, or even in -next.  Jens hasn't replied.

NeilBrown


pgpmG1uRuXBvb.pgp
Description: OpenPGP digital signature


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 05:45:56PM -0700, Linus Torvalds wrote:

> Al, do you have any ideas? Personally, I've wanted to make I_mutex a
> rwsem for a long time, but right now pretty much everything uses it
> for exclusion. For example, filename lookup is clearly just reading
> the directory, so it should take a rwsem for reading, right? No. Not
> the way it is done now. Filename lookup wants the directory inode
> exclusively because that guarantees that we create just one dentry and
> call the filesystem ->lookup only once on that dentry.

rwsem by itself won't do us much good there.  Look: for multiple lookups on
the same existing entry we could try to teach d_splice_alias() to cope,
etc.  But what happens when a bunch of processes looks for the same
inexistent entry?  And no, "who cares about fuckloads of negatives with
the same name" isn't a good answer - suppose we do mkdir() after that.
OK, so we'll find a negative dentry in dcache.  And tell the filesystem
to create the sucker.  Done.  Made it positive.  Now, do we hunt down
all _other_ negative dentries for it?  Or never keep negative ones at
all.  Or slap some kind of ->d_revalidate() there to catch all negative
dentries creates before the last mkdir/creat/mknod/symlink/link in given
parent?

One possibility would be a new dentry state - "being looked up".  Hashed,
treated as "fall out of RCU mode" for lazy pathwalk purposes, and places
where we call ->lookup() would (while still holding ->i_mutex on parent
shared) wait for that state to end.  Places where we call ->d_revalidate()
(with or without ->i_mutex on parent) would also wait on those.

It would need a careful analysis of tree-walkers, though.  Doable, but there
might be dragons.  In case of e.g. ceph - swamp ones, with mirror in the line
of sight...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] power: validate wakeup source before activating it.

2015-05-15 Thread Jin Qian
Hi Rafael,

The latest version is in [PATCHv3 1/3] power: validate wakeup source
before activating it. I changed WARN_ONCE back to WARN since if
multiple drivers activating uninitialized wakeup_sources, only the
first driver will by flagged. We lost alert for other drivers. The
warning should really happen during driver development. Hope this is
ok.

Thanks,
jin

On Thu, May 14, 2015 at 5:22 PM, Rafael J. Wysocki  wrote:
> On Wednesday, May 06, 2015 03:26:56 PM Jin Qian wrote:
>> A rogue wakeup source not registered in wakeup_sources list is not visible
>> from wakeup_sources_stats_show. Check if the wakeup source is registered
>> properly by looking at the timer struct.
>>
>> Signed-off-by: Jin Qian 
>> ---
>>  drivers/base/power/wakeup.c | 18 ++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
>> index 7726200..7b5ad9a 100644
>> --- a/drivers/base/power/wakeup.c
>> +++ b/drivers/base/power/wakeup.c
>> @@ -351,6 +351,20 @@ int device_set_wakeup_enable(struct device *dev, bool 
>> enable)
>>  }
>>  EXPORT_SYMBOL_GPL(device_set_wakeup_enable);
>>
>> +/**
>> + * wakeup_source_not_registered - validate the given wakeup source.
>> + * @ws: Wakeup source to be validated.
>> + */
>> +static bool wakeup_source_not_registered(struct wakeup_source *ws)
>> +{
>> + /*
>> +  * Use timer struct to check if the given source is initialized
>> +  * by wakeup_source_add.
>> +  */
>> + return ws->timer.function != pm_wakeup_timer_fn ||
>> +ws->timer.data != (unsigned long)ws;
>> +}
>> +
>>  /*
>>   * The functions below use the observation that each wakeup event starts a
>>   * period in which the system should not be suspended.  The moment this 
>> period
>> @@ -391,6 +405,10 @@ static void wakeup_source_activate(struct wakeup_source 
>> *ws)
>>  {
>>   unsigned int cec;
>>
>> + if (WARN_ONCE(wakeup_source_not_registered(ws),
>> + "unregistered wakeup source\n"))
>> + return;
>> +
>>   /*
>>* active wakeup source should bring the system
>>* out of PM_SUSPEND_FREEZE state
>
> Queued up for 4.2, thanks!
>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] power: add a dummy wakeup_source to record statistics

2015-05-15 Thread Jin Qian
Hi Rafael,

I made a minor change in [PATCHv3 3/3] power: add a dummy
wakeup_source to record statistics. Sorry lkml seems down. I can send
you the link later.

The only diff is that deleted_ws.max_time = max(deleted_ws.max_time,
ws->max_time) instead of adding them up.

Thanks,
jin


On Fri, May 15, 2015 at 5:41 PM, Rafael J. Wysocki  wrote:
> On Wednesday, April 22, 2015 05:50:12 PM Jin Qian wrote:
>> After a wakeup_source is destroyed, we lost all information such as how
>> long this wakeup_source has been active. Add a dummy wakeup_source to
>> record such info.
>>
>> Signed-off-by: Jin Qian 
>
> That's fine by me.  Queued up for 4.2, thanks!
>
>> ---
>>  drivers/base/power/wakeup.c | 35 +++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
>> index bdb45f3..732683c 100644
>> --- a/drivers/base/power/wakeup.c
>> +++ b/drivers/base/power/wakeup.c
>> @@ -59,6 +59,11 @@ static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
>>
>>  static ktime_t last_read_time;
>>
>> +static struct wakeup_source deleted_ws = {
>> + .name = "deleted",
>> + .lock =  __SPIN_LOCK_UNLOCKED(deleted_ws.lock),
>> +};
>> +
>>  /**
>>   * wakeup_source_prepare - Prepare a new wakeup source for initialization.
>>   * @ws: Wakeup source to prepare.
>> @@ -110,6 +115,33 @@ void wakeup_source_drop(struct wakeup_source *ws)
>>  }
>>  EXPORT_SYMBOL_GPL(wakeup_source_drop);
>>
>> +/*
>> + * Record wakeup_source statistics being deleted into a dummy wakeup_source.
>> + */
>> +static void wakeup_source_record(struct wakeup_source *ws)
>> +{
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(_ws.lock, flags);
>> +
>> + if (ws->event_count) {
>> + deleted_ws.total_time =
>> + ktime_add(deleted_ws.total_time, ws->total_time);
>> + deleted_ws.max_time =
>> + ktime_add(deleted_ws.max_time, ws->max_time);
>> + deleted_ws.prevent_sleep_time =
>> + ktime_add(deleted_ws.prevent_sleep_time,
>> +   ws->prevent_sleep_time);
>> + deleted_ws.event_count += ws->event_count;
>> + deleted_ws.active_count += ws->active_count;
>> + deleted_ws.relax_count += ws->relax_count;
>> + deleted_ws.expire_count += ws->expire_count;
>> + deleted_ws.wakeup_count += ws->wakeup_count;
>> + }
>> +
>> + spin_unlock_irqrestore(_ws.lock, flags);
>> +}
>> +
>>  /**
>>   * wakeup_source_destroy - Destroy a struct wakeup_source object.
>>   * @ws: Wakeup source to destroy.
>> @@ -122,6 +154,7 @@ void wakeup_source_destroy(struct wakeup_source *ws)
>>   return;
>>
>>   wakeup_source_drop(ws);
>> + wakeup_source_record(ws);
>>   kfree(ws->name);
>>   kfree(ws);
>>  }
>> @@ -930,6 +963,8 @@ static int wakeup_sources_stats_show(struct seq_file *m, 
>> void *unused)
>>   print_wakeup_source_stats(m, ws);
>>   rcu_read_unlock();
>>
>> + print_wakeup_source_stats(m, _ws);
>> +
>>   return 0;
>>  }
>>
>>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread NeilBrown
On Fri, 15 May 2015 17:45:56 -0700 Linus Torvalds
 wrote:

> On Fri, May 15, 2015 at 4:30 PM, NeilBrown  wrote:
> >
> > .. and I've been wondering what to do about i_mutex and NFS.  I've had
> > customer reports of slowness in creating files that seems to be due to
> > i_mutex on the directory being held over the whole 'create' RPC, so only one
> > of those can be in flight at the one time.
> > "make  -j" on a large source directory can easily want to create lots of
> > "*.o" files at "the same time".
> >
> > And NFS doesn't need i_mutex at all because the server will provide the
> > needed guarantees.
> 
> So i_mutex on a directory is probably the nastiest lock we have in the fs 
> layer.
> 
> It's used for several different half-related things:
> 
>  - serialize filename creation/deletion
> 
>This is partly for the benefit of the filesystem itself (and not
> helpful for NFS, as you note), but it's also very much about making
> sure we have uniqueness guarantees at the VFS layer too.
> 
>So even with NFS, it's not just "the server provides the needed
> guarantees", because some of the guarantees are really client-local.
> 
>For example, simply that we only ever have one single dentry for a
> particular name, and that we only ever have one active lookup per
> dentry. Those things happen independently of - and before - the server
> even sees the operation.

But surely those things can be managed with a spinlock.

I think a big part of the problem is that the VFS tries to control
filesystems rather than provide services to them.
If NFS was in control it might:
 - ask the dcache to look up a name and get back a dentry: positive, negative
   or not-validated.
 - if positive, NFS returns an error, or uses the inode - depending on
   operation.
 - otherwise send a 'create' request to the server.  At this point it holds
   references to dentries for directory and target, but no locks.
 - If an error is returned, just drop references and return.
 - On successful create, turn the filehandle into an inode and then
   ask the dcache to link the inode with the target dentry.
   If the dentry is still negative or not-validated, this is trivial.
   If it is positive and already has the right inode, again trivial.
   If it has the wrong inode, you certainly have an interesting problem,
   but one that is specific to NFS (or similar filesystems) and one
   that is up to NFS to solve, not up to the VFS to avoid.
   If the syscall doesn't need to return an 'fd', then just drop the
   references and report success.
   If an 'fd' is required, then create a 'deleted' dentry attached to
   the original parent.  As 'mkdir'  doesn't return an fd, that should
   be safe (not that all the fuss about directory dentries having exactly one
   parent is particularly relevant to NFS)

I'm not convinced that serialising 'lookup' calls is vital.  If two threads
find a 'not-validated' dentry, and both try to look up the inode, they
will both ultimately get the same struct_inode from the icache, and will both
succeed in connecting it to the dentry.  Obviously it would be better to
avoid two concurrent NFS "LOOKUP" requests, but that is a problem for NFS to
solve.  I suspect that using d_fsdata to point to a pending LOOKUP request
would allow the "second" thread to wait for that request to finish.  Other
filesystems would take a completely different approach.

But with the VFS trying to be in control and trying to accommodate the needs
of wildly different filesystems, I imagine it might not be so easy.

> 
>So the whole local directory tree consistency ends up depending on this.
> 
>  - readdir(). This is mostly to make it hard for filesystems to do the
> wrong thing when there is concurrent file creation.

and makes it impossible for them to do the right thing too?


> 
> I suspect readdir could fairly easily push the i_mutex down from the
> caller and into the filesystem, and then filesystems might narrow down
> the use (or even get rid of it). The initial patch might even be
> automated with coccinelle. However, rather few loads actually have a
> lot of readdir() activity, and samba is probably the only major one.
> I've seen benchmarks where it matters, but they are rare (and I
> haven't seen one in literally years).
> 
> So the readdir case could probably be at least relaxed fairly easily.
> But the thing that tends to hurt on more loads is, as you note, the
> filename lookup/creation/movement case. And that's much harder to fix.
> 
> Al, do you have any ideas? Personally, I've wanted to make I_mutex a
> rwsem for a long time, but right now pretty much everything uses it
> for exclusion. For example, filename lookup is clearly just reading
> the directory, so it should take a rwsem for reading, right? No. Not
> the way it is done now. Filename lookup wants the directory inode
> exclusively because that guarantees that we create just one dentry and
> call the filesystem ->lookup only once on that dentry.
> 
> Again, 

[PATCH v3 1/1] iio: ltr501: Add light channel support

2015-05-15 Thread Kuppuswamy Sathyanarayanan
Added support to calculate lux value from visible
and IR spectrum adc count values. Also added IIO_LIGHT
channel to enable user read the lux value directly
from device using illuminance input ABI.

Signed-off-by: Kuppuswamy Sathyanarayanan 

---
 drivers/iio/light/ltr501.c | 51 ++
 1 file changed, 51 insertions(+)

v2: Changed scan index of light channel to -1
v3: Removed scan type info from light channel

diff --git a/drivers/iio/light/ltr501.c b/drivers/iio/light/ltr501.c
index ca4bf47..d7245c6 100644
--- a/drivers/iio/light/ltr501.c
+++ b/drivers/iio/light/ltr501.c
@@ -66,6 +66,9 @@
 
 #define LTR501_REGMAP_NAME "ltr501_regmap"
 
+#define LTR501_LUX_CONV(vis_coeff, vis_data, ir_coeff, ir_data) \
+   ((vis_coeff * vis_data) - (ir_coeff * ir_data))
+
 static const int int_time_mapping[] = {10, 5, 20, 40};
 
 static const struct reg_field reg_field_it =
@@ -298,6 +301,29 @@ static int ltr501_ps_read_samp_period(struct ltr501_data 
*data, int *val)
return IIO_VAL_INT;
 }
 
+/* IR and visible spectrum coeff's are given in data sheet */
+static unsigned long ltr501_calculate_lux(u16 vis_data, u16 ir_data)
+{
+   unsigned long ratio, lux;
+
+   if (vis_data == 0)
+   return 0;
+
+   /* multiply numerator by 100 to avoid handling ratio < 1 */
+   ratio = DIV_ROUND_UP(ir_data * 100, ir_data + vis_data);
+
+   if (ratio < 45)
+   lux = LTR501_LUX_CONV(1774, vis_data, -1105, ir_data);
+   else if (ratio >= 45 && ratio < 64)
+   lux = LTR501_LUX_CONV(3772, vis_data, 1336, ir_data);
+   else if (ratio >= 64 && ratio < 85)
+   lux = LTR501_LUX_CONV(1690, vis_data, 169, ir_data);
+   else
+   lux = 0;
+
+   return lux / 1000;
+}
+
 static int ltr501_drdy(struct ltr501_data *data, u8 drdy_mask)
 {
int tries = 100;
@@ -548,7 +574,14 @@ static const struct iio_event_spec ltr501_pxs_event_spec[] 
= {
.num_event_specs = _evsize,\
 }
 
+#define LTR501_LIGHT_CHANNEL() { \
+   .type = IIO_LIGHT, \
+   .info_mask_separate = BIT(IIO_CHAN_INFO_PROCESSED), \
+   .scan_index = -1, \
+}
+
 static const struct iio_chan_spec ltr501_channels[] = {
+   LTR501_LIGHT_CHANNEL(),
LTR501_INTENSITY_CHANNEL(0, LTR501_ALS_DATA0, IIO_MOD_LIGHT_BOTH, 0,
 ltr501_als_event_spec,
 ARRAY_SIZE(ltr501_als_event_spec)),
@@ -576,6 +609,7 @@ static const struct iio_chan_spec ltr501_channels[] = {
 };
 
 static const struct iio_chan_spec ltr301_channels[] = {
+   LTR501_LIGHT_CHANNEL(),
LTR501_INTENSITY_CHANNEL(0, LTR501_ALS_DATA0, IIO_MOD_LIGHT_BOTH, 0,
 ltr501_als_event_spec,
 ARRAY_SIZE(ltr501_als_event_spec)),
@@ -596,6 +630,23 @@ static int ltr501_read_raw(struct iio_dev *indio_dev,
int ret, i;
 
switch (mask) {
+   case IIO_CHAN_INFO_PROCESSED:
+   if (iio_buffer_enabled(indio_dev))
+   return -EBUSY;
+
+   switch (chan->type) {
+   case IIO_LIGHT:
+   mutex_lock(>lock_als);
+   ret = ltr501_read_als(data, buf);
+   mutex_unlock(>lock_als);
+   if (ret < 0)
+   return ret;
+   *val = ltr501_calculate_lux(le16_to_cpu(buf[1]),
+   le16_to_cpu(buf[0]));
+   return IIO_VAL_INT;
+   default:
+   return -EINVAL;
+   }
case IIO_CHAN_INFO_RAW:
if (iio_buffer_enabled(indio_dev))
return -EBUSY;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 4:38 PM, Dave Chinner  wrote:
>
> Right, because it's cold cache performance that everyone complains
> about.

People really do complain about the hot-cache one too.

Did you read the description of the sample benchmark that Jeremy
described Windows sales people for using?

That kind of thing is actually not that unusual, and they can be big
sales tools.

We went through similar things with "mindcraft", then netbench/dbench.
People will run those benchmarks with enough memory (and often tune
things like dirty thresholds etc) explicitly to get rid of the IO
component for benchmarking reasons.

And often they are just nasty marketing benchmarks and not very
meaningful. The "geekbench of filesystem testing", if you will. Fair
enough. But those kinds of things have also been very useful in making
performance better, because the "real" filesystem benchmarks are
usually too nasty to actually run on reasonable machines. So the
fake/bad ones are often good at showing things that don't scale well
(despite being 100% just CPU-bound) because they show some bottleneck.

And sometimes fixing that bottleneck for the non-IO case ends up
helping the IO case too.

So the one samba profile I remember seeing was probably from early
dbench, I'm pretty sure it was Tridge that showed it as a stress-case
for samba on Linux. So we're talking a decade ago, I really can't
claim I remember the details, but I do remember it being readdir()
being 100% CPU-bound. Or rather, it *would* have been 100% CPU-bound,
but due to the inode semaphore (and back then it was i_sem, I think,
now it's i_mutex) it was actually spending most of the time
sleeping/scheduling due to inode semaphore contention. So rather than
scaling perfectly with CPU's, it just took basically one CPU.

Now, samba has probably changed enormously, and maybe it's not a big
deal. But I don't think our filesystem locking has changed at all,
because quite frankly, nobody else seems to see it. It tends to be a
fileserving thing (the Lustre comment kind of feeds into that).

So it might be interesting to have a simple benchmark that people can
run. WITHOUT the IO load. Because really, IO isn't that interesting to
most of us, especially when we then don't even have IO subsystems that
do much parallelism..

I wrote my own (really really stupid) concurrent stat() test just to
get good profiles of where the real problems are. It's nasty - it's
literally just MAX_THREADS pthread that loop on doing stat() on a list
of files for ten seconds, and then it reports the total number of
loops. But that stupid thing was actually ridiculously useful, not
because the load is meaningful, but because it ended up showing that
we had horribly fragile behavior when we had contention on the dentry
lock.

(That got fixed, although it still ends up sucking when we fall out of
RCU mode - but with Al's upcoming patches that should hopefully be
really really unusual rather than "every time we see a symlink" etc)

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 03/18] f2fs crypto: declare some definitions for f2fs encryption feature

2015-05-15 Thread Jaegeuk Kim
On Thu, May 14, 2015 at 09:50:44AM -0700, Tom Marshall wrote:
> Please keep in mind that I'm also working on transparent
> compression.  I'm watching this thread closely so that I can
> implement a compression library alongside the crypto library.  If
> there is any interest or benefit, I would be glad to work together
> so that the two can be done cooperatively at the same time.

I can't imagine quickly how compression code can be shared with crypto.
The basic approach for compression would be that X pages can be compressed into
small number of pages, Y, which can be a X to Y mapping.
But, this per-file encryption supports only 1 to 1 4KB mapping, so that it could
be quite a simple implementation.

Could you elaborate on your approach or design? Or, codes?
Whatever, IMO, it needs to implement it by any filesystem first.

Thanks,

> 
> On 05/13/2015 06:56 PM, Jaegeuk Kim wrote:
> >On Thu, May 14, 2015 at 10:37:21AM +1000, Dave Chinner wrote:
> >>On Tue, May 12, 2015 at 11:48:02PM -0700, Jaegeuk Kim wrote:
> >>>On Wed, May 13, 2015 at 12:02:08PM +1000, Dave Chinner wrote:
> On Fri, May 08, 2015 at 09:20:38PM -0700, Jaegeuk Kim wrote:
> >This definitions will be used by inode and superblock for encyption.
> How much of this crypto stuff is common with or only slightly
> modified from the ext4 code?  Is the behaviour and features the
> same? Is the user API and management tools the same?
> 
> IMO, if there is any amount of overlap, then we should be
> implementing this stuff as generic code, not propagating the same
> code through multiple filesystems via copy-n-paste-n-modify. This
> will simply end up with diverging code, different bugs and feature
> sets, and none of the implementations will get the review and
> maintenance they really require...
> 
> And, FWIW, this is the reason why I originally asked for the ext4
> encryption code to be pulled up to the VFS: precisely so we didn't
> end up with a rapid proliferation of individual in-filesystem
> encryption implementations that are all slightly different...
> >>>Totally agreed!
> >>>
> >>>AFAIK, Ted wants to push the codes as a crypto library into fs/ finally, so
> >>>I believe most part of crypto codes are common.
> >>Can I suggest fs/crypto/ if there are going to be multiple files?
> >No problem at all. I'll do.
> >
> >>>But, in order to realize that quickly, Ted implemented the feature to 
> >>>finalize
> >>>on-disk and in-memory design in EXT4 as a first step.
> >>>Then, I've been catching up and validating its design by implementing it in
> >>>F2FS, which also intends to figure out what crypto codes can be exactly 
> >>>common.
> >>Excellent. That will make it easier and less error prone for other
> >>filesystems to implement it, too!
> >>
> >>>As Ted mentioned before, since next android version tries to use per-file
> >>>encryption, F2FS also needs to support it as quick as possible likewise 
> >>>EXT4.
> >>Fair enough.
> >>
> >>>Meanwhile, surely I've been working on writing patches to push them into 
> >>>fs/;
> >>>currenlty, I did for cryto.c and will do for crypto_key.c and 
> >>>crypto_fname.c.
> >>>But, it needs to think about crypto_policy.c differently, since it may 
> >>>depend
> >>>on how each filesystem stores the policy information respectively; we 
> >>>cannot
> >>>push all the filesystems should use xattrs, right?
> >>All filesystems likely to implement per-file crypto support xattrs,
> >>and this is exactly what xattrs are designed for. e.g. we already
> >>require xattrs for generic security labels, ACLs, etc. Hence
> >>per-file crypto information should also use a common, shared xattr
> >>format. That way it only needs to be implemented once in the generic
> >>code and there's very little (hopefully nothing!) each filesystem
> >>has to customise to store the crypto information for each file.
> >Ok, I see. Let me take a look at that too.
> >Thank you for sharing your thoughts. :)
> >
> >>Cheers,
> >>
> >>Dave.
> >>-- 
> >>Dave Chinner
> >>da...@fromorbit.com
> >--
> >To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> >the body of a message to majord...@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv3 3/3] power: add a dummy wakeup_source to record statistics

2015-05-15 Thread Jin Qian
After a wakeup_source is destroyed, we lost all information such as how
long this wakeup_source has been active. Add a dummy wakeup_source to
record such info.

Signed-off-by: Jin Qian 
---
 drivers/base/power/wakeup.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index 3a915cc..071a5c5 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -58,6 +58,11 @@ static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
 
 static ktime_t last_read_time;
 
+static struct wakeup_source deleted_ws = {
+   .name = "deleted",
+   .lock =  __SPIN_LOCK_UNLOCKED(deleted_ws.lock),
+};
+
 /**
  * wakeup_source_prepare - Prepare a new wakeup source for initialization.
  * @ws: Wakeup source to prepare.
@@ -109,6 +114,34 @@ void wakeup_source_drop(struct wakeup_source *ws)
 }
 EXPORT_SYMBOL_GPL(wakeup_source_drop);
 
+/*
+ * Record wakeup_source statistics being deleted into a dummy wakeup_source.
+ */
+static void wakeup_source_record(struct wakeup_source *ws)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(_ws.lock, flags);
+
+   if (ws->event_count) {
+   deleted_ws.total_time =
+   ktime_add(deleted_ws.total_time, ws->total_time);
+   deleted_ws.prevent_sleep_time =
+   ktime_add(deleted_ws.prevent_sleep_time,
+ ws->prevent_sleep_time);
+   deleted_ws.max_time =
+   ktime_compare(deleted_ws.max_time, ws->max_time) > 0 ?
+   deleted_ws.max_time : ws->max_time;
+   deleted_ws.event_count += ws->event_count;
+   deleted_ws.active_count += ws->active_count;
+   deleted_ws.relax_count += ws->relax_count;
+   deleted_ws.expire_count += ws->expire_count;
+   deleted_ws.wakeup_count += ws->wakeup_count;
+   }
+
+   spin_unlock_irqrestore(_ws.lock, flags);
+}
+
 /**
  * wakeup_source_destroy - Destroy a struct wakeup_source object.
  * @ws: Wakeup source to destroy.
@@ -121,6 +154,7 @@ void wakeup_source_destroy(struct wakeup_source *ws)
return;
 
wakeup_source_drop(ws);
+   wakeup_source_record(ws);
kfree(ws->name);
kfree(ws);
 }
@@ -929,6 +963,8 @@ static int wakeup_sources_stats_show(struct seq_file *m, 
void *unused)
print_wakeup_source_stats(m, ws);
rcu_read_unlock();
 
+   print_wakeup_source_stats(m, _ws);
+
return 0;
 }
 
-- 
2.2.0.rc0.207.ga3a616c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv3 1/3] power: validate wakeup source before activating it.

2015-05-15 Thread Jin Qian
A rogue wakeup source not registered in wakeup_sources list is not visible
from wakeup_sources_stats_show. Check if the wakeup source is registered
properly by looking at the timer struct.

Signed-off-by: Jin Qian 
---
 drivers/base/power/wakeup.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index 7726200..457c04f 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -351,6 +351,20 @@ int device_set_wakeup_enable(struct device *dev, bool 
enable)
 }
 EXPORT_SYMBOL_GPL(device_set_wakeup_enable);
 
+/**
+ * wakeup_source_not_registered - validate the given wakeup source.
+ * @ws: Wakeup source to be validated.
+ */
+static bool wakeup_source_not_registered(struct wakeup_source *ws)
+{
+   /*
+* Use timer struct to check if the given source is initialized
+* by wakeup_source_add.
+*/
+   return ws->timer.function != pm_wakeup_timer_fn ||
+  ws->timer.data != (unsigned long)ws;
+}
+
 /*
  * The functions below use the observation that each wakeup event starts a
  * period in which the system should not be suspended.  The moment this period
@@ -391,6 +405,10 @@ static void wakeup_source_activate(struct wakeup_source 
*ws)
 {
unsigned int cec;
 
+   if (WARN(wakeup_source_not_registered(ws),
+   "unregistered wakeup source\n"))
+   return;
+
/*
 * active wakeup source should bring the system
 * out of PM_SUSPEND_FREEZE state
-- 
2.2.0.rc0.207.ga3a616c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] power: increment wakeup_count when save_wakeup_count failed.

2015-05-15 Thread Jin Qian
Some wakeup event happens every frequently between reading
wakeup_count and writing back wakeup_count.
They changed wakeup event count so writes fail and usespace doesn't
continue to suspend. However, such
occurrences are not counted in ws->wakeup_count. I spent quite
sometime finding out the problematic wakeup
event with inaccurate wakeup_count : )

Thanks,
jin


On Fri, May 15, 2015 at 5:34 PM, Rafael J. Wysocki  wrote:
> On Wednesday, April 22, 2015 05:50:11 PM Jin Qian wrote:
>> user-space aborts suspend attempt if writing wakeup_count failed.
>> Count the write failure towards wakeup_count.
>
> A use case, please?
>
>> Signed-off-by: Jin Qian 
>> ---
>>  drivers/base/power/wakeup.c | 17 +
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
>> index f24c622..bdb45f3 100644
>> --- a/drivers/base/power/wakeup.c
>> +++ b/drivers/base/power/wakeup.c
>> @@ -57,6 +57,8 @@ static LIST_HEAD(wakeup_sources);
>>
>>  static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
>>
>> +static ktime_t last_read_time;
>> +
>>  /**
>>   * wakeup_source_prepare - Prepare a new wakeup source for initialization.
>>   * @ws: Wakeup source to prepare.
>> @@ -771,10 +773,15 @@ void pm_wakeup_clear(void)
>>  bool pm_get_wakeup_count(unsigned int *count, bool block)
>>  {
>>   unsigned int cnt, inpr;
>> + unsigned long flags;
>>
>>   if (block) {
>>   DEFINE_WAIT(wait);
>>
>> + spin_lock_irqsave(_lock, flags);
>> + last_read_time = ktime_get();
>> + spin_unlock_irqrestore(_lock, flags);
>> +
>>   for (;;) {
>>   prepare_to_wait(_count_wait_queue, ,
>>   TASK_INTERRUPTIBLE);
>> @@ -806,6 +813,7 @@ bool pm_save_wakeup_count(unsigned int count)
>>  {
>>   unsigned int cnt, inpr;
>>   unsigned long flags;
>> + struct wakeup_source *ws;
>>
>>   events_check_enabled = false;
>>   spin_lock_irqsave(_lock, flags);
>> @@ -813,6 +821,15 @@ bool pm_save_wakeup_count(unsigned int count)
>>   if (cnt == count && inpr == 0) {
>>   saved_count = count;
>>   events_check_enabled = true;
>> + } else {
>> + rcu_read_lock();
>> + list_for_each_entry_rcu(ws, _sources, entry) {
>> + if (ws->active ||
>> + ktime_compare(ws->last_time, last_read_time) > 0) {
>> + ws->wakeup_count++;
>> + }
>> + }
>> + rcu_read_unlock();
>>   }
>>   spin_unlock_irqrestore(_lock, flags);
>>   return events_check_enabled;
>>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/2] cpufreq_stats: Adds sysfs file /sys/devices/system/cpu/cpufreq/current_in_state

2015-05-15 Thread Ruchi Kandoi
On Thu, May 14, 2015 at 7:48 PM, Viresh Kumar  wrote:
> I am not replying for concept here, as sched maintainers are in a
> better position for that, but a nit below..
>
> On 14-05-15, 17:12, Ruchi Kandoi wrote:
>> Adds the sysfs file for userspace to initialize the active current
>> values for all the cores at each of the frequencies.
>>
>> The format for storing the values is as follows:
>> echo "CPU:= =,CPU:
>> ..." > /sys/devices/system/cpu/cpufreq/current_in_state
>
> Why this file? And not
> /sys/devices/system/cpu/cpuX/cpufreq/stats/current_in_state ? That way
> you don't have to replicate the same information for all CPUs, as the
> stats folder can be shared by multiple CPUs (which share their
> clock/voltage rails)..

Some of the hand-held devices support hot-plugging of the cpus and
when the core is hot-plugged out the
/sys/devices/system/cpu/cpuX/cpufreq directory is removed too. So it
won't be possible to share folders by multiple CPUs.

>
> --
> viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv3 2/4] clk: socfpga: add a clock driver for the Arria 10 platform

2015-05-15 Thread Stephen Boyd
On 05/07, dingu...@opensource.altera.com wrote:
> diff --git a/drivers/clk/socfpga/clk-gate-a10.c 
> b/drivers/clk/socfpga/clk-gate-a10.c
> new file mode 100644
> index 000..fadf6f7
> --- /dev/null
> +++ b/drivers/clk/socfpga/clk-gate-a10.c
> @@ -0,0 +1,187 @@
[...]
> +
> +static int socfpga_clk_prepare(struct clk_hw *hwclk)
> +{
> + struct socfpga_gate_clk *socfpgaclk = to_socfpga_gate_clk(hwclk);
> + struct regmap *sys_mgr_base_addr;
> + int i;
> + u32 hs_timing;
> + u32 clk_phase[2];
> +
> + if (socfpgaclk->clk_phase[0] || socfpgaclk->clk_phase[1]) {
> + sys_mgr_base_addr = 
> syscon_regmap_lookup_by_compatible("altr,sys-mgr");
> + if (IS_ERR(sys_mgr_base_addr)) {

Is there a reason the syscon is grabbed lazily in prepare? Why
not get it before registering this clock?

> + pr_err("%s: failed to find altr,sys-mgr regmap!\n", 
> __func__);
> + return -EINVAL;
> + }
> +
> + for (i = 0; i < 2; i++) {

i < ARRAY_SIZE(clk_phase) ?

> + switch (socfpgaclk->clk_phase[i]) {
> + case 0:
> + clk_phase[i] = 0;
> + break;
> + case 45:
> + clk_phase[i] = 1;
> + break;
> + case 90:
> + clk_phase[i] = 2;
> + break;
> + case 135:
> + clk_phase[i] = 3;
> + break;
> + case 180:
> + clk_phase[i] = 4;
> + break;
> + case 225:
> + clk_phase[i] = 5;
> + break;
> + case 270:
> + clk_phase[i] = 6;
> + break;
> + case 315:
> + clk_phase[i] = 7;
> + break;
> + default:
> + clk_phase[i] = 0;
> + break;
> + }
> + }
> +
> + hs_timing = SYSMGR_SDMMC_CTRL_SET(clk_phase[0], clk_phase[1]);
> + regmap_write(sys_mgr_base_addr, SYSMGR_SDMMCGRP_CTRL_OFFSET,
> +  hs_timing);
> + }
> + return 0;
> +}
> +
> +static struct clk_ops gateclk_ops = {

const?

> + .prepare = socfpga_clk_prepare,
> + .recalc_rate = socfpga_gate_clk_recalc_rate,
> +};
> +
> diff --git a/drivers/clk/socfpga/clk-periph-a10.c 
> b/drivers/clk/socfpga/clk-periph-a10.c
> new file mode 100644
> index 000..81b9274
> --- /dev/null
> +++ b/drivers/clk/socfpga/clk-periph-a10.c
> @@ -0,0 +1,131 @@
[...]
> + */
> +#include 

Are you using this include?

> +#include 

Are you using this include?

Applies to every file added in this patch.

> +#include 
> +#include 
> +#include 
> +
> +#include "clk.h"
> +
> +#define CLK_MGR_FREE_SHIFT   16
> +#define CLK_MGR_FREE_MASK0x7
> +
> +#define SOCFPGA_MPU_FREE_CLK "mpu_free_clk"
> +#define SOCFPGA_NOC_FREE_CLK "noc_free_clk"
> +#define SOCFPGA_SDMMC_FREE_CLK   "sdmmc_free_clk"
[..]
> +
> +static __init void __socfpga_periph_init(struct device_node *node,
> + const struct clk_ops *ops)
> +{
[..]
> + init.name = clk_name;
> + init.ops = ops;
> + init.flags = 0;
> +
> + parent_name = of_clk_get_parent_name(node, 0);
> + init.num_parents = 1;
> + init.parent_names = _name;
> +
> + periph_clk->hw.hw.init = 
> +
> + clk = clk_register(NULL, _clk->hw.hw);
> + if (WARN_ON(IS_ERR(clk))) {
> + kfree(periph_clk);
> + return;
> + }
> + rc = of_clk_add_provider(node, of_clk_src_simple_get, clk);

Why not check the return value?

> +
> diff --git a/drivers/clk/socfpga/clk-pll-a10.c 
> b/drivers/clk/socfpga/clk-pll-a10.c
> new file mode 100644
> index 000..2adc2f5
> --- /dev/null
> +++ b/drivers/clk/socfpga/clk-pll-a10.c
[..]
> +
> +static u8 clk_pll_get_parent(struct clk_hw *hwclk)
> +{
> + struct socfpga_pll *socfpgaclk = to_socfpga_clk(hwclk);
> + u32 pll_src;
> +
> + pll_src = readl(socfpgaclk->hw.reg);
> +
> + return (pll_src >> CLK_MGR_PLL_CLK_SRC_SHIFT) &
> + CLK_MGR_PLL_CLK_SRC_MASK;
> +}
> +
> +

Nitpick: Single newline please.

> +static struct clk_ops clk_pll_ops = {

const?

> + .recalc_rate = clk_pll_recalc_rate,
> + .get_parent = clk_pll_get_parent,
> +};
> +
> +static __init struct clk *__socfpga_pll_init(struct device_node *node,

__init goes after the return type, doesn't it?

> + const struct clk_ops *ops)
> +{
> + u32 reg;
> + struct clk *clk;
> + struct socfpga_pll *pll_clk;
> + const char *clk_name = node->name;
> + const char 

Re: [PATCH/RFC] kbuild: Create a rule for building device tree overlay objects

2015-05-15 Thread Frank Rowand
On 5/12/2015 7:33 AM, Pantelis Antoniou wrote:
> Hi Geert,
> 
>> On May 12, 2015, at 14:56 , Geert Uytterhoeven  
>> wrote:
>>
>> This allows to handle device tree overlays like plain device trees.
>>
>> Signed-off-by: Geert Uytterhoeven 
>> ---
>> Questions:
>>  - Do we want dtso files under arch//boot/dts/, too?
>>  - Do we want to move the dts files outside the kernel repository
>>first?
>>
> 
> Oh that’s a nice hornet’s nest you’ve kicked here.
> 
> arch//boot/dts should not be the place, cause overlays are not related 
> with boot per se.
> As they are right now are board (family) specific.

Aren't overlays meant to describe child boards (capes, shields, whatever) that 
may
vary from system to system, but are not expected to be hot-plugged while the OS
is up?  Or is hot-plug a design goal?

If no hot-plug, then to me an overlay is just as related to boot as the base 
dts.
It is a mere implementation detail that overlays are "loaded" from userspace
instead of by the booting kernel (I don't really know the details of using
overlays, so please correct me if I am wrong about how the kernel becomes aware
of an overlay).

> 
> I think we should try to keep an external kernel repo with them for now until 
> we
> figure out where to put them.
> 
>> scripts/Makefile.lib | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
>> index 79e86613712f2230..4b14eef1d4b2ce8f 100644
>> --- a/scripts/Makefile.lib
>> +++ b/scripts/Makefile.lib
>> @@ -292,6 +292,9 @@ cmd_dtc = mkdir -p $(dir ${dtc-tmp}) ; \
>> $(obj)/%.dtb: $(src)/%.dts FORCE
>>  $(call if_changed_dep,dtc)
>>
>> +$(obj)/%.dtbo: $(src)/%.dtso FORCE
>> +$(call if_changed_dep,dtc)
>> +
>> dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp)
>>
>> # Bzip2
>> -- 
>> 1.9.1
> 
> Regards
> 
> — Pantelis
> 
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rtc-linux] [PATCH V2 0/4] da9062: DA9062 driver submission

2015-05-15 Thread Alexandre Belloni
Hi,

On 14/05/2015 at 17:43:53 +0100, S Twiss wrote :
> From: S Twiss 
> 
> This patch set adds support for the Dialog DA9062 Power Management IC.
> 
> In this patch set the following is provided:
>  - [PATCH V2 1/4]: MFD core support  
>  - [PATCH V2 2/4]: BUCK and LDO regulator driver
>  - [PATCH V2 3/4]: Watchdog driver
>  - [PATCH V2 4/4]: Add bindings for all DA9062 components

This patch should actually be the first one to go in, else checkpatch
will complain about compatibles not being found.


-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 4:30 PM, NeilBrown  wrote:
>
> .. and I've been wondering what to do about i_mutex and NFS.  I've had
> customer reports of slowness in creating files that seems to be due to
> i_mutex on the directory being held over the whole 'create' RPC, so only one
> of those can be in flight at the one time.
> "make  -j" on a large source directory can easily want to create lots of
> "*.o" files at "the same time".
>
> And NFS doesn't need i_mutex at all because the server will provide the
> needed guarantees.

So i_mutex on a directory is probably the nastiest lock we have in the fs layer.

It's used for several different half-related things:

 - serialize filename creation/deletion

   This is partly for the benefit of the filesystem itself (and not
helpful for NFS, as you note), but it's also very much about making
sure we have uniqueness guarantees at the VFS layer too.

   So even with NFS, it's not just "the server provides the needed
guarantees", because some of the guarantees are really client-local.

   For example, simply that we only ever have one single dentry for a
particular name, and that we only ever have one active lookup per
dentry. Those things happen independently of - and before - the server
even sees the operation.

   So the whole local directory tree consistency ends up depending on this.

 - readdir(). This is mostly to make it hard for filesystems to do the
wrong thing when there is concurrent file creation.

I suspect readdir could fairly easily push the i_mutex down from the
caller and into the filesystem, and then filesystems might narrow down
the use (or even get rid of it). The initial patch might even be
automated with coccinelle. However, rather few loads actually have a
lot of readdir() activity, and samba is probably the only major one.
I've seen benchmarks where it matters, but they are rare (and I
haven't seen one in literally years).

So the readdir case could probably be at least relaxed fairly easily.
But the thing that tends to hurt on more loads is, as you note, the
filename lookup/creation/movement case. And that's much harder to fix.

Al, do you have any ideas? Personally, I've wanted to make I_mutex a
rwsem for a long time, but right now pretty much everything uses it
for exclusion. For example, filename lookup is clearly just reading
the directory, so it should take a rwsem for reading, right? No. Not
the way it is done now. Filename lookup wants the directory inode
exclusively because that guarantees that we create just one dentry and
call the filesystem ->lookup only once on that dentry.

Again, there tend to be no simple benchmarks or loads that people care
about that show this. Most of the time it's fairly hard to see.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] ARM: multi_v7_defconfig: Stuff for Exynos

2015-05-15 Thread Krzysztof Kozłowski
2015-05-15 23:12 GMT+09:00 Javier Martinez Canillas
:
> Hello Krzysztof,
>
> On 05/15/2015 02:48 PM, Krzysztof Kozlowski wrote:
>> Dear Kukjin,
>>
>> The patchset enables various config options on multi_v7 config
>> for Exynos boards. The first two patches are actually resend.
>>
>> Arnd suggested [0] that this can go through your tree.
>>
>> Patchset is rebased on next-20150515 and Javier's patchset [1]
>> (to avoid conflicts around regulators and clocks).
>>
>> Please let me know if this should be rebased on other commit.
>>
>> [0] http://www.spinics.net/lists/kernel/msg1991518.html
>> [1] http://www.spinics.net/lists/kernel/msg1990767.html
>>
>> Best regards,
>> Krzysztof
>>
>>
>> Krzysztof Kozlowski (4):
>>   ARM: multi_v7_defconfig: Enable CPU idle for Exynos
>>   ARM: multi_v7_defconfig: Enable PMIC and MUIC drivers for Exynos
>> boards
>>   ARM: multi_v7_defconfig: Enable TMU for Exynos
>>   ARM: multi_v7_defconfig: Enable OHCI on Exynos
>>
>>  arch/arm/configs/multi_v7_defconfig | 14 ++
>>  1 file changed, 14 insertions(+)
>>
>
> On a first look, I agree with all the Kconfig symbols but I noticed that
> you are adding them to be built-in while the current policy for multi_v7
> is to build as much as possible as a module.
>
> Could you please add as =m for all the tristate symbols that can be build
> as a module?

Your request makes sense. For many Exynos boards we got used to
compiling it into the kernel, not as a module, but I think only reason
behind this is convenience. I'll try to switch to modules.

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] regulator: max77686: fix a shift wrapping bug

2015-05-15 Thread Krzysztof Kozlowski
2015-05-16 2:10 GMT+09:00 Joe Perches :
> On Fri, 2015-05-15 at 19:19 +0900, Chanwoo Choi wrote:
>> On 05/15/2015 06:25 PM, Dan Carpenter wrote:
>> > We need to be able to handle more than 32 bits here because "id" can go
>> > up to MAX77686_BUCK9 (34).  ->gpio_enabled is a u64 so that's fine
>> > already.
>> >
>> > Fixes: 3307e9025d29 ('regulator: max77686: Add GPIO control')
>> > Signed-off-by: Dan Carpenter 
>
> Alternate suggested patch below:
>
>> > diff --git a/drivers/regulator/max77686.c b/drivers/regulator/max77686.c
> []
>> > @@ -121,7 +121,7 @@ static unsigned int max77686_map_normal_mode(struct 
>> > max77686_data *max77686,
>> > -   if (max77686->gpio_enabled & (1 << id))
>> > +   if (max77686->gpio_enabled & (1ULL << id))
> []
>> > @@ -277,7 +277,7 @@ static int max77686_of_parse_cb(struct device_node *np,
>> > if (gpio_is_valid(config->ena_gpio)) {
>> > -   max77686->gpio_enabled |= (1 << desc->id);
>> > +   max77686->gpio_enabled |= (1ULL << desc->id);
> []
>> Looks good to me.
>> Reviewed-by: Chanwoo Choi 
>
> This could be better with DECLARE_BITMAP and test_bit/set_bit

Yes, this looks better - it clearly shows the purpose of
"gpio_enabled" member. Joe or Dan, can you resend with new solution
and respective tags? (Cc stable, reported-by Dan if patch comes from
Joe)

Anyway my reviewed-by may stay on for both solutions.

Thanks,
Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/8] MODSIGN: Use PKCS#7 for module signatures [ver #4]

2015-05-15 Thread Rusty Russell
David Howells  writes:
> Hi Rusty,

Hi David,

I try to stick my nose into patches which touch module.c/h: this
doesn't, so am happy for this via another tree (AFAICT doesn't even need
my ack).

Thanks,
Rusty.

> Here's a set of patches that does the following:
>
>  (1) Extracts both parts of an X.509 AuthorityKeyIdentifier (AKID) extension.
>  We already extract the bit that can match the subjectKeyIdentifier (SKID)
>  of the parent X.509 cert, but we currently ignore the bits that can match
>  the issuer and serialNumber.
>
>  Looks up an X.509 cert by issuer and serialNumber if those are provided 
> in
>  the AKID.  If the keyIdentifier is also provided, checks that the
>  subjectKeyIdentifier of the cert found matches that also.
>
>  If no issuer and serialNumber are provided in the AKID, looks up an X.509
>  cert by SKID using the AKID keyIdentifier.
>
>  This allows module signing to be done with certificates that don't have 
> an
>  SKID by which they can be looked up.
>
>  (2) Makes use of the PKCS#7 facility to provide module signatures.
>
>  sign-file is replaced with a program that generates a PKCS#7 message that
>  has no X.509 certs embedded and that has detached data (the module
>  content) and adds it onto the message with magic string and descriptor.
>
>  (3) The PKCS#7 message (and matching X.509 cert) supply all the information
>  that is needed to select the X.509 cert to be used to verify the 
> signature
>  by standard means (including selection of digest algorithm and public key
>  algorithm).  No kernel-specific magic values are required.
>
>  (4) Makes it possible to get sign-file to just write out a file containing 
> the
>  PKCS#7 signature blob.  This can be used for debugging and potentially 
> for
>  firmware signing.
>
>  (5) Extract the function that does PKCS#7 signature verification on a blob
>  from the module signing code and put it somewhere more general so that
>  other things, such as firmware signing, can make use of it without
>  depending on module config options.
>
> Note that the revised sign-file program no longer supports the "-s 
> "
> option as I'm not sure what the best way to deal with this is.  Do we generate
> a PKCS#7 cert from the signature given, or do we get given a PKCS#7 cert?  I
> lean towards the latter.  Note that David Woodhouse is looking at making
> sign-file work with PKCS#11, so bringing back -s might not be necessary.
>
> They can be found here also:
>
>   
> http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=modsign-pkcs7
>
> and are tagged with:
>
>   modsign-pkcs7-20150515
>
> Should these go via the security tree or your tree?
>
> David
> ---
> David Howells (7):
>   X.509: Extract both parts of the AuthorityKeyIdentifier
>   X.509: Support X.509 lookup by Issuer+Serial form AuthorityKeyIdentifier
>   PKCS#7: Allow detached data to be supplied for signature checking 
> purposes
>   MODSIGN: Provide a utility to append a PKCS#7 signature to a module
>   MODSIGN: Use PKCS#7 messages as module signatures
>   system_keyring.c doesn't need to #include module-internal.h
>   MODSIGN: Extract the blob PKCS#7 signature verifier from module signing
>
> Luis R. Rodriguez (1):
>   sign-file: Add option to only create signature file
>
>
>  Makefile  |2 
>  crypto/asymmetric_keys/Makefile   |8 -
>  crypto/asymmetric_keys/pkcs7_trust.c  |   10 -
>  crypto/asymmetric_keys/pkcs7_verify.c |   80 --
>  crypto/asymmetric_keys/x509_akid.asn1 |   35 ++
>  crypto/asymmetric_keys/x509_cert_parser.c |  142 ++
>  crypto/asymmetric_keys/x509_parser.h  |5 
>  crypto/asymmetric_keys/x509_public_key.c  |   86 --
>  include/crypto/pkcs7.h|3 
>  include/crypto/public_key.h   |4 
>  include/keys/system_keyring.h |5 
>  init/Kconfig  |   28 +-
>  kernel/module_signing.c   |  212 +--
>  kernel/system_keyring.c   |   51 +++-
>  scripts/Makefile  |2 
>  scripts/sign-file |  421 
> -
>  scripts/sign-file.c   |  212 +++
>  17 files changed, 578 insertions(+), 728 deletions(-)
>  create mode 100644 crypto/asymmetric_keys/x509_akid.asn1
>  delete mode 100755 scripts/sign-file
>  create mode 100755 scripts/sign-file.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PM / sleep: cancel the synchronous restriction of pm-trace

2015-05-15 Thread Rafael J. Wysocki
On Wednesday, May 06, 2015 10:47:24 PM Fu, Zhonghui wrote:
> Some system-hang occur only when multiple device PM methods
> are running asynchronously. So should cancel the synchronization
> of pm-trace, and make it suitable for asynchronous PM environment.
> 
> Signed-off-by: Zhonghui Fu 
> ---
>  drivers/base/power/main.c |   53 +---
>  include/linux/pm-trace.h  |   24 
>  kernel/power/main.c   |2 +
>  3 files changed, 41 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index 3d874ec..40daf48 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -476,9 +476,6 @@ static int device_resume_noirq(struct device *dev, 
> pm_message_t state, bool asyn
>   char *info = NULL;
>   int error = 0;
>  
> - TRACE_DEVICE(dev);
> - TRACE_RESUME(0);
> -
>   if (dev->power.syscore || dev->power.direct_complete)
>   goto Out;
>  
> @@ -506,19 +503,21 @@ static int device_resume_noirq(struct device *dev, 
> pm_message_t state, bool asyn
>   callback = pm_noirq_op(dev->driver->pm, state);
>   }
>  
> + TRACE_DEVICE_START(dev);
> + TRACE_RESUME(0);
> +
>   error = dpm_run_callback(callback, dev, state, info);
>   dev->power.is_noirq_suspended = false;
>  
> + TRACE_DEVICE_END();
>   Out:
>   complete_all(>power.completion);
> - TRACE_RESUME(error);
>   return error;
>  }
>  
>  static bool is_async(struct device *dev)
>  {
> - return dev->power.async_suspend && pm_async_enabled
> - && !pm_trace_is_enabled();
> + return dev->power.async_suspend && pm_async_enabled;
>  }
>  
>  static void async_resume_noirq(void *data, async_cookie_t cookie)
> @@ -605,9 +604,6 @@ static int device_resume_early(struct device *dev, 
> pm_message_t state, bool asyn
>   char *info = NULL;
>   int error = 0;
>  
> - TRACE_DEVICE(dev);
> - TRACE_RESUME(0);
> -
>   if (dev->power.syscore || dev->power.direct_complete)
>   goto Out;
>  
> @@ -635,12 +631,14 @@ static int device_resume_early(struct device *dev, 
> pm_message_t state, bool asyn
>   callback = pm_late_early_op(dev->driver->pm, state);
>   }
>  
> + TRACE_DEVICE_START(dev);
> + TRACE_RESUME(0);
> +
>   error = dpm_run_callback(callback, dev, state, info);
>   dev->power.is_late_suspended = false;
>  
> + TRACE_DEVICE_END();
>   Out:
> - TRACE_RESUME(error);
> -
>   pm_runtime_enable(dev);
>   complete_all(>power.completion);
>   return error;
> @@ -734,9 +732,6 @@ static int device_resume(struct device *dev, pm_message_t 
> state, bool async)
>   int error = 0;
>   DECLARE_DPM_WATCHDOG_ON_STACK(wd);
>  
> - TRACE_DEVICE(dev);
> - TRACE_RESUME(0);
> -
>   if (dev->power.syscore)
>   goto Complete;
>  
> @@ -801,9 +796,13 @@ static int device_resume(struct device *dev, 
> pm_message_t state, bool async)
>   }
>  
>   End:
> + TRACE_DEVICE_START(dev);
> + TRACE_RESUME(0);
> +
>   error = dpm_run_callback(callback, dev, state, info);
>   dev->power.is_suspended = false;
>  
> + TRACE_DEVICE_END();
>   Unlock:
>   device_unlock(dev);
>   dpm_watchdog_clear();
> @@ -811,8 +810,6 @@ static int device_resume(struct device *dev, pm_message_t 
> state, bool async)
>   Complete:
>   complete_all(>power.completion);
>  
> - TRACE_RESUME(error);
> -
>   return error;
>  }
>  
> @@ -1017,9 +1014,6 @@ static int __device_suspend_noirq(struct device *dev, 
> pm_message_t state, bool a
>   char *info = NULL;
>   int error = 0;
>  
> - TRACE_DEVICE(dev);
> - TRACE_SUSPEND(0);
> -
>   if (async_error)
>   goto Complete;
>  
> @@ -1052,15 +1046,18 @@ static int __device_suspend_noirq(struct device *dev, 
> pm_message_t state, bool a
>   callback = pm_noirq_op(dev->driver->pm, state);
>   }
>  
> + TRACE_DEVICE_START(dev);
> + TRACE_SUSPEND(0);
> +
>   error = dpm_run_callback(callback, dev, state, info);
>   if (!error)
>   dev->power.is_noirq_suspended = true;
>   else
>   async_error = error;
>  
> + TRACE_DEVICE_END();
>  Complete:
>   complete_all(>power.completion);
> - TRACE_SUSPEND(error);
>   return error;
>  }
>  
> @@ -1161,9 +1158,6 @@ static int __device_suspend_late(struct device *dev, 
> pm_message_t state, bool as
>   char *info = NULL;
>   int error = 0;
>  
> - TRACE_DEVICE(dev);
> - TRACE_SUSPEND(0);
> -
>   __pm_runtime_disable(dev, false);
>  
>   if (async_error)
> @@ -1198,14 +1192,17 @@ static int __device_suspend_late(struct device *dev, 
> pm_message_t state, bool as
>   callback = pm_late_early_op(dev->driver->pm, state);
>   }
>  
> + TRACE_DEVICE_START(dev);
> + TRACE_SUSPEND(0);
> +
>   error = dpm_run_callback(callback, 

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 01:10:27AM +0100, Al Viro wrote:

> Er...  Remember the clusterfuck around the ->i_size and alignment
> checks on XFS DIO writes?  Just this cycle.  Correctness of XFS
> locking is nothing to boast about - it *is* convoluted as hell and you
> guys are not superhuman enough to reliably spot the problems in that nest
> of horrors.  Nobody is.
> 
> PS: I've no idea whether I'm being insulting or not and frankly, I don't give
> a damn; unlike Linus I hadn't signed off on the "code of conflict" nonsense.
> Anyone who feels like complaining is quite welcome to it.

PPS: everyone is also quite welcome to coming up with a sane locking scheme,
and unlike complaints about insulting tone, _that_ won't be ignored.  Folks,
I'm absolutely serious - if anyone has a good candidate (hell, any candidate),
post it on fsdevel and let's discuss it.  I'll do my best to poke holes in
such and if we end up with something working, I'll be bloody happy.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] power: add a dummy wakeup_source to record statistics

2015-05-15 Thread Rafael J. Wysocki
On Wednesday, April 22, 2015 05:50:12 PM Jin Qian wrote:
> After a wakeup_source is destroyed, we lost all information such as how
> long this wakeup_source has been active. Add a dummy wakeup_source to
> record such info.
> 
> Signed-off-by: Jin Qian 

That's fine by me.  Queued up for 4.2, thanks!

> ---
>  drivers/base/power/wakeup.c | 35 +++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index bdb45f3..732683c 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -59,6 +59,11 @@ static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
>  
>  static ktime_t last_read_time;
>  
> +static struct wakeup_source deleted_ws = {
> + .name = "deleted",
> + .lock =  __SPIN_LOCK_UNLOCKED(deleted_ws.lock),
> +};
> +
>  /**
>   * wakeup_source_prepare - Prepare a new wakeup source for initialization.
>   * @ws: Wakeup source to prepare.
> @@ -110,6 +115,33 @@ void wakeup_source_drop(struct wakeup_source *ws)
>  }
>  EXPORT_SYMBOL_GPL(wakeup_source_drop);
>  
> +/*
> + * Record wakeup_source statistics being deleted into a dummy wakeup_source.
> + */
> +static void wakeup_source_record(struct wakeup_source *ws)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(_ws.lock, flags);
> +
> + if (ws->event_count) {
> + deleted_ws.total_time =
> + ktime_add(deleted_ws.total_time, ws->total_time);
> + deleted_ws.max_time =
> + ktime_add(deleted_ws.max_time, ws->max_time);
> + deleted_ws.prevent_sleep_time =
> + ktime_add(deleted_ws.prevent_sleep_time,
> +   ws->prevent_sleep_time);
> + deleted_ws.event_count += ws->event_count;
> + deleted_ws.active_count += ws->active_count;
> + deleted_ws.relax_count += ws->relax_count;
> + deleted_ws.expire_count += ws->expire_count;
> + deleted_ws.wakeup_count += ws->wakeup_count;
> + }
> +
> + spin_unlock_irqrestore(_ws.lock, flags);
> +}
> +
>  /**
>   * wakeup_source_destroy - Destroy a struct wakeup_source object.
>   * @ws: Wakeup source to destroy.
> @@ -122,6 +154,7 @@ void wakeup_source_destroy(struct wakeup_source *ws)
>   return;
>  
>   wakeup_source_drop(ws);
> + wakeup_source_record(ws);
>   kfree(ws->name);
>   kfree(ws);
>  }
> @@ -930,6 +963,8 @@ static int wakeup_sources_stats_show(struct seq_file *m, 
> void *unused)
>   print_wakeup_source_stats(m, ws);
>   rcu_read_unlock();
>  
> + print_wakeup_source_stats(m, _ws);
> +
>   return 0;
>  }
>  
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 09:38:08AM +1000, Dave Chinner wrote:

> > Both readdir() and path component lookup are technically read
> > operations, so why the hell do we use a mutex, rather than just
> > get a read-write lock for reading? Yeah, it's that (d) above. I
> > might trust xfs and ext4 to get their internal exclusions for
> > allocations etc right when called concurrently for the same
> > directory. But the others?
> 
> They just use a write lock for everything and *nothing changes* -
> this is a simple problem to solve.
> 
> The argument "filesystem developers are stupid" is not a
> compelling argument against changing locking. You're just being
> insulting, even though you probably don't realise it.

Er...  Remember the clusterfuck around the ->i_size and alignment
checks on XFS DIO writes?  Just this cycle.  Correctness of XFS
locking is nothing to boast about - it *is* convoluted as hell and you
guys are not superhuman enough to reliably spot the problems in that nest
of horrors.  Nobody is.

PS: I've no idea whether I'm being insulting or not and frankly, I don't give
a damn; unlike Linus I hadn't signed off on the "code of conflict" nonsense.
Anyone who feels like complaining is quite welcome to it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] power: increment wakeup_count when save_wakeup_count failed.

2015-05-15 Thread Rafael J. Wysocki
On Wednesday, April 22, 2015 05:50:11 PM Jin Qian wrote:
> user-space aborts suspend attempt if writing wakeup_count failed.
> Count the write failure towards wakeup_count.

A use case, please?

> Signed-off-by: Jin Qian 
> ---
>  drivers/base/power/wakeup.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index f24c622..bdb45f3 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -57,6 +57,8 @@ static LIST_HEAD(wakeup_sources);
>  
>  static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
>  
> +static ktime_t last_read_time;
> +
>  /**
>   * wakeup_source_prepare - Prepare a new wakeup source for initialization.
>   * @ws: Wakeup source to prepare.
> @@ -771,10 +773,15 @@ void pm_wakeup_clear(void)
>  bool pm_get_wakeup_count(unsigned int *count, bool block)
>  {
>   unsigned int cnt, inpr;
> + unsigned long flags;
>  
>   if (block) {
>   DEFINE_WAIT(wait);
>  
> + spin_lock_irqsave(_lock, flags);
> + last_read_time = ktime_get();
> + spin_unlock_irqrestore(_lock, flags);
> +
>   for (;;) {
>   prepare_to_wait(_count_wait_queue, ,
>   TASK_INTERRUPTIBLE);
> @@ -806,6 +813,7 @@ bool pm_save_wakeup_count(unsigned int count)
>  {
>   unsigned int cnt, inpr;
>   unsigned long flags;
> + struct wakeup_source *ws;
>  
>   events_check_enabled = false;
>   spin_lock_irqsave(_lock, flags);
> @@ -813,6 +821,15 @@ bool pm_save_wakeup_count(unsigned int count)
>   if (cnt == count && inpr == 0) {
>   saved_count = count;
>   events_check_enabled = true;
> + } else {
> + rcu_read_lock();
> + list_for_each_entry_rcu(ws, _sources, entry) {
> + if (ws->active ||
> + ktime_compare(ws->last_time, last_read_time) > 0) {
> + ws->wakeup_count++;
> + }
> + }
> + rcu_read_unlock();
>   }
>   spin_unlock_irqrestore(_lock, flags);
>   return events_check_enabled;
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH - RESEND] IRQ: don't suspend nested_thread irqs over system suspend.

2015-05-15 Thread Rafael J. Wysocki
On Tuesday, February 24, 2015 01:28:05 PM NeilBrown wrote:
> 
> --Sig_/GFnrHDc1Bs/XVRdbvRQTN5O
> Content-Type: text/plain; charset=US-ASCII
> Content-Transfer-Encoding: quoted-printable
> 
> 
> 
> Nested IRQs can only fire when the parent irq fires.
> So when the parent is suspended, there is no need to suspend
> the child irq.
> 
> Suspending nested irqs can cause a problem is they are suspended or
> resumed in the wrong order.
> If an interrupt fires while the parent is active but the child is
> suspended, then the interrupt will not be acknowledged properly
> and so an interrupt storm can result.
> This is particularly likely if the parent is resumed before
> the child, and the interrupt was raised during suspend.
> 
> Ensuring correct ordering would be possible, but it is simpler
> to just never suspend nested interrupts.
> 
> Signed-off-by: NeilBrown 

We need to make some progress here.

Can you please resend it again for people to have a fresh look (and CC PeterZ
this time too)?

> 
> ---
> This is a resend of a patch sent at the end of January 2015.
> Rafael seemed happy with it, but I receive no other response so I'm resendi=
> ng.
> 
> Thanks,
> NeilBrown
> 
> 
> diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
> index 3ca532592704..40cbcfb7fc43 100644
> --- a/kernel/irq/pm.c
> +++ b/kernel/irq/pm.c
> @@ -118,6 +118,8 @@ void suspend_device_irqs(void)
>   unsigned long flags;
>   bool sync;
> =20
> + if (irq_settings_is_nested_thread(desc))
> + continue;
>   raw_spin_lock_irqsave(>lock, flags);
>   sync =3D suspend_device_irq(desc, irq);
>   raw_spin_unlock_irqrestore(>lock, flags);
> @@ -158,6 +160,8 @@ static void resume_irqs(bool want_early)
> =20
>   if (!is_early && want_early)
>   continue;
> + if (irq_settings_is_nested_thread(desc))
> + continue;
> =20
>   raw_spin_lock_irqsave(>lock, flags);
>   resume_irq(desc, irq);
> 
> --Sig_/GFnrHDc1Bs/XVRdbvRQTN5O
> Content-Type: application/pgp-signature
> Content-Description: OpenPGP digital signature
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v2
> 
> iQIVAwUBVOvhtTnsnt1WYoG5AQKnXxAAjQnaCA2UPMaoXcwoP/Qx1MJpMSjW6Q6v
> b7NiJTm7ltcKoPBN4y2//p0Wk6BEKT+nCbfqFacj4+MxfjrC4qtjIpsL6lVxi2Iz
> sBdbhdkKdUsX6ueRKWv5ifK9qbDzBje7QUcGDrJ/9n80rvbVB5AnSH8bM0TADs2W
> bfg5xWAMLfBZq4yhMTgOkPx7D9LMk6/HmlR55ZEC7JvZo6yKwifyQi18k1wfDlLZ
> FobxyB6/MexRpjG9RXLpp4MJLV7BV0b6wQhZWWYSFhxVcuYxb9u7BHWumtO9fCMy
> fgIzSVsZHuPxTH+ct7fbj+Pat+Y38c5RYpq4ojWJP2FnrGjEnrPt+dpw6AnbkPme
> JudzHus6HhgUu5vLdT/Hu5i1qhVcQVe7OaweqhGs1BReGYX7AVPAsVzJAJ09Fwyb
> vNIbfuPoTWVI1cz9VV7VgbFHwZgq9/yhk9KhvRaypaMTeuWKbcf2SgiX315w2e21
> D3Gxa6FDKc+CmnDycrx0eALtfCDp86gR2yeOeiCPAGUp8ICJ5vvAJcnxYVPYzt64
> JBBILBwzRV/7LNzbkYWEFlq4CCm9HhmkN4/HDC7tMcxCqQoVFTK0eGn5nz9kFZ6x
> MaUyhhakwK+cAx1bVmTZ+gdBxnaa31p26c2OFdv2VRFnJlFC2X+vc9GFuz8KUZCD
> d43mKkWNdWQ=
> =vmFh
> -END PGP SIGNATURE-
> 
> --Sig_/GFnrHDc1Bs/XVRdbvRQTN5O--
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 1/3] PM / Hibernate: prepare for SANITIZE_FREED_PAGES

2015-05-15 Thread Rafael J. Wysocki
On Thursday, May 14, 2015 04:19:46 PM Anisse Astier wrote:
> SANITIZE_FREED_PAGES feature relies on having all pages going through
> the free_pages_prepare path in order to be cleared before being used. In
> the hibernate use case, free pages will automagically appear in the
> system without being cleared, left there by the loading kernel.
> 
> This patch will make sure free pages are cleared on resume; when we'll
> enable SANITIZE_FREED_PAGES. We free the pages just after resume because
> we can't do it later: going through any device resume code might
> allocate some memory and invalidate the free pages bitmap.
> 
> Signed-off-by: Anisse Astier 
> ---
>  kernel/power/hibernate.c |  4 +++-
>  kernel/power/power.h |  2 ++
>  kernel/power/snapshot.c  | 22 ++
>  3 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index 2329daa..0a73126 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -305,9 +305,11 @@ static int create_image(int platform_mode)
>   error);
>   /* Restore control flow magically appears here */
>   restore_processor_state();
> - if (!in_suspend)
> + if (!in_suspend) {
>   events_check_enabled = false;
>  
> + clear_free_pages();

Again, why don't you do that at the swsusp_free() time?

> + }
>   platform_leave(platform_mode);
>  
>   Power_up:
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index ce9b832..6d2d7bf 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -92,6 +92,8 @@ extern int create_basic_memory_bitmaps(void);
>  extern void free_basic_memory_bitmaps(void);
>  extern int hibernate_preallocate_memory(void);
>  
> +extern void clear_free_pages(void);
> +
>  /**
>   *   Auxiliary structure used for reading the snapshot image data and
>   *   metadata from and writing them to the list of page backup entries
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index 5235dd4..2335130 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -1032,6 +1032,28 @@ void free_basic_memory_bitmaps(void)
>   pr_debug("PM: Basic memory bitmaps freed\n");
>  }
>  
> +void clear_free_pages(void)
> +{
> +#ifdef CONFIG_SANITIZE_FREED_PAGES
> + struct memory_bitmap *bm = free_pages_map;
> + unsigned long pfn;
> +
> + if (WARN_ON(!(free_pages_map)))

One paren too many.

> + return;
> +
> + memory_bm_position_reset(bm);
> + pfn = memory_bm_next_pfn(bm);
> + while (pfn != BM_END_OF_MAP) {
> + if (pfn_valid(pfn))
> + clear_highpage(pfn_to_page(pfn));

Is clear_highpage() also fine for non-highmem pages?

> +
> + pfn = memory_bm_next_pfn(bm);
> + }
> + memory_bm_position_reset(bm);
> + printk(KERN_INFO "PM: free pages cleared after restore\n");
> +#endif /* SANITIZE_FREED_PAGES */
> +}
> +
>  /**
>   *   snapshot_additional_pages - estimate the number of additional pages
>   *   be needed for setting up the suspend image data structures for given
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH 6/9] drivers: platform: Configure dma operations at probe time

2015-05-15 Thread Greg Kroah-Hartman
On Fri, May 15, 2015 at 02:00:07AM +0300, Laurent Pinchart wrote:
> Configuring DMA ops at probe time will allow deferring device probe when
> the IOMMU isn't available yet.
> 
> Signed-off-by: Laurent Pinchart 
> ---
>  drivers/base/platform.c | 9 +
>  drivers/of/platform.c   | 7 +++
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 

Acked-by: Greg Kroah-Hartman 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/6] ACPI / EC: Update due to recent changes.

2015-05-15 Thread Rafael J. Wysocki
On Friday, May 15, 2015 02:16:05 PM Lv Zheng wrote:
> This patchset tries to cleanup the EC driver to reflect the recent changes.
> There is a small fix in the patchset to use time_before() instead of
> time_after().
> The last patch removes all non-root-caused MSI quirks so that we may be
> able to identify their root cause if regressions are reported against this
> removal and generate new quirks based on the new code base.
> 
> Lv Zheng (6):
>   ACPI / EC: Update acpi_ec_is_gpe_raised() with new GPE status flag.
>   ACPI / EC: Remove storming threashold enlarging quirk.
>   ACPI / EC: Remove irqs_disabled() check.
>   ACPI / EC: Fix and clean up register access guarding logics.
>   ACPI / EC: Add module params for polling modes.
>   ACPI / EC: Remove non-root-caused busy polling quirks.
> 
>  drivers/acpi/ec.c   |  148 
> ++-
>  drivers/acpi/internal.h |1 +
>  2 files changed, 69 insertions(+), 80 deletions(-)

All of these make sense to me, so I'm queuing them up for 4.2, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Fixed coding style issues

2015-05-15 Thread Greg KH
On Sat, May 16, 2015 at 02:11:06AM +0200, Pedro Marzo Perez wrote:
> From: pmarzo 

This line doesn't match your from: line in your email client, or the
line below in the signed-off-by part.

Also, please make your subject a bit more descriptive.

> 
> This patch just fixes some errors reported by checkpatch.pl script

What "errors" were reported and fixed?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 09:30:22AM +1000, NeilBrown wrote:

> .. and I've been wondering what to do about i_mutex and NFS.  I've had
> customer reports of slowness in creating files that seems to be due to
> i_mutex on the directory being held over the whole 'create' RPC, so only one
> of those can be in flight at the one time.
> "make  -j" on a large source directory can easily want to create lots of
> "*.o" files at "the same time".
> 
> And NFS doesn't need i_mutex at all because the server will provide the
> needed guarantees.

Directory i_mutex is used for a lot more than serialization of fs methods
on said directory - a lot of dcache handling relies upon having it held
around adding dentries/moving them around/making them negative/etc.

Server can't do a damn thing about those, obviously.  Neither can it
do anything about multiple lookups on the same name in the same directory,
just because several processes have arrived at the same time with dcache
cold.  And no, caching dentry before the end of lookup isn't a good idea
either - you'll get tons of messy corner cases.

If anyone has a usable finer-grained locking scheme, I would _love_ to see it.
All I'd seen from Lustre folks in that area was bringing Stross' mythos to
mind - both as in "reduction of NP to P would be handy for analysis" and
"looking into that thing feels like an excellent way of inviting the gibbering
monstrosities from beyond the spacetime to chew on your cortex".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 02/17] nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Ryusuke Konishi 

commit d8fd150fe3935e1692bf57c66691e17409ebb9c1 upstream.

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

Signed-off-by: Ryusuke Konishi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/nilfs2/btree.c |2 +-
 include/linux/nilfs2_fs.h |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -388,7 +388,7 @@ static int nilfs_btree_root_broken(const
nchildren = nilfs_btree_node_get_nchildren(node);
 
if (unlikely(level < NILFS_BTREE_LEVEL_NODE_MIN ||
-level > NILFS_BTREE_LEVEL_MAX ||
+level >= NILFS_BTREE_LEVEL_MAX ||
 nchildren < 0 ||
 nchildren > NILFS_BTREE_ROOT_NCHILDREN_MAX)) {
pr_crit("NILFS: bad btree root (inode number=%lu): level = %d, 
flags = 0x%x, nchildren = %d\n",
--- a/include/linux/nilfs2_fs.h
+++ b/include/linux/nilfs2_fs.h
@@ -458,7 +458,7 @@ struct nilfs_btree_node {
 /* level */
 #define NILFS_BTREE_LEVEL_DATA  0
 #define NILFS_BTREE_LEVEL_NODE_MIN  (NILFS_BTREE_LEVEL_DATA + 1)
-#define NILFS_BTREE_LEVEL_MAX   14
+#define NILFS_BTREE_LEVEL_MAX   14 /* Max level (exclusive) */
 
 /**
  * struct nilfs_palloc_group_desc - block group descriptor


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] drivers/rtc/interface.c: Change rtc_set_mmss() to use time64_t

2015-05-15 Thread John Stultz
On Fri, May 15, 2015 at 2:31 AM, Xunlei Pang  wrote:
> From: Xunlei Pang 
>
> rtc_set_mmss() uses "unsigned long" as its second parameter which
> may have y2038 problem on 32-bit systems.
>
> Change it to use time64_t.
>
> All its call sites will be changed later(there are no problems
> leaving these call sites untouched).

This line isn't super convincing. Better to explain *why* there aren't
problems making this change first, rather then just asserting it, and
to give some confidence you'll address this, point out that sparc is
the only user of this funciton and will be updated by the following
patch in the series.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 03/17] mm/memory-failure: call shake_page() when error hits thp tail page

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Naoya Horiguchi 

commit 09789e5de18e4e442870b2d700831f5cb802eb05 upstream.

Currently memory_failure() calls shake_page() to sweep pages out from
pcplists only when the victim page is 4kB LRU page or thp head page.
But we should do this for a thp tail page too.

Consider that a memory error hits a thp tail page whose head page is on
a pcplist when memory_failure() runs.  Then, the current kernel skips
shake_pages() part, so hwpoison_user_mappings() returns without calling
split_huge_page() nor try_to_unmap() because PageLRU of the thp head is
still cleared due to the skip of shake_page().

As a result, me_huge_page() runs for the thp, which is broken behavior.

One effect is a leak of the thp.  And another is to fail to isolate the
memory error, so later access to the error address causes another MCE,
which kills the processes which used the thp.

This patch fixes this problem by calling shake_page() for thp tail case.

Fixes: 385de35722c9 ("thp: allow a hwpoisoned head page to be put back to LRU")
Signed-off-by: Naoya Horiguchi 
Reviewed-by: Andi Kleen 
Acked-by: Dean Nelson 
Cc: Andrea Arcangeli 
Cc: Hidetoshi Seto 
Cc: Jin Dongming 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 mm/memory-failure.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1117,10 +1117,10 @@ int memory_failure(unsigned long pfn, in
 * The check (unnecessarily) ignores LRU pages being isolated and
 * walked by the page reclaim code, however that's not a big loss.
 */
-   if (!PageHuge(p) && !PageTransTail(p)) {
-   if (!PageLRU(p))
-   shake_page(p, 0);
-   if (!PageLRU(p)) {
+   if (!PageHuge(p)) {
+   if (!PageLRU(hpage))
+   shake_page(hpage, 0);
+   if (!PageLRU(hpage)) {
/*
 * shake_page could have turned it free.
 */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 11/17] drm/i915: Add missing MacBook Pro models with dual channel LVDS

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Lukas Wunner 

commit 3916e3fd81021fb795bfbdb17f375b6b3685bced upstream.

Single channel LVDS maxes out at 112 MHz. The 15" pre-retina models
shipped with 1440x900 (106 MHz) by default or 1680x1050 (119 MHz)
as a BTO option, both versions used dual channel LVDS even though
the smaller one would have fit into a single channel.

Notes:
  Bug report showing that the MacBookPro8,2 with 1440x900 uses dual
  channel LVDS (this lead to it being hardcoded in intel_lvds.c by
  Daniel Vetter with commit 618563e3945b9d0864154bab3c607865b557cecc):
https://bugzilla.kernel.org/show_bug.cgi?id=42842

  If i915.lvds_channel_mode=2 is missing even though the machine needs
  it, every other vertical line is white and consequently, only the left
  half of the screen is visible (verified by myself on a MacBookPro9,1).

  Forum posting concerning a MacBookPro6,2 with 1440x900, author is
  using i915.lvds_channel_mode=2 on the kernel command line, proving
  that the machine uses dual channels:
https://bbs.archlinux.org/viewtopic.php?id=185770

  Chi Mei N154C6-L04 with 1440x900 is a replacement panel for all
  MacBook Pro "A1286" models, and that model number encompasses the
  MacBookPro6,2 / 8,2 / 9,1. Page 17 of the panel's datasheet shows it's
  driven with dual channel LVDS:
http://www.ebay.com/itm/-/400690878560
http://www.everymac.com/ultimate-mac-lookup/?search_keywords=A1286
http://www.taopanel.com/chimei/datasheet/N154C6-L04.pdf

  Those three 15" models, MacBookPro6,2 / 8,2 / 9,1, are the only ones
  with i915 graphics and dual channel LVDS, so that list should be
  complete. And the 8,2 is already in intel_lvds.c.

  Possible motivation to use dual channel LVDS even on the 1440x900
  models: Reduce the number of different parts, i.e. use identical logic
  boards and display cabling on both versions and the only differing
  component is the panel.

Signed-off-by: Lukas Wunner 
Acked-by: Jani Nikula 
[Jani: included notes in the commit message for posterity]
Signed-off-by: Jani Nikula 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpu/drm/i915/intel_lvds.c |   18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/intel_lvds.c
+++ b/drivers/gpu/drm/i915/intel_lvds.c
@@ -1007,12 +1007,28 @@ static int intel_dual_link_lvds_callback
 static const struct dmi_system_id intel_dual_link_lvds[] = {
{
.callback = intel_dual_link_lvds_callback,
-   .ident = "Apple MacBook Pro (Core i5/i7 Series)",
+   .ident = "Apple MacBook Pro 15\" (2010)",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "MacBookPro6,2"),
+   },
+   },
+   {
+   .callback = intel_dual_link_lvds_callback,
+   .ident = "Apple MacBook Pro 15\" (2011)",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "MacBookPro8,2"),
},
},
+   {
+   .callback = intel_dual_link_lvds_callback,
+   .ident = "Apple MacBook Pro 15\" (2012)",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "MacBookPro9,1"),
+   },
+   },
{ } /* terminating entry */
 };
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 05/17] gpio: unregister gpiochip device before removing it

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 01cca93a9491ed95992523ff7e79dd9bfcdea8e0 upstream.

Unregister gpiochip device (used to export information through sysfs)
before removing it internally. This way removal will reverse addition.

Signed-off-by: Johan Hovold 
Signed-off-by: Linus Walleij 
Signed-off-by: Greg Kroah-Hartman 


---
 drivers/gpio/gpiolib.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1265,6 +1265,8 @@ int gpiochip_remove(struct gpio_chip *ch
int status = 0;
unsignedid;
 
+   gpiochip_unexport(chip);
+
spin_lock_irqsave(_lock, flags);
 
gpiochip_remove_pin_ranges(chip);
@@ -1285,9 +1287,6 @@ int gpiochip_remove(struct gpio_chip *ch
 
spin_unlock_irqrestore(_lock, flags);
 
-   if (status == 0)
-   gpiochip_unexport(chip);
-
return status;
 }
 EXPORT_SYMBOL_GPL(gpiochip_remove);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 16/17] ACPICA: Tables: Change acpi_find_root_pointer() to use acpi_physical_address.

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Lv Zheng 

commit f254e3c57b9d952e987502aefa0804c177dd2503 upstream.

ACPICA commit 7d9fd64397d7c38899d3dc497525f6e6b044e0e3

OSPMs like Linux expect an acpi_physical_address returning value from
acpi_find_root_pointer(). This triggers warnings if sizeof (acpi_size) doesn't
equal to sizeof (acpi_physical_address):
  drivers/acpi/osl.c:275:3: warning: passing argument 1 of 
'acpi_find_root_pointer' from incompatible pointer type [enabled by default]
  In file included from include/acpi/acpi.h:64:0,
   from include/linux/acpi.h:36,
   from drivers/acpi/osl.c:41:
  include/acpi/acpixf.h:433:1: note: expected 'acpi_size *' but argument is of 
type 'acpi_physical_address *'
This patch corrects acpi_find_root_pointer().

Link: https://github.com/acpica/acpica/commit/7d9fd643
Signed-off-by: Lv Zheng 
Signed-off-by: Bob Moore 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Dirk Behme 
Signed-off-by: George G. Davis 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/acpi/acpica/tbxfroot.c |7 ---
 include/acpi/acpixf.h  |2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

--- a/drivers/acpi/acpica/tbxfroot.c
+++ b/drivers/acpi/acpica/tbxfroot.c
@@ -118,7 +118,7 @@ static acpi_status acpi_tb_validate_rsdp
  *
  
**/
 
-acpi_status acpi_find_root_pointer(acpi_size *table_address)
+acpi_status acpi_find_root_pointer(acpi_physical_address * table_address)
 {
u8 *table_ptr;
u8 *mem_rover;
@@ -176,7 +176,8 @@ acpi_status acpi_find_root_pointer(acpi_
physical_address +=
(u32) ACPI_PTR_DIFF(mem_rover, table_ptr);
 
-   *table_address = physical_address;
+   *table_address =
+   (acpi_physical_address) physical_address;
return_ACPI_STATUS(AE_OK);
}
}
@@ -209,7 +210,7 @@ acpi_status acpi_find_root_pointer(acpi_
(ACPI_HI_RSDP_WINDOW_BASE +
 ACPI_PTR_DIFF(mem_rover, table_ptr));
 
-   *table_address = physical_address;
+   *table_address = (acpi_physical_address) physical_address;
return_ACPI_STATUS(AE_OK);
}
 
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -177,7 +177,7 @@ acpi_status acpi_load_tables(void);
  */
 acpi_status acpi_reallocate_root_table(void);
 
-acpi_status acpi_find_root_pointer(acpi_size *rsdp_address);
+acpi_status acpi_find_root_pointer(acpi_physical_address *rsdp_address);
 
 acpi_status acpi_unload_table_id(acpi_owner_id id);
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 07/17] ARM: dts: imx25: Add #pwm-cells to pwm4

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Markus Pargmann 

commit f90d3f0d0a11fa77918fd5497cb616dd2faa8431 upstream.

The property '#pwm-cells' is currently missing. It is not possible to
use pwm4 without this property.

Signed-off-by: Markus Pargmann 
Fixes: 5658a68fb578 ("ARM i.MX25: Add devicetree")
Reviewed-by: Fabio Estevam 
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/imx25.dtsi |1 +
 1 file changed, 1 insertion(+)

--- a/arch/arm/boot/dts/imx25.dtsi
+++ b/arch/arm/boot/dts/imx25.dtsi
@@ -393,6 +393,7 @@
 
pwm4: pwm@53fc8000 {
compatible = "fsl,imx25-pwm", "fsl,imx27-pwm";
+   #pwm-cells = <2>;
reg = <0x53fc8000 0x4000>;
clocks = < 108>, < 52>;
clock-names = "ipg", "per";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v1 1/1] iio: ltr501: Fix proximity threshold boundary check

2015-05-15 Thread Kuppuswamy Sathyanarayanan
Currently, proximity sensor boundary check is done
inside the switch block but outside the case
statement.Since this code will never get executed,
moved the check outside the switch case statement.

   867  case IIO_PROXIMITY:
   868  switch (dir) {
   // Following line has been moved outside the switch block.
   869  if (val > LTR501_PS_THRESH_MASK)
   870  return -EINVAL;
   871  case IIO_EV_DIR_RISING:

Signed-off-by: Kuppuswamy Sathyanarayanan 

---
 drivers/iio/light/ltr501.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iio/light/ltr501.c b/drivers/iio/light/ltr501.c
index ca4bf47..417369b 100644
--- a/drivers/iio/light/ltr501.c
+++ b/drivers/iio/light/ltr501.c
@@ -865,9 +865,9 @@ static int ltr501_write_thresh(struct iio_dev *indio_dev,
return -EINVAL;
}
case IIO_PROXIMITY:
-   switch (dir) {
if (val > LTR501_PS_THRESH_MASK)
return -EINVAL;
+   switch (dir) {
case IIO_EV_DIR_RISING:
mutex_lock(>lock_ps);
ret = regmap_bulk_write(data->regmap,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Dave Chinner
On Thu, May 14, 2015 at 04:57:22PM -0700, Jeremy Allison wrote:
> On Thu, May 14, 2015 at 04:24:13PM -0700, Linus Torvalds wrote:
> > On Thu, May 14, 2015 at 3:09 PM, Jeremy Allison  wrote:
> > >
> > > Of course we tell people to just set their filesystems
> > > up using mkfs.xfs -n version=ci :-).
> > 
> > So ASCII-only case-insensitivity is sufficient for you guys?
> 
> No it's not enough really. But for specific Windows apps that
> use restricted namespaces (and there are such) it works.
> 
> ZFS on *BSD does do full case-insenitive lookups (utf8) as part of
> FreeNAS. I think if it's configured a SMB-only share they turn
> that on.

Ad some people are using this on Linux:

http://oss.sgi.com/archives/xfs/2014-09/msg00169.html

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 04/17] xen/console: Update console event channel on resume

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Boris Ostrovsky 

commit b9d934f27c91b878c4b2e64299d6e419a4022f8d upstream.

After a resume the hypervisor/tools may change console event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky 
Signed-off-by: David Vrabel 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/tty/hvc/hvc_xen.c |   18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -299,11 +299,27 @@ static int xen_initial_domain_console_in
return 0;
 }
 
+static void xen_console_update_evtchn(struct xencons_info *info)
+{
+   if (xen_hvm_domain()) {
+   uint64_t v;
+   int err;
+
+   err = hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, );
+   if (!err && v)
+   info->evtchn = v;
+   } else
+   info->evtchn = xen_start_info->console.domU.evtchn;
+}
+
 void xen_console_resume(void)
 {
struct xencons_info *info = vtermno_to_xencons(HVC_COOKIE);
-   if (info != NULL && info->irq)
+   if (info != NULL && info->irq) {
+   if (!xen_initial_domain())
+   xen_console_update_evtchn(info);
rebind_evtchn_irq(info->evtchn, info->irq);
+   }
 }
 
 static void xencons_disconnect_backend(struct xencons_info *info)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 12/17] pinctrl: Dont just pretend to protect pinctrl_maps, do it for real

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Doug Anderson 

commit c5272a28566b00cce79127ad382406e0a8650690 upstream.

Way back, when the world was a simpler place and there was no war, no
evil, and no kernel bugs, there was just a single pinctrl lock.  That
was how the world was when (57291ce pinctrl: core device tree mapping
table parsing support) was written.  In that case, there were
instances where the pinctrl mutex was already held when
pinctrl_register_map() was called, hence a "locked" parameter was
passed to the function to indicate that the mutex was already locked
(so we shouldn't lock it again).

A few years ago in (42fed7b pinctrl: move subsystem mutex to
pinctrl_dev struct), we switched to a separate pinctrl_maps_mutex.
...but (oops) we forgot to re-think about the whole "locked" parameter
for pinctrl_register_map().  Basically the "locked" parameter appears
to still refer to whether the bigger pinctrl_dev mutex is locked, but
we're using it to skip locks of our (now separate) pinctrl_maps_mutex.

That's kind of a bad thing(TM).  Probably nobody noticed because most
of the calls to pinctrl_register_map happen at boot time and we've got
synchronous device probing.  ...and even cases where we're
asynchronous don't end up actually hitting the race too often.  ...but
after banging my head against the wall for a bug that reproduced 1 out
of 1000 reboots and lots of looking through kgdb, I finally noticed
this.

Anyway, we can now safely remove the "locked" parameter and go back to
a war-free, evil-free, and kernel-bug-free world.

Fixes: 42fed7ba44e4 ("pinctrl: move subsystem mutex to pinctrl_dev struct")
Signed-off-by: Doug Anderson 
Signed-off-by: Linus Walleij 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/pinctrl/core.c   |   10 --
 drivers/pinctrl/core.h   |2 +-
 drivers/pinctrl/devicetree.c |2 +-
 3 files changed, 6 insertions(+), 8 deletions(-)

--- a/drivers/pinctrl/core.c
+++ b/drivers/pinctrl/core.c
@@ -1077,7 +1077,7 @@ void devm_pinctrl_put(struct pinctrl *p)
 EXPORT_SYMBOL_GPL(devm_pinctrl_put);
 
 int pinctrl_register_map(struct pinctrl_map const *maps, unsigned num_maps,
-bool dup, bool locked)
+bool dup)
 {
int i, ret;
struct pinctrl_maps *maps_node;
@@ -1145,11 +1145,9 @@ int pinctrl_register_map(struct pinctrl_
maps_node->maps = maps;
}
 
-   if (!locked)
-   mutex_lock(_maps_mutex);
+   mutex_lock(_maps_mutex);
list_add_tail(_node->node, _maps);
-   if (!locked)
-   mutex_unlock(_maps_mutex);
+   mutex_unlock(_maps_mutex);
 
return 0;
 }
@@ -1164,7 +1162,7 @@ int pinctrl_register_map(struct pinctrl_
 int pinctrl_register_mappings(struct pinctrl_map const *maps,
  unsigned num_maps)
 {
-   return pinctrl_register_map(maps, num_maps, true, false);
+   return pinctrl_register_map(maps, num_maps, true);
 }
 
 void pinctrl_unregister_map(struct pinctrl_map const *map)
--- a/drivers/pinctrl/core.h
+++ b/drivers/pinctrl/core.h
@@ -183,7 +183,7 @@ static inline struct pin_desc *pin_desc_
 }
 
 int pinctrl_register_map(struct pinctrl_map const *maps, unsigned num_maps,
-bool dup, bool locked);
+bool dup);
 void pinctrl_unregister_map(struct pinctrl_map const *map);
 
 extern int pinctrl_force_sleep(struct pinctrl_dev *pctldev);
--- a/drivers/pinctrl/devicetree.c
+++ b/drivers/pinctrl/devicetree.c
@@ -92,7 +92,7 @@ static int dt_remember_or_free_map(struc
dt_map->num_maps = num_maps;
list_add_tail(_map->node, >dt_maps);
 
-   return pinctrl_register_map(map, num_maps, false, true);
+   return pinctrl_register_map(map, num_maps, false);
 }
 
 struct pinctrl_dev *of_pinctrl_get(struct device_node *np)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 17/17] ACPICA: Utilities: Cleanup to enforce ACPI_PHYSADDR_TO_PTR()/ACPI_PTR_TO_PHYSADDR().

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Lv Zheng 

commit 6d3fd3cc33d50e4c0d0c0bd172de02caaec3127c upstream.

ACPICA commit 154f6d074dd38d6ebc0467ad454454e6c5c9ecdf

There are code pieces converting pointers using "(acpi_physical_address) x"
or "ACPI_CAST_PTR (t, x)" formats, this patch cleans up them.

Known issues:
1. Cleanup of "(ACPI_PHYSICAL_ADDRRESS) x" for a table field
   For the conversions around the table fields, it is better to fix it with
   alignment also fixed. So this patch doesn't modify such code. There
   should be no functional problem by leaving them unchanged.

Link: https://github.com/acpica/acpica/commit/154f6d07
Signed-off-by: Lv Zheng 
Signed-off-by: Bob Moore 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Dirk Behme 
Signed-off-by: George G. Davis 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/acpi/acpica/dsopcode.c |3 +--
 drivers/acpi/acpica/tbinstal.c |5 ++---
 2 files changed, 3 insertions(+), 5 deletions(-)

--- a/drivers/acpi/acpica/dsopcode.c
+++ b/drivers/acpi/acpica/dsopcode.c
@@ -539,8 +539,7 @@ acpi_ds_eval_table_region_operands(struc
return_ACPI_STATUS(AE_NOT_EXIST);
}
 
-   obj_desc->region.address =
-   (acpi_physical_address) ACPI_TO_INTEGER(table);
+   obj_desc->region.address = ACPI_PTR_TO_PHYSADDR(table);
obj_desc->region.length = table->length;
 
ACPI_DEBUG_PRINT((ACPI_DB_EXEC, "RgnObj %p Addr %8.8X%8.8X Len %X\n",
--- a/drivers/acpi/acpica/tbinstal.c
+++ b/drivers/acpi/acpica/tbinstal.c
@@ -301,8 +301,7 @@ struct acpi_table_header *acpi_tb_table_
ACPI_EXCEPTION((AE_INFO, AE_NO_MEMORY,
"%4.4s %p Attempted physical table 
override failed",
table_header->signature,
-   ACPI_CAST_PTR(void,
- table_desc->address)));
+   
ACPI_PHYSADDR_TO_PTR(table_desc->address)));
return (NULL);
}
 
@@ -318,7 +317,7 @@ struct acpi_table_header *acpi_tb_table_
ACPI_INFO((AE_INFO,
   "%4.4s %p %s table override, new table: %p",
   table_header->signature,
-  ACPI_CAST_PTR(void, table_desc->address),
+  ACPI_PHYSADDR_TO_PTR(table_desc->address),
   override_type, new_table));
 
/* We can now unmap/delete the original table (if fully mapped) */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 06/17] gpio: sysfs: fix memory leaks and device hotplug

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 483d821108791092798f5d230686868112927044 upstream.

Unregister GPIOs requested through sysfs at chip remove to avoid leaking
the associated memory and sysfs entries.

The stale sysfs entries prevented the gpio numbers from being exported
when the gpio range was later reused (e.g. at device reconnect).

This also fixes the related module-reference leak.

Note that kernfs makes sure that any on-going sysfs operations finish
before the class devices are unregistered and that further accesses
fail.

The chip exported flag is used to prevent gpiod exports during removal.
This also makes it harder to trigger, but does not fix, the related race
between gpiochip_remove and export_store, which is really a race with
gpiod_request that needs to be addressed separately.

Also note that this would prevent the crashes (e.g. NULL-dereferences)
at reconnect that affects pre-3.18 kernels, as well as use-after-free on
operations on open attribute files on pre-3.14 kernels (prior to
kernfs).

Fixes: d8f388d8dc8d ("gpio: sysfs interface")
Signed-off-by: Johan Hovold 
Signed-off-by: Linus Walleij 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpio/gpiolib.c |   19 +++
 1 file changed, 19 insertions(+)

--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -752,6 +752,7 @@ static struct class gpio_class = {
  */
 static int gpiod_export(struct gpio_desc *desc, bool direction_may_change)
 {
+   struct gpio_chip*chip;
unsigned long   flags;
int status;
const char  *ioname = NULL;
@@ -769,8 +770,16 @@ static int gpiod_export(struct gpio_desc
return -EINVAL;
}
 
+   chip = desc->chip;
+
mutex_lock(_lock);
 
+   /* check if chip is being removed */
+   if (!chip || !chip->exported) {
+   status = -ENODEV;
+   goto fail_unlock;
+   }
+
spin_lock_irqsave(_lock, flags);
if (!test_bit(FLAG_REQUESTED, >flags) ||
 test_bit(FLAG_EXPORT, >flags)) {
@@ -1040,6 +1049,8 @@ static void gpiochip_unexport(struct gpi
 {
int status;
struct device   *dev;
+   struct gpio_desc *desc;
+   unsigned int i;
 
mutex_lock(_lock);
dev = class_find_device(_class, NULL, chip, match_export);
@@ -1047,6 +1058,7 @@ static void gpiochip_unexport(struct gpi
sysfs_remove_group(>kobj, _attr_group);
put_device(dev);
device_unregister(dev);
+   /* prevent further gpiod exports */
chip->exported = 0;
status = 0;
} else
@@ -1056,6 +1068,13 @@ static void gpiochip_unexport(struct gpi
if (status)
pr_debug("%s: chip %s status %d\n", __func__,
chip->label, status);
+
+   /* unregister gpiod class devices owned by sysfs */
+   for (i = 0; i < chip->ngpio; i++) {
+   desc = >desc[i];
+   if (test_and_clear_bit(FLAG_SYSFS, >flags))
+   gpiod_free(desc);
+   }
 }
 
 static int __init gpiolib_sysfs_init(void)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 08/17] ARM: dts: imx28: Fix AUART4 TX-DMA interrupt name

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Marek Vasut 

commit 4ada77e37a773168fea484899201e272ab44ba8b upstream.

Fix a typo in the TX DMA interrupt name for AUART4.
This patch makes AUART4 operational again.

Signed-off-by: Marek Vasut 
Fixes: f30fb03d4d3a ("ARM: dts: add generic DMA device tree binding for 
mxs-dma")
Acked-by: Stefan Wahren 
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 Documentation/devicetree/bindings/dma/fsl-mxs-dma.txt |2 +-
 arch/arm/boot/dts/imx28.dtsi  |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/Documentation/devicetree/bindings/dma/fsl-mxs-dma.txt
+++ b/Documentation/devicetree/bindings/dma/fsl-mxs-dma.txt
@@ -38,7 +38,7 @@ dma_apbx: dma-apbx@80024000 {
  80 81 68 69
  70 71 72 73
  74 75 76 77>;
-   interrupt-names = "auart4-rx", "aurat4-tx", "spdif-tx", "empty",
+   interrupt-names = "auart4-rx", "auart4-tx", "spdif-tx", "empty",
  "saif0", "saif1", "i2c0", "i2c1",
  "auart0-rx", "auart0-tx", "auart1-rx", "auart1-tx",
  "auart2-rx", "auart2-tx", "auart3-rx", "auart3-tx";
--- a/arch/arm/boot/dts/imx28.dtsi
+++ b/arch/arm/boot/dts/imx28.dtsi
@@ -691,7 +691,7 @@
  80 81 68 69
  70 71 72 73
  74 75 76 77>;
-   interrupt-names = "auart4-rx", "aurat4-tx", 
"spdif-tx", "empty",
+   interrupt-names = "auart4-rx", "auart4-tx", 
"spdif-tx", "empty",
  "saif0", "saif1", "i2c0", 
"i2c1",
  "auart0-rx", "auart0-tx", 
"auart1-rx", "auart1-tx",
  "auart2-rx", "auart2-tx", 
"auart3-rx", "auart3-tx";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 01/17] ocfs2: dlm: fix race between purge and get lock resource

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Junxiao Bi 

commit b1432a2a35565f538586774a03bf277c27fc267d upstream.

There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged.  This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.

dlm_get_lock_resource {
...
spin_lock(>spinlock);
tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
if (tmpres) {
 spin_unlock(>spinlock);
  race window, dlm_run_purge_list() may run and purge
  the lock resource
 spin_lock(>spinlock);
 ...
 spin_unlock(>spinlock);
}
}

Signed-off-by: Junxiao Bi 
Cc: Joseph Qi 
Cc: Mark Fasheh 
Cc: Joel Becker 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/ocfs2/dlm/dlmmaster.c |   13 +
 1 file changed, 13 insertions(+)

--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -729,6 +729,19 @@ lookup:
if (tmpres) {
spin_unlock(>spinlock);
spin_lock(>spinlock);
+
+   /*
+* Right after dlm spinlock was released, dlm_thread could have
+* purged the lockres. Check if lockres got unhashed. If so
+* start over.
+*/
+   if (hlist_unhashed(>hash_node)) {
+   spin_unlock(>spinlock);
+   dlm_lockres_put(tmpres);
+   tmpres = NULL;
+   goto lookup;
+   }
+
/* Wait on the thread that is mastering the resource */
if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
__dlm_wait_on_lockres(tmpres);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 09/17] ARM: dts: imx23-olinuxino: Fix dr_mode of usb0

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Stefan Wahren 

commit 0fdebe1a2f4d3a8fc03754022fabf8ba95e131a3 upstream.

The dr_mode of usb0 on imx233-olinuxino is left to default "otg".
Since the green LED (GPIO2_1) on imx233-olinuxino is connected to the
same pin as USB_OTG_ID it's possible to disable USB host by LED toggling:

echo 0 > /sys/class/leds/green/brightness
[ 1068.89] ci_hdrc ci_hdrc.0: remove, state 1
[ 1068.89] usb usb1: USB disconnect, device number 1
[ 1068.92] usb 1-1: USB disconnect, device number 2
[ 1068.92] usb 1-1.1: USB disconnect, device number 3
[ 1069.07] usb 1-1.2: USB disconnect, device number 4
[ 1069.45] ci_hdrc ci_hdrc.0: USB bus 1 deregistered
[ 1074.46] ci_hdrc ci_hdrc.0: timeout waiting for 0800 in 11

This patch fixes the issue by setting dr_mode to "host" in the dts file.

Reported-by: Harald Geyer 
Signed-off-by: Stefan Wahren 
Reviewed-by: Fabio Estevam 
Reviewed-by: Marek Vasut 
Acked-by: Peter Chen 
Fixes: b49312948285 ("ARM: dts: imx23-olinuxino: Add USB host support")
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/imx23-olinuxino.dts |1 +
 1 file changed, 1 insertion(+)

--- a/arch/arm/boot/dts/imx23-olinuxino.dts
+++ b/arch/arm/boot/dts/imx23-olinuxino.dts
@@ -89,6 +89,7 @@
 
ahb@8008 {
usb0: usb@8008 {
+   dr_mode = "host";
vbus-supply = <_usb0_vbus>;
status = "okay";
};


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 13/51] Revert "dm crypt: fix deadlock when async crypto algorithm returns -EBUSY"

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Rabin Vincent 

commit c0403ec0bb5a8c5b267fb7e16021bec0b17e4964 upstream.

This reverts Linux 4.1-rc1 commit 0618764cb25f6fa9fb31152995de42a8a0496475.

The problem which that commit attempts to fix actually lies in the
Freescale CAAM crypto driver not dm-crypt.

dm-crypt uses CRYPTO_TFM_REQ_MAY_BACKLOG.  This means the the crypto
driver should internally backlog requests which arrive when the queue is
full and process them later.  Until the crypto hw's queue becomes full,
the driver returns -EINPROGRESS.  When the crypto hw's queue if full,
the driver returns -EBUSY, and if CRYPTO_TFM_REQ_MAY_BACKLOG is set, is
expected to backlog the request and process it when the hardware has
queue space.  At the point when the driver takes the request from the
backlog and starts processing it, it calls the completion function with
a status of -EINPROGRESS.  The completion function is called (for a
second time, in the case of backlogged requests) with a status/err of 0
when a request is done.

Crypto drivers for hardware without hardware queueing use the helpers,
crypto_init_queue(), crypto_enqueue_request(), crypto_dequeue_request()
and crypto_get_backlog() helpers to implement this behaviour correctly,
while others implement this behaviour without these helpers (ccp, for
example).

dm-crypt (before the patch that needs reverting) uses this API
correctly.  It queues up as many requests as the hw queues will allow
(i.e. as long as it gets back -EINPROGRESS from the request function).
Then, when it sees at least one backlogged request (gets -EBUSY), it
waits till that backlogged request is handled (completion gets called
with -EINPROGRESS), and then continues.  The references to
af_alg_wait_for_completion() and af_alg_complete() in that commit's
commit message are irrelevant because those functions only handle one
request at a time, unlink dm-crypt.

The problem is that the Freescale CAAM driver, which that commit
describes as having being tested with, fails to implement the
backlogging behaviour correctly.  In cam_jr_enqueue(), if the hardware
queue is full, it simply returns -EBUSY without backlogging the request.
What the observed deadlock was is not described in the commit message
but it is obviously the wait_for_completion() in crypto_convert() where
dm-crypto would wait for the completion being called with -EINPROGRESS
in the case of backlogged requests.  This completion will never be
completed due to the bug in the CAAM driver.

Commit 0618764cb25 incorrectly made dm-crypt wait for every request,
even when the driver/hardware queues are not full, which means that
dm-crypt will never see -EBUSY.  This means that that commit will cause
a performance regression on all crypto drivers which implement the API
correctly.

Revert it.  Correct backlog handling should be implemented in the CAAM
driver instead.

Cc'ing stable purely because commit 0618764cb25 did.  If for some reason
a stable@ kernel did pick up commit 0618764cb25 it should get reverted.

Signed-off-by: Rabin Vincent 
Reviewed-by: Horia Geanta 
Signed-off-by: Mike Snitzer 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/md/dm-crypt.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -915,10 +915,11 @@ static int crypt_convert(struct crypt_co
 
switch (r) {
/* async */
-   case -EINPROGRESS:
case -EBUSY:
wait_for_completion(>restart);
reinit_completion(>restart);
+   /* fall through*/
+   case -EINPROGRESS:
ctx->req = NULL;
ctx->cc_sector++;
continue;
@@ -1313,8 +1314,10 @@ static void kcryptd_async_done(struct cr
struct dm_crypt_io *io = container_of(ctx, struct dm_crypt_io, ctx);
struct crypt_config *cc = io->cc;
 
-   if (error == -EINPROGRESS)
+   if (error == -EINPROGRESS) {
+   complete(>restart);
return;
+   }
 
if (!error && cc->iv_gen_ops && cc->iv_gen_ops->post)
error = cc->iv_gen_ops->post(cc, iv_of_dmreq(cc, dmreq), dmreq);
@@ -1325,15 +1328,12 @@ static void kcryptd_async_done(struct cr
mempool_free(req_of_dmreq(cc, dmreq), cc->req_pool);
 
if (!atomic_dec_and_test(>cc_pending))
-   goto done;
+   return;
 
if (bio_data_dir(io->base_bio) == READ)
kcryptd_crypt_read_done(io);
else
kcryptd_crypt_write_io_submit(io, 1);
-done:
-   if (!completion_done(>restart))
-   complete(>restart);
 }
 
 static void kcryptd_crypt(struct work_struct *work)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More 

[PATCH 3.10 10/17] ARM: mvebu: armada-xp-openblocks-ax3-4: Disable internal RTC

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Gregory CLEMENT 

commit 750e30d4076ae5e02ad13a376e96c95a2627742c upstream.

There is no crystal connected to the internal RTC on the Open Block
AX3. So let's disable it in order to prevent the kernel probing the
driver uselessly. Eventually this patches removes the following
warning message from the boot log:
"rtc-mv d0010300.rtc: internal RTC not ticking"

Acked-by: Andrew Lunn 
Signed-off-by: Gregory CLEMENT 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts |4 
 1 file changed, 4 insertions(+)

--- a/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
+++ b/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
@@ -32,6 +32,10 @@
  0xf000 0 0xf000 0x800 /* Device Bus, 
NOR 128MiB   */>;
 
internal-regs {
+   rtc@10300 {
+   /* No crystal connected to the internal RTC */
+   status = "disabled";
+   };
serial@12000 {
clock-frequency = <25000>;
status = "okay";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 14/51] ARM: dts: imx25: Add #pwm-cells to pwm4

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Markus Pargmann 

commit f90d3f0d0a11fa77918fd5497cb616dd2faa8431 upstream.

The property '#pwm-cells' is currently missing. It is not possible to
use pwm4 without this property.

Signed-off-by: Markus Pargmann 
Fixes: 5658a68fb578 ("ARM i.MX25: Add devicetree")
Reviewed-by: Fabio Estevam 
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/imx25.dtsi |1 +
 1 file changed, 1 insertion(+)

--- a/arch/arm/boot/dts/imx25.dtsi
+++ b/arch/arm/boot/dts/imx25.dtsi
@@ -411,6 +411,7 @@
 
pwm4: pwm@53fc8000 {
compatible = "fsl,imx25-pwm", "fsl,imx27-pwm";
+   #pwm-cells = <2>;
reg = <0x53fc8000 0x4000>;
clocks = < 108>, < 52>;
clock-names = "ipg", "per";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.10 14/17] sound/oss: fix deadlock in sequencer_ioctl(SNDCTL_SEQ_OUTOFBAND)

2015-05-15 Thread Greg Kroah-Hartman
3.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Khoroshilov 

commit bc26d4d06e337ade069f33d3f4377593b24e6e36 upstream.

A deadlock can be initiated by userspace via ioctl(SNDCTL_SEQ_OUTOFBAND)
on /dev/sequencer with TMR_ECHO midi event.

In this case the control flow is:
sound_ioctl()
-> case SND_DEV_SEQ:
   case SND_DEV_SEQ2:
 sequencer_ioctl()
 -> case SNDCTL_SEQ_OUTOFBAND:
  spin_lock_irqsave(,flags);
  play_event();
  -> case EV_TIMING:
   seq_timing_event()
   -> case TMR_ECHO:
seq_copy_to_input()
-> spin_lock_irqsave(,flags);

It seems that spin_lock_irqsave() around play_event() is not necessary,
because the only other call location in seq_startplay() makes the call
without acquiring spinlock.

So, the patch just removes spinlocks around play_event().
By the way, it removes unreachable code in seq_timing_event(),
since (seq_mode == SEQ_2) case is handled in the beginning.

Compile tested only.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
Signed-off-by: Takashi Iwai 
Cc: Willy Tarreau 
Signed-off-by: Greg Kroah-Hartman 

---
 sound/oss/sequencer.c |   12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

--- a/sound/oss/sequencer.c
+++ b/sound/oss/sequencer.c
@@ -683,13 +683,8 @@ static int seq_timing_event(unsigned cha
break;
 
case TMR_ECHO:
-   if (seq_mode == SEQ_2)
-   seq_copy_to_input(event_rec, 8);
-   else
-   {
-   parm = (parm << 8 | SEQ_ECHO);
-   seq_copy_to_input((unsigned char *) , 4);
-   }
+   parm = (parm << 8 | SEQ_ECHO);
+   seq_copy_to_input((unsigned char *) , 4);
break;
 
default:;
@@ -1332,7 +1327,6 @@ int sequencer_ioctl(int dev, struct file
int mode = translate_mode(file);
struct synth_info inf;
struct seq_event_rec event_rec;
-   unsigned long flags;
int __user *p = arg;
 
orig_dev = dev = dev >> 4;
@@ -1487,9 +1481,7 @@ int sequencer_ioctl(int dev, struct file
case SNDCTL_SEQ_OUTOFBAND:
if (copy_from_user(_rec, arg, sizeof(event_rec)))
return -EFAULT;
-   spin_lock_irqsave(,flags);
play_event(event_rec.arr);
-   spin_unlock_irqrestore(,flags);
return 0;
 
case SNDCTL_MIDI_INFO:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 17/51] ARM: dts: imx23-olinuxino: Fix polarity of LED GPIO

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Fabio Estevam 

commit cfe8c59762244251fd9a5e281d48808095ff4090 upstream.

On imx23-olinuxino the LED turns on when level logic high is aplied to
GPIO2_1.

Fix the gpios property accordingly.

Fixes: b34aa1850244 ("ARM: dts: imx23-olinuxino: Remove unneeded "default-on"")
Reported-by: Stefan Wahren 
Signed-off-by: Fabio Estevam 
Tested-by: Stefan Wahren 
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/imx23-olinuxino.dts |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/arm/boot/dts/imx23-olinuxino.dts
+++ b/arch/arm/boot/dts/imx23-olinuxino.dts
@@ -12,6 +12,7 @@
  */
 
 /dts-v1/;
+#include 
 #include "imx23.dtsi"
 
 / {
@@ -120,7 +121,7 @@
 
user {
label = "green";
-   gpios = < 1 1>;
+   gpios = < 1 GPIO_ACTIVE_HIGH>;
};
};
 };


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 19/51] ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Nicolas Schichan 

commit 19fc99d0c6ba7d9b65456496b5bb2169d5f74cd0 upstream.

In that case, emit_udiv() will be called with rn == ARM_R0 (r_scratch)
and loading rm first into ARM_R0 will result in jit_udiv() function
being called the same dividend and divisor. Fix that by loading rn
first into ARM_R1 and then rm into ARM_R0.

Signed-off-by: Nicolas Schichan 
Fixes: aee636c4809f (bpf: do not use reciprocal divide)
Acked-by: Mircea Gherzan 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/net/bpf_jit_32.c |   15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -449,10 +449,21 @@ static inline void emit_udiv(u8 rd, u8 r
return;
}
 #endif
-   if (rm != ARM_R0)
-   emit(ARM_MOV_R(ARM_R0, rm), ctx);
+
+   /*
+* For BPF_ALU | BPF_DIV | BPF_K instructions, rm is ARM_R4
+* (r_A) and rn is ARM_R0 (r_scratch) so load rn first into
+* ARM_R1 to avoid accidentally overwriting ARM_R0 with rm
+* before using it as a source for ARM_R1.
+*
+* For BPF_ALU | BPF_DIV | BPF_X rm is ARM_R4 (r_A) and rn is
+* ARM_R5 (r_X) so there is no particular register overlap
+* issues.
+*/
if (rn != ARM_R1)
emit(ARM_MOV_R(ARM_R1, rn), ctx);
+   if (rm != ARM_R0)
+   emit(ARM_MOV_R(ARM_R0, rm), ctx);
 
ctx->seen |= SEEN_CALL;
emit_mov_i(ARM_R3, (u32)jit_udiv, ctx);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 15/51] ARM: dts: imx28: Fix AUART4 TX-DMA interrupt name

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Marek Vasut 

commit 4ada77e37a773168fea484899201e272ab44ba8b upstream.

Fix a typo in the TX DMA interrupt name for AUART4.
This patch makes AUART4 operational again.

Signed-off-by: Marek Vasut 
Fixes: f30fb03d4d3a ("ARM: dts: add generic DMA device tree binding for 
mxs-dma")
Acked-by: Stefan Wahren 
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 Documentation/devicetree/bindings/dma/fsl-mxs-dma.txt |2 +-
 arch/arm/boot/dts/imx28.dtsi  |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/Documentation/devicetree/bindings/dma/fsl-mxs-dma.txt
+++ b/Documentation/devicetree/bindings/dma/fsl-mxs-dma.txt
@@ -38,7 +38,7 @@ dma_apbx: dma-apbx@80024000 {
  80 81 68 69
  70 71 72 73
  74 75 76 77>;
-   interrupt-names = "auart4-rx", "aurat4-tx", "spdif-tx", "empty",
+   interrupt-names = "auart4-rx", "auart4-tx", "spdif-tx", "empty",
  "saif0", "saif1", "i2c0", "i2c1",
  "auart0-rx", "auart0-tx", "auart1-rx", "auart1-tx",
  "auart2-rx", "auart2-tx", "auart3-rx", "auart3-tx";
--- a/arch/arm/boot/dts/imx28.dtsi
+++ b/arch/arm/boot/dts/imx28.dtsi
@@ -803,7 +803,7 @@
  80 81 68 69
  70 71 72 73
  74 75 76 77>;
-   interrupt-names = "auart4-rx", "aurat4-tx", 
"spdif-tx", "empty",
+   interrupt-names = "auart4-rx", "auart4-tx", 
"spdif-tx", "empty",
  "saif0", "saif1", "i2c0", 
"i2c1",
  "auart0-rx", "auart0-tx", 
"auart1-rx", "auart1-tx",
  "auart2-rx", "auart2-tx", 
"auart3-rx", "auart3-tx";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 16/51] ARM: dts: imx23-olinuxino: Fix dr_mode of usb0

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Stefan Wahren 

commit 0fdebe1a2f4d3a8fc03754022fabf8ba95e131a3 upstream.

The dr_mode of usb0 on imx233-olinuxino is left to default "otg".
Since the green LED (GPIO2_1) on imx233-olinuxino is connected to the
same pin as USB_OTG_ID it's possible to disable USB host by LED toggling:

echo 0 > /sys/class/leds/green/brightness
[ 1068.89] ci_hdrc ci_hdrc.0: remove, state 1
[ 1068.89] usb usb1: USB disconnect, device number 1
[ 1068.92] usb 1-1: USB disconnect, device number 2
[ 1068.92] usb 1-1.1: USB disconnect, device number 3
[ 1069.07] usb 1-1.2: USB disconnect, device number 4
[ 1069.45] ci_hdrc ci_hdrc.0: USB bus 1 deregistered
[ 1074.46] ci_hdrc ci_hdrc.0: timeout waiting for 0800 in 11

This patch fixes the issue by setting dr_mode to "host" in the dts file.

Reported-by: Harald Geyer 
Signed-off-by: Stefan Wahren 
Reviewed-by: Fabio Estevam 
Reviewed-by: Marek Vasut 
Acked-by: Peter Chen 
Fixes: b49312948285 ("ARM: dts: imx23-olinuxino: Add USB host support")
Signed-off-by: Shawn Guo 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/imx23-olinuxino.dts |1 +
 1 file changed, 1 insertion(+)

--- a/arch/arm/boot/dts/imx23-olinuxino.dts
+++ b/arch/arm/boot/dts/imx23-olinuxino.dts
@@ -93,6 +93,7 @@
 
ahb@8008 {
usb0: usb@8008 {
+   dr_mode = "host";
vbus-supply = <_usb0_vbus>;
status = "okay";
};


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 18/51] ARM: mvebu: armada-xp-openblocks-ax3-4: Disable internal RTC

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Gregory CLEMENT 

commit 750e30d4076ae5e02ad13a376e96c95a2627742c upstream.

There is no crystal connected to the internal RTC on the Open Block
AX3. So let's disable it in order to prevent the kernel probing the
driver uselessly. Eventually this patches removes the following
warning message from the boot log:
"rtc-mv d0010300.rtc: internal RTC not ticking"

Acked-by: Andrew Lunn 
Signed-off-by: Gregory CLEMENT 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts |4 
 1 file changed, 4 insertions(+)

--- a/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
+++ b/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
@@ -69,6 +69,10 @@
};
 
internal-regs {
+   rtc@10300 {
+   /* No crystal connected to the internal RTC */
+   status = "disabled";
+   };
serial@12000 {
clock-frequency = <25000>;
status = "okay";


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 02/51] nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Ryusuke Konishi 

commit d8fd150fe3935e1692bf57c66691e17409ebb9c1 upstream.

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

Signed-off-by: Ryusuke Konishi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/nilfs2/btree.c |2 +-
 include/linux/nilfs2_fs.h |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -388,7 +388,7 @@ static int nilfs_btree_root_broken(const
nchildren = nilfs_btree_node_get_nchildren(node);
 
if (unlikely(level < NILFS_BTREE_LEVEL_NODE_MIN ||
-level > NILFS_BTREE_LEVEL_MAX ||
+level >= NILFS_BTREE_LEVEL_MAX ||
 nchildren < 0 ||
 nchildren > NILFS_BTREE_ROOT_NCHILDREN_MAX)) {
pr_crit("NILFS: bad btree root (inode number=%lu): level = %d, 
flags = 0x%x, nchildren = %d\n",
--- a/include/linux/nilfs2_fs.h
+++ b/include/linux/nilfs2_fs.h
@@ -458,7 +458,7 @@ struct nilfs_btree_node {
 /* level */
 #define NILFS_BTREE_LEVEL_DATA  0
 #define NILFS_BTREE_LEVEL_NODE_MIN  (NILFS_BTREE_LEVEL_DATA + 1)
-#define NILFS_BTREE_LEVEL_MAX   14
+#define NILFS_BTREE_LEVEL_MAX   14 /* Max level (exclusive) */
 
 /**
  * struct nilfs_palloc_group_desc - block group descriptor


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Dave Chinner
On Thu, May 14, 2015 at 08:51:12AM -0700, Linus Torvalds wrote:
> On Thu, May 14, 2015 at 4:23 AM, Dave Chinner  wrote:
> >
> > IIRC, ext4 readdir is not slow because of the use of the buffer
> > cache, it's slow because of the way it hashes dirents across blocks
> > on disk.  i.e. it has locality issues, not a caching problem.
> 
> No, you're just worrying about IO. Natural for a filesystem guy, but a
> lot of loads cache really well, and IO isn't an issue.  Yes, there's a
> bad cold-cache case, but that's not when you get inode semaphore
> contention.

Right, because it's cold cache performance that everyone complains
about. e.g.  Workloads like gluster, ceph, fileservers, openstack
(e.g. swift) etc are all mostly cold cache directory workloads with
*extremely high* concurrency. Nobody is complaining  about cached
readdir performance - concurrency in cold cache directory operations
is what everyone has been asking me for.

In case you missed it, recently the Ceph developers have been
talking about storing file handles in a userspace database and then
using open_by_handle_at() so they can avoid the pain of cold cache
directory lookup overhead (see the O_NOMTIME thread). We have a
serious cold cache lookup problem on directories when people are
looking to bypass the directory structure entirely

[snip a bunch of rhetoric lacking in technical merit]

> End result: readdir() wastes a *lot* of time on stupid stuff (just
> that physical block number lookup is generally more expensive than
> readdir itself should be), and it does so with excessive locking,
> serializing everything.

The most overhead in readdir is calling filldir over and over again
for every dirent to copy it into the user buffer. The overhead is
not from looking up the buffer in the cache.

So, I just created close to a million dirents in a directory, and
ran the xfs_io readdir command on it (look, a readdir performance
measurement tool!). I used a ram disk to take IO out of the picture
for the first read, the system has E5-4620 0 @ 2.20GHz CPUs, and I
dropped caches to ensure that there was no cached metadata:

$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
$ sudo xfs_io -c readdir /mnt/scratch
read 29545648 bytes from offset 0
28 MiB, 923327 ops, 0. sec (111.011 MiB/sec and 3637694.9201 ops/sec)
$ sudo xfs_io -c readdir /mnt/scratch
read 29545648 bytes from offset 0
28 MiB, 923327 ops, 0. sec (189.864 MiB/sec and 6221628.5056 ops/sec)
$ sudo xfs_io -c readdir /mnt/scratch
read 29545648 bytes from offset 0
28 MiB, 923327 ops, 0. sec (190.156 MiB/sec and 6231201.6629 ops/sec)
$

Reading, decoding and copying dirents at 190MB/s? That's roughly 6
million dirents/second being pulled from cache, and it's doing
roughly 4 million/second cold cache. That's not slow at all.

What *noticable* performance gains are there to be had here for the
average user? Anything that takes less than a second or two to
complete is not going to be noticable to a user, and most people
don't have 8-10 million inodes in a directory

So, what did the profile look like?

  10.07%  [kernel]  [k] __xfs_dir3_data_check
   9.92%  [kernel]  [k] copy_user_generic_string
   7.44%  [kernel]  [k] xfs_dir_ino_validate
   6.83%  [kernel]  [k] filldir
   5.43%  [kernel]  [k] xfs_dir2_leaf_getdents
   4.56%  [kernel]  [k] kallsyms_expand_symbol.constprop.1
   4.38%  [kernel]  [k] _raw_spin_unlock_irqrestore
   4.26%  [kernel]  [k] _raw_spin_unlock_irq
   4.02%  [kernel]  [k] __memcpy
   3.02%  [kernel]  [k] format_decode
   2.36%  [kernel]  [k] xfs_dir2_data_entsize
   2.28%  [kernel]  [k] vsnprintf
   1.99%  [kernel]  [k] __do_softirq
   1.93%  [kernel]  [k] xfs_dir2_data_get_ftype
   1.88%  [kernel]  [k] number.isra.14
   1.84%  [kernel]  [k] _xfs_buf_find
   1.82%  [kernel]  [k] ___might_sleep
   1.61%  [kernel]  [k] strnlen
   1.49%  [kernel]  [k] queue_work_on
   1.48%  [kernel]  [k] string.isra.4
   1.21%  [kernel]  [k] __might_sleep


Oh, I'm running CONFIG_XFS_DEBUG=y, so internal runtime consistency
checks consume most of the CPU (__xfs_dir3_data_check,
xfs_dir_ino_validate). IOWs, real world readdir performance will be
much, much faster than I've demonstrated.

Other than that, the most CPU is spent on copying dirents into the
user buffer (copy_user_generic_string), passing dirents to the user
buffer (filldir) and extracting dirents from the on-disk buffer
(xfs_dir2_leaf_getdents).  The we have lock contention, ramdisk IO
(memcpy), some vsnprintf stuff (includes format_decode, probably
debug code) and some more dirent information extraction functions.

it's not until we get to _xfs_buf_find() do we see a buffer cache
lookup function, and that's actually comsuming less CPU than the
__might_sleep/might_sleep() debug annotations. That puts it in
persepective just how little overhead readdir buffer caching
actually has compared to everything else.

IOWs, these numbers indicate that readdir caching overhead has no
real impact on the performance of hot 

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Dave Chinner
On Fri, May 15, 2015 at 03:15:48PM -0600, Andreas Dilger wrote:
> On May 14, 2015, at 5:23 AM, Dave Chinner  wrote:
> > 
> > On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote:
> >> On Wed, May 13, 2015 at 8:30 PM, Al Viro  wrote:
> >>> 
> >>> Maybe...  I'd like to see the profiles, TBH - especially getxattr() and
> >>> access() frequency on various loads.  Sure, make(1) and cc(1) really care
> >>> about stat() very much, but I wouldn't be surprised if something like
> >>> httpd or samba would be hitting getxattr() a lot...
> >> 
> >> So I haven't seen samba profiles in ages, but iirc we have more
> >> serious problems than trying to speed up basic filename lookup.
> >> 
> >> At least long long ago, inode semaphore contention was a big deal,
> >> largely due to readdir().
> > 
> > It still is - it's the prime reason people still need to create
> > hashed directory structures so that they can get concurrency in
> > directory operations.  IMO, concurrency in directory operations is a
> > more important problem to solve than worrying about readdir speed;
> > in large filesystems readdir and lookup are IO bound operations and
> > so everything serialises on the IO as it's done with the i_mutex
> > held
> 
> We've had a patch[*] to add ext4 parallel directory operations in Lustre for
> a few years, that adds separate locks for each internal tree and leaf block
> instead of using i_mutex, so it scales as the size of the directory grows.
> This definitely improved many-threaded directory create/lookup/unlink
> performance (rename still uses a single lock).

Yup, we can do the same to XFS to implement concurrent modifications.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 03/51] RDMA/CMA: Canonize IPv4 on IPV6 sockets properly

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Jason Gunthorpe 

commit 285214409a9e5fceba2215461b4682b6069d8e77 upstream.

When accepting a new IPv4 connect to an IPv6 socket, the CMA tries to
canonize the address family to IPv4, but does not properly process
the listening sockaddr to get the listening port, and does not properly
set the address family of the canonized sockaddr.

Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")

Reported-By: Yotam Kenneth 
Signed-off-by: Jason Gunthorpe 
Tested-by: Haggai Eran 
Signed-off-by: Doug Ledford 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/infiniband/core/cma.c |   27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -859,19 +859,27 @@ static void cma_save_ib_info(struct rdma
memcpy(>sib_addr, >dgid, 16);
 }
 
+static __be16 ss_get_port(const struct sockaddr_storage *ss)
+{
+   if (ss->ss_family == AF_INET)
+   return ((struct sockaddr_in *)ss)->sin_port;
+   else if (ss->ss_family == AF_INET6)
+   return ((struct sockaddr_in6 *)ss)->sin6_port;
+   BUG();
+}
+
 static void cma_save_ip4_info(struct rdma_cm_id *id, struct rdma_cm_id 
*listen_id,
  struct cma_hdr *hdr)
 {
-   struct sockaddr_in *listen4, *ip4;
+   struct sockaddr_in *ip4;
 
-   listen4 = (struct sockaddr_in *) _id->route.addr.src_addr;
ip4 = (struct sockaddr_in *) >route.addr.src_addr;
-   ip4->sin_family = listen4->sin_family;
+   ip4->sin_family = AF_INET;
ip4->sin_addr.s_addr = hdr->dst_addr.ip4.addr;
-   ip4->sin_port = listen4->sin_port;
+   ip4->sin_port = ss_get_port(_id->route.addr.src_addr);
 
ip4 = (struct sockaddr_in *) >route.addr.dst_addr;
-   ip4->sin_family = listen4->sin_family;
+   ip4->sin_family = AF_INET;
ip4->sin_addr.s_addr = hdr->src_addr.ip4.addr;
ip4->sin_port = hdr->port;
 }
@@ -879,16 +887,15 @@ static void cma_save_ip4_info(struct rdm
 static void cma_save_ip6_info(struct rdma_cm_id *id, struct rdma_cm_id 
*listen_id,
  struct cma_hdr *hdr)
 {
-   struct sockaddr_in6 *listen6, *ip6;
+   struct sockaddr_in6 *ip6;
 
-   listen6 = (struct sockaddr_in6 *) _id->route.addr.src_addr;
ip6 = (struct sockaddr_in6 *) >route.addr.src_addr;
-   ip6->sin6_family = listen6->sin6_family;
+   ip6->sin6_family = AF_INET6;
ip6->sin6_addr = hdr->dst_addr.ip6;
-   ip6->sin6_port = listen6->sin6_port;
+   ip6->sin6_port = ss_get_port(_id->route.addr.src_addr);
 
ip6 = (struct sockaddr_in6 *) >route.addr.dst_addr;
-   ip6->sin6_family = listen6->sin6_family;
+   ip6->sin6_family = AF_INET6;
ip6->sin6_addr = hdr->src_addr.ip6;
ip6->sin6_port = hdr->port;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14 04/51] gpio: unregister gpiochip device before removing it

2015-05-15 Thread Greg Kroah-Hartman
3.14-stable review patch.  If anyone has any objections, please let me know.

--

From: Johan Hovold 

commit 01cca93a9491ed95992523ff7e79dd9bfcdea8e0 upstream.

Unregister gpiochip device (used to export information through sysfs)
before removing it internally. This way removal will reverse addition.

Signed-off-by: Johan Hovold 
Signed-off-by: Linus Walleij 
Signed-off-by: Greg Kroah-Hartman 


---
 drivers/gpio/gpiolib.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1265,6 +1265,8 @@ int gpiochip_remove(struct gpio_chip *ch
int status = 0;
unsignedid;
 
+   gpiochip_unexport(chip);
+
spin_lock_irqsave(_lock, flags);
 
gpiochip_remove_pin_ranges(chip);
@@ -1286,9 +1288,6 @@ int gpiochip_remove(struct gpio_chip *ch
 
spin_unlock_irqrestore(_lock, flags);
 
-   if (status == 0)
-   gpiochip_unexport(chip);
-
return status;
 }
 EXPORT_SYMBOL_GPL(gpiochip_remove);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >