[PATCH] dmaengine: edma: No need save/restore interrupt flags during spin_lock in IRQ

2014-04-16 Thread Joel Fernandes
The vchan lock in edma_callback is acquired in hard interrupt context. As
interrupts are already disabled, there's no point in save/restoring interrupt
mask bit or cpsr flags.

Get rid of flags local variable and use spin_lock instead of spin_lock_irqsave.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 91849aa..25a75e2 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -638,7 +638,6 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
struct edma_chan *echan = data;
struct device *dev = echan->vchan.chan.device->dev;
struct edma_desc *edesc;
-   unsigned long flags;
struct edmacc_param p;
 
edesc = echan->edesc;
@@ -649,7 +648,7 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
 
switch (ch_status) {
case EDMA_DMA_COMPLETE:
-   spin_lock_irqsave(>vchan.lock, flags);
+   spin_lock(>vchan.lock);
 
if (edesc) {
if (edesc->cyclic) {
@@ -665,11 +664,11 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
}
}
 
-   spin_unlock_irqrestore(>vchan.lock, flags);
+   spin_unlock(>vchan.lock);
 
break;
case EDMA_DMA_CC_ERROR:
-   spin_lock_irqsave(>vchan.lock, flags);
+   spin_lock(>vchan.lock);
 
edma_read_slot(EDMA_CHAN_SLOT(echan->slot[0]), );
 
@@ -700,7 +699,7 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
edma_trigger_channel(echan->ch_num);
}
 
-   spin_unlock_irqrestore(>vchan.lock, flags);
+   spin_unlock(>vchan.lock);
 
break;
default:
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 16/19] VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc.

2014-04-16 Thread Dave Chinner
On Thu, Apr 17, 2014 at 10:51:05AM +1000, NeilBrown wrote:
> On Wed, 16 Apr 2014 19:00:51 +1000 Dave Chinner  wrote:
> 
> > On Wed, Apr 16, 2014 at 04:49:41PM +1000, NeilBrown wrote:
> > > On Wed, 16 Apr 2014 16:25:20 +1000 Dave Chinner  
> > > wrote:
> > > 
> > > > On Wed, Apr 16, 2014 at 02:03:37PM +1000, NeilBrown wrote:
> > > > > __d_alloc can be called with i_mutex held, so it is safer to
> > > > > use GFP_NOFS.
> > > > > 
> > > > > lockdep reports this can deadlock when loop-back NFS is in use,
> > > > > as nfsd may be required to write out for reclaim, and nfsd certainly
> > > > > takes i_mutex.
> > > > 
> > > > But not the same i_mutex as is currently held. To me, this seems
> > > > like a false positive? If you are holding the i_mutex on an inode,
> > > > then you have a reference to the inode and hence memory reclaim
> > > > won't ever take the i_mutex on that inode.
> > > > 
> > > > FWIW, this sort of false positive was a long stabding problem for
> > > > XFS - we managed to get rid of most of the false positives like this
> > > > by ensuring that only the ilock is taken within memory reclaim and
> > > > memory reclaim can't be entered while we hold the ilock.
> > > > 
> > > > You can't do that with the i_mutex, though
> > > > 
> > > > Cheers,
> > > > 
> > > > Dave.
> > > 
> > > I'm not sure this is a false positive.
> > > You can call __d_alloc when creating a file and so are holding i_mutex on 
> > > the
> > > directory.
> > > nfsd might also want to access that directory.
> > > 
> > > If there was only 1 nfsd thread, it would need to get i_mutex and do it's
> > > thing before replying to that request and so before it could handle the
> > > COMMIT which __d_alloc is waiting for.
> > 
> > That seems wrong - the NFS client in __d_alloc holds a mutex on a
> > NFS client directory inode. The NFS server can't access that
> > specific mutex - it's on the other side of the "network". The NFS
> > server accesses mutexs from local filesystems, so __d_alloc would
> > have to be blocked on a local filesystem inode i_mutex for the nfsd
> > to get hung up behind it...
> 
> I'm not thinking of mutexes on the NFS inodes but the local filesystem inodes
> exactly as you describe below.
> 
> > 
> > However, my confusion comes from the fact that we do GFP_KERNEL
> > memory allocation with the i_mutex held all over the place.
> 
> Do we? 

Yes.

A simple example: fs/xattr.c. Setting or removing an xattr is done
under the i_mutex, yet have a look and the simple_xattr_*
implementation that in memory/psuedo filesystems can use. They use
GFP_KERNEL for all their allocations

> Should we?

No, I don't think so, because it means that under heavy filesystem
memory pressure workloads, direct reclaim can effectively shut off
and only kswapd can free memory. 

> Isn't the whole point of GFP_NOFS to use it when holding
> any filesystem lock?

Not the way I understand it - we've always used it in XFS to prevent
known deadlocks due to recursion, not as a big hammer that we hit
everything with. Every filesystem has different recursion deadlock
triggers, so use GFP_NOFS differently.  using t as a big hammer
doesn't play well with memory reclaim on filesystem memory pressure
generating workloads...

> >   If the
> > problem is:
> > 
> > local fs access -> i_mutex
> > .
> > nfsd -> i_mutex (blocked)
> > .
> > local fs access -> kmalloc(GFP_KERNEL)
> > -> direct reclaim
> > -> nfs_release_page
> > -> 
> >
> > 
> > then why is it just __d_alloc that needs this fix?  Either this is a
> > problem *everywhere* or it's not a problem at all.
> 
> I think it is a problem everywhere that it is a problem :-)



> If you are holding an FS lock, then you should be using GFP_NOFS.

Only if reclaim recursion can cause a deadlock.

> Currently a given filesystem can get away with sometimes using GFP_KERNEL
> because that particular lock never causes contention during reclaim for that
> particular filesystem.

Right, be cause we directly control what recursion can happen.

What you are doing is changing the global reclaim recursion context.
You're introducing recursion loops between filesystems *of different
types*. IOWs, at no point is it safe for anyone to allow any
recursion because we don't have the state available to determine if
recursion is safe or not.

> Adding loop-back NFS into the mix broadens the number of locks which can
> cause a problem as it creates interdependencies between different filesystems.

And that's a major architectural change to the memory reclaim
heirarchy and that means we're going to be breaking assumptions all
over the place, including in places we didn't know we had
dependencies...

> > If it's a problem everywhere it means that we simply can't allow
> > reclaim from localhost NFS mounts to run from contexts that could
> > block an NFSD. i.e. you cannot run NFS client memory reclaim from
> > 

[PATCH] cgroup: fix the retry path of cgroup_mount()

2014-04-16 Thread Li Zefan
If we hit the retry path, we'll call parse_cgroupfs_options() again,
but the string we pass to it has been modified by the previous call
to this function.

This bug can be observed by:

  # mount -t cgroup -o name=foo,cpuset xxx /mnt && umount /mnt && \
mount -t cgroup -o name=foo,cpuset xxx /mnt
  mount: wrong fs type, bad option, bad superblock on xxx,
 missing codepage or helper program, or other error
  ...

The second mount passed "name=foo,cpuset" to the parser, and then it
hit the retry path and call the parser again, but this time the string
passed to the parser is "name=foo".

To fix this, we avoid calling parse_cgroupfs_options() again in this
case.

Signed-off-by: Li Zefan 
---
 kernel/cgroup.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 08c4439..9d6be07 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1580,7 +1580,7 @@ static struct dentry *cgroup_mount(struct 
file_system_type *fs_type,
 */
if (!use_task_css_set_links)
cgroup_enable_task_cg_lists();
-retry:
+
mutex_lock(_tree_mutex);
mutex_lock(_mutex);
 
@@ -1588,7 +1588,7 @@ retry:
ret = parse_cgroupfs_options(data, );
if (ret)
goto out_unlock;
-
+retry:
/* look for a matching existing root */
if (!opts.subsys_mask && !opts.none && !opts.name) {
cgrp_dfl_root_visible = true;
@@ -1647,9 +1647,9 @@ retry:
if (!atomic_inc_not_zero(>cgrp.refcnt)) {
mutex_unlock(_mutex);
mutex_unlock(_tree_mutex);
-   kfree(opts.release_agent);
-   kfree(opts.name);
msleep(10);
+   mutex_lock(_tree_mutex);
+   mutex_lock(_mutex);
goto retry;
}
 
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: in kernel 2.6.x, tun/tap nic supports vlan packets

2014-04-16 Thread zhuyj

On 04/17/2014 01:02 PM, Willy Tarreau wrote:

Hi Zhu,

On Thu, Apr 17, 2014 at 11:35:58AM +0800, zhuyj wrote:

Hi, all

In kernel 2.6.x, linux depends on nic vlan hardware acceleration to
insert/extract
vlan tag. In this scene, in kernel 2.6.x

  _
 A   | | B|| C
  vlan packets-->| tap |->|vlan nic|--->
 |_|  ||

We hope vlan packets pass through tap and vlan nic from A to c.
But in kernel 2.6.x, linux kernel can not extract vlan tag. It depends
on nic vlan hardware acceleration. It is well known that tap nic has no
vlan acceleration. So in the above scene, vlan packets can not be handled by
tap nic. These vlan packets will be discarded in B. They can not arrive
at C.

It's not clear to me what you want to achieve. Are you trying to create
vlan interfaces on top of a tap interface ? Eg: tap1.12, tap1.23 etc ?

Hi, Willy

Yes. These 2 patches are trying create vlan interfaces on top of a tap 
interface.


Zhu Yanjun



In kernel 3.x, linux can handle vlan packets. It does not depend on nic vlan
hardware acceleration. So the above scene can work well in kernel 3.x.

To resolve the above in kernel 2.6.x, we simulated vlan hardware
acceleration in
tun/tap driver. Then followed the logic of commit commit 4fba4ca4
[vlan: Centralize handling of hardware acceleration] to modify the vlan
packets
process in kernel 2.6.x. In the end, the above scene can work well in
patched
kernel 2.6.x.

Please comment on it. Any reply is appreciated.

Hi, Willy

These 2 patches are for linux2.6.x. These can work well here. Please
help to merge
linux 2.6.32.x. Thanks a lot.

Well, 2.6.32.x is in deep freeze mode and it receives only critical fixes
once in a while. While I can appreciate that the patch above might solve
the issue you're facing, I'm wondering if there are not any acceptable
workarounds for such a deep freeze kernel. You patch is not huge, but it
definitely affects a working driver, and I wouldn't like risking to break
the tap driver for other users, and I reall don't have the skills to audit
it completely to ensure this is not the case. And if it breaks, I'll have
to revert it or seek for some help on netdev.

So I'd say that I'd rather not merge it unless I get an Acked-by from some
netdev people who are willing to help in case of any future regression,
which is unlikely but still possible.

Just out of curiosity, what is the motivation for ongoing development on
top of 2.6.32 ? Are there any important deployments that cannot upgrade
for any specific reason ? I'm asking because most 2.6.32.x kernels that
are stuffed into embedded boxes very likely come with their own number
of in-house patches to add whatever feature is needed in such contexts,
so I'm wondering why having this patch in mainline would help in your
situation compared to having it into your own patch set only.

Thanks,
Willy




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Nfs-ganesha-devel] should we change the name/macros of file-private locks?

2014-04-16 Thread Michael Kerrisk (man-pages)
On 04/17/2014 02:31 AM, Jim Lieb wrote:
> On Wednesday, April 16, 2014 13:16:33 Jeremy Allison wrote:
>> On Wed, Apr 16, 2014 at 10:00:46PM +0200, Michael Kerrisk (man-pages) wrote:
>>> [CC += Jeremy Allison]
>>>
>>> On Wed, Apr 16, 2014 at 8:57 PM, Jeff Layton  wrote:
 Sorry to spam so many lists, but I think this needs widespread
 distribution and consensus.

 File-private locks have been merged into Linux for v3.15, and *now*
 people are commenting that the name and macro definitions for the new
 file-private locks suck.

 ...and I can't even disagree. They do suck.

 We're going to have to live with these for a long time, so it's
 important that we be happy with the names before we're stuck with them.
>>>
>>> So, to add my perspective: The existing byte-range locking system has
>>> persisted (despite egregious faults) for well over two decades. One
>>> supposes that Jeff's new improved version might be around
>>> at least as long. With that in mind, and before setting in stone (and
>>> pushing into POSIX) a model of thinking that thousands of programmers
>>> will live with for a long time, it's worth thinking about names.
>>>
 Michael Kerrisk suggested several names but I think the only one that
 doesn't have other issues is "file-associated locks", which can be
 distinguished against "process-associated" locks (aka classic POSIX
 locks).
>>>
>>> The names I have suggested are:
>>> file-associated locks
>>>
>>> or
>>>
>>>file-handle locks
>>>
>>> or (using POSIX terminology)
>>>
>>> file-description locks
>>
>> Thanks for the CC: Michael, but to be honest
>> I don't really care what the name is, I just
>> want the functionality. I can change our build
>> system to cope with detecting it under any name
>> you guys choose :-).
>>
>> Cheers,
>>
>>  Jeremy.
> 
> I and the rest of the nfs-ganesha community are with Jeremy and samba wrt 
> names.  We just want locks that work, i.e. Useful Locks ;)

Yes, sure. The functionality is coming in any case, thanks to Jeff.
The point is: let's make the API as sane as we can. And that's what
this thread is about, so if you have insights or opinions on good 
naming, that would be helpful.

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched: let task migration destination cpu do active balance

2014-04-16 Thread Alex Shi
On 04/16/2014 08:13 PM, Peter Zijlstra wrote:
> On Wed, Apr 16, 2014 at 07:34:29PM +0800, Alex Shi wrote:
>> Chris Redpath found an issue on active balance: 
>> We let the task source cpu, the busiest cpu, do the active balance,
>> while the destination cpu maybe idle. thus we take the busiest cpu
>> time, but left the idlest cpu wait. That is not good for performance.
>>
>> This patch let the destination cpu do active balance. It will give tasks
>> more running time.
>>
>> Signed-off-by: Alex Shi 
>> ---
>>  kernel/sched/fair.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 9b4c4f3..cccee76 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6308,7 +6308,7 @@ more_balance:
>>  raw_spin_unlock_irqrestore(>lock, flags);
>>  
>>  if (active_balance) {
>> -stop_one_cpu_nowait(cpu_of(busiest),
>> +stop_one_cpu_nowait(busiest->push_cpu,
>>  active_load_balance_cpu_stop, busiest,
>>  >active_balance_work);
>>  }
> 
> This doesn't make sense, the whole point of active balance is that we're
> going to move current, for that to work we have to interrupt the CPU
> current is running on and make sure another task (the stopper task in
> this case) is running, so that the previous current is now a !running
> task and we can move it around.
> 

Sure, you are right. thanks for correction!

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/8] extcon: Add resource-managed extcon register function

2014-04-16 Thread Chanwoo Choi
Hi Sangjung,

Thanks for your contribution.

On 04/16/2014 07:26 PM, Sangjung Woo wrote:
> Add resource-managed extcon device register function for convenience.
> For example, if a extcon device is attached with new
> devm_extcon_dev_register(), that extcon device is automatically
> unregistered on driver detach.
> 
> Signed-off-by: Sangjung Woo 
> ---
>  drivers/extcon/extcon-class.c |   83 
> +
>  include/linux/extcon.h|8 
>  2 files changed, 91 insertions(+)
> 
> diff --git a/drivers/extcon/extcon-class.c b/drivers/extcon/extcon-class.c
> index 7ab21aa..accb49c 100644
> --- a/drivers/extcon/extcon-class.c
> +++ b/drivers/extcon/extcon-class.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

It is not necessary because 'device.h' already includes 'gfp.h' header file.

>  
>  /*
>   * extcon_cable_name suggests the standard cable names for commonly used
> @@ -819,6 +820,88 @@ void extcon_dev_unregister(struct extcon_dev *edev)
>  }
>  EXPORT_SYMBOL_GPL(extcon_dev_unregister);
>  
> +
> +/*
> + * Device resource management
> + */
> +

Should delete blank line.

> +struct extcon_devres {
> + struct extcon_dev *edev;
> +};
> +
> +static void devm_extcon_release(struct device *dev, void *res)

Need to change function name as following to sustain consistency of existing 
extcon functions.
- devm_extcon_release -> devm_extcon_dev_release()

> +{
> + struct extcon_devres *dr = (struct extcon_devres *)res;
> +
> + extcon_dev_unregister(dr->edev);
> +}
> +
> +static int devm_extcon_match(struct device *dev, void *res, void *data)

ditto.
- devm_extcon_match -> devm_extcon_dev_match

> +{
> + struct extcon_devres *dr = (struct extcon_devres *)res;
> + struct extcon_devres *match = (struct extcon_devres *)data;

I think that this function don't need explicit casting
because as I knew, casting is automatically about tool-chain.

> +
> + return dr->edev == match->edev;
> +}
> +
> +/**
> + * devm_extcon_dev_register() - Resource-managed extcon_dev_register()
> + * @dev: device to allocate extcon device
> + * @edev:the new extcon device to register
> + *
> + * Managed extcon_dev_register() function. If extcon device is attached with
> + * this function, that extcon device is automatically unregistered on driver
> + * detach. Internally this function calls extcon_dev_register() function.
> + * To get more information, refer that function.
> + *
> + * If extcon device is registered with this function and the device needs to 
> be
> + * unregistered separately, devm_extcon_dev_unregister() should be used.
> + *
> + * RETURNS:
> + * 0 on success, negative error number on failure.
> + */
> +int devm_extcon_dev_register(struct device *dev, struct extcon_dev *edev)
> +{
> + struct extcon_devres *dr;

To improve readability, I prefer to change 'dr' variable name (e.g., dr 
->devres)

> + int rc;

I think 'rc' variable name is ambiguous.
I prefer to change variable name for return value. (rc -> ret)

> +
> + dr = devres_alloc(devm_extcon_release, sizeof(struct extcon_devres),

ditto.
- devm_extcon_release -> devm_extcon_dev_release

We chan modify it as following:
sizeof(struct extcon_devres) -> sizeof(*dr)

> + GFP_KERNEL);
> + if (!dr)
> + return -ENOMEM;
> +
> + rc = extcon_dev_register(edev);
> + if (rc) {
> + devres_free(dr);
> + return rc;
> + }
> +
> + dr->edev = edev;
> + devres_add(dev, dr);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(devm_extcon_dev_register);
> +
> +/**
> + * devm_extcon_dev_unregister() - Resource-managed extcon_dev_unregister()
> + * @dev: device the extcon belongs to
> + * @edev:the extcon device to unregister
> + *
> + * Unregister extcon device that is registered with 
> devm_extcon_dev_register()
> + * function.
> + */
> +void devm_extcon_dev_unregister(struct device *dev, struct extcon_dev *edev)
> +{
> + struct extcon_devres match_dr = { edev };

Should we define 'match_dr' variable? I think it is not necessary.
Maybe it could use 'edev' directly without casting.

> +
> + WARN_ON(devres_destroy(dev, devm_extcon_release,
> + devm_extcon_match, _dr));

I think that devres_release() is more proper than devres_destroy.

> +
> + extcon_dev_unregister(edev);

If you use devres_release() instead of devres_destroy(), don't need to call 
extcon_dev_unregister() function separately because devres_release() function
would call 'release' function.

> +}
> +EXPORT_SYMBOL_GPL(devm_extcon_dev_unregister);
> +
>  #ifdef CONFIG_OF
>  /*
>   * extcon_get_edev_by_phandle - Get the extcon device from devicetree
> diff --git a/include/linux/extcon.h b/include/linux/extcon.h
> index f488145..e1e85a1 100644
> --- a/include/linux/extcon.h
> +++ b/include/linux/extcon.h
> @@ -188,6 +188,14 @@ extern void extcon_dev_unregister(struct extcon_dev 
> *edev);
>  extern 

Re: linux-next: manual merge of the userns tree with Linus' tree

2014-04-16 Thread Al Viro
On Thu, Apr 17, 2014 at 03:06:57PM +1000, Stephen Rothwell wrote:
> Hi Eric,
> 
> Today's linux-next merge of the userns tree got a conflict in
> fs/namespace.c between various commits from Linus' tree and various
> commits from the userns tree.
> 
> I fixed it up (hopefully - see below) and can carry the fix as necessary
> (no action is required).

Various commits include this:
commit 38129a13e6e71f666e0468e99fdd932a687b4d7e
Author: Al Viro 
Date:   Thu Mar 20 21:10:51 2014 -0400

switch mnt_hash to hlist

present in v3.14...  It's been there since before the merge window.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mtd: fsl-quadspi: fix __iomem annotations

2014-04-16 Thread Brian Norris
On Wed, Apr 16, 2014 at 05:21:43PM +0800, Huang Shijie wrote:
> 于 2014年04月16日 17:15, Brian Norris 写道:
> >Pushed to l2-mtd.git/spinor.
> I think you can rebase patches in the spinor branch to the master
> branch now.

I merged spinor into master for now, so it will be in -next. I think
I'll keep the separate branch, in case we need something stable for
others to pull in.

> After the rebase, we can add the patch for the *defconfig for the
> linux-next.

Yeah, I guess we should send patches. I'll try to do that soon.

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build failure after merge of the userns tree

2014-04-16 Thread Stephen Rothwell
Hi Eric,

After merging the userns tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

fs/namespace.c: In function 'new_mountpoint':
fs/namespace.c:725:9: error: implicit declaration of function 'hash' 
[-Werror=implicit-function-declaration]
  struct list_head *chain = mountpoint_hashtable + hash(NULL, dentry);
 ^
fs/namespace.c:725:28: warning: initialization from incompatible pointer type 
[enabled by default]
  struct list_head *chain = mountpoint_hashtable + hash(NULL, dentry);
^
fs/namespace.c:741:2: warning: passing argument 2 of 'hlist_add_head' from 
incompatible pointer type [enabled by default]
  hlist_add_head(>m_hash, chain);
  ^
In file included from include/linux/signal.h:4:0,
 from include/linux/syscalls.h:72,
 from fs/namespace.c:11:
include/linux/list.h:637:20: note: expected 'struct hlist_head *' but argument 
is of type 'struct list_head *'
 static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
^

So clearly my merge conflict resolution was not sufficient.

I will just drop the userns tree for today.  Please give me some help
with the resolutions - or fix this stuff up yourselves.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpiWW3Cx4EzD.pgp
Description: PGP signature


linux-next: manual merge of the userns tree with Linus' tree

2014-04-16 Thread Stephen Rothwell
Hi Eric,

Today's linux-next merge of the userns tree got a conflict in
fs/namespace.c between various commits from Linus' tree and various
commits from the userns tree.

I fixed it up (hopefully - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc fs/namespace.c
index 182bc41cd887,128c051041be..
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@@ -667,13 -632,47 +668,47 @@@ struct vfsmount *lookup_mnt(struct pat
return m;
  }
  
- static struct mountpoint *new_mountpoint(struct dentry *dentry)
+ /*
+  * __is_local_mountpoint - Test to see if dentry is a mountpoint in the
+  * current mount namespace.
+  *
+  * The common case is dentries are not mountpoints at all and that
+  * test is handled inline.  For the slow case when we are actually
+  * dealing with a mountpoint of some kind, walk through all of the
+  * mounts in the current mount namespace and test to see if the dentry
+  * is a mountpoint.
+  *
+  * The mount_hashtable is not usable in the context because we
+  * need to identify all mounts that may be in the current mount
+  * namespace not just a mount that happens to have some specified
+  * parent mount.
+  */
+ bool __is_local_mountpoint(struct dentry *dentry)
+ {
+   struct mnt_namespace *ns = current->nsproxy->mnt_ns;
+   struct mount *mnt;
+   bool is_covered = false;
+ 
+   if (!d_mountpoint(dentry))
+   goto out;
+ 
+   down_read(_sem);
+   list_for_each_entry(mnt, >list, mnt_list) {
+   is_covered = (mnt->mnt_mountpoint == dentry);
+   if (is_covered)
+   break;
+   }
+   up_read(_sem);
+ out:
+   return is_covered;
+ }
+ 
+ static struct mountpoint *lookup_mountpoint(struct dentry *dentry)
  {
 -  struct list_head *chain = mountpoint_hashtable + hash(NULL, dentry);
 +  struct hlist_head *chain = mp_hash(dentry);
struct mountpoint *mp;
-   int ret;
  
 -  list_for_each_entry(mp, chain, m_hash) {
 +  hlist_for_each_entry(mp, chain, m_hash) {
if (mp->m_dentry == dentry) {
/* might be worth a WARN_ON() */
if (d_unlinked(dentry))
@@@ -695,7 -702,8 +738,8 @@@ static struct mountpoint *new_mountpoin
  
mp->m_dentry = dentry;
mp->m_count = 1;
 -  list_add(>m_hash, chain);
 +  hlist_add_head(>m_hash, chain);
+   INIT_LIST_HEAD(>m_list);
return mp;
  }
  
@@@ -748,7 -757,8 +793,8 @@@ static void detach_mnt(struct mount *mn
mnt->mnt_parent = mnt;
mnt->mnt_mountpoint = mnt->mnt.mnt_root;
list_del_init(>mnt_child);
 -  list_del_init(>mnt_hash);
 +  hlist_del_init_rcu(>mnt_hash);
+   list_del_init(>mnt_mp_list);
put_mountpoint(mnt->mnt_mp);
mnt->mnt_mp = NULL;
  }
@@@ -936,9 -943,35 +983,25 @@@ static struct mount *clone_mnt(struct m
return ERR_PTR(err);
  }
  
 -static void delayed_free(struct rcu_head *head)
 -{
 -  struct mount *mnt = container_of(head, struct mount, mnt_rcu);
 -  kfree(mnt->mnt_devname);
 -#ifdef CONFIG_SMP
 -  free_percpu(mnt->mnt_pcp);
 -#endif
 -  kmem_cache_free(mnt_cache, mnt);
 -}
 -
+ static void cleanup_mnt(struct mount *mnt)
+ {
+   fsnotify_vfsmount_delete(>mnt);
+   dput(mnt->mnt.mnt_root);
+   deactivate_super(mnt->mnt.mnt_sb);
+   mnt_free_id(mnt);
+   complete(mnt->mnt_undone);
 -  call_rcu(>mnt_rcu, delayed_free);
++  call_rcu(>mnt_rcu, delayed_free_vfsmnt);
+ }
+ 
+ static void cleanup_mnt_work(struct work_struct *work)
+ {
+   cleanup_mnt(container_of(work, struct mount, mnt_cleanup_work));
+ }
+ 
  static void mntput_no_expire(struct mount *mnt)
  {
- put_again:
+   struct completion undone;
+ 
rcu_read_lock();
mnt_add_count(mnt, -1);
if (likely(mnt->mnt_ns)) { /* shouldn't be the last one */


pgpjZhilKuFu6.pgp
Description: PGP signature


Re: in kernel 2.6.x, tun/tap nic supports vlan packets

2014-04-16 Thread Willy Tarreau
Hi Zhu,

On Thu, Apr 17, 2014 at 11:35:58AM +0800, zhuyj wrote:
> Hi, all
> 
> In kernel 2.6.x, linux depends on nic vlan hardware acceleration to 
> insert/extract
> vlan tag. In this scene, in kernel 2.6.x
> 
>  _
> A   | | B|| C
>  vlan packets-->| tap |->|vlan nic|--->
> |_|  ||
> 
> We hope vlan packets pass through tap and vlan nic from A to c.
> But in kernel 2.6.x, linux kernel can not extract vlan tag. It depends
> on nic vlan hardware acceleration. It is well known that tap nic has no
> vlan acceleration. So in the above scene, vlan packets can not be handled by
> tap nic. These vlan packets will be discarded in B. They can not arrive 
> at C.

It's not clear to me what you want to achieve. Are you trying to create
vlan interfaces on top of a tap interface ? Eg: tap1.12, tap1.23 etc ?

> In kernel 3.x, linux can handle vlan packets. It does not depend on nic vlan
> hardware acceleration. So the above scene can work well in kernel 3.x.
> 
> To resolve the above in kernel 2.6.x, we simulated vlan hardware 
> acceleration in
> tun/tap driver. Then followed the logic of commit commit 4fba4ca4
> [vlan: Centralize handling of hardware acceleration] to modify the vlan 
> packets
> process in kernel 2.6.x. In the end, the above scene can work well in 
> patched
> kernel 2.6.x.
> 
> Please comment on it. Any reply is appreciated.
> 
> Hi, Willy
> 
> These 2 patches are for linux2.6.x. These can work well here. Please 
> help to merge
> linux 2.6.32.x. Thanks a lot.

Well, 2.6.32.x is in deep freeze mode and it receives only critical fixes
once in a while. While I can appreciate that the patch above might solve
the issue you're facing, I'm wondering if there are not any acceptable
workarounds for such a deep freeze kernel. You patch is not huge, but it
definitely affects a working driver, and I wouldn't like risking to break
the tap driver for other users, and I reall don't have the skills to audit
it completely to ensure this is not the case. And if it breaks, I'll have
to revert it or seek for some help on netdev.

So I'd say that I'd rather not merge it unless I get an Acked-by from some
netdev people who are willing to help in case of any future regression,
which is unlikely but still possible.

Just out of curiosity, what is the motivation for ongoing development on
top of 2.6.32 ? Are there any important deployments that cannot upgrade
for any specific reason ? I'm asking because most 2.6.32.x kernels that
are stuffed into embedded boxes very likely come with their own number
of in-house patches to add whatever feature is needed in such contexts,
so I'm wondering why having this patch in mainline would help in your
situation compared to having it into your own patch set only.

Thanks,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ARM: dts: imx6q-gk802: Enable HDMI

2014-04-16 Thread Shawn Guo
On Wed, Apr 16, 2014 at 11:23:50PM +0200, Philipp Zabel wrote:
> Signed-off-by: Philipp Zabel 
> ---
> Changes since v1:
>  - Reordered ddc-i2c-bus and status properties
> ---
>  arch/arm/boot/dts/imx6q-gk802.dts | 5 +
>  1 file changed, 5 insertions(+)

Applied.  It should be sent to LAKML than LKML though.

Shawn

> 
> diff --git a/arch/arm/boot/dts/imx6q-gk802.dts 
> b/arch/arm/boot/dts/imx6q-gk802.dts
> index 4a9b4dc..0f0c50b 100644
> --- a/arch/arm/boot/dts/imx6q-gk802.dts
> +++ b/arch/arm/boot/dts/imx6q-gk802.dts
> @@ -48,6 +48,11 @@
>   };
>  };
>  
> + {
> + ddc-i2c-bus = <>;
> + status = "okay";
> +};
> +
>  /* Internal I2C */
>   {
>   pinctrl-names = "default";
> -- 
> 1.9.1
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/19] Make effect of PF_FSTRANS to disable __GFP_FS universal.

2014-04-16 Thread Dave Chinner
On Thu, Apr 17, 2014 at 11:03:50AM +1000, NeilBrown wrote:
> On Wed, 16 Apr 2014 16:17:26 +1000 NeilBrown  wrote:
> 
> > On Wed, 16 Apr 2014 15:37:56 +1000 Dave Chinner  wrote:
> > 
> > > On Wed, Apr 16, 2014 at 02:03:36PM +1000, NeilBrown wrote:
> 
> > > > -   /*
> > > > -* Given that we do not allow direct reclaim to call us, we 
> > > > should
> > > > -* never be called while in a filesystem transaction.
> > > > -*/
> > > > -   if (WARN_ON(current->flags & PF_FSTRANS))
> > > > -   goto redirty;
> > > 
> > > We still need to ensure this rule isn't broken. If it is, the
> > > filesystem will silently deadlock in delayed allocation rather than
> > > gracefully handle the problem with a warning
> > 
> > Hmm... that might be tricky.  The 'new' PF_FSTRANS can definitely be set 
> > when
> > xfs_vm_writepage is called and we really want the write to happen.
> > I don't suppose there is any other way to detect if a transaction is
> > happening?
> 
> I've been thinking about this some more
> 
> That code is in xfs_vm_writepage which is only called as ->writepage.
> xfs never calls that directly so it could only possibly be called during
> reclaim?

__filemap_fdatawrite_range or __writeback_single_inode
  do_writepages
->writepages
  xfs_vm_writepages
write_cache_pages
  ->writepage
xfs_vm_writepage

So explicit data flushes or background writeback still end up in
xfs_vm_writepage.

> We know that doesn't happen, but if it does then PF_MEMALLOC would be set,
> but PF_KSWAPD would not... and you already have a test for that.
> 
> How about every time we set PF_FSTRANS, we store the corresponding
> xfs_trans_t in current->journal_info, and clear that field when PF_FSTRANS is
> cleared.  Then xfs_vm_writepage can test for current->journal_info being
> clear.
> That is the field that several other filesystems use to keep track of the
> 'current' transaction.

The difference is that we have an explicit transaction handle in XFS
which defines the transaction context. i.e. we don't hide
transactions in thread contexts - the transaction defines the atomic
context of the modification being made.

> I don't know what xfs_trans_t we would use in
> xfs_bmapi_allocate_worker, but I suspect you do :-)

The same one we use now.

But that's exactly my point.  i.e. the transaction handle belongs to
the operation being executed, not the thread that is currently
executing it.  We also hand transaction contexts to IO completion,
do interesting things with log space reservations for operations
that require multiple commits to complete and so pass state when
handles are duplicated prior to commit, etc. We still need direct
manipulation and control of the transaction structure, regardless of
where it is stored.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the staging tree with the staging.current tree

2014-04-16 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got conflicts in
drivers/staging/rtl8723au/core/rtw_ieee80211.c,
drivers/staging/rtl8723au/core/rtw_mlme_ext.c,
drivers/staging/rtl8723au/core/rtw_p2p.c and
drivers/staging/rtl8723au/core/rtw_wlan_util.c between commit
f5d197b614d8 ("staging: rtl8723au: Fix buffer overflow in rtw_get_wfd_ie
()") from the staging.current tree and various commits from the staging
tree.

I fixed it up (I just use dthe version from the staging tree) and can
carry the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpRwB6NT_0R4.pgp
Description: PGP signature


Re: [PATCH/RFC 00/19] Support loop-back NFS mounts

2014-04-16 Thread Dave Chinner
On Thu, Apr 17, 2014 at 11:50:18AM +1000, NeilBrown wrote:
> On Thu, 17 Apr 2014 11:27:39 +1000 Dave Chinner  wrote:
> 
> > On Thu, Apr 17, 2014 at 10:20:48AM +1000, NeilBrown wrote:
> > > A good example is the deadlock with the flush-* threads.
> > > flush-* will lock a page, and  then call ->writepage.  If ->writepage
> > > allocates memory it can enter reclaim, call ->releasepage on NFS, and 
> > > block
> > > waiting for a COMMIT to complete.
> > > The COMMIT might already be running, performing fsync on that same file 
> > > that
> > > flush-* is flushing.  It locks each page in turn.  When it  gets to the 
> > > page
> > > that flush-* has locked, it will deadlock.
> > 
> > It's nfs_release_page() again
> > 
> > > In general, if nfsd is allowed to block on local filesystem, and local
> > > filesystem is allowed to block on NFS, then a deadlock can happen.
> > > We would need a clear hierarchy
> > > 
> > >__GFP_NETFS > __GFP_FS > __GFP_IO
> > > 
> > > for it to work.  I'm not sure the extra level really helps a lot and it 
> > > would
> > > be a lot of churn.
> > 
> > I think you are looking at this the wrong way - it's not the other
> > filesystems that have to avoid memory reclaim recursion, it's the
> > NFS client mount that is on loopback that needs to avoid recursion.
> > 
> > IMO, the fix should be that the NFS client cannot block on messages sent to 
> > the NFSD
> > on the same host during memory reclaim. That is, nfs_release_page()
> > cannot send commit messages to the server if the server is on
> > localhost. Instead, it just tells memory reclaim that it can't
> > reclaim that page.
> > 
> > If nfs_release_page() no longer blocks in memory reclaim, and all
> > these nfsd-gets-blocked-in-GFP_KERNEL-memory-allocation recursion
> > problems go away. Do the same for all the other memory reclaim
> > operations in the NFS client, and you've got a solution that should
> > work without needing to walk all over the rest of the kernel
> 
> Maybe.
> It is nfs_release_page() today. I wonder if it could be other things another
> day.  I want to be sure I have a solution that really makes sense.

There could be other things, but in the absence of those things,
I don't think that adding another layer to memory reclaim
dependencies for this niche corner case makes a lot of sense. ;)

> However ... the thing that nfs_release_page is doing it sending a COMMIT to
> tell the server to flush to stable storage.  It does that so that if the
> server crashes, then the client can re-send.
> Of course when it is a loop-back mount the client is the server so the COMMIT
> is completely pointless.  If the client notices that it is sending a COMMIT
> to itself, it can simply assume a positive reply.

Yes, that's very true. You might have to treat ->writepage
specially, too, if that can block, say, on the number of outstanding
requests that can be sent to the server.

> You are right, that would make the patch set a lot less intrusive.  I'll give
> it some serious thought - thanks.

No worries. :)

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [raid5] kernel BUG at drivers/md/raid5.c:4374!

2014-04-16 Thread NeilBrown
On Thu, 17 Apr 2014 11:59:59 +0800 Fengguang Wu 
wrote:

> Shaohua,
> 
> We noticed the below BUG on
> 
> commit e240c1839d11152b0355442f8ac6d2d2d921be36 ("raid5: get_active_stripe 
> avoids device_lock")
> 
> test case: lkp-ws02/micro/dd-write/11HDD-RAID5-cfq-ext4-10dd

Thanks.  We know about this.  I really should push that patch out
Sorry

NeilBrown



> 
> 27c0f68f0745218  e240c1839d11152b0355442f8  
> ---  -  
>  0   +Inf%  1 ~ 0%  TOTAL 
> dmesg.kernel_BUG_at_drivers/md/raid5.c
>  0   +Inf%  1 ~ 0%  TOTAL dmesg.invalid_opcode
>  0   +Inf%  1 ~ 0%  TOTAL 
> dmesg.RIP:handle_active_stripes
>  0   +Inf%  1 ~ 0%  TOTAL 
> dmesg.Kernel_panic-not_syncing:Fatal_exception
> 
> Legend:
>   ~XX%- stddev percent
>   [+-]XX% - change percent
> 
> [  264.260444] kernel BUG at drivers/md/raid5.c:4374!
> [  264.267590] invalid opcode:  [#1] SMP
> [  264.272076] Modules linked in: btrfs microcode ipmi_si ipmi_msghandler 
> acpi_cpufreq processor
> [  264.281514] CPU: 0 PID: 4005 Comm: md0_raid5 Not tainted 
> 3.15.0-rc1-00611-g2e76799 #1
> [  264.289823] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/06/2010
> [  264.296789] task: 8804151e41a0 ti: 88041672c000 task.ti: 
> 88041672c000
> [  264.304750] RIP: 0010:[]  [] 
> handle_active_stripes.isra.24+0x254/0x360
> [  264.314951] RSP: 0018:88041672dd10  EFLAGS: 00010002
> [  264.320527] RAX: 88021dc46000 RBX: 880220a4e000 RCX: 
> 880220a4e080
> [  264.327926] RDX: 0001 RSI: 88021e0d7010 RDI: 
> 880220a4e000
> [  264.335325] RBP: 88041672dda8 R08:  R09: 
> 
> [  264.342724] R10:  R11: ef58 R12: 
> 
> [  264.350131] R13: 880220a4e080 R14:  R15: 
> 880220a4e268
> [  264.357530] FS:  () GS:880237c0() 
> knlGS:
> [  264.366099] CS:  0010 DS:  ES:  CR0: 8005003b
> [  264.372111] CR2: 01e21d64 CR3: 0200f000 CR4: 
> 07f0
> [  264.379509] Stack:
> [  264.381780]  8804173d7de0 880220a4e000 88041672dd38 
> ffd8
> [  264.389907]  1e1bd000 880220a4e090 817fdba2 
> 880220a4e000
> [  264.398043]  0021 880220a4e268  
> 88041672dd78
> [  264.406172] Call Trace:
> [  264.408893]  [] ? do_release_stripe+0xdf/0x158
> [  264.415168]  [] ? __release_stripe+0x15/0x17
> [  264.421266]  [] raid5d+0x3e2/0x4f2
> [  264.426497]  [] ? schedule_timeout+0x2f/0x19f
> [  264.432681]  [] md_thread+0x123/0x139
> [  264.438171]  [] ? __wake_up_sync+0x12/0x12
> [  264.444096]  [] ? md_register_thread+0xd5/0xd5
> [  264.450368]  [] kthread+0xdb/0xe3
> [  264.455511]  [] ? kthread_create_on_node+0x16f/0x16f
> [  264.462310]  [] ret_from_fork+0x7c/0xb0
> [  264.467973]  [] ? kthread_create_on_node+0x16f/0x16f
> [  264.474764] Code: 60 00 00 00 00 48 8b 70 10 48 8b 48 18 48 8d 50 10 48 89 
> 4e 08 48 89 31 48 89 50 10 48 89 50 18 f0 ff 40 50 8b 50 50
> ff ca 74 02 <0f> 0b 4a 89 44 d5 98 49 ff c2 49 83 fa 08 0f 85 d7 fd ff ff 41
> [  264.498252] RIP  [] 
> handle_active_stripes.isra.24+0x254/0x360
> [  264.506112]  RSP 
> [  264.509869] ---[ end trace 58f3875ff7b4e923 ]---
> 
> Thanks,
> Fengguang



signature.asc
Description: PGP signature


Re: [PATCH] cif: fix dead code

2014-04-16 Thread Steve French
merged into cifs-2.6.git

On Tue, Apr 15, 2014 at 3:06 AM, Michael Opdenacker
 wrote:
> This issue was found by Coverity (CID 1202536)
>
> This proposes a fix for a statement that creates dead code.
> The "rc < 0" statement is within code that is run
> with "rc > 0".
>
> It seems like "err < 0" was meant to be used here.
> This way, the error code is returned by the function.
>
> Signed-off-by: Michael Opdenacker 
> ---
>  fs/cifs/file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 8add25538a3b..b6e78632fa97 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -2599,7 +2599,7 @@ cifs_writev(struct kiocb *iocb, const struct iovec *iov,
> ssize_t err;
>
> err = generic_write_sync(file, iocb->ki_pos - rc, rc);
> -   if (rc < 0)
> +   if (err < 0)
> rc = err;
> }
> } else {
> --
> 1.8.3.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ARM: mm: support big-endian page tables

2014-04-16 Thread Jianguo Wu
On 2014/4/16 20:28, Marc Zyngier wrote:

> On 16/04/14 03:45, Jianguo Wu wrote:
>> On 2014/4/14 19:14, Marc Zyngier wrote:
>>
>>> On 14/04/14 11:43, Will Deacon wrote:
 (catching up on old email)

 On Tue, Mar 18, 2014 at 07:35:59AM +, Jianguo Wu wrote:
> Cloud you please take a look at this?

 [...]

> On 2014/2/17 15:05, Jianguo Wu wrote:
>> When enable LPAE and big-endian in a hisilicon board, while specify
>> mem=384M mem=512M@7680M, will get bad page state:
>>
>> Freeing unused kernel memory: 180K (c0466000 - c0493000)
>> BUG: Bad page state in process init  pfn:fa442
>> page:c7749840 count:0 mapcount:-1 mapping:  (null) index:0x0
>> page flags: 0x4400(reserved)
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: init Not tainted 3.10.27+ #66
>> [] (unwind_backtrace+0x0/0x11c) from [] 
>> (show_stack+0x10/0x14)
>> [] (show_stack+0x10/0x14) from [] 
>> (bad_page+0xd4/0x104)
>> [] (bad_page+0xd4/0x104) from [] 
>> (free_pages_prepare+0xa8/0x14c)
>> [] (free_pages_prepare+0xa8/0x14c) from [] 
>> (free_hot_cold_page+0x18/0xf0)
>> [] (free_hot_cold_page+0x18/0xf0) from [] 
>> (handle_pte_fault+0xcf4/0xdc8)
>> [] (handle_pte_fault+0xcf4/0xdc8) from [] 
>> (handle_mm_fault+0xf4/0x120)
>> [] (handle_mm_fault+0xf4/0x120) from [] 
>> (do_page_fault+0xfc/0x354)
>> [] (do_page_fault+0xfc/0x354) from [] 
>> (do_DataAbort+0x2c/0x90)
>> [] (do_DataAbort+0x2c/0x90) from [] 
>> (__dabt_usr+0x34/0x40)

 [...]

>> The bug is happened in cpu_v7_set_pte_ext(ptep, pte):
>> when pte is 64-bit, for little-endian, will store low 32-bit in r2,
>> high 32-bit in r3; for big-endian, will store low 32-bit in r3,
>> high 32-bit in r2, this will cause wrong pfn stored in pte,
>> so we should exchange r2 and r3 for big-endian.

>>
>> Hi Marc,
>> How about this:
>>
>> The bug is happened in cpu_v7_set_pte_ext(ptep, pte):
>> - It tests the L_PTE_NONE in one word on the other, and possibly clear 
>> L_PTE_VALID
>>   tstr3, #1 << (57 - 32) @ L_PTE_NONE
>>   bicne  r2, #L_PTE_VALID
>> - Same for L_PTE_DIRTY, respectively setting L_PTE_RDONLY
>>
>> As for LPAE, the pte is 64-bits, and the value of r2/r3 is depending on the 
>> endianness,
>> for little-endian, will store low 32-bit in r2, high 32-bit in r3,
>> for big-endian, will store low 32-bit in r3, high 32-bit in r2, 
>> this will cause wrong bit is cleared or set, and get wrong pfn.
>> So we should exchange r2 and r3 for big-endian.
> 
> May I suggest the following instead:
> 
> "An LPAE PTE is a 64bit quantity, passed to cpu_v7_set_pte_ext in the
>  r2 and r3 registers.
>  On an LE kernel, r2 contains the LSB of the PTE, and r3 the MSB.
>  On a BE kernel, the assignment is reversed.
> 
>  Unfortunately, the current code always assumes the LE case,
>  leading to corruption of the PTE when clearing/setting bits.
> 
>  This patch fixes this issue much like it has been done already in the
>  cpu_v7_switch_mm case."
> 

OK, I will sent a new version, thanks!

> Cheers,
> 
>   M.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging: rts5139: fixed coding style

2014-04-16 Thread Thomas Tanaka
Fixed checkpatch warnings > 80 lines

Signed-off-by: Thomas Tanaka 
---
 drivers/staging/rts5139/rts51x_fop.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/rts5139/rts51x_fop.c 
b/drivers/staging/rts5139/rts51x_fop.c
index 677d18b..cf4e675 100644
--- a/drivers/staging/rts5139/rts51x_fop.c
+++ b/drivers/staging/rts5139/rts51x_fop.c
@@ -70,7 +70,8 @@ static int rts51x_sd_direct_cmnd(struct rts51x_chip *chip,
switch (dir) {
case 0:
/* No data */
-   retval = ext_rts51x_sd_execute_no_data(chip, 
chip->card2lun[SD_CARD],
+   retval = ext_rts51x_sd_execute_no_data(chip,
+   chip->card2lun[SD_CARD],
cmd_idx, standby, acmd,
rsp_code, arg);
if (retval != TRANSPORT_GOOD)
@@ -83,10 +84,11 @@ static int rts51x_sd_direct_cmnd(struct rts51x_chip *chip,
if (!buf)
TRACE_RET(chip, STATUS_NOMEM);
 
-   retval = ext_rts51x_sd_execute_read_data(chip, 
chip->card2lun[SD_CARD],
- cmd_idx, cmd12, standby, acmd,
- rsp_code, arg, len, buf,
- cmnd->buf_len, 0);
+   retval = ext_rts51x_sd_execute_read_data(chip,
+   chip->card2lun[SD_CARD],
+   cmd_idx, cmd12, standby, acmd,
+   rsp_code, arg, len, buf,
+   cmnd->buf_len, 0);
if (retval != TRANSPORT_GOOD) {
kfree(buf);
TRACE_RET(chip, STATUS_FAIL);
@@ -117,10 +119,11 @@ static int rts51x_sd_direct_cmnd(struct rts51x_chip *chip,
}
 
retval =
-   ext_rts51x_sd_execute_write_data(chip, 
chip->card2lun[SD_CARD],
- cmd_idx, cmd12, standby, acmd,
- rsp_code, arg, len, buf,
- cmnd->buf_len, 0);
+   ext_rts51x_sd_execute_write_data(chip,
+   chip->card2lun[SD_CARD],
+   cmd_idx, cmd12, standby, acmd,
+   rsp_code, arg, len, buf,
+   cmnd->buf_len, 0);
if (retval != TRANSPORT_GOOD) {
kfree(buf);
TRACE_RET(chip, STATUS_FAIL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[sched,rcu] b84c4e08143: +3.1% will-it-scale.per_thread_ops

2014-04-16 Thread Fengguang Wu
Hi Paul,

FYI, this improves will-it-scale/open1 throughput.

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2014.04.14a
commit b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 ("sched,rcu: Make 
cond_resched() report RCU quiescent states")

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
563496 ~ 0%  +3.1% 581059 ~ 0%  nhm4/micro/will-it-scale/open1
563496 ~ 0%  +3.1% 581059 ~ 0%  TOTAL will-it-scale.per_thread_ops

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
756894 ~ 0%  +2.8% 778452 ~ 0%  nhm4/micro/will-it-scale/open1
756894 ~ 0%  +2.8% 778452 ~ 0%  TOTAL will-it-scale.per_process_ops

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
  0.57 ~ 0%  -2.7%   0.55 ~ 0%  nhm4/micro/will-it-scale/open1
  0.57 ~ 0%  -2.7%   0.55 ~ 0%  TOTAL will-it-scale.scalability

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
346764 ~ 2% -74.0%  90164 ~ 1%  nhm4/micro/will-it-scale/open1
346764 ~ 2% -74.0%  90164 ~ 1%  TOTAL 
slabinfo.kmalloc-256.active_objs

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
 10837 ~ 2% -73.9%   2824 ~ 1%  nhm4/micro/will-it-scale/open1
 10837 ~ 2% -73.9%   2824 ~ 1%  TOTAL 
slabinfo.kmalloc-256.active_slabs

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
 10837 ~ 2% -73.9%   2824 ~ 1%  nhm4/micro/will-it-scale/open1
 10837 ~ 2% -73.9%   2824 ~ 1%  TOTAL slabinfo.kmalloc-256.num_slabs

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
346821 ~ 2% -73.9%  90393 ~ 1%  nhm4/micro/will-it-scale/open1
346821 ~ 2% -73.9%  90393 ~ 1%  TOTAL slabinfo.kmalloc-256.num_objs

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
105961 ~ 1% -63.0%  39153 ~ 1%  nhm4/micro/will-it-scale/open1
105961 ~ 1% -63.0%  39153 ~ 1%  TOTAL meminfo.SUnreclaim

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
 26432 ~ 1% -62.9%   9814 ~ 1%  nhm4/micro/will-it-scale/open1
 26432 ~ 1% -62.9%   9814 ~ 1%  TOTAL 
proc-vmstat.nr_slab_unreclaimable

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
 50298 ~ 0%+194.3% 148011 ~ 0%  nhm4/micro/will-it-scale/open1
 37020 ~ 0% +42.6%  52798 ~ 1%  nhm4/micro/will-it-scale/signal1
 87318 ~ 0%+130.0% 200809 ~ 0%  TOTAL softirqs.RCU

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
140354 ~ 1% -47.6%  73490 ~ 0%  nhm4/micro/will-it-scale/open1
140354 ~ 1% -47.6%  73490 ~ 0%  TOTAL meminfo.Slab

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
 77391 ~ 1% -46.7%  41235 ~ 2%  nhm4/micro/will-it-scale/signal1
 77391 ~ 1% -46.7%  41235 ~ 2%  TOTAL cpuidle.C6-NHM.usage

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
 19871 ~ 2% -37.6%  12397 ~ 2%  nhm4/micro/will-it-scale/open1
 18497 ~ 1% -37.5%  11556 ~ 1%  nhm4/micro/will-it-scale/signal1
 38368 ~ 2% -37.6%  23954 ~ 2%  TOTAL softirqs.SCHED

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
  1.24 ~ 4% -35.4%   0.80 ~ 3%  nhm4/micro/will-it-scale/open1
  1.24 ~ 4% -35.4%   0.80 ~ 3%  TOTAL 
perf-profile.cpu-cycles.do_notify_resume.int_signal.close

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
  1.43 ~ 4% +41.9%   2.03 ~ 4%  nhm4/micro/will-it-scale/open1
  1.43 ~ 4% +41.9%   2.03 ~ 4%  TOTAL 
perf-profile.cpu-cycles.rcu_process_callbacks.__do_softirq.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
  1.27 ~ 3% -30.0%   0.89 ~ 6%  nhm4/micro/will-it-scale/open1
  1.27 ~ 3% -30.0%   0.89 ~ 6%  TOTAL 
perf-profile.cpu-cycles.setup_object.isra.46.new_slab.__slab_alloc.kmem_cache_alloc.get_empty_filp

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
  1.54 ~ 7% +35.6%   2.09 ~ 8%  nhm4/micro/will-it-scale/open1
  1.54 ~ 7% +35.6%   2.09 ~ 8%  TOTAL 
perf-profile.cpu-cycles.kmem_cache_alloc.getname_flags.getname.do_sys_open.sys_open

ad86a04266f9b49  b84c4e08143c98dad4b4d139f  
---  -  
  4.21 ~ 2% -29.1%   2.98 ~ 3%  nhm4/micro/will-it-scale/open1
  4.21 ~ 2% -29.1%   2.98 ~ 3%  TOTAL 

[net/sctp] 362d52040c7: +99.0% netperf.Throughput_Mbps

2014-04-16 Thread Fengguang Wu
Hi Daniel,

We noticed the same improvements in netperf SCTP_STREAM test case
as described in your patch changelog.

commit 362d52040c71f6e8d8158be48c812d7729cb8df1 ("Revert "net: sctp: Fix 
a_rwnd/rwnd management to reflect real state of the receiver's buffer"")

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  2.14 ~ 0% +98.9%   4.25 ~ 0%  
kbuildx/micro/netperf/300s-200%-SCTP_STREAM
  2.14 ~ 0% +99.1%   4.25 ~ 0%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  2.14 ~ 0% +99.0%   4.25 ~ 0%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  6.41 ~ 0% +99.0%  12.76 ~ 0%  TOTAL netperf.Throughput_Mbps

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  2.67 ~26%-100.0%   0.00 ~ 0%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  0.92 ~27%-100.0%   0.00 ~ 0%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  3.59 ~26%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.copy_user_generic_string.skb_copy_datagram_iovec.skb_copy_datagram_iovec.sctp_recvmsg.sock_common_recvmsg

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  8.75 ~32% -80.4%   1.72 ~21%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  8.75 ~32% -80.4%   1.72 ~21%  TOTAL 
perf-profile.cpu-cycles.sctp_packet_transmit.sctp_outq_flush.sctp_outq_uncork.sctp_cmd_interpreter.sctp_do_sm

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  0.00   +Inf%   1.22 ~22%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  0.00   +Inf%   1.43 ~22%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  0.00   +Inf%   2.65 ~22%  TOTAL 
perf-profile.cpu-cycles.sctp_packet_transmit.sctp_packet_transmit_chunk.sctp_outq_flush.sctp_outq_uncork.sctp_cmd_interpreter

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  6.26 ~13%+306.9%  25.46 ~26%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  6.26 ~13%+306.9%  25.46 ~26%  TOTAL 
perf-profile.cpu-cycles._raw_spin_lock_irqsave.clockevents_notify.intel_idle.cpuidle_enter_state.cpuidle_enter

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
   966 ~45% -44.6%535 ~29%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
   966 ~45% -44.6%535 ~29%  TOTAL cpuidle.C1-NHM.time

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  0.45 ~33%+165.0%   1.20 ~14%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  0.48 ~37%+235.3%   1.62 ~12%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  0.93 ~35%+201.3%   2.81 ~13%  TOTAL 
perf-profile.cpu-cycles.skb_copy_bits.skb_copy.sctp_make_reassembled_event.sctp_ulpq_tail_data.sctp_cmd_interpreter

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  0.80 ~32% -56.1%   0.35 ~38%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  0.80 ~32% -56.1%   0.35 ~38%  TOTAL 
perf-profile.cpu-cycles.menu_select.cpuidle_select.cpu_startup_entry.start_secondary

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  5.08 ~26% -58.3%   2.12 ~21%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  1.34 ~27%+112.1%   2.85 ~ 9%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  6.43 ~26% -22.7%   4.97 ~14%  TOTAL 
perf-profile.cpu-cycles.copy_user_generic_string.skb_copy_datagram_iovec.sctp_recvmsg.sock_common_recvmsg.sock_recvmsg

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
  0.72 ~41% +69.4%   1.22 ~25%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  0.72 ~41% +69.4%   1.22 ~25%  TOTAL 
perf-profile.cpu-cycles.unmap_single_vma.unmap_vmas.exit_mmap.mmput.do_exit

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
916917 ~ 0% +79.9%1649300 ~ 0%  
kbuildx/micro/netperf/300s-200%-SCTP_STREAM
  12832444 ~ 1%+115.7%   27683375 ~ 0%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
  13749362 ~ 0%+113.3%   29332676 ~ 0%  TOTAL proc-vmstat.pgalloc_normal

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
   1175423 ~ 0% +79.7%2111948 ~ 0%  
kbuildx/micro/netperf/300s-200%-SCTP_STREAM
  12997556 ~ 0%+115.4%   27996526 ~ 0%  
lkp-nex04/micro/netperf/300s-200%-SCTP_STREAM
   1181153 ~ 0% +79.3%2118015 ~ 0%  
lkp-t410/micro/netperf/300s-200%-SCTP_STREAM
  15354133 ~ 0%+109.9%   32226491 ~ 0%  TOTAL proc-vmstat.pgfree

bfae23249955819  362d52040c71f6e8d8158be48  
---  -  
257597 ~ 0% +79.3% 461743 ~ 0%  

in kernel 2.6.x, tun/tap nic supports vlan packets

2014-04-16 Thread zhuyj

Hi, all

In kernel 2.6.x, linux depends on nic vlan hardware acceleration to 
insert/extract

vlan tag. In this scene, in kernel 2.6.x

 _
A   | | B|| C
 vlan packets-->| tap |->|vlan nic|--->
|_|  ||

We hope vlan packets pass through tap and vlan nic from A to c.
But in kernel 2.6.x, linux kernel can not extract vlan tag. It depends
on nic vlan hardware acceleration. It is well known that tap nic has no
vlan acceleration. So in the above scene, vlan packets can not be handled by
tap nic. These vlan packets will be discarded in B. They can not arrive 
at C.


In kernel 3.x, linux can handle vlan packets. It does not depend on nic vlan
hardware acceleration. So the above scene can work well in kernel 3.x.

To resolve the above in kernel 2.6.x, we simulated vlan hardware 
acceleration in

tun/tap driver. Then followed the logic of commit commit 4fba4ca4
[vlan: Centralize handling of hardware acceleration] to modify the vlan 
packets
process in kernel 2.6.x. In the end, the above scene can work well in 
patched

kernel 2.6.x.

Please comment on it. Any reply is appreciated.

Hi, Willy

These 2 patches are for linux2.6.x. These can work well here. Please 
help to merge

linux 2.6.32.x. Thanks a lot.

Best Regards!
Zhu Yanjun

>From 66db0748fc0f932496100789eb319ca5884c0694 Mon Sep 17 00:00:00 2001
From: Zhu Yanjun 
Date: Wed, 16 Apr 2014 18:19:42 +0800
Subject: [PATCH 1/2] tun/tap: add the feature of vlan rx extraction

Tap is a virtual net device that has no vlan rx untag feature.
So this virtual device can not send/receive vlan packets in
kernel 2.6.x. To make this device support vlan send/receive vlan
packets in kernel 2.6.x, a vlan rx extraction feature is simulated
in its driver.

Signed-off-by: Zhu Yanjun 
---
 drivers/net/tun.c |  118 -
 include/linux/netdevice.h |1 +
 net/core/dev.c|   13 +
 3 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 894ad84..029e6cf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -69,6 +69,8 @@
 #include 
 #include 
 
+#include 
+
 /* Uncomment to enable debugging */
 /* #define TUN_DEBUG 1 */
 
@@ -426,6 +428,8 @@ static const struct net_device_ops tun_netdev_ops = {
 	.ndo_change_mtu		= tun_net_change_mtu,
 };
 
+static void tap_vlan_rx_register(struct net_device *dev, struct vlan_group *grp);
+
 static const struct net_device_ops tap_netdev_ops = {
 	.ndo_uninit		= tun_net_uninit,
 	.ndo_open		= tun_net_open,
@@ -435,6 +439,7 @@ static const struct net_device_ops tap_netdev_ops = {
 	.ndo_set_multicast_list	= tun_net_mclist,
 	.ndo_set_mac_address	= eth_mac_addr,
 	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_vlan_rx_register   = tap_vlan_rx_register,
 };
 
 /* Initialize net device. */
@@ -464,6 +469,8 @@ static void tun_net_init(struct net_device *dev)
 
 		random_ether_addr(dev->dev_addr);
 
+		dev->features |= NETIF_F_HW_VLAN_RX;
+
 		dev->tx_queue_len = TUN_READQ_SIZE;  /* We prefer our own queue length */
 		break;
 	}
@@ -530,6 +537,105 @@ static inline struct sk_buff *tun_alloc_skb(struct tun_struct *tun,
 	return skb;
 }
 
+static struct sk_buff *vlan_reorder_header(struct sk_buff *skb)
+{
+	if (skb_cow(skb, skb_headroom(skb)) < 0)
+		return NULL;
+	memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 2 * ETH_ALEN);
+	skb->mac_header += VLAN_HLEN;
+	return skb;
+}
+
+static void vlan_set_encap_proto(struct sk_buff *skb, struct vlan_hdr *vhdr)
+{
+	__be16 proto;
+	unsigned char *rawp;
+
+	/*
+ * 	 * Was a VLAN packet, grab the encapsulated protocol, which the layer
+ * 	 	 * three protocols care about.
+ * 	 	 	 */
+
+	proto = vhdr->h_vlan_encapsulated_proto;
+	if (ntohs(proto) >= 1536) {
+		skb->protocol = proto;
+		return;
+	}
+
+	rawp = skb->data;
+	if (*(unsigned short *) rawp == 0x)
+		/*
+ * 		 * This is a magic hack to spot IPX packets. Older Novell
+ * 		 		 * breaks the protocol design and runs IPX over 802.3 without
+ * 		 		 		 * an 802.2 LLC layer. We look for  which isn't a used
+ * 		 		 		 		 * 802.2 SSAP/DSAP. This won't work for fault tolerant netware
+ * 		 		 		 		 		 * but does for the rest.
+ * 		 		 		 		 		 		 */
+		skb->protocol = htons(ETH_P_802_3);
+	else
+		/*
+ * 		 * Real 802.2 LLC
+ * 		 		 */
+		skb->protocol = htons(ETH_P_802_2);
+}
+
+static void skb_reset_mac_len(struct sk_buff *skb)
+{
+	skb->mac_len = skb->network_header - skb->mac_header;
+}
+
+static struct sk_buff *vlan_untag(struct sk_buff *skb)
+{
+	struct vlan_hdr *vhdr;
+	u16 vlan_tci;
+
+	if (unlikely(vlan_tx_tag_present(skb))) {
+		/* vlan_tci is already set-up so leave this for another time */
+		return skb;
+	}
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (unlikely(!skb))
+		goto err_free;
+
+	if (unlikely(!pskb_may_pull(skb, VLAN_HLEN)))
+		goto err_free;
+
+	vhdr = (struct vlan_hdr *) skb->data;
+	vlan_tci = 

Re: [PATCH 3/4] ARM: dts: berlin: add the SDHCI nodes for the BG2Q

2014-04-16 Thread Jisheng Zhang
Hi Antoine,

On Wed, 16 Apr 2014 05:40:10 -0700
Antoine Ténart  wrote:

> Add the SDHCI nodes for the Marvell Berlin BG2Q, using the berlin-sdhci
> driver.
> 
> Signed-off-by: Antoine Ténart 
> ---
>  arch/arm/boot/dts/berlin2q.dtsi | 40
>  1 file changed, 40 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/berlin2q.dtsi
> b/arch/arm/boot/dts/berlin2q.dtsi index 5925e6a16749..8f897d461460 100644
> --- a/arch/arm/boot/dts/berlin2q.dtsi
> +++ b/arch/arm/boot/dts/berlin2q.dtsi
> @@ -67,6 +67,14 @@
>   clock-div = <3>;
>   };
>  
> + sdio1clk: sdio1clk {
> + compatible = "fixed-factor-clock";
> + #clock-cells = <0>;
> + clocks = <>;
> + clock-mult = <1>;
> + clock-div = <4>;
> + };
> +
>   soc {
>   compatible = "simple-bus";
>   #address-cells = <1>;
> @@ -75,6 +83,38 @@
>   ranges = <0 0xf700 0x100>;
>   interrupt-parent = <>;
>  
> + sdhci0: sdhci@ab {
> + compatible = "marvell,berlin2q-sdhci";
> + reg = <0xab 0x200>;
> + clocks = <>;
> + interrupts = ;
> + keep-power-in-suspend;
> + enable-sdio-wakeup;
> + broken-cd;
> + status = "disabled";
> + };
> +
> + sdhci1: sdhci@ab0800 {
> + compatible = "marvell,berlin2q-sdhci";
> + reg = <0xab0800 0x200>;
> + clocks = <>;
> + interrupts = ;
> + keep-power-in-suspend;
> + enable-sdio-wakeup;
> + status = "disabled";
> + };
> +
> + sdhci2: sdhci@ab1000 {
> + compatible = "marvell,berlin2q-sdhci";
> + reg = <0xab1000 0x200>;
> + interrupts = ;
> + clocks = <>;
> + keep-power-in-suspend;
> + enable-sdio-wakeup;
> + broken-cd;
> + status = "disabled";
> + };

could we put sdhci@ab1000 at the first of sdhci lists? For two reasons:

1. sdhci@ab and sdhci@ab0800 is called as sdhci1 and sdhci2 in mrvl
internal discussion, so this would make the name consistent when we
upgrade linux kernel to one mainline version.

2. sdhci@ab1000 is always used for emmc. if sdhci@ab0800 is put at the
head of sdhci@ab1000, and there's one sdcard in it, mmcblock0 would be
the sdcard rather than emmc.

I dunno whether there's elegant solutions for these two issues. alias? Could
anyone kindly help?

Thanks in advance,
Jisheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


block: check for dying queue in generic_make_request()

2014-04-16 Thread Asai Thambi S P

Check for dying queue is in request queue interface, but not for direct use
of make_request().

When a mounted device is surprise removed, block drivers delete gendisk and
cleanup request queue. As the reference count is non-zero, these structures
continue to exist and any further I/O request is passed on to block drivers.
With respect to the block driver, the device is removed and cleaned up the
data structures. This check will stop I/O to a non-existent device at the
block layer.

Signed-off-by: Asai Thambi S P 
---
 block/blk-core.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index e45b321..cec6bf4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1713,7 +1713,7 @@ generic_make_request_checks(struct bio *bio)
goto end_io;
 
q = bdev_get_queue(bio->bi_bdev);
-   if (unlikely(!q)) {
+   if (unlikely(!q || blk_queue_dying(q))) {
printk(KERN_ERR
   "generic_make_request: Trying to access "
"nonexistent block-device %s (%Lu)\n",
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mtip32xx: Remove dfs_parent after pci unregister

2014-04-16 Thread Asai Thambi S P

In module exit, dfs_parent and it's subtree were removed before unregistering
with pci. When debugfs entry for each device is attempted to remove in
pci_remove() context, they don't exist, as dfs_parent and its children were
already ripped apart.

Modified to first unregister with pci and then remove dfs_parent.

Signed-off-by: Asai Thambi S P 
---
 drivers/block/mtip32xx/mtip32xx.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 51628eb..27641bc 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -4939,13 +4939,13 @@ static int __init mtip_init(void)
  */
 static void __exit mtip_exit(void)
 {
-   debugfs_remove_recursive(dfs_parent);
-
/* Release the allocated major block device number. */
unregister_blkdev(mtip_major, MTIP_DRV_NAME);
 
/* Unregister the PCI driver. */
pci_unregister_driver(_pci_driver);
+
+   debugfs_remove_recursive(dfs_parent);
 }
 
 MODULE_AUTHOR("Micron Technology, Inc");
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Staging:Line6:usbdefs.h parenthesis for Marcos

2014-04-16 Thread Greg KH
On Tue, Apr 08, 2014 at 05:11:26PM +0100, Paul McQuade wrote:
> ERROR: Macros with complex values should be enclosed in parenthesis
> 
> Signed-off-by: Paul McQuade 
> ---
>  drivers/staging/line6/usbdefs.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/line6/usbdefs.h b/drivers/staging/line6/usbdefs.h
> index 2d1cc47..48958b5 100644
> --- a/drivers/staging/line6/usbdefs.h
> +++ b/drivers/staging/line6/usbdefs.h
> @@ -40,7 +40,7 @@
>  #define LINE6_DEVID_TONEPORT_UX2  0x4142
>  #define LINE6_DEVID_VARIAX0x534d
>  
> -#define LINE6_BIT(x) LINE6_BIT_ ## x = 1 << LINE6_INDEX_ ## x
> +#define LINE6_BIT(x) (LINE6_BIT_ ## x = 1 << LINE6_INDEX_ ## x)

I love this one, it gets people all the time who don't actually test
their changes...

Hint, it breaks the build, which isn't nice at all.  Please ALWAYS test
build your kernel changes, don't break other people's build boxes...

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mtip32xx: Increase timeout for STANDBY IMMEDIATE command

2014-04-16 Thread Asai Thambi S P

Increased timeout for STANDBY IMMEDIATE command to 2 minutes.

Signed-off-by: Selvan Mani 
Signed-off-by: Asai Thambi S P 
---
 drivers/block/mtip32xx/mtip32xx.c |   66 +++-
 1 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 59c5abe..51628eb 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -1529,6 +1529,37 @@ static inline void ata_swap_string(u16 *buf, unsigned 
int len)
be16_to_cpus([i]);
 }
 
+static void mtip_set_timeout(struct driver_data *dd,
+   struct host_to_dev_fis *fis,
+   unsigned int *timeout, u8 erasemode)
+{
+   switch (fis->command) {
+   case ATA_CMD_DOWNLOAD_MICRO:
+   *timeout = 12; /* 2 minutes */
+   break;
+   case ATA_CMD_SEC_ERASE_UNIT:
+   case 0xFC:
+   if (erasemode)
+   *timeout = ((*(dd->port->identify + 90) * 2) * 6);
+   else
+   *timeout = ((*(dd->port->identify + 89) * 2) * 6);
+   break;
+   case ATA_CMD_STANDBYNOW1:
+   *timeout = 12;  /* 2 minutes */
+   break;
+   case 0xF7:
+   case 0xFA:
+   *timeout = 6;  /* 60 seconds */
+   break;
+   case ATA_CMD_SMART:
+   *timeout = 15000;  /* 15 seconds */
+   break;
+   default:
+   *timeout = MTIP_IOCTL_COMMAND_TIMEOUT_MS;
+   break;
+   }
+}
+
 /*
  * Request the device identity information.
  *
@@ -1644,6 +1675,7 @@ static int mtip_standby_immediate(struct mtip_port *port)
int rv;
struct host_to_dev_fis  fis;
unsigned long start;
+   unsigned int timeout;
 
/* Build the FIS. */
memset(, 0, sizeof(struct host_to_dev_fis));
@@ -1651,6 +1683,8 @@ static int mtip_standby_immediate(struct mtip_port *port)
fis.opts= 1 << 7;
fis.command = ATA_CMD_STANDBYNOW1;
 
+   mtip_set_timeout(port->dd, , , 0);
+
start = jiffies;
rv = mtip_exec_internal_command(port,
,
@@ -1659,7 +1693,7 @@ static int mtip_standby_immediate(struct mtip_port *port)
0,
0,
GFP_ATOMIC,
-   15000);
+   timeout);
dbg_printk(MTIP_DRV_NAME "Time taken to complete standby cmd: %d ms\n",
jiffies_to_msecs(jiffies - start));
if (rv)
@@ -2202,36 +2236,6 @@ static unsigned int implicit_sector(unsigned char 
command,
}
return rv;
 }
-static void mtip_set_timeout(struct driver_data *dd,
-   struct host_to_dev_fis *fis,
-   unsigned int *timeout, u8 erasemode)
-{
-   switch (fis->command) {
-   case ATA_CMD_DOWNLOAD_MICRO:
-   *timeout = 12; /* 2 minutes */
-   break;
-   case ATA_CMD_SEC_ERASE_UNIT:
-   case 0xFC:
-   if (erasemode)
-   *timeout = ((*(dd->port->identify + 90) * 2) * 6);
-   else
-   *timeout = ((*(dd->port->identify + 89) * 2) * 6);
-   break;
-   case ATA_CMD_STANDBYNOW1:
-   *timeout = 12;  /* 2 minutes */
-   break;
-   case 0xF7:
-   case 0xFA:
-   *timeout = 6;  /* 60 seconds */
-   break;
-   case ATA_CMD_SMART:
-   *timeout = 15000;  /* 15 seconds */
-   break;
-   default:
-   *timeout = MTIP_IOCTL_COMMAND_TIMEOUT_MS;
-   break;
-   }
-}
 
 /*
  * Executes a taskfile
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] Staging: lustre: Fixed formatting errors in lib-types.h identified by checkpatch.pl

2014-04-16 Thread Greg KH
On Wed, Apr 02, 2014 at 07:12:56PM +1100, Joshua Baldock wrote:
> Fixed '{' not on same line as struct in file lib-types.h identified by 
> checkpatch.pl
> 
> Signed-off-by: Joshua Baldock 
> ---
>  drivers/staging/lustre/include/linux/lnet/lib-types.h | 12 
>  1 file changed, 4 insertions(+), 8 deletions(-)

This patch didn't apply to my tree for some reason, but the other 4 did,
odd.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] fbdev fixes for 3.15

2014-04-16 Thread Stephen Rothwell
On Wed, 16 Apr 2014 16:04:39 -0700 Linus Torvalds 
 wrote:
>
> On Wed, Apr 16, 2014 at 2:03 AM, Tomi Valkeinen  wrote:
> >
> > The drivers/video/Kconfig change in this pull request will conflict with the
> > fbdev reorder series, which is not yet in your tree. If that's an issue, I 
> > can
> > resend this without the Kconfig change.
> 
> I was actually hoping/expecting you to just resend the renaming
> rebased on top of 3.15-rc1. Or maybe you could do it on top of this.

Yeah, I was kind of hoping that this movement of files would be done with
sooner rather than later ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpwD0MOyHqOa.pgp
Description: PGP signature


Re: [PATCH v3 1/4] MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver

2014-04-16 Thread Joe Perches
On Wed, 2014-04-16 at 19:39 -0700, Iyappan Subramanian wrote:
> This patch adds a MAINTAINERS entry for APM X-Gene SoC
> ethernet driver.
[]
> diff --git a/MAINTAINERS b/MAINTAINERS
[]
> @@ -686,6 +686,14 @@ S:   Maintained
>  F:   drivers/net/appletalk/
>  F:   net/appletalk/
>  
> +APPLIED MICRO (APM) X-GENE SOC ETHERNET DRIVER
> +M:   Iyappan Subramanian 
> +M:   Keyur Chudgar 
> +M:   Ravi Patel 
> +S:   Maintained
> +F:   drivers/net/ethernet/apm/xgene

Please add a terminating slash to show this is a directory.

F:  drivers/net/ethernet/apm/xgene/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 4/4] drivers: net: Add APM X-Gene SoC ethernet driver support.

2014-04-16 Thread Joe Perches
On Wed, 2014-04-16 at 19:39 -0700, Iyappan Subramanian wrote:
> This patch adds network driver for APM X-Gene SoC ethernet.
[]
> diff --git a/drivers/net/ethernet/apm/xgene/Kconfig 
> b/drivers/net/ethernet/apm/xgene/Kconfig
[]
> @@ -0,0 +1,10 @@
> +config NET_XGENE
> + tristate "APM X-Gene SoC Ethernet Driver"
> + select PHYLIB
> + default y

default y? 

Shouldn't this need a depends on too?

> diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
> b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c

> +static void xgene_enet_ring_set_type(u32 *ring_cfg, u8 is_bufpool)
> +{

bool is_bufpool?

> +static void xgene_enet_set_ring_id(struct xgene_enet_desc_ring *ring)
> +{
> + u32 ring_id_val;
> + u32 ring_id_buf;
> + u8 is_bufpool = IS_FP(ring->id);

bool?

> +
> + ring_id_val = ring->id & GENMASK(9, 0);
> + ring_id_val |= (1 << 31) & GENMASK(31, 31);

Setting a single bit and masking with the same
bit looks silly.

> +
> + ring_id_buf = (ring->num << 9) & GENMASK(18, 9);
> + ring_id_buf |= ((u32) is_bufpool << 20) & GENMASK(20, 20);

And here.

> + ring_id_buf |= (1U << 21) & GENMASK(21, 21);

This too.

[]

> +static void xgene_enet_rd_mcx_stats(struct xgene_enet_pdata *pdata,
> + u32 rd_addr, u32 *rd_data)
> +{
[]
> + if (!ret)
> + netdev_err(pdata->ndev, "MCX stats read failed, addr: %04x",
> +rd_addr);

Missing newline

> +/* Start Statistics related functions */
> +static void xgene_gmac_get_rx_stats(struct xgene_enet_pdata *pdata,
> + struct xgene_enet_rx_stats *rx_stat)
> +{
> + xgene_enet_rd_mcx_stats(pdata, RBYT_ADDR, _stat->rx_byte_count);
> + xgene_enet_rd_mcx_stats(pdata, RPKT_ADDR, _stat->rx_packet_count);
> + xgene_enet_rd_mcx_stats(pdata, RDRP_ADDR, _stat->rx_drop_pkt_count);
> + xgene_enet_rd_mcx_stats(pdata, RFCS_ADDR, _stat->rx_fcs_err_count);
> + xgene_enet_rd_mcx_stats(pdata, RFLR_ADDR,
> + _stat->rx_frm_len_err_pkt_count);
> + xgene_enet_rd_mcx_stats(pdata, RALN_ADDR,
> + _stat->rx_alignment_err_pkt_count);
> + xgene_enet_rd_mcx_stats(pdata, ROVR_ADDR,
> + _stat->rx_oversize_pkt_count);
> + xgene_enet_rd_mcx_stats(pdata, RUND_ADDR,
> + _stat->rx_undersize_pkt_count);
> +
> + rx_stat->rx_byte_count &= RX_BYTE_CNTR_MASK;
> + rx_stat->rx_packet_count &= RX_PKT_CNTR_MASK;
> + rx_stat->rx_drop_pkt_count &= RX_DROPPED_PKT_CNTR_MASK;
> + rx_stat->rx_fcs_err_count &= RX_FCS_ERROR_CNTR_MASK;
> + rx_stat->rx_frm_len_err_pkt_count &= RX_LEN_ERR_CNTR_MASK;
> + rx_stat->rx_alignment_err_pkt_count &= RX_ALIGN_ERR_CNTR_MASK;
> + rx_stat->rx_oversize_pkt_count &= RX_OVRSIZE_PKT_CNTR_MASK;
> + rx_stat->rx_undersize_pkt_count &= RX_UNDRSIZE_PKT_CNTR_MASK;
> +}
> +
> +static void xgene_gmac_get_tx_stats(struct xgene_enet_pdata *pdata,
> + struct xgene_enet_tx_stats *tx_stats)
> +{
> + xgene_enet_rd_mcx_stats(pdata, TBYT_ADDR, _stats->tx_byte_count);
> + xgene_enet_rd_mcx_stats(pdata, TPKT_ADDR, _stats->tx_pkt_count);
> + xgene_enet_rd_mcx_stats(pdata, TDRP_ADDR, _stats->tx_drop_frm_count);
> + xgene_enet_rd_mcx_stats(pdata, TFCS_ADDR,
> + _stats->tx_fcs_err_frm_count);
> + xgene_enet_rd_mcx_stats(pdata, TUND_ADDR,
> + _stats->tx_undersize_frm_count);
> +
> + tx_stats->tx_byte_count &= TX_BYTE_CNTR_MASK;
> + tx_stats->tx_pkt_count &= TX_PKT_CNTR_MASK;
> + tx_stats->tx_drop_frm_count &= TX_DROP_FRAME_CNTR_MASK;
> + tx_stats->tx_fcs_err_frm_count &= TX_FCS_ERROR_CNTR_MASK;
> + tx_stats->tx_undersize_frm_count &= TX_UNDSIZE_FRAME_CNTR_MASK;
> +}

Pity about the masks


> +#define RX_BYTE_CNTR_MASK0x7fff
> +#define RX_PKT_CNTR_MASK 0x7fff
> +#define RX_FCS_ERROR_CNTR_MASK   0x
> +#define RX_ALIGN_ERR_CNTR_MASK   0x
> +#define RX_LEN_ERR_CNTR_MASK 0x
> +#define RX_UNDRSIZE_PKT_CNTR_MASK0x
> +#define RX_OVRSIZE_PKT_CNTR_MASK 0x
> +#define RX_DROPPED_PKT_CNTR_MASK 0x
> +#define TX_BYTE_CNTR_MASK0x7fff
> +#define TX_PKT_CNTR_MASK 0x7fff
> +#define TX_DROP_FRAME_CNTR_MASK  0x
> +#define TX_FCS_ERROR_CNTR_MASK   0x0fff
> +#define TX_UNDSIZE_FRAME_CNTR_MASK   0x0fff

Any of these going to possibly overrun their
counter size between polls?

[]
> +static struct xgene_enet_desc_ring *xgene_enet_create_desc_ring(
> + struct net_device *ndev, u32 ring_num,
> + enum xgene_enet_ring_cfgsize cfgsize, u32 ring_id)
> +{
> + struct xgene_enet_desc_ring *ring;
> + struct xgene_enet_pdata *pdata = netdev_priv(ndev);
> + struct device 

Re: [PATCH RESEND 2/2] staging: binder: Code simplification

2014-04-16 Thread Greg KH
On Tue, Apr 15, 2014 at 12:03:06PM +0200, Mathieu Maret wrote:
> Remove duplicate code
> 
> Signed-off-by: Mathieu Maret 
> ---
>  drivers/staging/android/binder.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/staging/android/binder.c 
> b/drivers/staging/android/binder.c
> index 3dca577..c29c3c7 100644
> --- a/drivers/staging/android/binder.c
> +++ b/drivers/staging/android/binder.c
> @@ -2686,11 +2686,8 @@ static long binder_ioctl(struct file *filp, unsigned 
> int cmd, unsigned long arg)
>   case BINDER_VERSION:{
>   struct binder_version __user *ver = ubuf;
>  
> - if (size != sizeof(struct binder_version)) {
> - ret = -EINVAL;
> - goto err;
> - }
> - if (put_user(BINDER_CURRENT_PROTOCOL_VERSION,
> + if (size != sizeof(struct binder_version) ||
> + put_user(BINDER_CURRENT_PROTOCOL_VERSION,
>>protocol_version)) {
>   ret = -EINVAL;
>   goto err;

I agree with Dan, the original code was easier to read.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] Staging:android:uapi:binder.h __packed

2014-04-16 Thread Greg KH
On Tue, Apr 08, 2014 at 07:02:04PM +0100, Paul McQuade wrote:
> WARNING: __packed is preferred over __attribute__((packed))
> 
> Signed-off-by: Paul McQuade 
> ---
>  drivers/staging/android/uapi/binder.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Not on uapi .h files, sorry :(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] kernel/panic: Add "crash_kexec_post_notifiers" option for kdump after panic_notifers

2014-04-16 Thread Masami Hiramatsu
Add a "crash_kexec_post_notifiers" option to run kdump after running
panic_notifiers and dump kmsg. This can help rare situations which
kdump drops in failure because of unstable crashed kernel or hardware
failure (memory corruption on critical data/code), or the 2nd kernel
is already broken by the 1st kernel (it's a broken behavior, but who
can guarantee that the "crashed" kernel works correctly?).

Usage: add "crash_kexec_post_notifiers" to kernel boot option.

Note that this actually increases risks of the failure of kdump.
This option should be set only if you worry about the rare case
of kdump failure rather than increasing the chance of success.

Changes from v1:
 - Rename late_kdump option to crash_kexec_post_notifiers.
 - Remove unneeded warning message.

Signed-off-by: Masami Hiramatsu 
Cc: Eric Biederman 
Cc: Vivek Goyal 
Cc: Andrew Morton 
Cc: Yoshihiro YUNOMAE 
Cc: Satoru MORIYA 
Cc: Motohiro Kosaki 
Cc: Tomoki Sekiyama 
---
 Documentation/kernel-parameters.txt |8 
 kernel/panic.c  |   25 +++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 03e50b4..1df416b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2339,6 +2339,14 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
timeout < 0: reboot immediately
Format: 
 
+   crash_kexec_post_notifiers
+   Run kdump after running panic-notifiers and dumping
+   kmsg. This only for the users who doubt kdump always
+   succeeds in any situation.
+   Note that this also increases risks of kdump failure,
+   because some panic notifiers can make the crashed
+   kernel more unstable.
+
parkbd.port=[HW] Parallel port number the keyboard adapter is
connected to, default is 0.
Format: 
diff --git a/kernel/panic.c b/kernel/panic.c
index d02fa9f..0c99c8c 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -32,6 +32,7 @@ static unsigned long tainted_mask;
 static int pause_on_oops;
 static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
+static bool crash_kexec_post_notifiers;
 
 int panic_timeout = CONFIG_PANIC_TIMEOUT;
 EXPORT_SYMBOL_GPL(panic_timeout);
@@ -112,9 +113,13 @@ void panic(const char *fmt, ...)
/*
 * If we have crashed and we have a crash kernel loaded let it handle
 * everything else.
-* Do we want to call this before we try to display a message?
+* If we want to run this after calling panic_notifiers, pass
+* the "crash_kexec_post_notifiers" option to the kernel.
 */
-   crash_kexec(NULL);
+   if (!crash_kexec_post_notifiers)
+   crash_kexec(NULL);
+   else
+   pr_emerg("Warning: crash_kexec_post_notifiers is set.\n");
 
/*
 * Note smp_send_stop is the usual smp shutdown function, which
@@ -131,6 +136,15 @@ void panic(const char *fmt, ...)
 
kmsg_dump(KMSG_DUMP_PANIC);
 
+   /*
+* If you doubt kdump always works fine in any situation,
+* "crash_kexec_post_notifiers" offers you a chance to run 
+* panic_notifiers and dumping kmsg before kdump.
+* Note: since some panic_notifiers can make crashed kernel
+* more unstable, it can increase risks of the kdump failure too.
+*/
+   crash_kexec(NULL);
+
bust_spinlocks(0);
 
if (!panic_blink)
@@ -472,6 +486,13 @@ EXPORT_SYMBOL(__stack_chk_fail);
 core_param(panic, panic_timeout, int, 0644);
 core_param(pause_on_oops, pause_on_oops, int, 0644);
 
+static int __init setup_crash_kexec_post_notifiers(char *s)
+{
+   crash_kexec_post_notifiers = true;
+   return 0;
+}
+early_param("crash_kexec_post_notifiers", setup_crash_kexec_post_notifiers);
+
 static int __init oops_setup(char *s)
 {
if (!s)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FMC: misc_register should not be called while atomic

2014-04-16 Thread Greg KH
On Thu, Apr 17, 2014 at 12:39:42AM +0200, Alessandro Rubini wrote:
> >> --- a/drivers/fmc/fmc-chardev.c
> >> +++ b/drivers/fmc/fmc-chardev.c
> >> @@ -141,8 +141,8 @@ static int fc_probe(struct fmc_device *fmc)
> >>fc->misc.fops = _fops;
> >>fc->misc.name = kstrdup(dev_name(>dev), GFP_KERNEL);
> >>  
> >> -  spin_lock(_lock);
> >>ret = misc_register(>misc);
> >> +  spin_lock(_lock);
> >>if (ret < 0) {
> >>kfree(fc->misc.name);
> >>kfree(fc);
> >> -- 
> >> 1.9.1
> 
> This is already applied, though in a slightly different way. Commit
> 783c2fb1b. I fixed it in Jul 13 2013 after user reports.
> 
> "git tag --contains 783c2fb1b" reports v3.12-rc1 and later

Then why in the world is it being submitted as a patch to be included?
What is going on here?

confused,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3] Use kernfs_break_active_protection() for device online store callbacks

2014-04-16 Thread Li Zhong
On Wed, 2014-04-16 at 11:17 -0400, Tejun Heo wrote:
> Hello,
> 
> On Wed, Apr 16, 2014 at 09:41:40AM +0800, Li Zhong wrote:
> > > If so, that is
> > > an actually possible deadlock, no?
> > 
> > Yes, but it seems to me that it is solved in commit 5e33bc41, which uses
> > lock_device_hotplug_sysfs() to return a restart syscall error if not
> > able to try lock the device_hotplug_lock. That also requires the device
> > removing code path to take the device_hotplug_lock. 
> 
> But that patch only takes out device_hotplug_lock out of the
> dependency graph and does nothing for cpu_add_remove_lock.  It seems
> to be that there still is a deadlock condition involving s_active and
> cpu_add_remove_lock.  Am I missing something here?

It seems to me cpu_add_remove_lock is always taken after
device_hotplug_lock.

So if cpu_add_remove_lock has been acquired by device removing process,
then it means the other online/offline process couldn't successfully try
lock device_hotplug_lock, and will release s_active with a restart
syscall error;

if cpu_add_remove_lock has been acquired by online/offline process, then
it should already hold device_hotlug_lock, and keeps the device removing
process waiting at device_hotplug_lock. So online/offline process could
release the lock, and finally release s_active soon. 

But after some further thinking, I seem to understand your point.
s_active has lock order problem with the other series of hotplug related
locks, so it's better to take s_active out of the dependency chain,
rather than the first of the other series of locks? like you suggested
below.

> 
> Now that kernfs has a proper mechanism to deal with it, wouldn't it
> make more sense to replace 5e33bc41 with prper s_active protection
> breaking?

I'll try this way and send you the code for review.

Thanks,
Zhong

> 
> Thanks.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Use an alternative to _PAGE_PROTNONE for _PAGE_NUMA v4

2014-04-16 Thread Fengguang Wu
On Tue, Apr 15, 2014 at 03:41:13PM +0100, Mel Gorman wrote:
> Fengguang Wu found that an earlier version crashed on his
> tests. This version passed tests running with DEBUG_VM and
> DEBUG_PAGEALLOC. Fengguang, another test would be appreciated and
> if it helps this series is the mm-numa-use-high-bit-v4r3 branch in
> git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma.git

Hi Mel,

We noticed the below changes. The last_state.is_incomplete_run 0=>1 change
means the test box failed to boot up. Unfortunately we don't have
serial console output of this testbox, it may be hard to check the
root cause. Anyway, I'll try to bisect it to make the debug easier.

Fengguang

  v3.14  685561ea2d015cb90c45504ec  
---  -  
864.70 ~ 5% -27.5% 627.28   
snb-drag/sysbench/fileio/600s-100%-1HDD-ext4-64G-1024-seqwr-sync
178.20 ~104% -99.9%   0.13   
snb-drag/sysbench/fileio/600s-100%-1HDD-xfs-64G-1024-rndwr-sync
   1042.90 ~22% -39.8% 627.41   TOTAL fileio.request_latency_max_ms

  v3.14  685561ea2d015cb90c45504ec  
---  -  
 0   +Inf%  1 ~ 0%  lkp-a04/fake/boot/1
 0   +Inf%  1 ~ 0%  TOTAL last_state.is_incomplete_run

  v3.14  685561ea2d015cb90c45504ec  
---  -  
10 ~10%   +1560.0%166   
lkp-ws02/micro/dd-write/11HDD-JBOD-cfq-ext4-10dd
10 ~10%   +1560.0%166   TOTAL 
ftrace.writeback_single_inode.sdg.age

  v3.14  685561ea2d015cb90c45504ec  
---  -  
  9662 ~19%  +17955.3%1744594   
lkp-snb01/micro/hackbench/1600%-process-socket
  4235 ~22%   +6497.6% 279443   
lkp-snb01/micro/hackbench/1600%-threads-socket
  3842 ~ 4%   +1468.7%  60278   
lkp-snb01/micro/hackbench/50%-process-pipe
 17181 ~ 3%+393.1%  84722   
lkp-snb01/micro/hackbench/50%-process-socket
 34922 ~10%   +6111.0%2169037   TOTAL cpuidle.POLL.time

  v3.14  685561ea2d015cb90c45504ec  
---  -  
34 ~ 1% +61.8% 56   
lkp-ws02/micro/dd-write/11HDD-JBOD-cfq-ext4-10dd
34 ~ 1% +61.8% 56   TOTAL 
ftrace.balance_dirty_pages.sdl.period

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   312 ~ 0% -10.4%280   
snb-drag/sysbench/fileio/600s-100%-1HDD-xfs-64G-1024-rndwr-sync
   312 ~ 0% -10.4%280   TOTAL vmstat.memory.buff

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   350 ~ 3%   +3056.4%  11057   
lkp-ws02/micro/dd-write/11HDD-JBOD-cfq-ext4-100dd
   340 ~ 2%   +1792.2%   6443   
lkp-ws02/micro/dd-write/11HDD-JBOD-cfq-ext4-10dd
   690 ~ 3%   +2433.3%  17500   TOTAL 
interrupts.125:PCI-MSI-edge.eth1-TxRx-4

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   310 ~ 1%   +2021.3%   6595   
lkp-ws02/micro/dd-write/11HDD-JBOD-cfq-ext4-1dd
   310 ~ 1%   +2021.3%   6595   TOTAL 
interrupts.127:PCI-MSI-edge.eth1-TxRx-6

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   157 ~ 4%+118.4%344   lkp-sb03/micro/nepim/300s-100%-udp
   157 ~ 4%+118.4%344   TOTAL 
interrupts.101:PCI-MSI-edge.eth1-TxRx-1

  v3.14  685561ea2d015cb90c45504ec  
---  -  
 40.55 ~ 0% -50.6%  20.02   lkp-a05/fake/boot/1
 40.55 ~ 0% -50.6%  20.02   TOTAL boottime.dhcp

  v3.14  685561ea2d015cb90c45504ec  
---  -  
 60.79 ~ 0% -47.8%  31.74   lkp-a05/fake/boot/1
 60.79 ~ 0% -47.8%  31.74   TOTAL boottime.boot

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   186 ~ 0% -46.1%100   lkp-a05/fake/boot/1
   186 ~ 0% -46.1%100   TOTAL boottime.idle

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   157 ~ 3% +84.7%290   lkp-a05/micro/iperf/300s-tcp
   157 ~ 3% +84.7%290   TOTAL 
interrupts.50:PCI-MSI-edge.eth0-tx-0

  v3.14  685561ea2d015cb90c45504ec  
---  -  
212639 ~ 1% +48.1% 314887   lkp-sb03/micro/nepim/300s-25%-udp
212639 ~ 1% +48.1% 314887   TOTAL interrupts.LOC

  v3.14  685561ea2d015cb90c45504ec  
---  -  
   207 ~ 2% +35.8%282   lkp-a05/micro/iperf/300s-tcp
   207 ~ 2% +35.8%282   TOTAL 
interrupts.47:PCI-MSI-edge.eth0-rx-1

--
To unsubscribe from this list: send the 

linux-next: manual merge of the crypto tree with Linus' tree

2014-04-16 Thread Stephen Rothwell
Hi Herbert,

Today's linux-next merge of the crypto tree got a conflict in
drivers/crypto/bfin_crc.h between commit 3356c99ea392 ("bfin_crc: Move
architecture independant crc header file out of the blackfin folder")
from Linus' tree and commit 52e6e543f2d8 ("crypto: bfin_crc - access crc
registers by readl and writel functions") from the crypto tree.

I fixed it up (I kept the file under its new name) and can carry the fix
as necessary (no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpyaMp_IS0vS.pgp
Description: PGP signature


[f2fs-dev][PATCH] f2fs: introduce raw_nat_from_node_info() to simplfy codes

2014-04-16 Thread Chao Yu
This patch introduce raw_nat_from_node_info() to simplfy some codes, and also
use exist function node_info_from_raw_nat() to do the same job.

Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c |   15 +++
 fs/f2fs/node.h |8 
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index f760793..84f9b7b 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -179,9 +179,7 @@ retry:
write_unlock(_i->nat_tree_lock);
goto retry;
}
-   nat_set_blkaddr(e, le32_to_cpu(ne->block_addr));
-   nat_set_ino(e, le32_to_cpu(ne->ino));
-   nat_set_version(e, ne->version);
+   node_info_from_raw_nat(>ni, ne);
}
write_unlock(_i->nat_tree_lock);
 }
@@ -1755,9 +1753,7 @@ retry:
write_unlock(_i->nat_tree_lock);
goto retry;
}
-   nat_set_blkaddr(ne, le32_to_cpu(raw_ne.block_addr));
-   nat_set_ino(ne, le32_to_cpu(raw_ne.ino));
-   nat_set_version(ne, raw_ne.version);
+   node_info_from_raw_nat(>ni, _ne);
__set_nat_cache_dirty(nm_i, ne);
write_unlock(_i->nat_tree_lock);
}
@@ -1790,7 +1786,6 @@ void flush_nat_entries(struct f2fs_sb_info *sbi)
nid_t nid;
struct f2fs_nat_entry raw_ne;
int offset = -1;
-   block_t new_blkaddr;
 
if (nat_get_blkaddr(ne) == NEW_ADDR)
continue;
@@ -1826,11 +1821,7 @@ to_nat_page:
f2fs_bug_on(!nat_blk);
raw_ne = nat_blk->entries[nid - start_nid];
 flush_now:
-   new_blkaddr = nat_get_blkaddr(ne);
-
-   raw_ne.ino = cpu_to_le32(nat_get_ino(ne));
-   raw_ne.block_addr = cpu_to_le32(new_blkaddr);
-   raw_ne.version = nat_get_version(ne);
+   raw_nat_from_node_info(_ne, >ni);
 
if (offset < 0) {
nat_blk->entries[nid - start_nid] = raw_ne;
diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
index 5decc1a..41bb65b 100644
--- a/fs/f2fs/node.h
+++ b/fs/f2fs/node.h
@@ -75,6 +75,14 @@ static inline void node_info_from_raw_nat(struct node_info 
*ni,
ni->version = raw_ne->version;
 }
 
+static inline void raw_nat_from_node_info(struct f2fs_nat_entry *raw_ne,
+   struct node_info *ni)
+{
+   raw_ne->ino = cpu_to_le32(ni->ino);
+   raw_ne->block_addr = cpu_to_le32(ni->blk_addr);
+   raw_ne->version = ni->version;
+}
+
 enum nid_type {
FREE_NIDS,  /* indicates the free nid list */
NAT_ENTRIES /* indicates the cached nat entry */
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the crypto tree with Linus' tree

2014-04-16 Thread Stephen Rothwell
Hi Herbert,

Today's linux-next merge of the crypto tree got a conflict in
drivers/char/hw_random/Kconfig between commit 2257ffbca73c ("hwrng: msm:
switch Kconfig to ARCH_QCOM depends") from Linus' tree and commits
020016183453 ("hwrng: Turn HW_RANDOM into a menuconfig") and 2d9cab5194c8
("hwrng: Fix a few driver dependencies and defaults") from the crypto
tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/char/hw_random/Kconfig
index 244759bbd7b7,38cfae686cc4..
--- a/drivers/char/hw_random/Kconfig
+++ b/drivers/char/hw_random/Kconfig
@@@ -342,11 -321,12 +321,12 @@@ config HW_RANDOM_TP
  If unsure, say Y.
  
  config HW_RANDOM_MSM
 -  tristate "Qualcomm MSM Random Number Generator support"
 -  depends on ARCH_MSM
 +  tristate "Qualcomm SoCs Random Number Generator support"
-   depends on HW_RANDOM && ARCH_QCOM
++  depends on ARCH_QCOM
+   default HW_RANDOM
---help---
  This driver provides kernel-side support for the Random Number
 -Generator hardware found on Qualcomm MSM SoCs.
 +Generator hardware found on Qualcomm SoCs.
  
  To compile this driver as a module, choose M here. the
  module will be called msm-rng.


pgpsxeo_BRvnE.pgp
Description: PGP signature


[PATCH v3 1/4] MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver

2014-04-16 Thread Iyappan Subramanian
This patch adds a MAINTAINERS entry for APM X-Gene SoC
ethernet driver.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Ravi Patel 
Signed-off-by: Keyur Chudgar 
---
 MAINTAINERS |8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 11b3937..bc32a01 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -686,6 +686,14 @@ S: Maintained
 F: drivers/net/appletalk/
 F: net/appletalk/
 
+APPLIED MICRO (APM) X-GENE SOC ETHERNET DRIVER
+M: Iyappan Subramanian 
+M: Keyur Chudgar 
+M: Ravi Patel 
+S: Maintained
+F: drivers/net/ethernet/apm/xgene
+F: Documentation/devicetree/bindings/net/apm-xgene-enet.txt
+
 APTINA CAMERA SENSOR PLL
 M: Laurent Pinchart 
 L: linux-me...@vger.kernel.org
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/4] net: Add APM X-Gene SoC Ethernet driver support

2014-04-16 Thread Iyappan Subramanian
Adding APM X-Gene SoC Ethernet driver.

v3: Address comments from v2 review
* cleaned up set_desc and get_desc functions
* added dtb mdio node and phy-handle subnode
* renamed dtb phy-mode to phy-connection-type
* added of_phy_connect call to connec to PHY
* added empty line after last local variable declaration
* removed type casting when not required
* removed inline keyword from source files
* removed CONFIG_CPU_BIG_ENDIAN ifdef

v2
* Completely redesigned ethernet driver
* Added support to work with big endian kernel
* Renamed dtb phyid entry to phy_addr
* Changed dtb local-mac-address entry to byte string format
* Renamed dtb eth8clk entry to menetclk

v1
* Initial version

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Ravi Patel 
Signed-off-by: Keyur Chudgar 
---
Iyappan Subramanian (4):
  MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver
  Documentation: dts: Add bindings for APM X-Gene SoC ethernet driver
  dts: Add bindings for APM X-Gene SoC ethernet driver
  drivers: net: Add APM X-Gene SoC ethernet driver support.

 .../devicetree/bindings/net/apm-xgene-enet.txt |   66 ++
 MAINTAINERS|8 +
 arch/arm64/boot/dts/apm-mustang.dts|4 +
 arch/arm64/boot/dts/apm-storm.dtsi |   27 +-
 drivers/net/ethernet/Kconfig   |1 +
 drivers/net/ethernet/Makefile  |1 +
 drivers/net/ethernet/apm/Kconfig   |1 +
 drivers/net/ethernet/apm/Makefile  |5 +
 drivers/net/ethernet/apm/xgene/Kconfig |   10 +
 drivers/net/ethernet/apm/xgene/Makefile|6 +
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c |  846 +
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h |  380 
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c   |  952 
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h   |  136 +++
 14 files changed, 2440 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/apm-xgene-enet.txt
 create mode 100644 drivers/net/ethernet/apm/Kconfig
 create mode 100644 drivers/net/ethernet/apm/Makefile
 create mode 100644 drivers/net/ethernet/apm/xgene/Kconfig
 create mode 100644 drivers/net/ethernet/apm/xgene/Makefile
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_main.c
 create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_main.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/4] Documentation: dts: Add bindings for APM X-Gene SoC ethernet driver

2014-04-16 Thread Iyappan Subramanian
This patch adds documentation for APM X-Gene SoC ethernet DTS binding.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Ravi Patel 
Signed-off-by: Keyur Chudgar 
---
 .../devicetree/bindings/net/apm-xgene-enet.txt |   66 
 1 file changed, 66 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/apm-xgene-enet.txt

diff --git a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt 
b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt
new file mode 100644
index 000..9ad6fe1
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt
@@ -0,0 +1,66 @@
+APM X-Gene SoC Ethernet nodes
+
+Ethernet nodes are defined to describe on-chip ethernet interfaces in
+APM X-Gene SoC.
+
+Required properties:
+- compatible:  Should be "apm,xgene-enet"
+- reg: First resource is the ethernet base register set
+   Second resource is the ring base register set
+   Third resource is the ring command register set
+- interrupts:  Ethernet main interrupt
+- clocks:  Reference to the clock entry.
+- local-mac-address:   Ethernet MAC address.
+- phy-connection-type: Ethernet MII mode.
+
+- mdio device tree subnode: When the X-Gene SoC has a phy connected to its 
local
+   mdio, there must be device tree subnode with the following
+   required properties:
+
+   - compatible: Must be "apm,xgene-mdio".
+   - #address-cells: Must be <1>.
+   - #size-cells: Must be <0>.
+
+   For the phy on the mdio bus, there must be a node with the following
+   fields:
+
+   - reg: phy id used to communicate to phy.
+
+Optional properties:
+- status   : Should be "ok" or "disabled" for enabled/disabled.
+ Default is "ok".
+
+Example:
+   menetclk: menetclk {
+   compatible = "apm,xgene-device-clock";
+   clock-output-names = "menetclk";
+   status = "ok";
+   };
+
+   menet: ethernet@1702 {
+   compatible = "apm,xgene-enet";
+   status = "disabled";
+   reg = <0x0 0x1702 0x0 0xd100>,
+ <0x0 0X1703 0x0 0X400>,
+ <0x0 0X1000 0x0 0X200>;
+   interrupts = <0x0 0x3c 0x4>;
+   clocks = < 0>;
+   local-mac-address = [00 01 73 00 00 01];
+   phy-connection-type = "rgmii";
+   phy-handle = <>;
+   mdio {
+   compatible = "apm,xgene-mdio";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   menetphy: menetphy@3 {
+   reg = <0x3>;
+   };
+
+   };
+   };
+
+/* Board-specific peripheral configurations */
+
+ {
+status = "ok";
+};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH V5.1] serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers

2014-04-16 Thread Yoshihiro YUNOMAE

Hi Stephen,

Thank you for your reply.

(2014/04/17 2:04), Stephen Warren wrote:

On 04/15/2014 08:08 PM, Yoshihiro YUNOMAE wrote:

diff --git a/drivers/tty/serial/8250/8250_core.c
b/drivers/tty/serial/8250/8250_core.c



@@ -2275,10 +2276,9 @@ serial8250_do_set_termios(struct uart_port
*port, struct ktermios *termios,



   if (up->capabilities & UART_CAP_FIFO && port->fifosize > 1) {
-fcr = uart_config[port->type].fcr;
-if ((baud < 2400 && !up->dma) || fifo_bug) {
-fcr &= ~UART_FCR_TRIGGER_MASK;
-fcr |= UART_FCR_TRIGGER_1;
+/* NOTE: If fifo_bug is not set, a uaser can set RX_trigger. */
+if ((baud < 2400 && !up->dma &&
+(up->fcr == uart_config[port->type].fcr)) ||
up->fifo_bug) {
+up->fcr &= ~UART_FCR_TRIGGER_MASK;
+up->fcr |= UART_FCR_TRIGGER_1;
   }
   }


Does the "(up->fcr == uart_config[port->type].fcr)" term prevent the
user from changing the trigger level multiple times? Perhaps this is
intended?


No, this means that if a user changed FCR value before setting termios,
use the changed value because the user think changed value is always
set. But, I thought this is not straightforward and it cannot help
when the user want to use default FCR value.
Could I add FCR changed flag(user_changed_fcr) in uart_8250_port
structure and check the flag here?
Or shouldn't the driver check the user changing?


Oh, I wasn't aware that the user could change FCR directly. To be
honest, I'm not sure of the best way to resolve that kind of conflict...


OK. For simplicity, I don't implement the checking.
Even if FCR is changed here, users can change it any time, so this is
not so big problem, I think.

Thanks,
Yoshihiro YUNOMAE

--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 10/19] NET: set PF_FSTRANS while holding sk_lock

2014-04-16 Thread NeilBrown
On Wed, 16 Apr 2014 09:00:02 -0400 (EDT) David Miller 
wrote:

> From: Eric Dumazet 
> Date: Tue, 15 Apr 2014 22:13:46 -0700
> 
> > For applications handling millions of sockets, this makes a difference.
> 
> Indeed, this really is not acceptable.

As you say...
I've just discovered that I can get rid of the lockdep message (and hence
presumably the deadlock risk) with a well placed:

newsock->sk->sk_allocation = GFP_NOFS;

which surprised me as it seemed to be an explicit GFP_KERNEL allocation that
was mentioned in the lockdep trace.  Obviously these traces require quite
some sophistication to understand.

So - thanks for the feedback, patch can be ignored.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


[PATCH v3 3/4] dts: Add bindings for APM X-Gene SoC ethernet driver

2014-04-16 Thread Iyappan Subramanian
This patch adds bindings for APM X-Gene SoC ethernet driver.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Ravi Patel 
Signed-off-by: Keyur Chudgar 
---
 arch/arm64/boot/dts/apm-mustang.dts |4 
 arch/arm64/boot/dts/apm-storm.dtsi  |   27 ---
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/apm-mustang.dts 
b/arch/arm64/boot/dts/apm-mustang.dts
index 1247ca1..e2fb1ef 100644
--- a/arch/arm64/boot/dts/apm-mustang.dts
+++ b/arch/arm64/boot/dts/apm-mustang.dts
@@ -24,3 +24,7 @@
reg = < 0x1 0x 0x0 0x8000 >; /* Updated by 
bootloader */
};
 };
+
+ {
+   status = "ok";
+};
diff --git a/arch/arm64/boot/dts/apm-storm.dtsi 
b/arch/arm64/boot/dts/apm-storm.dtsi
index 93f4b2d..e1e3d5d 100644
--- a/arch/arm64/boot/dts/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm-storm.dtsi
@@ -167,14 +167,13 @@
clock-output-names = "ethclk";
};
 
-   eth8clk: eth8clk {
+   menetclk: menetclk {
compatible = "apm,xgene-device-clock";
#clock-cells = <1>;
clocks = < 0>;
-   clock-names = "eth8clk";
reg = <0x0 0x1702C000 0x0 0x1000>;
reg-names = "csr-reg";
-   clock-output-names = "eth8clk";
+   clock-output-names = "menetclk";
};
 
sataphy1clk: sataphy1clk@1f21c000 {
@@ -339,5 +338,27 @@
phys = < 0>;
phy-names = "sata-phy";
};
+
+   menet: ethernet@1702 {
+   compatible = "apm,xgene-enet";
+   status = "disabled";
+   reg = <0x0 0x1702 0x0 0xd100>,
+ <0x0 0X1703 0x0 0X400>,
+ <0x0 0X1000 0x0 0X200>;
+   interrupts = <0x0 0x3c 0x4>;
+   clocks = < 0>;
+   local-mac-address = [00 01 73 00 00 01];
+   phy-connection-type = "rgmii";
+   phy-handle = <>;
+   mdio {
+   compatible = "apm,xgene-mdio";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   menetphy: menetphy@3 {
+   reg = <0x3>;
+   };
+
+   };
+   };
};
 };
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rb tree hrtimer lockup bug (found by perf_fuzzer)

2014-04-16 Thread Greg KH
On Thu, Apr 17, 2014 at 01:00:53AM +0200, Thomas Gleixner wrote:
> On Sat, 5 Apr 2014, Greg KH wrote:
> > On Mon, Mar 31, 2014 at 01:18:34PM +0200, Thomas Gleixner wrote:
> > > On Thu, 27 Mar 2014, Vince Weaver wrote:
> > > > On Wed, 26 Mar 2014, Thomas Gleixner wrote:
> > > > > Ok. So we know now what we are looking for.
> > > > > 
> > > > > [1.579996] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> > > > > ÿ[1.607279] 00:09: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 
> > > > > 115200) is a 16550A
> > > > > [1.615032] kobject: 'ttyS1' (88011772ac10): kobject_release, 
> > > > > parent   (null) (delayed 250)
> > > > > [1.624534] kobject: '(null)' (8801177400f0): kobject_release, 
> > > > > parent   (null) (delayed 500)
> > > > > [1.654213] :00:16.3: ttyS1 at I/O 0xf0e0 (irq = 19, base_baud 
> > > > > = 115200) is a 16550A
> > > > > 
> > > > > [3.294047] Invalid timer base: tmr 880117740150 tmr->base 
> > > > >   (null) base 880118898000
> > > > > 
> > > > > 1634110us : obj: 880117740130 initialized 
> > > > > kobject_delayed_cleanup+0x0/0x90
> > > > > 
> > > > > So that happens in the context of the 8250 serial driver.
> > > > > 
> > > > > ...
> > > > > 
> > > > > Below is a patch which gives us the call path of the unnamed object
> > > > > which causes the crash.
> > > > 
> > > > I've attached the boot log with that patch applied.
> > > 
> > > Vince, can you please disable CONFIG_DEBUG_KOBJECT_RELEASE and remove
> > > all the debug patches to see whether the issue goes away?
> > > 
> > > I had a deeper look down that code path and the issue is, that the
> > > serial core is not compatible with the deferred kobject release.
> > > 
> > > The tty_io layer uses a kobject embedded in its internal tty device
> > > representation and reuses that.
> > 
> > It does?  What kobject is that?  I've dug through the code and I can't
> > find it.  I see where we create a new device in
> > tty_register_device_attr() which is dynamic and should be torn down when
> > free_tty_struct() is called eventually.
> 
> It's not about the dynamic stuff.
>  
> > > So it seems that for whatever reason the tty layer releases ttyS1 and
> > > then initializes it again. So the deferred release will queue the
> > > object for release while the tty layer happily reinitializes it.
> > 
> > That's not good, but I can't find that code path, any hints?
> 
> static int tty_cdev_add(struct tty_driver *driver, dev_t dev,
>   unsigned int index, unsigned int count)
> {
>   /* init here, since reused cdevs cause crashes */
>   cdev_init(>cdevs[index], _fops);
> 
> The comment is interesting ...
> 
> And cdevs is an array of  struct cdev:
> 
> struct cdev {
>   struct kobject kobj;

Those are not "real" kobjects, and are never registered with the kobject
core.

I really need to go rename those one of these days, and just make them a
separate object, as they have nothing to do with a "normal" kobject
other than the reference count and the use of the kobject map stuff.

So if this is showing up as a problem, something else is going on here,
as this should not be an issue at all.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [DRIVER CORE] drivers/base/dd.c incorrect pr_debug() parameters

2014-04-16 Thread Greg Kroah-Hartman
On Wed, Apr 16, 2014 at 05:12:30PM -0700, Frank Rowand wrote:
> pr_debug() parameters are reverse order of format string
> 
> Signed-off-by: Frank Rowand 
> ---
> 
>  drivers/base/dd.c |4 2 + 2 - 0 !
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: b/drivers/base/dd.c
> ===

What is this, cvs?  :)

> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -187,8 +187,8 @@ static void driver_bound(struct device *
>   return;
>   }
>  
> - pr_debug("driver: '%s': %s: bound to device '%s'\n", dev_name(dev),
> -  __func__, dev->driver->name);
> + pr_debug("driver: '%s': %s: bound to device '%s'\n", dev->driver->name,
> +  __func__, dev_name(dev));

Thanks, I'll queue this up.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched/deadline: Fix memory leak

2014-04-16 Thread Li Zefan
Free cpudl->free_cpus allocated in cpudl_init().

Signed-off-by: Li Zefan 
Cc:  # 3.14
---
 kernel/sched/cpudeadline.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 5b9bb42..ab001b5 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -210,7 +210,5 @@ int cpudl_init(struct cpudl *cp)
  */
 void cpudl_cleanup(struct cpudl *cp)
 {
-   /*
-* nothing to do for the moment
-*/
+   free_cpumask_var(cp->free_cpus);
 }
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/4] drivers: net: Add APM X-Gene SoC ethernet driver support.

2014-04-16 Thread Iyappan Subramanian
Hi,

Please find my response inline.

On Mon, Apr 14, 2014 at 7:05 AM, Ben Dooks  wrote:
> On 12/04/14 04:06, Iyappan Subramanian wrote:
>>
>> This patch adds network driver for APM X-Gene SoC ethernet.
>>
>> Signed-off-by: Iyappan Subramanian 
>> Signed-off-by: Ravi Patel 
>
>
> [snip]
>
>
>> +{
>> +   struct xgene_enet_pdata *pdata = netdev_priv(ndev);
>> +   struct phy_device *phydev;
>> +   unsigned char phy_id[MII_BUS_ID_SIZE+3];
>> +   int ret = 0;
>> +
>> +   phydev = phy_find_first(pdata->mdio_bus);
>> +   if (!phydev) {
>> +   netdev_info(ndev, "no PHY found\n");
>> +   ret = -1;
>> +   goto out;
>> +   }
>> +
>> +   /* attach the mac to the phy */
>> +   snprintf(phy_id, sizeof(phy_id), PHY_ID_FMT, pdata->mdio_bus->id,
>> +pdata->phy_addr);
>> +   phydev = phy_connect(ndev, phy_id,
>> +_enet_mdio_link_change,
>> pdata->phy_mode);
>> +   if (IS_ERR(phydev)) {
>> +   netdev_err(ndev, "Could not attach to PHY\n");
>> +   ret = PTR_ERR(phydev);
>> +   phydev = NULL;
>> +   goto out;
>> +   }
>> +
>
>
> You should be using of_phy_connect or similar so that the
> necessary PHY<>OF data is properly initialised
>

I will use of_phy_connect function to connect to PHY.

>
>> +   netdev_info(ndev, "phy_id=0x%08x phy_drv=\"%s\"",
>> +   phydev->phy_id, phydev->drv->name);
>> +out:
>> +   pdata->phy_link = 0;
>> +   pdata->phy_speed = 0;
>> +   pdata->phy_dev = phydev;
>> +
>> +   return ret;
>> +}
>> +
>
>
>
> [snip]
>
>
>> +struct xgene_enet_desc {
>> +   u64 m0;
>> +   u64 m1;
>> +   u64 m2;
>> +   u64 m3;
>> +};
>> +
>> +struct xgene_enet_desc16 {
>> +   u64 m0;
>> +   u64 m1;
>> +};
>> +
>> +static inline void xgene_enet_cpu_to_le64(struct xgene_enet_desc *desc,
>> + int count)
>> +{
>> +#ifdef CONFIG_CPU_BIG_ENDIAN
>> +   int i;
>> +
>> +   for (i = 0; i < count; i++)
>> +   ((u64 *)desc)[i] = cpu_to_le64(((u64 *)desc)[i]);
>> +#endif
>> +}
>> +
>> +static inline void xgene_enet_le64_to_cpu(struct xgene_enet_desc *desc,
>> + int count)
>> +{
>> +#ifdef CONFIG_CPU_BIG_ENDIAN
>> +   int i;
>> +
>> +   for (i = 0; i < count; i++)
>> +   ((u64 *)desc)[i] = le64_to_cpu(((u64 *)desc)[i]);
>> +#endif
>> +}
>> +
>> +static inline void xgene_enet_desc16_to_le64(struct xgene_enet_desc
>> *desc)
>> +{
>> +#ifdef CONFIG_CPU_BIG_ENDIAN
>> +   ((u64 *)desc)[1] = cpu_to_le64(((u64 *)desc)[1]);
>> +#endif
>> +}
>> +
>> +static inline void xgene_enet_le64_to_desc16(struct xgene_enet_desc
>> *desc)
>> +{
>> +#ifdef CONFIG_CPU_BIG_ENDIAN
>> +   ((u64 *)desc)[1] = le64_to_cpu(((u64 *)desc)[1]);
>> +#endif
>> +}
>
>
> With this do you not risk confusing the hardware by swapping the
> format of the descriptors?
>
> Why not use xxx_to_cpu() and cpu_to_xxx() functions when reading or
> writing the descriptors? It would also remove a lot of the #ifdef in
> this code.

Our hardware expects descriptor in the little endian format.  I will remove the
#ifdef.

>
> --
> Ben Dooks   http://www.codethink.co.uk/
> Senior Engineer Codethink - Providing Genius
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/panic: Add "late_kdump" option for kdump in unstable condition

2014-04-16 Thread Masami Hiramatsu
Thank you for review!

(2014/04/16 22:48), Vivek Goyal wrote:
> On Mon, Apr 14, 2014 at 01:51:58PM +0900, Masami Hiramatsu wrote:
>> Add a "late_kdump" option to run kdump after running panic
>> notifiers and dump kmsg. This can help rare situations which
>> kdump drops in failure because of unstable crashed kernel
>> or hardware failure (memory corruption on critical data/code),
>> or the 2nd kernel is broken by the 1st kernel (it's a broken
>> behavior, but who can guarantee that the "crashed" kernel
>> works correctly?).
>>
>> Usage: add "late_kdump" to kernel boot option. That's all.
>>
>> Note that this actually increases risks of the failure of
>> kdump. This option should be set only if you worry about
>> the rare case of kdump failure rather than increasing the
>> chance of success.
>>
>> Signed-off-by: Masami Hiramatsu 
>> Cc: Eric Biederman 
>> Cc: Vivek Goyal 
>> Cc: Andrew Morton 
>> Cc: Yoshihiro YUNOMAE 
>> Cc: Satoru MORIYA 
>> Cc: Motohiro Kosaki 
>> Cc: Takenori Nagano 
>> ---
>>  Documentation/kernel-parameters.txt |7 +++
>>  kernel/panic.c  |   24 ++--
>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/kernel-parameters.txt 
>> b/Documentation/kernel-parameters.txt
>> index 03e50b4..1ba58da 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -2339,6 +2339,13 @@ bytes respectively. Such letter suffixes can also be 
>> entirely omitted.
>>  timeout < 0: reboot immediately
>>  Format: 
>>  
>> +late_kdump  Run kdump after running panic-notifiers and dumping
>> +kmsg. This only for the users who doubt kdump always
>> +succeeds in any situation.
>> +Note that this also increases risks of kdump failure,
>> +because some panic notifiers can make the crashed
>> +kernel more unstable.
>> +
> 
> I am wondering if "crash_kexec_post_notifiers" will be a better name
> to represent what we are trying to do here.

OK, I'll rename that.

> 
>>  parkbd.port=[HW] Parallel port number the keyboard adapter is
>>  connected to, default is 0.
>>  Format: 
>> diff --git a/kernel/panic.c b/kernel/panic.c
>> index d02fa9f..bba42b5 100644
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -32,6 +32,7 @@ static unsigned long tainted_mask;
>>  static int pause_on_oops;
>>  static int pause_on_oops_flag;
>>  static DEFINE_SPINLOCK(pause_on_oops_lock);
>> +static bool late_kdump;
>>  
>>  int panic_timeout = CONFIG_PANIC_TIMEOUT;
>>  EXPORT_SYMBOL_GPL(panic_timeout);
>> @@ -112,9 +113,14 @@ void panic(const char *fmt, ...)
>>  /*
>>   * If we have crashed and we have a crash kernel loaded let it handle
>>   * everything else.
>> - * Do we want to call this before we try to display a message?
>> + * If we want to call this after we try to display a message, pass
>> + * the "late_kdump" option to the kernel.
>>   */
>> -crash_kexec(NULL);
>> +if (!late_kdump)
>> +crash_kexec(NULL);
>> +else
>> +pr_emerg("Warning: late_kdump option is set. Please DO NOT "
>> +"report bugs about kdump failure with this option.\n");
> 
> I think above message about DO NOT report bugs seems unnecessary. 

OK, so I just notify the option is set as below.
"Warning: crash_kexec_post_notifiers is set.\n"

Thank you again!

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH/RFC 00/19] Support loop-back NFS mounts

2014-04-16 Thread NeilBrown
On Thu, 17 Apr 2014 11:27:39 +1000 Dave Chinner  wrote:

> On Thu, Apr 17, 2014 at 10:20:48AM +1000, NeilBrown wrote:
> > A good example is the deadlock with the flush-* threads.
> > flush-* will lock a page, and  then call ->writepage.  If ->writepage
> > allocates memory it can enter reclaim, call ->releasepage on NFS, and block
> > waiting for a COMMIT to complete.
> > The COMMIT might already be running, performing fsync on that same file that
> > flush-* is flushing.  It locks each page in turn.  When it  gets to the page
> > that flush-* has locked, it will deadlock.
> 
> It's nfs_release_page() again
> 
> > In general, if nfsd is allowed to block on local filesystem, and local
> > filesystem is allowed to block on NFS, then a deadlock can happen.
> > We would need a clear hierarchy
> > 
> >__GFP_NETFS > __GFP_FS > __GFP_IO
> > 
> > for it to work.  I'm not sure the extra level really helps a lot and it 
> > would
> > be a lot of churn.
> 
> I think you are looking at this the wrong way - it's not the other
> filesystems that have to avoid memory reclaim recursion, it's the
> NFS client mount that is on loopback that needs to avoid recursion.
> 
> IMO, the fix should be that the NFS client cannot block on messages sent to 
> the NFSD
> on the same host during memory reclaim. That is, nfs_release_page()
> cannot send commit messages to the server if the server is on
> localhost. Instead, it just tells memory reclaim that it can't
> reclaim that page.
> 
> If nfs_release_page() no longer blocks in memory reclaim, and all
> these nfsd-gets-blocked-in-GFP_KERNEL-memory-allocation recursion
> problems go away. Do the same for all the other memory reclaim
> operations in the NFS client, and you've got a solution that should
> work without needing to walk all over the rest of the kernel

Maybe.
It is nfs_release_page() today. I wonder if it could be other things another
day.  I want to be sure I have a solution that really makes sense.

However ... the thing that nfs_release_page is doing it sending a COMMIT to
tell the server to flush to stable storage.  It does that so that if the
server crashes, then the client can re-send.
Of course when it is a loop-back mount the client is the server so the COMMIT
is completely pointless.  If the client notices that it is sending a COMMIT
to itself, it can simply assume a positive reply.

You are right, that would make the patch set a lot less intrusive.  I'll give
it some serious thought - thanks.

NeilBrown


signature.asc
Description: PGP signature


Re: [uClinux-dev] v3.15-rc1 slab allocator broken on m68knommu (coldfire)

2014-04-16 Thread Joonsoo Kim
On Wed, Apr 16, 2014 at 10:44:11AM -0700, Steven King wrote:
> On Wednesday 16 April 2014 9:06:57 am Geert Uytterhoeven wrote:
> > Hi Steven,
> >
> > On Wed, Apr 16, 2014 at 5:47 PM, Steven King  wrote:
> > > --- a/mm/slab.c
> > > +++ b/mm/slab.c
> > > @@ -2572,13 +2572,13 @@ static void *alloc_slabmgmt(struct kmem_cache
> > > *cachep, return freelist;
> > >  }
> > >
> > > -static inline freelist_idx_t get_free_obj(struct page *page, unsigned
> > > char idx) +static inline freelist_idx_t get_free_obj(struct page *page,
> > > unsigned int idx) {
> > > return ((freelist_idx_t *)page->freelist)[idx];
> > >  }
> > >
> > >  static inline void set_free_obj(struct page *page,
> > > -   unsigned char idx, freelist_idx_t
> > > val) +   unsigned int idx,
> > > freelist_idx_t val) {
> > > ((freelist_idx_t *)(page->freelist))[idx] = val;
> > >  }
> > >
> > >
> > > then v3.15-rc1 will boot using the slab allocator.
> >
> > Is "idx" ever larger than 255?
> >
> > Gr{oetje,eeting}s,
> 
> Yes.  If I stick
> 
> if (idx > 255)
> pr_info("%s %d\n", __func__, idx);
> 
> in get_free_obj and set_free_obj and see values for idx up into the 400s.

Hello,

Yes, it's my mistake. idx can be larger than 255 if freelist_idx_t is
unsigned short. So unsigned char idx isn't appropriate here. Your
system's PAGE_SIZE may be 2^13, so freelist_idx_t would be unsigned short
and idx will be larger than 255.

Your fix looks good to me, so could you send it quickly to Pekka with
some description? If you don't have enough time to do it, I can handle it.

Really thanks for notifying this issue.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH cgroup/for-3.16] cgroup: add documentation about unified hierarchy

2014-04-16 Thread Li Zefan
On 2014/4/17 4:16, Randy Dunlap wrote:
> On 04/16/2014 06:51 AM, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Apr 15, 2014 at 03:36:29PM -0700, Randy Dunlap wrote:
 +depending on the specific controller.  IOW, hierarchy may be collapsed
>>>
>>> please spell out IOW
>>
>> Updated, but is this really necessary?
> 
> It depends on who your audience is.  I probably think that the audience
> is larger than you think it is.
> 

Yeah, I had to google it. :)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/27] ARM: EXYNOS: Add Exynos3250 SoC ID

2014-04-16 Thread Chanwoo Choi
Hi Tomasz,

On 04/17/2014 12:53 AM, Tomasz Figa wrote:
> Hi Chanwoo,
> 
> On 14.04.2014 07:13, Chanwoo Choi wrote:
>> On 04/11/2014 05:39 PM, Tomasz Figa wrote:
>>> On 11.04.2014 08:32, Chanwoo Choi wrote:
 On 04/11/2014 10:46 AM, Olof Johansson wrote:
> On Thu, Apr 10, 2014 at 06:37:12PM +0900, Chanwoo Choi wrote:
>> diff --git a/arch/arm/plat-samsung/include/plat/cpu.h 
>> b/arch/arm/plat-samsung/include/plat/cpu.h
>> index 5992b8d..3d808f6b 100644
>> --- a/arch/arm/plat-samsung/include/plat/cpu.h
>> +++ b/arch/arm/plat-samsung/include/plat/cpu.h
>> @@ -43,6 +43,9 @@ extern unsigned long samsung_cpu_id;
>>#define S5PV210_CPU_ID0x4311
>>#define S5PV210_CPU_MASK0xF000
>>
>> +#define EXYNOS3250_SOC_ID   0xE3472000
>> +#define EXYNOS3_SOC_MASK0xF000
>> +
>>#define EXYNOS4210_CPU_ID0x4321
>>#define EXYNOS4212_CPU_ID0x4322
>>#define EXYNOS4412_CPU_ID0xE4412200
>> @@ -68,6 +71,7 @@ IS_SAMSUNG_CPU(s5p6440, S5P6440_CPU_ID, 
>> S5P64XX_CPU_MASK)
>>IS_SAMSUNG_CPU(s5p6450, S5P6450_CPU_ID, S5P64XX_CPU_MASK)
>>IS_SAMSUNG_CPU(s5pc100, S5PC100_CPU_ID, S5PC100_CPU_MASK)
>>IS_SAMSUNG_CPU(s5pv210, S5PV210_CPU_ID, S5PV210_CPU_MASK)
>> +IS_SAMSUNG_CPU(exynos3250, EXYNOS3250_SOC_ID, EXYNOS3_SOC_MASK)
>>IS_SAMSUNG_CPU(exynos4210, EXYNOS4210_CPU_ID, EXYNOS4_CPU_MASK)
>>IS_SAMSUNG_CPU(exynos4212, EXYNOS4212_CPU_ID, EXYNOS4_CPU_MASK)
>>IS_SAMSUNG_CPU(exynos4412, EXYNOS4412_CPU_ID, EXYNOS4_CPU_MASK)
>> @@ -126,6 +130,12 @@ IS_SAMSUNG_CPU(exynos5440, EXYNOS5440_SOC_ID, 
>> EXYNOS5_SOC_MASK)
>># define soc_is_s5pv210()0
>>#endif
>>
>> +#if defined(CONFIG_SOC_EXYNOS3250)
>> +# define soc_is_exynos3250()is_samsung_exynos3250()
>> +#else
>> +# define soc_is_exynos3250()0
>> +#endif
>
> In general, I think we have too much code littered with soc_is_() 
> going
> on, so please try to avoid adding more for this SoC. Especially in cases 
> where
> you just want to bail out of certain features where we might already have
> function pointers to control if a function is called or not, such as the
> firmware interfaces.
>

 Do you prefer dt helper function such as following function instead of new 
 soc_is_xx() ?
 - of_machine_is_compatible("samsung,exynos3250")

 If you are OK, I'll use of_machine_is_compatible() instead of soc_is_xx().
>>>
>>> First of all, there is still a lot of code in mach-exynos/ using the 
>>> soc_is_xx() macros, so having some SoCs use them and other SoCs use 
>>> of_machine_is_compatible() wouldn't make the code cleaner.
>>>
>>> For now, I wouldn't mind adding soc_is_exynos3250(), but in general such 
>>> code surrounded with if (soc_is_xx()) blocks should be reworked to use 
>>> something better, for example function pointers, as Olof suggested.
>>
>> I thought 'function pointers' method instead of soc_is_xxx() macro as 
>> following two case:
>> I need more detailed explanation/example of "for example function pointers, 
>> as Olof suggested." sentence.
>>
>> [case 1]
>> Each Exynos SoC has other function pointers according to compatible name of 
>> DT.
>>
>> For example, arch/arm/mach-exynos/firmware.c
>>
>> static const struct firmware_ops exynos_firmware_ops = {
>> .do_idle= exynos_do_idle,
>> .set_cpu_boot_addr= exynos_set_cpu_boot_addr,
>> .cpu_boot= exynos_cpu_boot,
>> };
>> static const struct firmware_ops exynos3250_firmware_ops = {
>> .do_idle= exynos_do_idle,
>> .set_cpu_boot_addr= exynos4212_set_cpu_boot_addr,
>> .cpu_boot= exynos3250_cpu_boot,
>> };
>>
>> static const struct firmware_ops exynos4212_firmware_ops = {
>> .do_idle= exynos_do_idle,
>> .set_cpu_boot_addr= exynos4212_set_cpu_boot_addr,
>> .cpu_boot= exynos4212_cpu_boot,
>> };
>>
>> struct secure_firmware {
>> char *name;
>> const struct firmware_ops *ops;
>> } exynos_secure_firmware[] __initconst = {
>> { "samsung,secure-firmware",_firmware_ops },
>> { "samsung,exynos3250-secure-firmware", _firmware_ops },
>> { "samsung,exynos4212-secure-firmware", _firmware_ops },
>> };
>>
> 
> This is probably the right solution. Another would be to detect which 
> firmware ops to use by matching root node with particular SoC compatible 
> strings.
> 

OK, I'll modify firmware.c using this method on separated patch apart from 
Exynos3250 patchset.
But, I want to implment it after completed Exynos3250 patchset.
Because Exynos3250 patchset needs other patch such as following patch:
Following patches has not yet to be confirmed or merged.

[PATCH Resend] ARM: EXYNOS: Map SYSRAM address through DT
- http://www.spinics.net/lists/arm-kernel/msg323011.html

[PATCH v2 1/3] ARM: EXYNOS: Map PMU address through DT
- 

linux-next: manual merge of the ipsec tree with Linus' tree

2014-04-16 Thread Stephen Rothwell
Hi Steffen,

Today's linux-next merge of the ipsec tree got a conflict in
net/ipv4/ip_vti.c between commit 8d89dcdf80d8 ("vti: don't allow to add
the same tunnel twice") from Linus' tree and commit a32452366b72 ("vti4:
Don't count header length twice") from the ipsec tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc net/ipv4/ip_vti.c
index afcee51b90ed,cd62596e9a87..
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@@ -349,7 -348,7 +349,6 @@@ static int vti_tunnel_init(struct net_d
memcpy(dev->dev_addr, >saddr, 4);
memcpy(dev->broadcast, >daddr, 4);
  
-   dev->hard_header_len= LL_MAX_HEADER + sizeof(struct iphdr);
 -  dev->type   = ARPHRD_TUNNEL;
dev->mtu= ETH_DATA_LEN;
dev->flags  = IFF_NOARP;
dev->iflink = 0;


pgp7EemGp7xzi.pgp
Description: PGP signature


Re: [PATCH/RFC 00/19] Support loop-back NFS mounts

2014-04-16 Thread Dave Chinner
On Thu, Apr 17, 2014 at 10:20:48AM +1000, NeilBrown wrote:
> A good example is the deadlock with the flush-* threads.
> flush-* will lock a page, and  then call ->writepage.  If ->writepage
> allocates memory it can enter reclaim, call ->releasepage on NFS, and block
> waiting for a COMMIT to complete.
> The COMMIT might already be running, performing fsync on that same file that
> flush-* is flushing.  It locks each page in turn.  When it  gets to the page
> that flush-* has locked, it will deadlock.

It's nfs_release_page() again

> In general, if nfsd is allowed to block on local filesystem, and local
> filesystem is allowed to block on NFS, then a deadlock can happen.
> We would need a clear hierarchy
> 
>__GFP_NETFS > __GFP_FS > __GFP_IO
> 
> for it to work.  I'm not sure the extra level really helps a lot and it would
> be a lot of churn.

I think you are looking at this the wrong way - it's not the other
filesystems that have to avoid memory reclaim recursion, it's the
NFS client mount that is on loopback that needs to avoid recursion.

IMO, the fix should be that the NFS client cannot block on messages sent to the 
NFSD
on the same host during memory reclaim. That is, nfs_release_page()
cannot send commit messages to the server if the server is on
localhost. Instead, it just tells memory reclaim that it can't
reclaim that page.

If nfs_release_page() no longer blocks in memory reclaim, and all
these nfsd-gets-blocked-in-GFP_KERNEL-memory-allocation recursion
problems go away. Do the same for all the other memory reclaim
operations in the NFS client, and you've got a solution that should
work without needing to walk all over the rest of the kernel

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 2/4] arm64: dts: APM X-Gene PCIe device tree nodes

2014-04-16 Thread Jason Gunthorpe
On Thu, Apr 17, 2014 at 01:20:42AM +0100, Liviu Dudau wrote:

> > No spec says you can put config space into the ranges at all, nobody
> > should be doing that today, obviously some cases were missed during
> > review..
> 
> ePAPR documents allows that when ss == 00.

Which do you mean? The 'PCI Bus Binding' spec has fairly specific
language on how ranges should be used and interpreted, and it
precludes doing anything meaningful with config space (like requiring
b,d,f and r to be zeroed when doing compares against ranges, requiring
the ranges to represent the bridge windows, etc).

There is certainly room to invent something (like ECAM mapping) but
nothing is specified in that document.

The ePAPR document I have doesn't talk about PCI..

If you've found a document that defines how it works then that changes
things.. ;)

> > The comment about ECAM was intended as a general guidance on what
> > config space in ranges could/should be used for.
> > 
> > Right now config space shouldn't propagate out side any driver, so you
> > can probably just filter it in your generic code, and make it very hard
> > and obviously wrong for a driver to parse ranges for config space, so
> > we don't get more usages.
> 
> OK, this goes slightly against your email from 26th March:
> 
> "When we talked about this earlier on the DT bindings list the
> consensus seemed to be that configuration MMIO ranges should only be
> used if the underlying memory was exactly ECAM, and was not to be used
> for random configuration related register blocks.
> 
> The rational being that generic code, upon seeing that ranges entry,
> could just go ahead and assume ECAM mapping."
> 
> What I'm saying is that the only code that will see this ranges entry will
> be the parsing code as if we try to create a resource out of the range
> and add it to the host bridge structure (not driver) we will confuse the
> rest of the pci_host_bridge API. So we cannot do any ECAM accesses (yet?).

Sorry if this seems unclear, what you quoted was from a specification
standpoint - someday defining config space ranges to be the ECAM
window makes the most sense. This is from the direction of precluding
drivers from using it for random purposes.

>From a Linux standpoint, there is simply no infrastructure for generic
config access outside the driver, so config space must remain
contained in the driver, and shouldn't leak into the host bridge or
other core structures.

I think the shared code you are working on should simply ignore config
ss ranges entirely, they have no defined meaning..

Regards,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/3] cgroup: implement cgroup.populated for the default hierarchy

2014-04-16 Thread Li Zefan
> cgroup users often need a way to determine when a cgroup's
> subhierarchy becomes empty so that it can be cleaned up.  cgroup
> currently provides release_agent for it; unfortunately, this mechanism
> is riddled with issues.
> 
> * It delivers events by forking and execing a userland binary
>   specified as the release_agent.  This is a long deprecated method of
>   notification delivery.  It's extremely heavy, slow and cumbersome to
>   integrate with larger infrastructure.
> 
> * There is single monitoring point at the root.  There's no way to
>   delegate management of a subtree.
> 
> * The event isn't recursive.  It triggers when a cgroup doesn't have
>   any tasks or child cgroups.  Events for internal nodes trigger only
>   after all children are removed.  This again makes it impossible to
>   delegate management of a subtree.
> 
> * Events are filtered from the kernel side.  "notify_on_release" file
>   is used to subscribe to or suppress release event.  This is
>   unnecessarily complicated and probably done this way because event
>   delivery itself was expensive.
> 
> This patch implements interface file "cgroup.populated" which can be
> used to monitor whether the cgroup's subhierarchy has tasks in it or
> not.  Its value is 0 if there is no task in the cgroup and its
> descendants; otherwise, 1, and kernfs_notify() notificaiton is
> triggers when the value changes, which can be monitored through poll
> and [di]notify.
> 
> This is a lot ligther and simpler and trivially allows delegating
> management of subhierarchy - subhierarchy monitoring can block further
> propgation simply by putting itself or another process in the root of
> the subhierarchy and monitor events that it's interested in from there
> without interfering with monitoring higher in the tree.
> 
> v2: Patch description updated as per Serge.
> 
> v3: "cgroup.subtree_populated" renamed to "cgroup.populated".  The
> subtree_ prefix was a bit confusing because
> "cgroup.subtree_control" uses it to denote the tree rooted at the
> cgroup sans the cgroup itself while the populated state includes
> the cgroup itself.
> 
> Signed-off-by: Tejun Heo 
> Acked-by: Serge Hallyn 
> Cc: Lennart Poettering 

Acked-by: Li Zefan 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [DRIVER CORE] drivers/base/dd.c incorrect pr_debug() parameters

2014-04-16 Thread Joe Perches
On Wed, 2014-04-16 at 18:12 -0700, Frank Rowand wrote:
> On 4/16/2014 5:48 PM, Joe Perches wrote:
> > On Wed, 2014-04-16 at 17:12 -0700, Frank Rowand wrote:
> >> pr_debug() parameters are reverse order of format string
> > 
> > Another way to do this might be to change all the
> > printks/pr_debugs to dev_
> 
> Yes, but if that is done, one may as well do all of drivers/base/*.c
> and drivers/base/*/*.c

Actually, I don't think so, no.

These are the ones that have a struct device attached
to them.  Many of the others are device free and are
prefixed differently, like power and devtmpfs and bus.

> I was only trying to fix incorrectly reported information from one
> pr_debug().  If someone else want to do a big conversion, they are
> free to.  :-)

Always true...

cheers, Joe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [DRIVER CORE] drivers/base/dd.c incorrect pr_debug() parameters

2014-04-16 Thread Frank Rowand
On 4/16/2014 5:48 PM, Joe Perches wrote:
> On Wed, 2014-04-16 at 17:12 -0700, Frank Rowand wrote:
>> pr_debug() parameters are reverse order of format string
> 
> Another way to do this might be to change all the
> printks/pr_debugs to dev_

Yes, but if that is done, one may as well do all of drivers/base/*.c
and drivers/base/*/*.c

I was only trying to fix incorrectly reported information from one
pr_debug().  If someone else want to do a big conversion, they are
free to.  :-)

> 
> Something like:
> ---
>  drivers/base/dd.c | 35 +--
>  1 file changed, 17 insertions(+), 18 deletions(-)
> 
< snip >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: f2fs: BUG_ON() is triggered when mount valid f2fs filesystem

2014-04-16 Thread Alexey Khoroshilov
Hi,

But would not ability to trigger BUG_ON by mounting a crafted image
considered as an issue having security implications?

Regards,
Alexey


On 16.04.2014 16:35, Jaegeuk Kim wrote:
> Hi,
>
> 2014-04-16 (수), 13:11 +0400, Andrey Tsyvarev:
>> Hi,
>>
>> With this patch mounting of the image continues to fail (with similar 
>> BUG_ON).
>> But when image is formatted again (and steps mentioned in the previous 
>> message are performed),
>> mounting of it is now succeed.
>>
>> Is this is a true purpose of the patch?
> Indeed. The patch solves there-in root cause.
> But, if you're trying to use the failed image again, simply you can skip
> the errorneous part by:
>
> # mount ... -o disable_roll_forward ...
>
> Once sync or umount whatever checkpoint is done after that, the image
> will be mounted without "disable_roll_forward".
>
> Thanks,
>
>> 15.04.2014 15:04, Jaegeuk Kim пишет:
>>> Hi,
>>>
>>> Thank you for the report.
>>> I retrieved the fault image and found out that previous garbage data
>>> wreak such the wrong behaviors.
>>> So, I wrote the following patch that fills one zero-block at the
>>> checkpoint procedure.
>>> If the underlying device supports discard, I expect that it mostly
>>> doesn't incur any performance regression significantly.
>>>
>>> Could you test this patch?
>>>
>>> >From 60588ceb7277aae2a79e7f67f5217d1256720d78 Mon Sep 17 00:00:00 2001
>>> From: Jaegeuk Kim 
>>> Date: Tue, 15 Apr 2014 13:57:55 +0900
>>> Subject: [PATCH] f2fs: avoid to conduct roll-forward due to the remained
>>>   garbage blocks
>>>
>>> The f2fs always scans the next chain of direct node blocks.
>>> But some garbage blocks are able to be remained due to no discard
>>> support or
>>> SSR triggers.
>>> This occasionally wreaks recovering wrong inodes that were used or
>>> BUG_ONs
>>> due to reallocating node ids as follows.
>>>
>>> When mount this f2fs image:
>>> http://linuxtesting.org/downloads/f2fs_fault_image.zip
>>> BUG_ON is triggered in f2fs driver (messages below are generated on
>>> kernel 3.13.2; for other kernels output is similar):
>>>
>>> kernel BUG at fs/f2fs/node.c:215!
>>>   Call Trace:
>>>   [] recover_inode_page+0x1fd/0x3e0 [f2fs]
>>>   [] ? __lock_page+0x67/0x70
>>>   [] ? autoremove_wake_function+0x50/0x50
>>>   [] recover_fsync_data+0x1398/0x15d0 [f2fs]
>>>   [] ? selinux_d_instantiate+0x1c/0x20
>>>   [] ? d_instantiate+0x5b/0x80
>>>   [] f2fs_fill_super+0xb04/0xbf0 [f2fs]
>>>   [] ? mount_bdev+0x7e/0x210
>>>   [] mount_bdev+0x1c9/0x210
>>>   [] ? validate_superblock+0x210/0x210 [f2fs]
>>>   [] f2fs_mount+0x1d/0x30 [f2fs]
>>>   [] mount_fs+0x47/0x1c0
>>>   [] ? __alloc_percpu+0x10/0x20
>>>   [] vfs_kern_mount+0x72/0x110
>>>   [] do_mount+0x493/0x910
>>>   [] ? strndup_user+0x5b/0x80
>>>   [] SyS_mount+0x90/0xe0
>>>   [] system_call_fastpath+0x16/0x1b
>>>
>>> Found by Linux File System Verification project (linuxtesting.org).
>>>
>>> Reported-by: Andrey Tsyvarev 
>>> Signed-off-by: Jaegeuk Kim 
>>> ---
>>>   fs/f2fs/checkpoint.c |  6 ++
>>>   fs/f2fs/f2fs.h   |  1 +
>>>   fs/f2fs/segment.c| 17 +++--
>>>   3 files changed, 22 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>>> index 4aa521a..890e23d 100644
>>> --- a/fs/f2fs/checkpoint.c
>>> +++ b/fs/f2fs/checkpoint.c
>>> @@ -762,6 +762,12 @@ static void do_checkpoint(struct f2fs_sb_info *sbi,
>>> bool is_umount)
>>> void *kaddr;
>>> int i;
>>>   
>>> +   /*
>>> +* This avoids to conduct wrong roll-forward operations and uses
>>> +* metapages, so should be called prior to sync_meta_pages below.
>>> +*/
>>> +   discard_next_dnode(sbi);
>>> +
>>> /* Flush all the NAT/SIT pages */
>>> while (get_pages(sbi, F2FS_DIRTY_META))
>>> sync_meta_pages(sbi, META, LONG_MAX);
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>> index 2ecac83..2c5a5da 100644
>>> --- a/fs/f2fs/f2fs.h
>>> +++ b/fs/f2fs/f2fs.h
>>> @@ -1179,6 +1179,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *);
>>>   void invalidate_blocks(struct f2fs_sb_info *, block_t);
>>>   void refresh_sit_entry(struct f2fs_sb_info *, block_t, block_t);
>>>   void clear_prefree_segments(struct f2fs_sb_info *);
>>> +void discard_next_dnode(struct f2fs_sb_info *);
>>>   int npages_for_summary_flush(struct f2fs_sb_info *);
>>>   void allocate_new_segments(struct f2fs_sb_info *);
>>>   struct page *get_sum_page(struct f2fs_sb_info *, unsigned int);
>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>>> index 1e264e7..9993f94 100644
>>> --- a/fs/f2fs/segment.c
>>> +++ b/fs/f2fs/segment.c
>>> @@ -335,13 +335,26 @@ static void locate_dirty_segment(struct
>>> f2fs_sb_info *sbi, unsigned int segno)
>>> mutex_unlock(_i->seglist_lock);
>>>   }
>>>   
>>> -static void f2fs_issue_discard(struct f2fs_sb_info *sbi,
>>> +static int f2fs_issue_discard(struct f2fs_sb_info *sbi,
>>> block_t blkstart, block_t blklen)
>>>   {
>>> sector_t start = SECTOR_FROM_BLOCK(sbi, blkstart);

Re: [PATCH 04/19] Make effect of PF_FSTRANS to disable __GFP_FS universal.

2014-04-16 Thread NeilBrown
On Wed, 16 Apr 2014 16:17:26 +1000 NeilBrown  wrote:

> On Wed, 16 Apr 2014 15:37:56 +1000 Dave Chinner  wrote:
> 
> > On Wed, Apr 16, 2014 at 02:03:36PM +1000, NeilBrown wrote:

> > > - /*
> > > -  * Given that we do not allow direct reclaim to call us, we should
> > > -  * never be called while in a filesystem transaction.
> > > -  */
> > > - if (WARN_ON(current->flags & PF_FSTRANS))
> > > - goto redirty;
> > 
> > We still need to ensure this rule isn't broken. If it is, the
> > filesystem will silently deadlock in delayed allocation rather than
> > gracefully handle the problem with a warning
> 
> Hmm... that might be tricky.  The 'new' PF_FSTRANS can definitely be set when
> xfs_vm_writepage is called and we really want the write to happen.
> I don't suppose there is any other way to detect if a transaction is
> happening?

I've been thinking about this some more

That code is in xfs_vm_writepage which is only called as ->writepage.
xfs never calls that directly so it could only possibly be called during
reclaim?

We know that doesn't happen, but if it does then PF_MEMALLOC would be set,
but PF_KSWAPD would not... and you already have a test for that.

How about every time we set PF_FSTRANS, we store the corresponding
xfs_trans_t in current->journal_info, and clear that field when PF_FSTRANS is
cleared.  Then xfs_vm_writepage can test for current->journal_info being
clear.
That is the field that several other filesystems use to keep track of the
'current' transaction.
??

I don't know what xfs_trans_t we would use in xfs_bmapi_allocate_worker, but
I suspect you do :-)

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [PATCH 8/8] extcon: arizona: Use devm_extcon_dev_register()

2014-04-16 Thread Sangjung
To Seung-Woo.


On 04/16/2014 07:44 PM, Seung-Woo Kim wrote:
> Hi,
>
> On 2014년 04월 16일 19:27, Sangjung Woo wrote:
>> Use the resource-managed extcon device register function (i.e.
>> devm_extcon_dev_register()) instead of extcon_dev_register(). If extcon 
>> device
>> is attached with this function, that extcon device is automatically 
>> unregistered
>> on driver detach. That reduces tiresome managing code.
>>
>> Signed-off-by: Sangjung Woo 
>> ---
>>  drivers/extcon/extcon-arizona.c |   13 -
>>  1 file changed, 4 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/extcon/extcon-arizona.c 
>> b/drivers/extcon/extcon-arizona.c
>> index 98a14f6..40e6c0b 100644
>> --- a/drivers/extcon/extcon-arizona.c
>> +++ b/drivers/extcon/extcon-arizona.c
>> @@ -1105,15 +1105,13 @@ static int arizona_extcon_probe(struct 
>> platform_device *pdev)
>>  info = devm_kzalloc(>dev, sizeof(*info), GFP_KERNEL);
>>  if (!info) {
>>  dev_err(>dev, "Failed to allocate memory\n");
>> -ret = -ENOMEM;
>> -goto err;
>> +return -ENOMEM;
>>  }
>>  
>>  info->micvdd = devm_regulator_get(arizona->dev, "MICVDD");
>>  if (IS_ERR(info->micvdd)) {
>> -ret = PTR_ERR(info->micvdd);
>>  dev_err(arizona->dev, "Failed to get MICVDD: %d\n", ret);
> Assignment to ret is removed but it is still used here.

You're right.
I will fix and send it as second version.

Thank you for your comment.

BRs,
Sangjung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 16/19] VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc.

2014-04-16 Thread NeilBrown
On Wed, 16 Apr 2014 19:00:51 +1000 Dave Chinner  wrote:

> On Wed, Apr 16, 2014 at 04:49:41PM +1000, NeilBrown wrote:
> > On Wed, 16 Apr 2014 16:25:20 +1000 Dave Chinner  wrote:
> > 
> > > On Wed, Apr 16, 2014 at 02:03:37PM +1000, NeilBrown wrote:
> > > > __d_alloc can be called with i_mutex held, so it is safer to
> > > > use GFP_NOFS.
> > > > 
> > > > lockdep reports this can deadlock when loop-back NFS is in use,
> > > > as nfsd may be required to write out for reclaim, and nfsd certainly
> > > > takes i_mutex.
> > > 
> > > But not the same i_mutex as is currently held. To me, this seems
> > > like a false positive? If you are holding the i_mutex on an inode,
> > > then you have a reference to the inode and hence memory reclaim
> > > won't ever take the i_mutex on that inode.
> > > 
> > > FWIW, this sort of false positive was a long stabding problem for
> > > XFS - we managed to get rid of most of the false positives like this
> > > by ensuring that only the ilock is taken within memory reclaim and
> > > memory reclaim can't be entered while we hold the ilock.
> > > 
> > > You can't do that with the i_mutex, though
> > > 
> > > Cheers,
> > > 
> > > Dave.
> > 
> > I'm not sure this is a false positive.
> > You can call __d_alloc when creating a file and so are holding i_mutex on 
> > the
> > directory.
> > nfsd might also want to access that directory.
> > 
> > If there was only 1 nfsd thread, it would need to get i_mutex and do it's
> > thing before replying to that request and so before it could handle the
> > COMMIT which __d_alloc is waiting for.
> 
> That seems wrong - the NFS client in __d_alloc holds a mutex on a
> NFS client directory inode. The NFS server can't access that
> specific mutex - it's on the other side of the "network". The NFS
> server accesses mutexs from local filesystems, so __d_alloc would
> have to be blocked on a local filesystem inode i_mutex for the nfsd
> to get hung up behind it...

I'm not thinking of mutexes on the NFS inodes but the local filesystem inodes
exactly as you describe below.

> 
> However, my confusion comes from the fact that we do GFP_KERNEL
> memory allocation with the i_mutex held all over the place.

Do we?  Should we?  Isn't the whole point of GFP_NOFS to use it when holding
any filesystem lock?

>   If the
> problem is:
> 
>   local fs access -> i_mutex
> .
>   nfsd -> i_mutex (blocked)
> .
>   local fs access -> kmalloc(GFP_KERNEL)
>   -> direct reclaim
>   -> nfs_release_page
>   -> 
>  
> 
> then why is it just __d_alloc that needs this fix?  Either this is a
> problem *everywhere* or it's not a problem at all.

I think it is a problem everywhere that it is a problem :-)
If you are holding an FS lock, then you should be using GFP_NOFS.
Currently a given filesystem can get away with sometimes using GFP_KERNEL
because that particular lock never causes contention during reclaim for that
particular filesystem.

Adding loop-back NFS into the mix broadens the number of locks which can
cause a problem as it creates interdependencies between different filesystems.

> 
> If it's a problem everywhere it means that we simply can't allow
> reclaim from localhost NFS mounts to run from contexts that could
> block an NFSD. i.e. you cannot run NFS client memory reclaim from
> filesystems that are NFS server exported filesystems.

Well.. you cannot allow NFS client memory reclaim *while holding locks in*
filesystems that are NFS exported.

I think this is most effectively generalised to:
  you cannot allow FS memory reclaim while holding locks in filesystems which
  can be NFS exported

which I think is largely the case already - and lockdep can help us find
those places where we currently do allow FS reclaim while holding an FS lock.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [DRIVER CORE] drivers/base/dd.c incorrect pr_debug() parameters

2014-04-16 Thread Joe Perches
On Wed, 2014-04-16 at 17:12 -0700, Frank Rowand wrote:
> pr_debug() parameters are reverse order of format string

Another way to do this might be to change all the
printks/pr_debugs to dev_

Something like:
---
 drivers/base/dd.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 0605176..454df77 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -182,13 +182,12 @@ late_initcall(deferred_probe_initcall);
 static void driver_bound(struct device *dev)
 {
if (klist_node_attached(>p->knode_driver)) {
-   printk(KERN_WARNING "%s: device %s already bound\n",
-   __func__, kobject_name(>kobj));
+   dev_warn(dev, "%s: device already bound\n", __func__);
return;
}
 
-   pr_debug("driver: '%s': %s: bound to device '%s'\n", dev_name(dev),
-__func__, dev->driver->name);
+   dev_dbg(dev, "%s: driver '%s' bound to device\n",
+   __func__, dev->driver->name);
 
klist_add_tail(>p->knode_driver, >driver->p->klist_devices);
 
@@ -267,8 +266,8 @@ static int really_probe(struct device *dev, struct 
device_driver *drv)
int ret = 0;
 
atomic_inc(_count);
-   pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
-drv->bus->name, __func__, drv->name, dev_name(dev));
+   dev_dbg(dev, "%s: bus: '%s': probing driver '%s'\n",
+   __func__, drv->bus->name, drv->name);
WARN_ON(!list_empty(>devres_head));
 
dev->driver = drv;
@@ -279,8 +278,8 @@ static int really_probe(struct device *dev, struct 
device_driver *drv)
goto probe_failed;
 
if (driver_sysfs_add(dev)) {
-   printk(KERN_ERR "%s: driver_sysfs_add(%s) failed\n",
-   __func__, dev_name(dev));
+   dev_err(dev, "%s: driver '%s' - driver_sysfs_add failed\n",
+   __func__, drv->name);
goto probe_failed;
}
 
@@ -296,8 +295,8 @@ static int really_probe(struct device *dev, struct 
device_driver *drv)
 
driver_bound(dev);
ret = 1;
-   pr_debug("bus: '%s': %s: bound device %s to driver %s\n",
-drv->bus->name, __func__, dev_name(dev), drv->name);
+   dev_dbg(dev, "%s: bus: '%s' - bound to driver '%s'\n",
+   __func__, drv->bus->name, drv->name);
goto done;
 
 probe_failed:
@@ -308,16 +307,16 @@ probe_failed:
 
if (ret == -EPROBE_DEFER) {
/* Driver requested deferred probing */
-   dev_info(dev, "Driver %s requests probe deferral\n", drv->name);
+   dev_info(dev, "driver '%s' requests probe deferral\n",
+drv->name);
driver_deferred_probe_add(dev);
} else if (ret != -ENODEV && ret != -ENXIO) {
/* driver matched but the probe failed */
-   printk(KERN_WARNING
-  "%s: probe of %s failed with error %d\n",
-  drv->name, dev_name(dev), ret);
+   dev_warn(dev, "probe of driver '%s' failed with error %d\n",
+drv->name, ret);
} else {
-   pr_debug("%s: probe of %s rejects match %d\n",
-  drv->name, dev_name(dev), ret);
+   dev_dbg(dev, "probe of driver '%s' rejects match %d\n",
+   drv->name, ret);
}
/*
 * Ignore errors returned by ->probe so that the next driver can try
@@ -375,8 +374,8 @@ int driver_probe_device(struct device_driver *drv, struct 
device *dev)
if (!device_is_registered(dev))
return -ENODEV;
 
-   pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
-drv->bus->name, __func__, dev_name(dev), drv->name);
+   dev_dbg(dev, "%s: bus: '%s' - matched with driver '%s'\n",
+   __func__, drv->bus->name, drv->name);
 
pm_runtime_barrier(dev);
ret = really_probe(dev, drv);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] nohz: use delayed iowait accounting to avoid race on idle time stats

2014-04-16 Thread Hidetoshi Seto
(2014/04/16 18:36), Peter Zijlstra wrote:
> On Wed, Apr 16, 2014 at 03:33:06PM +0900, Hidetoshi Seto wrote:
>> So we need 2 operations:
>>   a) remove regression
> 
> What regression; there's never been talk about a regression, just a bug
> found. AFAICT this 'regression' is ever since we introduced NOHZ or
> somesuch, which is very long ago indeed.
> 
> And since its basically been broken forever, there's no rush what so
> ever.

Well, from a customer's view, when he upgrade his foobar enterprise
linux from version 5 to 6 (for example), he will say "it's a regression"
if something worked well in previous version have broken in new version
without any documents and/or technical notes etc.

That's why I used the word "regression" for this bug.

>>   b) implement new iowait accounting mechanism
>>
>> What Frederic mentioned is that we don't need a) once if we invent
>> the solution for b). But I doubt it because a) is still required
>> for stable environment including some distributor's kernel.
>> It is clear that patches for b) will not be backportable.
>>
>> Still the b) is disease that has no known cure. There is no reason
>> to wait works on b) before starting works for a).
> 
> As stated, there is no a). Its been forever broken. There is no urgency.

I just wrote my patches for my customer like above and for my salary ;-)

Thank you for your comments!
I'll post my v4 patch set soon.


Thanks,
H.Seto



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [Nfs-ganesha-devel] should we change the name/macros of file-private locks?

2014-04-16 Thread Jim Lieb
On Wednesday, April 16, 2014 13:16:33 Jeremy Allison wrote:
> On Wed, Apr 16, 2014 at 10:00:46PM +0200, Michael Kerrisk (man-pages) wrote:
> > [CC += Jeremy Allison]
> > 
> > On Wed, Apr 16, 2014 at 8:57 PM, Jeff Layton  wrote:
> > > Sorry to spam so many lists, but I think this needs widespread
> > > distribution and consensus.
> > > 
> > > File-private locks have been merged into Linux for v3.15, and *now*
> > > people are commenting that the name and macro definitions for the new
> > > file-private locks suck.
> > > 
> > > ...and I can't even disagree. They do suck.
> > > 
> > > We're going to have to live with these for a long time, so it's
> > > important that we be happy with the names before we're stuck with them.
> > 
> > So, to add my perspective: The existing byte-range locking system has
> > persisted (despite egregious faults) for well over two decades. One
> > supposes that Jeff's new improved version might be around
> > at least as long. With that in mind, and before setting in stone (and
> > pushing into POSIX) a model of thinking that thousands of programmers
> > will live with for a long time, it's worth thinking about names.
> > 
> > > Michael Kerrisk suggested several names but I think the only one that
> > > doesn't have other issues is "file-associated locks", which can be
> > > distinguished against "process-associated" locks (aka classic POSIX
> > > locks).
> > 
> > The names I have suggested are:
> > file-associated locks
> > 
> > or
> > 
> >file-handle locks
> > 
> > or (using POSIX terminology)
> > 
> > file-description locks
> 
> Thanks for the CC: Michael, but to be honest
> I don't really care what the name is, I just
> want the functionality. I can change our build
> system to cope with detecting it under any name
> you guys choose :-).
> 
> Cheers,
> 
>   Jeremy.

I and the rest of the nfs-ganesha community are with Jeremy and samba wrt 
names.  We just want locks that work, i.e. Useful Locks ;)

Jim
> 
> 
> -- Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> ___
> Nfs-ganesha-devel mailing list
> nfs-ganesha-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

-- 
Jim Lieb
Linux Systems Engineer
Panasas Inc.

"If ease of use was the only requirement, we would all be riding tricycles"
- Douglas Engelbart 1925–2013
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mm: kernel BUG at mm/huge_memory.c:1829!

2014-04-16 Thread Andrea Arcangeli
Hi Kirill,

On Mon, Apr 14, 2014 at 05:42:18PM +0300, Kirill A. Shutemov wrote:
> I've spent few day trying to understand rmap code. And now I think my
> patch is wrong.
> 
> I actually don't see where walk order requirement comes from. It seems all
> operations (insert, remove, foreach) on anon_vma is serialized with
> anon_vma->root->rwsem. Andrea, could you explain this for me?

It's true the locking protects and freezes the view of all anon_vma
structures associated with the page, but that only guarantees you not
to miss the vma. Not missing the vma is not enough. You can still miss
a pte during the rmap_walk if the order is wrong, because the pte/pmds
are still moving freely under the vmas (absent of the PT lock and the
mmap_sem).

The problem are all MM operations that copies or move a page mapping
from a source to destination vma (fork and mremap). They take the
anon_vma lock, insert the destination vma with the proper anon_vma
chains and then they _drop_ the anon vma lock, and only later they
start moving ptes and pmds around (by taking the proper PT locks).

anon_vma -> src_vma -> dst_vma

If the order is like above (guaranteed before the interval tree was
introduced), if the rmap_walk of split_huge_page and migrate
encounters the source pte/pmd and split/unmap it before it gets
copied, then the copy or move will retain the processed state (regular
pte instead of trans_huge_pmd for split_huge_page or migration pte for
migrate). If instead the pmd/pte was already copied by the time the
src_vma is scanned, then it will encounter the copy to process in the
dst_vma too. The rmap_walk can't miss a pte/pmd if the anon_vma chain
is walked in insertion order (i.e. older vma first).

anon_vma -> dst_vma -> src_vma

If the anon_vma walk order is reversed vs the insertion order, things
falls apart because you will scan dst_vma in split_huge_page while it
still empty, find nothing, then the trans_huge_pmd is moved or copied
from src_vma to dst_vma by the MM code only holding PT lock and
mmap_sem for writing (we cannot hold those across the whole duration
of split_huge_page and migrate). So if the rmap_walk order is not
right, the rmap_walk can miss the contents of the dst_vma that was
still empty at the time it was processed.

If the interval tree walk order cannot be fixed without screwing with
the computation complexity of the structure, a more black and white
fix could be to add a anon_vma templist to scan in O(N) after the
interval tree has been scanned, where you add newly inserted vmas.
The templist shall then be flushed back to the interval tree only
after the pte/pmd mangling of the MM operation is completed. That
requires identifying the closure of the critical section for those
problematic MM operations. The main drawback is actually having to
take the anon_vma lock twice, the second time for the flush to the
interval tree.

Looping like in your previous patch would be much simpler if it could
be made reliable, but it looked like it wouldn't close the bug
entirely because any concurrent unmap operation could lead to false
negative hiding the pmd/pte walk miss (by decreasing page->mapcount
under us).

Comments?

Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] thp: close race between split and zap huge pages

2014-04-16 Thread Bob Liu
On Wed, Apr 16, 2014 at 4:42 PM, Kirill A. Shutemov
 wrote:
> On Wed, Apr 16, 2014 at 07:52:29AM +0800, Bob Liu wrote:
>> > *ptl = pmd_lock(mm, pmd);
>> > -   if (pmd_none(*pmd))
>> > +   if (!pmd_present(*pmd))
>> > goto unlock;
>>
>> But I didn't get the idea why pmd_none() was removed?
>
> !pmd_present(*pmd) is weaker check then pmd_none(*pmd). I mean if
> pmd_none(*pmd) is true then pmd_present(*pmd) is always false.

Oh, yes. That's right.

BTW, it looks like this bug was introduced by the same reason.
https://lkml.org/lkml/2014/4/16/403

-- 
Regards,
--Bob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 2/4] arm64: dts: APM X-Gene PCIe device tree nodes

2014-04-16 Thread Liviu Dudau
On Wed, Apr 16, 2014 at 03:21:04PM -0600, Jason Gunthorpe wrote:
> On Wed, Apr 16, 2014 at 06:05:45PM +0100, Liviu Dudau wrote:
> 
> > I have found out that we cannot pasd the config ranges from the DT into the
> > pci_host_bridge structure as the PCI framework doesn't have a resource type
> > for config resources. Leaving the translation between range flags and
> > resource type as is (filtered through the IORESOURCE_TYPE_BITS) will lead
> > to a resource type of value zero, which is not recognised by any resource
> > handling API so bridge configuration and bus scanning will barf.
> > 
> > I'm looking for suggestions here, as Jason Gunthorpe suggested that we
> > should be able to parse config ranges if they conform to the ECAM part
> > of the PCI standard.
> 
> The thinking here is the ranges should be well defined and general, it
> isn't a dumping ground for driver specific stuff.
> 
> No spec says you can put config space into the ranges at all, nobody
> should be doing that today, obviously some cases were missed during
> review..

ePAPR documents allows that when ss == 00.

> 
> The comment about ECAM was intended as a general guidance on what
> config space in ranges could/should be used for.
> 
> Right now config space shouldn't propagate out side any driver, so you
> can probably just filter it in your generic code, and make it very hard
> and obviously wrong for a driver to parse ranges for config space, so
> we don't get more usages.

OK, this goes slightly against your email from 26th March:

"When we talked about this earlier on the DT bindings list the
consensus seemed to be that configuration MMIO ranges should only be
used if the underlying memory was exactly ECAM, and was not to be used
for random configuration related register blocks.

The rational being that generic code, upon seeing that ranges entry,
could just go ahead and assume ECAM mapping."

What I'm saying is that the only code that will see this ranges entry will
be the parsing code as if we try to create a resource out of the range
and add it to the host bridge structure (not driver) we will confuse the
rest of the pci_host_bridge API. So we cannot do any ECAM accesses (yet?).


Best regards,
Liviu

> 
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
---
   .oooO
   (   )
\ (  Oooo.
 \_) (   )
  ) /
 (_/

 One small step
   for me ...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH/RFC 00/19] Support loop-back NFS mounts

2014-04-16 Thread NeilBrown
On Wed, 16 Apr 2014 10:42:07 -0400 Jeff Layton  wrote:

> On Wed, 16 Apr 2014 14:03:35 +1000
> NeilBrown  wrote:
> 

> > Comments, criticisms, etc most welcome.
> > 
> > Thanks,
> > NeilBrown
> > 
> 
> I've only given this a once-over, but the basic concept seems a bit
> flawed. IIUC, the basic idea is to disallow allocations done in knfsd
> threads context from doing fs-based reclaim.
> 
> This seems very heavy-handed, and like it could cause problems on a
> busy NFS server. Those sorts of servers are likely to have a lot of
> data in pagecache and thus we generally want to allow them to do do
> writeback when memory is tight.
> 
> It's generally acceptable for knfsd to recurse into local filesystem
> code for writeback. What you want to avoid in this situation is reclaim
> on NFS filesystems that happen to be from knfsd on the same box.
> 
> If you really want to fix this, what may make more sense is trying to
> plumb that information down more granularly. Maybe GFP_NONETFS and/or
> PF_NETFSTRANS flags?

Hi Jeff,
 a few clarifications first:

 1/ These changes probably won't affect a "busy NFS server" at all.  The
PF_FSTRANS flag only get set in nfsd when it sees a request from the local
host.  Most busy NFS servers would never see that, and so would never set
PF_FSTRANS.

 2/ Setting PF_FSTRANS does not affect where writeback is done.  Direct
reclaim hasn't performed filesystem writeback since 3.2, it is all done
by kswapd (I think direct reclaim still writes to swap sometimes).
The main effects of setting PF_FSTRANS (as modified by this page set)
are:
  - when reclaim calls ->releasepage  __GFP_FS is not set in the gfp_t arg
  - various caches like dcache, icache etc are not shrunk from
direct reclaim
There are other effects, but I'm less clear on exactly what they mean.

A flag specific to network filesystems might make sense, but I don't think it
would solve all the deadlocks.

A good example is the deadlock with the flush-* threads.
flush-* will lock a page, and  then call ->writepage.  If ->writepage
allocates memory it can enter reclaim, call ->releasepage on NFS, and block
waiting for a COMMIT to complete.
The COMMIT might already be running, performing fsync on that same file that
flush-* is flushing.  It locks each page in turn.  When it  gets to the page
that flush-* has locked, it will deadlock.

xfs_vm_writepage does allocate memory with __GFP_FS set
   xfs_vm_writepage -> xfs_setfilesize_trans_alloc -> xfs_trans_alloc ->
   _xfs_trans_allo

and I have had this deadlock happen.  To avoid this we need flush-* to ensure
that no memory allocation blocks on NFS.  We could set a PF_NETFSTRANS there,
but as that code really has nothing to do with networks it would seem an odd
place to put a network-fs-specific flag.

In general, if nfsd is allowed to block on local filesystem, and local
filesystem is allowed to block on NFS, then a deadlock can happen.
We would need a clear hierarchy

   __GFP_NETFS > __GFP_FS > __GFP_IO

for it to work.  I'm not sure the extra level really helps a lot and it would
be a lot of churn.


Thanks,
NeilBrown



signature.asc
Description: PGP signature


Re: Info: mapping multiple BARs. Your kernel is fine.

2014-04-16 Thread Dave Jones
On Wed, Apr 16, 2014 at 04:56:00PM -0600, Bjorn Helgaas wrote:
 
 > > I'm seeing the exact same message on my thinkpad t430s.
 > > When I try your patch, modesetting no longer works. When it tries
 > > to change to the framebuffer I get a black screen and lockup.
 > > If I boot with nomodeset it locks up when it gets to X.
 > > It all scrolls by too fast to read, but it looks like there's still
 > > a backtrace present.
 > 
 > Ouch, sorry about that.  I do see a bug in my patch (fixed below), but I
 > don't see how that could cause what you're seeing.

updated diff made no difference fwiw.

 > Maybe I could figure
 > out something from this info (this can be from a kernel without my patch):
 > 
 > - dmesg log
 > - output of "find /sys/devices/pnp0 -name id -o -name resources | xargs 
 > grep ."
 > - output of "sudo lspci -s00:00.0 -xxx"

attached from a fedora build of rc1.

Dave

/sys/devices/pnp0/00:00/id:PNP0c01
/sys/devices/pnp0/00:00/resources:state = active
/sys/devices/pnp0/00:00/resources:mem 0x0-0x9
/sys/devices/pnp0/00:00/resources:mem 0xc-0xc3fff
/sys/devices/pnp0/00:00/resources:mem 0xc4000-0xc7fff
/sys/devices/pnp0/00:00/resources:mem 0xc8000-0xcbfff
/sys/devices/pnp0/00:00/resources:mem 0xcc000-0xc
/sys/devices/pnp0/00:00/resources:mem 0xd-0xd3fff
/sys/devices/pnp0/00:00/resources:mem 0xd4000-0xd7fff
/sys/devices/pnp0/00:00/resources:mem 0xd8000-0xdbfff
/sys/devices/pnp0/00:00/resources:mem 0xdc000-0xd
/sys/devices/pnp0/00:00/resources:mem 0xe-0xe3fff
/sys/devices/pnp0/00:00/resources:mem 0xe4000-0xe7fff
/sys/devices/pnp0/00:00/resources:mem 0xe8000-0xebfff
/sys/devices/pnp0/00:00/resources:mem 0xec000-0xe
/sys/devices/pnp0/00:00/resources:mem 0xf-0xf
/sys/devices/pnp0/00:00/resources:mem 0x10-0xbf9f
/sys/devices/pnp0/00:00/resources:mem 0xfec0-0xfed3
/sys/devices/pnp0/00:00/resources:mem 0xfed4c000-0x
/sys/devices/pnp0/00:01/id:PNP0c02
/sys/devices/pnp0/00:01/resources:state = active
/sys/devices/pnp0/00:01/resources:io 0x10-0x1f
/sys/devices/pnp0/00:01/resources:io 0x90-0x9f
/sys/devices/pnp0/00:01/resources:io 0x24-0x25
/sys/devices/pnp0/00:01/resources:io 0x28-0x29
/sys/devices/pnp0/00:01/resources:io 0x2c-0x2d
/sys/devices/pnp0/00:01/resources:io 0x30-0x31
/sys/devices/pnp0/00:01/resources:io 0x34-0x35
/sys/devices/pnp0/00:01/resources:io 0x38-0x39
/sys/devices/pnp0/00:01/resources:io 0x3c-0x3d
/sys/devices/pnp0/00:01/resources:io 0xa4-0xa5
/sys/devices/pnp0/00:01/resources:io 0xa8-0xa9
/sys/devices/pnp0/00:01/resources:io 0xac-0xad
/sys/devices/pnp0/00:01/resources:io 0xb0-0xb5
/sys/devices/pnp0/00:01/resources:io 0xb8-0xb9
/sys/devices/pnp0/00:01/resources:io 0xbc-0xbd
/sys/devices/pnp0/00:01/resources:io 0x50-0x53
/sys/devices/pnp0/00:01/resources:io 0x72-0x77
/sys/devices/pnp0/00:01/resources:io 0x400-0x47f
/sys/devices/pnp0/00:01/resources:io 0x500-0x57f
/sys/devices/pnp0/00:01/resources:io 0x800-0x80f
/sys/devices/pnp0/00:01/resources:io 0x15e0-0x15ef
/sys/devices/pnp0/00:01/resources:io 0x1600-0x167f
/sys/devices/pnp0/00:01/resources:mem 0xf800-0xfbff
/sys/devices/pnp0/00:01/resources:mem disabled
/sys/devices/pnp0/00:01/resources:mem 0xfed1c000-0xfed1
/sys/devices/pnp0/00:01/resources:mem 0xfed1-0xfed13fff
/sys/devices/pnp0/00:01/resources:mem 0xfed18000-0xfed18fff
/sys/devices/pnp0/00:01/resources:mem 0xfed19000-0xfed19fff
/sys/devices/pnp0/00:01/resources:mem 0xfed45000-0xfed4bfff
/sys/devices/pnp0/00:02/id:PNP0103
/sys/devices/pnp0/00:02/resources:state = active
/sys/devices/pnp0/00:02/resources:mem 0xfed0-0xfed003ff
/sys/devices/pnp0/00:03/id:PNP0200
/sys/devices/pnp0/00:03/resources:state = active
/sys/devices/pnp0/00:03/resources:io 0x0-0xf
/sys/devices/pnp0/00:03/resources:io 0x80-0x8f
/sys/devices/pnp0/00:03/resources:io 0xc0-0xdf
/sys/devices/pnp0/00:03/resources:dma 4
/sys/devices/pnp0/00:04/id:PNP0800
/sys/devices/pnp0/00:04/resources:state = active
/sys/devices/pnp0/00:04/resources:io 0x61-0x61
/sys/devices/pnp0/00:05/id:PNP0c04
/sys/devices/pnp0/00:05/resources:state = active
/sys/devices/pnp0/00:05/resources:io 0xf0-0xf0
/sys/devices/pnp0/00:05/resources:irq 13
/sys/devices/pnp0/00:06/id:PNP0b00
/sys/devices/pnp0/00:06/resources:state = active
/sys/devices/pnp0/00:06/resources:io 0x70-0x71
/sys/devices/pnp0/00:06/resources:irq 8
/sys/devices/pnp0/00:07/id:LEN0071
/sys/devices/pnp0/00:07/id:PNP0303
/sys/devices/pnp0/00:07/resources:state = active
/sys/devices/pnp0/00:07/resources:io 0x60-0x60
/sys/devices/pnp0/00:07/resources:io 0x64-0x64
/sys/devices/pnp0/00:07/resources:irq 1
/sys/devices/pnp0/00:08/id:LEN0015
/sys/devices/pnp0/00:08/id:PNP0f13
/sys/devices/pnp0/00:08/resources:state = active
/sys/devices/pnp0/00:08/resources:irq 12
/sys/devices/pnp0/00:09/id:SMO1200
/sys/devices/pnp0/00:09/id:PNP0c31
/sys/devices/pnp0/00:09/resources:state = active
/sys/devices/pnp0/00:09/resources:mem 0xfed4-0xfed44fff

00:00.0 Host bridge: Intel Corporation 3rd Gen Core 

Re: [3.14+] kernel BUG at mm/filemap.c:1347!

2014-04-16 Thread Hugh Dickins
On Wed, 16 Apr 2014, Johannes Weiner wrote:
> Subject: [patch] mm: filemap: update find_get_pages_tag() to deal with shadow
>  entries
> 
> Dave Jones reports the following crash when find_get_pages_tag() runs
> into an exceptional entry:
> 
> kernel BUG at mm/filemap.c:1347!
> RIP: 0010:[]  [] 
> find_get_pages_tag+0x1cb/0x220
> Call Trace:
>  [] ? find_get_pages_tag+0x36/0x220
>  [] pagevec_lookup_tag+0x21/0x30
>  [] filemap_fdatawait_range+0xbe/0x1e0
>  [] filemap_fdatawait+0x27/0x30
>  [] sync_inodes_sb+0x204/0x2a0
>  [] ? wait_for_completion+0xff/0x130
>  [] ? vfs_fsync+0x40/0x40
>  [] sync_inodes_one_sb+0x19/0x20
>  [] iterate_supers+0xb2/0x110
>  [] sys_sync+0x44/0xb0
>  [] ia32_do_call+0x13/0x13
> 
> 1343 /*
> 1344  * This function is never used on a shmem/tmpfs
> 1345  * mapping, so a swap entry won't be found here.
> 1346  */
> 1347 BUG();
> 
> After 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page
> cache radix trees") this comment and BUG() are out of date because
> exceptional entries can now appear in all mappings - as shadows of
> recently evicted pages.
> 
> However, as Hugh Dickins notes,
> 
>   "it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
>any other PAGECACHE_TAG_*) to appear on an exceptional entry.
> 
>I expect it comes down to an occasional race in RCU lookup of the
>radix_tree: lacking absolute synchronization, we might sometimes
>catch an exceptional entry, with the tag which really belongs with
>the unexceptional entry which was there an instant before."
> 
> And indeed, not only is the tree walk lockless, the tags are also read
> in chunks, one radix tree node at a time.  There is plenty of time for
> page reclaim to swoop in and replace a page that was already looked up
> as tagged with a shadow entry.
> 
> Remove the BUG() and update the comment.  While reviewing all other
> lookup sites for whether they properly deal with shadow entries of
> evicted pages, update all the comments and fix memcg file charge
> moving to not miss shmem/tmpfs swapcache pages.
> 
> Reported-by: Dave Jones 
> Signed-off-by: Johannes Weiner 
> Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache 
> radix trees")

Looks exactly right to me, thanks Hannes.  Good catch in memcontrol.c.

Acked-by: Hugh Dickins 

And I realize now that the tag races which led me to defer to you, are
actually just races we have lived with for years; but before they were
all handled invisibly at the "unlikely(!page)" stage, whereas now they
simply need active handling at the radix_tree_exception stage too.

There is, by the way, a separate cleanup that I noticed last night,
while puzzing over the filemap.c:202 bug.  In mm/truncate.c there
are several "We rely upon deletion not changing page->index" comments
(and in mm/filemap.c "Leave page->index set: truncation relies upon it").
I think your indices[] everywhere have ended that reliance?  Whether you
also remove the "WARN_ON(page->index != index)"s is a matter of taste:
it is reassuring to have that checked somewhere, but no longer so
particular to those loops.

> ---
>  mm/filemap.c| 49 -
>  mm/memcontrol.c | 20 
>  mm/truncate.c   |  8 
>  3 files changed, 40 insertions(+), 37 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a82fbe4c9e8e..d92c437a79c4 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -906,8 +906,8 @@ EXPORT_SYMBOL(page_cache_prev_hole);
>   * Looks up the page cache slot at @mapping & @offset.  If there is a
>   * page cache page, it is returned with an increased refcount.
>   *
> - * If the slot holds a shadow entry of a previously evicted page, it
> - * is returned.
> + * If the slot holds a shadow entry of a previously evicted page, or a
> + * swap entry from shmem/tmpfs, it is returned.
>   *
>   * Otherwise, %NULL is returned.
>   */
> @@ -928,9 +928,9 @@ repeat:
>   if (radix_tree_deref_retry(page))
>   goto repeat;
>   /*
> -  * Otherwise, shmem/tmpfs must be storing a swap entry
> -  * here as an exceptional entry: so return it without
> -  * attempting to raise page count.
> +  * A shadow entry of a recently evicted page,
> +  * or a swap entry from shmem/tmpfs.  Return
> +  * it without attempting to raise page count.
>*/
>   goto out;
>   }
> @@ -983,8 +983,8 @@ EXPORT_SYMBOL(find_get_page);
>   * page cache page, it is returned locked and with an increased
>   * refcount.
>   *
> - * If the slot holds a shadow entry of a previously evicted page, it
> - * is returned.
> + * If the slot holds a shadow entry of a 

[DRIVER CORE] drivers/base/dd.c incorrect pr_debug() parameters

2014-04-16 Thread Frank Rowand
pr_debug() parameters are reverse order of format string

Signed-off-by: Frank Rowand 
---

 drivers/base/dd.c |4   2 + 2 - 0 !
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: b/drivers/base/dd.c
===
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -187,8 +187,8 @@ static void driver_bound(struct device *
return;
}
 
-   pr_debug("driver: '%s': %s: bound to device '%s'\n", dev_name(dev),
-__func__, dev->driver->name);
+   pr_debug("driver: '%s': %s: bound to device '%s'\n", dev->driver->name,
+__func__, dev_name(dev));
 
klist_add_tail(>p->knode_driver, >driver->p->klist_devices);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PM / devfreq: Use freq_table for available_frequencies

2014-04-16 Thread Saravana Kannan
On 04/15/2014 11:41 AM, Saravana Kannan wrote:
> Ah, I misunderstood your previous email. I thought you Nack-ed my patch
> and decided to send your own patch to replace mine. Ok, I'll fix up mine
> and send it out.

MyungJoo/Kyungmin,

I sent out an updated patch. Can you please take a look?

Thanks,
Saravana


-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] mm/compaction: cleanup isolate_freepages()

2014-04-16 Thread Minchan Kim
Hi Vlastimil,

Below just nitpicks.

On Tue, Apr 15, 2014 at 11:18:27AM +0200, Vlastimil Babka wrote:
> isolate_freepages() is currently somewhat hard to follow thanks to many
> different pfn variables. Especially misleading is the name 'high_pfn' which
> looks like it is related to the 'low_pfn' variable, but in fact it is not.

Indeed.

> 
> This patch renames the 'high_pfn' variable to a hopefully less confusing name,
> and slightly changes its handling without a functional change. A comment made
> obsolete by recent changes is also updated.

It's clean up patch so if we do fixing, I'd like to do more.

> 
> Signed-off-by: Vlastimil Babka 
> Cc: Minchan Kim 
> Cc: Mel Gorman 
> Cc: Joonsoo Kim 
> Cc: Bartlomiej Zolnierkiewicz 
> Cc: Michal Nazarewicz 
> Cc: Naoya Horiguchi 
> Cc: Christoph Lameter 
> Cc: Rik van Riel 
> ---
>  mm/compaction.c | 17 -
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 627dc2e..169c7b2 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -671,7 +671,7 @@ static void isolate_freepages(struct zone *zone,
>   struct compact_control *cc)
>  {
>   struct page *page;
> - unsigned long high_pfn, low_pfn, pfn, z_end_pfn;
> + unsigned long pfn, low_pfn, next_free_pfn, z_end_pfn;

Could you add comment for each variable?

unsigned long pfn; /* scanning cursor */
unsigned long low_pfn; /* lowest pfn free scanner is able to scan */
unsigned long next_free_pfn; /* start pfn for scaning at next truen */
unsigned long z_end_pfn; /* zone's end pfn */


>   int nr_freepages = cc->nr_freepages;
>   struct list_head *freelist = >freepages;
>  
> @@ -688,11 +688,10 @@ static void isolate_freepages(struct zone *zone,
>   low_pfn = ALIGN(cc->migrate_pfn + 1, pageblock_nr_pages);
>  
>   /*
> -  * Take care that if the migration scanner is at the end of the zone
> -  * that the free scanner does not accidentally move to the next zone
> -  * in the next isolation cycle.
> +  * Seed the value for max(next_free_pfn, pfn) updates. If there are
> +  * none, the pfn < low_pfn check will kick in.

   "none" what? I'd like to clear more.

>*/
> - high_pfn = min(low_pfn, pfn);
> + next_free_pfn = 0;
>  
>   z_end_pfn = zone_end_pfn(zone);
>  
> @@ -754,7 +753,7 @@ static void isolate_freepages(struct zone *zone,
>*/
>   if (isolated) {
>   cc->finished_update_free = true;
> - high_pfn = max(high_pfn, pfn);
> + next_free_pfn = max(next_free_pfn, pfn);
>   }
>   }
>  
> @@ -766,9 +765,9 @@ static void isolate_freepages(struct zone *zone,
>* so that compact_finished() may detect this
>*/
>   if (pfn < low_pfn)
> - cc->free_pfn = max(pfn, zone->zone_start_pfn);
> - else
> - cc->free_pfn = high_pfn;
> + next_free_pfn = max(pfn, zone->zone_start_pfn);

Why we need max operation?
IOW, what's the problem if we do (next_free_pfn = pfn)?

> +
> + cc->free_pfn = next_free_pfn;
>   cc->nr_freepages = nr_freepages;
>  }
>  
> -- 
> 1.8.4.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/9] ARM: OMAP: dmtimer: Add comments on OMAP1 clock framework

2014-04-16 Thread Joel Fernandes
OMAP1 doesn't support clock framework, add a comment where needed
and correct a FIXME.

Signed-off-by: Joel Fernandes 
---
 arch/arm/plat-omap/dmtimer.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index ecd3f97..f5a674c 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -142,8 +142,7 @@ static int omap_dm_timer_prepare(struct omap_dm_timer 
*timer)
int rc;
 
/*
-* FIXME: OMAP1 devices do not use the clock framework for dmtimers so
-* do not call clk_get() for these devices.
+* Do not call clk_get() for OMAP1 due to no clock framework support.
 */
if (!(timer->capability & OMAP_TIMER_NEEDS_RESET)) {
timer->fclk = clk_get(>pdev->dev, "fck");
@@ -461,6 +460,7 @@ int omap_dm_timer_stop(struct omap_dm_timer *timer)
if (unlikely(!timer))
return -EINVAL;
 
+   /* OMAP1 is not converted to clk framework so avoid clk_get_rate here */
if (!(timer->capability & OMAP_TIMER_NEEDS_RESET))
rate = clk_get_rate(timer->fclk);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/9] ARM: OMAP: dmtimer: Add a write_ctrl function to simplify bit setting

2014-04-16 Thread Joel Fernandes
A common pattern in dmtimer code is to read the control reg, set and reset
certain bits, and write it back. We abstract this pattern and introduce a
new function to do so.

Signed-off-by: Joel Fernandes 
---
 arch/arm/plat-omap/dmtimer.c |   63 --
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 0e96ad2..782ff10 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -91,6 +91,18 @@ static void omap_dm_timer_write_reg(struct omap_dm_timer 
*timer, u32 reg,
__omap_dm_timer_write(timer, reg, value, timer->posted);
 }
 
+static u32 omap_dm_timer_write_ctrl(struct omap_dm_timer *timer, u32 mask,
+   u32 value)
+{
+   u32 l;
+
+   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
+   l &= mask;
+   l |= value;
+   omap_dm_timer_write_reg(timer, OMAP_TIMER_CTRL_REG, l);
+   return l;
+}
+
 static void omap_timer_restore_context(struct omap_dm_timer *timer)
 {
omap_dm_timer_write_reg(timer, OMAP_TIMER_WAKEUP_EN_REG,
@@ -521,23 +533,22 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_set_source);
 int omap_dm_timer_set_load(struct omap_dm_timer *timer, int autoreload,
unsigned int load)
 {
-   u32 l;
+   u32 mask = ~0, val = 0;
 
if (unlikely(!timer))
return -EINVAL;
 
omap_dm_timer_enable(timer);
-   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
if (autoreload)
-   l |= OMAP_TIMER_CTRL_AR;
+   val |= OMAP_TIMER_CTRL_AR;
else
-   l &= ~OMAP_TIMER_CTRL_AR;
-   omap_dm_timer_write_reg(timer, OMAP_TIMER_CTRL_REG, l);
+   mask &= ~OMAP_TIMER_CTRL_AR;
+   val = omap_dm_timer_write_ctrl(timer, mask, val);
omap_dm_timer_write_reg(timer, OMAP_TIMER_LOAD_REG, load);
 
omap_dm_timer_write_reg(timer, OMAP_TIMER_TRIGGER_REG, 0);
/* Save the context */
-   timer->context.tclr = l;
+   timer->context.tclr = val;
timer->context.tldr = load;
omap_dm_timer_disable(timer);
return 0;
@@ -577,22 +588,22 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_set_load_start);
 int omap_dm_timer_set_match(struct omap_dm_timer *timer, int enable,
 unsigned int match)
 {
-   u32 l;
+   u32 mask = ~0, val = 0;
 
if (unlikely(!timer))
return -EINVAL;
 
omap_dm_timer_enable(timer);
-   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
if (enable)
-   l |= OMAP_TIMER_CTRL_CE;
+   val |= OMAP_TIMER_CTRL_CE;
else
-   l &= ~OMAP_TIMER_CTRL_CE;
+   mask &= ~OMAP_TIMER_CTRL_CE;
+
omap_dm_timer_write_reg(timer, OMAP_TIMER_MATCH_REG, match);
-   omap_dm_timer_write_reg(timer, OMAP_TIMER_CTRL_REG, l);
+   val = omap_dm_timer_write_ctrl(timer, mask, val);
 
/* Save the context */
-   timer->context.tclr = l;
+   timer->context.tclr = val;
timer->context.tmar = match;
omap_dm_timer_disable(timer);
return 0;
@@ -602,24 +613,23 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_set_match);
 int omap_dm_timer_set_pwm(struct omap_dm_timer *timer, int def_on,
   int toggle, int trigger)
 {
-   u32 l;
+   u32 mask = ~0, val = 0;
 
if (unlikely(!timer))
return -EINVAL;
 
omap_dm_timer_enable(timer);
-   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
-   l &= ~(OMAP_TIMER_CTRL_GPOCFG | OMAP_TIMER_CTRL_SCPWM |
+   mask &= ~(OMAP_TIMER_CTRL_GPOCFG | OMAP_TIMER_CTRL_SCPWM |
   OMAP_TIMER_CTRL_PT | (0x03 << 10));
if (def_on)
-   l |= OMAP_TIMER_CTRL_SCPWM;
+   val |= OMAP_TIMER_CTRL_SCPWM;
if (toggle)
-   l |= OMAP_TIMER_CTRL_PT;
-   l |= trigger << 10;
-   omap_dm_timer_write_reg(timer, OMAP_TIMER_CTRL_REG, l);
+   val |= OMAP_TIMER_CTRL_PT;
+   val |= trigger << 10;
+   val = omap_dm_timer_write_ctrl(timer, mask, val);
 
/* Save the context */
-   timer->context.tclr = l;
+   timer->context.tclr = val;
omap_dm_timer_disable(timer);
return 0;
 }
@@ -627,22 +637,21 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_set_pwm);
 
 int omap_dm_timer_set_prescaler(struct omap_dm_timer *timer, int prescaler)
 {
-   u32 l;
+   u32 mask = ~0, val = 0;
 
if (unlikely(!timer))
return -EINVAL;
 
omap_dm_timer_enable(timer);
-   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
-   l &= ~(OMAP_TIMER_CTRL_PRE | (0x07 << 2));
+   mask &= ~(OMAP_TIMER_CTRL_PRE | (0x07 << 2));
if (prescaler >= 0x00 && prescaler <= 0x07) {
-   l |= OMAP_TIMER_CTRL_PRE;
-   l |= prescaler << 2;
+   val |= OMAP_TIMER_CTRL_PRE;
+   val |= 

[PATCH 8/9] ARM: OMAP: dmtimer: Add function to check for timer availability

2014-04-16 Thread Joel Fernandes
Simplify the check for a timer availability in atleast 4 places by providing a
function to do the same.

Signed-off-by: Joel Fernandes 
---
 arch/arm/plat-omap/dmtimer.c |   24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 8a4a97c..7e806f9 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -704,14 +704,22 @@ int omap_dm_timer_set_int_disable(struct omap_dm_timer 
*timer, u32 mask)
 }
 EXPORT_SYMBOL_GPL(omap_dm_timer_set_int_disable);
 
+static int is_timer_available(struct omap_dm_timer *timer)
+{
+   if (unlikely(!timer || pm_runtime_suspended(>pdev->dev))) {
+   pr_err("Timer not available or enabled.\n");
+   WARN_ON(1);
+   return 0;
+   }
+   return 1;
+}
+
 unsigned int omap_dm_timer_read_status(struct omap_dm_timer *timer)
 {
unsigned int l;
 
-   if (unlikely(!timer || pm_runtime_suspended(>pdev->dev))) {
-   pr_err("%s: timer not available or enabled.\n", __func__);
+   if (!is_timer_available(timer))
return 0;
-   }
 
l = __raw_readl(timer->irq_stat);
 
@@ -721,7 +729,7 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_read_status);
 
 int omap_dm_timer_write_status(struct omap_dm_timer *timer, unsigned int value)
 {
-   if (unlikely(!timer || pm_runtime_suspended(>pdev->dev)))
+   if (!is_timer_available(timer))
return -EINVAL;
 
__omap_dm_timer_write_status(timer, value);
@@ -732,10 +740,8 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_write_status);
 
 unsigned int omap_dm_timer_read_counter(struct omap_dm_timer *timer)
 {
-   if (unlikely(!timer || pm_runtime_suspended(>pdev->dev))) {
-   pr_err("%s: timer not iavailable or enabled.\n", __func__);
+   if (!is_timer_available(timer))
return 0;
-   }
 
return __omap_dm_timer_read_counter(timer, timer->posted);
 }
@@ -743,10 +749,8 @@ EXPORT_SYMBOL_GPL(omap_dm_timer_read_counter);
 
 int omap_dm_timer_write_counter(struct omap_dm_timer *timer, unsigned int 
value)
 {
-   if (unlikely(!timer || pm_runtime_suspended(>pdev->dev))) {
-   pr_err("%s: timer not available or enabled.\n", __func__);
+   if (!is_timer_available(timer))
return -EINVAL;
-   }
 
omap_dm_timer_write_reg(timer, OMAP_TIMER_COUNTER_REG, value);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/9] ARM: OMAP: dmtimer: Add note to set parent from DT

2014-04-16 Thread Joel Fernandes
Once clock-parents or default-parent support for DT clocks is available,
we should use it to set clock parent and turn clk_set_parent into a NOOP.

Signed-off-by: Joel Fernandes 
---
 arch/arm/plat-omap/dmtimer.c |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index f5a674c..4debb3d 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -165,6 +165,10 @@ static int omap_dm_timer_prepare(struct omap_dm_timer 
*timer)
__omap_dm_timer_enable_posted(timer);
omap_dm_timer_disable(timer);
 
+   /*
+* FIXME: Once DT clock-parents or set-parents support is upstream,
+* this is to become a NOOP.
+*/
return omap_dm_timer_set_source(timer, OMAP_TIMER_SRC_32_KHZ);
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/9] ARM: OMAP: dmtimer: Add function to check if timer is running

2014-04-16 Thread Joel Fernandes
Inorder to move non-DM timer specific code that modifies the "idlect"
mask on OMAP1, from dmtimer code, to OMAP1 specific timer initialization code,
we introduce a new function that can possibly be reused for other purposes in
the future. The function just checks if a timer is running based on the timer ID
which should be same as pdev->id. This allows us to cleanly separate the timer 
vs
non-timer bits and keep the timer bits in the dmtimer code.

Signed-off-by: Joel Fernandes 
---
 arch/arm/plat-omap/dmtimer.c  |   29 +
 arch/arm/plat-omap/include/plat/dmtimer.h |2 ++
 2 files changed, 31 insertions(+)

diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 4debb3d..86b2641 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -187,6 +187,35 @@ int omap_dm_timer_reserve_systimer(int id)
return 0;
 }
 
+/*
+ * Check if a timer is running based on timer_id, used for OMAP1 currently.
+ */
+int omap_dm_timer_is_running(int timer_id)
+{
+   int i = 1, ret = 0;
+   struct omap_dm_timer *timer = NULL;
+   unsigned long flags;
+
+   spin_lock_irqsave(_timer_lock, flags);
+   list_for_each_entry(timer, _timer_list, node) {
+   if (i == timer_id) {
+   u32 l;
+   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
+   if (l & OMAP_TIMER_CTRL_ST) {
+   ret = 1;
+   goto done;
+   } else {
+   goto done;
+   }
+   }
+   i++;
+   }
+done:
+   spin_unlock_irqrestore(_timer_lock, flags);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(omap_dm_timer_is_running);
+
 static struct omap_dm_timer *_omap_dm_timer_request(int req_type, void *data)
 {
struct omap_dm_timer *timer = NULL, *t;
diff --git a/arch/arm/plat-omap/include/plat/dmtimer.h 
b/arch/arm/plat-omap/include/plat/dmtimer.h
index 2861b15..41df0a6 100644
--- a/arch/arm/plat-omap/include/plat/dmtimer.h
+++ b/arch/arm/plat-omap/include/plat/dmtimer.h
@@ -135,6 +135,8 @@ void omap_dm_timer_disable(struct omap_dm_timer *timer);
 
 int omap_dm_timer_get_irq(struct omap_dm_timer *timer);
 
+int omap_dm_timer_is_running(int timer_id);
+
 u32 omap_dm_timer_modify_idlect_mask(u32 inputmask);
 struct clk *omap_dm_timer_get_fclk(struct omap_dm_timer *timer);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/9] ARM: OMAP: dmtimer: Get rid of check for mem resource error

2014-04-16 Thread Joel Fernandes
The subsequent devm_ioremap_resource will catch it and print an error, let it
be checked there.

Signed-off-by: Joel Fernandes 
---
 arch/arm/plat-omap/dmtimer.c |4 
 1 file changed, 4 deletions(-)

diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 7e806f9..1fd30fa 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -810,10 +810,6 @@ static int omap_dm_timer_probe(struct platform_device 
*pdev)
}
 
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   if (unlikely(!mem)) {
-   dev_err(dev, "%s: no memory resource.\n", __func__);
-   return -ENODEV;
-   }
 
timer = devm_kzalloc(dev, sizeof(struct omap_dm_timer), GFP_KERNEL);
if (!timer) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/9] ARM: OMAP: dmtimer: Have __omap_dm_timer_load_start set ST bit in CTRL instead of caller

2014-04-16 Thread Joel Fernandes
"load_start" implies start, so it makes sense to set the ST bit in
__omap_dm_timer_load_start instead of callers.

Signed-off-by: Joel Fernandes 
---
 arch/arm/mach-omap2/timer.c   |6 +++---
 arch/arm/plat-omap/dmtimer.c  |1 -
 arch/arm/plat-omap/include/plat/dmtimer.h |1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c
index 74044aa..dfb19df 100644
--- a/arch/arm/mach-omap2/timer.c
+++ b/arch/arm/mach-omap2/timer.c
@@ -95,7 +95,7 @@ static struct irqaction omap2_gp_timer_irq = {
 static int omap2_gp_timer_set_next_event(unsigned long cycles,
 struct clock_event_device *evt)
 {
-   __omap_dm_timer_load_start(, OMAP_TIMER_CTRL_ST,
+   __omap_dm_timer_load_start(, 0,
   0x - cycles, OMAP_TIMER_POSTED);
 
return 0;
@@ -116,7 +116,7 @@ static void omap2_gp_timer_set_mode(enum clock_event_mode 
mode,
__omap_dm_timer_write(, OMAP_TIMER_LOAD_REG,
  0x - period, OMAP_TIMER_POSTED);
__omap_dm_timer_load_start(,
-   OMAP_TIMER_CTRL_AR | OMAP_TIMER_CTRL_ST,
+   OMAP_TIMER_CTRL_AR,
0x - period, OMAP_TIMER_POSTED);
break;
case CLOCK_EVT_MODE_ONESHOT:
@@ -469,7 +469,7 @@ static void __init omap2_gptimer_clocksource_init(int 
gptimer_id,
BUG_ON(res);
 
__omap_dm_timer_load_start(,
-  OMAP_TIMER_CTRL_ST | OMAP_TIMER_CTRL_AR, 0,
+  OMAP_TIMER_CTRL_AR, 0,
   OMAP_TIMER_NONPOSTED);
sched_clock_register(dmtimer_read_sched_clock, 32, clksrc.rate);
 
diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 782ff10..8a4a97c 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -573,7 +573,6 @@ int omap_dm_timer_set_load_start(struct omap_dm_timer 
*timer, int autoreload,
} else {
l &= ~OMAP_TIMER_CTRL_AR;
}
-   l |= OMAP_TIMER_CTRL_ST;
 
__omap_dm_timer_load_start(timer, l, load, timer->posted);
 
diff --git a/arch/arm/plat-omap/include/plat/dmtimer.h 
b/arch/arm/plat-omap/include/plat/dmtimer.h
index 16ea9fd..fe3780a 100644
--- a/arch/arm/plat-omap/include/plat/dmtimer.h
+++ b/arch/arm/plat-omap/include/plat/dmtimer.h
@@ -393,6 +393,7 @@ static inline void __omap_dm_timer_load_start(struct 
omap_dm_timer *timer,
u32 ctrl, unsigned int load,
int posted)
 {
+   ctrl |= OMAP_TIMER_CTRL_ST;
__omap_dm_timer_write(timer, OMAP_TIMER_COUNTER_REG, load, posted);
__omap_dm_timer_write(timer, OMAP_TIMER_CTRL_REG, ctrl, posted);
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/9] ARM: OMAP1: dmtimer: Rewrite modify of IDLECT mask to use new is_running function

2014-04-16 Thread Joel Fernandes
While at it, also delete the old definition of the function in dmtimer.c code.
This completes the separation and removal of OMAP1 header dependency in dmtimer
code and removes references to MOD_CONF_CTRL registers in dmtimer.

Signed-off-by: Joel Fernandes 
---
 arch/arm/mach-omap1/include/mach/hardware.h |2 ++
 arch/arm/mach-omap1/timer.c |   26 +++
 arch/arm/plat-omap/dmtimer.c|   48 ---
 arch/arm/plat-omap/include/plat/dmtimer.h   |1 -
 4 files changed, 28 insertions(+), 49 deletions(-)

diff --git a/arch/arm/mach-omap1/include/mach/hardware.h 
b/arch/arm/mach-omap1/include/mach/hardware.h
index 5875a50..46de040 100644
--- a/arch/arm/mach-omap1/include/mach/hardware.h
+++ b/arch/arm/mach-omap1/include/mach/hardware.h
@@ -70,6 +70,8 @@ static inline u32 omap_cs3_phys(void)
? 0 : OMAP_CS3_PHYS;
 }
 
+__u32 omap_dm_timer_modify_idlect_mask(__u32 inputmask);
+
 #endif /* ifndef __ASSEMBLER__ */
 
 #define OMAP1_IO_OFFSET0x0100  /* Virtual IO = 
0xfefb */
diff --git a/arch/arm/mach-omap1/timer.c b/arch/arm/mach-omap1/timer.c
index 4b9c604..0a039f1 100644
--- a/arch/arm/mach-omap1/timer.c
+++ b/arch/arm/mach-omap1/timer.c
@@ -42,6 +42,32 @@
 
 #define OMAP1_DM_TIMER_COUNT   8
 
+/**
+ * omap_dm_timer_modify_idlect_mask - Check if any running timers use ARMXOR
+ * @inputmask: current value of idlect mask
+ */
+__u32 omap_dm_timer_modify_idlect_mask(__u32 inputmask)
+{
+   int i;
+
+   /* If ARMXOR cannot be idled this function call is unnecessary */
+   if (!(inputmask & (1 << 1)))
+   return inputmask;
+
+   for (i = 1; i <= OMAP1_DM_TIMER_COUNT; i++) {
+   if (omap_dm_timer_is_running(i)) {
+   if (((omap_readl(MOD_CONF_CTRL_1) >> ((i-1) * 2))
+ & 0x03) == 0)
+   inputmask &= ~(1 << 1);
+   else
+   inputmask &= ~(1 << 2);
+   }
+   i++;
+   }
+
+   return inputmask;
+}
+
 static int omap1_dm_timer_set_src(struct platform_device *pdev,
int source)
 {
diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 86b2641..0e96ad2 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -397,44 +397,6 @@ int omap_dm_timer_get_irq(struct omap_dm_timer *timer)
 }
 EXPORT_SYMBOL_GPL(omap_dm_timer_get_irq);
 
-#if defined(CONFIG_ARCH_OMAP1)
-#include 
-/**
- * omap_dm_timer_modify_idlect_mask - Check if any running timers use ARMXOR
- * @inputmask: current value of idlect mask
- */
-__u32 omap_dm_timer_modify_idlect_mask(__u32 inputmask)
-{
-   int i = 0;
-   struct omap_dm_timer *timer = NULL;
-   unsigned long flags;
-
-   /* If ARMXOR cannot be idled this function call is unnecessary */
-   if (!(inputmask & (1 << 1)))
-   return inputmask;
-
-   /* If any active timer is using ARMXOR return modified mask */
-   spin_lock_irqsave(_timer_lock, flags);
-   list_for_each_entry(timer, _timer_list, node) {
-   u32 l;
-
-   l = omap_dm_timer_read_reg(timer, OMAP_TIMER_CTRL_REG);
-   if (l & OMAP_TIMER_CTRL_ST) {
-   if (((omap_readl(MOD_CONF_CTRL_1) >> (i * 2)) & 0x03) 
== 0)
-   inputmask &= ~(1 << 1);
-   else
-   inputmask &= ~(1 << 2);
-   }
-   i++;
-   }
-   spin_unlock_irqrestore(_timer_lock, flags);
-
-   return inputmask;
-}
-EXPORT_SYMBOL_GPL(omap_dm_timer_modify_idlect_mask);
-
-#else
-
 struct clk *omap_dm_timer_get_fclk(struct omap_dm_timer *timer)
 {
if (timer && !IS_ERR(timer->fclk))
@@ -443,16 +405,6 @@ struct clk *omap_dm_timer_get_fclk(struct omap_dm_timer 
*timer)
 }
 EXPORT_SYMBOL_GPL(omap_dm_timer_get_fclk);
 
-__u32 omap_dm_timer_modify_idlect_mask(__u32 inputmask)
-{
-   BUG();
-
-   return 0;
-}
-EXPORT_SYMBOL_GPL(omap_dm_timer_modify_idlect_mask);
-
-#endif
-
 int omap_dm_timer_trigger(struct omap_dm_timer *timer)
 {
if (unlikely(!timer || pm_runtime_suspended(>pdev->dev))) {
diff --git a/arch/arm/plat-omap/include/plat/dmtimer.h 
b/arch/arm/plat-omap/include/plat/dmtimer.h
index 41df0a6..16ea9fd 100644
--- a/arch/arm/plat-omap/include/plat/dmtimer.h
+++ b/arch/arm/plat-omap/include/plat/dmtimer.h
@@ -137,7 +137,6 @@ int omap_dm_timer_get_irq(struct omap_dm_timer *timer);
 
 int omap_dm_timer_is_running(int timer_id);
 
-u32 omap_dm_timer_modify_idlect_mask(u32 inputmask);
 struct clk *omap_dm_timer_get_fclk(struct omap_dm_timer *timer);
 
 int omap_dm_timer_trigger(struct omap_dm_timer *timer);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH 1/9] ARM: OMAP: dmtimer: Remove setting of clk parent indirectly through platform hook

2014-04-16 Thread Joel Fernandes
There is a platform specific hook just for OMAP1 to set its clk parent.  Remove
this hook and have OMAP1 set its parent in omap1_dm_timer_init.  If OMAP1 is
ever migrated to clock framework, the correct way to do this would be through
clk_set_parent like other platforms.

Signed-off-by: Joel Fernandes 
---
 arch/arm/mach-omap1/timer.c|8 +++-
 arch/arm/plat-omap/dmtimer.c   |8 +++-
 include/linux/platform_data/dmtimer-omap.h |2 --
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mach-omap1/timer.c b/arch/arm/mach-omap1/timer.c
index bde7a35..4b9c604 100644
--- a/arch/arm/mach-omap1/timer.c
+++ b/arch/arm/mach-omap1/timer.c
@@ -140,7 +140,13 @@ static int __init omap1_dm_timer_init(void)
goto err_free_pdata;
}
 
-   pdata->set_timer_src = omap1_dm_timer_set_src;
+   /*
+* Since OMAP1 doesn't support clock framework, set timer clock
+* source to 32KHz here instead of expecting it to be set by
+* dmtimer code.
+*/
+   omap1_dm_timer_set_src(pdev, 0x01);
+
pdata->timer_capability = OMAP_TIMER_ALWON |
OMAP_TIMER_NEEDS_RESET | OMAP_TIMER_HAS_DSP_IRQ;
 
diff --git a/arch/arm/plat-omap/dmtimer.c b/arch/arm/plat-omap/dmtimer.c
index 869254c..ecd3f97 100644
--- a/arch/arm/plat-omap/dmtimer.c
+++ b/arch/arm/plat-omap/dmtimer.c
@@ -494,12 +494,10 @@ int omap_dm_timer_set_source(struct omap_dm_timer *timer, 
int source)
return -EINVAL;
 
/*
-* FIXME: Used for OMAP1 devices only because they do not currently
-* use the clock framework to set the parent clock. To be removed
-* once OMAP1 migrated to using clock framework for dmtimers
+* For OMAP1, timer source is already set during omap1_dm_timer_init.
 */
-   if (pdata && pdata->set_timer_src)
-   return pdata->set_timer_src(timer->pdev, source);
+   if (timer->capability & OMAP_TIMER_NEEDS_RESET)
+   return 0;
 
if (IS_ERR(timer->fclk))
return -EINVAL;
diff --git a/include/linux/platform_data/dmtimer-omap.h 
b/include/linux/platform_data/dmtimer-omap.h
index a19b78d..9f42b06 100644
--- a/include/linux/platform_data/dmtimer-omap.h
+++ b/include/linux/platform_data/dmtimer-omap.h
@@ -21,8 +21,6 @@
 #define __PLATFORM_DATA_DMTIMER_OMAP_H__
 
 struct dmtimer_platform_data {
-   /* set_timer_src - Only used for OMAP1 devices */
-   int (*set_timer_src)(struct platform_device *pdev, int source);
u32 timer_capability;
u32 timer_errata;
int (*get_context_loss_count)(struct device *);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2 v3] staging: comedi: addi_apci_1564: fixup and absorb apci1564_reset()

2014-04-16 Thread Hartley Sweeten
On Wednesday, April 16, 2014 4:53 PM, Chase Southwood wrote:
>>On Wednesday, April 16, 2014 6:36 PM, Hartley Sweeten 
>> wrote:
>>>On Wednesday, April 16, 2014 4:34 PM, Chase Southwood wrote:
>>>Move apci1564_reset() from hwdrv_apci1564.c to addi_apci_1564.c.  The
>>>function was very messy and failed to reset a couple registers, these
>>>issues were fixed on the move.>
>>
>>The commit message needs updated. You are no longer moving the function.
>>
>
> Erm...in the PATCH v3 I just sent, it has been moved as described here, so at 
> least for
> what is happening now, this changelog is accurate.  Would you like me to no 
> longer move
> the function for the time being?

Ah, missed that you moved it before the struct addi_board definition.

Hmmm... I think it would be cleaner if you moved the function after separating
this driver from the addi_common.c file. Then the moved functions can be put
into the driver in "cleaner" locations and avoid any forward declaration junk.

The brute force way to do this is just copy the contents of addi_common.c to
the driver and remove the #include. Then you can move the functions from
the hwrdv_apci1564.c file to the driver, and remove them from the boardinfo,
as needed.

Once you get the addi_common.c stuff localized you should fine that much
of it is just NOP code for this driver. Those pieces then just need to be ripped
out.

Have fun...

Hartley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fbdev: fix possible NULL pointer derefernce

2014-04-16 Thread DaeSeok Youn
Hello,

2014-04-16 21:38 GMT+09:00 Jean-Christophe PLAGNIOL-VILLARD
:
>
> On Apr 16, 2014, at 5:40 PM, Daeseok Youn  wrote:
>
>>
>> The spec->modedb can be NULL by fb_create_modedb().
>>
>> And also smatch says:
>> drivers/video/fbdev/core/fbmon.c:975 fb_edid_to_monspecs() error:
>> potential null dereference 'specs->modedb'.
>> (fb_create_modedb returns null)
>>
>> Signed-off-by: Daeseok Youn 
>> ---
>> drivers/video/fbdev/core/fbmon.c |3 +++
>> 1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/video/fbdev/core/fbmon.c 
>> b/drivers/video/fbdev/core/fbmon.c
>> index c204ebe..db274ca 100644
>> --- a/drivers/video/fbdev/core/fbmon.c
>> +++ b/drivers/video/fbdev/core/fbmon.c
>> @@ -966,6 +966,9 @@ void fb_edid_to_monspecs(unsigned char *edid, struct 
>> fb_monspecs *specs)
>>
>>   specs->modedb = fb_create_modedb(edid, >modedb_len);
>>
>> + if (!specs->modedb)
>> + return;
>> +
>
> we need to return an error and trace it
Yes, you're right. I will change return type from void to int and add
to handle an error when this function(fb_edid_to_monspecs) is called.

I will send this patch as your comment.

Thanks for review.

Daeseok Youn.
>
> Best Regards,
> J.
>>   /*
>>* Workaround for buggy EDIDs that sets that the first
>>* detailed timing is preferred but has not detailed
>> --
>> 1.7.4.4
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/9] dmtimer code clean-up first pass

2014-04-16 Thread Joel Fernandes
Here are some minor cleanups for dmtimer code in preparation for moving it out
to drivers.

There is OMAP1 specific dmtimer code earlier, that are handled in mach-omap1
directly now.   Other than this, few functions and code has been refactored to
reduce redundancy and some minor cleanups.

OMAP1 hasn't been tested so I welcome anyone with HW to test this.

Joel Fernandes (9):
  ARM: OMAP: dmtimer: Remove setting of clk parent indirectly through
platform hook
  ARM: OMAP: dmtimer: Add comments on OMAP1 clock framework
  ARM: OMAP: dmtimer: Add note to set parent from DT
  ARM: OMAP: dmtimer: Add function to check if timer is running
  ARM: OMAP1: dmtimer: Rewrite modify of IDLECT mask to use new
is_running function
  ARM: OMAP: dmtimer: Add a write_ctrl function to simplify bit setting
  ARM: OMAP: dmtimer: Have __omap_dm_timer_load_start set ST bit in
CTRL instead of caller
  ARM: OMAP: dmtimer: Add function to check for timer availability
  ARM: OMAP: dmtimer: Get rid of check for mem resource error

 arch/arm/mach-omap1/include/mach/hardware.h |2 +
 arch/arm/mach-omap1/timer.c |   34 -
 arch/arm/mach-omap2/timer.c |6 +-
 arch/arm/plat-omap/dmtimer.c|  185 +--
 arch/arm/plat-omap/include/plat/dmtimer.h   |4 +-
 include/linux/platform_data/dmtimer-omap.h  |2 -
 6 files changed, 129 insertions(+), 104 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: android: binder.c: avoid sparse checker warning: cast removes address space of expression

2014-04-16 Thread Dan Carpenter
On Wed, Apr 16, 2014 at 10:42:17PM +0200, Yves Deweerdt wrote:
> 
> __user should be kept when casting to struct binder_version *
> 
> 

Mathieu sent this one earlier.
http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2014-April/048614.html

His is perfect, except that he missed the space in the ": {", but it's
still ok.

regards,
dan carpenter

> Signed-off-by: Yves Deweerdt 
> ---
>  drivers/staging/android/binder.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/android/binder.c 
> b/drivers/staging/android/binder.c
> index cfe4bc8..0f74e43 100644
> --- a/drivers/staging/android/binder.c
> +++ b/drivers/staging/android/binder.c
> @@ -2683,16 +2683,21 @@ static long binder_ioctl(struct file *filp, unsigned 
> int cmd, unsigned long arg)
>   binder_free_thread(proc, thread);
>   thread = NULL;
>   break;
> - case BINDER_VERSION:
> + case BINDER_VERSION: {
> + struct binder_version __user *bv =
> + (struct binder_version __user *)ubuf;
> +
>   if (size != sizeof(struct binder_version)) {
>   ret = -EINVAL;
>   goto err;
>   }
> - if (put_user(BINDER_CURRENT_PROTOCOL_VERSION, &((struct 
> binder_version *)ubuf)->protocol_version)) {
> + if (put_user(BINDER_CURRENT_PROTOCOL_VERSION,
> + &(bv->protocol_version))) {
>   ret = -EINVAL;
>   goto err;
>   }
>   break;
> + }
>   default:
>   ret = -EINVAL;
>   goto err;
> -- 
> 1.8.3.2
> 
> ___
> devel mailing list
> de...@linuxdriverproject.org
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the akpm-current tree

2014-04-16 Thread Stephen Rothwell
Hi Mel, Andrew,

On Wed, 16 Apr 2014 20:21:35 +0100 Mel Gorman  wrote:
>
> On Wed, Apr 16, 2014 at 03:19:56PM +1000, Stephen Rothwell wrote:
> > 
> > After merging the akpm-current tree, today's linux-next build (powerpc
> > ppc64_defconfig) failed like this:
> > 
> > In file included from mm/vmscan.c:50:0:
> > include/linux/swapops.h: In function 'is_swap_pte':
> > include/linux/swapops.h:57:2: error: implicit declaration of function 
> > 'pte_present_nonuma' [-Werror=implicit-function-declaration]
> >   return !pte_none(pte) && !pte_present_nonuma(pte) && !pte_file(pte);
> >   ^
> > 
> > Caused by commit 851fe3337768 ("x86: define _PAGE_NUMA by reusing
> > software bits on the PMD and PTE levels").  This build does not have
> > CONFIG_NUMA_BALANCING set.
> > 
> > I have reverted that commit for today.
> 
> Thanks Stephen. A patch that should address the problem is on its way to
> Andrew.

I grabbed it out of mmots and added it to linux-next for today (in case a
new mmotm does not appear in time).
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpaSEiYzm6zB.pgp
Description: PGP signature


Re: [PATCH] workqueue: add __WQ_FREEZING and remove POOL_FREEZING

2014-04-16 Thread Lai Jiangshan
On 04/17/2014 03:51 AM, Tejun Heo wrote:
> Hello,
> 
> On Tue, Mar 25, 2014 at 05:56:04PM +0800, Lai Jiangshan wrote:
>> freezing is nothing related to pools, but POOL_FREEZING adds a connection,
>> and causes freeze_workqueues_begin() and thaw_workqueues() complicated.
>>
>> Since freezing is workqueue instance attribute, so we introduce __WQ_FREEZING
>> to wq->flags instead and remove POOL_FREEZING.
>>
>> we set __WQ_FREEZING only when freezable(to simplify 
>> pwq_adjust_max_active()),
>> make freeze_workqueues_begin() and thaw_workqueues() fast skip non-freezable 
>> wq.
> 
> Please wrap the description to 80 columns.
> 
>> @@ -3730,18 +3726,13 @@ static void pwq_unbound_release_workfn(struct 
>> work_struct *work)
>>  static void pwq_adjust_max_active(struct pool_workqueue *pwq)
>>  {
>>  struct workqueue_struct *wq = pwq->wq;
>> -bool freezable = wq->flags & WQ_FREEZABLE;
>>  
>> -/* for @wq->saved_max_active */
>> +/* for @wq->saved_max_active and @wq->flags */
>>  lockdep_assert_held(>mutex);
>>  
>> -/* fast exit for non-freezable wqs */
>> -if (!freezable && pwq->max_active == wq->saved_max_active)
>> -return;
>> -
> 
> Why are we removing the above?  Can't we still test __WQ_FREEZING as
> we're holding wq->mutex?  I don't really mind removing the
> optimization but the patch description at least has to explain what's
> going on.

This part was in other old patch: https://lkml.org/lkml/2013/4/3/756
I admit the changelogs(old patch) are bad.
But I still consider it would be better if we split it to two patches:
(https://lkml.org/lkml/2013/4/3/748 & https://lkml.org/lkml/2013/4/3/756)

There are different aims in the patches.

Any thinks? And sorry for I didn't keep to push the patches at that time.
Thanks
Lai

> 
> ...
>>  list_for_each_entry(wq, , list) {
>> +if (!(wq->flags & WQ_FREEZABLE))
>> +continue;
> 
> Ah, okay, you're not calling the function at all if WQ_FREEZABLE is
> not set.  I couldn't really understand what you were trying to say in
> the patch description.  Can you please try to refine the description
> more?  It's better to be verbose and clear than short and difficult to
> understand.
> 
> Thanks.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm/compaction: make isolate_freepages start at pageblock boundary

2014-04-16 Thread Minchan Kim
On Tue, Apr 15, 2014 at 11:18:26AM +0200, Vlastimil Babka wrote:
> The compaction freepage scanner implementation in isolate_freepages() starts
> by taking the current cc->free_pfn value as the first pfn. In a for loop, it
> scans from this first pfn to the end of the pageblock, and then subtracts
> pageblock_nr_pages from the first pfn to obtain the first pfn for the next
> for loop iteration.
> 
> This means that when cc->free_pfn starts at offset X rather than being aligned
> on pageblock boundary, the scanner will start at offset X in all scanned
> pageblock, ignoring potentially many free pages. Currently this can happen 
> when
> a) zone's end pfn is not pageblock aligned, or
> b) through zone->compact_cached_free_pfn with CONFIG_HOLES_IN_ZONE enabled and
>a hole spanning the beginning of a pageblock
> 
> This patch fixes the problem by aligning the initial pfn in 
> isolate_freepages()
> to pageblock boundary. This also allows to replace the end-of-pageblock
> alignment within the for loop with a simple pageblock_nr_pages increment.
> 
> Signed-off-by: Vlastimil Babka 
> Reported-by: Heesub Shin 

Acked-by: Minchan Kim 

-stable stuff?

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] mtd: m25p80: Calculate flash block protect bits based on number of sectors

2014-04-16 Thread Marek Vasut
On Wednesday, April 16, 2014 at 02:37:13 PM, Austin Boyle wrote:
> This patch generalises the calculation of block protect bits based on the
> number of sectors and implements the _is_locked function.
> 
> Existing calculation of block protect bits only works for devices with 64
> sectors or more. This new logic is applicable to the STmicro devices:
> m25p10, p20, p40, p80, p16, pe16, p32, p64, p128.
> Note devices with >64 sectors only allow the protected region to be
> specified to a resolution of 1/64th of the total size (such as m25p64).
> 
> New return codes for ioctl(MEMISLOCKED) have been added to
> uapi/mtd/mtd-abi.h because the _is_locked function can query a region
> which is partially unlocked.
> 
> Added flag to m25p_ids table to indicate if flash protection is supported.
> 
> Added n_sectors and sector_size to m25p flash structure so it can be used
> in block protect bit calculation.
> 
> From: Austin Boyle 
> Signed-off-by: Austin Boyle 

Acked-by: Marek Vasut 

Thanks!

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Route keyboard LEDs through the generic LEDs layer.

2014-04-16 Thread Samuel Thibault
Hello,

Dmitry Torokhov, le Sat 12 Apr 2014 18:25:57 -0700, a écrit :
> On Fri, Apr 11, 2014 at 08:12:02AM +0200, Samuel Thibault wrote:
> > I'm sorry this went out with a few mistakes.
> > 
> > Samuel Thibault, le Wed 09 Apr 2014 01:33:06 +0200, a écrit :
> > > Dmitry Torokhov, le Tue 08 Apr 2014 01:39:40 -0700, a écrit :
> > > > It is not about the VT, I am talking about pure input core. If I
> > > > repurpose CapsLock LED for my WiFi indicator I do not want to go into
> > > > other programs and teach them that they should stay away from trying to
> > > > control this LED.
> > > 
> > > Err, but even without talking about repurposing Capslock LED for WiFi...
> > > How is managed the conflict between the normal capslock behavior and
> > > other programs' action on the underlying input device?  I don't think
> > > this patch does not introduce the problem.
> > 
> > I of course meant "I don't think this patch introduces the problem".
> 
> The difference in my eyes is that with old interface both players knew
> that they would be affecting (and potentially interfering) with the
> state of CapsLock LED. With your proposed changes users of old
> interfaces have no idea if they are actually toggling CapsLock LED or if
> they are affecting something that is no longer a CapsLock LED.

Mmm, I thought you were talking about evdev programs (which would always
get to manipulate the hardware capslock leds anyway)?

Are you here talking about KDSETLEDS?  This is a very odd ioctl
actually.  Its name suggests that its purpose is to just light LEDs,
but AFAIK it is essentially used by users to change the status of the
keyboard state, and the LED change is just a side effect for them.  So
if e.g. the numlock LED is repurposed to wifi, should KDSETLEDS really
just switch on the LED (thus defeating the wifi purpose), while the user
probably used setleds +num only to get its numlock enabled by default
(and doesn't care about seeing that on the keyboard, on the contrary, he
preferred to see the wifi state).

> > > How to decide which one to prioritize?
> > > 
> > > Is it just because console-setup happens to repurpose the capslock LED
> > > key that applications should suddenly not have capslock LED control
> > > at all?  That's contradictory with the use case you have given.
> > 
> > Oops, not the use case you have given, but a typical use-case of wanting
> > to use a program which does nice things with the capslock LED, and then
> > having to teach console-setup not to repurpose the capslock LED.
> 
> I believe that applications should be able to control sate of CapsLock
> and other LEDs and that the affected LED should not be the physical LED
> but rather LED that is currently tied to CapsLock trigger (if any).

Which kind of applications are you talking about?  What I have heard
from users was that they either wanted to have an effect on the physical
LED, or on the keyboard state, but not really "on the LED which shows
the keyboard state", which would just be a side consequence of what they
really want to achieve.

> This
> way everything works as is by default and if I decide to have my
> physical CapsLock LED to be repurposed for Wifi or HDD activity or
> whatever I do not need to teach unrelated applications to stop touching
> it.

Again, which applications are you thinking about precisely?  If a
user has an application which wants to really show something on the
capslock LED, then the user's intent was really the physical LED, and he
hurted himself in the foot.  Otherwise, what was the intention of the
application?  I don't understand why an application would want to light
"the LED which shows the capslock state", and not simply the keyboard
capslock status (which thus may or may not be reflected on some physical
LED, depending whether some LED is plugged on the keyboard capslock
status).

> > > That
> > > leads me into believing that we should not try to push a hard rule, and
> > > just let the user manage what accesses it.  We could indeed make the VT
> > > always take priority, but then that would probably break some existing
> > > applications.
> > 
> > Such as the example above, with the capslock LED.
> 
> I am not saying that VT shoudl have priority, I am saying we need to
> come up with implementation that does not result in inconsistent
> behavior.

Ok, but for now I don't see in which exact case we'd have an
inconsistency.

Dmitry Torokhov, le Sat 12 Apr 2014 18:30:49 -0700, a écrit :
> > I'd say that applications using direct EV_LED interface should just
> > stop doing it. Yes, you can probably use led API and still toggle the
> > led using gpio api behind leds back (not tested, perhaps there are
> > interlocks that prevent that)... and it is same situation with
> > EV_LED. We should just teach applications not to do that.
> > 
> > Would solution where EV_LED would be ignored when there's non-default
> > trigger selected work for you?
> 
> Not ignored but rather routed to the LED that is currently selected 

RE: [PATCH 2/2 v3] staging: comedi: addi_apci_1564: fixup and absorb apci1564_reset()

2014-04-16 Thread Hartley Sweeten
On Wednesday, April 16, 2014 4:34 PM, Chase Southwood wrote:
>
> Move apci1564_reset() from hwdrv_apci1564.c to addi_apci_1564.c.  The
> function was very messy and failed to reset a couple registers, these
> issues were fixed on the move.

The commit message needs updated. You are no longer moving the function.

Regards,
Hartley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: f2fs: BUG_ON() is triggered when mount valid f2fs filesystem

2014-04-16 Thread Jaegeuk Kim
Hi,

2014-04-16 (수), 13:11 +0400, Andrey Tsyvarev:
> Hi,
> 
> With this patch mounting of the image continues to fail (with similar 
> BUG_ON).
> But when image is formatted again (and steps mentioned in the previous 
> message are performed),
> mounting of it is now succeed.
> 
> Is this is a true purpose of the patch?

Indeed. The patch solves there-in root cause.
But, if you're trying to use the failed image again, simply you can skip
the errorneous part by:

# mount ... -o disable_roll_forward ...

Once sync or umount whatever checkpoint is done after that, the image
will be mounted without "disable_roll_forward".

Thanks,

> 
> 15.04.2014 15:04, Jaegeuk Kim пишет:
> > Hi,
> >
> > Thank you for the report.
> > I retrieved the fault image and found out that previous garbage data
> > wreak such the wrong behaviors.
> > So, I wrote the following patch that fills one zero-block at the
> > checkpoint procedure.
> > If the underlying device supports discard, I expect that it mostly
> > doesn't incur any performance regression significantly.
> >
> > Could you test this patch?
> >
> > >From 60588ceb7277aae2a79e7f67f5217d1256720d78 Mon Sep 17 00:00:00 2001
> > From: Jaegeuk Kim 
> > Date: Tue, 15 Apr 2014 13:57:55 +0900
> > Subject: [PATCH] f2fs: avoid to conduct roll-forward due to the remained
> >   garbage blocks
> >
> > The f2fs always scans the next chain of direct node blocks.
> > But some garbage blocks are able to be remained due to no discard
> > support or
> > SSR triggers.
> > This occasionally wreaks recovering wrong inodes that were used or
> > BUG_ONs
> > due to reallocating node ids as follows.
> >
> > When mount this f2fs image:
> > http://linuxtesting.org/downloads/f2fs_fault_image.zip
> > BUG_ON is triggered in f2fs driver (messages below are generated on
> > kernel 3.13.2; for other kernels output is similar):
> >
> > kernel BUG at fs/f2fs/node.c:215!
> >   Call Trace:
> >   [] recover_inode_page+0x1fd/0x3e0 [f2fs]
> >   [] ? __lock_page+0x67/0x70
> >   [] ? autoremove_wake_function+0x50/0x50
> >   [] recover_fsync_data+0x1398/0x15d0 [f2fs]
> >   [] ? selinux_d_instantiate+0x1c/0x20
> >   [] ? d_instantiate+0x5b/0x80
> >   [] f2fs_fill_super+0xb04/0xbf0 [f2fs]
> >   [] ? mount_bdev+0x7e/0x210
> >   [] mount_bdev+0x1c9/0x210
> >   [] ? validate_superblock+0x210/0x210 [f2fs]
> >   [] f2fs_mount+0x1d/0x30 [f2fs]
> >   [] mount_fs+0x47/0x1c0
> >   [] ? __alloc_percpu+0x10/0x20
> >   [] vfs_kern_mount+0x72/0x110
> >   [] do_mount+0x493/0x910
> >   [] ? strndup_user+0x5b/0x80
> >   [] SyS_mount+0x90/0xe0
> >   [] system_call_fastpath+0x16/0x1b
> >
> > Found by Linux File System Verification project (linuxtesting.org).
> >
> > Reported-by: Andrey Tsyvarev 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >   fs/f2fs/checkpoint.c |  6 ++
> >   fs/f2fs/f2fs.h   |  1 +
> >   fs/f2fs/segment.c| 17 +++--
> >   3 files changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index 4aa521a..890e23d 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -762,6 +762,12 @@ static void do_checkpoint(struct f2fs_sb_info *sbi,
> > bool is_umount)
> > void *kaddr;
> > int i;
> >   
> > +   /*
> > +* This avoids to conduct wrong roll-forward operations and uses
> > +* metapages, so should be called prior to sync_meta_pages below.
> > +*/
> > +   discard_next_dnode(sbi);
> > +
> > /* Flush all the NAT/SIT pages */
> > while (get_pages(sbi, F2FS_DIRTY_META))
> > sync_meta_pages(sbi, META, LONG_MAX);
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 2ecac83..2c5a5da 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -1179,6 +1179,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *);
> >   void invalidate_blocks(struct f2fs_sb_info *, block_t);
> >   void refresh_sit_entry(struct f2fs_sb_info *, block_t, block_t);
> >   void clear_prefree_segments(struct f2fs_sb_info *);
> > +void discard_next_dnode(struct f2fs_sb_info *);
> >   int npages_for_summary_flush(struct f2fs_sb_info *);
> >   void allocate_new_segments(struct f2fs_sb_info *);
> >   struct page *get_sum_page(struct f2fs_sb_info *, unsigned int);
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 1e264e7..9993f94 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -335,13 +335,26 @@ static void locate_dirty_segment(struct
> > f2fs_sb_info *sbi, unsigned int segno)
> > mutex_unlock(_i->seglist_lock);
> >   }
> >   
> > -static void f2fs_issue_discard(struct f2fs_sb_info *sbi,
> > +static int f2fs_issue_discard(struct f2fs_sb_info *sbi,
> > block_t blkstart, block_t blklen)
> >   {
> > sector_t start = SECTOR_FROM_BLOCK(sbi, blkstart);
> > sector_t len = SECTOR_FROM_BLOCK(sbi, blklen);
> > -   blkdev_issue_discard(sbi->sb->s_bdev, start, len, GFP_NOFS, 0);
> > trace_f2fs_issue_discard(sbi->sb, blkstart, blklen);
> > +   return 

[PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit

2014-04-16 Thread Lai Jiangshan
Before the rescuer is picked to running, the works of the @pwq
may be processed by some other workers, and destroy_workqueue()
may called at the same time. This may result a nasty situation
that rescuer may exit with non-empty mayday list.

It is no harm currently, destroy_workqueue() can safely to free
them all(workqueue) togerther, since the rescuer is stopped.
No rescuer nor mayday-timer can access the mayday list.

But it is nasty and error-prone in future develop. Fix it.

Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0ee63af..832125f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2409,12 +2409,6 @@ static int rescuer_thread(void *__rescuer)
 repeat:
set_current_state(TASK_INTERRUPTIBLE);
 
-   if (kthread_should_stop()) {
-   __set_current_state(TASK_RUNNING);
-   rescuer->task->flags &= ~PF_WQ_WORKER;
-   return 0;
-   }
-
/* see whether any pwq is asking for help */
spin_lock_irq(_mayday_lock);
 
@@ -2459,6 +2453,12 @@ repeat:
 
spin_unlock_irq(_mayday_lock);
 
+   if (kthread_should_stop()) {
+   __set_current_state(TASK_RUNNING);
+   rescuer->task->flags &= ~PF_WQ_WORKER;
+   return 0;
+   }
+
/* rescuers should never participate in concurrency management */
WARN_ON_ONCE(!(rescuer->flags & WORKER_NOT_RUNNING));
schedule();
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] workqueue: fix possible race condition when rescuer VS pwq-release

2014-04-16 Thread Lai Jiangshan
There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

The works of the @pwq may be processed by some other workers,
and @pwq is scheduled to release(due to its wq's attr is changed)
before the rescuer starts to process. In this case
pwq_unbound_release_workfn() will corrupt wq->maydays list,
and rescuer_thead() will access to corrupted data.

Using get_pwq() pin it until rescuer is done with it.

Signed-off-by: Lai Jiangshan 
---
 kernel/workqueue.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 832125f..77c29b7 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1916,6 +1916,16 @@ static void send_mayday(struct work_struct *work)
 
/* mayday mayday mayday */
if (list_empty(>mayday_node)) {
+   /*
+* pwqs might go away at any time, pin it until
+* rescuer is done with it.
+*
+* Especially a pwq of an unbound wq may be released
+* before wq's destruction when the wq's attr is changed.
+* In this case, pwq_unbound_release_workfn() may execute
+* earlier before rescuer_thread() and corrupt wq->maydays.
+*/
+   get_pwq(pwq);
list_add_tail(>mayday_node, >maydays);
wake_up_process(wq->rescuer->task);
}
@@ -2438,6 +2448,9 @@ repeat:
 
process_scheduled_works(rescuer);
 
+   /* put the reference grabbed by send_mayday(). */
+   put_pwq(pwq);
+
/*
 * Leave this pool.  If keep_working() is %true, notify a
 * regular worker; otherwise, we end up with 0 concurrency
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >