date:20140121

Re: [GIT PULL] GPIO bulk changes for v3.14

2014-01-21 Thread Linus Walleij

On Tue, Jan 21, 2014 at 7:11 PM, Linus Torvalds
 wrote:

> The fact that it doesn't even compile makes me doubt your statement
> that it has been in linux-next. It doesn't even pass a basic
> allmodconfig build.

Hm I rely on the zeroday build, and didn't get any angry compile errors.
I'll double-check with Fengguang to see what's going on here and
that branches get proper buildtesting.

> I see that you tried to fix it in commit 01d7004181c8 ("gpio:
> mcp23s08: depend on OF_GPIO") but screwed up the order of operations.
>
> I fixed it up properly in the merge, but please try to figure out how
> the hell this passed through the cracks.

Argh, thanks for fixing. I see the problem now.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "sched: Fix sleep time double accounting in enqueue entity"

2014-01-21 Thread Vincent Guittot

Paul,

I let you send a patch that will add comment and move the "if (wakeup) logic" ?

Regards
Vincent

On 22 January 2014 08:45, Vincent Guittot  wrote:
> This reverts commit 282cf499f03ec1754b6c8c945c9674b02631fb0f.
>
> With the current implementation, the load average statistics of a sched entity
> change according to other activity on the CPU even if this activity is done
> between the running window of the sched entity and have no influence on the
> running duration of the task.
>
> When a task wakes up on the same CPU, we currently update last_runnable_update
> with the return  of __synchronize_entity_decay without updating the
> runnable_avg_sum and runnable_avg_period accordingly. In fact, we have to sync
> the load_contrib of the se with the rq's blocked_load_contrib before removing
> it from the latter (with __synchronize_entity_decay) but we must keep
> last_runnable_update unchanged for updating runnable_avg_sum/period during the
> next update_entity_load_avg.
>
> Signed-off-by: Vincent Guittot 
>
> ---
>  kernel/sched/fair.c |8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e64b079..6d61f20 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2365,13 +2365,7 @@ static inline void enqueue_entity_load_avg(struct 
> cfs_rq *cfs_rq,
> }
> wakeup = 0;
> } else {
> -   /*
> -* Task re-woke on same cpu (or else migrate_task_rq_fair()
> -* would have made count negative); we must be careful to 
> avoid
> -* double-accounting blocked time after synchronizing decays.
> -*/
> -   se->avg.last_runnable_update += __synchronize_entity_decay(se)
> -   << 20;
> +   __synchronize_entity_decay(se);
> }
>
> /* migrated tasks did not contribute to our blocked load */
> --
> 1.7.9.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Revert "sched: Fix sleep time double accounting in enqueue entity"

2014-01-21 Thread Vincent Guittot

This reverts commit 282cf499f03ec1754b6c8c945c9674b02631fb0f.

With the current implementation, the load average statistics of a sched entity
change according to other activity on the CPU even if this activity is done
between the running window of the sched entity and have no influence on the
running duration of the task.

When a task wakes up on the same CPU, we currently update last_runnable_update
with the return  of __synchronize_entity_decay without updating the
runnable_avg_sum and runnable_avg_period accordingly. In fact, we have to sync
the load_contrib of the se with the rq's blocked_load_contrib before removing
it from the latter (with __synchronize_entity_decay) but we must keep
last_runnable_update unchanged for updating runnable_avg_sum/period during the
next update_entity_load_avg.

Signed-off-by: Vincent Guittot 

---
 kernel/sched/fair.c |8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e64b079..6d61f20 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2365,13 +2365,7 @@ static inline void enqueue_entity_load_avg(struct cfs_rq 
*cfs_rq,
}
wakeup = 0;
} else {
-   /*
-* Task re-woke on same cpu (or else migrate_task_rq_fair()
-* would have made count negative); we must be careful to avoid
-* double-accounting blocked time after synchronizing decays.
-*/
-   se->avg.last_runnable_update += __synchronize_entity_decay(se)
-   << 20;
+   __synchronize_entity_decay(se);
}
 
/* migrated tasks did not contribute to our blocked load */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: top-down balance purpose discussion -- resend

2014-01-21 Thread Alex Shi

On 01/21/2014 10:57 PM, Peter Zijlstra wrote:
> On Tue, Jan 21, 2014 at 10:04:26PM +0800, Alex Shi wrote:
>>
>> Current scheduler load balance is bottom-up mode, each CPU need
>> initiate the balance by self.
>>
>> 1, Like in a integrate computer system, it has smt/core/cpu/numa, 4
>> level scheduler domains. If there is just 2 tasks in whole system that
>> both running on cpu0. Current load balance need to pull task to another
>> smt in smt domain, then pull task to another core, then pull task to
>> another cpu, finally pull task to another numa. Totally it is need 4
>> times task moving to get system balance.
> 
> Except the idle load balancer, and esp. the newidle can totally by-pass
> this.
> 
> If you do the packing right in the newidle pass, you'd get there in 1
> step.

It give me a huge pressure to argue with you a great experts. I am
waiting and very appreciate for any comments and corrections. :)

Yes, a newidle will kindly relief this. but it can not eliminate it. If
a newidle happens on another numa group. It just needs 1 step. But if it
happens on another smt group, it still needs 4 steps. So generally, we
still need one more steps before well balance.

In this example, if a newidle is in the same smallest group, maybe we
should wakeup a remotest cpu in system/llc to avoid extra task moving in
near future for best performance.
And for power saving, maybe we'd better kick the task to smallest group,
then let the remote cpu group idle.
But for current newidle, it's impossible to do this because newidle is
also bottom-up mode.
> 
>> Generally, the task moving complexity is
>>  O(nm log n),  n := nr_cpus, m := nr_tasks
>>
>> There is a excellent summary and explanation for this in
>> kernel/sched/fair.c:4605
> 
> Which is a perfectly fine scheme for a busy system.
> 
>> Another weakness of current LB is that every cpu need to get the other
>> cpus' load info repeatedly and try to figure out busiest sched
>> group/queue on every sched domain level. But it just waste time, since
>> it may not conduct a task moving. One of reasons is that cpu can only
>> pull task, not pushing.
> 
> This doesn't make sense.. and in fact, we do a limited amount of 3rd
> party movements.

Yes, but the 3rd party movements is too limited, just for task pinned.
> 
> Whatever you do, you have to repeat the information gathering anyhow,
> because it constantly changes.
> 

Yes, it is good to collection the load info once for once balance. but
if the balance cpu is busiest cpu, current balance still keep collecting
every group load info from bottom to up, and then do nothing on this
imbalance system. This is bad.

> Trying to serialize that doesn't make any kind of sense. The only thing
> you want is that the system converges.

Sorry, would you like to give a bit more details of 'serialize' is no sense?
> 
> Skipped the rest because it seems build on a fundament I don't agree
> with. That 4 move thing is just silly for an idle system, and we
> shouldn't do that.
> 
> I also very much do not want a single CPU balancing the entire system,
> that's the anti-thesis of scalable.

Sorry. IMHO, single cpu is possible to handle 1000 cpu balancing. And it
is far more scalable than every cpu do balance in system, since there is
only one cpu need to pick other cpu load info.

BTW, there is no organize among all cpus' balancing currently. That's a
a bit mess. Like if 2 cpus in a small cpu group just do balance for
whole system at the same time, then both of them think self group is
light and want more load. then they have the chance to over pull load to
self group. That is bad. And single balancing has no such problem.

-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] net: dm9000: Read GPR, modify and write

2014-01-21 Thread Chris Ruehl


On Wednesday, January 22, 2014 03:15 PM, David Miller wrote:

Please do not mix coding style and functional changes.

Please resubmit this entire series once you have addressed
all feedback.

Thank you.

Thanks for the advice. I will do.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Freeing of dev->p

2014-01-21 Thread Jean Delvare

Hi Greg,

On Fri, 10 Jan 2014 07:24:02 -0800, Greg Kroah-Hartman wrote:
> On Fri, Jan 10, 2014 at 03:39:07PM +0100, Jean Delvare wrote:
> > (...)
> > Then I suppose we could inline both functions
> > again, for performance. Well, put in short, really revering
> > b4028437876866aba4747a655ede00f892089e14 would be the way to go IMHO.
> >
> > Really, while I understand your envy to protect driver core internals
> > from unwanted access, the cost here was simply too high IMHO, both in
> > terms of getting things right and performance. Some drivers are calling
> > dev_get_drvdata() directly or indirectly repeatedly at run-time. They
> > had no reason not to as this used to be so fast, and now it is no
> > longer an inline function, it has conditionals and a double pointer
> > indirection...
> > 
> > Plus, I can't think of anything really bad that could result from
> > accessing driver_data directly, contrary to the other members of struct
> > device_private.
> 
> (...)
> 
> Thanks for the detailed response, I think I'll just revert most of that
> patch and see if it's still workable.

Any news on this?

-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BISECTED] Linux 3.12.7 introduces page map handling regression

2014-01-21 Thread Steven Noonan

On Wed, Jan 22, 2014 at 12:02:15AM -0500, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 21, 2014 at 07:20:45PM -0800, Steven Noonan wrote:
> > On Tue, Jan 21, 2014 at 06:47:07PM -0800, Linus Torvalds wrote:
> > > On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman
> > >  wrote:
> 
> Adding extra folks to the party.
> > > >
> > > > Odds are this also shows up in 3.13, right?
> > 
> > Reproduced using 3.13 on the PV guest:
> > 
> > [  368.756763] BUG: Bad page map in process mp  pte:8004a67c6165 
> > pmd:e9b706067
> > [  368.756777] page:ea001299f180 count:0 mapcount:-1 mapping:   
> >(null) index:0x0
> > [  368.756781] page flags: 0x2f8014(referenced|dirty)
> > [  368.756786] addr:7fd1388b7000 vm_flags:00100071 
> > anon_vma:880e9ba15f80 mapping:  (null) index:7fd1388b7
> > [  368.756792] CPU: 29 PID: 618 Comm: mp Not tainted 3.13.0-ec2 #1
> > [  368.756795]  880e9b718958 880e9eaf3cc0 814d8748 
> > 7fd1388b7000
> > [  368.756803]  880e9eaf3d08 8116d289  
> > 
> > [  368.756809]  880e9b7065b8 ea001299f180 7fd1388b8000 
> > 880e9eaf3e30
> > [  368.756815] Call Trace:
> > [  368.756825]  [] dump_stack+0x45/0x56
> > [  368.756833]  [] print_bad_pte+0x229/0x250
> > [  368.756837]  [] unmap_single_vma+0x583/0x890
> > [  368.756842]  [] unmap_vmas+0x65/0x90
> > [  368.756847]  [] unmap_region+0xac/0x120
> > [  368.756852]  [] ? vma_rb_erase+0x1c9/0x210
> > [  368.756856]  [] do_munmap+0x280/0x370
> > [  368.756860]  [] vm_munmap+0x41/0x60
> > [  368.756864]  [] SyS_munmap+0x22/0x30
> > [  368.756869]  [] system_call_fastpath+0x1a/0x1f
> > [  368.756872] Disabling lock debugging due to kernel taint
> > [  368.760084] BUG: Bad rss-counter state mm:880e9d079680 idx:0 
> > val:-1
> > [  368.760091] BUG: Bad rss-counter state mm:880e9d079680 idx:1 
> > val:1
> > 
> > > 
> > > Probably. I don't have a Xen PV setup to test with (and very little
> > > interest in setting one up).. And I have a suspicion that it might not
> > > be so much about Xen PV, as perhaps about the kind of hardware.
> > > 
> > > I suspect the issue has something to do with the magic _PAGE_NUMA
> > > tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up
> > > removing the _PAGE_PRESENT bit, and now the crazy numa code is
> > > confused.
> > > 
> > > The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the
> > > bit with _PAGE_PROTNONE, which is why it then has that tie-in to
> > > _PAGE_PRESENT.
> > > 
> > > Adding Andrea to the Cc, because he's the author of that horridness.
> > > Putting Steven's test-case here as an attachement for Andrea, maybe
> > > that makes him go "Ahh, yes, silly case".
> > > 
> > > Also added Kirill, because he was involved the last _PAGE_NUMA debacle.
> > > 
> > > Andrea, you can find the thread on lkml, but it boils down to commit
> > > 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the
> > > attached test-case (but apparently only under Xen PV). There it
> > > apparently causes a "BUG: Bad page map .." error.
> 
> I *think* it is due to the fact that pmd_numa and pte_numa is getting the 
> _raw_
> value of PMDs and PTEs. That is - it does not use the pvops interface
> and instead reads the values directly from the page-table. Since the
> page-table is also manipulated by the hypervisor - there are certain
> flags it also sets to do its business. It might be that it uses
> _PAGE_GLOBAL as well - and Linux picks up on that. If it was using
> pte_flags that would invoke the pvops interface.
> 
> Elena, Dariof and George, you guys had been looking at this a bit deeper
> than I have. Does the Xen hypervisor use the _PAGE_GLOBAL for PV guests?
> 
> This not-compiled-totally-bad-patch might shed some light on what I was
> thinking _could_ fix this issue - and IS NOT A FIX - JUST A HACK.
> It does not fix it for PMDs naturally (as there are no PMD paravirt ops
> for that).

Unfortunately the Totally Bad Patch seems to make no difference. I am
still able to repro the issue:

[  346.374929] BUG: Bad page map in process mp  pte:8004ae928065 
pmd:e993f9067
[  346.374942] page:ea0012ba4a00 count:0 mapcount:-1 mapping:   
   (null) index:0x0
[  346.374946] page flags: 0x2f8014(referenced|dirty)
[  346.374951] addr:7f06a9bbb000 vm_flags:00100071 
anon_vma:880e9939fe00 mapping:  (null) index:7f06a9bbb
[  346.374956] CPU: 29 PID: 609 Comm: mp Not tainted 3.13.0-ec2+ #1
[  346.374960]  880e9cc38da8 880e991a3cc0 814d8768 
7f06a9bbb000
[  346.374967]  880e991a3d08 8116d289  

[  346.374972]  880e993f9dd8 ea0012ba4a00 7f06a9bbc000 
880e991a3e30
[  346.374979] Call Trace:
[  346.374988]  [] dump_stack+0x

Re: linux rdma 3.14 merge plans

2014-01-21 Thread Nicholas A. Bellinger

Roland & Co,

On Tue, 2014-01-21 at 16:43 -0800, Roland Dreier wrote:
> On Tue, Jan 21, 2014 at 2:00 PM, Or Gerlitz  wrote:
> > Roland, ping! the signature patches were posted > three months ago. We
> > deserve a response from the maintainer that goes beyond "I need to
> > think on that".
> >
> > Responsiveness was stated by Linus to be the #1 requirement from
> > kernel maintainers.
> 
> Or, I'm not sure what response you're after from me.  Linus has also
> said that maintainers should say "no" a lot more
> (http://lwn.net/Articles/571995/) so maybe you want me to say, "No, I
> won't merge this patch set, since it adds a bunch of complexity to
> support a feature no one really cares about."  Is that it? 

The patch set proposed by Sagi + Or is modest in terms of LOC to core IB
code, and includes mostly mlx5 specific driver changes that enables HW
offloads.

> (And yes I
> am skeptical about this stuff — I work at an enterprise storage
> company and even here it's hard to find anyone who cares about
> DIF/DIX, especially offload features that stop it from being
> end-to-end)
> 

My understanding is most HBAs capable of T10 PI offload in DIX PASS +
VERIFY mode are already implementing DIX INSERT + STRIP modes in various
capacities to support legacy environments.

Beyond the DIX INSERT + STRIP case for enterprise storage, the amount of
FC + SAS HBAs that already support T10 PI metadata is substantial.

> I'm sure you're not expecting me to say, "Sure, I'll merge it without
> understanding the problem it's solving or how it's doing that,"
> especially given the your recent history of pushing me to merge stuff
> like the IP-RoCE patches back when they broke the userspace ABI.

With the merge window now upon us, there is a understandable reluctance
to merge new features.  Given the amount of time the series has spent on
the list, it is however a good candidate to consider for an exception.

Short of that, are you planning to accept the series for the next round
once the current merge window closes..?

We'd really like to start enabling fabrics with these types of offloads
for v3.15. 

> I'd really rather spend my time on something actually useful like
> cleaning up softroce.
> 

+1 for softroce + T10 PI support!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: Tree for Jan 22

2014-01-21 Thread Stephen Rothwell

Hi all,

This tree fails (more than usual) the powerpc allyesconfig build.

Changes since 20140117:

New tree: init (Paul Gortmaker's init.h inclusion cleanup)

Dropped tree: sh (complex merge conflicts against very old commits)
imx-mxs (complex merge conflicts against the arm tree)

The powerpc tree still had its build failure.

The drm-intel tree gained conflicts against the drm tree.

The drivers-x86 tree gained a conflict against the pm tree.

Non-merge commits (relative to Linus' tree): 6883
 7331 files changed, 332480 insertions(+), 154974 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 210 trees (counting Linus' and 29 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwell 

$ git checkout master
$ git reset --hard stable
Merging origin/master (03d11a0e458d Merge tag 'for-v3.14' of 
git://git.infradead.org/battery-2.6)
Merging fixes/master (b0031f227e47 Merge tag 's2mps11-build' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator)
Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" 
not depend on vmlinux)
Merging arc-current/for-curr (7e22e91102c6 Linux 3.13-rc8)
Merging arm-current/fixes (b25f3e1c3584 ARM: 7938/1: OMAP4/highbank: Flush L2 
cache before disabling)
Merging m68k-current/for-linus (56931d73697c m68k/mac: Make SCC reset work more 
reliably)
Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2)
Merging powerpc-merge/merge (b3084f4db3ae powerpc/thp: Fix crash on mremap)
Merging sparc/master (ef350bb7c5e0 Merge tag 'ext4_for_linus_stable' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4)
Merging net/master (7d0d46da750a Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging ipsec/master (965cdea82569 dccp: catch failed request_module call in 
dccp_probe init)
Merging sound-current/for-linus (7552f34a7900 Merge tag 'asoc-v3.14-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (f0b75693cbb2 MAINTAINERS: Add DesignWare, i.MX6, 
Armada, R-Car PCI host maintainers)
Merging wireless/master (2eff7c791a18 Merge tag 'nfc-fixes-3.13-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes)
Merging driver-core.current/driver-core-linus (413541dd66d5 Linux 3.13-rc5)
Merging tty.current/tty-linus (413541dd66d5 Linux 3.13-rc5)
Merging usb.current/usb-linus (413541dd66d5 Linux 3.13-rc5)
Merging staging.current/staging-linus (413541dd66d5 Linux 3.13-rc5)
Merging char-misc.current/char-misc-linus (802eee95bde7 Linux 3.13-rc6)
Merging input-current/for-linus (8e2f2325b73f Input: xpad - add new USB IDs for 
Logitech F310 and F710)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (efb753b8e013 crypto: ixp4xx - Fix kernel compile 
error)
Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary 
pci_set_drvdata())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros 
to inline functions)
Merging devicetree-current/devicetree/merge (6f041e99fc7b of: Fix NULL 
dereference in unflatten_and_copy())
Merging rr-fixes/fixes (7122c3e9154b scripts/link-vmlinux.sh: only filter 
kernel symbols f

Re: [PATCH 1/2] net: dm9000: Read GPR, modify and write

2014-01-21 Thread David Miller


Please do not mix coding style and functional changes.

Please resubmit this entire series once you have addressed
all feedback.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-21 Thread Hannes Reinecke

On 01/22/2014 06:20 AM, Joel Becker wrote:
> On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote:
>> One topic that has been lurking forever at the edges is the current
>> 4k limitation for file system block sizes. Some devices in
>> production today and others coming soon have larger sectors and it
>> would be interesting to see if it is time to poke at this topic
>> again.
>>
>> LSF/MM seems to be pretty much the only event of the year that most
>> of the key people will be present, so should be a great topic for a
>> joint session.
> 
> Oh yes, I want in on this.  We handle 4k/16k/64k pages "seamlessly," and
> we would want to do the same for larger sectors.  In theory, our code
> should handle it with the appropriate defines updated.
> 
+1

The shingled drive folks would really love us for this.
Plus it would make live really easy for those type of devices.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH V5 8/8] cpuidle/powernv: Parse device tree to setup idle states

2014-01-21 Thread Preeti U Murthy

Add deep idle states such as nap and fast sleep to the cpuidle state table
only if they are discovered from the device tree during cpuidle initialization.

Signed-off-by: Preeti U Murthy 
---

 drivers/cpuidle/cpuidle-powernv.c |   81 +
 1 file changed, 64 insertions(+), 17 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c 
b/drivers/cpuidle/cpuidle-powernv.c
index 90f0c2b..b3face5 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -12,10 +12,17 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 
+/* Flags and constants used in PowerNV platform */
+
+#define MAX_POWERNV_IDLE_STATES8
+#define IDLE_USE_INST_NAP  0x0001 /* Use nap instruction */
+#define IDLE_USE_INST_SLEEP0x0002 /* Use sleep instruction */
+
 struct cpuidle_driver powernv_idle_driver = {
.name = "powernv_idle",
.owner= THIS_MODULE,
@@ -87,7 +94,7 @@ static int fastsleep_loop(struct cpuidle_device *dev,
 /*
  * States for dedicated partition case.
  */
-static struct cpuidle_state powernv_states[] = {
+static struct cpuidle_state powernv_states[MAX_POWERNV_IDLE_STATES] = {
{ /* Snooze */
.name = "snooze",
.desc = "snooze",
@@ -95,20 +102,6 @@ static struct cpuidle_state powernv_states[] = {
.exit_latency = 0,
.target_residency = 0,
.enter = &snooze_loop },
-   { /* NAP */
-   .name = "NAP",
-   .desc = "NAP",
-   .flags = CPUIDLE_FLAG_TIME_VALID,
-   .exit_latency = 10,
-   .target_residency = 100,
-   .enter = &nap_loop },
-{ /* Fastsleep */
-   .name = "fastsleep",
-   .desc = "fastsleep",
-   .flags = CPUIDLE_FLAG_TIME_VALID,
-   .exit_latency = 10,
-   .target_residency = 100,
-   .enter = &fastsleep_loop },
 };
 
 static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n,
@@ -169,19 +162,73 @@ static int powernv_cpuidle_driver_init(void)
return 0;
 }
 
+static int powernv_add_idle_states(void)
+{
+   struct device_node *power_mgt;
+   struct property *prop;
+   int nr_idle_states = 1; /* Snooze */
+   int dt_idle_states;
+   u32 *flags;
+   int i;
+
+   /* Currently we have snooze statically defined */
+
+   power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
+   if (!power_mgt) {
+   pr_warn("opal: PowerMgmt Node not found\n");
+   return nr_idle_states;
+   }
+
+   prop = of_find_property(power_mgt, "ibm,cpu-idle-state-flags", NULL);
+   if (!prop) {
+   pr_warn("DT-PowerMgmt: missing ibm,cpu-idle-state-flags\n");
+   return nr_idle_states;
+   }
+
+   dt_idle_states = prop->length / sizeof(u32);
+   flags = (u32 *) prop->value;
+
+   for (i = 0; i < dt_idle_states; i++) {
+
+   if (flags[i] & IDLE_USE_INST_NAP) {
+   /* Add NAP state */
+   strcpy(powernv_states[nr_idle_states].name, "Nap");
+   strcpy(powernv_states[nr_idle_states].desc, "Nap");
+   powernv_states[nr_idle_states].flags = 
CPUIDLE_FLAG_TIME_VALID;
+   powernv_states[nr_idle_states].exit_latency = 10;
+   powernv_states[nr_idle_states].target_residency = 100;
+   powernv_states[nr_idle_states].enter = &nap_loop;
+   nr_idle_states++;
+   }
+
+   if (flags[i] & IDLE_USE_INST_SLEEP) {
+   /* Add FASTSLEEP state */
+   strcpy(powernv_states[nr_idle_states].name, 
"FastSleep");
+   strcpy(powernv_states[nr_idle_states].desc, 
"FastSleep");
+   powernv_states[nr_idle_states].flags = 
CPUIDLE_FLAG_TIME_VALID;
+   powernv_states[nr_idle_states].exit_latency = 300;
+   powernv_states[nr_idle_states].target_residency = 
100;
+   powernv_states[nr_idle_states].enter = &fastsleep_loop;
+   nr_idle_states++;
+   }
+   }
+
+   return nr_idle_states;
+}
+
 /*
  * powernv_idle_probe()
  * Choose state table for shared versus dedicated partition
  */
 static int powernv_idle_probe(void)
 {
-
if (cpuidle_disable != IDLE_NO_OVERRIDE)
return -ENODEV;
 
if (firmware_has_feature(FW_FEATURE_OPALv3)) {
cpuidle_state_table = powernv_states;
-   max_idle_state = ARRAY_SIZE(powernv_states);
+   /* Device tree can indicate more idle states */
+   max_idle_state = powernv_add_idle_states();
} else
return -ENODEV;
 

--
To unsubscribe from this list: send the line "unsubscribe l

[RESEND PATCH V5 7/8] cpuidle/powernv: Add "Fast-Sleep" CPU idle state

2014-01-21 Thread Preeti U Murthy

Fast sleep is one of the deep idle states on Power8 in which local timers of
CPUs stop. On PowerPC we do not have an external clock device which can
handle wakeup of such CPUs. Now that we have the support in the tick broadcast
framework for archs that do not sport such a device and the low level support
for fast sleep, enable it in the cpuidle framework on PowerNV.

Signed-off-by: Preeti U Murthy 
---

 arch/powerpc/Kconfig  |2 ++
 arch/powerpc/kernel/time.c|2 +-
 drivers/cpuidle/cpuidle-powernv.c |   42 +
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index fa39517..ec91584 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -129,6 +129,8 @@ config PPC
select GENERIC_CMOS_UPDATE
select GENERIC_TIME_VSYSCALL_OLD
select GENERIC_CLOCKEVENTS
+   select GENERIC_CLOCKEVENTS_BROADCAST
+   select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select HAVE_MOD_ARCH_SPECIFIC
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index df2989b..95fa5ce 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -106,7 +106,7 @@ struct clock_event_device decrementer_clockevent = {
.irq= 0,
.set_next_event = decrementer_set_next_event,
.set_mode   = decrementer_set_mode,
-   .features   = CLOCK_EVT_FEAT_ONESHOT,
+   .features   = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP,
 };
 EXPORT_SYMBOL(decrementer_clockevent);
 
diff --git a/drivers/cpuidle/cpuidle-powernv.c 
b/drivers/cpuidle/cpuidle-powernv.c
index 78fd174..90f0c2b 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -49,6 +50,40 @@ static int nap_loop(struct cpuidle_device *dev,
return index;
 }
 
+static int fastsleep_loop(struct cpuidle_device *dev,
+   struct cpuidle_driver *drv,
+   int index)
+{
+   int cpu = dev->cpu;
+   unsigned long old_lpcr = mfspr(SPRN_LPCR);
+   unsigned long new_lpcr;
+
+   if (unlikely(system_state < SYSTEM_RUNNING))
+   return index;
+
+   new_lpcr = old_lpcr;
+   new_lpcr &= ~(LPCR_MER | LPCR_PECE); /* lpcr[mer] must be 0 */
+
+   /* exit powersave upon external interrupt, but not decrementer
+* interrupt, Emulate sleep.
+*/
+   new_lpcr |= LPCR_PECE0;
+
+   if (clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu)) {
+   new_lpcr |= LPCR_PECE1;
+   mtspr(SPRN_LPCR, new_lpcr);
+   power7_nap();
+   } else {
+   mtspr(SPRN_LPCR, new_lpcr);
+   power7_sleep();
+   }
+   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
+
+   mtspr(SPRN_LPCR, old_lpcr);
+
+   return index;
+}
+
 /*
  * States for dedicated partition case.
  */
@@ -67,6 +102,13 @@ static struct cpuidle_state powernv_states[] = {
.exit_latency = 10,
.target_residency = 100,
.enter = &nap_loop },
+{ /* Fastsleep */
+   .name = "fastsleep",
+   .desc = "fastsleep",
+   .flags = CPUIDLE_FLAG_TIME_VALID,
+   .exit_latency = 10,
+   .target_residency = 100,
+   .enter = &fastsleep_loop },
 };
 
 static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH V5 4/8] powernv/cpuidle: Add context management for Fast Sleep

2014-01-21 Thread Preeti U Murthy

From: Vaidyanathan Srinivasan 

Before adding Fast-Sleep into the cpuidle framework, some low level
support needs to be added to enable it. This includes saving and
restoring of certain registers at entry and exit time of this state
respectively just like we do in the NAP idle state.

Signed-off-by: Vaidyanathan Srinivasan 
[Changelog modified by Preeti U. Murthy ]
Signed-off-by: Preeti U. Murthy 
---

 arch/powerpc/include/asm/processor.h |1 +
 arch/powerpc/kernel/exceptions-64s.S |   10 -
 arch/powerpc/kernel/idle_power7.S|   63 --
 3 files changed, 53 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index b62de43..d660dc3 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -450,6 +450,7 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, 
IDLE_POWERSAVE_OFF};
 
 extern int powersave_nap;  /* set if nap mode can be used in idle loop */
 extern void power7_nap(void);
+extern void power7_sleep(void);
 extern void flush_instruction_cache(void);
 extern void hard_reset_now(void);
 extern void poweroff_now(void);
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 38d5073..b01a9cb 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -121,9 +121,10 @@ BEGIN_FTR_SECTION
cmpwi   cr1,r13,2
/* Total loss of HV state is fatal, we could try to use the
 * PIR to locate a PACA, then use an emergency stack etc...
-* but for now, let's just stay stuck here
+* OPAL v3 based powernv platforms have new idle states
+* which fall in this catagory.
 */
-   bgt cr1,.
+   bgt cr1,8f
GET_PACA(r13)
 
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
@@ -141,6 +142,11 @@ BEGIN_FTR_SECTION
beq cr1,2f
b   .power7_wakeup_noloss
 2: b   .power7_wakeup_loss
+
+   /* Fast Sleep wakeup on PowerNV */
+8: GET_PACA(r13)
+   b   .power7_wakeup_loss
+
 9:
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
 #endif /* CONFIG_PPC_P7_NAP */
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index 3fdef0f..14f78be 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -20,17 +20,27 @@
 
 #undef DEBUG
 
-   .text
+/* Idle state entry routines */
 
-_GLOBAL(power7_idle)
-   /* Now check if user or arch enabled NAP mode */
-   LOAD_REG_ADDRBASE(r3,powersave_nap)
-   lwz r4,ADDROFF(powersave_nap)(r3)
-   cmpwi   0,r4,0
-   beqlr
-   /* fall through */
+#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \
+   /* Magic NAP/SLEEP/WINKLE mode enter sequence */\
+   std r0,0(r1);   \
+   ptesync;\
+   ld  r0,0(r1);   \
+1: cmp cr0,r0,r0;  \
+   bne 1b; \
+   IDLE_INST;  \
+   b   .
 
-_GLOBAL(power7_nap)
+   .text
+
+/*
+ * Pass requested state in r3:
+ * 0 - nap
+ * 1 - sleep
+ */
+_GLOBAL(power7_powersave_common)
+   /* Use r3 to pass state nap/sleep/winkle */
/* NAP is a state loss, we create a regs frame on the
 * stack, fill it up with the state we care about and
 * stick a pointer to it in PACAR1. We really only
@@ -79,8 +89,8 @@ _GLOBAL(power7_nap)
/* Continue saving state */
SAVE_GPR(2, r1)
SAVE_NVGPRS(r1)
-   mfcrr3
-   std r3,_CCR(r1)
+   mfcrr4
+   std r4,_CCR(r1)
std r9,_MSR(r1)
std r1,PACAR1(r13)
 
@@ -90,15 +100,30 @@ _GLOBAL(power7_enter_nap_mode)
li  r4,KVM_HWTHREAD_IN_NAP
stb r4,HSTATE_HWTHREAD_STATE(r13)
 #endif
+   cmpwi   cr0,r3,1
+   beq 2f
+   IDLE_STATE_ENTER_SEQ(PPC_NAP)
+   /* No return */
+2: IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
+   /* No return */
 
-   /* Magic NAP mode enter sequence */
-   std r0,0(r1)
-   ptesync
-   ld  r0,0(r1)
-1: cmp cr0,r0,r0
-   bne 1b
-   PPC_NAP
-   b   .
+_GLOBAL(power7_idle)
+   /* Now check if user or arch enabled NAP mode */
+   LOAD_REG_ADDRBASE(r3,powersave_nap)
+   lwz r4,ADDROFF(powersave_nap)(r3)
+   cmpwi   0,r4,0
+   beqlr
+   /* fall through */
+
+_GLOBAL(power7_nap)
+   li  r3,0
+   b   power7_powersave_common
+   /* No return */
+
+_GLOBAL(power7_sleep)
+   li  r3,1
+   b   power7_powersave_common
+   /* No return */
 
 _GLOBAL(power7_wakeup_loss)
ld  r1,PACAR1(r13)

--
To unsubscribe from this list: send the lin

[RESEND PATCH V5 5/8] powermgt: Add OPAL call to resync timebase on wakeup

2014-01-21 Thread Preeti U Murthy

From: Vaidyanathan Srinivasan 

During "Fast-sleep" and deeper power savings state, decrementer and
timebase could be stopped making it out of sync with rest
of the cores in the system.

Add a firmware call to request platform to resync timebase
using low level platform methods.

Signed-off-by: Vaidyanathan Srinivasan 
Signed-off-by: Preeti U. Murthy 
---

 arch/powerpc/include/asm/opal.h|2 ++
 arch/powerpc/kernel/exceptions-64s.S   |2 +-
 arch/powerpc/kernel/idle_power7.S  |   27 
 arch/powerpc/platforms/powernv/opal-wrappers.S |1 +
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9a87b44..8c4829f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -154,6 +154,7 @@ extern int opal_enter_rtas(struct rtas_args *args,
 #define OPAL_FLASH_VALIDATE76
 #define OPAL_FLASH_MANAGE  77
 #define OPAL_FLASH_UPDATE  78
+#define OPAL_RESYNC_TIMEBASE   79
 #define OPAL_GET_MSG   85
 #define OPAL_CHECK_ASYNC_COMPLETION86
 
@@ -863,6 +864,7 @@ extern void opal_flash_init(void);
 extern int opal_machine_check(struct pt_regs *regs);
 
 extern void opal_shutdown(void);
+extern int opal_resync_timebase(void);
 
 extern void opal_lpc_init(void);
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index b01a9cb..9533d7a 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -145,7 +145,7 @@ BEGIN_FTR_SECTION
 
/* Fast Sleep wakeup on PowerNV */
 8: GET_PACA(r13)
-   b   .power7_wakeup_loss
+   b   .power7_wakeup_tb_loss
 
 9:
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index 14f78be..c3ab869 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #undef DEBUG
 
@@ -125,6 +126,32 @@ _GLOBAL(power7_sleep)
b   power7_powersave_common
/* No return */
 
+_GLOBAL(power7_wakeup_tb_loss)
+   ld  r2,PACATOC(r13);
+   ld  r1,PACAR1(r13)
+
+   /* Time base re-sync */
+   li  r0,OPAL_RESYNC_TIMEBASE
+   LOAD_REG_ADDR(r11,opal);
+   ld  r12,8(r11);
+   ld  r2,0(r11);
+   mtctr   r12
+   bctrl
+
+   /* TODO: Check r3 for failure */
+
+   REST_NVGPRS(r1)
+   REST_GPR(2, r1)
+   ld  r3,_CCR(r1)
+   ld  r4,_MSR(r1)
+   ld  r5,_NIP(r1)
+   addir1,r1,INT_FRAME_SIZE
+   mtcrr3
+   mfspr   r3,SPRN_SRR1/* Return SRR1 */
+   mtspr   SPRN_SRR1,r4
+   mtspr   SPRN_SRR0,r5
+   rfid
+
 _GLOBAL(power7_wakeup_loss)
ld  r1,PACAR1(r13)
REST_NVGPRS(r1)
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 719aa5c..a11a87c 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -126,5 +126,6 @@ OPAL_CALL(opal_return_cpu,  
OPAL_RETURN_CPU);
 OPAL_CALL(opal_validate_flash, OPAL_FLASH_VALIDATE);
 OPAL_CALL(opal_manage_flash,   OPAL_FLASH_MANAGE);
 OPAL_CALL(opal_update_flash,   OPAL_FLASH_UPDATE);
+OPAL_CALL(opal_resync_timebase,OPAL_RESYNC_TIMEBASE);
 OPAL_CALL(opal_get_msg,OPAL_GET_MSG);
 OPAL_CALL(opal_check_completion,   OPAL_CHECK_ASYNC_COMPLETION);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH V5 6/8] time/cpuidle: Support in tick broadcast framework in the absence of external clock device

2014-01-21 Thread Preeti U Murthy

On some architectures, in certain CPU deep idle states the local timers stop.
An external clock device is used to wakeup these CPUs. The kernel support for 
the
wakeup of these CPUs is provided by the tick broadcast framework by using the
external clock device as the wakeup source.

However not all implementations of architectures provide such an external
clock device such as some PowerPC ones. This patch includes support in the
broadcast framework to handle the wakeup of the CPUs in deep idle states on such
systems by queuing a hrtimer on one of the CPUs, meant to handle the wakeup of
CPUs in deep idle states. This CPU is identified as the bc_cpu.

Each time the hrtimer expires, it is reprogrammed for the next wakeup of the
CPUs in deep idle state after handling broadcast. However when a CPU is about
to enter  deep idle state with its wakeup time earlier than the time at which
the hrtimer is currently programmed, it *becomes the new bc_cpu* and restarts
the hrtimer on itself. This way the job of doing broadcast is handed around to
the CPUs that ask for the earliest wakeup just before entering deep idle
state. This is consistent with what happens in cases where an external clock
device is present. The smp affinity of this clock device is set to the CPU
with the earliest wakeup.

The important point here is that the bc_cpu cannot enter deep idle state
since it has a hrtimer queued to wakeup the other CPUs in deep idle. Hence it
cannot have its local timer stopped. Therefore for such a CPU, the
BROADCAST_ENTER notification has to fail implying that it cannot enter deep
idle state. On architectures where an external clock device is present, all
CPUs can enter deep idle.

During hotplug of the bc_cpu, the job of doing a broadcast is assigned to the
first cpu in the broadcast mask. This newly nominated bc_cpu is woken up by
an IPI so as to queue the above mentioned hrtimer on it.

Signed-off-by: Preeti U Murthy 
---

 include/linux/clockchips.h   |4 -
 kernel/time/clockevents.c|9 +-
 kernel/time/tick-broadcast.c |  192 ++
 kernel/time/tick-internal.h  |8 +-
 4 files changed, 186 insertions(+), 27 deletions(-)

diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index 493aa02..bbda37b 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -186,9 +186,9 @@ static inline int tick_check_broadcast_expired(void) { 
return 0; }
 #endif
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
-extern void clockevents_notify(unsigned long reason, void *arg);
+extern int clockevents_notify(unsigned long reason, void *arg);
 #else
-static inline void clockevents_notify(unsigned long reason, void *arg) {}
+static inline int clockevents_notify(unsigned long reason, void *arg) {}
 #endif
 
 #else /* CONFIG_GENERIC_CLOCKEVENTS_BUILD */
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 086ad60..d61404e 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -524,12 +524,13 @@ void clockevents_resume(void)
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 /**
  * clockevents_notify - notification about relevant events
+ * Returns non zero on error.
  */
-void clockevents_notify(unsigned long reason, void *arg)
+int clockevents_notify(unsigned long reason, void *arg)
 {
struct clock_event_device *dev, *tmp;
unsigned long flags;
-   int cpu;
+   int cpu, ret = 0;
 
raw_spin_lock_irqsave(&clockevents_lock, flags);
 
@@ -542,11 +543,12 @@ void clockevents_notify(unsigned long reason, void *arg)
 
case CLOCK_EVT_NOTIFY_BROADCAST_ENTER:
case CLOCK_EVT_NOTIFY_BROADCAST_EXIT:
-   tick_broadcast_oneshot_control(reason);
+   ret = tick_broadcast_oneshot_control(reason);
break;
 
case CLOCK_EVT_NOTIFY_CPU_DYING:
tick_handover_do_timer(arg);
+   tick_handover_broadcast_cpu(arg);
break;
 
case CLOCK_EVT_NOTIFY_SUSPEND:
@@ -585,6 +587,7 @@ void clockevents_notify(unsigned long reason, void *arg)
break;
}
raw_spin_unlock_irqrestore(&clockevents_lock, flags);
+   return ret;
 }
 EXPORT_SYMBOL_GPL(clockevents_notify);
 
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 9532690..1c23912 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "tick-internal.h"
 
@@ -35,6 +36,15 @@ static cpumask_var_t tmpmask;
 static DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
 static int tick_broadcast_force;
 
+/*
+ * Helper variables for handling broadcast in the absence of a
+ * tick_broadcast_device.
+ * */
+static struct hrtimer *bc_hrtimer;
+static int bc_cpu = -1;
+static ktime_t bc_next_wakeup;
+static int hrtimer_initialized = 0;
+
 #ifdef CONFIG_TICK_ONESHOT
 static void tick_broadcast_clear_oneshot(int cpu);
 #else
@@ -528,6 +538,20 @@ static int tick

[RESEND PATCH V5 3/8] cpuidle/ppc: Split timer_interrupt() into timer handling and interrupt handling routines

2014-01-21 Thread Preeti U Murthy

Split timer_interrupt(), which is the local timer interrupt handler on ppc
into routines called during regular interrupt handling and __timer_interrupt(),
which takes care of running local timers and collecting time related stats.

This will enable callers interested only in running expired local timers to
directly call into __timer_interupt(). One of the use cases of this is the
tick broadcast IPI handling in which the sleeping CPUs need to handle the local
timers that have expired.

Signed-off-by: Preeti U Murthy 
---

 arch/powerpc/kernel/time.c |   81 +---
 1 file changed, 46 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 3ff97db..df2989b 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -478,6 +478,47 @@ void arch_irq_work_raise(void)
 
 #endif /* CONFIG_IRQ_WORK */
 
+void __timer_interrupt(void)
+{
+   struct pt_regs *regs = get_irq_regs();
+   u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+   struct clock_event_device *evt = &__get_cpu_var(decrementers);
+   u64 now;
+
+   trace_timer_interrupt_entry(regs);
+
+   if (test_irq_work_pending()) {
+   clear_irq_work_pending();
+   irq_work_run();
+   }
+
+   now = get_tb_or_rtc();
+   if (now >= *next_tb) {
+   *next_tb = ~(u64)0;
+   if (evt->event_handler)
+   evt->event_handler(evt);
+   __get_cpu_var(irq_stat).timer_irqs_event++;
+   } else {
+   now = *next_tb - now;
+   if (now <= DECREMENTER_MAX)
+   set_dec((int)now);
+   /* We may have raced with new irq work */
+   if (test_irq_work_pending())
+   set_dec(1);
+   __get_cpu_var(irq_stat).timer_irqs_others++;
+   }
+
+#ifdef CONFIG_PPC64
+   /* collect purr register values often, for accurate calculations */
+   if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
+   struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
+   cu->current_tb = mfspr(SPRN_PURR);
+   }
+#endif
+
+   trace_timer_interrupt_exit(regs);
+}
+
 /*
  * timer_interrupt - gets called when the decrementer overflows,
  * with interrupts disabled.
@@ -486,8 +527,6 @@ void timer_interrupt(struct pt_regs * regs)
 {
struct pt_regs *old_regs;
u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
-   struct clock_event_device *evt = &__get_cpu_var(decrementers);
-   u64 now;
 
/* Ensure a positive value is written to the decrementer, or else
 * some CPUs will continue to take decrementer exceptions.
@@ -519,39 +558,7 @@ void timer_interrupt(struct pt_regs * regs)
old_regs = set_irq_regs(regs);
irq_enter();
 
-   trace_timer_interrupt_entry(regs);
-
-   if (test_irq_work_pending()) {
-   clear_irq_work_pending();
-   irq_work_run();
-   }
-
-   now = get_tb_or_rtc();
-   if (now >= *next_tb) {
-   *next_tb = ~(u64)0;
-   if (evt->event_handler)
-   evt->event_handler(evt);
-   __get_cpu_var(irq_stat).timer_irqs_event++;
-   } else {
-   now = *next_tb - now;
-   if (now <= DECREMENTER_MAX)
-   set_dec((int)now);
-   /* We may have raced with new irq work */
-   if (test_irq_work_pending())
-   set_dec(1);
-   __get_cpu_var(irq_stat).timer_irqs_others++;
-   }
-
-#ifdef CONFIG_PPC64
-   /* collect purr register values often, for accurate calculations */
-   if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-   struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
-   cu->current_tb = mfspr(SPRN_PURR);
-   }
-#endif
-
-   trace_timer_interrupt_exit(regs);
-
+   __timer_interrupt();
irq_exit();
set_irq_regs(old_regs);
 }
@@ -828,6 +835,10 @@ static void decrementer_set_mode(enum clock_event_mode 
mode,
 /* Interrupt handler for the timer broadcast IPI */
 void tick_broadcast_ipi_handler(void)
 {
+   u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+
+   *next_tb = get_tb_or_rtc();
+   __timer_interrupt();
 }
 
 static void register_decrementer_clockevent(int cpu)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH V5 1/8] powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message

2014-01-21 Thread Preeti U Murthy

From: Srivatsa S. Bhat 

The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE map
to a common implementation - generic_smp_call_function_single_interrupt(). So,
we can consolidate them and save one of the IPI message slots, (which are
precious on powerpc, since only 4 of those slots are available).

So, implement the functionality of PPC_MSG_CALL_FUNC_SINGLE using
PPC_MSG_CALL_FUNC itself and release its IPI message slot, so that it can be
used for something else in the future, if desired.

Signed-off-by: Srivatsa S. Bhat 
Signed-off-by: Preeti U. Murthy 
Acked-by: Geoff Levand  [For the PS3 part]
---

 arch/powerpc/include/asm/smp.h  |2 +-
 arch/powerpc/kernel/smp.c   |   12 +---
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 084e080..9f7356b 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu);
  * in /proc/interrupts will be wrong!!! --Troy */
 #define PPC_MSG_CALL_FUNCTION   0
 #define PPC_MSG_RESCHEDULE  1
-#define PPC_MSG_CALL_FUNC_SINGLE   2
+#define PPC_MSG_UNUSED 2
 #define PPC_MSG_DEBUGGER_BREAK  3
 
 /* for irq controllers that have dedicated ipis per message (4) */
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ac2621a..ee7d76b 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -145,9 +145,9 @@ static irqreturn_t reschedule_action(int irq, void *data)
return IRQ_HANDLED;
 }
 
-static irqreturn_t call_function_single_action(int irq, void *data)
+static irqreturn_t unused_action(int irq, void *data)
 {
-   generic_smp_call_function_single_interrupt();
+   /* This slot is unused and hence available for use, if needed */
return IRQ_HANDLED;
 }
 
@@ -168,14 +168,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 static irq_handler_t smp_ipi_action[] = {
[PPC_MSG_CALL_FUNCTION] =  call_function_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
-   [PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action,
+   [PPC_MSG_UNUSED] = unused_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
[PPC_MSG_CALL_FUNCTION] =  "ipi call function",
[PPC_MSG_RESCHEDULE] = "ipi reschedule",
-   [PPC_MSG_CALL_FUNC_SINGLE] = "ipi call function single",
+   [PPC_MSG_UNUSED] = "ipi unused",
[PPC_MSG_DEBUGGER_BREAK] = "ipi debugger",
 };
 
@@ -251,8 +251,6 @@ irqreturn_t smp_ipi_demux(void)
generic_smp_call_function_interrupt();
if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE))
scheduler_ipi();
-   if (all & IPI_MESSAGE(PPC_MSG_CALL_FUNC_SINGLE))
-   generic_smp_call_function_single_interrupt();
if (all & IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK))
debug_ipi_action(0, NULL);
} while (info->messages);
@@ -280,7 +278,7 @@ EXPORT_SYMBOL_GPL(smp_send_reschedule);
 
 void arch_send_call_function_single_ipi(int cpu)
 {
-   do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE);
+   do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
 }
 
 void arch_send_call_function_ipi_mask(const struct cpumask *mask)
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.c
index 2d42f3b..adf3726 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -215,7 +215,7 @@ void iic_request_IPIs(void)
 {
iic_request_ipi(PPC_MSG_CALL_FUNCTION);
iic_request_ipi(PPC_MSG_RESCHEDULE);
-   iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE);
+   iic_request_ipi(PPC_MSG_UNUSED);
iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
 }
 
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 4b35166..00d1a7c 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -76,7 +76,7 @@ static int __init ps3_smp_probe(void)
 
BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION!= 0);
BUILD_BUG_ON(PPC_MSG_RESCHEDULE   != 1);
-   BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2);
+   BUILD_BUG_ON(PPC_MSG_UNUSED   != 2);
BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK   != 3);
 
for (i = 0; i < MSG_COUNT; i++) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH V5 2/8] powerpc: Implement tick broadcast IPI as a fixed IPI message

2014-01-21 Thread Preeti U Murthy

From: Srivatsa S. Bhat 

For scalability and performance reasons, we want the tick broadcast IPIs
to be handled as efficiently as possible. Fixed IPI messages
are one of the most efficient mechanisms available - they are faster than
the smp_call_function mechanism because the IPI handlers are fixed and hence
they don't involve costly operations such as adding IPI handlers to the target
CPU's function queue, acquiring locks for synchronization etc.

Luckily we have an unused IPI message slot, so use that to implement
tick broadcast IPIs efficiently.

Signed-off-by: Srivatsa S. Bhat 
[Functions renamed to tick_broadcast* and Changelog modified by
 Preeti U. Murthy]
Signed-off-by: Preeti U. Murthy 
Acked-by: Geoff Levand  [For the PS3 part]
---

 arch/powerpc/include/asm/smp.h  |2 +-
 arch/powerpc/include/asm/time.h |1 +
 arch/powerpc/kernel/smp.c   |   19 +++
 arch/powerpc/kernel/time.c  |5 +
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 6 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 9f7356b..ff51046 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu);
  * in /proc/interrupts will be wrong!!! --Troy */
 #define PPC_MSG_CALL_FUNCTION   0
 #define PPC_MSG_RESCHEDULE  1
-#define PPC_MSG_UNUSED 2
+#define PPC_MSG_TICK_BROADCAST 2
 #define PPC_MSG_DEBUGGER_BREAK  3
 
 /* for irq controllers that have dedicated ipis per message (4) */
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index c1f2676..1d428e6 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -28,6 +28,7 @@ extern struct clock_event_device decrementer_clockevent;
 struct rtc_time;
 extern void to_tm(int tim, struct rtc_time * tm);
 extern void GregorianDay(struct rtc_time *tm);
+extern void tick_broadcast_ipi_handler(void);
 
 extern void generic_calibrate_decr(void);
 
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ee7d76b..6f06f05 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -145,9 +146,9 @@ static irqreturn_t reschedule_action(int irq, void *data)
return IRQ_HANDLED;
 }
 
-static irqreturn_t unused_action(int irq, void *data)
+static irqreturn_t tick_broadcast_ipi_action(int irq, void *data)
 {
-   /* This slot is unused and hence available for use, if needed */
+   tick_broadcast_ipi_handler();
return IRQ_HANDLED;
 }
 
@@ -168,14 +169,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 static irq_handler_t smp_ipi_action[] = {
[PPC_MSG_CALL_FUNCTION] =  call_function_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
-   [PPC_MSG_UNUSED] = unused_action,
+   [PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
[PPC_MSG_CALL_FUNCTION] =  "ipi call function",
[PPC_MSG_RESCHEDULE] = "ipi reschedule",
-   [PPC_MSG_UNUSED] = "ipi unused",
+   [PPC_MSG_TICK_BROADCAST] = "ipi tick-broadcast",
[PPC_MSG_DEBUGGER_BREAK] = "ipi debugger",
 };
 
@@ -251,6 +252,8 @@ irqreturn_t smp_ipi_demux(void)
generic_smp_call_function_interrupt();
if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE))
scheduler_ipi();
+   if (all & IPI_MESSAGE(PPC_MSG_TICK_BROADCAST))
+   tick_broadcast_ipi_handler();
if (all & IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK))
debug_ipi_action(0, NULL);
} while (info->messages);
@@ -289,6 +292,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask 
*mask)
do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
 }
 
+void tick_broadcast(const struct cpumask *mask)
+{
+   unsigned int cpu;
+
+   for_each_cpu(cpu, mask)
+   do_message_pass(cpu, PPC_MSG_TICK_BROADCAST);
+}
+
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 void smp_send_debugger_break(void)
 {
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index b3dab20..3ff97db 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -825,6 +825,11 @@ static void decrementer_set_mode(enum clock_event_mode 
mode,
decrementer_set_next_event(DECREMENTER_MAX, dev);
 }
 
+/* Interrupt handler for the timer broadcast IPI */
+void tick_broadcast_ipi_handler(void)
+{
+}
+
 static void register_decrementer_clockevent(int cpu)
 {
struct clock_event_device *dec = &per_cpu(decrementers, cpu);
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.

[RESEND PATCH V5 0/8] cpuidle/ppc: Enable deep idle states on PowerNV

2014-01-21 Thread Preeti U Murthy

On PowerPC, when CPUs enter certain deep idle states, the local timers stop
and the time base could go out of sync with the rest of the cores in the system.

This patchset adds support to wake up CPUs in such idle states by
broadcasting IPIs to them at their next timer events using the tick broadcast
framework in the Linux kernel. We refer to these IPIs as the tick
broadcast IPIs in this patchset.

However the tick broadcast framework as it exists today makes use of an external
clock device to wakeup CPUs in such idle states. But not all implementations of
PowerPC provides such an external clock device.

Hence Patch[6/8]:
[time/cpuidle: Support in tick broadcast framework for archs without external
clock device] adds support in the tick broadcast framework for such
use cases by queuing a hrtimer on one of the CPUs which is meant to handle the 
wakeup
of CPUs in deep idle states.
This patch was posted separately at: https://lkml.org/lkml/2013/12/12/687.

Patches 1-3 adds support in powerpc to hook onto the tick broadcast framework.

The patchset also includes support for resyncing of time base with the rest of 
the
cores in the system and context management for fast sleep. PATCH[4/8] and
PATCH[5/8] address these issues.

With the required support for deep idle states thus in place, the
patchset adds "Fast-Sleep" idle state into cpuidle (Patches 7 and 8). 
"Fast-Sleep"
is a deep idle state on Power8 in which the above mentioned challenges
exist. Fast-Sleep can yield us significantly more power
savings than the idle states that we have in cpuidle so far.

This patchset is based on Ben's ppc next branch at commit fac515db45207718
[Merge remote-tracking branch 'scott/next' into next],  and the
cpuidle driver for powernv posted by Deepthi Dharwar:
https://lkml.org/lkml/2014/1/14/172. The same patchset minus the resolving of
merge conflicts with Ben's ppc next branch had been posted earlier
at http://lkml.org/lkml/2014/1/15/70. This Repost resolves these merge
conflicts with Ben's ppc next branch. Hence the Repost. Besides the earlier
post was based and tested on the mainline commit that was quite old.

However the patchset posted earlier at http://lkml.org/lkml/2014/1/15/70
along wiith Deepthi's patches on cpuidle driver for
powernv applies cleanly on the mainline kernel at commit: 
85ce70fdf48aa290b484531
dated Jan 16 2014 and has been tested on the same at the time of this Repost.


Changes in V5: The primary change in this version is in Patch[6/8].
As per the discussions in V4 posting of this patchset, it was decided to
refine handling the wakeup of CPUs in fast-sleep by doing the following:

1. In V4, a polling mechanism was used by the CPU handling broadcast to
find out the time of next wakeup of the CPUs in deep idle states. V5 avoids
polling by a way described under PATCH[6/8] in this patchset.

2. The mechanism of broadcast handling of CPUs in deep idle in the absence of an
external wakeup device should be generic and not arch specific code. Hence in 
this
version this functionality has been integrated into the tick broadcast 
framework in
the kernel unlike before where it was handled in powerpc specific code.

3. It was suggested that the "broadcast cpu" can be the time keeping cpu
itself. However this has challenges of its own:

 a. The time keeping cpu need not exist when all cpus are idle. Hence there
are phases in time when time keeping cpu is absent. But for the use case that
this patchset is trying to address we rely on the presence of a broadcast cpu
all the time.

 b. The nomination and un-assignment of the time keeping cpu is not protected
by a lock today and need not be as well since such is its use case in the
kernel. However we would need locks if we double up the time keeping cpu as the
broadcast cpu.

Hence the broadcast cpu is independent of the time-keeping cpu. However 
PATCH[6/8]
proposes a simpler solution to pick a broadcast cpu in this version.



Changes in V4: https://lkml.org/lkml/2013/11/29/97

1. Add Fast Sleep CPU idle state on PowerNV.

2. Add the required context management for Fast Sleep and the call to OPAL
to synchronize time base after wakeup from fast sleep.

4. Add parsing of CPU idle states from the device tree to populate the
cpuidle
state table.

5. Rename ambiguous functions in the code around waking up of CPUs from fast
sleep.

6. Fixed a bug in re-programming of the hrtimer that is queued to wakeup the
CPUs in fast sleep and modified Changelogs.

7. Added the ARCH_HAS_TICK_BROADCAST option. This signifies that we have a
arch specific function to perform broadcast.


Changes in V3:
http://thread.gmane.org/gmane.linux.power-management.general/38113

1. Fix the way in which a broadcast ipi is handled on the idling cpus. Timer
handling on a broadcast ipi is being done now without missing out any timer
stats generation.

2. Fix a bug in the programming of the hrtimer meant to do broadcast. Program
it to trigger at the earlier of a "broadcast period", and the next wake

Re: [PATCH] swap: do not skip lowest_bit in scan_swap_map() scan loop

2014-01-21 Thread Hugh Dickins

On Tue, 21 Jan 2014, Jamie Liu wrote:

> In the second half of scan_swap_map()'s scan loop, offset is set to
> si->lowest_bit and then incremented before entering the loop for the
> first time, causing si->swap_map[si->lowest_bit] to be skipped.
> 
> Signed-off-by: Jamie Liu 

Acked-by: Hugh Dickins 

Good catch.  At first I was puzzled that this off-by-one could have
gone unnoticed for so long (ever since 2.6.29); but now I think that
almost always we have a good amount of slack, in those pages duplicated
between swap and swapcache, which can be reclaimed at the vm_swap_full()
check, and so conceal this loss of a single slot.

> ---
>  mm/swapfile.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 612a7c9..6635081 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -616,7 +616,7 @@ scan:
>   }
>   }
>   offset = si->lowest_bit;
> - while (++offset < scan_base) {
> + while (offset < scan_base) {
>   if (!si->swap_map[offset]) {
>   spin_lock(&si->lock);
>   goto checks;
> @@ -629,6 +629,7 @@ scan:
>   cond_resched();
>   latency_ration = LATENCY_LIMIT;
>   }
> + offset++;
>   }
>   spin_lock(&si->lock);
>  
> -- 
> 1.8.5.3
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH V3 2/2] fs : Add sanity checks for block size > PAGE_SIZE

2014-01-21 Thread Raghavendra K T


On 01/22/2014 01:51 AM, Andrew Morton wrote:

On Tue, 21 Jan 2014 17:00:00 +0530 Raghavendra K T 
 wrote:


We could hit null pointer dereference error during alloc_page_buffers
in : (1) block size > PAGE_SIZE (2) low memory.
Add sanity check for that.

Signed-off-by: Raghavendra K T 
---
  fs/block_dev.c | 1 +
  fs/buffer.c| 6 ++
  2 files changed, 7 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 1e86823..2481d42 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1027,6 +1027,7 @@ void bd_set_size(struct block_device *bdev, loff_t size)
break;
bsize <<= 1;
}
+   BUG_ON(bsize > PAGE_SIZE);
bdev->bd_block_size = bsize;
bdev->bd_inode->i_blkbits = blksize_bits(bsize);
  }


alloc_page_buffers() will always return NULL if passed size >=
PAGE_SIZE.  So if we're going to add a check, it would be better to add
it to alloc_page_buffers() because that will catch errors from the
widest range of callsites.


In that case how about converting BUG_ON to setting a default value of
PAGE_SIZE for bs in bd_set_size() itself (with a warning)?



But alloc_page_buffers() is pretty frequently called and I'd be
inclined to not add any check - most callers will just go oops and that
will provide basically the same information.


Agree with this concern.




--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1571,6 +1571,12 @@ void create_empty_buffers(struct page *page,
struct buffer_head *bh, *head, *tail;

head = alloc_page_buffers(page, blocksize, 1);
+
+   /*
+* alloc_page_buffers() could return NULL on (1) bs > PAGE_SIZE
+* (2) low memory case. Ensure that we don't dereference null ptr
+*/
+   BUG_ON(!head);


This is unneeded.

- bs > PAGE_SIZE can be checked elsewhere in a direct fashion

- low memory case can't happen - we passed retry=1

- create_empty_buffers() will immediately go oops if head==NULL.
   That oops contains the same info as is presented by a BUG().


Okay.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 00/73] tree-wide: clean up some no longer required #include

2014-01-21 Thread Stephen Rothwell

Hi Paul,

On Tue, 21 Jan 2014 16:22:03 -0500 Paul Gortmaker 
 wrote:
>
> Where: This work exists as a queue of patches that I apply to
> linux-next; since the changes are fixing some things that currently
> can only be found there.  The patch series can be found at:
> 
>http://git.kernel.org/cgit/linux/kernel/git/paulg/init.git
>git://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git
> 
> I've avoided annoying Stephen with another queue of patches for
> linux-next while the development content was in flux, but now that
> the merge window has opened, and new additions are fewer, perhaps he
> wouldn't mind tacking it on the end...  Stephen?

OK, I have added this to the end of linux-next today - we will see how we
go.  It is called "init".

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgment of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
 * submitted under GPL v2 (or later) and include the Contributor's
Signed-off-by,
 * posted to the relevant mailing list,
 * reviewed by you (or another maintainer of your subsystem tree),
 * successfully unit tested, and 
 * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

-- 
Cheers,
Stephen Rothwell 
s...@canb.auug.org.au

Legal Stuff:
By participating in linux-next, your subsystem tree contributions are
public and will be included in the linux-next trees.  You may be sent
e-mail messages indicating errors or other issues when the
patches/commits from your subsystem tree are merged and tested in
linux-next.  These messages may also be cross-posted to the linux-next
mailing list, the linux-kernel mailing list, etc.  The linux-next tree
project and IBM (my employer) make no warranties regarding the linux-next
project, the testing procedures, the results, the e-mails, etc.  If you
don't agree to these ground rules, let me know and I'll remove your tree
from participation in linux-next.

pgpLaCcLhIIMO.pgp
Description: PGP signature

Re: [patch 9/9] mm: keep page cache radix tree nodes in check

2014-01-21 Thread Johannes Weiner

On Wed, Jan 22, 2014 at 02:06:07PM +1100, Dave Chinner wrote:
> On Tue, Jan 21, 2014 at 12:50:17AM -0500, Johannes Weiner wrote:
> > On Tue, Jan 21, 2014 at 02:03:58PM +1100, Dave Chinner wrote:
> > > On Mon, Jan 20, 2014 at 06:17:37PM -0500, Johannes Weiner wrote:
> > > > On Fri, Jan 17, 2014 at 11:05:17AM +1100, Dave Chinner wrote:
> > > > > On Fri, Jan 10, 2014 at 01:10:43PM -0500, Johannes Weiner wrote:
> > > > > > +static struct shrinker workingset_shadow_shrinker = {
> > > > > > +   .count_objects = count_shadow_nodes,
> > > > > > +   .scan_objects = scan_shadow_nodes,
> > > > > > +   .seeks = DEFAULT_SEEKS * 4,
> > > > > > +   .flags = SHRINKER_NUMA_AWARE,
> > > > > > +};
> > > > > 
> > > > > Can you add a comment explaining how you calculated the .seeks
> > > > > value? It's important to document the weighings/importance
> > > > > we give to slab reclaim so we can determine if it's actually
> > > > > acheiving the desired balance under different loads...
> > > > 
> > > > This is not an exact science, to say the least.
> > > 
> > > I know, that's why I asked it be documented rather than be something
> > > kept in your head.
> > > 
> > > > The shadow entries are mostly self-regulated, so I don't want the
> > > > shrinker to interfere while the machine is just regularly trimming
> > > > caches during normal operation.
> > > > 
> > > > It should only kick in when either a) reclaim is picking up and the
> > > > scan-to-reclaim ratio increases due to mapped pages, dirty cache,
> > > > swapping etc. or b) the number of objects compared to LRU pages
> > > > becomes excessive.
> > > > 
> > > > I think that is what most shrinkers with an elevated seeks value want,
> > > > but this translates very awkwardly (and not completely) to the current
> > > > cost model, and we should probably rework that interface.
> > > > 
> > > > "Seeks" currently encodes 3 ratios:
> > > > 
> > > >   1. the cost of creating an object vs. a page
> > > > 
> > > >   2. the expected number of objects vs. pages
> > > 
> > > It doesn't encode that at all. If it did, then the default value
> > > wouldn't be "2".
> > >
> > > >   3. the cost of reclaiming an object vs. a page
> > > 
> > > Which, when you consider #3 in conjunction with #1, the actual
> > > intended meaning of .seeks is "the cost of replacing this object in
> > > the cache compared to the cost of replacing a page cache page."
> > 
> > But what it actually seems to do is translate scan rate from LRU pages
> > to scan rate in another object pool.  The actual replacement cost
> > varies based on hotness of each set, an in-use object is more
> > expensive to replace than a cold page and vice versa, the dentry and
> > inode shrinkers reflect this by rotating hot objects and refusing to
> > actually reclaim items while they are in active use.
> 
> Right, but so does the page cache when the page referenced bit is
> seen by the LRU scanner. That's a scanned page, so what is passed to
> shrink_slab is a ratio of pages scanned vs pages eligible for
> reclaim. IOWs, the fact that the slab caches rotate rather than
> reclaim is irrelevant - what matters is the same proportional
> pressure is applied to the slab cache that was applied to the page
> cache

Oh, but it does.  You apply the same pressure to both, but the actual
reclaim outcome depends on object valuation measures specific to each
pool (e.g. recently referenced or not), whereas my shrinker takes
sc->nr_to_scan objects and reclaims them without looking at their
individual value, which varies just like the value of slab objects
varies.

I thought I could compensate for the lack of object valuation in the
shadow shrinker by tweaking that fixed pressure factor between page
cache and shadow entries, but I'm no longer convinced this can work.

One thing that does affect the value of shadow entries is the overall
health of the system, memory-wise, so reclaim efficiency would be one
factor that affects individual object value, albeit a secondary one.

The most obvious value factor is whether the shadow entries in a node
are expired or not, but there are potentially 64 of them, potentially
from different zones with different "inactive ages" atomic_t's, so
that is fairly expensive to assess.

> > So I am having a hard time deriving a meaningful value out of this
> > definition for my usecase because I want to push back objects based on
> > reclaim efficiency (scan rate vs. reclaim rate).  The other shrinkers
> > with non-standard seek settings reek of magic number as well, which
> > suggests I am not alone with this.
> 
> Right, which is exactly why I'm asking you to document it. I've got
> no idea how other subsystems have come up with their magic numbers
> because they are not documented, and so it's just about impossible
> to determine what the author of the code really needed and hence the
> best way to improve the interface is difficult to determine.
> 
> > I wonder if we can come up with a better interface that allows both
>

[PATCH] drivers: xen: deaggressive selfballoon driver

2014-01-21 Thread Bob Liu

Current xen-selfballoon driver is too aggressive which may cause OOM be
triggered more often. Eg. this bug reported by James:
https://lkml.org/lkml/2013/11/21/158

There are two mainly reasons:
1) The original goal_page didn't consider some pages used by kernel space, like
slab pages and pages used by device drivers.

2) The balloon driver may not give back memory to guest OS fast enough when the
workload suddenly aquries a lot of physical memory.

In both cases, the guest OS will suffer from memory pressure and OOM may
be triggered.

The fix is make xen-selfballoon driver not that aggressive by adding extra 10%
of total ram pages to goal_page.
It's more valuable to keep the guest system reliable and response faster than
balloon out these 10% pages to XEN.

Signed-off-by: Bob Liu 
---
 drivers/xen/xen-selfballoon.c |   22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 21e18c1..745ad79 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -175,6 +175,7 @@ static void frontswap_selfshrink(void)
 #endif /* CONFIG_FRONTSWAP */
 
 #define MB2PAGES(mb)   ((mb) << (20 - PAGE_SHIFT))
+#define PAGES2MB(pages) ((pages) >> (20 - PAGE_SHIFT))
 
 /*
  * Use current balloon size, the goal (vm_committed_as), and hysteresis
@@ -525,6 +526,7 @@ EXPORT_SYMBOL(register_xen_selfballooning);
 int xen_selfballoon_init(bool use_selfballooning, bool 
use_frontswap_selfshrink)
 {
bool enable = false;
+   unsigned long reserve_pages;
 
if (!xen_domain())
return -ENODEV;
@@ -549,6 +551,26 @@ int xen_selfballoon_init(bool use_selfballooning, bool 
use_frontswap_selfshrink)
if (!enable)
return -ENODEV;
 
+   /*
+* Give selfballoon_reserved_mb a default value(10% of total ram pages)
+* to make selfballoon not so aggressive.
+*
+* There are mainly two reasons:
+* 1) The original goal_page didn't consider some pages used by kernel
+*space, like slab pages and memory used by device drivers.
+*
+* 2) The balloon driver may not give back memory to guest OS fast
+*enough when the workload suddenly aquries a lot of physical 
memory.
+*
+* In both cases, the guest OS will suffer from memory pressure and
+* OOM killer may be triggered.
+* By reserving extra 10% of total ram pages, we can keep the system
+* much more reliably and response faster in some cases.
+*/
+   if (!selfballoon_reserved_mb) {
+   reserve_pages = totalram_pages / 10;
+   selfballoon_reserved_mb = PAGES2MB(reserve_pages);
+   }
schedule_delayed_work(&selfballoon_worker, selfballoon_interval * HZ);
 
return 0;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv2 2/2] of: fix of_update_property()

2014-01-21 Thread Xiubo Li

The of_update_property() is intented to update a property in a node
and if the property does not exist, will add it.

The second search of the property is possibly won't be found, that
maybe removed by other thread just before the second search begain.

Using the __of_find_property() and __of_add_property() instead and
move them into lock operations.

Signed-off-by: Xiubo Li 
Cc: Pantelis Antoniou 
---
 drivers/of/base.c | 36 ++--
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index b86b77a..458072d 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1573,7 +1573,7 @@ int of_update_property(struct device_node *np, struct 
property *newprop)
 {
struct property **next, *oldprop;
unsigned long flags;
-   int rc, found = 0;
+   int rc = 0;
 
rc = of_property_notify(OF_RECONFIG_UPDATE_PROPERTY, np, newprop);
if (rc)
@@ -1582,36 +1582,28 @@ int of_update_property(struct device_node *np, struct 
property *newprop)
if (!newprop->name)
return -EINVAL;
 
-   oldprop = of_find_property(np, newprop->name, NULL);
-   if (!oldprop)
-   return of_add_property(np, newprop);
-
raw_spin_lock_irqsave(&devtree_lock, flags);
-   next = &np->properties;
-   while (*next) {
-   if (*next == oldprop) {
-   /* found the node */
-   newprop->next = oldprop->next;
-   *next = newprop;
-   oldprop->next = np->deadprops;
-   np->deadprops = oldprop;
-   found = 1;
-   break;
-   }
-   next = &(*next)->next;
+   oldprop = __of_find_property(np, newprop->name, NULL);
+   if (!oldprop) {
+   /* add the node */
+   rc = __of_add_property(np, newprop);
+   } else {
+   /* replace the node */
+   next = &oldprop;
+   newprop->next = oldprop->next;
+   *next = newprop;
+   oldprop->next = np->deadprops;
+   np->deadprops = oldprop;
}
raw_spin_unlock_irqrestore(&devtree_lock, flags);
 
-   if (!found)
-   return -ENODEV;
-
 #ifdef CONFIG_PROC_DEVICETREE
/* try to add to proc as well if it was initialized */
-   if (np->pde)
+   if (!rc && np->pde)
proc_device_tree_update_prop(np->pde, newprop, oldprop);
 #endif /* CONFIG_PROC_DEVICETREE */
 
-   return 0;
+   return rc;
 }
 
 #if defined(CONFIG_OF_DYNAMIC)
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv2 1/2] of: add __of_add_property() without lock operations

2014-01-21 Thread Xiubo Li

There two places will use the same code for adding one new property to
the DT node. Adding __of_add_property() and prepare for fixing
of_update_property()'s bug.

Signed-off-by: Xiubo Li 
---
 drivers/of/base.c | 38 --
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index f807d0e..b86b77a 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1469,11 +1469,31 @@ static int of_property_notify(int action, struct 
device_node *np,
 #endif
 
 /**
+ * __of_add_property - Add a property to a node without lock operations
+ */
+static int __of_add_property(struct device_node *np, struct property *prop)
+{
+   struct property **next;
+
+   prop->next = NULL;
+   next = &np->properties;
+   while (*next) {
+   if (strcmp(prop->name, (*next)->name) == 0)
+   /* duplicate ! don't insert it */
+   return -EEXIST;
+
+   next = &(*next)->next;
+   }
+   *next = prop;
+
+   return 0;
+}
+
+/**
  * of_add_property - Add a property to a node
  */
 int of_add_property(struct device_node *np, struct property *prop)
 {
-   struct property **next;
unsigned long flags;
int rc;
 
@@ -1481,27 +1501,17 @@ int of_add_property(struct device_node *np, struct 
property *prop)
if (rc)
return rc;
 
-   prop->next = NULL;
raw_spin_lock_irqsave(&devtree_lock, flags);
-   next = &np->properties;
-   while (*next) {
-   if (strcmp(prop->name, (*next)->name) == 0) {
-   /* duplicate ! don't insert it */
-   raw_spin_unlock_irqrestore(&devtree_lock, flags);
-   return -1;
-   }
-   next = &(*next)->next;
-   }
-   *next = prop;
+   rc = __of_add_property(np, prop);
raw_spin_unlock_irqrestore(&devtree_lock, flags);
 
 #ifdef CONFIG_PROC_DEVICETREE
/* try to add to proc as well if it was initialized */
-   if (np->pde)
+   if (!rc && np->pde)
proc_device_tree_add_prop(np->pde, prop);
 #endif /* CONFIG_PROC_DEVICETREE */
 
-   return 0;
+   return rc;
 }
 
 /**
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] net/ipv4: queue work on power efficient wq

2014-01-21 Thread Viresh Kumar

Workqueue used in ipv4 layer have no real dependency of scheduling these on the
cpu which scheduled them.

On a idle system, it is observed that an idle cpu wakes up many times just to
service this work. It would be better if we can schedule it on a cpu which the
scheduler believes to be the most appropriate one.

This patch replaces normal workqueues with power efficient versions. This
doesn't change existing behavior of code unless CONFIG_WQ_POWER_EFFICIENT is
enabled.

Signed-off-by: Viresh Kumar 
---
Initial support for power-efficient workqueues was added here:
https://lkml.org/lkml/2013/4/24/215

 net/ipv4/devinet.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 646023b..ac2dff3 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -474,7 +474,7 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct 
nlmsghdr *nlh,
inet_hash_insert(dev_net(in_dev->dev), ifa);
 
cancel_delayed_work(&check_lifetime_work);
-   schedule_delayed_work(&check_lifetime_work, 0);
+   queue_delayed_work(system_power_efficient_wq, &check_lifetime_work, 0);
 
/* Send message first, then call notifier.
   Notifier will trigger FIB update, so that
@@ -684,7 +684,8 @@ static void check_lifetime(struct work_struct *work)
if (time_before(next_sched, now + ADDRCONF_TIMER_FUZZ_MAX))
next_sched = now + ADDRCONF_TIMER_FUZZ_MAX;
 
-   schedule_delayed_work(&check_lifetime_work, next_sched - now);
+   queue_delayed_work(system_power_efficient_wq, &check_lifetime_work,
+   next_sched - now);
 }
 
 static void set_ifa_lifetime(struct in_ifaddr *ifa, __u32 valid_lft,
@@ -842,7 +843,8 @@ static int inet_rtm_newaddr(struct sk_buff *skb, struct 
nlmsghdr *nlh)
ifa = ifa_existing;
set_ifa_lifetime(ifa, valid_lft, prefered_lft);
cancel_delayed_work(&check_lifetime_work);
-   schedule_delayed_work(&check_lifetime_work, 0);
+   queue_delayed_work(system_power_efficient_wq,
+   &check_lifetime_work, 0);
rtmsg_ifa(RTM_NEWADDR, ifa, nlh, NETLINK_CB(skb).portid);
blocking_notifier_call_chain(&inetaddr_chain, NETDEV_UP, ifa);
}
@@ -2322,7 +2324,7 @@ void __init devinet_init(void)
register_gifconf(PF_INET, inet_gifconf);
register_netdevice_notifier(&ip_netdev_notifier);
 
-   schedule_delayed_work(&check_lifetime_work, 0);
+   queue_delayed_work(system_power_efficient_wq, &check_lifetime_work, 0);
 
rtnl_af_register(&inet_af_ops);
 
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] net/neighbour: queue work on power efficient wq

2014-01-21 Thread Viresh Kumar

Workqueue used in neighbour layer have no real dependency of scheduling these on
the cpu which scheduled them.

On a idle system, it is observed that an idle cpu wakes up many times just to
service this work. It would be better if we can schedule it on a cpu which the
scheduler believes to be the most appropriate one.

This patch replaces normal workqueues with power efficient versions. This
doesn't change existing behavior of code unless CONFIG_WQ_POWER_EFFICIENT is
enabled.

Signed-off-by: Viresh Kumar 
---
 net/core/neighbour.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index f8012fe..b9e9e0d 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -828,7 +828,7 @@ out:
 * ARP entry timeouts range from 1/2 BASE_REACHABLE_TIME to 3/2
 * BASE_REACHABLE_TIME.
 */
-   schedule_delayed_work(&tbl->gc_work,
+   queue_delayed_work(system_power_efficient_wq, &tbl->gc_work,
  NEIGH_VAR(&tbl->parms, BASE_REACHABLE_TIME) >> 1);
write_unlock_bh(&tbl->lock);
 }
@@ -1565,7 +1565,8 @@ static void neigh_table_init_no_netlink(struct 
neigh_table *tbl)
 
rwlock_init(&tbl->lock);
INIT_DEFERRABLE_WORK(&tbl->gc_work, neigh_periodic_work);
-   schedule_delayed_work(&tbl->gc_work, tbl->parms.reachable_time);
+   queue_delayed_work(system_power_efficient_wq, &tbl->gc_work,
+   tbl->parms.reachable_time);
setup_timer(&tbl->proxy_timer, neigh_proxy_process, (unsigned long)tbl);
skb_queue_head_init_class(&tbl->proxy_queue,
&neigh_table_proxy_queue_class);
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: Align all CPUs to the same frequency if using shared clock

2014-01-21 Thread Viresh Kumar

On 22 January 2014 11:56, Li, Zhuangzhi  wrote:
> I don't think it's a real bug in bootloader, the bootloader can set CPUs to 
> different frequencies according to actually requirements(Power saving first 
> or Performance first),
> the CPUs freq policy are initialized in kernel, if the kernel want to share 
> one CPU policy(using CPUFREQ_SHARED_TYPE_ALL type), it should ensure all CPUs 
> frequencies aligned first,
> don't depend on the bootloader CPUs Pre-states, then the kernel can have 
> better compatibility.
>
> If the kernel uses CPUFREQ_SHARED_TYPE_ALL policy, the patch can ensure these:
> 1. If all CPUs are in the same P-state, it does nothing when cpufreq 
> registering
> 2. If the CPUs are in different P-states, all the other CPUs are aligned once 
> to current frequency of CPU0 according to the present policy.

I thought, as you are asking kernel to keep same freq on all of them,
then same should be true for bootloaders.

Otherwise it was okay.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/hid/wacom: fixed coding style issues

2014-01-21 Thread Dmitry Torokhov

On Tue, Jan 21, 2014 at 11:42:03PM +0100, Rob Schroer wrote:
> On Tue, Jan 21, 2014 at 01:25:54PM -0800, Joe Perches wrote:
> > On Tue, 2014-01-21 at 13:18 -0800, Dmitry Torokhov wrote:
> > > On Tue, Jan 21, 2014 at 09:29:44PM +0100, Rob Schroer wrote:
> > > > As far as I can see, kstrtoXXX() might be an alternative, but I was just
> > > > fixing coding style issues, no need to break anything IMO.
> > > 
> > > You could do the breaking in a follow up patch ;)
> > 
> > Yes please.
> > 
> > Include the breaking of multiple statements
> > into multiple lines too please like
> > 
> > from:
> > case USB_DEVICE_ID_WACOM_GRAPHIRE_BLUETOOTH:
> > rep_data[0] = 0x03; rep_data[1] = 0x00;
> > 
> > to:
> > case USB_DEVICE_ID_WACOM_GRAPHIRE_BLUETOOTH:
> > rep_data[0] = 0x03;
> > rep_data[1] = 0x00;
> > 
> > 
> 
> Added a cosmetical linebreak, switched an occurence of sscanf to kstrtoint.
> 
> Signed-off-by: Robin Schroer 
> ---
>  drivers/hid/hid-wacom.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hid/hid-wacom.c b/drivers/hid/hid-wacom.c
> index ebcca0d..5daf80c 100644
> --- a/drivers/hid/hid-wacom.c
> +++ b/drivers/hid/hid-wacom.c
> @@ -336,7 +336,8 @@ static void wacom_set_features(struct hid_device *hdev, 
> u8 speed)
> 
>   switch (hdev->product) {
>   case USB_DEVICE_ID_WACOM_GRAPHIRE_BLUETOOTH:
> - rep_data[0] = 0x03; rep_data[1] = 0x00;
> + rep_data[0] = 0x03;
> + rep_data[1] = 0x00;
>   limit = 3;
>   do {
>   ret = hdev->hid_output_raw_report(hdev, rep_data, 2,
> @@ -404,7 +405,7 @@ static ssize_t wacom_store_speed(struct device *dev,
>   struct hid_device *hdev = container_of(dev, struct hid_device, dev);
>   int new_speed;
> 
> - if (sscanf(buf, "%1d", &new_speed) != 1)
> + if (kstrtoint(buf, 10, &new_speed))
>   return -EINVAL;

I think this should be

error = kstrtoint(buf, 10, &new_speed);
if (error)
return error;

> 
>   if (new_speed == 0 || new_speed == 1) {
> --
> 1.8.4.2
> 
> Well, I hope this works as intended.
> 
> --
> Robin

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

fanotify use after free.

2014-01-21 Thread Dave Jones

Jan,

since yesterdays changes, on boot I see a flood of messages from slub debug 
during boot..

=
BUG fanotify_event_info (Not tainted): Poison overwritten
-

Disabling lock debugging due to kernel taint
INFO: 0x880247e45bc8-0x880247e45bcb. First byte 0x0 instead of 0x6b
INFO: Allocated in fanotify_handle_event+0x136/0x390 age=0 cpu=0 pid=293
 __slab_alloc+0x456/0x565
 kmem_cache_alloc+0x1fe/0x260
 fanotify_handle_event+0x136/0x390
 send_to_group+0xd3/0x1c0
 fsnotify+0x1c8/0x340
 open_exec+0xe2/0x120
 load_elf_binary+0x7b7/0x18e0
 search_binary_handler+0x94/0x1b0
 do_execve_common.isra.26+0x5d7/0x7d0
 SyS_execve+0x36/0x50
 stub_execve+0x69/0xa0
INFO: Freed in fanotify_free_event+0x2e/0x40 age=0 cpu=3 pid=290
 __slab_free+0x4a/0x382
 kmem_cache_free+0x1c9/0x210
 fanotify_free_event+0x2e/0x40
 fsnotify_destroy_event+0x21/0x30
 fanotify_read+0x39e/0x5e0
 vfs_read+0x9b/0x160
 SyS_read+0x58/0xb0
 tracesys+0xdd/0xe2
INFO: Slab 0xea00091f9100 objects=20 used=20 fp=0x  (null) 
flags=0x204080
INFO: Object 0x880247e45b90 @offset=7056 fp=0x880247e44000

Bytes b4 880247e45b80: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a  

Object 880247e45b90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object 880247e45ba0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object 880247e45bb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object 880247e45bc0: 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b a5  
kkk.
Redzone 880247e45bd0: bb bb bb bb bb bb bb bb  

Padding 880247e45d10: 5a 5a 5a 5a 5a 5a 5a 5a  

CPU: 0 PID: 293 Comm: mount Tainted: GB3.13.0+ #28 
 880247e45b90 8c7fe87c 8800874cbb28 9c710632
 88024a776ac0 8800874cbb68 9c194dad 0008
 88020001 880247e45bcc 88024a776ac0 006b
Call Trace:
 [] dump_stack+0x4e/0x7a
 [] print_trailer+0x14d/0x200
 [] check_bytes_and_report+0xcf/0x110
 [] check_object+0x1d7/0x250
 [] ? fanotify_handle_event+0x136/0x390
 [] alloc_debug_processing+0x76/0x118
 [] __slab_alloc+0x456/0x565
 [] ? fanotify_handle_event+0x136/0x390
 [] ? mntput+0x24/0x40
 [] ? terminate_walk+0x69/0x70
 [] ? do_last+0x25e/0x1390
 [] ? inode_permission+0x18/0x50
 [] ? fanotify_handle_event+0x136/0x390
 [] kmem_cache_alloc+0x1fe/0x260
 [] fanotify_handle_event+0x136/0x390
 [] ? path_openat+0xcd/0x6a0
 [] send_to_group+0xd3/0x1c0
 [] ? fsnotify+0x8f/0x340
 [] fsnotify+0x1c8/0x340
 [] do_sys_open+0x19f/0x230
 [] SyS_open+0x1e/0x20
 [] tracesys+0xdd/0xe2
FIX fanotify_event_info: Restoring 0x880247e45bc8-0x880247e45bcb=0x6b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] cpufreq: Align all CPUs to the same frequency if using shared clock

2014-01-21 Thread Li, Zhuangzhi



> -Original Message-
> From: Viresh Kumar [mailto:viresh.ku...@linaro.org]
> Sent: Wednesday, January 22, 2014 1:18 PM
> To: Li, Zhuangzhi
> Cc: Rafael J. Wysocki; cpuf...@vger.kernel.org; linux...@vger.kernel.org;
> Linux Kernel Mailing List; Liu, Chuansheng
> Subject: Re: [PATCH] cpufreq: Align all CPUs to the same frequency if using
> shared clock
> 
> On 21 January 2014 13:42, Viresh Kumar  wrote:
> > On 21 January 2014 12:56, Li, Zhuangzhi  wrote:
> >> Thanks for reviewing.
> >
> > Its my job :)
> >
> >> Sorry for make you misunderstanding, on our x86 platform, we want all the
> CPUs share one policy by setting CPUFREQ_SHARED_TYPE_ALL, not share one
> HW clock line.
> >
> > I see.. Then probably your patch makes sense. But it is obviously not
> > required for every platform that exists today.
> >
> > Please update it to do it only for drivers that have set
> > CPUFREQ_SHARED_TYPE_ALL..
> 
> One more thing, who has set different frequencies to these cores?
> I hope kernel hasn't ?
> 
> In that case, probably you are fixing a bootloader bug in kernel?
> What about doing this in bootloader then?
I don't think it's a real bug in bootloader, the bootloader can set CPUs to 
different frequencies according to actually requirements(Power saving first or 
Performance first),
the CPUs freq policy are initialized in kernel, if the kernel want to share one 
CPU policy(using CPUFREQ_SHARED_TYPE_ALL type), it should ensure all CPUs 
frequencies aligned first,
don't depend on the bootloader CPUs Pre-states, then the kernel can have better 
compatibility.

If the kernel uses CPUFREQ_SHARED_TYPE_ALL policy, the patch can ensure these:
1. If all CPUs are in the same P-state, it does nothing when cpufreq registering
2. If the CPUs are in different P-states, all the other CPUs are aligned once 
to current frequency of CPU0 according to the present policy.
> 
> --
> virehs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values

2014-01-21 Thread Darrick J. Wong

On Mon, Nov 11, 2013 at 07:30:18PM -0500, Theodore Ts'o wrote:
> On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote:
> > b. Use Andreas's encoding, which is incompatible with pre-1970 files
> > written on 64-bit systems.
> >
> > I don't care about currently-existing post-2038 files, because I believe
> > that nobody has a valid reason to have such files.  However, I do
> > believe that pre-1970 files are probably important to someone.
> > 
> > Despite this, I prefer option (b), because I think the simplicity is
> > valuable, and because I hate to give up date ranges (even ones that I
> > think we'll "never" need). Option (b) is not actually lossy, because we
> > could correct pre-1970 files with e2fsck; under Andreas's encoding,
> > their dates would be in the far future (and thus cannot be legitimate).
> > 
> > Would a patch that does (b) be accepted?  I would accompany it with a
> > patch to e2fsck (which I assume would also go to the ext4 developers
> > mailing list?).
> 
> I agree, I think this is the best way to go.  I'm going to drop your
> earlier patch, and wait for an updated patch from you.  It may miss
> this merge window, but as Andreas has pointed out, we still have a few
> years to get this right.  :-)

Just out of curiosity, did this (updated patch) ever happen?

--D
> 
> Thanks!!
> 
>   - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable

2014-01-21 Thread Vladimir Davydov

On 01/22/2014 02:22 AM, David Rientjes wrote:
> On Fri, 17 Jan 2014, Vladimir Davydov wrote:
>
>> The name `max_pass' is misleading, because this variable actually keeps
>> the estimate number of freeable objects, not the maximal number of
>> objects we can scan in this pass, which can be twice that. Rename it to
>> reflect its actual meaning.
>>
>> Signed-off-by: Vladimir Davydov 
>> Cc: Andrew Morton 
>> Cc: Mel Gorman 
>> Cc: Michal Hocko 
>> Cc: Johannes Weiner 
>> Cc: Rik van Riel 
>> Cc: Dave Chinner 
>> Cc: Glauber Costa 
> This doesn't compile on linux-next:
>
> mm/vmscan.c: In function ‘shrink_slab_node’:
> mm/vmscan.c:300:23: error: ‘max_pass’ undeclared (first use in this function)
> mm/vmscan.c:300:23: note: each undeclared identifier is reported only once 
> for each function it appears in
>
> because of b01fa2357bca ("mm: vmscan: shrink all slab objects if tight on 
> memory") from an author with a name remarkably similar to yours.

Oh, sorry. I thought it hadn't been committed there yet.

> Could you rebase this series on top of your previous work that is already in 
> -mm?

Sure.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 17/24] GFS2: Use RCU/hlist_bl based hash for quotas

2014-01-21 Thread Sasha Levin

On 01/22/2014 12:32 AM, Paul E. McKenney wrote:

On Mon, Jan 20, 2014 at 12:23:40PM +, Steven Whitehouse wrote:

>Prior to this patch, GFS2 kept all the quotas for each
>super block in a single linked list. This is rather slow
>when there are large numbers of quotas.
>
>This patch introduces a hlist_bl based hash table, similar
>to the one used for glocks. The initial look up of the quota
>is now lockless in the case where it is already cached,
>although we still have to take the per quota spinlock in
>order to bump the ref count. Either way though, this is a
>big improvement on what was there before.
>
>The qd_lock and the per super block list is preserved, for
>the time being. However it is intended that since this is no
>longer used for its original role, it should be possible to
>shrink the number of items on that list in due course and
>remove the requirement to take qd_lock in qd_get.
>
>Signed-off-by: Steven Whitehouse
>Cc: Abhijith Das
>Cc: Paul E. McKenney

Interesting!  I thought that Sasha Levin had a hash table in the works,
but I don't see it, so CCing him.

Indeed, there is a hlist based hashtable at include/linux/hashtable.h for 
couple kernel
versions now. However, there's no hlist_bl one.

If there is a plan on adding a hlist_bl hashtable for whatever reason, it 
should probably
be done by expanding hashtable.h so that more places that use hlist_bl would 
benefit from it (yes,
there are couple more places that do hlist_bl hashtable).

Also, do we really want to use hlist_bl here? It doesn't seem like it's being 
done to conserve on
memory, and that's the only reason it should be used for. Doing a single 
spinlock per bucket is
much more efficient than using the bit locking scheme that hlist_bl does.

Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] restore user defined min_free_kbytes when disabling thp

2014-01-21 Thread Han Pingtian

On Tue, Jan 21, 2014 at 10:23:51AM +, Mel Gorman wrote:
> On Tue, Jan 21, 2014 at 05:38:59PM +0800, Han Pingtian wrote:
> > The testcase 'thp04' of LTP will enable THP, do some testing, then
> > disable it if it wasn't enabled. But this will leave a different value
> > of min_free_kbytes if it has been set by admin. So I think it's better
> > to restore the user defined value after disabling THP.
> > 
> 
> Then have LTP record what min_free_kbytes was at the same time THP was
> enabled by the test and restore both settings. It leaves a window where
> an admin can set an alternative value during the test but that would also
> invalidate the test in same cases and gets filed under "don't do that".
> 

Because the value is changed in kernel, so it would be better to 
restore it in kernel, right? :)  I have a v2 patch which will restore
the value only if it isn't set again by user after THP's initialization.
This v2 patch is dependent on the patch 'mm: show message when updating
min_free_kbytes in thp' which has been added to -mm tree, can be found
here:

http://ozlabs.org/~akpm/mmotm/broken-out/mm-show-message-when-updating-min_free_kbytes-in-thp.patch

please have a look. Thanks.


>From 8b79586ff9a1d85cbe45102a86888268094ec0ae Mon Sep 17 00:00:00 2001
From: Han Pingtian 
Date: Tue, 21 Jan 2014 17:24:43 +0800
Subject: [PATCH] mm: restore user defined min_free_kbytes when disabling thp

thp increases the value of min_free_kbytes in initialization. This will
change the user defined value of min_free_kbytes sometimes. So try to
restore the value when disabling thp if the value has been changed in
thp initialization and isn't changed by user afte that.

Signed-off-by: Han Pingtian 
---
 mm/huge_memory.c |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 94a824f..fcb8ce58 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -164,6 +164,16 @@ static int start_khugepaged(void)
} else if (khugepaged_thread) {
kthread_stop(khugepaged_thread);
khugepaged_thread = NULL;
+
+   if (user_min_free_kbytes >= 0 && 
+   user_min_free_kbytes != min_free_kbytes) {
+   pr_info("restore min_free_kbytes from %d to user "
+   "defined %d when stopping khugepaged\n",
+   min_free_kbytes, user_min_free_kbytes);
+
+   min_free_kbytes = user_min_free_kbytes;
+   setup_per_zone_wmarks();
+   }
}
 
return err;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH Resend] PM: Remove unnecessary !!

2014-01-21 Thread Viresh Kumar

Double ! or !! are normally required to get 0 or 1 out of a expression. A
comparision always returns 0 or 1 and hence there is no need to apply double !
over it again.

Signed-off-by: Viresh Kumar 
---
 kernel/power/suspend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 62ee437..90b3d93 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -39,7 +39,7 @@ static const struct platform_suspend_ops *suspend_ops;
 
 static bool need_suspend_ops(suspend_state_t state)
 {
-   return !!(state > PM_SUSPEND_FREEZE);
+   return state > PM_SUSPEND_FREEZE;
 }
 
 static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 2/2] usb: dwc3: adapt dwc3 core to use Generic PHY Framework

2014-01-21 Thread Vivek Gautam

Hi,


On Tue, Jan 21, 2014 at 7:30 PM, Roger Quadros  wrote:
> Hi Kishon,
>
> On 01/21/2014 12:11 PM, Kishon Vijay Abraham I wrote:
>> Adapted dwc3 core to use the Generic PHY Framework. So for init, exit,
>> power_on and power_off the following APIs are used phy_init(), phy_exit(),
>> phy_power_on() and phy_power_off().
>>
>> However using the old USB phy library wont be removed till the PHYs of all
>> other SoC's using dwc3 core is adapted to the Generic PHY Framework.
>>
>> Signed-off-by: Kishon Vijay Abraham I 
>> ---
>> Changes from v3:
>> * avoided using quirks
>>
>>  Documentation/devicetree/bindings/usb/dwc3.txt |6 ++-
>>  drivers/usb/dwc3/core.c|   60 
>> 
>>  drivers/usb/dwc3/core.h|7 +++
>>  3 files changed, 71 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt 
>> b/Documentation/devicetree/bindings/usb/dwc3.txt
>> index e807635..471366d 100644
>> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
>> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
>> @@ -6,11 +6,13 @@ Required properties:
>>   - compatible: must be "snps,dwc3"
>>   - reg : Address and length of the register set for the device
>>   - interrupts: Interrupts used by the dwc3 controller.
>> +
>> +Optional properties:
>>   - usb-phy : array of phandle for the PHY device.  The first element
>> in the array is expected to be a handle to the USB2/HS PHY and
>> the second element is expected to be a handle to the USB3/SS PHY
>> -
>> -Optional properties:
>> + - phys: from the *Generic PHY* bindings
>> + - phy-names: from the *Generic PHY* bindings
>>   - tx-fifo-resize: determines if the FIFO *has* to be reallocated.
>>
>>  This is usually a subnode to DWC3 glue to which it is connected.
>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>> index e009d4e..036d589 100644
>> --- a/drivers/usb/dwc3/core.c
>> +++ b/drivers/usb/dwc3/core.c
>> @@ -82,6 +82,11 @@ static void dwc3_core_soft_reset(struct dwc3 *dwc)
>>
>>   usb_phy_init(dwc->usb2_phy);
>>   usb_phy_init(dwc->usb3_phy);
>> + if (dwc->usb2_generic_phy)
>> + phy_init(dwc->usb2_generic_phy);
>
> What if phy_init() fails? You need to report and fail. Same applies for all 
> PHY apis in this patch.
>
>> + if (dwc->usb3_generic_phy)
>> + phy_init(dwc->usb3_generic_phy);
>> +
>>   mdelay(100);
>>
>>   /* Clear USB3 PHY reset */
>> @@ -343,6 +348,11 @@ static void dwc3_core_exit(struct dwc3 *dwc)
>>  {
>>   usb_phy_shutdown(dwc->usb2_phy);
>>   usb_phy_shutdown(dwc->usb3_phy);
>> + if (dwc->usb2_generic_phy)
>> + phy_exit(dwc->usb2_generic_phy);
>> + if (dwc->usb3_generic_phy)
>> + phy_exit(dwc->usb3_generic_phy);
>> +
>>  }
>>
>>  #define DWC3_ALIGN_MASK  (16 - 1)
>> @@ -433,6 +443,32 @@ static int dwc3_probe(struct platform_device *pdev)
>>   }
>>   }
>>
>> + dwc->usb2_generic_phy = devm_phy_get(dev, "usb2-phy");
>> + if (IS_ERR(dwc->usb2_generic_phy)) {
>> + ret = PTR_ERR(dwc->usb2_generic_phy);
>> + if (ret == -ENOSYS || ret == -ENODEV) {
>> + dwc->usb2_generic_phy = NULL;
>> + } else if (ret == -EPROBE_DEFER) {
>> + return ret;
>> + } else {
>> + dev_err(dev, "no usb2 phy configured\n");
>> + return ret;
>> + }
>> + }
>> +
>> + dwc->usb3_generic_phy = devm_phy_get(dev, "usb3-phy");
>> + if (IS_ERR(dwc->usb3_generic_phy)) {
>> + ret = PTR_ERR(dwc->usb3_generic_phy);
>> + if (ret == -ENOSYS || ret == -ENODEV) {
>> + dwc->usb3_generic_phy = NULL;
>> + } else if (ret == -EPROBE_DEFER) {
>> + return ret;
>> + } else {
>> + dev_err(dev, "no usb3 phy configured\n");
>> + return ret;
>> + }
>> + }
>> +
>>   dwc->xhci_resources[0].start = res->start;
>>   dwc->xhci_resources[0].end = dwc->xhci_resources[0].start +
>>   DWC3_XHCI_REGS_END;
>> @@ -482,6 +518,11 @@ static int dwc3_probe(struct platform_device *pdev)
>>   usb_phy_set_suspend(dwc->usb2_phy, 0);
>>   usb_phy_set_suspend(dwc->usb3_phy, 0);
>>
>> + if (dwc->usb2_generic_phy)
>> + phy_power_on(dwc->usb2_generic_phy);
>> + if (dwc->usb3_generic_phy)
>> + phy_power_on(dwc->usb3_generic_phy);
>> +
>
> Is it OK to power on the phy before phy_init()?

Isn't phy_init() being done before phy_power_on() in the
core_soft_reset() in this patch ?
Isn't that what you want here ?

>
> I suggest to move phy_init() from core_soft_reset() to here, just before 
> phy_power_on().

core_soft_reset() is called before phy_power_on() itself from
dwc3_core_init(), right ?
will moving the phy_inti() here make na

mutual exculsion between clk_prepare_enable /clk_disable_unprepare and clk_set_parent

2014-01-21 Thread Xiaoguang Chen

Hi, Mike

We met a issue between clk_prepare_enable /clk_disable_unprepare and
clk_set_parent.

As we know, clk preprare/unprare will grab preprare lock, and clk
enable/disable will grab enable lock. clk_set_parent will grab prepare
lock

but there is no lock protection in clk_prepare_enable /clk_disable_unprepare,
for example, in clk_disable_unprepare, it is expended as clk_disable +
clk_unprepare,

and if below condition occurs, there will be problem
thread1 thread 2
call clk_disable_unprepare
1) clk_disable
get enable lock
...
release enable lock

 call clk_set_parent
 get prepare lock
 set clock's
parent to another parent
 release prepare lock

2) clk_unprepare
get prepare lock
unprepare parent clock <<--
release prepare lock

In above sequence, After thread 1 call clock disable, thread 2 change
clk's parent to another clock, then in thread1 step2, it will
unprepare clk's new parent, but not old parent, this will cause old
parent is not unprepared, but new parent is unprepared even when it is
not prepared yet.

So How can we use this API: clk_prepare_enable and clk_disable_unprepare  ?
Should we add lock to protect this API, if we get a prepare lock
inside this API, like
clk_disable_unprepare ()
{
get_prepare_lock();
clk_disable();
clk_unprepare();
clk_prepare_unlock();
}

is above sequence ok? if so, I can provide a patch for this.

Thanks
Xiaoguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-21 Thread Alexei Starovoitov

On Tue, Jan 21, 2014 at 8:10 PM, dormando  wrote:
>
>
> On Tue, 21 Jan 2014, Alexei Starovoitov wrote:
>
>> On Tue, Jan 21, 2014 at 5:39 PM, dormando  wrote:
>> >
>> > > On Fri, Jan 17, 2014 at 11:16 PM, dormando  wrote:
>> > > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote:
>> > > >> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote:
>> > > >> > > Hi,
>> > > >> > >
>> > > >> > > Upgraded a few kernels to the latest 3.10 stable tree while 
>> > > >> > > tracking down
>> > > >> > > a rare kernel panic, seems to have introduced a much more 
>> > > >> > > frequent kernel
>> > > >> > > panic. Takes anywhere from 4 hours to 2 days to trigger:
>> > > >> > >
>> > > >> > > <4>[196727.311203] general protection fault:  [#1] SMP
>> > > >> > > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP 
>> > > >> > > macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich 
>> > > >> > > microcode
>> ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm 
>> tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp
>> pps_core mdio
>> > > >> > > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 
>> > > >> > > 3.10.26 #1
>> > > >> > > <4>[196727.311344] Hardware name: Supermicro 
>> > > >> > > X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
>> > > >> > > <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 
>> > > >> > > task.ti: 885e6f072000
>> > > >> > > <4>[196727.311377] RIP: 0010:[]  
>> > > >> > > [] ipv4_dst_destroy+0x4f/0x80
>> > > >> > > <4>[196727.311399] RSP: 0018:885effd23a70  EFLAGS: 00010282
>> > > >> > > <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 
>> > > >> > > RCX: 0040
>> > > >> > > <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 
>> > > >> > > RDI: dead00200200
>> > > >> > > <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 
>> > > >> > > R09: 885d5a590800
>> > > >> > > <4>[196727.311451] R10:  R11:  
>> > > >> > > R12: 
>> > > >> > > <4>[196727.311464] R13: 81c8c280 R14:  
>> > > >> > > R15: 880e85ee16ce
>> > > >> > > <4>[196727.311510] FS:  () 
>> > > >> > > GS:885effd2() knlGS:
>> > > >> > > <4>[196727.311554] CS:  0010 DS:  ES:  CR0: 
>> > > >> > > 80050033
>> > > >> > > <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 
>> > > >> > > CR4: 000407e0
>> > > >> > > <4>[196727.311625] DR0:  DR1:  
>> > > >> > > DR2: 
>> > > >> > > <4>[196727.311669] DR3:  DR6: 0ff0 
>> > > >> > > DR7: 0400
>> > > >> > > <4>[196727.311713] Stack:
>> > > >> > > <4>[196727.311733]  8854c398ecc0 8854c398ecc0 
>> > > >> > > 885effd23ab0 815b7f42
>> > > >> > > <4>[196727.311784]  88be6595bc00 8854c398ecc0 
>> > > >> > >  8854c398ecc0
>> > > >> > > <4>[196727.311834]  885effd23ad0 815b86c6 
>> > > >> > > 885d5a590800 8816827821c0
>> > > >> > > <4>[196727.311885] Call Trace:
>> > > >> > > <4>[196727.311907]  
>> > > >> > > <4>[196727.311912]  [] dst_destroy+0x32/0xe0
>> > > >> > > <4>[196727.311959]  [] dst_release+0x56/0x80
>> > > >> > > <4>[196727.311986]  [] tcp_v4_do_rcv+0x2a5/0x4a0
>> > > >> > > <4>[196727.312013]  [] tcp_v4_rcv+0x7da/0x820
>> > > >> > > <4>[196727.312041]  [] ? 
>> > > >> > > ip_rcv_finish+0x360/0x360
>> > > >> > > <4>[196727.312070]  [] ? nf_hook_slow+0x7d/0x150
>> > > >> > > <4>[196727.312097]  [] ? 
>> > > >> > > ip_rcv_finish+0x360/0x360
>> > > >> > > <4>[196727.312125]  [] 
>> > > >> > > ip_local_deliver_finish+0xb2/0x230
>> > > >> > > <4>[196727.312154]  [] 
>> > > >> > > ip_local_deliver+0x4a/0x90
>> > > >> > > <4>[196727.312183]  [] ip_rcv_finish+0x119/0x360
>> > > >> > > <4>[196727.312212]  [] ip_rcv+0x22b/0x340
>> > > >> > > <4>[196727.312242]  [] ? 
>> > > >> > > macvlan_broadcast+0x160/0x160 [macvlan]
>> > > >> > > <4>[196727.312275]  [] 
>> > > >> > > __netif_receive_skb_core+0x512/0x640
>> > > >> > > <4>[196727.312308]  [] ? 
>> > > >> > > kmem_cache_alloc+0x13b/0x150
>> > > >> > > <4>[196727.312338]  [] 
>> > > >> > > __netif_receive_skb+0x21/0x70
>> > > >> > > <4>[196727.312368]  [] 
>> > > >> > > netif_receive_skb+0x31/0xa0
>> > > >> > > <4>[196727.312397]  [] 
>> > > >> > > napi_gro_receive+0xe8/0x140
>> > > >> > > <4>[196727.312433]  [] ixgbe_poll+0x551/0x11f0 
>> > > >> > > [ixgbe]
>> > > >> > > <4>[196727.312463]  [] ? ip_rcv+0x22b/0x340
>> > > >> > > <4>[196727.312491]  [] net_rx_action+0x111/0x210
>> > > >> > > <4>[196727.312521]  [] ? 
>> > > >> > > __netif_receive_skb+0x21/0x70
>> > > >> > > <4>[196727.312552]  [] __do_softirq+0xd0/0x270
>> > > >> > > <4>[196727.312583]  [] call_softirq+0x1c/0x30
>> > > >> > > <4>[196727.312613]  [] do_softirq+0x55/0x90
>> > > >> > > <4>[196727.312640]  [] irq_e

Deadlock between cpu_hotplug_begin and cpu_add_remove_lock

2014-01-21 Thread Paul Mackerras

This arises out of a report from a tester that offlining a CPU never
finished on a system they were testing.  This was on a POWER8 running
a 3.10.x kernel, but the issue is still present in mainline AFAICS.

What I found when I looked at the system was this:

* There was a ppc64_cpu process stuck inside cpu_hotplug_begin(),
  called from _cpu_down(), from cpu_down().  This process was holding
  the cpu_add_remove_lock mutex, since cpu_down() calls
  cpu_maps_update_begin() before calling _cpu_down().  It was stuck
  there because cpu_hotplug.refcount == 1.

* There was a mdadm process trying to acquire the cpu_add_remove_lock
  mutex inside register_cpu_notifier(), called from
  raid5_alloc_percpu() in drivers/md/raid5.c.  That process had
  previously called get_online_cpus, which is why cpu_hotplug.refcount
  was 1.

Result: deadlock.

Thus it seems that the following code is not safe:

get_online_cpus();
register_cpu_notifier(&...);
put_online_cpus();

There are a few different places that do that sort of thing; besides
drivers/md/raid5.c, there are instances in arch/x86/kernel/cpu,
arch/x86/oprofile, drivers/cpufreq/acpi-cpufreq.c,
drivers/oprofile/nmi_timer_int.c and kernel/trace/ring_buffer.c.

My question is this: is it reasonable to call register_cpu_notifier
inside a get/put_online_cpus block?  If so, the deadlock needs to be
fixed; if not, the callers need to be fixed, and the restriction
should be documented.

Regards,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

DID YOU GET OUR E-MAIL?

2014-01-21 Thread COCA-COLA FOUNDATION

You were among the lucky beneficiary selected to receive the sum of
£850,000.00GBP (Eight Hundred & Fifty Thousand Pounds Sterling's) as
charity donations/aid from the Coca-Cola Foundation to promote your
business and personal need Email us your Name--Tel--Country--to
(cocacola.foundatio...@yahoo.com ) for details. Mrs.Eleina Welsh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the drivers-x86 tree with the pm tree

2014-01-21 Thread Stephen Rothwell

Hi Matthew,

Today's linux-next merge of the drivers-x86 tree got a conflict in
drivers/platform/x86/mxm-wmi.c between commit 8b48463f8942 ("ACPI: Clean
up inclusions of ACPI header files") from the pm tree and commit
475879d65123 ("drivers: platform: Include appropriate header file in
mxm-wmi.c") from the drivers-x86 tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/platform/x86/mxm-wmi.c
index 3c59c0a3ee0f,7503d2b9b073..
--- a/drivers/platform/x86/mxm-wmi.c
+++ b/drivers/platform/x86/mxm-wmi.c
@@@ -20,7 -20,9 +20,8 @@@
  #include 
  #include 
  #include 
+ #include 
 -#include 
 -#include 
 +#include 
  
  MODULE_AUTHOR("Dave Airlie");
  MODULE_DESCRIPTION("MXM WMI Driver");


pgp8GZXcTvVm8.pgp
Description: PGP signature

Re: [PATCH 17/24] GFS2: Use RCU/hlist_bl based hash for quotas

2014-01-21 Thread Paul E. McKenney

On Mon, Jan 20, 2014 at 12:23:40PM +, Steven Whitehouse wrote:
> Prior to this patch, GFS2 kept all the quotas for each
> super block in a single linked list. This is rather slow
> when there are large numbers of quotas.
> 
> This patch introduces a hlist_bl based hash table, similar
> to the one used for glocks. The initial look up of the quota
> is now lockless in the case where it is already cached,
> although we still have to take the per quota spinlock in
> order to bump the ref count. Either way though, this is a
> big improvement on what was there before.
> 
> The qd_lock and the per super block list is preserved, for
> the time being. However it is intended that since this is no
> longer used for its original role, it should be possible to
> shrink the number of items on that list in due course and
> remove the requirement to take qd_lock in qd_get.
> 
> Signed-off-by: Steven Whitehouse 
> Cc: Abhijith Das 
> Cc: Paul E. McKenney 

Interesting!  I thought that Sasha Levin had a hash table in the works,
but I don't see it, so CCing him.

A few questions and comments below.

Thanx, Paul

> diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
> index a99f60c..59d99ec 100644
> --- a/fs/gfs2/incore.h
> +++ b/fs/gfs2/incore.h
> @@ -428,10 +428,13 @@ enum {
>  };
> 
>  struct gfs2_quota_data {
> + struct hlist_bl_node qd_hlist;
>   struct list_head qd_list;
>   struct kqid qd_id;
> + struct gfs2_sbd *qd_sbd;
>   struct lockref qd_lockref;
>   struct list_head qd_lru;
> + unsigned qd_hash;
> 
>   unsigned long qd_flags; /* QDF_... */
> 
> @@ -450,6 +453,7 @@ struct gfs2_quota_data {
> 
>   u64 qd_sync_gen;
>   unsigned long qd_last_warn;
> + struct rcu_head qd_rcu;
>  };
> 
>  struct gfs2_trans {
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 0650db2..c272e73 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -76,6 +76,7 @@ static int __init init_gfs2_fs(void)
> 
>   gfs2_str2qstr(&gfs2_qdot, ".");
>   gfs2_str2qstr(&gfs2_qdotdot, "..");
> + gfs2_quota_hash_init();
> 
>   error = gfs2_sys_init();
>   if (error)
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index 1b6b367..a1df01d 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -52,6 +52,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
> +#include 
> 
>  #include "gfs2.h"
>  #include "incore.h"
> @@ -67,10 +71,43 @@
>  #include "inode.h"
>  #include "util.h"
> 
> -/* Lock order: qd_lock -> qd->lockref.lock -> lru lock */
> +#define GFS2_QD_HASH_SHIFT  12

Should this be a function of the number of CPUs?  (Might not be an issue
if the really big systems don't use GFS.)

> +#define GFS2_QD_HASH_SIZE   (1 << GFS2_QD_HASH_SHIFT)
> +#define GFS2_QD_HASH_MASK   (GFS2_QD_HASH_SIZE - 1)
> +
> +/* Lock order: qd_lock -> bucket lock -> qd->lockref.lock -> lru lock */
>  static DEFINE_SPINLOCK(qd_lock);
>  struct list_lru gfs2_qd_lru;
> 
> +static struct hlist_bl_head qd_hash_table[GFS2_QD_HASH_SIZE];
> +
> +static unsigned int gfs2_qd_hash(const struct gfs2_sbd *sdp,
> +  const struct kqid qid)
> +{
> + unsigned int h;
> +
> + h = jhash(&sdp, sizeof(struct gfs2_sbd *), 0);
> + h = jhash(&qid, sizeof(struct kqid), h);
> +
> + return h & GFS2_QD_HASH_MASK;
> +}
> +
> +static inline void spin_lock_bucket(unsigned int hash)
> +{
> +hlist_bl_lock(&qd_hash_table[hash]);
> +}
> +
> +static inline void spin_unlock_bucket(unsigned int hash)
> +{
> +hlist_bl_unlock(&qd_hash_table[hash]);
> +}
> +
> +static void gfs2_qd_dealloc(struct rcu_head *rcu)
> +{
> + struct gfs2_quota_data *qd = container_of(rcu, struct gfs2_quota_data, 
> qd_rcu);
> + kmem_cache_free(gfs2_quotad_cachep, qd);
> +}
> +
>  static void gfs2_qd_dispose(struct list_head *list)
>  {
>   struct gfs2_quota_data *qd;
> @@ -87,6 +124,10 @@ static void gfs2_qd_dispose(struct list_head *list)
>   list_del(&qd->qd_list);
>   spin_unlock(&qd_lock);
> 
> + spin_lock_bucket(qd->qd_hash);
> + hlist_bl_del_rcu(&qd->qd_hlist);
> + spin_unlock_bucket(qd->qd_hash);
> +

Good, removed from the RCU-traversed list before invoking call_rcu().

>   gfs2_assert_warn(sdp, !qd->qd_change);
>   gfs2_assert_warn(sdp, !qd->qd_slot_count);
>   gfs2_assert_warn(sdp, !qd->qd_bh_count);
> @@ -95,7 +136,7 @@ static void gfs2_qd_dispose(struct list_head *list)
>   atomic_dec(&sdp->sd_quota_count);
> 
>   /* Delete it from the common reclaim list */
> - kmem_cache_free(gfs2_quotad_cachep, qd);
> + call_rcu(&qd->qd_rcu, gfs2_qd_dealloc);
>   }
>  }
> 
> @@ -165,83 +206,95 @@ static u64 qd2offset(struct gfs2_quota_data *qd)
>   return offset;
>  }
> 
> -static int qd_alloc(struct gfs2_sbd *sdp, struct kqid qid,
> -

[PATCH v3 0/4] X86/KVM: enable Intel MPX for KVM

2014-01-21 Thread Liu, Jinsong

These patches are version 3 to enalbe Intel MPX for KVM.

Version 1:
  * Add some Intel MPX definiation
  * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable 
  * vmx and msr handle for MPX support at KVM
  * enalbe MPX feature for guest

Version 2:
  * remove generic MPX definiation, Qiaowei's patch has add the definiation at 
kernel side
  * add MSR_IA32_BNDCFGS to msrs_to_save

Version 3:
  * rebase on latest kernel, which include Qiaowei's MPX common definiation 
pulled from HPA's tree

Thanks,
Jinsong--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v4 1/2] sysctl: Make neg_one a standard constraint

2014-01-21 Thread David Rientjes

On Mon, 20 Jan 2014, atom...@redhat.com wrote:

> From: Aaron Tomlin 
> 
> Add neg_one to the list of standard constraints.
> 
> Signed-off-by: Aaron Tomlin 
> Acked-by: Rik van Riel 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v4 2/2] hung_task: Display every hung task warning

2014-01-21 Thread David Rientjes

On Mon, 20 Jan 2014, atom...@redhat.com wrote:

> From: Aaron Tomlin 
> 
> When khungtaskd detects hung tasks, it prints out
> backtraces from a number of those tasks.
> Limiting the number of backtraces being printed
> out can result in the user not seeing the information
> necessary to debug the issue. The hung_task_warnings
> sysctl controls this feature.
> 
> This patch makes it possible for hung_task_warnings
> to accept a special value to print an unlimited
> number of backtraces when khungtaskd detects hung
> tasks.
> 
> The special value is -1. To use this value it is
> necessary to change types from ulong to int.
> 
> Signed-off-by: Aaron Tomlin 
> Reviewed-by: Rik van Riel 

Acked-by: David Rientjes 

Nice documentation updates!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] clk: export __clk_get_hw for re-use in others

2014-01-21 Thread SeongJae Park

On Wed, Jan 22, 2014 at 1:59 PM, Greg KH  wrote:
> On Wed, Jan 22, 2014 at 12:05:57PM +0900, SeongJae Park wrote:
>> Dear Greg, Mike,
>>
>> May I ask your answer or other opinion, please?
>
> It's the middle of the merge window, it's not time for new development,
> or much time for free-time for me, sorry.  Feel free to fix it the best
> way you know how.

Oops, I've forgot about the merge window. Thank you very much for your
kind answer.
Sorry if I bothered you while you're in busy time.
Because the build problem is not a big deal because it exists only in
-next tree,
I will wait until merge window be closed and then fix it again if it
still exist.

SeongJae Park.
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-21 Thread Joel Becker

On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote:
> One topic that has been lurking forever at the edges is the current
> 4k limitation for file system block sizes. Some devices in
> production today and others coming soon have larger sectors and it
> would be interesting to see if it is time to poke at this topic
> again.
> 
> LSF/MM seems to be pretty much the only event of the year that most
> of the key people will be present, so should be a great topic for a
> joint session.

Oh yes, I want in on this.  We handle 4k/16k/64k pages "seamlessly," and
we would want to do the same for larger sectors.  In theory, our code
should handle it with the appropriate defines updated.

Joel

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpufreq: Align all CPUs to the same frequency if using shared clock

2014-01-21 Thread Viresh Kumar

On 21 January 2014 13:42, Viresh Kumar  wrote:
> On 21 January 2014 12:56, Li, Zhuangzhi  wrote:
>> Thanks for reviewing.
>
> Its my job :)
>
>> Sorry for make you misunderstanding, on our x86 platform, we want all the 
>> CPUs share one policy by setting CPUFREQ_SHARED_TYPE_ALL, not share one HW 
>> clock line.
>
> I see.. Then probably your patch makes sense. But it is
> obviously not required for every platform that exists today.
>
> Please update it to do it only for drivers that have set
> CPUFREQ_SHARED_TYPE_ALL..

One more thing, who has set different frequencies to these cores?
I hope kernel hasn't ?

In that case, probably you are fixing a bootloader bug in kernel?
What about doing this in bootloader then?

--
virehs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] backlight: turn backlight on/off when necessary

2014-01-21 Thread Jingoo Han

On Wednesday, January 22, 2014 2:04 PM, Liu Ying wrote:
> 
> Ping...
> 
> Regards,
> Liu Ying

Please, don't send the ping within 2 days.
It is not a good practice. You sent the v1 patch 6 months ago.
However, why I should review the patch within 2 days?
Please wait.

Best regards,
Jingoo Han

> 
> On 01/20/2014 12:52 PM, Liu Ying wrote:
> > We don't have to turn backlight on/off everytime a blanking
> > or unblanking event comes because the backlight status may
> > have already been what we want. Another thought is that one
> > backlight device may be shared by multiple framebuffers. We
> > don't hope blanking one of the framebuffers may turn the
> > backlight off for all the other framebuffers, since they are
> > likely being active to display something. This patch adds
> > some logics to record each framebuffer's backlight usage to
> > determine the backlight device use count and whether the
> > backlight should be turned on or off. To be more specific,
> > only one unblank operation on a certain blanked framebuffer
> > may increase the backlight device's use count by one, while
> > one blank operation on a certain unblanked framebuffer may
> > decrease the use count by one, because the userspace is
> > likely to unblank a unblanked framebuffer or blank a blanked
> > framebuffer.
> >
> > Signed-off-by: Liu Ying 
> > ---
> > v1 can be found at https://lkml.org/lkml/2013/5/30/139
> >
> > v1->v2:
> > * Make the commit message be more specific about the condition
> >   in which backlight device use count can be increased/decreased.
> > * Correct the setting for bd->props.fb_blank.
> >
> >  drivers/video/backlight/backlight.c |   28 +---
> >  include/linux/backlight.h   |6 ++
> >  2 files changed, 27 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/video/backlight/backlight.c 
> > b/drivers/video/backlight/backlight.c
> > index 5d0..42044be 100644
> > --- a/drivers/video/backlight/backlight.c
> > +++ b/drivers/video/backlight/backlight.c
> > @@ -34,13 +34,15 @@ static const char *const backlight_types[] = {
> >defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE))
> >  /* This callback gets called when something important happens inside a
> >   * framebuffer driver. We're looking if that important event is blanking,
> > - * and if it is, we're switching backlight power as well ...
> > + * and if it is and necessary, we're switching backlight power as well ...
> >   */
> >  static int fb_notifier_callback(struct notifier_block *self,
> > unsigned long event, void *data)
> >  {
> > struct backlight_device *bd;
> > struct fb_event *evdata = data;
> > +   int node = evdata->info->node;
> > +   int fb_blank = 0;
> >
> > /* If we aren't interested in this event, skip it immediately ... */
> > if (event != FB_EVENT_BLANK && event != FB_EVENT_CONBLANK)
> > @@ -51,12 +53,24 @@ static int fb_notifier_callback(struct notifier_block 
> > *self,
> > if (bd->ops)
> > if (!bd->ops->check_fb ||
> > bd->ops->check_fb(bd, evdata->info)) {
> > -   bd->props.fb_blank = *(int *)evdata->data;
> > -   if (bd->props.fb_blank == FB_BLANK_UNBLANK)
> > -   bd->props.state &= ~BL_CORE_FBBLANK;
> > -   else
> > -   bd->props.state |= BL_CORE_FBBLANK;
> > -   backlight_update_status(bd);
> > +   fb_blank = *(int *)evdata->data;
> > +   if (fb_blank == FB_BLANK_UNBLANK &&
> > +   !bd->fb_bl_on[node]) {
> > +   bd->fb_bl_on[node] = true;
> > +   if (!bd->use_count++) {
> > +   bd->props.state &= ~BL_CORE_FBBLANK;
> > +   bd->props.fb_blank = 
> > FB_BLANK_UNBLANK;
> > +   backlight_update_status(bd);
> > +   }
> > +   } else if (fb_blank != FB_BLANK_UNBLANK &&
> > +  bd->fb_bl_on[node]) {
> > +   bd->fb_bl_on[node] = false;
> > +   if (!(--bd->use_count)) {
> > +   bd->props.state |= BL_CORE_FBBLANK;
> > +   bd->props.fb_blank = 
> > FB_BLANK_POWERDOWN;
> > +   backlight_update_status(bd);
> > +   }
> > +   }
> > }
> > mutex_unlock(&bd->ops_lock);
> > return 0;
> > diff --git a/include/linux/backlight.h b/include/linux/backlight.h
> > index 5f9cd96..7264742 100644
> > --- a/include/linux/backlight.h
> > +++ b/include/linux/backlight.h
> > @@ -9,6 +9,7 @@
> >  #define _LINUX_BACK

RE: [PATCH] Add HID's to hid-microsoft driver of Surface Type/Touch Cover 2 to fix bug

2014-01-21 Thread Reyad Attiyat

Hello Benjamin,

>>
>> Hi,
>>
>> Thanks for reminding me of hid_have_special_driver[]. I noticed that
>> this device has the HID_DG_CONTACTID and in the comment of the
>> hid_have_sepcial_driver[]
>>
>> * Please note that for multitouch devices (driven by hid-multitouch driver),
>> * there is a proper autodetection and autoloading in place (based on presence
>> * of HID_DG_CONTACTID), so those devices don't need to be added to this list,
>> * as we are doing the right thing in hid_scan_usage().
>>
>> This device should not be driven by hid-multitouch as it does not
>> handle keyboard/mouse input devices.
>> I submitted a new patch below with it added. I believe it should still
>> be part of this array, in case this kind of implementation is
>> fixed/updated.
>
> This implementation is perfectly fine (I am referring to the "fixed/updated"):
> - if your device should be driven by hid-multitouch, then you _don't_
> add it to hid_have_special_driver
> - if your device should not be driven by hid-multitouch, then you
> _need_ to add it to hid_have_special_driver.
>
> Adding the device to hid_have_special_driver prevents the detection of
> the group HID_GRP_MULTITOUCH, so you will not end with a race between
> hid-multitouch and your special hid driver.
>

Thanks for clearing that up. I understand the proper use of this array
now, under this circumstance and am glad to know that there will be no
race when added.

>>
>> From 291742873dcf181faf9657b41279487f31302c73 Mon Sep 17 00:00:00 2001
>> From: Reyad Attiyat 
>> Date: Tue, 21 Jan 2014 01:22:25 -0600
>> Subject: [PATCH 1/1] Added in HID's for Microsoft Surface Type/Touch cover 2.
>>  This is to fix bug 64811 where this device is detected as a multitouch 
>> device
>>
>
> You are missing a commit message here (the first message you sent
> would fit perfectly here).
>

Sorry about that, I'm new to submitting patches to these mailing lists.

> Other than that, I played a little with the report descriptor pointed
> in the bugzilla.
>
> I think I will be able to handle this touch cover in hid-multitouch,
> but that would require more testings/debugging. Microsoft seems to
> have implemented an indirect (dual) touchpad here, but until we know
> which mode we should put it into, it's going to be tricky to set it up
> correctly.
>
> One last thing, in the bugzilla, in the comment 2 you say: "I still
> have issues with the type cover 2 even with this fix". Are you still
> experiencing those disconnection? If so, maybe we should switch to
> hid-multitouch at some point.
>
I tried some patches that I think you posted to hid-input about hid-multitouch.
The patches added in support for function callbacks to allow for a
generic protocol.
This worked after I changed mt_input_mapping() to set the protocol to
mt_protocol_generic

851 * such as Mouse that might have the same GenericDesktop usages. */
852 if (field->application != HID_DG_TOUCHSCREEN &&
853 field->application != HID_DG_PEN &&
854 field->application != HID_DG_TOUCHPAD)
855td->protocols[report_id] = mt_protocol_generic;

I still experience the disconnects with both of these solutions. Do
you have any idea what could cause this?
It seems to happen when I'm typing fast or holding a key. I'm guessing
the only way to fix this properly is
to snoop USB packets in Windows to see how the device is handled there.
Another bug is the device stays on, lit,  in standby mode.

What do you think is the best solution to take? By that I mean should
I keep the patch as part of hid-microsoft?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BISECTED] Linux 3.12.7 introduces page map handling regression

2014-01-21 Thread Konrad Rzeszutek Wilk

On Tue, Jan 21, 2014 at 07:20:45PM -0800, Steven Noonan wrote:
> On Tue, Jan 21, 2014 at 06:47:07PM -0800, Linus Torvalds wrote:
> > On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman
> >  wrote:

Adding extra folks to the party.
> > >
> > > Odds are this also shows up in 3.13, right?
> 
> Reproduced using 3.13 on the PV guest:
> 
>   [  368.756763] BUG: Bad page map in process mp  pte:8004a67c6165 
> pmd:e9b706067
>   [  368.756777] page:ea001299f180 count:0 mapcount:-1 mapping:   
>(null) index:0x0
>   [  368.756781] page flags: 0x2f8014(referenced|dirty)
>   [  368.756786] addr:7fd1388b7000 vm_flags:00100071 
> anon_vma:880e9ba15f80 mapping:  (null) index:7fd1388b7
>   [  368.756792] CPU: 29 PID: 618 Comm: mp Not tainted 3.13.0-ec2 #1
>   [  368.756795]  880e9b718958 880e9eaf3cc0 814d8748 
> 7fd1388b7000
>   [  368.756803]  880e9eaf3d08 8116d289  
> 
>   [  368.756809]  880e9b7065b8 ea001299f180 7fd1388b8000 
> 880e9eaf3e30
>   [  368.756815] Call Trace:
>   [  368.756825]  [] dump_stack+0x45/0x56
>   [  368.756833]  [] print_bad_pte+0x229/0x250
>   [  368.756837]  [] unmap_single_vma+0x583/0x890
>   [  368.756842]  [] unmap_vmas+0x65/0x90
>   [  368.756847]  [] unmap_region+0xac/0x120
>   [  368.756852]  [] ? vma_rb_erase+0x1c9/0x210
>   [  368.756856]  [] do_munmap+0x280/0x370
>   [  368.756860]  [] vm_munmap+0x41/0x60
>   [  368.756864]  [] SyS_munmap+0x22/0x30
>   [  368.756869]  [] system_call_fastpath+0x1a/0x1f
>   [  368.756872] Disabling lock debugging due to kernel taint
>   [  368.760084] BUG: Bad rss-counter state mm:880e9d079680 idx:0 
> val:-1
>   [  368.760091] BUG: Bad rss-counter state mm:880e9d079680 idx:1 
> val:1
> 
> > 
> > Probably. I don't have a Xen PV setup to test with (and very little
> > interest in setting one up).. And I have a suspicion that it might not
> > be so much about Xen PV, as perhaps about the kind of hardware.
> > 
> > I suspect the issue has something to do with the magic _PAGE_NUMA
> > tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up
> > removing the _PAGE_PRESENT bit, and now the crazy numa code is
> > confused.
> > 
> > The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the
> > bit with _PAGE_PROTNONE, which is why it then has that tie-in to
> > _PAGE_PRESENT.
> > 
> > Adding Andrea to the Cc, because he's the author of that horridness.
> > Putting Steven's test-case here as an attachement for Andrea, maybe
> > that makes him go "Ahh, yes, silly case".
> > 
> > Also added Kirill, because he was involved the last _PAGE_NUMA debacle.
> > 
> > Andrea, you can find the thread on lkml, but it boils down to commit
> > 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the
> > attached test-case (but apparently only under Xen PV). There it
> > apparently causes a "BUG: Bad page map .." error.

I *think* it is due to the fact that pmd_numa and pte_numa is getting the _raw_
value of PMDs and PTEs. That is - it does not use the pvops interface
and instead reads the values directly from the page-table. Since the
page-table is also manipulated by the hypervisor - there are certain
flags it also sets to do its business. It might be that it uses
_PAGE_GLOBAL as well - and Linux picks up on that. If it was using
pte_flags that would invoke the pvops interface.

Elena, Dariof and George, you guys had been looking at this a bit deeper
than I have. Does the Xen hypervisor use the _PAGE_GLOBAL for PV guests?

This not-compiled-totally-bad-patch might shed some light on what I was
thinking _could_ fix this issue - and IS NOT A FIX - JUST A HACK.
It does not fix it for PMDs naturally (as there are no PMD paravirt ops
for that).

The other question is - how is AutoNUMA running when it is not enabled?
Shouldn't those _PAGE_NUMA ops be nops when AutoNUMA hasn't even been
turned on?


diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index ce563be..9fa7088 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -370,12 +370,15 @@ static pteval_t pte_mfn_to_pfn(pteval_t val)
unsigned long pfn = mfn_to_pfn(mfn);
 
pteval_t flags = val & PTE_FLAGS_MASK;
+   /* No AutoNUMA for PV. TODO If Linux sees the PTE having
+* said bit, just igore it. */
+   if (flags & _PAGE_NUMA)
+   flags = flags & ~_PAGE_NUMA;
if (unlikely(pfn == ~0))
val = flags & ~_PAGE_PRESENT;
else
val = ((pteval_t)pfn << PAGE_SHIFT) | flags;
}
-
return val;
 }
 
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index db09234..a8bc07d 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -644,7 +644,7 @@ static inline int pmd_t

Re:[PATCH v2] backlight: turn backlight on/off when necessary

2014-01-21 Thread Liu Ying

Ping...

Regards,
Liu Ying

On 01/20/2014 12:52 PM, Liu Ying wrote:
> We don't have to turn backlight on/off everytime a blanking
> or unblanking event comes because the backlight status may
> have already been what we want. Another thought is that one
> backlight device may be shared by multiple framebuffers. We
> don't hope blanking one of the framebuffers may turn the
> backlight off for all the other framebuffers, since they are
> likely being active to display something. This patch adds
> some logics to record each framebuffer's backlight usage to
> determine the backlight device use count and whether the
> backlight should be turned on or off. To be more specific,
> only one unblank operation on a certain blanked framebuffer
> may increase the backlight device's use count by one, while
> one blank operation on a certain unblanked framebuffer may
> decrease the use count by one, because the userspace is
> likely to unblank a unblanked framebuffer or blank a blanked
> framebuffer.
> 
> Signed-off-by: Liu Ying 
> ---
> v1 can be found at https://lkml.org/lkml/2013/5/30/139
> 
> v1->v2:
> * Make the commit message be more specific about the condition
>   in which backlight device use count can be increased/decreased.
> * Correct the setting for bd->props.fb_blank.
> 
>  drivers/video/backlight/backlight.c |   28 +---
>  include/linux/backlight.h   |6 ++
>  2 files changed, 27 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/video/backlight/backlight.c 
> b/drivers/video/backlight/backlight.c
> index 5d0..42044be 100644
> --- a/drivers/video/backlight/backlight.c
> +++ b/drivers/video/backlight/backlight.c
> @@ -34,13 +34,15 @@ static const char *const backlight_types[] = {
>defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE))
>  /* This callback gets called when something important happens inside a
>   * framebuffer driver. We're looking if that important event is blanking,
> - * and if it is, we're switching backlight power as well ...
> + * and if it is and necessary, we're switching backlight power as well ...
>   */
>  static int fb_notifier_callback(struct notifier_block *self,
> unsigned long event, void *data)
>  {
> struct backlight_device *bd;
> struct fb_event *evdata = data;
> +   int node = evdata->info->node;
> +   int fb_blank = 0;
> 
> /* If we aren't interested in this event, skip it immediately ... */
> if (event != FB_EVENT_BLANK && event != FB_EVENT_CONBLANK)
> @@ -51,12 +53,24 @@ static int fb_notifier_callback(struct notifier_block 
> *self,
> if (bd->ops)
> if (!bd->ops->check_fb ||
> bd->ops->check_fb(bd, evdata->info)) {
> -   bd->props.fb_blank = *(int *)evdata->data;
> -   if (bd->props.fb_blank == FB_BLANK_UNBLANK)
> -   bd->props.state &= ~BL_CORE_FBBLANK;
> -   else
> -   bd->props.state |= BL_CORE_FBBLANK;
> -   backlight_update_status(bd);
> +   fb_blank = *(int *)evdata->data;
> +   if (fb_blank == FB_BLANK_UNBLANK &&
> +   !bd->fb_bl_on[node]) {
> +   bd->fb_bl_on[node] = true;
> +   if (!bd->use_count++) {
> +   bd->props.state &= ~BL_CORE_FBBLANK;
> +   bd->props.fb_blank = FB_BLANK_UNBLANK;
> +   backlight_update_status(bd);
> +   }
> +   } else if (fb_blank != FB_BLANK_UNBLANK &&
> +  bd->fb_bl_on[node]) {
> +   bd->fb_bl_on[node] = false;
> +   if (!(--bd->use_count)) {
> +   bd->props.state |= BL_CORE_FBBLANK;
> +   bd->props.fb_blank = 
> FB_BLANK_POWERDOWN;
> +   backlight_update_status(bd);
> +   }
> +   }
> }
> mutex_unlock(&bd->ops_lock);
> return 0;
> diff --git a/include/linux/backlight.h b/include/linux/backlight.h
> index 5f9cd96..7264742 100644
> --- a/include/linux/backlight.h
> +++ b/include/linux/backlight.h
> @@ -9,6 +9,7 @@
>  #define _LINUX_BACKLIGHT_H
> 
>  #include 
> +#include 
>  #include 
>  #include 
> 
> @@ -104,6 +105,11 @@ struct backlight_device {
> struct list_head entry;
> 
> struct device dev;
> +
> +   /* Multiple framebuffers may share one backlight device */
> +   bool fb_bl_on[FB_MAX];
> +
> +   int use_count;
>  };
> 
>  static inline void backlight_update_status(struct backlight_device *bd)
> --
> 1.7.9.5

--
To unsubscribe from this list: send

Re: [PATCH 68/73] drivers/cpufreq: delete non-required instances of

2014-01-21 Thread Viresh Kumar

On 22 January 2014 02:53, Paul Gortmaker  wrote:
> None of these files are actually using any __init type directives
> and hence don't need to include .  Most are just a
> left over from __devinit and __cpuinit removal, or simply due to
> code getting copied from one driver to the next.
>
> Cc: Kevin Hilman 
> Cc: "Rafael J. Wysocki" 
> Cc: Viresh Kumar 
> Cc: cpuf...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Signed-off-by: Paul Gortmaker 
> ---
>  drivers/cpufreq/omap-cpufreq.c| 1 -
>  drivers/cpufreq/powernow-k8.c | 1 -
>  drivers/cpufreq/s3c2412-cpufreq.c | 1 -
>  drivers/cpufreq/s3c2440-cpufreq.c | 1 -
>  drivers/cpufreq/spear-cpufreq.c   | 1 -
>  drivers/cpufreq/speedstep-lib.c   | 1 -
>  6 files changed, 6 deletions(-)

We have discussed this in past where I pointed out this file is required
by stuff like module_init, what happened to that query ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] clk: export __clk_get_hw for re-use in others

2014-01-21 Thread Greg KH

On Wed, Jan 22, 2014 at 12:05:57PM +0900, SeongJae Park wrote:
> Dear Greg, Mike,
> 
> May I ask your answer or other opinion, please?

It's the middle of the merge window, it's not time for new development,
or much time for free-time for me, sorry.  Feel free to fix it the best
way you know how.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pinctrl: Rename Broadcom Capri pinctrl driver

2014-01-21 Thread Olof Johansson

On Tue, Jan 21, 2014 at 5:35 PM, Matt Porter  wrote:
> On Tue, Jan 21, 2014 at 04:59:35PM -0800, Olof Johansson wrote:
>> Hi,
>>
>>
>> On Tue, Jan 21, 2014 at 2:38 PM, Sherman Yin  wrote:
>> > To be consistent with other Broadcom drivers, the Broadcom Capri pinctrl
>> > driver and its related CONFIG option are renamed to bcm281xx.
>> >
>> > Devicetree compatible string and binding documentation use
>> > "brcm,bcm11351-pinctrl" to match the machine binding here:
>> > Documentation/devicetree/bindings/arm/bcm/bcm11351.txt
>> >
>> > This driver supports pinctrl on BCM11130, BCM11140, BCM11351, BCM28145
>> > and BCM28155 SoCs.
>> >
>> > Signed-off-by: Sherman Yin 
>> > Reviewed-by: Matt Porter 
>> > ---
>> >  ...capri-pinctrl.txt => brcm,bcm11351-pinctrl.txt} |8 +-
>> >  arch/arm/boot/dts/bcm11351.dtsi|2 +-
>> >  arch/arm/configs/bcm_defconfig |2 +-
>> >  drivers/pinctrl/Kconfig|8 +-
>> >  drivers/pinctrl/Makefile   |2 +-
>> >  .../{pinctrl-capri.c => pinctrl-bcm281xx.c}| 1521 
>> > ++--
>> >  6 files changed, 775 insertions(+), 768 deletions(-)
>> >  rename Documentation/devicetree/bindings/pinctrl/{brcm,capri-pinctrl.txt 
>> > => brcm,bcm11351-pinctrl.txt} (98%)
>> >  rename drivers/pinctrl/{pinctrl-capri.c => pinctrl-bcm281xx.c} (25%)
>> >
>> > diff --git 
>> > a/Documentation/devicetree/bindings/pinctrl/brcm,capri-pinctrl.txt 
>> > b/Documentation/devicetree/bindings/pinctrl/brcm,bcm11351-pinctrl.txt
>> > similarity index 98%
>> > rename from 
>> > Documentation/devicetree/bindings/pinctrl/brcm,capri-pinctrl.txt
>> > rename to 
>> > Documentation/devicetree/bindings/pinctrl/brcm,bcm11351-pinctrl.txt
>> > index 9e9e9ef..c119deb 100644
>> > --- a/Documentation/devicetree/bindings/pinctrl/brcm,capri-pinctrl.txt
>> > +++ b/Documentation/devicetree/bindings/pinctrl/brcm,bcm11351-pinctrl.txt
>> > @@ -1,4 +1,4 @@
>> > -Broadcom Capri Pin Controller
>> > +Broadcom BCM281xx Pin Controller
>> >
>> >  This is a pin controller for the Broadcom BCM281xx SoC family, which 
>> > includes
>> >  BCM11130, BCM11140, BCM11351, BCM28145, and BCM28155 SoCs.
>> > @@ -7,14 +7,14 @@ BCM11130, BCM11140, BCM11351, BCM28145, and BCM28155 
>> > SoCs.
>> >
>> >  Required Properties:
>> >
>> > -- compatible:  Must be "brcm,capri-pinctrl".
>> > +- compatible:  Must be "brcm,bcm11351-pinctrl"
>>
>> Since the original binding is queued for 3.14 (I believe?), if this
>> rename isn't merged for 3.14 then you will still need to accept the
>> old compatible string (binding). You can document it as deprecated,
>> but the driver needs to still probe with it.
>
> Linus had mentioned that he could take a rename in 3.14-rc for this
> driver which is really what we had in mind here. Since the binding
> doesn't become stable until 3.14 is actually released I was under the
> impression that this is ok without keeping a deprecated compatible
> string. I notice that Tomasz had comments about this type of situation
> in http://www.spinics.net/lists/devicetree/msg18010.html

Yes, if the rename goes in before the binding has been in one stable
release then we can make noncompatible changes. Which is why I said if
this isn't merged for 3.14, etc.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scripts/gcc-version.sh: handle CC="gcc -m32"

2014-01-21 Thread Rusty Russell

Rusty Russell  writes:
> Michal Marek  writes:
>>> gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead
>>> gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead
>>> kernel/bounds.c:1:0: error: CPU you selected does not support x86-64 
>>> instruction set
>>>  /*
>>>  ^
>>> kernel/bounds.c:1:0: warning: -mregparm is ignored in 64-bit mode [enabled 
>>> by default]
>>> make[1]: *** [kernel/bounds.s] Error 1
>>> make: *** [prepare0] Error 2

Sorry, ignore this report.

In case anyone else hits this: was resolved by installing more 32 bit headers
(Ubuntu's libc6-dev-i386 package, in this case).

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] mm: oom_kill: revert 3% system memory bonus for privileged tasks

2014-01-21 Thread David Rientjes

On Thu, 16 Jan 2014, Johannes Weiner wrote:

> > Unfortunately, I think this could potentially be too much of a bonus.  On 
> > your same 32GB machine, if a root process is using 18GB and a user process 
> > is using 14GB, the user process ends up getting selected while the current 
> > discount of 3% still selects the root process.
> > 
> > I do like the idea of scaling this bonus depending on points, however.  I 
> > think it would be better if we could scale the discount but also limit it 
> > to some sane value.
> 
> I just reverted to the /= 4 because we had that for a long time and it
> seemed to work.  I don't really mind either way as long as we get rid
> of that -3%.  Do you have a suggestion?
> 

How about simply using 3% of the root process's points so that root 
processes get some bonus compared to non-root processes with the same 
memory usage and it's scaled to the usage rather than amount of available 
memory?

So rather than points /= 4, we do

if (has_capability_noaudit(p, CAP_SYS_ADMIN))
points -= (points * 3) / 100;

instead.  Sound good?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v9 3/5] qrwlock, x86 - Treat all data type not bigger than long as atomic in x86

2014-01-21 Thread Waiman Long


On 01/21/2014 07:31 PM, Linus Torvalds wrote:

On Tue, Jan 21, 2014 at 8:09 AM, Waiman Long  wrote:

include/linux/compiler.h:

#ifndef __native_word
# ifdef __arch_native_word(t)
#  define __native_word(t)  __arch_native_word(t)
# else
#  define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == 
siizeof(long))
# endif
#endif

Do we even really need this?

I'd suggest removing it entirely. You might want to retain the whole

   compiletime_assert_atomic_type()

thing on purely the alpha side, but then it's all inside just the
alpha code, without any need for this "native_word" thing.

And if somebody tries to do a "smp_store_release()" on a random
structure or union, do we care? We're not some nanny state that wants
to give nice warnings for insane code.

   Linus


That sounds good to me too. Peter, what do you think about this?

-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: Fix memory leak in rw_copy_check_uvector() handling

2014-01-21 Thread Jens Axboe

On Sun, Jan 19 2014, Christian Engelmayer wrote:
> Fix a memory leak in the error handling path of function sg_io()
> that is used during the processing of scsi ioctl. Memory already
> allocated by rw_copy_check_uvector() needs to be freed correctly.
> Detected by Coverity: CID 1128953.

Applied, thanks.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -trivial] mg_disk: Spelling s/finised/finished/

2014-01-21 Thread Jens Axboe

On Tue, Jan 21 2014, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven 

Applied, thanks.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bio_integrity_verify() bug causing READ verify to be silently skipped

2014-01-21 Thread Jens Axboe

On Tue, Jan 21 2014, Nicholas A. Bellinger wrote:
> On Fri, 2014-01-17 at 16:58 -0500, Martin K. Petersen wrote:
> > > "nab" == Nicholas A Bellinger  writes:
> > 
> > >> That breaks partial completion, though. I'll take a look at Kent's
> > >> changes...
> > 
> > nab> Ping..?  Any updates on a proper bugfix for this..?
> > 
> > I did put your patch in my queue and have been working on a fix for the
> > partial completion case. The latter requires a bit of massaging that
> > interferes with other pending changes.
> > 
> > Given that your patch does address a valid issue I'm OK with Jens
> > putting it in as is. I'll build upon it for my changes.
> > 
> 
> 
> 
> Jens, are you going to pick this one up, or shall I include it in the
> upcoming target-pending/for-next pull request instead..?
> 
> Either way, it needs a CC' to stable for >= v3.10.y.

I'll queue it up, thanks.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH V3 1/2] null_blk: Null pointer deference problem in alloc_page_buffers

2014-01-21 Thread Jens Axboe

On Tue, Jan 21 2014, Raghavendra K T wrote:
>  If we load the null_blk module with bs=8k we get following oops:
> [ 3819.812190] BUG: unable to handle kernel NULL pointer dereference at 
> 0008
> [ 3819.812387] IP: [] create_empty_buffers+0x28/0xaf
> [ 3819.812527] PGD 219244067 PUD 215a06067 PMD 0
> [ 3819.812640] Oops:  [#1] SMP
> [ 3819.812772] Modules linked in: null_blk(+)
> 
>  Fix that by resetting block size to PAGE_SIZE if it is greater than PAGE_SIZE

Thanks, applied.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT] floppy

2014-01-21 Thread Jens Axboe

On Fri, Jan 17 2014, Jiri Kosina wrote:
> Jens,
> 
> please consider pulling
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/jikos/linux-block.git for-jens
> 
> into your for-3.14/drivers branch to receive
> 
> Jiri Kosina (1):
>   floppy: bail out in open() if drive is not responding to block0 read

Thanks Jiri, pulled.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-21 Thread dormando



On Tue, 21 Jan 2014, Alexei Starovoitov wrote:

> On Tue, Jan 21, 2014 at 5:39 PM, dormando  wrote:
> >
> > > On Fri, Jan 17, 2014 at 11:16 PM, dormando  wrote:
> > > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote:
> > > >> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote:
> > > >> > > Hi,
> > > >> > >
> > > >> > > Upgraded a few kernels to the latest 3.10 stable tree while 
> > > >> > > tracking down
> > > >> > > a rare kernel panic, seems to have introduced a much more frequent 
> > > >> > > kernel
> > > >> > > panic. Takes anywhere from 4 hours to 2 days to trigger:
> > > >> > >
> > > >> > > <4>[196727.311203] general protection fault:  [#1] SMP
> > > >> > > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP 
> > > >> > > macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich 
> > > >> > > microcode
> ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm 
> tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp
> pps_core mdio
> > > >> > > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 
> > > >> > > 3.10.26 #1
> > > >> > > <4>[196727.311344] Hardware name: Supermicro 
> > > >> > > X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
> > > >> > > <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 
> > > >> > > task.ti: 885e6f072000
> > > >> > > <4>[196727.311377] RIP: 0010:[]  
> > > >> > > [] ipv4_dst_destroy+0x4f/0x80
> > > >> > > <4>[196727.311399] RSP: 0018:885effd23a70  EFLAGS: 00010282
> > > >> > > <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 
> > > >> > > RCX: 0040
> > > >> > > <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 
> > > >> > > RDI: dead00200200
> > > >> > > <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 
> > > >> > > R09: 885d5a590800
> > > >> > > <4>[196727.311451] R10:  R11:  
> > > >> > > R12: 
> > > >> > > <4>[196727.311464] R13: 81c8c280 R14:  
> > > >> > > R15: 880e85ee16ce
> > > >> > > <4>[196727.311510] FS:  () 
> > > >> > > GS:885effd2() knlGS:
> > > >> > > <4>[196727.311554] CS:  0010 DS:  ES:  CR0: 
> > > >> > > 80050033
> > > >> > > <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 
> > > >> > > CR4: 000407e0
> > > >> > > <4>[196727.311625] DR0:  DR1:  
> > > >> > > DR2: 
> > > >> > > <4>[196727.311669] DR3:  DR6: 0ff0 
> > > >> > > DR7: 0400
> > > >> > > <4>[196727.311713] Stack:
> > > >> > > <4>[196727.311733]  8854c398ecc0 8854c398ecc0 
> > > >> > > 885effd23ab0 815b7f42
> > > >> > > <4>[196727.311784]  88be6595bc00 8854c398ecc0 
> > > >> > >  8854c398ecc0
> > > >> > > <4>[196727.311834]  885effd23ad0 815b86c6 
> > > >> > > 885d5a590800 8816827821c0
> > > >> > > <4>[196727.311885] Call Trace:
> > > >> > > <4>[196727.311907]  
> > > >> > > <4>[196727.311912]  [] dst_destroy+0x32/0xe0
> > > >> > > <4>[196727.311959]  [] dst_release+0x56/0x80
> > > >> > > <4>[196727.311986]  [] tcp_v4_do_rcv+0x2a5/0x4a0
> > > >> > > <4>[196727.312013]  [] tcp_v4_rcv+0x7da/0x820
> > > >> > > <4>[196727.312041]  [] ? 
> > > >> > > ip_rcv_finish+0x360/0x360
> > > >> > > <4>[196727.312070]  [] ? nf_hook_slow+0x7d/0x150
> > > >> > > <4>[196727.312097]  [] ? 
> > > >> > > ip_rcv_finish+0x360/0x360
> > > >> > > <4>[196727.312125]  [] 
> > > >> > > ip_local_deliver_finish+0xb2/0x230
> > > >> > > <4>[196727.312154]  [] ip_local_deliver+0x4a/0x90
> > > >> > > <4>[196727.312183]  [] ip_rcv_finish+0x119/0x360
> > > >> > > <4>[196727.312212]  [] ip_rcv+0x22b/0x340
> > > >> > > <4>[196727.312242]  [] ? 
> > > >> > > macvlan_broadcast+0x160/0x160 [macvlan]
> > > >> > > <4>[196727.312275]  [] 
> > > >> > > __netif_receive_skb_core+0x512/0x640
> > > >> > > <4>[196727.312308]  [] ? 
> > > >> > > kmem_cache_alloc+0x13b/0x150
> > > >> > > <4>[196727.312338]  [] 
> > > >> > > __netif_receive_skb+0x21/0x70
> > > >> > > <4>[196727.312368]  [] 
> > > >> > > netif_receive_skb+0x31/0xa0
> > > >> > > <4>[196727.312397]  [] 
> > > >> > > napi_gro_receive+0xe8/0x140
> > > >> > > <4>[196727.312433]  [] ixgbe_poll+0x551/0x11f0 
> > > >> > > [ixgbe]
> > > >> > > <4>[196727.312463]  [] ? ip_rcv+0x22b/0x340
> > > >> > > <4>[196727.312491]  [] net_rx_action+0x111/0x210
> > > >> > > <4>[196727.312521]  [] ? 
> > > >> > > __netif_receive_skb+0x21/0x70
> > > >> > > <4>[196727.312552]  [] __do_softirq+0xd0/0x270
> > > >> > > <4>[196727.312583]  [] call_softirq+0x1c/0x30
> > > >> > > <4>[196727.312613]  [] do_softirq+0x55/0x90
> > > >> > > <4>[196727.312640]  [] irq_exit+0x55/0x60
> > > >> > > <4>[196727.312668]  [] do_IRQ+0x63/0xe0
> > > >> > > <4>[196727.312696]  [] common_interrupt+0x6a/0x6a
> > > >> > > <4>[196727.312722]

Re: linux rdma 3.14 merge plans

2014-01-21 Thread Or Gerlitz

On Wed, Jan 22, 2014 at 2:43 AM, Roland Dreier  wrote:
> On Tue, Jan 21, 2014 at 2:00 PM, Or Gerlitz  wrote:
>> Roland, ping! the signature patches were posted > three months ago. We
>> deserve a response from the maintainer that goes beyond "I need to
>> think on that".
>>
>> Responsiveness was stated by Linus to be the #1 requirement from
>> kernel maintainers.
>
> Or, I'm not sure what response you're after from me.

Roland, what I am after is a r-e-s-p-o-n-s-e from you, and let it
contain what ever justified and/or unjustified mud as below. We posted
the V0 series on Oct 15 2013 and since that time not a word from you,
except for an "I need to think on that" comment last week after we
nudged million times.

You can't leave us clueless in the air for whole three months without
any concrete or unconcrete comment. There's no way to carry kernel
development like that. I am old enough to hear and face "no" and "wTF
is this" or "yTF you do it this way" etc etc, this happened few times
with e.g with networking patches we sent  and we either improved
things or did them differently or whatever needed to be done.

There's no way on earth to face plain ignoring of your work, and this
is what happens here. I had no way to get your below response except
for going to LKML, why?

> Linus has also said that maintainers should say "no" a lot more
> (http://lwn.net/Articles/571995/) so maybe you want me to say, "No, I
> won't merge this patch set, since it adds a bunch of complexity to
> support a feature no one really cares about."  Is that it?  (And yes I
> am skeptical about this stuff — I work at an enterprise storage
> company and even here it's hard to find anyone who cares about
> DIF/DIX, especially offload features that stop it from being
> end-to-end)
>
> I'm sure you're not expecting me to say, "Sure, I'll merge it without
> understanding the problem it's solving or how it's doing that,"
> especially given the your recent history of pushing me to merge stuff
> like the IP-RoCE patches back when they broke the userspace ABI.
>
> I'd really rather spend my time on something actually useful like
> cleaning up softroce.
>
>  - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: uninline rcu_lock_acquire/etc ?

2014-01-21 Thread Paul E. McKenney

On Tue, Jan 21, 2014 at 08:39:09PM +0100, Oleg Nesterov wrote:
> On 01/21, Oleg Nesterov wrote:
> >
> > But I agreed that the code looks simpler with bitfields, so perhaps
> > this patch is better.
> 
> Besides, I guess the major offender is rcu...
> 
> Paul, can't we do something like below? Saves 19.5 kilobytes,
> 
>   -   5255131 2974376 10125312183548191181283 vmlinux
>   +   5235227 2970344 1012531218330883117b503 vmlinux
> 
> probably we can also uninline rcu_lockdep_assert()...

Looks mostly plausible, some questions inline below.

Thanx, Paul

> Oleg.
> ---
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 2eef290..58f7a97 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -310,18 +310,34 @@ static inline bool rcu_lockdep_current_cpu_online(void)
>  }
>  #endif /* #else #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) 
> */
> 
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -
> -static inline void rcu_lock_acquire(struct lockdep_map *map)
> +static inline void __rcu_lock_acquire(struct lockdep_map *map, unsigned long 
> ip)
>  {
> - lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
> + lock_acquire(map, 0, 0, 2, 0, NULL, ip);
>  }
> 
> -static inline void rcu_lock_release(struct lockdep_map *map)
> +static inline void __rcu_lock_release(struct lockdep_map *map, unsigned long 
> ip)
>  {
>   lock_release(map, 1, _THIS_IP_);
>  }
> 
> +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
> +extern void rcu_lock_acquire(void);
> +extern void rcu_lock_release(void);
> +extern void rcu_lock_acquire_bh(void);
> +extern void rcu_lock_release_bh(void);
> +extern void rcu_lock_acquire_sched(void);
> +extern void rcu_lock_release_sched(void);
> +#else
> +#define rcu_lock_acquire()   do { } while (0)
> +#define rcu_lock_release()   do { } while (0)
> +#define rcu_lock_acquire_bh()do { } while (0)
> +#define rcu_lock_release_bh()do { } while (0)
> +#define rcu_lock_acquire_sched() do { } while (0)
> +#define rcu_lock_release_sched() do { } while (0)
> +#endif
> +
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +
>  extern struct lockdep_map rcu_lock_map;
>  extern struct lockdep_map rcu_bh_lock_map;
>  extern struct lockdep_map rcu_sched_lock_map;
> @@ -419,9 +435,6 @@ static inline int rcu_read_lock_sched_held(void)
> 
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> 
> -# define rcu_lock_acquire(a) do { } while (0)
> -# define rcu_lock_release(a) do { } while (0)
> -
>  static inline int rcu_read_lock_held(void)
>  {
>   return 1;
> @@ -766,11 +779,9 @@ static inline void rcu_preempt_sleep_check(void)
>   */
>  static inline void rcu_read_lock(void)
>  {
> - __rcu_read_lock();
>   __acquire(RCU);
> - rcu_lock_acquire(&rcu_lock_map);
> - rcu_lockdep_assert(rcu_is_watching(),
> -"rcu_read_lock() used illegally while idle");
> + __rcu_read_lock();
> + rcu_lock_acquire();

Not sure why __rcu_read_lock() needs to be in any particular order
with respect to the sparse __acquire(RCU), but should work either way.
Same question about the other reorderings of similar statements.

>  }
> 
>  /*
> @@ -790,11 +801,9 @@ static inline void rcu_read_lock(void)
>   */
>  static inline void rcu_read_unlock(void)
>  {
> - rcu_lockdep_assert(rcu_is_watching(),
> -"rcu_read_unlock() used illegally while idle");
> - rcu_lock_release(&rcu_lock_map);
> - __release(RCU);
> + rcu_lock_release();
>   __rcu_read_unlock();
> + __release(RCU);
>  }
> 
>  /**
> @@ -816,11 +825,9 @@ static inline void rcu_read_unlock(void)
>   */
>  static inline void rcu_read_lock_bh(void)
>  {
> - local_bh_disable();
>   __acquire(RCU_BH);
> - rcu_lock_acquire(&rcu_bh_lock_map);
> - rcu_lockdep_assert(rcu_is_watching(),
> -"rcu_read_lock_bh() used illegally while idle");
> + local_bh_disable();
> + rcu_lock_acquire_bh();
>  }
> 
>  /*
> @@ -830,11 +837,9 @@ static inline void rcu_read_lock_bh(void)
>   */
>  static inline void rcu_read_unlock_bh(void)
>  {
> - rcu_lockdep_assert(rcu_is_watching(),
> -"rcu_read_unlock_bh() used illegally while idle");
> - rcu_lock_release(&rcu_bh_lock_map);
> - __release(RCU_BH);
> + rcu_lock_release_bh();
>   local_bh_enable();
> + __release(RCU_BH);
>  }
> 
>  /**
> @@ -852,9 +857,9 @@ static inline void rcu_read_unlock_bh(void)
>   */
>  static inline void rcu_read_lock_sched(void)
>  {
> - preempt_disable();
>   __acquire(RCU_SCHED);
> - rcu_lock_acquire(&rcu_sched_lock_map);
> + preempt_disable();
> + rcu_lock_acquire_sched();
>   rcu_lockdep_assert(rcu_is_watching(),
>  "rcu_read_lock_sched() used illegally while idle");

The above pair of l

[GIT PULL] audit subsystem for 3.14

2014-01-21 Thread Eric Paris

Linus,

Please consider pulling the following audit changes.  Again we stayed
pretty well contained inside the audit system.  Venturing out was fixing
a couple of function prototypes which were inconsistent (didn't hurt
anything, but we used the same value as an int, uint, u32, and I think
even a long in a couple of places).  We also made a couple of minor
changes to when a couple of LSMs called the audit system.  We hoped to
add aarch64 audit support this go round, but it wasn't ready.

There is one merge issue.  Take your code, then convert the prototype
for the first 4 functions changing the "u32 ses" to "unsigned int ses".
(Do not change the u32 secid)

I'm disappearing on vacation on Thursday.  I should have internet
access, but it'll be spotty.  If anything goes wrong please be sure to
cc r...@redhat.com.  He'll make fixing things his top priority.

-Eric

The following changes since commit fc582aef7dcc27a7120cf232c1e76c569c7b6eab:

  Merge tag 'v3.12' (2013-11-22 18:57:54 -0500)

are available in the git repository at:


  git://git.infradead.org/users/eparis/audit.git master

for you to fetch changes up to f3411cb2b2e396a41ed3a439863f028db7140a34:

  audit: whitespace fix in kernel-parameters.txt (2014-01-17 17:15:02 -0500)


AKASHI Takahiro (2):
  audit: correct a type mismatch in audit_syscall_exit()
  audit: Modify a set of system calls in audit class definitions

Dan Duval (2):
  audit: efficiency fix 1: only wake up if queue shorter than backlog limit
  audit: efficiency fix 2: request exclusive wait since all need same 
resource

Eric Paris (8):
  audit: convert all sessionid declaration to unsigned int
  audit: wait_for_auditd rework for readability
  audit: documentation of audit= kernel parameter
  audit: use define's for audit version
  audit: remove needless switch in AUDIT_SET
  audit: rework AUDIT_TTY_SET to only grab spin_lock once
  audit: reorder AUDIT_TTY_SET arguments
  audit: remove pr_info for every network namespace

Eric W. Biederman (1):
  audit: Simplify and correct audit_log_capset

Gao feng (7):
  audit: remove useless code in audit_enable
  audit: fix incorrect order of log new and old feature
  audit: don't generate audit feature changed log when audit disabled
  audit: use old_lock in audit_set_feature
  audit: don't generate loginuid log when audit disabled
  audit: print error message when fail to create audit socket
  audit: fix incorrect set of audit_sock

Joe Perches (3):
  audit: Use hex_byte_pack_upper
  audit: Use more current logging style
  audit: Convert int limit uses to u32

Paul Davies C (2):
  audit: drop audit_log_abend()
  audit: Added exe field to audit core dump signal log

Richard Guy Briggs (24):
  audit: fix netlink portid naming and types
  audit: restore order of tty and ses fields in log output
  audit: listen in all network namespaces
  audit: reset audit backlog wait time after error recovery
  audit: make use of remaining sleep time from wait_for_auditd
  documentation: document the audit= kernel start-up parameter
  audit: add kernel set-up parameter to override default backlog limit
  audit: clean up AUDIT_GET/SET local variables and future-proof API
  audit: add audit_backlog_wait_time configuration option
  audit: fix incorrect type of sessionid
  audit: allow unlimited backlog queue
  audit: get rid of *NO* daemon at audit_pid=0 message
  audit: log AUDIT_TTY_SET config changes
  audit: refactor audit_receive_msg() to clarify AUDIT_*_RULE* cases
  audit: prevent an older auditd shutdown from orphaning a newer auditd 
startup
  selinux: call WARN_ONCE() instead of calling audit_log_start()
  smack: call WARN_ONCE() instead of calling audit_log_start()
  audit: drop audit_cmd_lock in AUDIT_USER family of cases
  audit: log on errors from filter user rules
  audit: fix dangling keywords in audit_log_set_loginuid() output
  audit: log task info on feature change
  audit: update MAINTAINERS
  audit: fix location of __net_initdata for audit_net_ops
  audit: whitespace fix in kernel-parameters.txt

Toshiyuki Okajima (1):
  audit: audit_log_start running on auditd should not stop

 Documentation/kernel-parameters.txt |  16 ++
 MAINTAINERS |   3 +-
 drivers/tty/tty_audit.c |   2 +-
 include/asm-generic/audit_change_attr.h |   4 +-
 include/asm-generic/audit_write.h   |   6 +++
 include/linux/audit.h   |  22 
 include/linux/init_task.h   |   2 +-
 include/net/netlabel.h  |   2 +-
 include/net/xfrm.h  |  20 +++
 include/uapi/linux/audit.h  |   8 +++
 kernel/audit.c  | 365

[PATCH v5 8/8] ARM: brcmstb: dts: add a reference DTS for Broadcom 7445

2014-01-21 Thread Marc Carino

Add a sample DTS which will allow bootup of a board populated
with the BCM7445 chip.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 arch/arm/boot/dts/bcm7445.dts |  111 +
 1 files changed, 111 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/boot/dts/bcm7445.dts

diff --git a/arch/arm/boot/dts/bcm7445.dts b/arch/arm/boot/dts/bcm7445.dts
new file mode 100644
index 000..ffa3305
--- /dev/null
+++ b/arch/arm/boot/dts/bcm7445.dts
@@ -0,0 +1,111 @@
+/dts-v1/;
+/include/ "skeleton.dtsi"
+
+/ {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   model = "Broadcom STB (bcm7445)";
+   compatible = "brcm,bcm7445", "brcm,brcmstb";
+   interrupt-parent = <&gic>;
+
+   chosen {};
+
+   memory {
+   device_type = "memory";
+   reg = <0x00 0x 0x00 0x4000>,
+ <0x00 0x4000 0x00 0x4000>,
+ <0x00 0x8000 0x00 0x4000>;
+   };
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   cpu@0 {
+   compatible = "brcm,brahma-b15";
+   device_type = "cpu";
+   reg = <0>;
+   };
+
+   cpu@1 {
+   compatible = "brcm,brahma-b15";
+   device_type = "cpu";
+   reg = <1>;
+   };
+
+   cpu@2 {
+   compatible = "brcm,brahma-b15";
+   device_type = "cpu";
+   reg = <2>;
+   };
+
+   cpu@3 {
+   compatible = "brcm,brahma-b15";
+   device_type = "cpu";
+   reg = <3>;
+   };
+   };
+
+   gic: interrupt-controller@ffd0 {
+   compatible = "brcm,brahma-b15-gic", "arm,cortex-a15-gic";
+   reg = <0x00 0xffd01000 0x00 0x1000>,
+ <0x00 0xffd02000 0x00 0x2000>,
+ <0x00 0xffd04000 0x00 0x2000>,
+ <0x00 0xffd06000 0x00 0x2000>;
+   interrupt-controller;
+   #interrupt-cells = <3>;
+   };
+
+   timer {
+   compatible = "arm,armv7-timer";
+   interrupts = <1 13 0xf08>,
+<1 14 0xf08>,
+<1 11 0xf08>,
+<1 10 0xf08>;
+   };
+
+   rdb {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "simple-bus";
+   ranges = <0 0x00 0xf000 0x100>;
+
+   serial@406b00 {
+   compatible = "ns16550a";
+   reg = <0x406b00 0x20>;
+   reg-shift = <2>;
+   reg-io-width = <4>;
+   interrupts = <0 75 0x4>;
+   clock-frequency = <0x4d3f640>;
+   };
+
+   sun_top_ctrl: syscon@404000 {
+   compatible = "brcm,bcm7445-sun-top-ctrl",
+"syscon";
+   reg = <0x404000 0x51c>;
+   };
+
+   hif_cpubiuctrl: syscon@3e2400 {
+   compatible = "brcm,bcm7445-hif-cpubiuctrl",
+"syscon";
+   reg = <0x3e2400 0x5b4>;
+   };
+
+   hif_continuation: syscon@452000 {
+   compatible = "brcm,bcm7445-hif-continuation",
+"syscon";
+   reg = <0x452000 0x100>;
+   };
+   };
+
+   smpboot {
+   compatible = "brcm,brcmstb-smpboot";
+   syscon-cpu = <&hif_cpubiuctrl 0x88 0x178>;
+   syscon-cont = <&hif_continuation>;
+   };
+
+   reboot {
+   compatible = "brcm,brcmstb-reboot";
+   syscon = <&sun_top_ctrl 0x304 0x308>;
+   };
+};
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 2/8] power: reset: Add reboot driver for brcmstb

2014-01-21 Thread Marc Carino

Add support for reboot functionality on boards with ARM-based
Broadcom STB chipsets.

Signed-off-by: Marc Carino 
---
 drivers/power/reset/Kconfig  |   10 +++
 drivers/power/reset/Makefile |1 +
 drivers/power/reset/brcmstb-reboot.c |  120 ++
 3 files changed, 131 insertions(+), 0 deletions(-)
 create mode 100644 drivers/power/reset/brcmstb-reboot.c

diff --git a/drivers/power/reset/Kconfig b/drivers/power/reset/Kconfig
index 9b3ea53..31b468b 100644
--- a/drivers/power/reset/Kconfig
+++ b/drivers/power/reset/Kconfig
@@ -6,6 +6,16 @@ menuconfig POWER_RESET
 
  Say Y here to enable board reset and power off
 
+config POWER_RESET_BRCMSTB
+   bool "Broadcom STB reset driver"
+   depends on POWER_RESET && ARCH_BRCMSTB
+   help
+ This driver provides restart support for ARM-based Broadcom STB
+ boards.
+
+ Say Y here if you have an ARM-based Broadcom STB board and you wish
+ to have restart support.
+
 config POWER_RESET_GPIO
bool "GPIO power-off driver"
depends on OF_GPIO && POWER_RESET
diff --git a/drivers/power/reset/Makefile b/drivers/power/reset/Makefile
index 3e6ed88..806d056 100644
--- a/drivers/power/reset/Makefile
+++ b/drivers/power/reset/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_POWER_RESET_QNAP) += qnap-poweroff.o
 obj-$(CONFIG_POWER_RESET_RESTART) += restart-poweroff.o
 obj-$(CONFIG_POWER_RESET_VEXPRESS) += vexpress-poweroff.o
 obj-$(CONFIG_POWER_RESET_XGENE) += xgene-reboot.o
+obj-$(CONFIG_POWER_RESET_BRCMSTB) += brcmstb-reboot.o
diff --git a/drivers/power/reset/brcmstb-reboot.c 
b/drivers/power/reset/brcmstb-reboot.c
new file mode 100644
index 000..3f23692
--- /dev/null
+++ b/drivers/power/reset/brcmstb-reboot.c
@@ -0,0 +1,120 @@
+/*
+ * Copyright (C) 2013 Broadcom Corporation
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define RESET_SOURCE_ENABLE_REG 1
+#define SW_MASTER_RESET_REG 2
+
+static struct regmap *regmap;
+static u32 rst_src_en;
+static u32 sw_mstr_rst;
+
+static void brcmstb_reboot(enum reboot_mode mode, const char *cmd)
+{
+   int rc;
+   u32 tmp;
+
+   rc = regmap_write(regmap, rst_src_en, 1);
+   if (rc) {
+   pr_err("failed to write rst_src_en (%d)\n", rc);
+   return;
+   }
+
+   rc = regmap_read(regmap, rst_src_en, &tmp);
+   if (rc) {
+   pr_err("failed to read rst_src_en (%d)\n", rc);
+   return;
+   }
+
+   rc = regmap_write(regmap, sw_mstr_rst, 1);
+   if (rc) {
+   pr_err("failed to write sw_mstr_rst (%d)\n", rc);
+   return;
+   }
+
+   rc = regmap_read(regmap, sw_mstr_rst, &tmp);
+   if (rc) {
+   pr_err("failed to read sw_mstr_rst (%d)\n", rc);
+   return;
+   }
+
+   while (1)
+   ;
+}
+
+static int brcmstb_reboot_probe(struct platform_device *pdev)
+{
+   int rc;
+   struct device_node *np = pdev->dev.of_node;
+
+   regmap = syscon_regmap_lookup_by_phandle(np, "syscon");
+   if (IS_ERR(regmap)) {
+   pr_err("failed to get syscon phandle\n");
+   return -EINVAL;
+   }
+
+   rc = of_property_read_u32_index(np, "syscon", RESET_SOURCE_ENABLE_REG,
+   &rst_src_en);
+   if (rc) {
+   pr_err("can't get rst_src_en offset (%d)\n", rc);
+   return -EINVAL;
+   }
+
+   rc = of_property_read_u32_index(np, "syscon", SW_MASTER_RESET_REG,
+   &sw_mstr_rst);
+   if (rc) {
+   pr_err("can't get sw_mstr_rst offset (%d)\n", rc);
+   return -EINVAL;
+   }
+
+   arm_pm_restart = brcmstb_reboot;
+
+   return 0;
+}
+
+static const struct of_device_id of_match[] = {
+   { .compatible = "brcm,brcmstb-reboot", },
+   {},
+};
+
+static struct platform_driver brcmstb_reboot_driver = {
+   .probe = brcmstb_reboot_probe,
+   .driver = {
+   .name = "brcmstb-reboot",
+   .owner = THIS_MODULE,
+   .of_match_table = of_match,
+   },
+};
+
+static int __init brcmstb_reboot_init(void)
+{
+   return platform_driver_probe(&brcmstb_reboot_driver,
+   brcmstb_reboot_probe);
+}
+subsys_initcall(brcmstb_reboot_init);
-- 
1.7.1

--
To unsubscribe from this list: sen

[PATCH v5 1/8] ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs

2014-01-21 Thread Marc Carino

The BCM7xxx series of Broadcom SoCs are used primarily in set-top boxes.

This patch adds machine support for the ARM-based Broadcom SoCs.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 arch/arm/configs/multi_v7_defconfig |1 +
 arch/arm/mach-bcm/Kconfig   |   14 ++
 arch/arm/mach-bcm/Makefile  |4 +
 arch/arm/mach-bcm/brcmstb.c |  110 
 arch/arm/mach-bcm/brcmstb.h |   38 
 arch/arm/mach-bcm/headsmp-brcmstb.S |   34 
 arch/arm/mach-bcm/hotplug-brcmstb.c |  334 +++
 7 files changed, 535 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/mach-bcm/brcmstb.c
 create mode 100644 arch/arm/mach-bcm/brcmstb.h
 create mode 100644 arch/arm/mach-bcm/headsmp-brcmstb.S
 create mode 100644 arch/arm/mach-bcm/hotplug-brcmstb.c

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index c1df4e9..7028d11 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -7,6 +7,7 @@ CONFIG_MACH_ARMADA_370=y
 CONFIG_MACH_ARMADA_XP=y
 CONFIG_ARCH_BCM=y
 CONFIG_ARCH_BCM_MOBILE=y
+CONFIG_ARCH_BRCMSTB=y
 CONFIG_GPIO_PCA953X=y
 CONFIG_ARCH_HIGHBANK=y
 CONFIG_ARCH_KEYSTONE=y
diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
index 9fe6d88..2c1ae83 100644
--- a/arch/arm/mach-bcm/Kconfig
+++ b/arch/arm/mach-bcm/Kconfig
@@ -31,6 +31,20 @@ config ARCH_BCM_MOBILE
  BCM11130, BCM11140, BCM11351, BCM28145 and
  BCM28155 variants.
 
+config ARCH_BRCMSTB
+   bool "Broadcom BCM7XXX based boards" if ARCH_MULTI_V7
+   depends on MMU
+   select ARM_GIC
+   select MIGHT_HAVE_PCI
+   select HAVE_SMP
+   select HAVE_ARM_ARCH_TIMER
+   help
+ Say Y if you intend to run the kernel on a Broadcom ARM-based STB
+ chipset.
+
+ This enables support for Broadcom ARM-based set-top box chipsets,
+ including the 7445 family of chips.
+
 endmenu
 
 endif
diff --git a/arch/arm/mach-bcm/Makefile b/arch/arm/mach-bcm/Makefile
index c2ccd5a..b744a12 100644
--- a/arch/arm/mach-bcm/Makefile
+++ b/arch/arm/mach-bcm/Makefile
@@ -13,3 +13,7 @@
 obj-$(CONFIG_ARCH_BCM_MOBILE)  := board_bcm281xx.o bcm_kona_smc.o 
bcm_kona_smc_asm.o kona.o
 plus_sec := $(call as-instr,.arch_extension sec,+sec)
 AFLAGS_bcm_kona_smc_asm.o  :=-Wa,-march=armv7-a$(plus_sec)
+
+obj-$(CONFIG_ARCH_BRCMSTB) := brcmstb.o
+obj-$(CONFIG_SMP)  += headsmp-brcmstb.o
+obj-$(CONFIG_HOTPLUG_CPU)  += hotplug-brcmstb.o
diff --git a/arch/arm/mach-bcm/brcmstb.c b/arch/arm/mach-bcm/brcmstb.c
new file mode 100644
index 000..7a6093d
--- /dev/null
+++ b/arch/arm/mach-bcm/brcmstb.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (C) 2013 Broadcom Corporation
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "brcmstb.h"
+
+/***
+ * STB CPU (main application processor)
+ ***/
+
+static const char *brcmstb_match[] __initconst = {
+   "brcm,bcm7445",
+   "brcm,brcmstb",
+   NULL
+};
+
+static void __init brcmstb_init_early(void)
+{
+   add_preferred_console("ttyS", 0, "115200");
+}
+
+/***
+ * SMP boot
+ ***/
+
+#ifdef CONFIG_SMP
+static DEFINE_SPINLOCK(boot_lock);
+
+static void __cpuinit brcmstb_secondary_init(unsigned int cpu)
+{
+   /*
+* Synchronise with the boot thread.
+*/
+   spin_lock(&boot_lock);
+   spin_unlock(&boot_lock);
+}
+
+static int __cpuinit brcmstb_boot_secondary(unsigned int cpu,
+   struct task_struct *idle)
+{
+   /*
+* set synchronisation state between this boot processor
+* and the secondary one
+*/
+   spin_lock(&boot_lock);
+
+   /* Bring up power to the core if necessary */
+   if (brcmstb_cpu_get_power_state(cpu) == 0)
+   brcmstb_cpu_power_on(cpu);
+
+   brcmstb_cpu_boot(cpu);
+
+   /*
+* now the secondary core is starting up let it run its
+* calibrations, then wait for it to finish
+*/
+   spin_unlock(&boot_lock);
+
+   return 0;
+}
+

[PATCH v5 5/8] ARM: brcmstb: add CPU binding for Broadcom Brahma15

2014-01-21 Thread Marc Carino

Add the Broadcom Brahma B15 CPU to the DT CPU binding list.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 Documentation/devicetree/bindings/arm/cpus.txt |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt 
b/Documentation/devicetree/bindings/arm/cpus.txt
index 9130435..0cd1e25 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -163,6 +163,7 @@ nodes to be present and contain the properties described 
below.
"arm,cortex-r4"
"arm,cortex-r5"
"arm,cortex-r7"
+   "brcm,brahma-b15"
"faraday,fa526"
"intel,sa110"
"intel,sa1100"
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 6/8] ARM: brcmstb: add misc. DT bindings for brcmstb

2014-01-21 Thread Marc Carino

Document the bindings that the Broadcom STB platform needs
for proper bootup.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 .../devicetree/bindings/arm/brcm-brcmstb.txt   |   95 
 1 files changed, 95 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/brcm-brcmstb.txt

diff --git a/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt 
b/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt
new file mode 100644
index 000..3c436cc
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt
@@ -0,0 +1,95 @@
+ARM Broadcom STB platforms Device Tree Bindings
+---
+Boards with Broadcom Brahma15 ARM-based BCM (generally BCM7xxx variants)
+SoC shall have the following DT organization:
+
+Required root node properties:
+- compatible: "brcm,bcm", "brcm,brcmstb"
+
+example:
+/ {
+#address-cells = <2>;
+#size-cells = <2>;
+model = "Broadcom STB (bcm7445)";
+compatible = "brcm,bcm7445", "brcm,brcmstb";
+
+Further, syscon nodes that map platform-specific registers used for general
+system control is required:
+
+- compatible: "brcm,bcm-sun-top-ctrl", "syscon"
+- compatible: "brcm,bcm-hif-cpubiuctrl", "syscon"
+- compatible: "brcm,bcm-hif-continuation", "syscon"
+
+example:
+rdb {
+#address-cells = <1>;
+#size-cells = <1>;
+compatible = "simple-bus";
+ranges = <0 0x00 0xf000 0x100>;
+
+sun_top_ctrl: syscon@404000 {
+compatible = "brcm,bcm7445-sun-top-ctrl", "syscon";
+reg = <0x404000 0x51c>;
+};
+
+hif_cpubiuctrl: syscon@3e2400 {
+compatible = "brcm,bcm7445-hif-cpubiuctrl", "syscon";
+reg = <0x3e2400 0x5b4>;
+};
+
+hif_continuation: syscon@452000 {
+compatible = "brcm,bcm7445-hif-continuation", "syscon";
+reg = <0x452000 0x100>;
+};
+};
+
+Lastly, nodes that allow for support of SMP initialization and reboot are
+required:
+
+smpboot
+---
+Required properties:
+
+- compatible
+The string "brcm,brcmstb-smpboot".
+
+- syscon-cpu
+A phandle / integer array property which lets the BSP know the location
+of certain CPU power-on registers.
+
+The layout of the property is as follows:
+o a phandle to the "hif_cpubiuctrl" syscon node
+o offset to the base CPU power zone register
+o offset to the base CPU reset register
+
+- syscon-cont
+A phandle pointing to the syscon node which describes the CPU boot
+continuation registers.
+o a phandle to the "hif_continuation" syscon node
+
+example:
+smpboot {
+compatible = "brcm,brcmstb-smpboot";
+syscon-cpu = <&hif_cpubiuctrl 0x88 0x178>;
+syscon-cont = <&hif_continuation>;
+};
+
+reboot
+---
+Required properties
+
+- compatible
+The string property "brcm,brcmstb-reboot".
+
+- syscon
+A phandle / integer array that points to the syscon node which 
describes
+the general system reset registers.
+o a phandle to "sun_top_ctrl"
+o offset to the "reset source enable" register
+o offset to the "software master reset" register
+
+example:
+reboot {
+compatible = "brcm,brcmstb-reboot";
+syscon = <&sun_top_ctrl 0x304 0x308>;
+};
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 3/8] ARM: brcmstb: add debug UART for earlyprintk support

2014-01-21 Thread Marc Carino

Add the UART definitions needed to support earlyprintk on brcmstb machines.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 arch/arm/Kconfig.debug |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 5765abf..666afd7 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -94,6 +94,17 @@ choice
depends on ARCH_BCM2835
select DEBUG_UART_PL01X
 
+   config DEBUG_BRCMSTB_UART
+   bool "Use BRCMSTB UART for low-level debug"
+   depends on ARCH_BRCMSTB
+   select DEBUG_UART_8250
+   help
+ Say Y here if you want the debug print routines to direct
+ their output to the first serial port on these devices.
+
+ If you have a Broadcom STB chip and would like early print
+ messages to appear over the UART, select this option.
+
config DEBUG_CLPS711X_UART1
bool "Kernel low-level debugging messages via UART1"
depends on ARCH_CLPS711X
@@ -1008,6 +1019,7 @@ config DEBUG_UART_PHYS
default 0xd4018000 if DEBUG_MMP_UART3
default 0xe000 if ARCH_SPEAR13XX
default 0xfbe0 if ARCH_EBSA110
+   default 0xf0406b00 if DEBUG_BRCMSTB_UART
default 0xf1012000 if DEBUG_MVEBU_UART_ALTERNATE
default 0xf1012000 if ARCH_DOVE || ARCH_KIRKWOOD || ARCH_MV78XX0 || \
ARCH_ORION5X
@@ -1040,6 +1052,7 @@ config DEBUG_UART_VIRT
default 0xf809 if DEBUG_VEXPRESS_UART0_RS1
default 0xfb009000 if DEBUG_REALVIEW_STD_PORT
default 0xfb10c000 if DEBUG_REALVIEW_PB1176_PORT
+   default 0xfc406b00 if DEBUG_BRCMSTB_UART
default 0xfd00 if ARCH_SPEAR3XX || ARCH_SPEAR6XX
default 0xfd00 if ARCH_SPEAR13XX
default 0xfd012000 if ARCH_MV78XX0
@@ -1091,7 +1104,8 @@ config DEBUG_UART_8250_WORD
default y if DEBUG_PICOXCELL_UART || DEBUG_SOCFPGA_UART || \
ARCH_KEYSTONE || \
DEBUG_DAVINCI_DMx_UART0 || DEBUG_DAVINCI_DA8XX_UART1 || \
-   DEBUG_DAVINCI_DA8XX_UART2 || DEBUG_DAVINCI_TNETV107X_UART1
+   DEBUG_DAVINCI_DA8XX_UART2 || DEBUG_DAVINCI_TNETV107X_UART1 || \
+   DEBUG_BRCMSTB_UART
 
 config DEBUG_UART_8250_FLOW_CONTROL
bool "Enable flow control for 8250 UART"
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 7/8] ARM: brcmstb: gic: add compatible string for Broadcom Brahma15

2014-01-21 Thread Marc Carino

Document the Broadcom Brahma B15 GIC implementation as compatible
with the ARM GIC standard.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 Documentation/devicetree/bindings/arm/gic.txt |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/gic.txt 
b/Documentation/devicetree/bindings/arm/gic.txt
index 3dfb0c0..d7409fd 100644
--- a/Documentation/devicetree/bindings/arm/gic.txt
+++ b/Documentation/devicetree/bindings/arm/gic.txt
@@ -15,6 +15,7 @@ Main node required properties:
"arm,cortex-a9-gic"
"arm,cortex-a7-gic"
"arm,arm11mp-gic"
+   "brcm,brahma-b15-gic"
 - interrupt-controller : Identifies the node as an interrupt controller
 - #interrupt-cells : Specifies the number of cells needed to encode an
   interrupt source.  The type shall be a  and the value shall be 3.
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 0/8] ARM: brcmstb: Add Broadcom STB SoC support

2014-01-21 Thread Marc Carino

This patchset contains the board support package for the
Broadcom BCM7445 ARM-based SoC [1]. These changes contain a
minimal set of code needed for a BCM7445-based board to boot
the Linux kernel.

These changes heavily leverage the OF/devicetree framework.

v5:
- rebased to v3.13 tag
- make UART DT node a child of 'rdb' node
- fix ordering of debug UART entries

v4:
- make a reboot driver and put it in the drivers folder
- rework DT bindings to leverage 'syscon'
- rework BSP code to use 'syscon' for all register mappings
- misc. tweaks per suggestions from v3

v3:
- rebased to v3.13-rc8
- switched to using 'multi_v7_defconfig'
- eliminated dependence on compile-time peripheral register access
- moved DT node iomap out from 'init_early'
- misc. minor cleanups from mailing-list discussion for v2

v2:
- rebased to v3.13-rc1
- moved implementation to 'mach-bcm' folder
- added CPU init for B15

v1:
- initial submission

[1] http://www.broadcom.com/products/Cable/Cable-Set-Top-Box-Solutions/BCM7445

Marc Carino (8):
  ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs
  power: reset: Add reboot driver for brcmstb
  ARM: brcmstb: add debug UART for earlyprintk support
  ARM: do CPU-specific init for Broadcom Brahma15 cores
  ARM: brcmstb: add CPU binding for Broadcom Brahma15
  ARM: brcmstb: add misc. DT bindings for brcmstb
  ARM: brcmstb: gic: add compatible string for Broadcom Brahma15
  ARM: brcmstb: dts: add a reference DTS for Broadcom 7445

 .../devicetree/bindings/arm/brcm-brcmstb.txt   |   95 ++
 Documentation/devicetree/bindings/arm/cpus.txt |1 +
 Documentation/devicetree/bindings/arm/gic.txt  |1 +
 arch/arm/Kconfig.debug |   16 +-
 arch/arm/boot/dts/bcm7445.dts  |  111 +++
 arch/arm/configs/multi_v7_defconfig|1 +
 arch/arm/mach-bcm/Kconfig  |   14 +
 arch/arm/mach-bcm/Makefile |4 +
 arch/arm/mach-bcm/brcmstb.c|  110 +++
 arch/arm/mach-bcm/brcmstb.h|   38 +++
 arch/arm/mach-bcm/headsmp-brcmstb.S|   34 ++
 arch/arm/mach-bcm/hotplug-brcmstb.c|  334 
 arch/arm/mm/proc-v7.S  |   11 +
 drivers/power/reset/Kconfig|   10 +
 drivers/power/reset/Makefile   |1 +
 drivers/power/reset/brcmstb-reboot.c   |  120 +++
 16 files changed, 900 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/brcm-brcmstb.txt
 create mode 100644 arch/arm/boot/dts/bcm7445.dts
 create mode 100644 arch/arm/mach-bcm/brcmstb.c
 create mode 100644 arch/arm/mach-bcm/brcmstb.h
 create mode 100644 arch/arm/mach-bcm/headsmp-brcmstb.S
 create mode 100644 arch/arm/mach-bcm/hotplug-brcmstb.c
 create mode 100644 drivers/power/reset/brcmstb-reboot.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 4/8] ARM: do CPU-specific init for Broadcom Brahma15 cores

2014-01-21 Thread Marc Carino

Perform any CPU-specific initialization required on the
Broadcom Brahma-15 core.

Signed-off-by: Marc Carino 
Acked-by: Florian Fainelli 
---
 arch/arm/mm/proc-v7.S |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index bd17819..98ea423 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -193,6 +193,7 @@ __v7_cr7mp_setup:
b   1f
 __v7_ca7mp_setup:
 __v7_ca15mp_setup:
+__v7_b15mp_setup:
mov r10, #0
 1:
 #ifdef CONFIG_SMP
@@ -494,6 +495,16 @@ __v7_ca15mp_proc_info:
.size   __v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info
 
/*
+* Broadcom Corporation Brahma-B15 processor.
+*/
+   .type   __v7_b15mp_proc_info, #object
+__v7_b15mp_proc_info:
+   .long   0x420f00f0
+   .long   0xff00
+   __v7_proc __v7_b15mp_setup, hwcaps = HWCAP_IDIV
+   .size   __v7_b15mp_proc_info, . - __v7_b15mp_proc_info
+
+   /*
 * Qualcomm Inc. Krait processors.
 */
.type   __krait_proc_info, #object
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BISECTED] Linux 3.12.7 introduces page map handling regression

2014-01-21 Thread Steven Noonan

On Tue, Jan 21, 2014 at 06:47:07PM -0800, Linus Torvalds wrote:
> On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman
>  wrote:
> >
> > Odds are this also shows up in 3.13, right?

Reproduced using 3.13 on the PV guest:

[  368.756763] BUG: Bad page map in process mp  pte:8004a67c6165 
pmd:e9b706067
[  368.756777] page:ea001299f180 count:0 mapcount:-1 mapping:   
   (null) index:0x0
[  368.756781] page flags: 0x2f8014(referenced|dirty)
[  368.756786] addr:7fd1388b7000 vm_flags:00100071 
anon_vma:880e9ba15f80 mapping:  (null) index:7fd1388b7
[  368.756792] CPU: 29 PID: 618 Comm: mp Not tainted 3.13.0-ec2 #1
[  368.756795]  880e9b718958 880e9eaf3cc0 814d8748 
7fd1388b7000
[  368.756803]  880e9eaf3d08 8116d289  

[  368.756809]  880e9b7065b8 ea001299f180 7fd1388b8000 
880e9eaf3e30
[  368.756815] Call Trace:
[  368.756825]  [] dump_stack+0x45/0x56
[  368.756833]  [] print_bad_pte+0x229/0x250
[  368.756837]  [] unmap_single_vma+0x583/0x890
[  368.756842]  [] unmap_vmas+0x65/0x90
[  368.756847]  [] unmap_region+0xac/0x120
[  368.756852]  [] ? vma_rb_erase+0x1c9/0x210
[  368.756856]  [] do_munmap+0x280/0x370
[  368.756860]  [] vm_munmap+0x41/0x60
[  368.756864]  [] SyS_munmap+0x22/0x30
[  368.756869]  [] system_call_fastpath+0x1a/0x1f
[  368.756872] Disabling lock debugging due to kernel taint
[  368.760084] BUG: Bad rss-counter state mm:880e9d079680 idx:0 
val:-1
[  368.760091] BUG: Bad rss-counter state mm:880e9d079680 idx:1 
val:1

> 
> Probably. I don't have a Xen PV setup to test with (and very little
> interest in setting one up).. And I have a suspicion that it might not
> be so much about Xen PV, as perhaps about the kind of hardware.
> 
> I suspect the issue has something to do with the magic _PAGE_NUMA
> tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up
> removing the _PAGE_PRESENT bit, and now the crazy numa code is
> confused.
> 
> The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the
> bit with _PAGE_PROTNONE, which is why it then has that tie-in to
> _PAGE_PRESENT.
> 
> Adding Andrea to the Cc, because he's the author of that horridness.
> Putting Steven's test-case here as an attachement for Andrea, maybe
> that makes him go "Ahh, yes, silly case".
> 
> Also added Kirill, because he was involved the last _PAGE_NUMA debacle.
> 
> Andrea, you can find the thread on lkml, but it boils down to commit
> 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the
> attached test-case (but apparently only under Xen PV). There it
> apparently causes a "BUG: Bad page map .." error.
> 
> And I suspect this is another of those "this bug is only visible on
> real numa machines, because _PAGE_NUMA isn't actually ever set
> otherwise". That has pretty much guaranteed that it gets basically
> zero testing, which is not a great idea when coupled with that subtle
> sharing of the _PAGE_PROTNONE bit..
> 
> It may be that the whole "Xen PV" thing is a red herring, and that
> Steven only sees it on that one machine because the one he runs as a
> PV guest under is a real NUMA machine, and all the other machines he
> has tried it on haven't been numa. So it *may* be that that "only
> under Xen PV" is a red herring. But that's just a possible guess.

The PV and HVM guests are both on NUMA hosts, but we don't expose NUMA to the
PV guest, so it fakes a NUMA node at startup.

I've also tried running a PV guest on a dual socket host with interleaved
memory:

# dmesg | grep -i -e numa -e node
[0.00] NUMA turned off
[0.00] Faking a node at [mem 
0x-0x0005607f]
[0.00] Initmem setup node 0 [mem 0x-0x5607f]
[0.00]   NODE_DATA [mem 0x55d4f2000-0x55d518fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x0009]
[0.00]   node   0: [mem 0x0010-0x5607f]
[0.00] On node 0 totalpages: 5638047
[0.00] setup_percpu: NR_CPUS:4096 nr_cpumask_bits:16 
nr_cpu_ids:16 nr_node_ids:1
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, 
Nodes=1
[0.010697] Inode-cache hash table entries: 2097152 (order: 12, 
16777216 bytes)
# dmesg | tail -n 21
[  348.467265] BUG: Bad page map in process t  pte:80008a6ef165 
pmd:53aa39067
[  348.467280] page:ea000229bbc0 count:0 mapcount:-1 mapping:   
   (null) index:0x0
[  348.467286] page flags: 0x1ffc14(referenced|dirty)
[  348.467293] addr:7f8c9fca vm_flags:00100071 
anon_vma:88053aff19c0 mapping:  (

[PATCH 1/2] net: dm9000: Read GPR, modify and write

2014-01-21 Thread Chris Ruehl

The GPR register should be read, modified and write to
activate the PHY. A simple write 0 to the GPR might override
other register values with needs to keep.
Some codestyle fixes (mostly leading spaces)

Signed-off-by: Chris Ruehl 
---
 drivers/net/ethernet/davicom/dm9000.c |   23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/davicom/dm9000.c 
b/drivers/net/ethernet/davicom/dm9000.c
index 7080ad6..0349b91 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -745,9 +745,9 @@ static const struct ethtool_ops dm9000_ethtool_ops = {
.get_link   = dm9000_get_link,
.get_wol= dm9000_get_wol,
.set_wol= dm9000_set_wol,
-   .get_eeprom_len = dm9000_get_eeprom_len,
-   .get_eeprom = dm9000_get_eeprom,
-   .set_eeprom = dm9000_set_eeprom,
+   .get_eeprom_len = dm9000_get_eeprom_len,
+   .get_eeprom = dm9000_get_eeprom,
+   .set_eeprom = dm9000_set_eeprom,
 };
 
 static void dm9000_show_carrier(board_info_t *db,
@@ -795,7 +795,7 @@ dm9000_poll_work(struct work_struct *w)
}
} else
mii_check_media(&db->mii, netif_msg_link(db), 0);
-   
+
if (netif_running(ndev))
dm9000_schedule_poll(db);
 }
@@ -1286,6 +1286,7 @@ dm9000_open(struct net_device *dev)
 {
board_info_t *db = netdev_priv(dev);
unsigned long irqflags = db->irq_res->flags & IRQF_TRIGGER_MASK;
+   int gprval;
 
if (netif_msg_ifup(db))
dev_dbg(db->dev, "enabling %s\n", dev->name);
@@ -1298,9 +1299,15 @@ dm9000_open(struct net_device *dev)
 
irqflags |= IRQF_SHARED;
 
+   gprval = ior(db, DM9000_GPR);
+
/* GPIO0 on pre-activate PHY, Reg 1F is not set by reset */
-   iow(db, DM9000_GPR, 0); /* REG_1F bit0 activate phyxcer */
-   mdelay(1); /* delay needs by DM9000B */
+   if (gprval & (1<<0)) {
+   dev_dbg(db->dev, "Activate PHY GPR: 0x%x\n", gprval);
+   gprval = gprval & ~(1<<0);
+   iow(db, DM9000_GPR, gprval);/* REG_1F bit0 activate phyxcer 
*/
+   mdelay(1); /* delay needs by DM9000B */
+   }
 
/* Initialize DM9000 board */
dm9000_reset(db);
@@ -1314,7 +1321,7 @@ dm9000_open(struct net_device *dev)
 
mii_check_media(&db->mii, netif_msg_link(db), 1);
netif_start_queue(dev);
-   
+
dm9000_schedule_poll(db);
 
return 0;
@@ -1628,7 +1635,7 @@ dm9000_probe(struct platform_device *pdev)
 
if (!is_valid_ether_addr(ndev->dev_addr)) {
/* try reading from mac */
-   
+
mac_src = "chip";
for (i = 0; i < 6; i++)
ndev->dev_addr[i] = ior(db, i+DM9000_PAR);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] net: dm9000: Only call PHY reset for TYPE-B on shutdown

2014-01-21 Thread Chris Ruehl

Unconditional call of PHY reset can triggers a fault to detect
the link for DM9000A on reboot, only a hard reset can solve it.
This patch check the version of the chip and call the PHY reset
only for the B version of the chip.

Signed-off-by: Chris Ruehl 
---
 drivers/net/ethernet/davicom/dm9000.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/davicom/dm9000.c 
b/drivers/net/ethernet/davicom/dm9000.c
index 0349b91..55a2e9c 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -1333,7 +1333,8 @@ dm9000_shutdown(struct net_device *dev)
board_info_t *db = netdev_priv(dev);
 
/* RESET device */
-   dm9000_phy_write(dev, 0, MII_BMCR, BMCR_RESET); /* PHY RESET */
+   if (db->type == TYPE_DM9000B)
+   dm9000_phy_write(dev, 0, MII_BMCR, BMCR_RESET); /* PHY RESET */
iow(db, DM9000_GPR, 0x01);  /* Power-Down PHY */
iow(db, DM9000_IMR, IMR_PAR);   /* Disable all interrupt */
iow(db, DM9000_RCR, 0x00);  /* Disable RX */
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2 v2] imx27: pinctrl: fix offset calculation in imx_read_2bit

2014-01-21 Thread Chris Ruehl

The offset for the 2bit register calculate wrong, this patch
fixes the problem. The debugfs printout for oconf, iconfa, iconfb
now shows the real values.

Signed-off-by: Chris Ruehl 
---
 drivers/pinctrl/pinctrl-imx1-core.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pinctrl/pinctrl-imx1-core.c 
b/drivers/pinctrl/pinctrl-imx1-core.c
index 8dfc3dc..59a16b6 100644
--- a/drivers/pinctrl/pinctrl-imx1-core.c
+++ b/drivers/pinctrl/pinctrl-imx1-core.c
@@ -139,7 +139,7 @@ static int imx1_read_2bit(struct imx1_pinctrl *ipctl, 
unsigned int pin_id,
u32 reg_offset)
 {
void __iomem *reg = imx1_mem(ipctl, pin_id) + reg_offset;
-   int offset = pin_id % 16;
+   int offset = (pin_id % 16) * 2;
 
/* Use the next register if the pin's port pin number is >=16 */
if (pin_id % 32 >= 16)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2 v2] imx27: pinctrl: fix wrong offset to ICONFB

2014-01-21 Thread Chris Ruehl

The offset to ICONFB was incorrect, this patch set the correct value 0x14.
dev_dbg in function imx1_write_2bit print the wrong address and had been
moved after address calculation.

Signed-off-by: Chris Ruehl 
---
 drivers/pinctrl/pinctrl-imx1-core.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pinctrl/pinctrl-imx1-core.c 
b/drivers/pinctrl/pinctrl-imx1-core.c
index f77914a..8dfc3dc 100644
--- a/drivers/pinctrl/pinctrl-imx1-core.c
+++ b/drivers/pinctrl/pinctrl-imx1-core.c
@@ -45,7 +45,7 @@ struct imx1_pinctrl {
 #define MX1_DDIR 0x00
 #define MX1_OCR 0x04
 #define MX1_ICONFA 0x0c
-#define MX1_ICONFB 0x10
+#define MX1_ICONFB 0x14
 #define MX1_GIUS 0x20
 #define MX1_GPR 0x38
 #define MX1_PUEN 0x40
@@ -97,13 +97,13 @@ static void imx1_write_2bit(struct imx1_pinctrl *ipctl, 
unsigned int pin_id,
u32 old_val;
u32 new_val;
 
-   dev_dbg(ipctl->dev, "write: register 0x%p offset %d value 0x%x\n",
-   reg, offset, value);
-
/* Use the next register if the pin's port pin number is >=16 */
if (pin_id % 32 >= 16)
reg += 0x04;
 
+   dev_dbg(ipctl->dev, "write: register 0x%p offset %d value 0x%x\n",
+   reg, offset, value);
+
/* Get current state of pins */
old_val = readl(reg);
old_val &= mask;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 9/9] mm: keep page cache radix tree nodes in check

2014-01-21 Thread Dave Chinner

On Tue, Jan 21, 2014 at 12:50:17AM -0500, Johannes Weiner wrote:
> On Tue, Jan 21, 2014 at 02:03:58PM +1100, Dave Chinner wrote:
> > On Mon, Jan 20, 2014 at 06:17:37PM -0500, Johannes Weiner wrote:
> > > On Fri, Jan 17, 2014 at 11:05:17AM +1100, Dave Chinner wrote:
> > > > On Fri, Jan 10, 2014 at 01:10:43PM -0500, Johannes Weiner wrote:
> > > > > +static struct shrinker workingset_shadow_shrinker = {
> > > > > + .count_objects = count_shadow_nodes,
> > > > > + .scan_objects = scan_shadow_nodes,
> > > > > + .seeks = DEFAULT_SEEKS * 4,
> > > > > + .flags = SHRINKER_NUMA_AWARE,
> > > > > +};
> > > > 
> > > > Can you add a comment explaining how you calculated the .seeks
> > > > value? It's important to document the weighings/importance
> > > > we give to slab reclaim so we can determine if it's actually
> > > > acheiving the desired balance under different loads...
> > > 
> > > This is not an exact science, to say the least.
> > 
> > I know, that's why I asked it be documented rather than be something
> > kept in your head.
> > 
> > > The shadow entries are mostly self-regulated, so I don't want the
> > > shrinker to interfere while the machine is just regularly trimming
> > > caches during normal operation.
> > > 
> > > It should only kick in when either a) reclaim is picking up and the
> > > scan-to-reclaim ratio increases due to mapped pages, dirty cache,
> > > swapping etc. or b) the number of objects compared to LRU pages
> > > becomes excessive.
> > > 
> > > I think that is what most shrinkers with an elevated seeks value want,
> > > but this translates very awkwardly (and not completely) to the current
> > > cost model, and we should probably rework that interface.
> > > 
> > > "Seeks" currently encodes 3 ratios:
> > > 
> > >   1. the cost of creating an object vs. a page
> > > 
> > >   2. the expected number of objects vs. pages
> > 
> > It doesn't encode that at all. If it did, then the default value
> > wouldn't be "2".
> >
> > >   3. the cost of reclaiming an object vs. a page
> > 
> > Which, when you consider #3 in conjunction with #1, the actual
> > intended meaning of .seeks is "the cost of replacing this object in
> > the cache compared to the cost of replacing a page cache page."
> 
> But what it actually seems to do is translate scan rate from LRU pages
> to scan rate in another object pool.  The actual replacement cost
> varies based on hotness of each set, an in-use object is more
> expensive to replace than a cold page and vice versa, the dentry and
> inode shrinkers reflect this by rotating hot objects and refusing to
> actually reclaim items while they are in active use.

Right, but so does the page cache when the page referenced bit is
seen by the LRU scanner. That's a scanned page, so what is passed to
shrink_slab is a ratio of pages scanned vs pages eligible for
reclaim. IOWs, the fact that the slab caches rotate rather than
reclaim is irrelevant - what matters is the same proportional
pressure is applied to the slab cache that was applied to the page
cache

> So I am having a hard time deriving a meaningful value out of this
> definition for my usecase because I want to push back objects based on
> reclaim efficiency (scan rate vs. reclaim rate).  The other shrinkers
> with non-standard seek settings reek of magic number as well, which
> suggests I am not alone with this.

Right, which is exactly why I'm asking you to document it. I've got
no idea how other subsystems have come up with their magic numbers
because they are not documented, and so it's just about impossible
to determine what the author of the code really needed and hence the
best way to improve the interface is difficult to determine.

> I wonder if we can come up with a better interface that allows both
> traditional cache shrinkers with their own aging, as well as object
> pools that want to push back based on reclaim efficiency.

We probably can, though I'd prefer we don't end up with some
alternative algorithm that is specific to a single shrinker.

So, how do we measure page cache reclaim efficiency? How can that be
communicated to a shrinker? how can we tell a shrinker what measure
to use? How do we tell shrinker authors what measure to use?  How do
we translate that new method useful scan count information?

> > > but they are not necessarily correlated.  How I would like to
> > > configure the shadow shrinker instead is:
> > > 
> > >   o scan objects when reclaim efficiency is down to 75%, because they
> > > are more valuable than use-once cache but less than workingset
> > > 
> > >   o scan objects when the ratio between them and the number of pages
> > > exceeds 1/32 (one shadow entry for each resident page, up to 64
> > > entries per shrinkable object, assume 50% packing for robustness)
> > > 
> > >   o as the expected balance between objects and lru pages is 1:32,
> > > reclaim one object for every 32 reclaimed LRU pages, instead of
> > > assuming that number of scanne

linux-next: manual merge of the drm-intel tree with the drm tree

2014-01-21 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the drm-intel tree got a conflict in
drivers/gpu/drm/i915/intel_display.c between commit c326c0a9c98c
("drm/i915: Call drm_calc_timestamping_constants() earlier") from the drm
tree and commit bbee18af2a25 ("drm/i915: Prepare to track new pipe config
per pipe") from the drm-intel tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/gpu/drm/i915/intel_display.c
index 14b024becb91,e1d3ae1212a7..
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@@ -9660,14 -9705,7 +9703,15 @@@ static int __intel_set_mode(struct drm_
/* mode_set/enable/disable functions rely on a correct pipe
 * config. */
to_intel_crtc(crtc)->config = *pipe_config;
+   to_intel_crtc(crtc)->new_config = &to_intel_crtc(crtc)->config;
 +
 +  /*
 +   * Calculate and store various constants which
 +   * are later needed by vblank and swap-completion
 +   * timestamping. They are derived from true hwmode.
 +   */
 +  drm_calc_timestamping_constants(crtc,
 +  &pipe_config->adjusted_mode);
}
  
/* Only after disabling all output pipelines that will be changed can we


pgphEiQSyz2ju.pgp
Description: PGP signature

Re: [PATCH] clk: export __clk_get_hw for re-use in others

2014-01-21 Thread SeongJae Park

Dear Greg, Mike,

May I ask your answer or other opinion, please?

On Mon, Jan 20, 2014 at 5:07 PM, SeongJae Park  wrote:
> On Mon, Jan 20, 2014 at 4:47 PM, Mike Turquette  wrote:
>> On Sun, Jan 19, 2014 at 9:37 AM, Greg KH  wrote:
>>> On Sun, Jan 19, 2014 at 02:55:07PM +0900, SeongJae Park wrote:
 Following build comes while modprobe process:
 > ERROR: "__clk_get_hw" [drivers/clk/clk-max77686.ko] undefined!
 > make[2]: *** [__modpost] Error 1
 > make[1]: *** [modules] Error 2

 Export the symbol to fix it and for other part's usecase.

 Signed-off-by: SeongJae Park 
 ---
  drivers/clk/clk.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
 index 2b38dc9..3883fba 100644
 --- a/drivers/clk/clk.c
 +++ b/drivers/clk/clk.c
 @@ -575,6 +575,7 @@ struct clk_hw *__clk_get_hw(struct clk *clk)
  {
   return !clk ? NULL : clk->hw;
  }
 +EXPORT_SYMBOL_GPL(__clk_get_hw);
>>>
>>> __ functions should usually only be for "internal" use, why does this
>>> get exported to modules?  Why not just put it in a .h file?
>>
>> It was originally used only within the clock core but it is sensible
>> for hardware-specific clock drivers to use this as well. I plan to
>> audit all of the double-underscore functions in
>> include/linux/clk-provider.h for 3.15.
>>
>> Regards,
>> Mike
>>
> Thank you very much for answering about it, Mike.
>
> I agree Greg's indication and think Mike's explanation is reasonable.
>
> So, I think it would be better to just export the symbol now
> because it would be easier for future functions renaming and
> similar issues were solved in this way in past:
> https://lkml.org/lkml/2013/4/15/50
>
> Or, maybe I can change the client code of __clk_get_hw to not use the 
> function.
>
> What do you think would be better to fix this build error? Or, do you
> have better idea?
> I will respect your opinion.
>
> Thanks and Regards.
> SeongJae Park.
>
>>>
>>> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the drm-intel tree with the drm tree

2014-01-21 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the drm-intel tree got a conflict in
drivers/gpu/drm/i915/i915_irq.c between commit abca9e454498 ("drm: Pass
'flags' from the caller to .get_scanout_position()") from the drm tree
and commit d59a63ad8234 ("drm/i915: Add intel_get_crtc_scanline()") from
the drm-intel tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/gpu/drm/i915/i915_irq.c
index 17d8fcb1b6f7,ffb56a9db9cc..
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@@ -649,8 -675,9 +649,9 @@@ static bool ilk_pipe_in_vblank_locked(s
  }
  
  static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe,
 -  int *vpos, int *hpos,
 +  unsigned int flags, int *vpos, int *hpos,
-   ktime_t *stime, ktime_t *etime)
+   ktime_t *stime, ktime_t *etime,
+   bool adjust)
  {
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_crtc *crtc = dev_priv->pipe_to_crtc_mapping[pipe];
@@@ -788,6 -786,24 +791,24 @@@
return ret;
  }
  
+ static int i915_get_scanout_position(struct drm_device *dev, int pipe,
+int *vpos, int *hpos,
+ktime_t *stime, ktime_t *etime)
+ {
 -  return i915_get_crtc_scanoutpos(dev, pipe, vpos, hpos,
++  return i915_get_crtc_scanoutpos(dev, pipe, 0, vpos, hpos,
+   stime, etime, true);
+ }
+ 
+ int intel_get_crtc_scanline(struct drm_crtc *crtc)
+ {
+   int vpos = 0, hpos = 0;
+ 
 -  i915_get_crtc_scanoutpos(crtc->dev, to_intel_crtc(crtc)->pipe,
++  i915_get_crtc_scanoutpos(crtc->dev, to_intel_crtc(crtc)->pipe, 0,
+&vpos, &hpos, NULL, NULL, false);
+ 
+   return vpos;
+ }
+ 
  static int i915_get_vblank_timestamp(struct drm_device *dev, int pipe,
  int *max_error,
  struct timeval *vblank_time,


pgpvLV6E23Jmh.pgp
Description: PGP signature

[LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-21 Thread Ric Wheeler

One topic that has been lurking forever at the edges is the current 4k 
limitation for file system block sizes. Some devices in production today and 
others coming soon have larger sectors and it would be interesting to see if it 
is time to poke at this topic again.


LSF/MM seems to be pretty much the only event of the year that most of the key 
people will be present, so should be a great topic for a joint session.


Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] uapi: convert u64 to __u64 in exported headers

2014-01-21 Thread David Rientjes

On Tue, 21 Jan 2014, Mike Frysinger wrote:

> The u64 type is not defined in any exported kernel headers, so trying
> to use it will lead to build failures.
> 
> Signed-off-by: Mike Frysinger 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] uapi: dn: pull in ioctl.h header

2014-01-21 Thread Mike Frysinger

This header uses _IOW/_IOR defines but doesn't include ioctl.h for it.
If you try to use this w/out including ioctl.h yourself, it can fail
to build, so add the explicit include.

Signed-off-by: Mike Frysinger 
---
 include/uapi/linux/dn.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/dn.h b/include/uapi/linux/dn.h
index 5fbdd3d..4295c74 100644
--- a/include/uapi/linux/dn.h
+++ b/include/uapi/linux/dn.h
@@ -1,6 +1,7 @@
 #ifndef _LINUX_DN_H
 #define _LINUX_DN_H
 
+#include 
 #include 
 #include 
 
-- 
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] uapi: ppp-ioctl.h: pull in ppp_defs.h

2014-01-21 Thread Mike Frysinger

This header uses enum NPmode but doesn't include ppp_defs.h.  If you try
to use this header w/out including the defs header first, it leads to a
build failure.  So add the explicit include to fix it.

Don't know of any packages directly impacted, but noticed while building
some ppp code by hand.

Signed-off-by: Mike Frysinger 
---
 include/uapi/linux/ppp-ioctl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h
index 2d9a885..63a23a3 100644
--- a/include/uapi/linux/ppp-ioctl.h
+++ b/include/uapi/linux/ppp-ioctl.h
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * Bit definitions for flags argument to PPPIOCGFLAGS/PPPIOCSFLAGS.
-- 
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BISECTED] Linux 3.12.7 introduces page map handling regression

2014-01-21 Thread Linus Torvalds

On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman
 wrote:
>
> Odds are this also shows up in 3.13, right?

Probably. I don't have a Xen PV setup to test with (and very little
interest in setting one up).. And I have a suspicion that it might not
be so much about Xen PV, as perhaps about the kind of hardware.

I suspect the issue has something to do with the magic _PAGE_NUMA
tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up
removing the _PAGE_PRESENT bit, and now the crazy numa code is
confused.

The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the
bit with _PAGE_PROTNONE, which is why it then has that tie-in to
_PAGE_PRESENT.

Adding Andrea to the Cc, because he's the author of that horridness.
Putting Steven's test-case here as an attachement for Andrea, maybe
that makes him go "Ahh, yes, silly case".

Also added Kirill, because he was involved the last _PAGE_NUMA debacle.

Andrea, you can find the thread on lkml, but it boils down to commit
1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the
attached test-case (but apparently only under Xen PV). There it
apparently causes a "BUG: Bad page map .." error.

And I suspect this is another of those "this bug is only visible on
real numa machines, because _PAGE_NUMA isn't actually ever set
otherwise". That has pretty much guaranteed that it gets basically
zero testing, which is not a great idea when coupled with that subtle
sharing of the _PAGE_PROTNONE bit..

It may be that the whole "Xen PV" thing is a red herring, and that
Steven only sees it on that one machine because the one he runs as a
PV guest under is a real NUMA machine, and all the other machines he
has tried it on haven't been numa. So it *may* be that that "only
under Xen PV" is a red herring. But that's just a possible guess.

Christ, how I hate that _PAGE_NUMA bit. Andrea: the fact that it gets
no testing on any normal machines is a major problem. If it was simple
and straightforward and the code was "obviously correct", it wouldn't
be such a problem, but the _PAGE_NUMA code definitely does not fall
under that "simple and obviously correct" heading.

Guys, any ideas?

Linus
#include 
#include 
#include 
#include 

void die(const char *what)
{
	perror(what);
	exit(1);
}

int main(int arg, char **argv)
{
	void *p =
	mmap(NULL, 4096, PROT_READ | PROT_WRITE,
		 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

	if (p == MAP_FAILED)
		die("mmap");

	/* Tickle the page. */
	((char *) p)[0] = 0;

	if (mprotect(p, 4096, PROT_NONE) != 0)
		die("mprotect");

	if (mprotect(p, 4096, PROT_READ) != 0)
		die("mprotect");

	if (munmap(p, 4096) != 0)
		die("munmap");

	return 0;
}

[PATCH] uapi: convert u64 to __u64 in exported headers

2014-01-21 Thread Mike Frysinger

The u64 type is not defined in any exported kernel headers, so trying
to use it will lead to build failures.

Signed-off-by: Mike Frysinger 
---
 include/uapi/linux/nfs4.h   | 2 +-
 include/uapi/linux/perf_event.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/nfs4.h b/include/uapi/linux/nfs4.h
index 788128e..35f5f4c 100644
--- a/include/uapi/linux/nfs4.h
+++ b/include/uapi/linux/nfs4.h
@@ -150,7 +150,7 @@
 #define NFS4_SECINFO_STYLE4_CURRENT_FH 0
 #define NFS4_SECINFO_STYLE4_PARENT 1
 
-#define NFS4_MAX_UINT64(~(u64)0)
+#define NFS4_MAX_UINT64(~(__u64)0)
 
 /* An NFS4 sessions server must support at least NFS4_MAX_OPS operations.
  * If a compound requires more operations, adjust NFS4_MAX_OPS accordingly.
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 959d454..7a3fed5 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -787,7 +787,7 @@ union perf_mem_data_src {
 #define PERF_MEM_TLB_SHIFT 26
 
 #define PERF_MEM_S(a, s) \
-   (((u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
+   (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
 
 /*
  * single taken branch record layout:
-- 
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] percpu changes for v3.14-rc1

2014-01-21 Thread Tejun Heo

Hello, Linus.

On Tue, Jan 21, 2014 at 05:51:13PM -0800, Linus Torvalds wrote:
> On Tue, Jan 21, 2014 at 1:48 AM, Tejun Heo  wrote:
> >
> > I messed up the for-3.14 branch (committed stuff to for-next) and had
> > to rebuild for-3.14 by cherry-picking; however, the result is the same
> > as published to the next tree through for-next.
> >
> > The changes are available in the following git branch
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git
> 
> You messed up the pull request too.. The branch name is missing from
> that git line, even if you did mention it a few lines earlier...

Oops, sorry.  The branch is for-3.14.

I have no idea how that happened tho.  That even isn't a part that I
edit.  I did

  git request-pull master 
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-3.14 > out

and then pulled in that file and added the description on top and diff
at the end.  I still have the "out" file created by the above command
and it also lacks the branch tag, so it definitely wasn't me somehow
deleting it while editing.  If I run the git-request-pull again, it
does have "for-3.14" there with everything else identical.  I wonder
whether git-request-pull somehow skips over branch tag when remote
for-3.14 doesn't match local one?

Ooh, right, that was it.  So, after running git-request-pull for the
first time, I rebuilt for-3.14, did git push -f and then ran
git-request-pull.  At that point, the new for-3.14 hasn't propagated
to git://git.kernel.org yet, so git-request-pull couldn't find the
head which matched the SHA1 and thus omitted printing the branch.  I
wonder whether this is a new behavior.  I saw the warning message
multiple times but ISTR the generated pull request having the branch
name specified on the command line regardless.  Maybe it should just
fail rather than generating pull request w/o branch tag?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 10/73] powerpc: use device_initcall for registering rtc devices

2014-01-21 Thread Paul Gortmaker

On Tue, Jan 21, 2014 at 6:48 PM, Geoff Levand  wrote:
> Hi Paul,
>
> On Tue, 2014-01-21 at 16:22 -0500, Paul Gortmaker wrote:
>> Currently these two RTC devices are in core platform code
>> where it is not possible for them to be modular.  It will
>> never be modular, so using module_init as an alias for
>> __initcall can be somewhat misleading.
>>
>>  arch/powerpc/kernel/time.c| 2 +-
>>  arch/powerpc/platforms/ps3/time.c | 3 +--
>>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> I tested the PS3 part of this patch and it seems to work OK.
>
> Acked-by: Geoff Levand 

Thanks Geoff for the review and testing; I'll add the ack.

Paul.
--

>
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] swap: do not skip lowest_bit in scan_swap_map() scan loop

2014-01-21 Thread Jamie Liu

In the second half of scan_swap_map()'s scan loop, offset is set to
si->lowest_bit and then incremented before entering the loop for the
first time, causing si->swap_map[si->lowest_bit] to be skipped.

Signed-off-by: Jamie Liu 
---
 mm/swapfile.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 612a7c9..6635081 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -616,7 +616,7 @@ scan:
}
}
offset = si->lowest_bit;
-   while (++offset < scan_base) {
+   while (offset < scan_base) {
if (!si->swap_map[offset]) {
spin_lock(&si->lock);
goto checks;
@@ -629,6 +629,7 @@ scan:
cond_resched();
latency_ration = LATENCY_LIMIT;
}
+   offset++;
}
spin_lock(&si->lock);
 
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, cpu hotplug, use cpumask stack safe variant cpumask_var_t in check_irq_vectors_for_cpu_disable() [v2]

2014-01-21 Thread Chen, Gong

On Mon, Jan 20, 2014 at 01:57:58PM -0500, Prarit Bhargava wrote:
> Subject: [PATCH] x86, cpu hotplug, use cpumask stack safe variant
>  cpumask_var_t in check_irq_vectors_for_cpu_disable() [v2]
> 
> kbuild, 0day kernel build service, outputs the warning:
> 
> arch/x86/kernel/irq.c:333:1: warning: the frame size of 2056 bytes
> is larger than 2048 bytes [-Wframe-larger-than=]
> 
> because check_irq_vectors_for_cpu_disable() allocates two cpumasks on the
> stack.  Fix this by using cpumask_var_t, the cpumask stack safe variant.
> 
> Signed-off-by: Prarit Bhargava 
> Cc: Andi Kleen 
> Cc: Michel Lespinasse 
> Cc: Seiji Aguchi 
> Cc: Yang Zhang 
> Cc: Paul Gortmaker 
> Cc: Janet Morgan 
> Cc: Tony Luck 
> Cc: Ruiv Wang 
> Cc: Gong Chen 
> Cc: H. Peter Anvin 
> Cc: Gong Chen 
> Cc: x...@kernel.org
> Cc: Fengguang Wu 
> 
> [v2]: switch from GFP_KERNEL to GFP_ATOMIC

Reviewed-by: Chen, Gong 



signature.asc
Description: Digital signature

Messenger from Administrator

2014-01-21 Thread Webmail Support Team




Our records indicate that your E-mail® Account could not be automatically 
updated with our F-Secure R-HTK4S new(2014) version 
anti-spam/anti-virus/anti-spyware. Please click this link below to update 
manually

http://www.contactme.com/52b579e4038a5300020107e3

We Are Sorry For Any Inconvenience.

Verification Code: SQP4039VE

Regards,
Technical Support Team
Copyright © 2014. All Rights Reserved
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/4] Intel MPX support

2014-01-21 Thread Qiaowei Ren

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Qiaowei Ren (4):
  x86, mpx: add documentation on Intel MPX
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
  x86, mpx: extend siginfo structure to include bound violation
information

 Documentation/x86/intel_mpx.txt|   76 +++
 arch/x86/Kconfig   |4 +
 arch/x86/include/asm/mpx.h |   63 ++
 arch/x86/include/asm/processor.h   |   16 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  417 
 arch/x86/kernel/traps.c|   61 +-
 include/uapi/asm-generic/siginfo.h |9 +-
 include/uapi/linux/prctl.h |6 +
 kernel/signal.c|4 +
 kernel/sys.c   |   12 +
 11 files changed, 667 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 >

1 - 100 of 663 matches

Mail list logo