Re: [GIT PULL] GPIO bulk changes for v3.14
On Tue, Jan 21, 2014 at 7:11 PM, Linus Torvalds wrote: > The fact that it doesn't even compile makes me doubt your statement > that it has been in linux-next. It doesn't even pass a basic > allmodconfig build. Hm I rely on the zeroday build, and didn't get any angry compile errors. I'll double-check with Fengguang to see what's going on here and that branches get proper buildtesting. > I see that you tried to fix it in commit 01d7004181c8 ("gpio: > mcp23s08: depend on OF_GPIO") but screwed up the order of operations. > > I fixed it up properly in the merge, but please try to figure out how > the hell this passed through the cracks. Argh, thanks for fixing. I see the problem now. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "sched: Fix sleep time double accounting in enqueue entity"
Paul, I let you send a patch that will add comment and move the "if (wakeup) logic" ? Regards Vincent On 22 January 2014 08:45, Vincent Guittot wrote: > This reverts commit 282cf499f03ec1754b6c8c945c9674b02631fb0f. > > With the current implementation, the load average statistics of a sched entity > change according to other activity on the CPU even if this activity is done > between the running window of the sched entity and have no influence on the > running duration of the task. > > When a task wakes up on the same CPU, we currently update last_runnable_update > with the return of __synchronize_entity_decay without updating the > runnable_avg_sum and runnable_avg_period accordingly. In fact, we have to sync > the load_contrib of the se with the rq's blocked_load_contrib before removing > it from the latter (with __synchronize_entity_decay) but we must keep > last_runnable_update unchanged for updating runnable_avg_sum/period during the > next update_entity_load_avg. > > Signed-off-by: Vincent Guittot > > --- > kernel/sched/fair.c |8 +--- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index e64b079..6d61f20 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2365,13 +2365,7 @@ static inline void enqueue_entity_load_avg(struct > cfs_rq *cfs_rq, > } > wakeup = 0; > } else { > - /* > -* Task re-woke on same cpu (or else migrate_task_rq_fair() > -* would have made count negative); we must be careful to > avoid > -* double-accounting blocked time after synchronizing decays. > -*/ > - se->avg.last_runnable_update += __synchronize_entity_decay(se) > - << 20; > + __synchronize_entity_decay(se); > } > > /* migrated tasks did not contribute to our blocked load */ > -- > 1.7.9.5 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Revert "sched: Fix sleep time double accounting in enqueue entity"
This reverts commit 282cf499f03ec1754b6c8c945c9674b02631fb0f. With the current implementation, the load average statistics of a sched entity change according to other activity on the CPU even if this activity is done between the running window of the sched entity and have no influence on the running duration of the task. When a task wakes up on the same CPU, we currently update last_runnable_update with the return of __synchronize_entity_decay without updating the runnable_avg_sum and runnable_avg_period accordingly. In fact, we have to sync the load_contrib of the se with the rq's blocked_load_contrib before removing it from the latter (with __synchronize_entity_decay) but we must keep last_runnable_update unchanged for updating runnable_avg_sum/period during the next update_entity_load_avg. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c |8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e64b079..6d61f20 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2365,13 +2365,7 @@ static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq, } wakeup = 0; } else { - /* -* Task re-woke on same cpu (or else migrate_task_rq_fair() -* would have made count negative); we must be careful to avoid -* double-accounting blocked time after synchronizing decays. -*/ - se->avg.last_runnable_update += __synchronize_entity_decay(se) - << 20; + __synchronize_entity_decay(se); } /* migrated tasks did not contribute to our blocked load */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: top-down balance purpose discussion -- resend
On 01/21/2014 10:57 PM, Peter Zijlstra wrote: > On Tue, Jan 21, 2014 at 10:04:26PM +0800, Alex Shi wrote: >> >> Current scheduler load balance is bottom-up mode, each CPU need >> initiate the balance by self. >> >> 1, Like in a integrate computer system, it has smt/core/cpu/numa, 4 >> level scheduler domains. If there is just 2 tasks in whole system that >> both running on cpu0. Current load balance need to pull task to another >> smt in smt domain, then pull task to another core, then pull task to >> another cpu, finally pull task to another numa. Totally it is need 4 >> times task moving to get system balance. > > Except the idle load balancer, and esp. the newidle can totally by-pass > this. > > If you do the packing right in the newidle pass, you'd get there in 1 > step. It give me a huge pressure to argue with you a great experts. I am waiting and very appreciate for any comments and corrections. :) Yes, a newidle will kindly relief this. but it can not eliminate it. If a newidle happens on another numa group. It just needs 1 step. But if it happens on another smt group, it still needs 4 steps. So generally, we still need one more steps before well balance. In this example, if a newidle is in the same smallest group, maybe we should wakeup a remotest cpu in system/llc to avoid extra task moving in near future for best performance. And for power saving, maybe we'd better kick the task to smallest group, then let the remote cpu group idle. But for current newidle, it's impossible to do this because newidle is also bottom-up mode. > >> Generally, the task moving complexity is >> O(nm log n), n := nr_cpus, m := nr_tasks >> >> There is a excellent summary and explanation for this in >> kernel/sched/fair.c:4605 > > Which is a perfectly fine scheme for a busy system. > >> Another weakness of current LB is that every cpu need to get the other >> cpus' load info repeatedly and try to figure out busiest sched >> group/queue on every sched domain level. But it just waste time, since >> it may not conduct a task moving. One of reasons is that cpu can only >> pull task, not pushing. > > This doesn't make sense.. and in fact, we do a limited amount of 3rd > party movements. Yes, but the 3rd party movements is too limited, just for task pinned. > > Whatever you do, you have to repeat the information gathering anyhow, > because it constantly changes. > Yes, it is good to collection the load info once for once balance. but if the balance cpu is busiest cpu, current balance still keep collecting every group load info from bottom to up, and then do nothing on this imbalance system. This is bad. > Trying to serialize that doesn't make any kind of sense. The only thing > you want is that the system converges. Sorry, would you like to give a bit more details of 'serialize' is no sense? > > Skipped the rest because it seems build on a fundament I don't agree > with. That 4 move thing is just silly for an idle system, and we > shouldn't do that. > > I also very much do not want a single CPU balancing the entire system, > that's the anti-thesis of scalable. Sorry. IMHO, single cpu is possible to handle 1000 cpu balancing. And it is far more scalable than every cpu do balance in system, since there is only one cpu need to pick other cpu load info. BTW, there is no organize among all cpus' balancing currently. That's a a bit mess. Like if 2 cpus in a small cpu group just do balance for whole system at the same time, then both of them think self group is light and want more load. then they have the chance to over pull load to self group. That is bad. And single balancing has no such problem. -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] net: dm9000: Read GPR, modify and write
On Wednesday, January 22, 2014 03:15 PM, David Miller wrote: Please do not mix coding style and functional changes. Please resubmit this entire series once you have addressed all feedback. Thank you. Thanks for the advice. I will do. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Freeing of dev->p
Hi Greg, On Fri, 10 Jan 2014 07:24:02 -0800, Greg Kroah-Hartman wrote: > On Fri, Jan 10, 2014 at 03:39:07PM +0100, Jean Delvare wrote: > > (...) > > Then I suppose we could inline both functions > > again, for performance. Well, put in short, really revering > > b4028437876866aba4747a655ede00f892089e14 would be the way to go IMHO. > > > > Really, while I understand your envy to protect driver core internals > > from unwanted access, the cost here was simply too high IMHO, both in > > terms of getting things right and performance. Some drivers are calling > > dev_get_drvdata() directly or indirectly repeatedly at run-time. They > > had no reason not to as this used to be so fast, and now it is no > > longer an inline function, it has conditionals and a double pointer > > indirection... > > > > Plus, I can't think of anything really bad that could result from > > accessing driver_data directly, contrary to the other members of struct > > device_private. > > (...) > > Thanks for the detailed response, I think I'll just revert most of that > patch and see if it's still workable. Any news on this? -- Jean Delvare -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BISECTED] Linux 3.12.7 introduces page map handling regression
On Wed, Jan 22, 2014 at 12:02:15AM -0500, Konrad Rzeszutek Wilk wrote: > On Tue, Jan 21, 2014 at 07:20:45PM -0800, Steven Noonan wrote: > > On Tue, Jan 21, 2014 at 06:47:07PM -0800, Linus Torvalds wrote: > > > On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman > > > wrote: > > Adding extra folks to the party. > > > > > > > > Odds are this also shows up in 3.13, right? > > > > Reproduced using 3.13 on the PV guest: > > > > [ 368.756763] BUG: Bad page map in process mp pte:8004a67c6165 > > pmd:e9b706067 > > [ 368.756777] page:ea001299f180 count:0 mapcount:-1 mapping: > >(null) index:0x0 > > [ 368.756781] page flags: 0x2f8014(referenced|dirty) > > [ 368.756786] addr:7fd1388b7000 vm_flags:00100071 > > anon_vma:880e9ba15f80 mapping: (null) index:7fd1388b7 > > [ 368.756792] CPU: 29 PID: 618 Comm: mp Not tainted 3.13.0-ec2 #1 > > [ 368.756795] 880e9b718958 880e9eaf3cc0 814d8748 > > 7fd1388b7000 > > [ 368.756803] 880e9eaf3d08 8116d289 > > > > [ 368.756809] 880e9b7065b8 ea001299f180 7fd1388b8000 > > 880e9eaf3e30 > > [ 368.756815] Call Trace: > > [ 368.756825] [] dump_stack+0x45/0x56 > > [ 368.756833] [] print_bad_pte+0x229/0x250 > > [ 368.756837] [] unmap_single_vma+0x583/0x890 > > [ 368.756842] [] unmap_vmas+0x65/0x90 > > [ 368.756847] [] unmap_region+0xac/0x120 > > [ 368.756852] [] ? vma_rb_erase+0x1c9/0x210 > > [ 368.756856] [] do_munmap+0x280/0x370 > > [ 368.756860] [] vm_munmap+0x41/0x60 > > [ 368.756864] [] SyS_munmap+0x22/0x30 > > [ 368.756869] [] system_call_fastpath+0x1a/0x1f > > [ 368.756872] Disabling lock debugging due to kernel taint > > [ 368.760084] BUG: Bad rss-counter state mm:880e9d079680 idx:0 > > val:-1 > > [ 368.760091] BUG: Bad rss-counter state mm:880e9d079680 idx:1 > > val:1 > > > > > > > > Probably. I don't have a Xen PV setup to test with (and very little > > > interest in setting one up).. And I have a suspicion that it might not > > > be so much about Xen PV, as perhaps about the kind of hardware. > > > > > > I suspect the issue has something to do with the magic _PAGE_NUMA > > > tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up > > > removing the _PAGE_PRESENT bit, and now the crazy numa code is > > > confused. > > > > > > The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the > > > bit with _PAGE_PROTNONE, which is why it then has that tie-in to > > > _PAGE_PRESENT. > > > > > > Adding Andrea to the Cc, because he's the author of that horridness. > > > Putting Steven's test-case here as an attachement for Andrea, maybe > > > that makes him go "Ahh, yes, silly case". > > > > > > Also added Kirill, because he was involved the last _PAGE_NUMA debacle. > > > > > > Andrea, you can find the thread on lkml, but it boils down to commit > > > 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the > > > attached test-case (but apparently only under Xen PV). There it > > > apparently causes a "BUG: Bad page map .." error. > > I *think* it is due to the fact that pmd_numa and pte_numa is getting the > _raw_ > value of PMDs and PTEs. That is - it does not use the pvops interface > and instead reads the values directly from the page-table. Since the > page-table is also manipulated by the hypervisor - there are certain > flags it also sets to do its business. It might be that it uses > _PAGE_GLOBAL as well - and Linux picks up on that. If it was using > pte_flags that would invoke the pvops interface. > > Elena, Dariof and George, you guys had been looking at this a bit deeper > than I have. Does the Xen hypervisor use the _PAGE_GLOBAL for PV guests? > > This not-compiled-totally-bad-patch might shed some light on what I was > thinking _could_ fix this issue - and IS NOT A FIX - JUST A HACK. > It does not fix it for PMDs naturally (as there are no PMD paravirt ops > for that). Unfortunately the Totally Bad Patch seems to make no difference. I am still able to repro the issue: [ 346.374929] BUG: Bad page map in process mp pte:8004ae928065 pmd:e993f9067 [ 346.374942] page:ea0012ba4a00 count:0 mapcount:-1 mapping: (null) index:0x0 [ 346.374946] page flags: 0x2f8014(referenced|dirty) [ 346.374951] addr:7f06a9bbb000 vm_flags:00100071 anon_vma:880e9939fe00 mapping: (null) index:7f06a9bbb [ 346.374956] CPU: 29 PID: 609 Comm: mp Not tainted 3.13.0-ec2+ #1 [ 346.374960] 880e9cc38da8 880e991a3cc0 814d8768 7f06a9bbb000 [ 346.374967] 880e991a3d08 8116d289 [ 346.374972] 880e993f9dd8 ea0012ba4a00 7f06a9bbc000 880e991a3e30 [ 346.374979] Call Trace: [ 346.374988] [] dump_stack+0x
Re: linux rdma 3.14 merge plans
Roland & Co, On Tue, 2014-01-21 at 16:43 -0800, Roland Dreier wrote: > On Tue, Jan 21, 2014 at 2:00 PM, Or Gerlitz wrote: > > Roland, ping! the signature patches were posted > three months ago. We > > deserve a response from the maintainer that goes beyond "I need to > > think on that". > > > > Responsiveness was stated by Linus to be the #1 requirement from > > kernel maintainers. > > Or, I'm not sure what response you're after from me. Linus has also > said that maintainers should say "no" a lot more > (http://lwn.net/Articles/571995/) so maybe you want me to say, "No, I > won't merge this patch set, since it adds a bunch of complexity to > support a feature no one really cares about." Is that it? The patch set proposed by Sagi + Or is modest in terms of LOC to core IB code, and includes mostly mlx5 specific driver changes that enables HW offloads. > (And yes I > am skeptical about this stuff — I work at an enterprise storage > company and even here it's hard to find anyone who cares about > DIF/DIX, especially offload features that stop it from being > end-to-end) > My understanding is most HBAs capable of T10 PI offload in DIX PASS + VERIFY mode are already implementing DIX INSERT + STRIP modes in various capacities to support legacy environments. Beyond the DIX INSERT + STRIP case for enterprise storage, the amount of FC + SAS HBAs that already support T10 PI metadata is substantial. > I'm sure you're not expecting me to say, "Sure, I'll merge it without > understanding the problem it's solving or how it's doing that," > especially given the your recent history of pushing me to merge stuff > like the IP-RoCE patches back when they broke the userspace ABI. With the merge window now upon us, there is a understandable reluctance to merge new features. Given the amount of time the series has spent on the list, it is however a good candidate to consider for an exception. Short of that, are you planning to accept the series for the next round once the current merge window closes..? We'd really like to start enabling fabrics with these types of offloads for v3.15. > I'd really rather spend my time on something actually useful like > cleaning up softroce. > +1 for softroce + T10 PI support! --nab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: Tree for Jan 22
Hi all, This tree fails (more than usual) the powerpc allyesconfig build. Changes since 20140117: New tree: init (Paul Gortmaker's init.h inclusion cleanup) Dropped tree: sh (complex merge conflicts against very old commits) imx-mxs (complex merge conflicts against the arm tree) The powerpc tree still had its build failure. The drm-intel tree gained conflicts against the drm tree. The drivers-x86 tree gained a conflict against the pm tree. Non-merge commits (relative to Linus' tree): 6883 7331 files changed, 332480 insertions(+), 154974 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a multi_v7_defconfig for arm. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc, sparc64 and arm defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. I am currently merging 210 trees (counting Linus' and 29 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (03d11a0e458d Merge tag 'for-v3.14' of git://git.infradead.org/battery-2.6) Merging fixes/master (b0031f227e47 Merge tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator) Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" not depend on vmlinux) Merging arc-current/for-curr (7e22e91102c6 Linux 3.13-rc8) Merging arm-current/fixes (b25f3e1c3584 ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling) Merging m68k-current/for-linus (56931d73697c m68k/mac: Make SCC reset work more reliably) Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2) Merging powerpc-merge/merge (b3084f4db3ae powerpc/thp: Fix crash on mremap) Merging sparc/master (ef350bb7c5e0 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4) Merging net/master (7d0d46da750a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging ipsec/master (965cdea82569 dccp: catch failed request_module call in dccp_probe init) Merging sound-current/for-linus (7552f34a7900 Merge tag 'asoc-v3.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus) Merging pci-current/for-linus (f0b75693cbb2 MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers) Merging wireless/master (2eff7c791a18 Merge tag 'nfc-fixes-3.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes) Merging driver-core.current/driver-core-linus (413541dd66d5 Linux 3.13-rc5) Merging tty.current/tty-linus (413541dd66d5 Linux 3.13-rc5) Merging usb.current/usb-linus (413541dd66d5 Linux 3.13-rc5) Merging staging.current/staging-linus (413541dd66d5 Linux 3.13-rc5) Merging char-misc.current/char-misc-linus (802eee95bde7 Linux 3.13-rc6) Merging input-current/for-linus (8e2f2325b73f Input: xpad - add new USB IDs for Logitech F310 and F710) Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" stripe) Merging crypto-current/master (efb753b8e013 crypto: ixp4xx - Fix kernel compile error) Merging ide/master (c2f7d1e103ef ide: pmac: remove unnecessary pci_set_drvdata()) Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff) Merging sh-current/sh-fixes-for-linus (44033109e99c SH: Convert out[bwl] macros to inline functions) Merging devicetree-current/devicetree/merge (6f041e99fc7b of: Fix NULL dereference in unflatten_and_copy()) Merging rr-fixes/fixes (7122c3e9154b scripts/link-vmlinux.sh: only filter kernel symbols f
Re: [PATCH 1/2] net: dm9000: Read GPR, modify and write
Please do not mix coding style and functional changes. Please resubmit this entire series once you have addressed all feedback. Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes
On 01/22/2014 06:20 AM, Joel Becker wrote: > On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote: >> One topic that has been lurking forever at the edges is the current >> 4k limitation for file system block sizes. Some devices in >> production today and others coming soon have larger sectors and it >> would be interesting to see if it is time to poke at this topic >> again. >> >> LSF/MM seems to be pretty much the only event of the year that most >> of the key people will be present, so should be a great topic for a >> joint session. > > Oh yes, I want in on this. We handle 4k/16k/64k pages "seamlessly," and > we would want to do the same for larger sectors. In theory, our code > should handle it with the appropriate defines updated. > +1 The shingled drive folks would really love us for this. Plus it would make live really easy for those type of devices. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH V5 8/8] cpuidle/powernv: Parse device tree to setup idle states
Add deep idle states such as nap and fast sleep to the cpuidle state table only if they are discovered from the device tree during cpuidle initialization. Signed-off-by: Preeti U Murthy --- drivers/cpuidle/cpuidle-powernv.c | 81 + 1 file changed, 64 insertions(+), 17 deletions(-) diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 90f0c2b..b3face5 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -12,10 +12,17 @@ #include #include #include +#include #include #include +/* Flags and constants used in PowerNV platform */ + +#define MAX_POWERNV_IDLE_STATES8 +#define IDLE_USE_INST_NAP 0x0001 /* Use nap instruction */ +#define IDLE_USE_INST_SLEEP0x0002 /* Use sleep instruction */ + struct cpuidle_driver powernv_idle_driver = { .name = "powernv_idle", .owner= THIS_MODULE, @@ -87,7 +94,7 @@ static int fastsleep_loop(struct cpuidle_device *dev, /* * States for dedicated partition case. */ -static struct cpuidle_state powernv_states[] = { +static struct cpuidle_state powernv_states[MAX_POWERNV_IDLE_STATES] = { { /* Snooze */ .name = "snooze", .desc = "snooze", @@ -95,20 +102,6 @@ static struct cpuidle_state powernv_states[] = { .exit_latency = 0, .target_residency = 0, .enter = &snooze_loop }, - { /* NAP */ - .name = "NAP", - .desc = "NAP", - .flags = CPUIDLE_FLAG_TIME_VALID, - .exit_latency = 10, - .target_residency = 100, - .enter = &nap_loop }, -{ /* Fastsleep */ - .name = "fastsleep", - .desc = "fastsleep", - .flags = CPUIDLE_FLAG_TIME_VALID, - .exit_latency = 10, - .target_residency = 100, - .enter = &fastsleep_loop }, }; static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n, @@ -169,19 +162,73 @@ static int powernv_cpuidle_driver_init(void) return 0; } +static int powernv_add_idle_states(void) +{ + struct device_node *power_mgt; + struct property *prop; + int nr_idle_states = 1; /* Snooze */ + int dt_idle_states; + u32 *flags; + int i; + + /* Currently we have snooze statically defined */ + + power_mgt = of_find_node_by_path("/ibm,opal/power-mgt"); + if (!power_mgt) { + pr_warn("opal: PowerMgmt Node not found\n"); + return nr_idle_states; + } + + prop = of_find_property(power_mgt, "ibm,cpu-idle-state-flags", NULL); + if (!prop) { + pr_warn("DT-PowerMgmt: missing ibm,cpu-idle-state-flags\n"); + return nr_idle_states; + } + + dt_idle_states = prop->length / sizeof(u32); + flags = (u32 *) prop->value; + + for (i = 0; i < dt_idle_states; i++) { + + if (flags[i] & IDLE_USE_INST_NAP) { + /* Add NAP state */ + strcpy(powernv_states[nr_idle_states].name, "Nap"); + strcpy(powernv_states[nr_idle_states].desc, "Nap"); + powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIME_VALID; + powernv_states[nr_idle_states].exit_latency = 10; + powernv_states[nr_idle_states].target_residency = 100; + powernv_states[nr_idle_states].enter = &nap_loop; + nr_idle_states++; + } + + if (flags[i] & IDLE_USE_INST_SLEEP) { + /* Add FASTSLEEP state */ + strcpy(powernv_states[nr_idle_states].name, "FastSleep"); + strcpy(powernv_states[nr_idle_states].desc, "FastSleep"); + powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIME_VALID; + powernv_states[nr_idle_states].exit_latency = 300; + powernv_states[nr_idle_states].target_residency = 100; + powernv_states[nr_idle_states].enter = &fastsleep_loop; + nr_idle_states++; + } + } + + return nr_idle_states; +} + /* * powernv_idle_probe() * Choose state table for shared versus dedicated partition */ static int powernv_idle_probe(void) { - if (cpuidle_disable != IDLE_NO_OVERRIDE) return -ENODEV; if (firmware_has_feature(FW_FEATURE_OPALv3)) { cpuidle_state_table = powernv_states; - max_idle_state = ARRAY_SIZE(powernv_states); + /* Device tree can indicate more idle states */ + max_idle_state = powernv_add_idle_states(); } else return -ENODEV; -- To unsubscribe from this list: send the line "unsubscribe l
[RESEND PATCH V5 7/8] cpuidle/powernv: Add "Fast-Sleep" CPU idle state
Fast sleep is one of the deep idle states on Power8 in which local timers of CPUs stop. On PowerPC we do not have an external clock device which can handle wakeup of such CPUs. Now that we have the support in the tick broadcast framework for archs that do not sport such a device and the low level support for fast sleep, enable it in the cpuidle framework on PowerNV. Signed-off-by: Preeti U Murthy --- arch/powerpc/Kconfig |2 ++ arch/powerpc/kernel/time.c|2 +- drivers/cpuidle/cpuidle-powernv.c | 42 + 3 files changed, 45 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index fa39517..ec91584 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -129,6 +129,8 @@ config PPC select GENERIC_CMOS_UPDATE select GENERIC_TIME_VSYSCALL_OLD select GENERIC_CLOCKEVENTS + select GENERIC_CLOCKEVENTS_BROADCAST + select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select GENERIC_STRNCPY_FROM_USER select GENERIC_STRNLEN_USER select HAVE_MOD_ARCH_SPECIFIC diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index df2989b..95fa5ce 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -106,7 +106,7 @@ struct clock_event_device decrementer_clockevent = { .irq= 0, .set_next_event = decrementer_set_next_event, .set_mode = decrementer_set_mode, - .features = CLOCK_EVT_FEAT_ONESHOT, + .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP, }; EXPORT_SYMBOL(decrementer_clockevent); diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c index 78fd174..90f0c2b 100644 --- a/drivers/cpuidle/cpuidle-powernv.c +++ b/drivers/cpuidle/cpuidle-powernv.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,40 @@ static int nap_loop(struct cpuidle_device *dev, return index; } +static int fastsleep_loop(struct cpuidle_device *dev, + struct cpuidle_driver *drv, + int index) +{ + int cpu = dev->cpu; + unsigned long old_lpcr = mfspr(SPRN_LPCR); + unsigned long new_lpcr; + + if (unlikely(system_state < SYSTEM_RUNNING)) + return index; + + new_lpcr = old_lpcr; + new_lpcr &= ~(LPCR_MER | LPCR_PECE); /* lpcr[mer] must be 0 */ + + /* exit powersave upon external interrupt, but not decrementer +* interrupt, Emulate sleep. +*/ + new_lpcr |= LPCR_PECE0; + + if (clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu)) { + new_lpcr |= LPCR_PECE1; + mtspr(SPRN_LPCR, new_lpcr); + power7_nap(); + } else { + mtspr(SPRN_LPCR, new_lpcr); + power7_sleep(); + } + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); + + mtspr(SPRN_LPCR, old_lpcr); + + return index; +} + /* * States for dedicated partition case. */ @@ -67,6 +102,13 @@ static struct cpuidle_state powernv_states[] = { .exit_latency = 10, .target_residency = 100, .enter = &nap_loop }, +{ /* Fastsleep */ + .name = "fastsleep", + .desc = "fastsleep", + .flags = CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 100, + .enter = &fastsleep_loop }, }; static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH V5 4/8] powernv/cpuidle: Add context management for Fast Sleep
From: Vaidyanathan Srinivasan Before adding Fast-Sleep into the cpuidle framework, some low level support needs to be added to enable it. This includes saving and restoring of certain registers at entry and exit time of this state respectively just like we do in the NAP idle state. Signed-off-by: Vaidyanathan Srinivasan [Changelog modified by Preeti U. Murthy ] Signed-off-by: Preeti U. Murthy --- arch/powerpc/include/asm/processor.h |1 + arch/powerpc/kernel/exceptions-64s.S | 10 - arch/powerpc/kernel/idle_power7.S| 63 -- 3 files changed, 53 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index b62de43..d660dc3 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -450,6 +450,7 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF}; extern int powersave_nap; /* set if nap mode can be used in idle loop */ extern void power7_nap(void); +extern void power7_sleep(void); extern void flush_instruction_cache(void); extern void hard_reset_now(void); extern void poweroff_now(void); diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 38d5073..b01a9cb 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -121,9 +121,10 @@ BEGIN_FTR_SECTION cmpwi cr1,r13,2 /* Total loss of HV state is fatal, we could try to use the * PIR to locate a PACA, then use an emergency stack etc... -* but for now, let's just stay stuck here +* OPAL v3 based powernv platforms have new idle states +* which fall in this catagory. */ - bgt cr1,. + bgt cr1,8f GET_PACA(r13) #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE @@ -141,6 +142,11 @@ BEGIN_FTR_SECTION beq cr1,2f b .power7_wakeup_noloss 2: b .power7_wakeup_loss + + /* Fast Sleep wakeup on PowerNV */ +8: GET_PACA(r13) + b .power7_wakeup_loss + 9: END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) #endif /* CONFIG_PPC_P7_NAP */ diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S index 3fdef0f..14f78be 100644 --- a/arch/powerpc/kernel/idle_power7.S +++ b/arch/powerpc/kernel/idle_power7.S @@ -20,17 +20,27 @@ #undef DEBUG - .text +/* Idle state entry routines */ -_GLOBAL(power7_idle) - /* Now check if user or arch enabled NAP mode */ - LOAD_REG_ADDRBASE(r3,powersave_nap) - lwz r4,ADDROFF(powersave_nap)(r3) - cmpwi 0,r4,0 - beqlr - /* fall through */ +#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \ + /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ + std r0,0(r1); \ + ptesync;\ + ld r0,0(r1); \ +1: cmp cr0,r0,r0; \ + bne 1b; \ + IDLE_INST; \ + b . -_GLOBAL(power7_nap) + .text + +/* + * Pass requested state in r3: + * 0 - nap + * 1 - sleep + */ +_GLOBAL(power7_powersave_common) + /* Use r3 to pass state nap/sleep/winkle */ /* NAP is a state loss, we create a regs frame on the * stack, fill it up with the state we care about and * stick a pointer to it in PACAR1. We really only @@ -79,8 +89,8 @@ _GLOBAL(power7_nap) /* Continue saving state */ SAVE_GPR(2, r1) SAVE_NVGPRS(r1) - mfcrr3 - std r3,_CCR(r1) + mfcrr4 + std r4,_CCR(r1) std r9,_MSR(r1) std r1,PACAR1(r13) @@ -90,15 +100,30 @@ _GLOBAL(power7_enter_nap_mode) li r4,KVM_HWTHREAD_IN_NAP stb r4,HSTATE_HWTHREAD_STATE(r13) #endif + cmpwi cr0,r3,1 + beq 2f + IDLE_STATE_ENTER_SEQ(PPC_NAP) + /* No return */ +2: IDLE_STATE_ENTER_SEQ(PPC_SLEEP) + /* No return */ - /* Magic NAP mode enter sequence */ - std r0,0(r1) - ptesync - ld r0,0(r1) -1: cmp cr0,r0,r0 - bne 1b - PPC_NAP - b . +_GLOBAL(power7_idle) + /* Now check if user or arch enabled NAP mode */ + LOAD_REG_ADDRBASE(r3,powersave_nap) + lwz r4,ADDROFF(powersave_nap)(r3) + cmpwi 0,r4,0 + beqlr + /* fall through */ + +_GLOBAL(power7_nap) + li r3,0 + b power7_powersave_common + /* No return */ + +_GLOBAL(power7_sleep) + li r3,1 + b power7_powersave_common + /* No return */ _GLOBAL(power7_wakeup_loss) ld r1,PACAR1(r13) -- To unsubscribe from this list: send the lin
[RESEND PATCH V5 5/8] powermgt: Add OPAL call to resync timebase on wakeup
From: Vaidyanathan Srinivasan During "Fast-sleep" and deeper power savings state, decrementer and timebase could be stopped making it out of sync with rest of the cores in the system. Add a firmware call to request platform to resync timebase using low level platform methods. Signed-off-by: Vaidyanathan Srinivasan Signed-off-by: Preeti U. Murthy --- arch/powerpc/include/asm/opal.h|2 ++ arch/powerpc/kernel/exceptions-64s.S |2 +- arch/powerpc/kernel/idle_power7.S | 27 arch/powerpc/platforms/powernv/opal-wrappers.S |1 + 4 files changed, 31 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 9a87b44..8c4829f 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -154,6 +154,7 @@ extern int opal_enter_rtas(struct rtas_args *args, #define OPAL_FLASH_VALIDATE76 #define OPAL_FLASH_MANAGE 77 #define OPAL_FLASH_UPDATE 78 +#define OPAL_RESYNC_TIMEBASE 79 #define OPAL_GET_MSG 85 #define OPAL_CHECK_ASYNC_COMPLETION86 @@ -863,6 +864,7 @@ extern void opal_flash_init(void); extern int opal_machine_check(struct pt_regs *regs); extern void opal_shutdown(void); +extern int opal_resync_timebase(void); extern void opal_lpc_init(void); diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index b01a9cb..9533d7a 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -145,7 +145,7 @@ BEGIN_FTR_SECTION /* Fast Sleep wakeup on PowerNV */ 8: GET_PACA(r13) - b .power7_wakeup_loss + b .power7_wakeup_tb_loss 9: END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S index 14f78be..c3ab869 100644 --- a/arch/powerpc/kernel/idle_power7.S +++ b/arch/powerpc/kernel/idle_power7.S @@ -17,6 +17,7 @@ #include #include #include +#include #undef DEBUG @@ -125,6 +126,32 @@ _GLOBAL(power7_sleep) b power7_powersave_common /* No return */ +_GLOBAL(power7_wakeup_tb_loss) + ld r2,PACATOC(r13); + ld r1,PACAR1(r13) + + /* Time base re-sync */ + li r0,OPAL_RESYNC_TIMEBASE + LOAD_REG_ADDR(r11,opal); + ld r12,8(r11); + ld r2,0(r11); + mtctr r12 + bctrl + + /* TODO: Check r3 for failure */ + + REST_NVGPRS(r1) + REST_GPR(2, r1) + ld r3,_CCR(r1) + ld r4,_MSR(r1) + ld r5,_NIP(r1) + addir1,r1,INT_FRAME_SIZE + mtcrr3 + mfspr r3,SPRN_SRR1/* Return SRR1 */ + mtspr SPRN_SRR1,r4 + mtspr SPRN_SRR0,r5 + rfid + _GLOBAL(power7_wakeup_loss) ld r1,PACAR1(r13) REST_NVGPRS(r1) diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S index 719aa5c..a11a87c 100644 --- a/arch/powerpc/platforms/powernv/opal-wrappers.S +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S @@ -126,5 +126,6 @@ OPAL_CALL(opal_return_cpu, OPAL_RETURN_CPU); OPAL_CALL(opal_validate_flash, OPAL_FLASH_VALIDATE); OPAL_CALL(opal_manage_flash, OPAL_FLASH_MANAGE); OPAL_CALL(opal_update_flash, OPAL_FLASH_UPDATE); +OPAL_CALL(opal_resync_timebase,OPAL_RESYNC_TIMEBASE); OPAL_CALL(opal_get_msg,OPAL_GET_MSG); OPAL_CALL(opal_check_completion, OPAL_CHECK_ASYNC_COMPLETION); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH V5 6/8] time/cpuidle: Support in tick broadcast framework in the absence of external clock device
On some architectures, in certain CPU deep idle states the local timers stop. An external clock device is used to wakeup these CPUs. The kernel support for the wakeup of these CPUs is provided by the tick broadcast framework by using the external clock device as the wakeup source. However not all implementations of architectures provide such an external clock device such as some PowerPC ones. This patch includes support in the broadcast framework to handle the wakeup of the CPUs in deep idle states on such systems by queuing a hrtimer on one of the CPUs, meant to handle the wakeup of CPUs in deep idle states. This CPU is identified as the bc_cpu. Each time the hrtimer expires, it is reprogrammed for the next wakeup of the CPUs in deep idle state after handling broadcast. However when a CPU is about to enter deep idle state with its wakeup time earlier than the time at which the hrtimer is currently programmed, it *becomes the new bc_cpu* and restarts the hrtimer on itself. This way the job of doing broadcast is handed around to the CPUs that ask for the earliest wakeup just before entering deep idle state. This is consistent with what happens in cases where an external clock device is present. The smp affinity of this clock device is set to the CPU with the earliest wakeup. The important point here is that the bc_cpu cannot enter deep idle state since it has a hrtimer queued to wakeup the other CPUs in deep idle. Hence it cannot have its local timer stopped. Therefore for such a CPU, the BROADCAST_ENTER notification has to fail implying that it cannot enter deep idle state. On architectures where an external clock device is present, all CPUs can enter deep idle. During hotplug of the bc_cpu, the job of doing a broadcast is assigned to the first cpu in the broadcast mask. This newly nominated bc_cpu is woken up by an IPI so as to queue the above mentioned hrtimer on it. Signed-off-by: Preeti U Murthy --- include/linux/clockchips.h |4 - kernel/time/clockevents.c|9 +- kernel/time/tick-broadcast.c | 192 ++ kernel/time/tick-internal.h |8 +- 4 files changed, 186 insertions(+), 27 deletions(-) diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 493aa02..bbda37b 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -186,9 +186,9 @@ static inline int tick_check_broadcast_expired(void) { return 0; } #endif #ifdef CONFIG_GENERIC_CLOCKEVENTS -extern void clockevents_notify(unsigned long reason, void *arg); +extern int clockevents_notify(unsigned long reason, void *arg); #else -static inline void clockevents_notify(unsigned long reason, void *arg) {} +static inline int clockevents_notify(unsigned long reason, void *arg) {} #endif #else /* CONFIG_GENERIC_CLOCKEVENTS_BUILD */ diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 086ad60..d61404e 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -524,12 +524,13 @@ void clockevents_resume(void) #ifdef CONFIG_GENERIC_CLOCKEVENTS /** * clockevents_notify - notification about relevant events + * Returns non zero on error. */ -void clockevents_notify(unsigned long reason, void *arg) +int clockevents_notify(unsigned long reason, void *arg) { struct clock_event_device *dev, *tmp; unsigned long flags; - int cpu; + int cpu, ret = 0; raw_spin_lock_irqsave(&clockevents_lock, flags); @@ -542,11 +543,12 @@ void clockevents_notify(unsigned long reason, void *arg) case CLOCK_EVT_NOTIFY_BROADCAST_ENTER: case CLOCK_EVT_NOTIFY_BROADCAST_EXIT: - tick_broadcast_oneshot_control(reason); + ret = tick_broadcast_oneshot_control(reason); break; case CLOCK_EVT_NOTIFY_CPU_DYING: tick_handover_do_timer(arg); + tick_handover_broadcast_cpu(arg); break; case CLOCK_EVT_NOTIFY_SUSPEND: @@ -585,6 +587,7 @@ void clockevents_notify(unsigned long reason, void *arg) break; } raw_spin_unlock_irqrestore(&clockevents_lock, flags); + return ret; } EXPORT_SYMBOL_GPL(clockevents_notify); diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 9532690..1c23912 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "tick-internal.h" @@ -35,6 +36,15 @@ static cpumask_var_t tmpmask; static DEFINE_RAW_SPINLOCK(tick_broadcast_lock); static int tick_broadcast_force; +/* + * Helper variables for handling broadcast in the absence of a + * tick_broadcast_device. + * */ +static struct hrtimer *bc_hrtimer; +static int bc_cpu = -1; +static ktime_t bc_next_wakeup; +static int hrtimer_initialized = 0; + #ifdef CONFIG_TICK_ONESHOT static void tick_broadcast_clear_oneshot(int cpu); #else @@ -528,6 +538,20 @@ static int tick
[RESEND PATCH V5 3/8] cpuidle/ppc: Split timer_interrupt() into timer handling and interrupt handling routines
Split timer_interrupt(), which is the local timer interrupt handler on ppc into routines called during regular interrupt handling and __timer_interrupt(), which takes care of running local timers and collecting time related stats. This will enable callers interested only in running expired local timers to directly call into __timer_interupt(). One of the use cases of this is the tick broadcast IPI handling in which the sleeping CPUs need to handle the local timers that have expired. Signed-off-by: Preeti U Murthy --- arch/powerpc/kernel/time.c | 81 +--- 1 file changed, 46 insertions(+), 35 deletions(-) diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 3ff97db..df2989b 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -478,6 +478,47 @@ void arch_irq_work_raise(void) #endif /* CONFIG_IRQ_WORK */ +void __timer_interrupt(void) +{ + struct pt_regs *regs = get_irq_regs(); + u64 *next_tb = &__get_cpu_var(decrementers_next_tb); + struct clock_event_device *evt = &__get_cpu_var(decrementers); + u64 now; + + trace_timer_interrupt_entry(regs); + + if (test_irq_work_pending()) { + clear_irq_work_pending(); + irq_work_run(); + } + + now = get_tb_or_rtc(); + if (now >= *next_tb) { + *next_tb = ~(u64)0; + if (evt->event_handler) + evt->event_handler(evt); + __get_cpu_var(irq_stat).timer_irqs_event++; + } else { + now = *next_tb - now; + if (now <= DECREMENTER_MAX) + set_dec((int)now); + /* We may have raced with new irq work */ + if (test_irq_work_pending()) + set_dec(1); + __get_cpu_var(irq_stat).timer_irqs_others++; + } + +#ifdef CONFIG_PPC64 + /* collect purr register values often, for accurate calculations */ + if (firmware_has_feature(FW_FEATURE_SPLPAR)) { + struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array); + cu->current_tb = mfspr(SPRN_PURR); + } +#endif + + trace_timer_interrupt_exit(regs); +} + /* * timer_interrupt - gets called when the decrementer overflows, * with interrupts disabled. @@ -486,8 +527,6 @@ void timer_interrupt(struct pt_regs * regs) { struct pt_regs *old_regs; u64 *next_tb = &__get_cpu_var(decrementers_next_tb); - struct clock_event_device *evt = &__get_cpu_var(decrementers); - u64 now; /* Ensure a positive value is written to the decrementer, or else * some CPUs will continue to take decrementer exceptions. @@ -519,39 +558,7 @@ void timer_interrupt(struct pt_regs * regs) old_regs = set_irq_regs(regs); irq_enter(); - trace_timer_interrupt_entry(regs); - - if (test_irq_work_pending()) { - clear_irq_work_pending(); - irq_work_run(); - } - - now = get_tb_or_rtc(); - if (now >= *next_tb) { - *next_tb = ~(u64)0; - if (evt->event_handler) - evt->event_handler(evt); - __get_cpu_var(irq_stat).timer_irqs_event++; - } else { - now = *next_tb - now; - if (now <= DECREMENTER_MAX) - set_dec((int)now); - /* We may have raced with new irq work */ - if (test_irq_work_pending()) - set_dec(1); - __get_cpu_var(irq_stat).timer_irqs_others++; - } - -#ifdef CONFIG_PPC64 - /* collect purr register values often, for accurate calculations */ - if (firmware_has_feature(FW_FEATURE_SPLPAR)) { - struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array); - cu->current_tb = mfspr(SPRN_PURR); - } -#endif - - trace_timer_interrupt_exit(regs); - + __timer_interrupt(); irq_exit(); set_irq_regs(old_regs); } @@ -828,6 +835,10 @@ static void decrementer_set_mode(enum clock_event_mode mode, /* Interrupt handler for the timer broadcast IPI */ void tick_broadcast_ipi_handler(void) { + u64 *next_tb = &__get_cpu_var(decrementers_next_tb); + + *next_tb = get_tb_or_rtc(); + __timer_interrupt(); } static void register_decrementer_clockevent(int cpu) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH V5 1/8] powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message
From: Srivatsa S. Bhat The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE map to a common implementation - generic_smp_call_function_single_interrupt(). So, we can consolidate them and save one of the IPI message slots, (which are precious on powerpc, since only 4 of those slots are available). So, implement the functionality of PPC_MSG_CALL_FUNC_SINGLE using PPC_MSG_CALL_FUNC itself and release its IPI message slot, so that it can be used for something else in the future, if desired. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Preeti U. Murthy Acked-by: Geoff Levand [For the PS3 part] --- arch/powerpc/include/asm/smp.h |2 +- arch/powerpc/kernel/smp.c | 12 +--- arch/powerpc/platforms/cell/interrupt.c |2 +- arch/powerpc/platforms/ps3/smp.c|2 +- 4 files changed, 8 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 084e080..9f7356b 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu); * in /proc/interrupts will be wrong!!! --Troy */ #define PPC_MSG_CALL_FUNCTION 0 #define PPC_MSG_RESCHEDULE 1 -#define PPC_MSG_CALL_FUNC_SINGLE 2 +#define PPC_MSG_UNUSED 2 #define PPC_MSG_DEBUGGER_BREAK 3 /* for irq controllers that have dedicated ipis per message (4) */ diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ac2621a..ee7d76b 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -145,9 +145,9 @@ static irqreturn_t reschedule_action(int irq, void *data) return IRQ_HANDLED; } -static irqreturn_t call_function_single_action(int irq, void *data) +static irqreturn_t unused_action(int irq, void *data) { - generic_smp_call_function_single_interrupt(); + /* This slot is unused and hence available for use, if needed */ return IRQ_HANDLED; } @@ -168,14 +168,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data) static irq_handler_t smp_ipi_action[] = { [PPC_MSG_CALL_FUNCTION] = call_function_action, [PPC_MSG_RESCHEDULE] = reschedule_action, - [PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action, + [PPC_MSG_UNUSED] = unused_action, [PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action, }; const char *smp_ipi_name[] = { [PPC_MSG_CALL_FUNCTION] = "ipi call function", [PPC_MSG_RESCHEDULE] = "ipi reschedule", - [PPC_MSG_CALL_FUNC_SINGLE] = "ipi call function single", + [PPC_MSG_UNUSED] = "ipi unused", [PPC_MSG_DEBUGGER_BREAK] = "ipi debugger", }; @@ -251,8 +251,6 @@ irqreturn_t smp_ipi_demux(void) generic_smp_call_function_interrupt(); if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE)) scheduler_ipi(); - if (all & IPI_MESSAGE(PPC_MSG_CALL_FUNC_SINGLE)) - generic_smp_call_function_single_interrupt(); if (all & IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK)) debug_ipi_action(0, NULL); } while (info->messages); @@ -280,7 +278,7 @@ EXPORT_SYMBOL_GPL(smp_send_reschedule); void arch_send_call_function_single_ipi(int cpu) { - do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE); + do_message_pass(cpu, PPC_MSG_CALL_FUNCTION); } void arch_send_call_function_ipi_mask(const struct cpumask *mask) diff --git a/arch/powerpc/platforms/cell/interrupt.c b/arch/powerpc/platforms/cell/interrupt.c index 2d42f3b..adf3726 100644 --- a/arch/powerpc/platforms/cell/interrupt.c +++ b/arch/powerpc/platforms/cell/interrupt.c @@ -215,7 +215,7 @@ void iic_request_IPIs(void) { iic_request_ipi(PPC_MSG_CALL_FUNCTION); iic_request_ipi(PPC_MSG_RESCHEDULE); - iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE); + iic_request_ipi(PPC_MSG_UNUSED); iic_request_ipi(PPC_MSG_DEBUGGER_BREAK); } diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c index 4b35166..00d1a7c 100644 --- a/arch/powerpc/platforms/ps3/smp.c +++ b/arch/powerpc/platforms/ps3/smp.c @@ -76,7 +76,7 @@ static int __init ps3_smp_probe(void) BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION!= 0); BUILD_BUG_ON(PPC_MSG_RESCHEDULE != 1); - BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2); + BUILD_BUG_ON(PPC_MSG_UNUSED != 2); BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK != 3); for (i = 0; i < MSG_COUNT; i++) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH V5 2/8] powerpc: Implement tick broadcast IPI as a fixed IPI message
From: Srivatsa S. Bhat For scalability and performance reasons, we want the tick broadcast IPIs to be handled as efficiently as possible. Fixed IPI messages are one of the most efficient mechanisms available - they are faster than the smp_call_function mechanism because the IPI handlers are fixed and hence they don't involve costly operations such as adding IPI handlers to the target CPU's function queue, acquiring locks for synchronization etc. Luckily we have an unused IPI message slot, so use that to implement tick broadcast IPIs efficiently. Signed-off-by: Srivatsa S. Bhat [Functions renamed to tick_broadcast* and Changelog modified by Preeti U. Murthy] Signed-off-by: Preeti U. Murthy Acked-by: Geoff Levand [For the PS3 part] --- arch/powerpc/include/asm/smp.h |2 +- arch/powerpc/include/asm/time.h |1 + arch/powerpc/kernel/smp.c | 19 +++ arch/powerpc/kernel/time.c |5 + arch/powerpc/platforms/cell/interrupt.c |2 +- arch/powerpc/platforms/ps3/smp.c|2 +- 6 files changed, 24 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 9f7356b..ff51046 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu); * in /proc/interrupts will be wrong!!! --Troy */ #define PPC_MSG_CALL_FUNCTION 0 #define PPC_MSG_RESCHEDULE 1 -#define PPC_MSG_UNUSED 2 +#define PPC_MSG_TICK_BROADCAST 2 #define PPC_MSG_DEBUGGER_BREAK 3 /* for irq controllers that have dedicated ipis per message (4) */ diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index c1f2676..1d428e6 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -28,6 +28,7 @@ extern struct clock_event_device decrementer_clockevent; struct rtc_time; extern void to_tm(int tim, struct rtc_time * tm); extern void GregorianDay(struct rtc_time *tm); +extern void tick_broadcast_ipi_handler(void); extern void generic_calibrate_decr(void); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ee7d76b..6f06f05 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -145,9 +146,9 @@ static irqreturn_t reschedule_action(int irq, void *data) return IRQ_HANDLED; } -static irqreturn_t unused_action(int irq, void *data) +static irqreturn_t tick_broadcast_ipi_action(int irq, void *data) { - /* This slot is unused and hence available for use, if needed */ + tick_broadcast_ipi_handler(); return IRQ_HANDLED; } @@ -168,14 +169,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data) static irq_handler_t smp_ipi_action[] = { [PPC_MSG_CALL_FUNCTION] = call_function_action, [PPC_MSG_RESCHEDULE] = reschedule_action, - [PPC_MSG_UNUSED] = unused_action, + [PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action, [PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action, }; const char *smp_ipi_name[] = { [PPC_MSG_CALL_FUNCTION] = "ipi call function", [PPC_MSG_RESCHEDULE] = "ipi reschedule", - [PPC_MSG_UNUSED] = "ipi unused", + [PPC_MSG_TICK_BROADCAST] = "ipi tick-broadcast", [PPC_MSG_DEBUGGER_BREAK] = "ipi debugger", }; @@ -251,6 +252,8 @@ irqreturn_t smp_ipi_demux(void) generic_smp_call_function_interrupt(); if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE)) scheduler_ipi(); + if (all & IPI_MESSAGE(PPC_MSG_TICK_BROADCAST)) + tick_broadcast_ipi_handler(); if (all & IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK)) debug_ipi_action(0, NULL); } while (info->messages); @@ -289,6 +292,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask *mask) do_message_pass(cpu, PPC_MSG_CALL_FUNCTION); } +void tick_broadcast(const struct cpumask *mask) +{ + unsigned int cpu; + + for_each_cpu(cpu, mask) + do_message_pass(cpu, PPC_MSG_TICK_BROADCAST); +} + #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) void smp_send_debugger_break(void) { diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index b3dab20..3ff97db 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -825,6 +825,11 @@ static void decrementer_set_mode(enum clock_event_mode mode, decrementer_set_next_event(DECREMENTER_MAX, dev); } +/* Interrupt handler for the timer broadcast IPI */ +void tick_broadcast_ipi_handler(void) +{ +} + static void register_decrementer_clockevent(int cpu) { struct clock_event_device *dec = &per_cpu(decrementers, cpu); diff --git a/arch/powerpc/platforms/cell/interrupt.c b/arch/powerpc/platforms/cell/interrupt.
[RESEND PATCH V5 0/8] cpuidle/ppc: Enable deep idle states on PowerNV
On PowerPC, when CPUs enter certain deep idle states, the local timers stop and the time base could go out of sync with the rest of the cores in the system. This patchset adds support to wake up CPUs in such idle states by broadcasting IPIs to them at their next timer events using the tick broadcast framework in the Linux kernel. We refer to these IPIs as the tick broadcast IPIs in this patchset. However the tick broadcast framework as it exists today makes use of an external clock device to wakeup CPUs in such idle states. But not all implementations of PowerPC provides such an external clock device. Hence Patch[6/8]: [time/cpuidle: Support in tick broadcast framework for archs without external clock device] adds support in the tick broadcast framework for such use cases by queuing a hrtimer on one of the CPUs which is meant to handle the wakeup of CPUs in deep idle states. This patch was posted separately at: https://lkml.org/lkml/2013/12/12/687. Patches 1-3 adds support in powerpc to hook onto the tick broadcast framework. The patchset also includes support for resyncing of time base with the rest of the cores in the system and context management for fast sleep. PATCH[4/8] and PATCH[5/8] address these issues. With the required support for deep idle states thus in place, the patchset adds "Fast-Sleep" idle state into cpuidle (Patches 7 and 8). "Fast-Sleep" is a deep idle state on Power8 in which the above mentioned challenges exist. Fast-Sleep can yield us significantly more power savings than the idle states that we have in cpuidle so far. This patchset is based on Ben's ppc next branch at commit fac515db45207718 [Merge remote-tracking branch 'scott/next' into next], and the cpuidle driver for powernv posted by Deepthi Dharwar: https://lkml.org/lkml/2014/1/14/172. The same patchset minus the resolving of merge conflicts with Ben's ppc next branch had been posted earlier at http://lkml.org/lkml/2014/1/15/70. This Repost resolves these merge conflicts with Ben's ppc next branch. Hence the Repost. Besides the earlier post was based and tested on the mainline commit that was quite old. However the patchset posted earlier at http://lkml.org/lkml/2014/1/15/70 along wiith Deepthi's patches on cpuidle driver for powernv applies cleanly on the mainline kernel at commit: 85ce70fdf48aa290b484531 dated Jan 16 2014 and has been tested on the same at the time of this Repost. Changes in V5: The primary change in this version is in Patch[6/8]. As per the discussions in V4 posting of this patchset, it was decided to refine handling the wakeup of CPUs in fast-sleep by doing the following: 1. In V4, a polling mechanism was used by the CPU handling broadcast to find out the time of next wakeup of the CPUs in deep idle states. V5 avoids polling by a way described under PATCH[6/8] in this patchset. 2. The mechanism of broadcast handling of CPUs in deep idle in the absence of an external wakeup device should be generic and not arch specific code. Hence in this version this functionality has been integrated into the tick broadcast framework in the kernel unlike before where it was handled in powerpc specific code. 3. It was suggested that the "broadcast cpu" can be the time keeping cpu itself. However this has challenges of its own: a. The time keeping cpu need not exist when all cpus are idle. Hence there are phases in time when time keeping cpu is absent. But for the use case that this patchset is trying to address we rely on the presence of a broadcast cpu all the time. b. The nomination and un-assignment of the time keeping cpu is not protected by a lock today and need not be as well since such is its use case in the kernel. However we would need locks if we double up the time keeping cpu as the broadcast cpu. Hence the broadcast cpu is independent of the time-keeping cpu. However PATCH[6/8] proposes a simpler solution to pick a broadcast cpu in this version. Changes in V4: https://lkml.org/lkml/2013/11/29/97 1. Add Fast Sleep CPU idle state on PowerNV. 2. Add the required context management for Fast Sleep and the call to OPAL to synchronize time base after wakeup from fast sleep. 4. Add parsing of CPU idle states from the device tree to populate the cpuidle state table. 5. Rename ambiguous functions in the code around waking up of CPUs from fast sleep. 6. Fixed a bug in re-programming of the hrtimer that is queued to wakeup the CPUs in fast sleep and modified Changelogs. 7. Added the ARCH_HAS_TICK_BROADCAST option. This signifies that we have a arch specific function to perform broadcast. Changes in V3: http://thread.gmane.org/gmane.linux.power-management.general/38113 1. Fix the way in which a broadcast ipi is handled on the idling cpus. Timer handling on a broadcast ipi is being done now without missing out any timer stats generation. 2. Fix a bug in the programming of the hrtimer meant to do broadcast. Program it to trigger at the earlier of a "broadcast period", and the next wake
Re: [PATCH] swap: do not skip lowest_bit in scan_swap_map() scan loop
On Tue, 21 Jan 2014, Jamie Liu wrote: > In the second half of scan_swap_map()'s scan loop, offset is set to > si->lowest_bit and then incremented before entering the loop for the > first time, causing si->swap_map[si->lowest_bit] to be skipped. > > Signed-off-by: Jamie Liu Acked-by: Hugh Dickins Good catch. At first I was puzzled that this off-by-one could have gone unnoticed for so long (ever since 2.6.29); but now I think that almost always we have a good amount of slack, in those pages duplicated between swap and swapcache, which can be reclaimed at the vm_swap_full() check, and so conceal this loss of a single slot. > --- > mm/swapfile.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 612a7c9..6635081 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -616,7 +616,7 @@ scan: > } > } > offset = si->lowest_bit; > - while (++offset < scan_base) { > + while (offset < scan_base) { > if (!si->swap_map[offset]) { > spin_lock(&si->lock); > goto checks; > @@ -629,6 +629,7 @@ scan: > cond_resched(); > latency_ration = LATENCY_LIMIT; > } > + offset++; > } > spin_lock(&si->lock); > > -- > 1.8.5.3 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH V3 2/2] fs : Add sanity checks for block size > PAGE_SIZE
On 01/22/2014 01:51 AM, Andrew Morton wrote: On Tue, 21 Jan 2014 17:00:00 +0530 Raghavendra K T wrote: We could hit null pointer dereference error during alloc_page_buffers in : (1) block size > PAGE_SIZE (2) low memory. Add sanity check for that. Signed-off-by: Raghavendra K T --- fs/block_dev.c | 1 + fs/buffer.c| 6 ++ 2 files changed, 7 insertions(+) diff --git a/fs/block_dev.c b/fs/block_dev.c index 1e86823..2481d42 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1027,6 +1027,7 @@ void bd_set_size(struct block_device *bdev, loff_t size) break; bsize <<= 1; } + BUG_ON(bsize > PAGE_SIZE); bdev->bd_block_size = bsize; bdev->bd_inode->i_blkbits = blksize_bits(bsize); } alloc_page_buffers() will always return NULL if passed size >= PAGE_SIZE. So if we're going to add a check, it would be better to add it to alloc_page_buffers() because that will catch errors from the widest range of callsites. In that case how about converting BUG_ON to setting a default value of PAGE_SIZE for bs in bd_set_size() itself (with a warning)? But alloc_page_buffers() is pretty frequently called and I'd be inclined to not add any check - most callers will just go oops and that will provide basically the same information. Agree with this concern. --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1571,6 +1571,12 @@ void create_empty_buffers(struct page *page, struct buffer_head *bh, *head, *tail; head = alloc_page_buffers(page, blocksize, 1); + + /* +* alloc_page_buffers() could return NULL on (1) bs > PAGE_SIZE +* (2) low memory case. Ensure that we don't dereference null ptr +*/ + BUG_ON(!head); This is unneeded. - bs > PAGE_SIZE can be checked elsewhere in a direct fashion - low memory case can't happen - we passed retry=1 - create_empty_buffers() will immediately go oops if head==NULL. That oops contains the same info as is presented by a BUG(). Okay. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 00/73] tree-wide: clean up some no longer required #include
Hi Paul, On Tue, 21 Jan 2014 16:22:03 -0500 Paul Gortmaker wrote: > > Where: This work exists as a queue of patches that I apply to > linux-next; since the changes are fixing some things that currently > can only be found there. The patch series can be found at: > >http://git.kernel.org/cgit/linux/kernel/git/paulg/init.git >git://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git > > I've avoided annoying Stephen with another queue of patches for > linux-next while the development content was in flux, but now that > the merge window has opened, and new additions are fewer, perhaps he > wouldn't mind tacking it on the end... Stephen? OK, I have added this to the end of linux-next today - we will see how we go. It is called "init". Thanks for adding your subsystem tree as a participant of linux-next. As you may know, this is not a judgment of your code. The purpose of linux-next is for integration testing and to lower the impact of conflicts between subsystems in the next merge window. You will need to ensure that the patches/commits in your tree/series have been: * submitted under GPL v2 (or later) and include the Contributor's Signed-off-by, * posted to the relevant mailing list, * reviewed by you (or another maintainer of your subsystem tree), * successfully unit tested, and * destined for the current or next Linux merge window. Basically, this should be just what you would send to Linus (or ask him to fetch). It is allowed to be rebased if you deem it necessary. -- Cheers, Stephen Rothwell s...@canb.auug.org.au Legal Stuff: By participating in linux-next, your subsystem tree contributions are public and will be included in the linux-next trees. You may be sent e-mail messages indicating errors or other issues when the patches/commits from your subsystem tree are merged and tested in linux-next. These messages may also be cross-posted to the linux-next mailing list, the linux-kernel mailing list, etc. The linux-next tree project and IBM (my employer) make no warranties regarding the linux-next project, the testing procedures, the results, the e-mails, etc. If you don't agree to these ground rules, let me know and I'll remove your tree from participation in linux-next. pgpLaCcLhIIMO.pgp Description: PGP signature
Re: [patch 9/9] mm: keep page cache radix tree nodes in check
On Wed, Jan 22, 2014 at 02:06:07PM +1100, Dave Chinner wrote: > On Tue, Jan 21, 2014 at 12:50:17AM -0500, Johannes Weiner wrote: > > On Tue, Jan 21, 2014 at 02:03:58PM +1100, Dave Chinner wrote: > > > On Mon, Jan 20, 2014 at 06:17:37PM -0500, Johannes Weiner wrote: > > > > On Fri, Jan 17, 2014 at 11:05:17AM +1100, Dave Chinner wrote: > > > > > On Fri, Jan 10, 2014 at 01:10:43PM -0500, Johannes Weiner wrote: > > > > > > +static struct shrinker workingset_shadow_shrinker = { > > > > > > + .count_objects = count_shadow_nodes, > > > > > > + .scan_objects = scan_shadow_nodes, > > > > > > + .seeks = DEFAULT_SEEKS * 4, > > > > > > + .flags = SHRINKER_NUMA_AWARE, > > > > > > +}; > > > > > > > > > > Can you add a comment explaining how you calculated the .seeks > > > > > value? It's important to document the weighings/importance > > > > > we give to slab reclaim so we can determine if it's actually > > > > > acheiving the desired balance under different loads... > > > > > > > > This is not an exact science, to say the least. > > > > > > I know, that's why I asked it be documented rather than be something > > > kept in your head. > > > > > > > The shadow entries are mostly self-regulated, so I don't want the > > > > shrinker to interfere while the machine is just regularly trimming > > > > caches during normal operation. > > > > > > > > It should only kick in when either a) reclaim is picking up and the > > > > scan-to-reclaim ratio increases due to mapped pages, dirty cache, > > > > swapping etc. or b) the number of objects compared to LRU pages > > > > becomes excessive. > > > > > > > > I think that is what most shrinkers with an elevated seeks value want, > > > > but this translates very awkwardly (and not completely) to the current > > > > cost model, and we should probably rework that interface. > > > > > > > > "Seeks" currently encodes 3 ratios: > > > > > > > > 1. the cost of creating an object vs. a page > > > > > > > > 2. the expected number of objects vs. pages > > > > > > It doesn't encode that at all. If it did, then the default value > > > wouldn't be "2". > > > > > > > 3. the cost of reclaiming an object vs. a page > > > > > > Which, when you consider #3 in conjunction with #1, the actual > > > intended meaning of .seeks is "the cost of replacing this object in > > > the cache compared to the cost of replacing a page cache page." > > > > But what it actually seems to do is translate scan rate from LRU pages > > to scan rate in another object pool. The actual replacement cost > > varies based on hotness of each set, an in-use object is more > > expensive to replace than a cold page and vice versa, the dentry and > > inode shrinkers reflect this by rotating hot objects and refusing to > > actually reclaim items while they are in active use. > > Right, but so does the page cache when the page referenced bit is > seen by the LRU scanner. That's a scanned page, so what is passed to > shrink_slab is a ratio of pages scanned vs pages eligible for > reclaim. IOWs, the fact that the slab caches rotate rather than > reclaim is irrelevant - what matters is the same proportional > pressure is applied to the slab cache that was applied to the page > cache Oh, but it does. You apply the same pressure to both, but the actual reclaim outcome depends on object valuation measures specific to each pool (e.g. recently referenced or not), whereas my shrinker takes sc->nr_to_scan objects and reclaims them without looking at their individual value, which varies just like the value of slab objects varies. I thought I could compensate for the lack of object valuation in the shadow shrinker by tweaking that fixed pressure factor between page cache and shadow entries, but I'm no longer convinced this can work. One thing that does affect the value of shadow entries is the overall health of the system, memory-wise, so reclaim efficiency would be one factor that affects individual object value, albeit a secondary one. The most obvious value factor is whether the shadow entries in a node are expired or not, but there are potentially 64 of them, potentially from different zones with different "inactive ages" atomic_t's, so that is fairly expensive to assess. > > So I am having a hard time deriving a meaningful value out of this > > definition for my usecase because I want to push back objects based on > > reclaim efficiency (scan rate vs. reclaim rate). The other shrinkers > > with non-standard seek settings reek of magic number as well, which > > suggests I am not alone with this. > > Right, which is exactly why I'm asking you to document it. I've got > no idea how other subsystems have come up with their magic numbers > because they are not documented, and so it's just about impossible > to determine what the author of the code really needed and hence the > best way to improve the interface is difficult to determine. > > > I wonder if we can come up with a better interface that allows both >
[PATCH] drivers: xen: deaggressive selfballoon driver
Current xen-selfballoon driver is too aggressive which may cause OOM be triggered more often. Eg. this bug reported by James: https://lkml.org/lkml/2013/11/21/158 There are two mainly reasons: 1) The original goal_page didn't consider some pages used by kernel space, like slab pages and pages used by device drivers. 2) The balloon driver may not give back memory to guest OS fast enough when the workload suddenly aquries a lot of physical memory. In both cases, the guest OS will suffer from memory pressure and OOM may be triggered. The fix is make xen-selfballoon driver not that aggressive by adding extra 10% of total ram pages to goal_page. It's more valuable to keep the guest system reliable and response faster than balloon out these 10% pages to XEN. Signed-off-by: Bob Liu --- drivers/xen/xen-selfballoon.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c index 21e18c1..745ad79 100644 --- a/drivers/xen/xen-selfballoon.c +++ b/drivers/xen/xen-selfballoon.c @@ -175,6 +175,7 @@ static void frontswap_selfshrink(void) #endif /* CONFIG_FRONTSWAP */ #define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT)) +#define PAGES2MB(pages) ((pages) >> (20 - PAGE_SHIFT)) /* * Use current balloon size, the goal (vm_committed_as), and hysteresis @@ -525,6 +526,7 @@ EXPORT_SYMBOL(register_xen_selfballooning); int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink) { bool enable = false; + unsigned long reserve_pages; if (!xen_domain()) return -ENODEV; @@ -549,6 +551,26 @@ int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink) if (!enable) return -ENODEV; + /* +* Give selfballoon_reserved_mb a default value(10% of total ram pages) +* to make selfballoon not so aggressive. +* +* There are mainly two reasons: +* 1) The original goal_page didn't consider some pages used by kernel +*space, like slab pages and memory used by device drivers. +* +* 2) The balloon driver may not give back memory to guest OS fast +*enough when the workload suddenly aquries a lot of physical memory. +* +* In both cases, the guest OS will suffer from memory pressure and +* OOM killer may be triggered. +* By reserving extra 10% of total ram pages, we can keep the system +* much more reliably and response faster in some cases. +*/ + if (!selfballoon_reserved_mb) { + reserve_pages = totalram_pages / 10; + selfballoon_reserved_mb = PAGES2MB(reserve_pages); + } schedule_delayed_work(&selfballoon_worker, selfballoon_interval * HZ); return 0; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 2/2] of: fix of_update_property()
The of_update_property() is intented to update a property in a node and if the property does not exist, will add it. The second search of the property is possibly won't be found, that maybe removed by other thread just before the second search begain. Using the __of_find_property() and __of_add_property() instead and move them into lock operations. Signed-off-by: Xiubo Li Cc: Pantelis Antoniou --- drivers/of/base.c | 36 ++-- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/drivers/of/base.c b/drivers/of/base.c index b86b77a..458072d 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1573,7 +1573,7 @@ int of_update_property(struct device_node *np, struct property *newprop) { struct property **next, *oldprop; unsigned long flags; - int rc, found = 0; + int rc = 0; rc = of_property_notify(OF_RECONFIG_UPDATE_PROPERTY, np, newprop); if (rc) @@ -1582,36 +1582,28 @@ int of_update_property(struct device_node *np, struct property *newprop) if (!newprop->name) return -EINVAL; - oldprop = of_find_property(np, newprop->name, NULL); - if (!oldprop) - return of_add_property(np, newprop); - raw_spin_lock_irqsave(&devtree_lock, flags); - next = &np->properties; - while (*next) { - if (*next == oldprop) { - /* found the node */ - newprop->next = oldprop->next; - *next = newprop; - oldprop->next = np->deadprops; - np->deadprops = oldprop; - found = 1; - break; - } - next = &(*next)->next; + oldprop = __of_find_property(np, newprop->name, NULL); + if (!oldprop) { + /* add the node */ + rc = __of_add_property(np, newprop); + } else { + /* replace the node */ + next = &oldprop; + newprop->next = oldprop->next; + *next = newprop; + oldprop->next = np->deadprops; + np->deadprops = oldprop; } raw_spin_unlock_irqrestore(&devtree_lock, flags); - if (!found) - return -ENODEV; - #ifdef CONFIG_PROC_DEVICETREE /* try to add to proc as well if it was initialized */ - if (np->pde) + if (!rc && np->pde) proc_device_tree_update_prop(np->pde, newprop, oldprop); #endif /* CONFIG_PROC_DEVICETREE */ - return 0; + return rc; } #if defined(CONFIG_OF_DYNAMIC) -- 1.8.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv2 1/2] of: add __of_add_property() without lock operations
There two places will use the same code for adding one new property to the DT node. Adding __of_add_property() and prepare for fixing of_update_property()'s bug. Signed-off-by: Xiubo Li --- drivers/of/base.c | 38 -- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/drivers/of/base.c b/drivers/of/base.c index f807d0e..b86b77a 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1469,11 +1469,31 @@ static int of_property_notify(int action, struct device_node *np, #endif /** + * __of_add_property - Add a property to a node without lock operations + */ +static int __of_add_property(struct device_node *np, struct property *prop) +{ + struct property **next; + + prop->next = NULL; + next = &np->properties; + while (*next) { + if (strcmp(prop->name, (*next)->name) == 0) + /* duplicate ! don't insert it */ + return -EEXIST; + + next = &(*next)->next; + } + *next = prop; + + return 0; +} + +/** * of_add_property - Add a property to a node */ int of_add_property(struct device_node *np, struct property *prop) { - struct property **next; unsigned long flags; int rc; @@ -1481,27 +1501,17 @@ int of_add_property(struct device_node *np, struct property *prop) if (rc) return rc; - prop->next = NULL; raw_spin_lock_irqsave(&devtree_lock, flags); - next = &np->properties; - while (*next) { - if (strcmp(prop->name, (*next)->name) == 0) { - /* duplicate ! don't insert it */ - raw_spin_unlock_irqrestore(&devtree_lock, flags); - return -1; - } - next = &(*next)->next; - } - *next = prop; + rc = __of_add_property(np, prop); raw_spin_unlock_irqrestore(&devtree_lock, flags); #ifdef CONFIG_PROC_DEVICETREE /* try to add to proc as well if it was initialized */ - if (np->pde) + if (!rc && np->pde) proc_device_tree_add_prop(np->pde, prop); #endif /* CONFIG_PROC_DEVICETREE */ - return 0; + return rc; } /** -- 1.8.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] net/ipv4: queue work on power efficient wq
Workqueue used in ipv4 layer have no real dependency of scheduling these on the cpu which scheduled them. On a idle system, it is observed that an idle cpu wakes up many times just to service this work. It would be better if we can schedule it on a cpu which the scheduler believes to be the most appropriate one. This patch replaces normal workqueues with power efficient versions. This doesn't change existing behavior of code unless CONFIG_WQ_POWER_EFFICIENT is enabled. Signed-off-by: Viresh Kumar --- Initial support for power-efficient workqueues was added here: https://lkml.org/lkml/2013/4/24/215 net/ipv4/devinet.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 646023b..ac2dff3 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -474,7 +474,7 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh, inet_hash_insert(dev_net(in_dev->dev), ifa); cancel_delayed_work(&check_lifetime_work); - schedule_delayed_work(&check_lifetime_work, 0); + queue_delayed_work(system_power_efficient_wq, &check_lifetime_work, 0); /* Send message first, then call notifier. Notifier will trigger FIB update, so that @@ -684,7 +684,8 @@ static void check_lifetime(struct work_struct *work) if (time_before(next_sched, now + ADDRCONF_TIMER_FUZZ_MAX)) next_sched = now + ADDRCONF_TIMER_FUZZ_MAX; - schedule_delayed_work(&check_lifetime_work, next_sched - now); + queue_delayed_work(system_power_efficient_wq, &check_lifetime_work, + next_sched - now); } static void set_ifa_lifetime(struct in_ifaddr *ifa, __u32 valid_lft, @@ -842,7 +843,8 @@ static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh) ifa = ifa_existing; set_ifa_lifetime(ifa, valid_lft, prefered_lft); cancel_delayed_work(&check_lifetime_work); - schedule_delayed_work(&check_lifetime_work, 0); + queue_delayed_work(system_power_efficient_wq, + &check_lifetime_work, 0); rtmsg_ifa(RTM_NEWADDR, ifa, nlh, NETLINK_CB(skb).portid); blocking_notifier_call_chain(&inetaddr_chain, NETDEV_UP, ifa); } @@ -2322,7 +2324,7 @@ void __init devinet_init(void) register_gifconf(PF_INET, inet_gifconf); register_netdevice_notifier(&ip_netdev_notifier); - schedule_delayed_work(&check_lifetime_work, 0); + queue_delayed_work(system_power_efficient_wq, &check_lifetime_work, 0); rtnl_af_register(&inet_af_ops); -- 1.7.12.rc2.18.g61b472e -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] net/neighbour: queue work on power efficient wq
Workqueue used in neighbour layer have no real dependency of scheduling these on the cpu which scheduled them. On a idle system, it is observed that an idle cpu wakes up many times just to service this work. It would be better if we can schedule it on a cpu which the scheduler believes to be the most appropriate one. This patch replaces normal workqueues with power efficient versions. This doesn't change existing behavior of code unless CONFIG_WQ_POWER_EFFICIENT is enabled. Signed-off-by: Viresh Kumar --- net/core/neighbour.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index f8012fe..b9e9e0d 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -828,7 +828,7 @@ out: * ARP entry timeouts range from 1/2 BASE_REACHABLE_TIME to 3/2 * BASE_REACHABLE_TIME. */ - schedule_delayed_work(&tbl->gc_work, + queue_delayed_work(system_power_efficient_wq, &tbl->gc_work, NEIGH_VAR(&tbl->parms, BASE_REACHABLE_TIME) >> 1); write_unlock_bh(&tbl->lock); } @@ -1565,7 +1565,8 @@ static void neigh_table_init_no_netlink(struct neigh_table *tbl) rwlock_init(&tbl->lock); INIT_DEFERRABLE_WORK(&tbl->gc_work, neigh_periodic_work); - schedule_delayed_work(&tbl->gc_work, tbl->parms.reachable_time); + queue_delayed_work(system_power_efficient_wq, &tbl->gc_work, + tbl->parms.reachable_time); setup_timer(&tbl->proxy_timer, neigh_proxy_process, (unsigned long)tbl); skb_queue_head_init_class(&tbl->proxy_queue, &neigh_table_proxy_queue_class); -- 1.7.12.rc2.18.g61b472e -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: Align all CPUs to the same frequency if using shared clock
On 22 January 2014 11:56, Li, Zhuangzhi wrote: > I don't think it's a real bug in bootloader, the bootloader can set CPUs to > different frequencies according to actually requirements(Power saving first > or Performance first), > the CPUs freq policy are initialized in kernel, if the kernel want to share > one CPU policy(using CPUFREQ_SHARED_TYPE_ALL type), it should ensure all CPUs > frequencies aligned first, > don't depend on the bootloader CPUs Pre-states, then the kernel can have > better compatibility. > > If the kernel uses CPUFREQ_SHARED_TYPE_ALL policy, the patch can ensure these: > 1. If all CPUs are in the same P-state, it does nothing when cpufreq > registering > 2. If the CPUs are in different P-states, all the other CPUs are aligned once > to current frequency of CPU0 according to the present policy. I thought, as you are asking kernel to keep same freq on all of them, then same should be true for bootloaders. Otherwise it was okay. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/hid/wacom: fixed coding style issues
On Tue, Jan 21, 2014 at 11:42:03PM +0100, Rob Schroer wrote: > On Tue, Jan 21, 2014 at 01:25:54PM -0800, Joe Perches wrote: > > On Tue, 2014-01-21 at 13:18 -0800, Dmitry Torokhov wrote: > > > On Tue, Jan 21, 2014 at 09:29:44PM +0100, Rob Schroer wrote: > > > > As far as I can see, kstrtoXXX() might be an alternative, but I was just > > > > fixing coding style issues, no need to break anything IMO. > > > > > > You could do the breaking in a follow up patch ;) > > > > Yes please. > > > > Include the breaking of multiple statements > > into multiple lines too please like > > > > from: > > case USB_DEVICE_ID_WACOM_GRAPHIRE_BLUETOOTH: > > rep_data[0] = 0x03; rep_data[1] = 0x00; > > > > to: > > case USB_DEVICE_ID_WACOM_GRAPHIRE_BLUETOOTH: > > rep_data[0] = 0x03; > > rep_data[1] = 0x00; > > > > > > Added a cosmetical linebreak, switched an occurence of sscanf to kstrtoint. > > Signed-off-by: Robin Schroer > --- > drivers/hid/hid-wacom.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/hid/hid-wacom.c b/drivers/hid/hid-wacom.c > index ebcca0d..5daf80c 100644 > --- a/drivers/hid/hid-wacom.c > +++ b/drivers/hid/hid-wacom.c > @@ -336,7 +336,8 @@ static void wacom_set_features(struct hid_device *hdev, > u8 speed) > > switch (hdev->product) { > case USB_DEVICE_ID_WACOM_GRAPHIRE_BLUETOOTH: > - rep_data[0] = 0x03; rep_data[1] = 0x00; > + rep_data[0] = 0x03; > + rep_data[1] = 0x00; > limit = 3; > do { > ret = hdev->hid_output_raw_report(hdev, rep_data, 2, > @@ -404,7 +405,7 @@ static ssize_t wacom_store_speed(struct device *dev, > struct hid_device *hdev = container_of(dev, struct hid_device, dev); > int new_speed; > > - if (sscanf(buf, "%1d", &new_speed) != 1) > + if (kstrtoint(buf, 10, &new_speed)) > return -EINVAL; I think this should be error = kstrtoint(buf, 10, &new_speed); if (error) return error; > > if (new_speed == 0 || new_speed == 1) { > -- > 1.8.4.2 > > Well, I hope this works as intended. > > -- > Robin -- Dmitry -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
fanotify use after free.
Jan, since yesterdays changes, on boot I see a flood of messages from slub debug during boot.. = BUG fanotify_event_info (Not tainted): Poison overwritten - Disabling lock debugging due to kernel taint INFO: 0x880247e45bc8-0x880247e45bcb. First byte 0x0 instead of 0x6b INFO: Allocated in fanotify_handle_event+0x136/0x390 age=0 cpu=0 pid=293 __slab_alloc+0x456/0x565 kmem_cache_alloc+0x1fe/0x260 fanotify_handle_event+0x136/0x390 send_to_group+0xd3/0x1c0 fsnotify+0x1c8/0x340 open_exec+0xe2/0x120 load_elf_binary+0x7b7/0x18e0 search_binary_handler+0x94/0x1b0 do_execve_common.isra.26+0x5d7/0x7d0 SyS_execve+0x36/0x50 stub_execve+0x69/0xa0 INFO: Freed in fanotify_free_event+0x2e/0x40 age=0 cpu=3 pid=290 __slab_free+0x4a/0x382 kmem_cache_free+0x1c9/0x210 fanotify_free_event+0x2e/0x40 fsnotify_destroy_event+0x21/0x30 fanotify_read+0x39e/0x5e0 vfs_read+0x9b/0x160 SyS_read+0x58/0xb0 tracesys+0xdd/0xe2 INFO: Slab 0xea00091f9100 objects=20 used=20 fp=0x (null) flags=0x204080 INFO: Object 0x880247e45b90 @offset=7056 fp=0x880247e44000 Bytes b4 880247e45b80: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a Object 880247e45b90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Object 880247e45ba0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Object 880247e45bb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Object 880247e45bc0: 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b a5 kkk. Redzone 880247e45bd0: bb bb bb bb bb bb bb bb Padding 880247e45d10: 5a 5a 5a 5a 5a 5a 5a 5a CPU: 0 PID: 293 Comm: mount Tainted: GB3.13.0+ #28 880247e45b90 8c7fe87c 8800874cbb28 9c710632 88024a776ac0 8800874cbb68 9c194dad 0008 88020001 880247e45bcc 88024a776ac0 006b Call Trace: [] dump_stack+0x4e/0x7a [] print_trailer+0x14d/0x200 [] check_bytes_and_report+0xcf/0x110 [] check_object+0x1d7/0x250 [] ? fanotify_handle_event+0x136/0x390 [] alloc_debug_processing+0x76/0x118 [] __slab_alloc+0x456/0x565 [] ? fanotify_handle_event+0x136/0x390 [] ? mntput+0x24/0x40 [] ? terminate_walk+0x69/0x70 [] ? do_last+0x25e/0x1390 [] ? inode_permission+0x18/0x50 [] ? fanotify_handle_event+0x136/0x390 [] kmem_cache_alloc+0x1fe/0x260 [] fanotify_handle_event+0x136/0x390 [] ? path_openat+0xcd/0x6a0 [] send_to_group+0xd3/0x1c0 [] ? fsnotify+0x8f/0x340 [] fsnotify+0x1c8/0x340 [] do_sys_open+0x19f/0x230 [] SyS_open+0x1e/0x20 [] tracesys+0xdd/0xe2 FIX fanotify_event_info: Restoring 0x880247e45bc8-0x880247e45bcb=0x6b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] cpufreq: Align all CPUs to the same frequency if using shared clock
> -Original Message- > From: Viresh Kumar [mailto:viresh.ku...@linaro.org] > Sent: Wednesday, January 22, 2014 1:18 PM > To: Li, Zhuangzhi > Cc: Rafael J. Wysocki; cpuf...@vger.kernel.org; linux...@vger.kernel.org; > Linux Kernel Mailing List; Liu, Chuansheng > Subject: Re: [PATCH] cpufreq: Align all CPUs to the same frequency if using > shared clock > > On 21 January 2014 13:42, Viresh Kumar wrote: > > On 21 January 2014 12:56, Li, Zhuangzhi wrote: > >> Thanks for reviewing. > > > > Its my job :) > > > >> Sorry for make you misunderstanding, on our x86 platform, we want all the > CPUs share one policy by setting CPUFREQ_SHARED_TYPE_ALL, not share one > HW clock line. > > > > I see.. Then probably your patch makes sense. But it is obviously not > > required for every platform that exists today. > > > > Please update it to do it only for drivers that have set > > CPUFREQ_SHARED_TYPE_ALL.. > > One more thing, who has set different frequencies to these cores? > I hope kernel hasn't ? > > In that case, probably you are fixing a bootloader bug in kernel? > What about doing this in bootloader then? I don't think it's a real bug in bootloader, the bootloader can set CPUs to different frequencies according to actually requirements(Power saving first or Performance first), the CPUs freq policy are initialized in kernel, if the kernel want to share one CPU policy(using CPUFREQ_SHARED_TYPE_ALL type), it should ensure all CPUs frequencies aligned first, don't depend on the bootloader CPUs Pre-states, then the kernel can have better compatibility. If the kernel uses CPUFREQ_SHARED_TYPE_ALL policy, the patch can ensure these: 1. If all CPUs are in the same P-state, it does nothing when cpufreq registering 2. If the CPUs are in different P-states, all the other CPUs are aligned once to current frequency of CPU0 according to the present policy. > > -- > virehs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ext4: explain encoding of 34-bit a,c,mtime values
On Mon, Nov 11, 2013 at 07:30:18PM -0500, Theodore Ts'o wrote: > On Sun, Nov 10, 2013 at 02:56:54AM -0500, David Turner wrote: > > b. Use Andreas's encoding, which is incompatible with pre-1970 files > > written on 64-bit systems. > > > > I don't care about currently-existing post-2038 files, because I believe > > that nobody has a valid reason to have such files. However, I do > > believe that pre-1970 files are probably important to someone. > > > > Despite this, I prefer option (b), because I think the simplicity is > > valuable, and because I hate to give up date ranges (even ones that I > > think we'll "never" need). Option (b) is not actually lossy, because we > > could correct pre-1970 files with e2fsck; under Andreas's encoding, > > their dates would be in the far future (and thus cannot be legitimate). > > > > Would a patch that does (b) be accepted? I would accompany it with a > > patch to e2fsck (which I assume would also go to the ext4 developers > > mailing list?). > > I agree, I think this is the best way to go. I'm going to drop your > earlier patch, and wait for an updated patch from you. It may miss > this merge window, but as Andreas has pointed out, we still have a few > years to get this right. :-) Just out of curiosity, did this (updated patch) ever happen? --D > > Thanks!! > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable
On 01/22/2014 02:22 AM, David Rientjes wrote: > On Fri, 17 Jan 2014, Vladimir Davydov wrote: > >> The name `max_pass' is misleading, because this variable actually keeps >> the estimate number of freeable objects, not the maximal number of >> objects we can scan in this pass, which can be twice that. Rename it to >> reflect its actual meaning. >> >> Signed-off-by: Vladimir Davydov >> Cc: Andrew Morton >> Cc: Mel Gorman >> Cc: Michal Hocko >> Cc: Johannes Weiner >> Cc: Rik van Riel >> Cc: Dave Chinner >> Cc: Glauber Costa > This doesn't compile on linux-next: > > mm/vmscan.c: In function ‘shrink_slab_node’: > mm/vmscan.c:300:23: error: ‘max_pass’ undeclared (first use in this function) > mm/vmscan.c:300:23: note: each undeclared identifier is reported only once > for each function it appears in > > because of b01fa2357bca ("mm: vmscan: shrink all slab objects if tight on > memory") from an author with a name remarkably similar to yours. Oh, sorry. I thought it hadn't been committed there yet. > Could you rebase this series on top of your previous work that is already in > -mm? Sure. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 17/24] GFS2: Use RCU/hlist_bl based hash for quotas
On 01/22/2014 12:32 AM, Paul E. McKenney wrote: On Mon, Jan 20, 2014 at 12:23:40PM +, Steven Whitehouse wrote: >Prior to this patch, GFS2 kept all the quotas for each >super block in a single linked list. This is rather slow >when there are large numbers of quotas. > >This patch introduces a hlist_bl based hash table, similar >to the one used for glocks. The initial look up of the quota >is now lockless in the case where it is already cached, >although we still have to take the per quota spinlock in >order to bump the ref count. Either way though, this is a >big improvement on what was there before. > >The qd_lock and the per super block list is preserved, for >the time being. However it is intended that since this is no >longer used for its original role, it should be possible to >shrink the number of items on that list in due course and >remove the requirement to take qd_lock in qd_get. > >Signed-off-by: Steven Whitehouse >Cc: Abhijith Das >Cc: Paul E. McKenney Interesting! I thought that Sasha Levin had a hash table in the works, but I don't see it, so CCing him. Indeed, there is a hlist based hashtable at include/linux/hashtable.h for couple kernel versions now. However, there's no hlist_bl one. If there is a plan on adding a hlist_bl hashtable for whatever reason, it should probably be done by expanding hashtable.h so that more places that use hlist_bl would benefit from it (yes, there are couple more places that do hlist_bl hashtable). Also, do we really want to use hlist_bl here? It doesn't seem like it's being done to conserve on memory, and that's the only reason it should be used for. Doing a single spinlock per bucket is much more efficient than using the bit locking scheme that hlist_bl does. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] restore user defined min_free_kbytes when disabling thp
On Tue, Jan 21, 2014 at 10:23:51AM +, Mel Gorman wrote: > On Tue, Jan 21, 2014 at 05:38:59PM +0800, Han Pingtian wrote: > > The testcase 'thp04' of LTP will enable THP, do some testing, then > > disable it if it wasn't enabled. But this will leave a different value > > of min_free_kbytes if it has been set by admin. So I think it's better > > to restore the user defined value after disabling THP. > > > > Then have LTP record what min_free_kbytes was at the same time THP was > enabled by the test and restore both settings. It leaves a window where > an admin can set an alternative value during the test but that would also > invalidate the test in same cases and gets filed under "don't do that". > Because the value is changed in kernel, so it would be better to restore it in kernel, right? :) I have a v2 patch which will restore the value only if it isn't set again by user after THP's initialization. This v2 patch is dependent on the patch 'mm: show message when updating min_free_kbytes in thp' which has been added to -mm tree, can be found here: http://ozlabs.org/~akpm/mmotm/broken-out/mm-show-message-when-updating-min_free_kbytes-in-thp.patch please have a look. Thanks. >From 8b79586ff9a1d85cbe45102a86888268094ec0ae Mon Sep 17 00:00:00 2001 From: Han Pingtian Date: Tue, 21 Jan 2014 17:24:43 +0800 Subject: [PATCH] mm: restore user defined min_free_kbytes when disabling thp thp increases the value of min_free_kbytes in initialization. This will change the user defined value of min_free_kbytes sometimes. So try to restore the value when disabling thp if the value has been changed in thp initialization and isn't changed by user afte that. Signed-off-by: Han Pingtian --- mm/huge_memory.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94a824f..fcb8ce58 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -164,6 +164,16 @@ static int start_khugepaged(void) } else if (khugepaged_thread) { kthread_stop(khugepaged_thread); khugepaged_thread = NULL; + + if (user_min_free_kbytes >= 0 && + user_min_free_kbytes != min_free_kbytes) { + pr_info("restore min_free_kbytes from %d to user " + "defined %d when stopping khugepaged\n", + min_free_kbytes, user_min_free_kbytes); + + min_free_kbytes = user_min_free_kbytes; + setup_per_zone_wmarks(); + } } return err; -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH Resend] PM: Remove unnecessary !!
Double ! or !! are normally required to get 0 or 1 out of a expression. A comparision always returns 0 or 1 and hence there is no need to apply double ! over it again. Signed-off-by: Viresh Kumar --- kernel/power/suspend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 62ee437..90b3d93 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -39,7 +39,7 @@ static const struct platform_suspend_ops *suspend_ops; static bool need_suspend_ops(suspend_state_t state) { - return !!(state > PM_SUSPEND_FREEZE); + return state > PM_SUSPEND_FREEZE; } static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head); -- 1.7.12.rc2.18.g61b472e -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 2/2] usb: dwc3: adapt dwc3 core to use Generic PHY Framework
Hi, On Tue, Jan 21, 2014 at 7:30 PM, Roger Quadros wrote: > Hi Kishon, > > On 01/21/2014 12:11 PM, Kishon Vijay Abraham I wrote: >> Adapted dwc3 core to use the Generic PHY Framework. So for init, exit, >> power_on and power_off the following APIs are used phy_init(), phy_exit(), >> phy_power_on() and phy_power_off(). >> >> However using the old USB phy library wont be removed till the PHYs of all >> other SoC's using dwc3 core is adapted to the Generic PHY Framework. >> >> Signed-off-by: Kishon Vijay Abraham I >> --- >> Changes from v3: >> * avoided using quirks >> >> Documentation/devicetree/bindings/usb/dwc3.txt |6 ++- >> drivers/usb/dwc3/core.c| 60 >> >> drivers/usb/dwc3/core.h|7 +++ >> 3 files changed, 71 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt >> b/Documentation/devicetree/bindings/usb/dwc3.txt >> index e807635..471366d 100644 >> --- a/Documentation/devicetree/bindings/usb/dwc3.txt >> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt >> @@ -6,11 +6,13 @@ Required properties: >> - compatible: must be "snps,dwc3" >> - reg : Address and length of the register set for the device >> - interrupts: Interrupts used by the dwc3 controller. >> + >> +Optional properties: >> - usb-phy : array of phandle for the PHY device. The first element >> in the array is expected to be a handle to the USB2/HS PHY and >> the second element is expected to be a handle to the USB3/SS PHY >> - >> -Optional properties: >> + - phys: from the *Generic PHY* bindings >> + - phy-names: from the *Generic PHY* bindings >> - tx-fifo-resize: determines if the FIFO *has* to be reallocated. >> >> This is usually a subnode to DWC3 glue to which it is connected. >> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >> index e009d4e..036d589 100644 >> --- a/drivers/usb/dwc3/core.c >> +++ b/drivers/usb/dwc3/core.c >> @@ -82,6 +82,11 @@ static void dwc3_core_soft_reset(struct dwc3 *dwc) >> >> usb_phy_init(dwc->usb2_phy); >> usb_phy_init(dwc->usb3_phy); >> + if (dwc->usb2_generic_phy) >> + phy_init(dwc->usb2_generic_phy); > > What if phy_init() fails? You need to report and fail. Same applies for all > PHY apis in this patch. > >> + if (dwc->usb3_generic_phy) >> + phy_init(dwc->usb3_generic_phy); >> + >> mdelay(100); >> >> /* Clear USB3 PHY reset */ >> @@ -343,6 +348,11 @@ static void dwc3_core_exit(struct dwc3 *dwc) >> { >> usb_phy_shutdown(dwc->usb2_phy); >> usb_phy_shutdown(dwc->usb3_phy); >> + if (dwc->usb2_generic_phy) >> + phy_exit(dwc->usb2_generic_phy); >> + if (dwc->usb3_generic_phy) >> + phy_exit(dwc->usb3_generic_phy); >> + >> } >> >> #define DWC3_ALIGN_MASK (16 - 1) >> @@ -433,6 +443,32 @@ static int dwc3_probe(struct platform_device *pdev) >> } >> } >> >> + dwc->usb2_generic_phy = devm_phy_get(dev, "usb2-phy"); >> + if (IS_ERR(dwc->usb2_generic_phy)) { >> + ret = PTR_ERR(dwc->usb2_generic_phy); >> + if (ret == -ENOSYS || ret == -ENODEV) { >> + dwc->usb2_generic_phy = NULL; >> + } else if (ret == -EPROBE_DEFER) { >> + return ret; >> + } else { >> + dev_err(dev, "no usb2 phy configured\n"); >> + return ret; >> + } >> + } >> + >> + dwc->usb3_generic_phy = devm_phy_get(dev, "usb3-phy"); >> + if (IS_ERR(dwc->usb3_generic_phy)) { >> + ret = PTR_ERR(dwc->usb3_generic_phy); >> + if (ret == -ENOSYS || ret == -ENODEV) { >> + dwc->usb3_generic_phy = NULL; >> + } else if (ret == -EPROBE_DEFER) { >> + return ret; >> + } else { >> + dev_err(dev, "no usb3 phy configured\n"); >> + return ret; >> + } >> + } >> + >> dwc->xhci_resources[0].start = res->start; >> dwc->xhci_resources[0].end = dwc->xhci_resources[0].start + >> DWC3_XHCI_REGS_END; >> @@ -482,6 +518,11 @@ static int dwc3_probe(struct platform_device *pdev) >> usb_phy_set_suspend(dwc->usb2_phy, 0); >> usb_phy_set_suspend(dwc->usb3_phy, 0); >> >> + if (dwc->usb2_generic_phy) >> + phy_power_on(dwc->usb2_generic_phy); >> + if (dwc->usb3_generic_phy) >> + phy_power_on(dwc->usb3_generic_phy); >> + > > Is it OK to power on the phy before phy_init()? Isn't phy_init() being done before phy_power_on() in the core_soft_reset() in this patch ? Isn't that what you want here ? > > I suggest to move phy_init() from core_soft_reset() to here, just before > phy_power_on(). core_soft_reset() is called before phy_power_on() itself from dwc3_core_init(), right ? will moving the phy_inti() here make na
mutual exculsion between clk_prepare_enable /clk_disable_unprepare and clk_set_parent
Hi, Mike We met a issue between clk_prepare_enable /clk_disable_unprepare and clk_set_parent. As we know, clk preprare/unprare will grab preprare lock, and clk enable/disable will grab enable lock. clk_set_parent will grab prepare lock but there is no lock protection in clk_prepare_enable /clk_disable_unprepare, for example, in clk_disable_unprepare, it is expended as clk_disable + clk_unprepare, and if below condition occurs, there will be problem thread1 thread 2 call clk_disable_unprepare 1) clk_disable get enable lock ... release enable lock call clk_set_parent get prepare lock set clock's parent to another parent release prepare lock 2) clk_unprepare get prepare lock unprepare parent clock <<-- release prepare lock In above sequence, After thread 1 call clock disable, thread 2 change clk's parent to another clock, then in thread1 step2, it will unprepare clk's new parent, but not old parent, this will cause old parent is not unprepared, but new parent is unprepared even when it is not prepared yet. So How can we use this API: clk_prepare_enable and clk_disable_unprepare ? Should we add lock to protect this API, if we get a prepare lock inside this API, like clk_disable_unprepare () { get_prepare_lock(); clk_disable(); clk_unprepare(); clk_prepare_unlock(); } is above sequence ok? if so, I can provide a patch for this. Thanks Xiaoguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ipv4_dst_destroy panic regression after 3.10.15
On Tue, Jan 21, 2014 at 8:10 PM, dormando wrote: > > > On Tue, 21 Jan 2014, Alexei Starovoitov wrote: > >> On Tue, Jan 21, 2014 at 5:39 PM, dormando wrote: >> > >> > > On Fri, Jan 17, 2014 at 11:16 PM, dormando wrote: >> > > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: >> > > >> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: >> > > >> > > Hi, >> > > >> > > >> > > >> > > Upgraded a few kernels to the latest 3.10 stable tree while >> > > >> > > tracking down >> > > >> > > a rare kernel panic, seems to have introduced a much more >> > > >> > > frequent kernel >> > > >> > > panic. Takes anywhere from 4 hours to 2 days to trigger: >> > > >> > > >> > > >> > > <4>[196727.311203] general protection fault: [#1] SMP >> > > >> > > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP >> > > >> > > macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich >> > > >> > > microcode >> ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm >> tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp >> pps_core mdio >> > > >> > > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted >> > > >> > > 3.10.26 #1 >> > > >> > > <4>[196727.311344] Hardware name: Supermicro >> > > >> > > X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013 >> > > >> > > <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 >> > > >> > > task.ti: 885e6f072000 >> > > >> > > <4>[196727.311377] RIP: 0010:[] >> > > >> > > [] ipv4_dst_destroy+0x4f/0x80 >> > > >> > > <4>[196727.311399] RSP: 0018:885effd23a70 EFLAGS: 00010282 >> > > >> > > <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 >> > > >> > > RCX: 0040 >> > > >> > > <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 >> > > >> > > RDI: dead00200200 >> > > >> > > <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 >> > > >> > > R09: 885d5a590800 >> > > >> > > <4>[196727.311451] R10: R11: >> > > >> > > R12: >> > > >> > > <4>[196727.311464] R13: 81c8c280 R14: >> > > >> > > R15: 880e85ee16ce >> > > >> > > <4>[196727.311510] FS: () >> > > >> > > GS:885effd2() knlGS: >> > > >> > > <4>[196727.311554] CS: 0010 DS: ES: CR0: >> > > >> > > 80050033 >> > > >> > > <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 >> > > >> > > CR4: 000407e0 >> > > >> > > <4>[196727.311625] DR0: DR1: >> > > >> > > DR2: >> > > >> > > <4>[196727.311669] DR3: DR6: 0ff0 >> > > >> > > DR7: 0400 >> > > >> > > <4>[196727.311713] Stack: >> > > >> > > <4>[196727.311733] 8854c398ecc0 8854c398ecc0 >> > > >> > > 885effd23ab0 815b7f42 >> > > >> > > <4>[196727.311784] 88be6595bc00 8854c398ecc0 >> > > >> > > 8854c398ecc0 >> > > >> > > <4>[196727.311834] 885effd23ad0 815b86c6 >> > > >> > > 885d5a590800 8816827821c0 >> > > >> > > <4>[196727.311885] Call Trace: >> > > >> > > <4>[196727.311907] >> > > >> > > <4>[196727.311912] [] dst_destroy+0x32/0xe0 >> > > >> > > <4>[196727.311959] [] dst_release+0x56/0x80 >> > > >> > > <4>[196727.311986] [] tcp_v4_do_rcv+0x2a5/0x4a0 >> > > >> > > <4>[196727.312013] [] tcp_v4_rcv+0x7da/0x820 >> > > >> > > <4>[196727.312041] [] ? >> > > >> > > ip_rcv_finish+0x360/0x360 >> > > >> > > <4>[196727.312070] [] ? nf_hook_slow+0x7d/0x150 >> > > >> > > <4>[196727.312097] [] ? >> > > >> > > ip_rcv_finish+0x360/0x360 >> > > >> > > <4>[196727.312125] [] >> > > >> > > ip_local_deliver_finish+0xb2/0x230 >> > > >> > > <4>[196727.312154] [] >> > > >> > > ip_local_deliver+0x4a/0x90 >> > > >> > > <4>[196727.312183] [] ip_rcv_finish+0x119/0x360 >> > > >> > > <4>[196727.312212] [] ip_rcv+0x22b/0x340 >> > > >> > > <4>[196727.312242] [] ? >> > > >> > > macvlan_broadcast+0x160/0x160 [macvlan] >> > > >> > > <4>[196727.312275] [] >> > > >> > > __netif_receive_skb_core+0x512/0x640 >> > > >> > > <4>[196727.312308] [] ? >> > > >> > > kmem_cache_alloc+0x13b/0x150 >> > > >> > > <4>[196727.312338] [] >> > > >> > > __netif_receive_skb+0x21/0x70 >> > > >> > > <4>[196727.312368] [] >> > > >> > > netif_receive_skb+0x31/0xa0 >> > > >> > > <4>[196727.312397] [] >> > > >> > > napi_gro_receive+0xe8/0x140 >> > > >> > > <4>[196727.312433] [] ixgbe_poll+0x551/0x11f0 >> > > >> > > [ixgbe] >> > > >> > > <4>[196727.312463] [] ? ip_rcv+0x22b/0x340 >> > > >> > > <4>[196727.312491] [] net_rx_action+0x111/0x210 >> > > >> > > <4>[196727.312521] [] ? >> > > >> > > __netif_receive_skb+0x21/0x70 >> > > >> > > <4>[196727.312552] [] __do_softirq+0xd0/0x270 >> > > >> > > <4>[196727.312583] [] call_softirq+0x1c/0x30 >> > > >> > > <4>[196727.312613] [] do_softirq+0x55/0x90 >> > > >> > > <4>[196727.312640] [] irq_e
Deadlock between cpu_hotplug_begin and cpu_add_remove_lock
This arises out of a report from a tester that offlining a CPU never finished on a system they were testing. This was on a POWER8 running a 3.10.x kernel, but the issue is still present in mainline AFAICS. What I found when I looked at the system was this: * There was a ppc64_cpu process stuck inside cpu_hotplug_begin(), called from _cpu_down(), from cpu_down(). This process was holding the cpu_add_remove_lock mutex, since cpu_down() calls cpu_maps_update_begin() before calling _cpu_down(). It was stuck there because cpu_hotplug.refcount == 1. * There was a mdadm process trying to acquire the cpu_add_remove_lock mutex inside register_cpu_notifier(), called from raid5_alloc_percpu() in drivers/md/raid5.c. That process had previously called get_online_cpus, which is why cpu_hotplug.refcount was 1. Result: deadlock. Thus it seems that the following code is not safe: get_online_cpus(); register_cpu_notifier(&...); put_online_cpus(); There are a few different places that do that sort of thing; besides drivers/md/raid5.c, there are instances in arch/x86/kernel/cpu, arch/x86/oprofile, drivers/cpufreq/acpi-cpufreq.c, drivers/oprofile/nmi_timer_int.c and kernel/trace/ring_buffer.c. My question is this: is it reasonable to call register_cpu_notifier inside a get/put_online_cpus block? If so, the deadlock needs to be fixed; if not, the callers need to be fixed, and the restriction should be documented. Regards, Paul. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
DID YOU GET OUR E-MAIL?
You were among the lucky beneficiary selected to receive the sum of £850,000.00GBP (Eight Hundred & Fifty Thousand Pounds Sterling's) as charity donations/aid from the Coca-Cola Foundation to promote your business and personal need Email us your Name--Tel--Country--to (cocacola.foundatio...@yahoo.com ) for details. Mrs.Eleina Welsh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the drivers-x86 tree with the pm tree
Hi Matthew, Today's linux-next merge of the drivers-x86 tree got a conflict in drivers/platform/x86/mxm-wmi.c between commit 8b48463f8942 ("ACPI: Clean up inclusions of ACPI header files") from the pm tree and commit 475879d65123 ("drivers: platform: Include appropriate header file in mxm-wmi.c") from the drivers-x86 tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/platform/x86/mxm-wmi.c index 3c59c0a3ee0f,7503d2b9b073.. --- a/drivers/platform/x86/mxm-wmi.c +++ b/drivers/platform/x86/mxm-wmi.c @@@ -20,7 -20,9 +20,8 @@@ #include #include #include + #include -#include -#include +#include MODULE_AUTHOR("Dave Airlie"); MODULE_DESCRIPTION("MXM WMI Driver"); pgp8GZXcTvVm8.pgp Description: PGP signature
Re: [PATCH 17/24] GFS2: Use RCU/hlist_bl based hash for quotas
On Mon, Jan 20, 2014 at 12:23:40PM +, Steven Whitehouse wrote: > Prior to this patch, GFS2 kept all the quotas for each > super block in a single linked list. This is rather slow > when there are large numbers of quotas. > > This patch introduces a hlist_bl based hash table, similar > to the one used for glocks. The initial look up of the quota > is now lockless in the case where it is already cached, > although we still have to take the per quota spinlock in > order to bump the ref count. Either way though, this is a > big improvement on what was there before. > > The qd_lock and the per super block list is preserved, for > the time being. However it is intended that since this is no > longer used for its original role, it should be possible to > shrink the number of items on that list in due course and > remove the requirement to take qd_lock in qd_get. > > Signed-off-by: Steven Whitehouse > Cc: Abhijith Das > Cc: Paul E. McKenney Interesting! I thought that Sasha Levin had a hash table in the works, but I don't see it, so CCing him. A few questions and comments below. Thanx, Paul > diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h > index a99f60c..59d99ec 100644 > --- a/fs/gfs2/incore.h > +++ b/fs/gfs2/incore.h > @@ -428,10 +428,13 @@ enum { > }; > > struct gfs2_quota_data { > + struct hlist_bl_node qd_hlist; > struct list_head qd_list; > struct kqid qd_id; > + struct gfs2_sbd *qd_sbd; > struct lockref qd_lockref; > struct list_head qd_lru; > + unsigned qd_hash; > > unsigned long qd_flags; /* QDF_... */ > > @@ -450,6 +453,7 @@ struct gfs2_quota_data { > > u64 qd_sync_gen; > unsigned long qd_last_warn; > + struct rcu_head qd_rcu; > }; > > struct gfs2_trans { > diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c > index 0650db2..c272e73 100644 > --- a/fs/gfs2/main.c > +++ b/fs/gfs2/main.c > @@ -76,6 +76,7 @@ static int __init init_gfs2_fs(void) > > gfs2_str2qstr(&gfs2_qdot, "."); > gfs2_str2qstr(&gfs2_qdotdot, ".."); > + gfs2_quota_hash_init(); > > error = gfs2_sys_init(); > if (error) > diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c > index 1b6b367..a1df01d 100644 > --- a/fs/gfs2/quota.c > +++ b/fs/gfs2/quota.c > @@ -52,6 +52,10 @@ > #include > #include > #include > +#include > +#include > +#include > +#include > > #include "gfs2.h" > #include "incore.h" > @@ -67,10 +71,43 @@ > #include "inode.h" > #include "util.h" > > -/* Lock order: qd_lock -> qd->lockref.lock -> lru lock */ > +#define GFS2_QD_HASH_SHIFT 12 Should this be a function of the number of CPUs? (Might not be an issue if the really big systems don't use GFS.) > +#define GFS2_QD_HASH_SIZE (1 << GFS2_QD_HASH_SHIFT) > +#define GFS2_QD_HASH_MASK (GFS2_QD_HASH_SIZE - 1) > + > +/* Lock order: qd_lock -> bucket lock -> qd->lockref.lock -> lru lock */ > static DEFINE_SPINLOCK(qd_lock); > struct list_lru gfs2_qd_lru; > > +static struct hlist_bl_head qd_hash_table[GFS2_QD_HASH_SIZE]; > + > +static unsigned int gfs2_qd_hash(const struct gfs2_sbd *sdp, > + const struct kqid qid) > +{ > + unsigned int h; > + > + h = jhash(&sdp, sizeof(struct gfs2_sbd *), 0); > + h = jhash(&qid, sizeof(struct kqid), h); > + > + return h & GFS2_QD_HASH_MASK; > +} > + > +static inline void spin_lock_bucket(unsigned int hash) > +{ > +hlist_bl_lock(&qd_hash_table[hash]); > +} > + > +static inline void spin_unlock_bucket(unsigned int hash) > +{ > +hlist_bl_unlock(&qd_hash_table[hash]); > +} > + > +static void gfs2_qd_dealloc(struct rcu_head *rcu) > +{ > + struct gfs2_quota_data *qd = container_of(rcu, struct gfs2_quota_data, > qd_rcu); > + kmem_cache_free(gfs2_quotad_cachep, qd); > +} > + > static void gfs2_qd_dispose(struct list_head *list) > { > struct gfs2_quota_data *qd; > @@ -87,6 +124,10 @@ static void gfs2_qd_dispose(struct list_head *list) > list_del(&qd->qd_list); > spin_unlock(&qd_lock); > > + spin_lock_bucket(qd->qd_hash); > + hlist_bl_del_rcu(&qd->qd_hlist); > + spin_unlock_bucket(qd->qd_hash); > + Good, removed from the RCU-traversed list before invoking call_rcu(). > gfs2_assert_warn(sdp, !qd->qd_change); > gfs2_assert_warn(sdp, !qd->qd_slot_count); > gfs2_assert_warn(sdp, !qd->qd_bh_count); > @@ -95,7 +136,7 @@ static void gfs2_qd_dispose(struct list_head *list) > atomic_dec(&sdp->sd_quota_count); > > /* Delete it from the common reclaim list */ > - kmem_cache_free(gfs2_quotad_cachep, qd); > + call_rcu(&qd->qd_rcu, gfs2_qd_dealloc); > } > } > > @@ -165,83 +206,95 @@ static u64 qd2offset(struct gfs2_quota_data *qd) > return offset; > } > > -static int qd_alloc(struct gfs2_sbd *sdp, struct kqid qid, > -
[PATCH v3 0/4] X86/KVM: enable Intel MPX for KVM
These patches are version 3 to enalbe Intel MPX for KVM. Version 1: * Add some Intel MPX definiation * Fix a cpuid(0x0d, 0) exposing bug, dynamic per XCR0 features enable/disable * vmx and msr handle for MPX support at KVM * enalbe MPX feature for guest Version 2: * remove generic MPX definiation, Qiaowei's patch has add the definiation at kernel side * add MSR_IA32_BNDCFGS to msrs_to_save Version 3: * rebase on latest kernel, which include Qiaowei's MPX common definiation pulled from HPA's tree Thanks, Jinsong-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v4 1/2] sysctl: Make neg_one a standard constraint
On Mon, 20 Jan 2014, atom...@redhat.com wrote: > From: Aaron Tomlin > > Add neg_one to the list of standard constraints. > > Signed-off-by: Aaron Tomlin > Acked-by: Rik van Riel Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v4 2/2] hung_task: Display every hung task warning
On Mon, 20 Jan 2014, atom...@redhat.com wrote: > From: Aaron Tomlin > > When khungtaskd detects hung tasks, it prints out > backtraces from a number of those tasks. > Limiting the number of backtraces being printed > out can result in the user not seeing the information > necessary to debug the issue. The hung_task_warnings > sysctl controls this feature. > > This patch makes it possible for hung_task_warnings > to accept a special value to print an unlimited > number of backtraces when khungtaskd detects hung > tasks. > > The special value is -1. To use this value it is > necessary to change types from ulong to int. > > Signed-off-by: Aaron Tomlin > Reviewed-by: Rik van Riel Acked-by: David Rientjes Nice documentation updates! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clk: export __clk_get_hw for re-use in others
On Wed, Jan 22, 2014 at 1:59 PM, Greg KH wrote: > On Wed, Jan 22, 2014 at 12:05:57PM +0900, SeongJae Park wrote: >> Dear Greg, Mike, >> >> May I ask your answer or other opinion, please? > > It's the middle of the merge window, it's not time for new development, > or much time for free-time for me, sorry. Feel free to fix it the best > way you know how. Oops, I've forgot about the merge window. Thank you very much for your kind answer. Sorry if I bothered you while you're in busy time. Because the build problem is not a big deal because it exists only in -next tree, I will wait until merge window be closed and then fix it again if it still exist. SeongJae Park. > > greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes
On Tue, Jan 21, 2014 at 10:04:29PM -0500, Ric Wheeler wrote: > One topic that has been lurking forever at the edges is the current > 4k limitation for file system block sizes. Some devices in > production today and others coming soon have larger sectors and it > would be interesting to see if it is time to poke at this topic > again. > > LSF/MM seems to be pretty much the only event of the year that most > of the key people will be present, so should be a great topic for a > joint session. Oh yes, I want in on this. We handle 4k/16k/64k pages "seamlessly," and we would want to do the same for larger sectors. In theory, our code should handle it with the appropriate defines updated. Joel -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: Align all CPUs to the same frequency if using shared clock
On 21 January 2014 13:42, Viresh Kumar wrote: > On 21 January 2014 12:56, Li, Zhuangzhi wrote: >> Thanks for reviewing. > > Its my job :) > >> Sorry for make you misunderstanding, on our x86 platform, we want all the >> CPUs share one policy by setting CPUFREQ_SHARED_TYPE_ALL, not share one HW >> clock line. > > I see.. Then probably your patch makes sense. But it is > obviously not required for every platform that exists today. > > Please update it to do it only for drivers that have set > CPUFREQ_SHARED_TYPE_ALL.. One more thing, who has set different frequencies to these cores? I hope kernel hasn't ? In that case, probably you are fixing a bootloader bug in kernel? What about doing this in bootloader then? -- virehs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] backlight: turn backlight on/off when necessary
On Wednesday, January 22, 2014 2:04 PM, Liu Ying wrote: > > Ping... > > Regards, > Liu Ying Please, don't send the ping within 2 days. It is not a good practice. You sent the v1 patch 6 months ago. However, why I should review the patch within 2 days? Please wait. Best regards, Jingoo Han > > On 01/20/2014 12:52 PM, Liu Ying wrote: > > We don't have to turn backlight on/off everytime a blanking > > or unblanking event comes because the backlight status may > > have already been what we want. Another thought is that one > > backlight device may be shared by multiple framebuffers. We > > don't hope blanking one of the framebuffers may turn the > > backlight off for all the other framebuffers, since they are > > likely being active to display something. This patch adds > > some logics to record each framebuffer's backlight usage to > > determine the backlight device use count and whether the > > backlight should be turned on or off. To be more specific, > > only one unblank operation on a certain blanked framebuffer > > may increase the backlight device's use count by one, while > > one blank operation on a certain unblanked framebuffer may > > decrease the use count by one, because the userspace is > > likely to unblank a unblanked framebuffer or blank a blanked > > framebuffer. > > > > Signed-off-by: Liu Ying > > --- > > v1 can be found at https://lkml.org/lkml/2013/5/30/139 > > > > v1->v2: > > * Make the commit message be more specific about the condition > > in which backlight device use count can be increased/decreased. > > * Correct the setting for bd->props.fb_blank. > > > > drivers/video/backlight/backlight.c | 28 +--- > > include/linux/backlight.h |6 ++ > > 2 files changed, 27 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/video/backlight/backlight.c > > b/drivers/video/backlight/backlight.c > > index 5d0..42044be 100644 > > --- a/drivers/video/backlight/backlight.c > > +++ b/drivers/video/backlight/backlight.c > > @@ -34,13 +34,15 @@ static const char *const backlight_types[] = { > >defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)) > > /* This callback gets called when something important happens inside a > > * framebuffer driver. We're looking if that important event is blanking, > > - * and if it is, we're switching backlight power as well ... > > + * and if it is and necessary, we're switching backlight power as well ... > > */ > > static int fb_notifier_callback(struct notifier_block *self, > > unsigned long event, void *data) > > { > > struct backlight_device *bd; > > struct fb_event *evdata = data; > > + int node = evdata->info->node; > > + int fb_blank = 0; > > > > /* If we aren't interested in this event, skip it immediately ... */ > > if (event != FB_EVENT_BLANK && event != FB_EVENT_CONBLANK) > > @@ -51,12 +53,24 @@ static int fb_notifier_callback(struct notifier_block > > *self, > > if (bd->ops) > > if (!bd->ops->check_fb || > > bd->ops->check_fb(bd, evdata->info)) { > > - bd->props.fb_blank = *(int *)evdata->data; > > - if (bd->props.fb_blank == FB_BLANK_UNBLANK) > > - bd->props.state &= ~BL_CORE_FBBLANK; > > - else > > - bd->props.state |= BL_CORE_FBBLANK; > > - backlight_update_status(bd); > > + fb_blank = *(int *)evdata->data; > > + if (fb_blank == FB_BLANK_UNBLANK && > > + !bd->fb_bl_on[node]) { > > + bd->fb_bl_on[node] = true; > > + if (!bd->use_count++) { > > + bd->props.state &= ~BL_CORE_FBBLANK; > > + bd->props.fb_blank = > > FB_BLANK_UNBLANK; > > + backlight_update_status(bd); > > + } > > + } else if (fb_blank != FB_BLANK_UNBLANK && > > + bd->fb_bl_on[node]) { > > + bd->fb_bl_on[node] = false; > > + if (!(--bd->use_count)) { > > + bd->props.state |= BL_CORE_FBBLANK; > > + bd->props.fb_blank = > > FB_BLANK_POWERDOWN; > > + backlight_update_status(bd); > > + } > > + } > > } > > mutex_unlock(&bd->ops_lock); > > return 0; > > diff --git a/include/linux/backlight.h b/include/linux/backlight.h > > index 5f9cd96..7264742 100644 > > --- a/include/linux/backlight.h > > +++ b/include/linux/backlight.h > > @@ -9,6 +9,7 @@ > > #define _LINUX_BACK
RE: [PATCH] Add HID's to hid-microsoft driver of Surface Type/Touch Cover 2 to fix bug
Hello Benjamin, >> >> Hi, >> >> Thanks for reminding me of hid_have_special_driver[]. I noticed that >> this device has the HID_DG_CONTACTID and in the comment of the >> hid_have_sepcial_driver[] >> >> * Please note that for multitouch devices (driven by hid-multitouch driver), >> * there is a proper autodetection and autoloading in place (based on presence >> * of HID_DG_CONTACTID), so those devices don't need to be added to this list, >> * as we are doing the right thing in hid_scan_usage(). >> >> This device should not be driven by hid-multitouch as it does not >> handle keyboard/mouse input devices. >> I submitted a new patch below with it added. I believe it should still >> be part of this array, in case this kind of implementation is >> fixed/updated. > > This implementation is perfectly fine (I am referring to the "fixed/updated"): > - if your device should be driven by hid-multitouch, then you _don't_ > add it to hid_have_special_driver > - if your device should not be driven by hid-multitouch, then you > _need_ to add it to hid_have_special_driver. > > Adding the device to hid_have_special_driver prevents the detection of > the group HID_GRP_MULTITOUCH, so you will not end with a race between > hid-multitouch and your special hid driver. > Thanks for clearing that up. I understand the proper use of this array now, under this circumstance and am glad to know that there will be no race when added. >> >> From 291742873dcf181faf9657b41279487f31302c73 Mon Sep 17 00:00:00 2001 >> From: Reyad Attiyat >> Date: Tue, 21 Jan 2014 01:22:25 -0600 >> Subject: [PATCH 1/1] Added in HID's for Microsoft Surface Type/Touch cover 2. >> This is to fix bug 64811 where this device is detected as a multitouch >> device >> > > You are missing a commit message here (the first message you sent > would fit perfectly here). > Sorry about that, I'm new to submitting patches to these mailing lists. > Other than that, I played a little with the report descriptor pointed > in the bugzilla. > > I think I will be able to handle this touch cover in hid-multitouch, > but that would require more testings/debugging. Microsoft seems to > have implemented an indirect (dual) touchpad here, but until we know > which mode we should put it into, it's going to be tricky to set it up > correctly. > > One last thing, in the bugzilla, in the comment 2 you say: "I still > have issues with the type cover 2 even with this fix". Are you still > experiencing those disconnection? If so, maybe we should switch to > hid-multitouch at some point. > I tried some patches that I think you posted to hid-input about hid-multitouch. The patches added in support for function callbacks to allow for a generic protocol. This worked after I changed mt_input_mapping() to set the protocol to mt_protocol_generic 851 * such as Mouse that might have the same GenericDesktop usages. */ 852 if (field->application != HID_DG_TOUCHSCREEN && 853 field->application != HID_DG_PEN && 854 field->application != HID_DG_TOUCHPAD) 855td->protocols[report_id] = mt_protocol_generic; I still experience the disconnects with both of these solutions. Do you have any idea what could cause this? It seems to happen when I'm typing fast or holding a key. I'm guessing the only way to fix this properly is to snoop USB packets in Windows to see how the device is handled there. Another bug is the device stays on, lit, in standby mode. What do you think is the best solution to take? By that I mean should I keep the patch as part of hid-microsoft? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BISECTED] Linux 3.12.7 introduces page map handling regression
On Tue, Jan 21, 2014 at 07:20:45PM -0800, Steven Noonan wrote: > On Tue, Jan 21, 2014 at 06:47:07PM -0800, Linus Torvalds wrote: > > On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman > > wrote: Adding extra folks to the party. > > > > > > Odds are this also shows up in 3.13, right? > > Reproduced using 3.13 on the PV guest: > > [ 368.756763] BUG: Bad page map in process mp pte:8004a67c6165 > pmd:e9b706067 > [ 368.756777] page:ea001299f180 count:0 mapcount:-1 mapping: >(null) index:0x0 > [ 368.756781] page flags: 0x2f8014(referenced|dirty) > [ 368.756786] addr:7fd1388b7000 vm_flags:00100071 > anon_vma:880e9ba15f80 mapping: (null) index:7fd1388b7 > [ 368.756792] CPU: 29 PID: 618 Comm: mp Not tainted 3.13.0-ec2 #1 > [ 368.756795] 880e9b718958 880e9eaf3cc0 814d8748 > 7fd1388b7000 > [ 368.756803] 880e9eaf3d08 8116d289 > > [ 368.756809] 880e9b7065b8 ea001299f180 7fd1388b8000 > 880e9eaf3e30 > [ 368.756815] Call Trace: > [ 368.756825] [] dump_stack+0x45/0x56 > [ 368.756833] [] print_bad_pte+0x229/0x250 > [ 368.756837] [] unmap_single_vma+0x583/0x890 > [ 368.756842] [] unmap_vmas+0x65/0x90 > [ 368.756847] [] unmap_region+0xac/0x120 > [ 368.756852] [] ? vma_rb_erase+0x1c9/0x210 > [ 368.756856] [] do_munmap+0x280/0x370 > [ 368.756860] [] vm_munmap+0x41/0x60 > [ 368.756864] [] SyS_munmap+0x22/0x30 > [ 368.756869] [] system_call_fastpath+0x1a/0x1f > [ 368.756872] Disabling lock debugging due to kernel taint > [ 368.760084] BUG: Bad rss-counter state mm:880e9d079680 idx:0 > val:-1 > [ 368.760091] BUG: Bad rss-counter state mm:880e9d079680 idx:1 > val:1 > > > > > Probably. I don't have a Xen PV setup to test with (and very little > > interest in setting one up).. And I have a suspicion that it might not > > be so much about Xen PV, as perhaps about the kind of hardware. > > > > I suspect the issue has something to do with the magic _PAGE_NUMA > > tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up > > removing the _PAGE_PRESENT bit, and now the crazy numa code is > > confused. > > > > The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the > > bit with _PAGE_PROTNONE, which is why it then has that tie-in to > > _PAGE_PRESENT. > > > > Adding Andrea to the Cc, because he's the author of that horridness. > > Putting Steven's test-case here as an attachement for Andrea, maybe > > that makes him go "Ahh, yes, silly case". > > > > Also added Kirill, because he was involved the last _PAGE_NUMA debacle. > > > > Andrea, you can find the thread on lkml, but it boils down to commit > > 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the > > attached test-case (but apparently only under Xen PV). There it > > apparently causes a "BUG: Bad page map .." error. I *think* it is due to the fact that pmd_numa and pte_numa is getting the _raw_ value of PMDs and PTEs. That is - it does not use the pvops interface and instead reads the values directly from the page-table. Since the page-table is also manipulated by the hypervisor - there are certain flags it also sets to do its business. It might be that it uses _PAGE_GLOBAL as well - and Linux picks up on that. If it was using pte_flags that would invoke the pvops interface. Elena, Dariof and George, you guys had been looking at this a bit deeper than I have. Does the Xen hypervisor use the _PAGE_GLOBAL for PV guests? This not-compiled-totally-bad-patch might shed some light on what I was thinking _could_ fix this issue - and IS NOT A FIX - JUST A HACK. It does not fix it for PMDs naturally (as there are no PMD paravirt ops for that). The other question is - how is AutoNUMA running when it is not enabled? Shouldn't those _PAGE_NUMA ops be nops when AutoNUMA hasn't even been turned on? diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index ce563be..9fa7088 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -370,12 +370,15 @@ static pteval_t pte_mfn_to_pfn(pteval_t val) unsigned long pfn = mfn_to_pfn(mfn); pteval_t flags = val & PTE_FLAGS_MASK; + /* No AutoNUMA for PV. TODO If Linux sees the PTE having +* said bit, just igore it. */ + if (flags & _PAGE_NUMA) + flags = flags & ~_PAGE_NUMA; if (unlikely(pfn == ~0)) val = flags & ~_PAGE_PRESENT; else val = ((pteval_t)pfn << PAGE_SHIFT) | flags; } - return val; } diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index db09234..a8bc07d 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -644,7 +644,7 @@ static inline int pmd_t
Re:[PATCH v2] backlight: turn backlight on/off when necessary
Ping... Regards, Liu Ying On 01/20/2014 12:52 PM, Liu Ying wrote: > We don't have to turn backlight on/off everytime a blanking > or unblanking event comes because the backlight status may > have already been what we want. Another thought is that one > backlight device may be shared by multiple framebuffers. We > don't hope blanking one of the framebuffers may turn the > backlight off for all the other framebuffers, since they are > likely being active to display something. This patch adds > some logics to record each framebuffer's backlight usage to > determine the backlight device use count and whether the > backlight should be turned on or off. To be more specific, > only one unblank operation on a certain blanked framebuffer > may increase the backlight device's use count by one, while > one blank operation on a certain unblanked framebuffer may > decrease the use count by one, because the userspace is > likely to unblank a unblanked framebuffer or blank a blanked > framebuffer. > > Signed-off-by: Liu Ying > --- > v1 can be found at https://lkml.org/lkml/2013/5/30/139 > > v1->v2: > * Make the commit message be more specific about the condition > in which backlight device use count can be increased/decreased. > * Correct the setting for bd->props.fb_blank. > > drivers/video/backlight/backlight.c | 28 +--- > include/linux/backlight.h |6 ++ > 2 files changed, 27 insertions(+), 7 deletions(-) > > diff --git a/drivers/video/backlight/backlight.c > b/drivers/video/backlight/backlight.c > index 5d0..42044be 100644 > --- a/drivers/video/backlight/backlight.c > +++ b/drivers/video/backlight/backlight.c > @@ -34,13 +34,15 @@ static const char *const backlight_types[] = { >defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)) > /* This callback gets called when something important happens inside a > * framebuffer driver. We're looking if that important event is blanking, > - * and if it is, we're switching backlight power as well ... > + * and if it is and necessary, we're switching backlight power as well ... > */ > static int fb_notifier_callback(struct notifier_block *self, > unsigned long event, void *data) > { > struct backlight_device *bd; > struct fb_event *evdata = data; > + int node = evdata->info->node; > + int fb_blank = 0; > > /* If we aren't interested in this event, skip it immediately ... */ > if (event != FB_EVENT_BLANK && event != FB_EVENT_CONBLANK) > @@ -51,12 +53,24 @@ static int fb_notifier_callback(struct notifier_block > *self, > if (bd->ops) > if (!bd->ops->check_fb || > bd->ops->check_fb(bd, evdata->info)) { > - bd->props.fb_blank = *(int *)evdata->data; > - if (bd->props.fb_blank == FB_BLANK_UNBLANK) > - bd->props.state &= ~BL_CORE_FBBLANK; > - else > - bd->props.state |= BL_CORE_FBBLANK; > - backlight_update_status(bd); > + fb_blank = *(int *)evdata->data; > + if (fb_blank == FB_BLANK_UNBLANK && > + !bd->fb_bl_on[node]) { > + bd->fb_bl_on[node] = true; > + if (!bd->use_count++) { > + bd->props.state &= ~BL_CORE_FBBLANK; > + bd->props.fb_blank = FB_BLANK_UNBLANK; > + backlight_update_status(bd); > + } > + } else if (fb_blank != FB_BLANK_UNBLANK && > + bd->fb_bl_on[node]) { > + bd->fb_bl_on[node] = false; > + if (!(--bd->use_count)) { > + bd->props.state |= BL_CORE_FBBLANK; > + bd->props.fb_blank = > FB_BLANK_POWERDOWN; > + backlight_update_status(bd); > + } > + } > } > mutex_unlock(&bd->ops_lock); > return 0; > diff --git a/include/linux/backlight.h b/include/linux/backlight.h > index 5f9cd96..7264742 100644 > --- a/include/linux/backlight.h > +++ b/include/linux/backlight.h > @@ -9,6 +9,7 @@ > #define _LINUX_BACKLIGHT_H > > #include > +#include > #include > #include > > @@ -104,6 +105,11 @@ struct backlight_device { > struct list_head entry; > > struct device dev; > + > + /* Multiple framebuffers may share one backlight device */ > + bool fb_bl_on[FB_MAX]; > + > + int use_count; > }; > > static inline void backlight_update_status(struct backlight_device *bd) > -- > 1.7.9.5 -- To unsubscribe from this list: send
Re: [PATCH 68/73] drivers/cpufreq: delete non-required instances of
On 22 January 2014 02:53, Paul Gortmaker wrote: > None of these files are actually using any __init type directives > and hence don't need to include . Most are just a > left over from __devinit and __cpuinit removal, or simply due to > code getting copied from one driver to the next. > > Cc: Kevin Hilman > Cc: "Rafael J. Wysocki" > Cc: Viresh Kumar > Cc: cpuf...@vger.kernel.org > Cc: linux...@vger.kernel.org > Signed-off-by: Paul Gortmaker > --- > drivers/cpufreq/omap-cpufreq.c| 1 - > drivers/cpufreq/powernow-k8.c | 1 - > drivers/cpufreq/s3c2412-cpufreq.c | 1 - > drivers/cpufreq/s3c2440-cpufreq.c | 1 - > drivers/cpufreq/spear-cpufreq.c | 1 - > drivers/cpufreq/speedstep-lib.c | 1 - > 6 files changed, 6 deletions(-) We have discussed this in past where I pointed out this file is required by stuff like module_init, what happened to that query ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clk: export __clk_get_hw for re-use in others
On Wed, Jan 22, 2014 at 12:05:57PM +0900, SeongJae Park wrote: > Dear Greg, Mike, > > May I ask your answer or other opinion, please? It's the middle of the merge window, it's not time for new development, or much time for free-time for me, sorry. Feel free to fix it the best way you know how. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pinctrl: Rename Broadcom Capri pinctrl driver
On Tue, Jan 21, 2014 at 5:35 PM, Matt Porter wrote: > On Tue, Jan 21, 2014 at 04:59:35PM -0800, Olof Johansson wrote: >> Hi, >> >> >> On Tue, Jan 21, 2014 at 2:38 PM, Sherman Yin wrote: >> > To be consistent with other Broadcom drivers, the Broadcom Capri pinctrl >> > driver and its related CONFIG option are renamed to bcm281xx. >> > >> > Devicetree compatible string and binding documentation use >> > "brcm,bcm11351-pinctrl" to match the machine binding here: >> > Documentation/devicetree/bindings/arm/bcm/bcm11351.txt >> > >> > This driver supports pinctrl on BCM11130, BCM11140, BCM11351, BCM28145 >> > and BCM28155 SoCs. >> > >> > Signed-off-by: Sherman Yin >> > Reviewed-by: Matt Porter >> > --- >> > ...capri-pinctrl.txt => brcm,bcm11351-pinctrl.txt} |8 +- >> > arch/arm/boot/dts/bcm11351.dtsi|2 +- >> > arch/arm/configs/bcm_defconfig |2 +- >> > drivers/pinctrl/Kconfig|8 +- >> > drivers/pinctrl/Makefile |2 +- >> > .../{pinctrl-capri.c => pinctrl-bcm281xx.c}| 1521 >> > ++-- >> > 6 files changed, 775 insertions(+), 768 deletions(-) >> > rename Documentation/devicetree/bindings/pinctrl/{brcm,capri-pinctrl.txt >> > => brcm,bcm11351-pinctrl.txt} (98%) >> > rename drivers/pinctrl/{pinctrl-capri.c => pinctrl-bcm281xx.c} (25%) >> > >> > diff --git >> > a/Documentation/devicetree/bindings/pinctrl/brcm,capri-pinctrl.txt >> > b/Documentation/devicetree/bindings/pinctrl/brcm,bcm11351-pinctrl.txt >> > similarity index 98% >> > rename from >> > Documentation/devicetree/bindings/pinctrl/brcm,capri-pinctrl.txt >> > rename to >> > Documentation/devicetree/bindings/pinctrl/brcm,bcm11351-pinctrl.txt >> > index 9e9e9ef..c119deb 100644 >> > --- a/Documentation/devicetree/bindings/pinctrl/brcm,capri-pinctrl.txt >> > +++ b/Documentation/devicetree/bindings/pinctrl/brcm,bcm11351-pinctrl.txt >> > @@ -1,4 +1,4 @@ >> > -Broadcom Capri Pin Controller >> > +Broadcom BCM281xx Pin Controller >> > >> > This is a pin controller for the Broadcom BCM281xx SoC family, which >> > includes >> > BCM11130, BCM11140, BCM11351, BCM28145, and BCM28155 SoCs. >> > @@ -7,14 +7,14 @@ BCM11130, BCM11140, BCM11351, BCM28145, and BCM28155 >> > SoCs. >> > >> > Required Properties: >> > >> > -- compatible: Must be "brcm,capri-pinctrl". >> > +- compatible: Must be "brcm,bcm11351-pinctrl" >> >> Since the original binding is queued for 3.14 (I believe?), if this >> rename isn't merged for 3.14 then you will still need to accept the >> old compatible string (binding). You can document it as deprecated, >> but the driver needs to still probe with it. > > Linus had mentioned that he could take a rename in 3.14-rc for this > driver which is really what we had in mind here. Since the binding > doesn't become stable until 3.14 is actually released I was under the > impression that this is ok without keeping a deprecated compatible > string. I notice that Tomasz had comments about this type of situation > in http://www.spinics.net/lists/devicetree/msg18010.html Yes, if the rename goes in before the binding has been in one stable release then we can make noncompatible changes. Which is why I said if this isn't merged for 3.14, etc. -Olof -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scripts/gcc-version.sh: handle CC="gcc -m32"
Rusty Russell writes: > Michal Marek writes: >>> gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead >>> gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead >>> kernel/bounds.c:1:0: error: CPU you selected does not support x86-64 >>> instruction set >>> /* >>> ^ >>> kernel/bounds.c:1:0: warning: -mregparm is ignored in 64-bit mode [enabled >>> by default] >>> make[1]: *** [kernel/bounds.s] Error 1 >>> make: *** [prepare0] Error 2 Sorry, ignore this report. In case anyone else hits this: was resolved by installing more 32 bit headers (Ubuntu's libc6-dev-i386 package, in this case). Thanks, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] mm: oom_kill: revert 3% system memory bonus for privileged tasks
On Thu, 16 Jan 2014, Johannes Weiner wrote: > > Unfortunately, I think this could potentially be too much of a bonus. On > > your same 32GB machine, if a root process is using 18GB and a user process > > is using 14GB, the user process ends up getting selected while the current > > discount of 3% still selects the root process. > > > > I do like the idea of scaling this bonus depending on points, however. I > > think it would be better if we could scale the discount but also limit it > > to some sane value. > > I just reverted to the /= 4 because we had that for a long time and it > seemed to work. I don't really mind either way as long as we get rid > of that -3%. Do you have a suggestion? > How about simply using 3% of the root process's points so that root processes get some bonus compared to non-root processes with the same memory usage and it's scaled to the usage rather than amount of available memory? So rather than points /= 4, we do if (has_capability_noaudit(p, CAP_SYS_ADMIN)) points -= (points * 3) / 100; instead. Sound good? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 3/5] qrwlock, x86 - Treat all data type not bigger than long as atomic in x86
On 01/21/2014 07:31 PM, Linus Torvalds wrote: On Tue, Jan 21, 2014 at 8:09 AM, Waiman Long wrote: include/linux/compiler.h: #ifndef __native_word # ifdef __arch_native_word(t) # define __native_word(t) __arch_native_word(t) # else # define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == siizeof(long)) # endif #endif Do we even really need this? I'd suggest removing it entirely. You might want to retain the whole compiletime_assert_atomic_type() thing on purely the alpha side, but then it's all inside just the alpha code, without any need for this "native_word" thing. And if somebody tries to do a "smp_store_release()" on a random structure or union, do we care? We're not some nanny state that wants to give nice warnings for insane code. Linus That sounds good to me too. Peter, what do you think about this? -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] block: Fix memory leak in rw_copy_check_uvector() handling
On Sun, Jan 19 2014, Christian Engelmayer wrote: > Fix a memory leak in the error handling path of function sg_io() > that is used during the processing of scsi ioctl. Memory already > allocated by rw_copy_check_uvector() needs to be freed correctly. > Detected by Coverity: CID 1128953. Applied, thanks. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -trivial] mg_disk: Spelling s/finised/finished/
On Tue, Jan 21 2014, Geert Uytterhoeven wrote: > From: Geert Uytterhoeven Applied, thanks. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bio_integrity_verify() bug causing READ verify to be silently skipped
On Tue, Jan 21 2014, Nicholas A. Bellinger wrote: > On Fri, 2014-01-17 at 16:58 -0500, Martin K. Petersen wrote: > > > "nab" == Nicholas A Bellinger writes: > > > > >> That breaks partial completion, though. I'll take a look at Kent's > > >> changes... > > > > nab> Ping..? Any updates on a proper bugfix for this..? > > > > I did put your patch in my queue and have been working on a fix for the > > partial completion case. The latter requires a bit of massaging that > > interferes with other pending changes. > > > > Given that your patch does address a valid issue I'm OK with Jens > > putting it in as is. I'll build upon it for my changes. > > > > > > Jens, are you going to pick this one up, or shall I include it in the > upcoming target-pending/for-next pull request instead..? > > Either way, it needs a CC' to stable for >= v3.10.y. I'll queue it up, thanks. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH V3 1/2] null_blk: Null pointer deference problem in alloc_page_buffers
On Tue, Jan 21 2014, Raghavendra K T wrote: > If we load the null_blk module with bs=8k we get following oops: > [ 3819.812190] BUG: unable to handle kernel NULL pointer dereference at > 0008 > [ 3819.812387] IP: [] create_empty_buffers+0x28/0xaf > [ 3819.812527] PGD 219244067 PUD 215a06067 PMD 0 > [ 3819.812640] Oops: [#1] SMP > [ 3819.812772] Modules linked in: null_blk(+) > > Fix that by resetting block size to PAGE_SIZE if it is greater than PAGE_SIZE Thanks, applied. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT] floppy
On Fri, Jan 17 2014, Jiri Kosina wrote: > Jens, > > please consider pulling > > git://git.kernel.org/pub/scm/linux/kernel/git/jikos/linux-block.git for-jens > > into your for-3.14/drivers branch to receive > > Jiri Kosina (1): > floppy: bail out in open() if drive is not responding to block0 read Thanks Jiri, pulled. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ipv4_dst_destroy panic regression after 3.10.15
On Tue, 21 Jan 2014, Alexei Starovoitov wrote: > On Tue, Jan 21, 2014 at 5:39 PM, dormando wrote: > > > > > On Fri, Jan 17, 2014 at 11:16 PM, dormando wrote: > > > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: > > > >> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: > > > >> > > Hi, > > > >> > > > > > >> > > Upgraded a few kernels to the latest 3.10 stable tree while > > > >> > > tracking down > > > >> > > a rare kernel panic, seems to have introduced a much more frequent > > > >> > > kernel > > > >> > > panic. Takes anywhere from 4 hours to 2 days to trigger: > > > >> > > > > > >> > > <4>[196727.311203] general protection fault: [#1] SMP > > > >> > > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP > > > >> > > macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich > > > >> > > microcode > ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm > tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp > pps_core mdio > > > >> > > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted > > > >> > > 3.10.26 #1 > > > >> > > <4>[196727.311344] Hardware name: Supermicro > > > >> > > X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013 > > > >> > > <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 > > > >> > > task.ti: 885e6f072000 > > > >> > > <4>[196727.311377] RIP: 0010:[] > > > >> > > [] ipv4_dst_destroy+0x4f/0x80 > > > >> > > <4>[196727.311399] RSP: 0018:885effd23a70 EFLAGS: 00010282 > > > >> > > <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 > > > >> > > RCX: 0040 > > > >> > > <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 > > > >> > > RDI: dead00200200 > > > >> > > <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 > > > >> > > R09: 885d5a590800 > > > >> > > <4>[196727.311451] R10: R11: > > > >> > > R12: > > > >> > > <4>[196727.311464] R13: 81c8c280 R14: > > > >> > > R15: 880e85ee16ce > > > >> > > <4>[196727.311510] FS: () > > > >> > > GS:885effd2() knlGS: > > > >> > > <4>[196727.311554] CS: 0010 DS: ES: CR0: > > > >> > > 80050033 > > > >> > > <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 > > > >> > > CR4: 000407e0 > > > >> > > <4>[196727.311625] DR0: DR1: > > > >> > > DR2: > > > >> > > <4>[196727.311669] DR3: DR6: 0ff0 > > > >> > > DR7: 0400 > > > >> > > <4>[196727.311713] Stack: > > > >> > > <4>[196727.311733] 8854c398ecc0 8854c398ecc0 > > > >> > > 885effd23ab0 815b7f42 > > > >> > > <4>[196727.311784] 88be6595bc00 8854c398ecc0 > > > >> > > 8854c398ecc0 > > > >> > > <4>[196727.311834] 885effd23ad0 815b86c6 > > > >> > > 885d5a590800 8816827821c0 > > > >> > > <4>[196727.311885] Call Trace: > > > >> > > <4>[196727.311907] > > > >> > > <4>[196727.311912] [] dst_destroy+0x32/0xe0 > > > >> > > <4>[196727.311959] [] dst_release+0x56/0x80 > > > >> > > <4>[196727.311986] [] tcp_v4_do_rcv+0x2a5/0x4a0 > > > >> > > <4>[196727.312013] [] tcp_v4_rcv+0x7da/0x820 > > > >> > > <4>[196727.312041] [] ? > > > >> > > ip_rcv_finish+0x360/0x360 > > > >> > > <4>[196727.312070] [] ? nf_hook_slow+0x7d/0x150 > > > >> > > <4>[196727.312097] [] ? > > > >> > > ip_rcv_finish+0x360/0x360 > > > >> > > <4>[196727.312125] [] > > > >> > > ip_local_deliver_finish+0xb2/0x230 > > > >> > > <4>[196727.312154] [] ip_local_deliver+0x4a/0x90 > > > >> > > <4>[196727.312183] [] ip_rcv_finish+0x119/0x360 > > > >> > > <4>[196727.312212] [] ip_rcv+0x22b/0x340 > > > >> > > <4>[196727.312242] [] ? > > > >> > > macvlan_broadcast+0x160/0x160 [macvlan] > > > >> > > <4>[196727.312275] [] > > > >> > > __netif_receive_skb_core+0x512/0x640 > > > >> > > <4>[196727.312308] [] ? > > > >> > > kmem_cache_alloc+0x13b/0x150 > > > >> > > <4>[196727.312338] [] > > > >> > > __netif_receive_skb+0x21/0x70 > > > >> > > <4>[196727.312368] [] > > > >> > > netif_receive_skb+0x31/0xa0 > > > >> > > <4>[196727.312397] [] > > > >> > > napi_gro_receive+0xe8/0x140 > > > >> > > <4>[196727.312433] [] ixgbe_poll+0x551/0x11f0 > > > >> > > [ixgbe] > > > >> > > <4>[196727.312463] [] ? ip_rcv+0x22b/0x340 > > > >> > > <4>[196727.312491] [] net_rx_action+0x111/0x210 > > > >> > > <4>[196727.312521] [] ? > > > >> > > __netif_receive_skb+0x21/0x70 > > > >> > > <4>[196727.312552] [] __do_softirq+0xd0/0x270 > > > >> > > <4>[196727.312583] [] call_softirq+0x1c/0x30 > > > >> > > <4>[196727.312613] [] do_softirq+0x55/0x90 > > > >> > > <4>[196727.312640] [] irq_exit+0x55/0x60 > > > >> > > <4>[196727.312668] [] do_IRQ+0x63/0xe0 > > > >> > > <4>[196727.312696] [] common_interrupt+0x6a/0x6a > > > >> > > <4>[196727.312722]
Re: linux rdma 3.14 merge plans
On Wed, Jan 22, 2014 at 2:43 AM, Roland Dreier wrote: > On Tue, Jan 21, 2014 at 2:00 PM, Or Gerlitz wrote: >> Roland, ping! the signature patches were posted > three months ago. We >> deserve a response from the maintainer that goes beyond "I need to >> think on that". >> >> Responsiveness was stated by Linus to be the #1 requirement from >> kernel maintainers. > > Or, I'm not sure what response you're after from me. Roland, what I am after is a r-e-s-p-o-n-s-e from you, and let it contain what ever justified and/or unjustified mud as below. We posted the V0 series on Oct 15 2013 and since that time not a word from you, except for an "I need to think on that" comment last week after we nudged million times. You can't leave us clueless in the air for whole three months without any concrete or unconcrete comment. There's no way to carry kernel development like that. I am old enough to hear and face "no" and "wTF is this" or "yTF you do it this way" etc etc, this happened few times with e.g with networking patches we sent and we either improved things or did them differently or whatever needed to be done. There's no way on earth to face plain ignoring of your work, and this is what happens here. I had no way to get your below response except for going to LKML, why? > Linus has also said that maintainers should say "no" a lot more > (http://lwn.net/Articles/571995/) so maybe you want me to say, "No, I > won't merge this patch set, since it adds a bunch of complexity to > support a feature no one really cares about." Is that it? (And yes I > am skeptical about this stuff — I work at an enterprise storage > company and even here it's hard to find anyone who cares about > DIF/DIX, especially offload features that stop it from being > end-to-end) > > I'm sure you're not expecting me to say, "Sure, I'll merge it without > understanding the problem it's solving or how it's doing that," > especially given the your recent history of pushing me to merge stuff > like the IP-RoCE patches back when they broke the userspace ABI. > > I'd really rather spend my time on something actually useful like > cleaning up softroce. > > - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: uninline rcu_lock_acquire/etc ?
On Tue, Jan 21, 2014 at 08:39:09PM +0100, Oleg Nesterov wrote: > On 01/21, Oleg Nesterov wrote: > > > > But I agreed that the code looks simpler with bitfields, so perhaps > > this patch is better. > > Besides, I guess the major offender is rcu... > > Paul, can't we do something like below? Saves 19.5 kilobytes, > > - 5255131 2974376 10125312183548191181283 vmlinux > + 5235227 2970344 1012531218330883117b503 vmlinux > > probably we can also uninline rcu_lockdep_assert()... Looks mostly plausible, some questions inline below. Thanx, Paul > Oleg. > --- > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 2eef290..58f7a97 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -310,18 +310,34 @@ static inline bool rcu_lockdep_current_cpu_online(void) > } > #endif /* #else #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) > */ > > -#ifdef CONFIG_DEBUG_LOCK_ALLOC > - > -static inline void rcu_lock_acquire(struct lockdep_map *map) > +static inline void __rcu_lock_acquire(struct lockdep_map *map, unsigned long > ip) > { > - lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_); > + lock_acquire(map, 0, 0, 2, 0, NULL, ip); > } > > -static inline void rcu_lock_release(struct lockdep_map *map) > +static inline void __rcu_lock_release(struct lockdep_map *map, unsigned long > ip) > { > lock_release(map, 1, _THIS_IP_); > } > > +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU) > +extern void rcu_lock_acquire(void); > +extern void rcu_lock_release(void); > +extern void rcu_lock_acquire_bh(void); > +extern void rcu_lock_release_bh(void); > +extern void rcu_lock_acquire_sched(void); > +extern void rcu_lock_release_sched(void); > +#else > +#define rcu_lock_acquire() do { } while (0) > +#define rcu_lock_release() do { } while (0) > +#define rcu_lock_acquire_bh()do { } while (0) > +#define rcu_lock_release_bh()do { } while (0) > +#define rcu_lock_acquire_sched() do { } while (0) > +#define rcu_lock_release_sched() do { } while (0) > +#endif > + > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + > extern struct lockdep_map rcu_lock_map; > extern struct lockdep_map rcu_bh_lock_map; > extern struct lockdep_map rcu_sched_lock_map; > @@ -419,9 +435,6 @@ static inline int rcu_read_lock_sched_held(void) > > #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ > > -# define rcu_lock_acquire(a) do { } while (0) > -# define rcu_lock_release(a) do { } while (0) > - > static inline int rcu_read_lock_held(void) > { > return 1; > @@ -766,11 +779,9 @@ static inline void rcu_preempt_sleep_check(void) > */ > static inline void rcu_read_lock(void) > { > - __rcu_read_lock(); > __acquire(RCU); > - rcu_lock_acquire(&rcu_lock_map); > - rcu_lockdep_assert(rcu_is_watching(), > -"rcu_read_lock() used illegally while idle"); > + __rcu_read_lock(); > + rcu_lock_acquire(); Not sure why __rcu_read_lock() needs to be in any particular order with respect to the sparse __acquire(RCU), but should work either way. Same question about the other reorderings of similar statements. > } > > /* > @@ -790,11 +801,9 @@ static inline void rcu_read_lock(void) > */ > static inline void rcu_read_unlock(void) > { > - rcu_lockdep_assert(rcu_is_watching(), > -"rcu_read_unlock() used illegally while idle"); > - rcu_lock_release(&rcu_lock_map); > - __release(RCU); > + rcu_lock_release(); > __rcu_read_unlock(); > + __release(RCU); > } > > /** > @@ -816,11 +825,9 @@ static inline void rcu_read_unlock(void) > */ > static inline void rcu_read_lock_bh(void) > { > - local_bh_disable(); > __acquire(RCU_BH); > - rcu_lock_acquire(&rcu_bh_lock_map); > - rcu_lockdep_assert(rcu_is_watching(), > -"rcu_read_lock_bh() used illegally while idle"); > + local_bh_disable(); > + rcu_lock_acquire_bh(); > } > > /* > @@ -830,11 +837,9 @@ static inline void rcu_read_lock_bh(void) > */ > static inline void rcu_read_unlock_bh(void) > { > - rcu_lockdep_assert(rcu_is_watching(), > -"rcu_read_unlock_bh() used illegally while idle"); > - rcu_lock_release(&rcu_bh_lock_map); > - __release(RCU_BH); > + rcu_lock_release_bh(); > local_bh_enable(); > + __release(RCU_BH); > } > > /** > @@ -852,9 +857,9 @@ static inline void rcu_read_unlock_bh(void) > */ > static inline void rcu_read_lock_sched(void) > { > - preempt_disable(); > __acquire(RCU_SCHED); > - rcu_lock_acquire(&rcu_sched_lock_map); > + preempt_disable(); > + rcu_lock_acquire_sched(); > rcu_lockdep_assert(rcu_is_watching(), > "rcu_read_lock_sched() used illegally while idle"); The above pair of l
[GIT PULL] audit subsystem for 3.14
Linus, Please consider pulling the following audit changes. Again we stayed pretty well contained inside the audit system. Venturing out was fixing a couple of function prototypes which were inconsistent (didn't hurt anything, but we used the same value as an int, uint, u32, and I think even a long in a couple of places). We also made a couple of minor changes to when a couple of LSMs called the audit system. We hoped to add aarch64 audit support this go round, but it wasn't ready. There is one merge issue. Take your code, then convert the prototype for the first 4 functions changing the "u32 ses" to "unsigned int ses". (Do not change the u32 secid) I'm disappearing on vacation on Thursday. I should have internet access, but it'll be spotty. If anything goes wrong please be sure to cc r...@redhat.com. He'll make fixing things his top priority. -Eric The following changes since commit fc582aef7dcc27a7120cf232c1e76c569c7b6eab: Merge tag 'v3.12' (2013-11-22 18:57:54 -0500) are available in the git repository at: git://git.infradead.org/users/eparis/audit.git master for you to fetch changes up to f3411cb2b2e396a41ed3a439863f028db7140a34: audit: whitespace fix in kernel-parameters.txt (2014-01-17 17:15:02 -0500) AKASHI Takahiro (2): audit: correct a type mismatch in audit_syscall_exit() audit: Modify a set of system calls in audit class definitions Dan Duval (2): audit: efficiency fix 1: only wake up if queue shorter than backlog limit audit: efficiency fix 2: request exclusive wait since all need same resource Eric Paris (8): audit: convert all sessionid declaration to unsigned int audit: wait_for_auditd rework for readability audit: documentation of audit= kernel parameter audit: use define's for audit version audit: remove needless switch in AUDIT_SET audit: rework AUDIT_TTY_SET to only grab spin_lock once audit: reorder AUDIT_TTY_SET arguments audit: remove pr_info for every network namespace Eric W. Biederman (1): audit: Simplify and correct audit_log_capset Gao feng (7): audit: remove useless code in audit_enable audit: fix incorrect order of log new and old feature audit: don't generate audit feature changed log when audit disabled audit: use old_lock in audit_set_feature audit: don't generate loginuid log when audit disabled audit: print error message when fail to create audit socket audit: fix incorrect set of audit_sock Joe Perches (3): audit: Use hex_byte_pack_upper audit: Use more current logging style audit: Convert int limit uses to u32 Paul Davies C (2): audit: drop audit_log_abend() audit: Added exe field to audit core dump signal log Richard Guy Briggs (24): audit: fix netlink portid naming and types audit: restore order of tty and ses fields in log output audit: listen in all network namespaces audit: reset audit backlog wait time after error recovery audit: make use of remaining sleep time from wait_for_auditd documentation: document the audit= kernel start-up parameter audit: add kernel set-up parameter to override default backlog limit audit: clean up AUDIT_GET/SET local variables and future-proof API audit: add audit_backlog_wait_time configuration option audit: fix incorrect type of sessionid audit: allow unlimited backlog queue audit: get rid of *NO* daemon at audit_pid=0 message audit: log AUDIT_TTY_SET config changes audit: refactor audit_receive_msg() to clarify AUDIT_*_RULE* cases audit: prevent an older auditd shutdown from orphaning a newer auditd startup selinux: call WARN_ONCE() instead of calling audit_log_start() smack: call WARN_ONCE() instead of calling audit_log_start() audit: drop audit_cmd_lock in AUDIT_USER family of cases audit: log on errors from filter user rules audit: fix dangling keywords in audit_log_set_loginuid() output audit: log task info on feature change audit: update MAINTAINERS audit: fix location of __net_initdata for audit_net_ops audit: whitespace fix in kernel-parameters.txt Toshiyuki Okajima (1): audit: audit_log_start running on auditd should not stop Documentation/kernel-parameters.txt | 16 ++ MAINTAINERS | 3 +- drivers/tty/tty_audit.c | 2 +- include/asm-generic/audit_change_attr.h | 4 +- include/asm-generic/audit_write.h | 6 +++ include/linux/audit.h | 22 include/linux/init_task.h | 2 +- include/net/netlabel.h | 2 +- include/net/xfrm.h | 20 +++ include/uapi/linux/audit.h | 8 +++ kernel/audit.c | 365
[PATCH v5 8/8] ARM: brcmstb: dts: add a reference DTS for Broadcom 7445
Add a sample DTS which will allow bootup of a board populated with the BCM7445 chip. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- arch/arm/boot/dts/bcm7445.dts | 111 + 1 files changed, 111 insertions(+), 0 deletions(-) create mode 100644 arch/arm/boot/dts/bcm7445.dts diff --git a/arch/arm/boot/dts/bcm7445.dts b/arch/arm/boot/dts/bcm7445.dts new file mode 100644 index 000..ffa3305 --- /dev/null +++ b/arch/arm/boot/dts/bcm7445.dts @@ -0,0 +1,111 @@ +/dts-v1/; +/include/ "skeleton.dtsi" + +/ { + #address-cells = <2>; + #size-cells = <2>; + model = "Broadcom STB (bcm7445)"; + compatible = "brcm,bcm7445", "brcm,brcmstb"; + interrupt-parent = <&gic>; + + chosen {}; + + memory { + device_type = "memory"; + reg = <0x00 0x 0x00 0x4000>, + <0x00 0x4000 0x00 0x4000>, + <0x00 0x8000 0x00 0x4000>; + }; + + cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu@0 { + compatible = "brcm,brahma-b15"; + device_type = "cpu"; + reg = <0>; + }; + + cpu@1 { + compatible = "brcm,brahma-b15"; + device_type = "cpu"; + reg = <1>; + }; + + cpu@2 { + compatible = "brcm,brahma-b15"; + device_type = "cpu"; + reg = <2>; + }; + + cpu@3 { + compatible = "brcm,brahma-b15"; + device_type = "cpu"; + reg = <3>; + }; + }; + + gic: interrupt-controller@ffd0 { + compatible = "brcm,brahma-b15-gic", "arm,cortex-a15-gic"; + reg = <0x00 0xffd01000 0x00 0x1000>, + <0x00 0xffd02000 0x00 0x2000>, + <0x00 0xffd04000 0x00 0x2000>, + <0x00 0xffd06000 0x00 0x2000>; + interrupt-controller; + #interrupt-cells = <3>; + }; + + timer { + compatible = "arm,armv7-timer"; + interrupts = <1 13 0xf08>, +<1 14 0xf08>, +<1 11 0xf08>, +<1 10 0xf08>; + }; + + rdb { + #address-cells = <1>; + #size-cells = <1>; + compatible = "simple-bus"; + ranges = <0 0x00 0xf000 0x100>; + + serial@406b00 { + compatible = "ns16550a"; + reg = <0x406b00 0x20>; + reg-shift = <2>; + reg-io-width = <4>; + interrupts = <0 75 0x4>; + clock-frequency = <0x4d3f640>; + }; + + sun_top_ctrl: syscon@404000 { + compatible = "brcm,bcm7445-sun-top-ctrl", +"syscon"; + reg = <0x404000 0x51c>; + }; + + hif_cpubiuctrl: syscon@3e2400 { + compatible = "brcm,bcm7445-hif-cpubiuctrl", +"syscon"; + reg = <0x3e2400 0x5b4>; + }; + + hif_continuation: syscon@452000 { + compatible = "brcm,bcm7445-hif-continuation", +"syscon"; + reg = <0x452000 0x100>; + }; + }; + + smpboot { + compatible = "brcm,brcmstb-smpboot"; + syscon-cpu = <&hif_cpubiuctrl 0x88 0x178>; + syscon-cont = <&hif_continuation>; + }; + + reboot { + compatible = "brcm,brcmstb-reboot"; + syscon = <&sun_top_ctrl 0x304 0x308>; + }; +}; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 2/8] power: reset: Add reboot driver for brcmstb
Add support for reboot functionality on boards with ARM-based Broadcom STB chipsets. Signed-off-by: Marc Carino --- drivers/power/reset/Kconfig | 10 +++ drivers/power/reset/Makefile |1 + drivers/power/reset/brcmstb-reboot.c | 120 ++ 3 files changed, 131 insertions(+), 0 deletions(-) create mode 100644 drivers/power/reset/brcmstb-reboot.c diff --git a/drivers/power/reset/Kconfig b/drivers/power/reset/Kconfig index 9b3ea53..31b468b 100644 --- a/drivers/power/reset/Kconfig +++ b/drivers/power/reset/Kconfig @@ -6,6 +6,16 @@ menuconfig POWER_RESET Say Y here to enable board reset and power off +config POWER_RESET_BRCMSTB + bool "Broadcom STB reset driver" + depends on POWER_RESET && ARCH_BRCMSTB + help + This driver provides restart support for ARM-based Broadcom STB + boards. + + Say Y here if you have an ARM-based Broadcom STB board and you wish + to have restart support. + config POWER_RESET_GPIO bool "GPIO power-off driver" depends on OF_GPIO && POWER_RESET diff --git a/drivers/power/reset/Makefile b/drivers/power/reset/Makefile index 3e6ed88..806d056 100644 --- a/drivers/power/reset/Makefile +++ b/drivers/power/reset/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_POWER_RESET_QNAP) += qnap-poweroff.o obj-$(CONFIG_POWER_RESET_RESTART) += restart-poweroff.o obj-$(CONFIG_POWER_RESET_VEXPRESS) += vexpress-poweroff.o obj-$(CONFIG_POWER_RESET_XGENE) += xgene-reboot.o +obj-$(CONFIG_POWER_RESET_BRCMSTB) += brcmstb-reboot.o diff --git a/drivers/power/reset/brcmstb-reboot.c b/drivers/power/reset/brcmstb-reboot.c new file mode 100644 index 000..3f23692 --- /dev/null +++ b/drivers/power/reset/brcmstb-reboot.c @@ -0,0 +1,120 @@ +/* + * Copyright (C) 2013 Broadcom Corporation + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#define RESET_SOURCE_ENABLE_REG 1 +#define SW_MASTER_RESET_REG 2 + +static struct regmap *regmap; +static u32 rst_src_en; +static u32 sw_mstr_rst; + +static void brcmstb_reboot(enum reboot_mode mode, const char *cmd) +{ + int rc; + u32 tmp; + + rc = regmap_write(regmap, rst_src_en, 1); + if (rc) { + pr_err("failed to write rst_src_en (%d)\n", rc); + return; + } + + rc = regmap_read(regmap, rst_src_en, &tmp); + if (rc) { + pr_err("failed to read rst_src_en (%d)\n", rc); + return; + } + + rc = regmap_write(regmap, sw_mstr_rst, 1); + if (rc) { + pr_err("failed to write sw_mstr_rst (%d)\n", rc); + return; + } + + rc = regmap_read(regmap, sw_mstr_rst, &tmp); + if (rc) { + pr_err("failed to read sw_mstr_rst (%d)\n", rc); + return; + } + + while (1) + ; +} + +static int brcmstb_reboot_probe(struct platform_device *pdev) +{ + int rc; + struct device_node *np = pdev->dev.of_node; + + regmap = syscon_regmap_lookup_by_phandle(np, "syscon"); + if (IS_ERR(regmap)) { + pr_err("failed to get syscon phandle\n"); + return -EINVAL; + } + + rc = of_property_read_u32_index(np, "syscon", RESET_SOURCE_ENABLE_REG, + &rst_src_en); + if (rc) { + pr_err("can't get rst_src_en offset (%d)\n", rc); + return -EINVAL; + } + + rc = of_property_read_u32_index(np, "syscon", SW_MASTER_RESET_REG, + &sw_mstr_rst); + if (rc) { + pr_err("can't get sw_mstr_rst offset (%d)\n", rc); + return -EINVAL; + } + + arm_pm_restart = brcmstb_reboot; + + return 0; +} + +static const struct of_device_id of_match[] = { + { .compatible = "brcm,brcmstb-reboot", }, + {}, +}; + +static struct platform_driver brcmstb_reboot_driver = { + .probe = brcmstb_reboot_probe, + .driver = { + .name = "brcmstb-reboot", + .owner = THIS_MODULE, + .of_match_table = of_match, + }, +}; + +static int __init brcmstb_reboot_init(void) +{ + return platform_driver_probe(&brcmstb_reboot_driver, + brcmstb_reboot_probe); +} +subsys_initcall(brcmstb_reboot_init); -- 1.7.1 -- To unsubscribe from this list: sen
[PATCH v5 1/8] ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs
The BCM7xxx series of Broadcom SoCs are used primarily in set-top boxes. This patch adds machine support for the ARM-based Broadcom SoCs. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- arch/arm/configs/multi_v7_defconfig |1 + arch/arm/mach-bcm/Kconfig | 14 ++ arch/arm/mach-bcm/Makefile |4 + arch/arm/mach-bcm/brcmstb.c | 110 arch/arm/mach-bcm/brcmstb.h | 38 arch/arm/mach-bcm/headsmp-brcmstb.S | 34 arch/arm/mach-bcm/hotplug-brcmstb.c | 334 +++ 7 files changed, 535 insertions(+), 0 deletions(-) create mode 100644 arch/arm/mach-bcm/brcmstb.c create mode 100644 arch/arm/mach-bcm/brcmstb.h create mode 100644 arch/arm/mach-bcm/headsmp-brcmstb.S create mode 100644 arch/arm/mach-bcm/hotplug-brcmstb.c diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig index c1df4e9..7028d11 100644 --- a/arch/arm/configs/multi_v7_defconfig +++ b/arch/arm/configs/multi_v7_defconfig @@ -7,6 +7,7 @@ CONFIG_MACH_ARMADA_370=y CONFIG_MACH_ARMADA_XP=y CONFIG_ARCH_BCM=y CONFIG_ARCH_BCM_MOBILE=y +CONFIG_ARCH_BRCMSTB=y CONFIG_GPIO_PCA953X=y CONFIG_ARCH_HIGHBANK=y CONFIG_ARCH_KEYSTONE=y diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig index 9fe6d88..2c1ae83 100644 --- a/arch/arm/mach-bcm/Kconfig +++ b/arch/arm/mach-bcm/Kconfig @@ -31,6 +31,20 @@ config ARCH_BCM_MOBILE BCM11130, BCM11140, BCM11351, BCM28145 and BCM28155 variants. +config ARCH_BRCMSTB + bool "Broadcom BCM7XXX based boards" if ARCH_MULTI_V7 + depends on MMU + select ARM_GIC + select MIGHT_HAVE_PCI + select HAVE_SMP + select HAVE_ARM_ARCH_TIMER + help + Say Y if you intend to run the kernel on a Broadcom ARM-based STB + chipset. + + This enables support for Broadcom ARM-based set-top box chipsets, + including the 7445 family of chips. + endmenu endif diff --git a/arch/arm/mach-bcm/Makefile b/arch/arm/mach-bcm/Makefile index c2ccd5a..b744a12 100644 --- a/arch/arm/mach-bcm/Makefile +++ b/arch/arm/mach-bcm/Makefile @@ -13,3 +13,7 @@ obj-$(CONFIG_ARCH_BCM_MOBILE) := board_bcm281xx.o bcm_kona_smc.o bcm_kona_smc_asm.o kona.o plus_sec := $(call as-instr,.arch_extension sec,+sec) AFLAGS_bcm_kona_smc_asm.o :=-Wa,-march=armv7-a$(plus_sec) + +obj-$(CONFIG_ARCH_BRCMSTB) := brcmstb.o +obj-$(CONFIG_SMP) += headsmp-brcmstb.o +obj-$(CONFIG_HOTPLUG_CPU) += hotplug-brcmstb.o diff --git a/arch/arm/mach-bcm/brcmstb.c b/arch/arm/mach-bcm/brcmstb.c new file mode 100644 index 000..7a6093d --- /dev/null +++ b/arch/arm/mach-bcm/brcmstb.c @@ -0,0 +1,110 @@ +/* + * Copyright (C) 2013 Broadcom Corporation + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation version 2. + * + * This program is distributed "as is" WITHOUT ANY WARRANTY of any + * kind, whether express or implied; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "brcmstb.h" + +/*** + * STB CPU (main application processor) + ***/ + +static const char *brcmstb_match[] __initconst = { + "brcm,bcm7445", + "brcm,brcmstb", + NULL +}; + +static void __init brcmstb_init_early(void) +{ + add_preferred_console("ttyS", 0, "115200"); +} + +/*** + * SMP boot + ***/ + +#ifdef CONFIG_SMP +static DEFINE_SPINLOCK(boot_lock); + +static void __cpuinit brcmstb_secondary_init(unsigned int cpu) +{ + /* +* Synchronise with the boot thread. +*/ + spin_lock(&boot_lock); + spin_unlock(&boot_lock); +} + +static int __cpuinit brcmstb_boot_secondary(unsigned int cpu, + struct task_struct *idle) +{ + /* +* set synchronisation state between this boot processor +* and the secondary one +*/ + spin_lock(&boot_lock); + + /* Bring up power to the core if necessary */ + if (brcmstb_cpu_get_power_state(cpu) == 0) + brcmstb_cpu_power_on(cpu); + + brcmstb_cpu_boot(cpu); + + /* +* now the secondary core is starting up let it run its +* calibrations, then wait for it to finish +*/ + spin_unlock(&boot_lock); + + return 0; +} +
[PATCH v5 5/8] ARM: brcmstb: add CPU binding for Broadcom Brahma15
Add the Broadcom Brahma B15 CPU to the DT CPU binding list. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- Documentation/devicetree/bindings/arm/cpus.txt |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt index 9130435..0cd1e25 100644 --- a/Documentation/devicetree/bindings/arm/cpus.txt +++ b/Documentation/devicetree/bindings/arm/cpus.txt @@ -163,6 +163,7 @@ nodes to be present and contain the properties described below. "arm,cortex-r4" "arm,cortex-r5" "arm,cortex-r7" + "brcm,brahma-b15" "faraday,fa526" "intel,sa110" "intel,sa1100" -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 6/8] ARM: brcmstb: add misc. DT bindings for brcmstb
Document the bindings that the Broadcom STB platform needs for proper bootup. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- .../devicetree/bindings/arm/brcm-brcmstb.txt | 95 1 files changed, 95 insertions(+), 0 deletions(-) create mode 100644 Documentation/devicetree/bindings/arm/brcm-brcmstb.txt diff --git a/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt b/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt new file mode 100644 index 000..3c436cc --- /dev/null +++ b/Documentation/devicetree/bindings/arm/brcm-brcmstb.txt @@ -0,0 +1,95 @@ +ARM Broadcom STB platforms Device Tree Bindings +--- +Boards with Broadcom Brahma15 ARM-based BCM (generally BCM7xxx variants) +SoC shall have the following DT organization: + +Required root node properties: +- compatible: "brcm,bcm", "brcm,brcmstb" + +example: +/ { +#address-cells = <2>; +#size-cells = <2>; +model = "Broadcom STB (bcm7445)"; +compatible = "brcm,bcm7445", "brcm,brcmstb"; + +Further, syscon nodes that map platform-specific registers used for general +system control is required: + +- compatible: "brcm,bcm-sun-top-ctrl", "syscon" +- compatible: "brcm,bcm-hif-cpubiuctrl", "syscon" +- compatible: "brcm,bcm-hif-continuation", "syscon" + +example: +rdb { +#address-cells = <1>; +#size-cells = <1>; +compatible = "simple-bus"; +ranges = <0 0x00 0xf000 0x100>; + +sun_top_ctrl: syscon@404000 { +compatible = "brcm,bcm7445-sun-top-ctrl", "syscon"; +reg = <0x404000 0x51c>; +}; + +hif_cpubiuctrl: syscon@3e2400 { +compatible = "brcm,bcm7445-hif-cpubiuctrl", "syscon"; +reg = <0x3e2400 0x5b4>; +}; + +hif_continuation: syscon@452000 { +compatible = "brcm,bcm7445-hif-continuation", "syscon"; +reg = <0x452000 0x100>; +}; +}; + +Lastly, nodes that allow for support of SMP initialization and reboot are +required: + +smpboot +--- +Required properties: + +- compatible +The string "brcm,brcmstb-smpboot". + +- syscon-cpu +A phandle / integer array property which lets the BSP know the location +of certain CPU power-on registers. + +The layout of the property is as follows: +o a phandle to the "hif_cpubiuctrl" syscon node +o offset to the base CPU power zone register +o offset to the base CPU reset register + +- syscon-cont +A phandle pointing to the syscon node which describes the CPU boot +continuation registers. +o a phandle to the "hif_continuation" syscon node + +example: +smpboot { +compatible = "brcm,brcmstb-smpboot"; +syscon-cpu = <&hif_cpubiuctrl 0x88 0x178>; +syscon-cont = <&hif_continuation>; +}; + +reboot +--- +Required properties + +- compatible +The string property "brcm,brcmstb-reboot". + +- syscon +A phandle / integer array that points to the syscon node which describes +the general system reset registers. +o a phandle to "sun_top_ctrl" +o offset to the "reset source enable" register +o offset to the "software master reset" register + +example: +reboot { +compatible = "brcm,brcmstb-reboot"; +syscon = <&sun_top_ctrl 0x304 0x308>; +}; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 3/8] ARM: brcmstb: add debug UART for earlyprintk support
Add the UART definitions needed to support earlyprintk on brcmstb machines. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- arch/arm/Kconfig.debug | 16 +++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index 5765abf..666afd7 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -94,6 +94,17 @@ choice depends on ARCH_BCM2835 select DEBUG_UART_PL01X + config DEBUG_BRCMSTB_UART + bool "Use BRCMSTB UART for low-level debug" + depends on ARCH_BRCMSTB + select DEBUG_UART_8250 + help + Say Y here if you want the debug print routines to direct + their output to the first serial port on these devices. + + If you have a Broadcom STB chip and would like early print + messages to appear over the UART, select this option. + config DEBUG_CLPS711X_UART1 bool "Kernel low-level debugging messages via UART1" depends on ARCH_CLPS711X @@ -1008,6 +1019,7 @@ config DEBUG_UART_PHYS default 0xd4018000 if DEBUG_MMP_UART3 default 0xe000 if ARCH_SPEAR13XX default 0xfbe0 if ARCH_EBSA110 + default 0xf0406b00 if DEBUG_BRCMSTB_UART default 0xf1012000 if DEBUG_MVEBU_UART_ALTERNATE default 0xf1012000 if ARCH_DOVE || ARCH_KIRKWOOD || ARCH_MV78XX0 || \ ARCH_ORION5X @@ -1040,6 +1052,7 @@ config DEBUG_UART_VIRT default 0xf809 if DEBUG_VEXPRESS_UART0_RS1 default 0xfb009000 if DEBUG_REALVIEW_STD_PORT default 0xfb10c000 if DEBUG_REALVIEW_PB1176_PORT + default 0xfc406b00 if DEBUG_BRCMSTB_UART default 0xfd00 if ARCH_SPEAR3XX || ARCH_SPEAR6XX default 0xfd00 if ARCH_SPEAR13XX default 0xfd012000 if ARCH_MV78XX0 @@ -1091,7 +1104,8 @@ config DEBUG_UART_8250_WORD default y if DEBUG_PICOXCELL_UART || DEBUG_SOCFPGA_UART || \ ARCH_KEYSTONE || \ DEBUG_DAVINCI_DMx_UART0 || DEBUG_DAVINCI_DA8XX_UART1 || \ - DEBUG_DAVINCI_DA8XX_UART2 || DEBUG_DAVINCI_TNETV107X_UART1 + DEBUG_DAVINCI_DA8XX_UART2 || DEBUG_DAVINCI_TNETV107X_UART1 || \ + DEBUG_BRCMSTB_UART config DEBUG_UART_8250_FLOW_CONTROL bool "Enable flow control for 8250 UART" -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 7/8] ARM: brcmstb: gic: add compatible string for Broadcom Brahma15
Document the Broadcom Brahma B15 GIC implementation as compatible with the ARM GIC standard. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- Documentation/devicetree/bindings/arm/gic.txt |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt index 3dfb0c0..d7409fd 100644 --- a/Documentation/devicetree/bindings/arm/gic.txt +++ b/Documentation/devicetree/bindings/arm/gic.txt @@ -15,6 +15,7 @@ Main node required properties: "arm,cortex-a9-gic" "arm,cortex-a7-gic" "arm,arm11mp-gic" + "brcm,brahma-b15-gic" - interrupt-controller : Identifies the node as an interrupt controller - #interrupt-cells : Specifies the number of cells needed to encode an interrupt source. The type shall be a and the value shall be 3. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 0/8] ARM: brcmstb: Add Broadcom STB SoC support
This patchset contains the board support package for the Broadcom BCM7445 ARM-based SoC [1]. These changes contain a minimal set of code needed for a BCM7445-based board to boot the Linux kernel. These changes heavily leverage the OF/devicetree framework. v5: - rebased to v3.13 tag - make UART DT node a child of 'rdb' node - fix ordering of debug UART entries v4: - make a reboot driver and put it in the drivers folder - rework DT bindings to leverage 'syscon' - rework BSP code to use 'syscon' for all register mappings - misc. tweaks per suggestions from v3 v3: - rebased to v3.13-rc8 - switched to using 'multi_v7_defconfig' - eliminated dependence on compile-time peripheral register access - moved DT node iomap out from 'init_early' - misc. minor cleanups from mailing-list discussion for v2 v2: - rebased to v3.13-rc1 - moved implementation to 'mach-bcm' folder - added CPU init for B15 v1: - initial submission [1] http://www.broadcom.com/products/Cable/Cable-Set-Top-Box-Solutions/BCM7445 Marc Carino (8): ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs power: reset: Add reboot driver for brcmstb ARM: brcmstb: add debug UART for earlyprintk support ARM: do CPU-specific init for Broadcom Brahma15 cores ARM: brcmstb: add CPU binding for Broadcom Brahma15 ARM: brcmstb: add misc. DT bindings for brcmstb ARM: brcmstb: gic: add compatible string for Broadcom Brahma15 ARM: brcmstb: dts: add a reference DTS for Broadcom 7445 .../devicetree/bindings/arm/brcm-brcmstb.txt | 95 ++ Documentation/devicetree/bindings/arm/cpus.txt |1 + Documentation/devicetree/bindings/arm/gic.txt |1 + arch/arm/Kconfig.debug | 16 +- arch/arm/boot/dts/bcm7445.dts | 111 +++ arch/arm/configs/multi_v7_defconfig|1 + arch/arm/mach-bcm/Kconfig | 14 + arch/arm/mach-bcm/Makefile |4 + arch/arm/mach-bcm/brcmstb.c| 110 +++ arch/arm/mach-bcm/brcmstb.h| 38 +++ arch/arm/mach-bcm/headsmp-brcmstb.S| 34 ++ arch/arm/mach-bcm/hotplug-brcmstb.c| 334 arch/arm/mm/proc-v7.S | 11 + drivers/power/reset/Kconfig| 10 + drivers/power/reset/Makefile |1 + drivers/power/reset/brcmstb-reboot.c | 120 +++ 16 files changed, 900 insertions(+), 1 deletions(-) create mode 100644 Documentation/devicetree/bindings/arm/brcm-brcmstb.txt create mode 100644 arch/arm/boot/dts/bcm7445.dts create mode 100644 arch/arm/mach-bcm/brcmstb.c create mode 100644 arch/arm/mach-bcm/brcmstb.h create mode 100644 arch/arm/mach-bcm/headsmp-brcmstb.S create mode 100644 arch/arm/mach-bcm/hotplug-brcmstb.c create mode 100644 drivers/power/reset/brcmstb-reboot.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 4/8] ARM: do CPU-specific init for Broadcom Brahma15 cores
Perform any CPU-specific initialization required on the Broadcom Brahma-15 core. Signed-off-by: Marc Carino Acked-by: Florian Fainelli --- arch/arm/mm/proc-v7.S | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S index bd17819..98ea423 100644 --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -193,6 +193,7 @@ __v7_cr7mp_setup: b 1f __v7_ca7mp_setup: __v7_ca15mp_setup: +__v7_b15mp_setup: mov r10, #0 1: #ifdef CONFIG_SMP @@ -494,6 +495,16 @@ __v7_ca15mp_proc_info: .size __v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info /* +* Broadcom Corporation Brahma-B15 processor. +*/ + .type __v7_b15mp_proc_info, #object +__v7_b15mp_proc_info: + .long 0x420f00f0 + .long 0xff00 + __v7_proc __v7_b15mp_setup, hwcaps = HWCAP_IDIV + .size __v7_b15mp_proc_info, . - __v7_b15mp_proc_info + + /* * Qualcomm Inc. Krait processors. */ .type __krait_proc_info, #object -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BISECTED] Linux 3.12.7 introduces page map handling regression
On Tue, Jan 21, 2014 at 06:47:07PM -0800, Linus Torvalds wrote: > On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman > wrote: > > > > Odds are this also shows up in 3.13, right? Reproduced using 3.13 on the PV guest: [ 368.756763] BUG: Bad page map in process mp pte:8004a67c6165 pmd:e9b706067 [ 368.756777] page:ea001299f180 count:0 mapcount:-1 mapping: (null) index:0x0 [ 368.756781] page flags: 0x2f8014(referenced|dirty) [ 368.756786] addr:7fd1388b7000 vm_flags:00100071 anon_vma:880e9ba15f80 mapping: (null) index:7fd1388b7 [ 368.756792] CPU: 29 PID: 618 Comm: mp Not tainted 3.13.0-ec2 #1 [ 368.756795] 880e9b718958 880e9eaf3cc0 814d8748 7fd1388b7000 [ 368.756803] 880e9eaf3d08 8116d289 [ 368.756809] 880e9b7065b8 ea001299f180 7fd1388b8000 880e9eaf3e30 [ 368.756815] Call Trace: [ 368.756825] [] dump_stack+0x45/0x56 [ 368.756833] [] print_bad_pte+0x229/0x250 [ 368.756837] [] unmap_single_vma+0x583/0x890 [ 368.756842] [] unmap_vmas+0x65/0x90 [ 368.756847] [] unmap_region+0xac/0x120 [ 368.756852] [] ? vma_rb_erase+0x1c9/0x210 [ 368.756856] [] do_munmap+0x280/0x370 [ 368.756860] [] vm_munmap+0x41/0x60 [ 368.756864] [] SyS_munmap+0x22/0x30 [ 368.756869] [] system_call_fastpath+0x1a/0x1f [ 368.756872] Disabling lock debugging due to kernel taint [ 368.760084] BUG: Bad rss-counter state mm:880e9d079680 idx:0 val:-1 [ 368.760091] BUG: Bad rss-counter state mm:880e9d079680 idx:1 val:1 > > Probably. I don't have a Xen PV setup to test with (and very little > interest in setting one up).. And I have a suspicion that it might not > be so much about Xen PV, as perhaps about the kind of hardware. > > I suspect the issue has something to do with the magic _PAGE_NUMA > tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up > removing the _PAGE_PRESENT bit, and now the crazy numa code is > confused. > > The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the > bit with _PAGE_PROTNONE, which is why it then has that tie-in to > _PAGE_PRESENT. > > Adding Andrea to the Cc, because he's the author of that horridness. > Putting Steven's test-case here as an attachement for Andrea, maybe > that makes him go "Ahh, yes, silly case". > > Also added Kirill, because he was involved the last _PAGE_NUMA debacle. > > Andrea, you can find the thread on lkml, but it boils down to commit > 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the > attached test-case (but apparently only under Xen PV). There it > apparently causes a "BUG: Bad page map .." error. > > And I suspect this is another of those "this bug is only visible on > real numa machines, because _PAGE_NUMA isn't actually ever set > otherwise". That has pretty much guaranteed that it gets basically > zero testing, which is not a great idea when coupled with that subtle > sharing of the _PAGE_PROTNONE bit.. > > It may be that the whole "Xen PV" thing is a red herring, and that > Steven only sees it on that one machine because the one he runs as a > PV guest under is a real NUMA machine, and all the other machines he > has tried it on haven't been numa. So it *may* be that that "only > under Xen PV" is a red herring. But that's just a possible guess. The PV and HVM guests are both on NUMA hosts, but we don't expose NUMA to the PV guest, so it fakes a NUMA node at startup. I've also tried running a PV guest on a dual socket host with interleaved memory: # dmesg | grep -i -e numa -e node [0.00] NUMA turned off [0.00] Faking a node at [mem 0x-0x0005607f] [0.00] Initmem setup node 0 [mem 0x-0x5607f] [0.00] NODE_DATA [mem 0x55d4f2000-0x55d518fff] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x1000-0x0009] [0.00] node 0: [mem 0x0010-0x5607f] [0.00] On node 0 totalpages: 5638047 [0.00] setup_percpu: NR_CPUS:4096 nr_cpumask_bits:16 nr_cpu_ids:16 nr_node_ids:1 [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1 [0.010697] Inode-cache hash table entries: 2097152 (order: 12, 16777216 bytes) # dmesg | tail -n 21 [ 348.467265] BUG: Bad page map in process t pte:80008a6ef165 pmd:53aa39067 [ 348.467280] page:ea000229bbc0 count:0 mapcount:-1 mapping: (null) index:0x0 [ 348.467286] page flags: 0x1ffc14(referenced|dirty) [ 348.467293] addr:7f8c9fca vm_flags:00100071 anon_vma:88053aff19c0 mapping: (
[PATCH 1/2] net: dm9000: Read GPR, modify and write
The GPR register should be read, modified and write to activate the PHY. A simple write 0 to the GPR might override other register values with needs to keep. Some codestyle fixes (mostly leading spaces) Signed-off-by: Chris Ruehl --- drivers/net/ethernet/davicom/dm9000.c | 23 +++ 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c index 7080ad6..0349b91 100644 --- a/drivers/net/ethernet/davicom/dm9000.c +++ b/drivers/net/ethernet/davicom/dm9000.c @@ -745,9 +745,9 @@ static const struct ethtool_ops dm9000_ethtool_ops = { .get_link = dm9000_get_link, .get_wol= dm9000_get_wol, .set_wol= dm9000_set_wol, - .get_eeprom_len = dm9000_get_eeprom_len, - .get_eeprom = dm9000_get_eeprom, - .set_eeprom = dm9000_set_eeprom, + .get_eeprom_len = dm9000_get_eeprom_len, + .get_eeprom = dm9000_get_eeprom, + .set_eeprom = dm9000_set_eeprom, }; static void dm9000_show_carrier(board_info_t *db, @@ -795,7 +795,7 @@ dm9000_poll_work(struct work_struct *w) } } else mii_check_media(&db->mii, netif_msg_link(db), 0); - + if (netif_running(ndev)) dm9000_schedule_poll(db); } @@ -1286,6 +1286,7 @@ dm9000_open(struct net_device *dev) { board_info_t *db = netdev_priv(dev); unsigned long irqflags = db->irq_res->flags & IRQF_TRIGGER_MASK; + int gprval; if (netif_msg_ifup(db)) dev_dbg(db->dev, "enabling %s\n", dev->name); @@ -1298,9 +1299,15 @@ dm9000_open(struct net_device *dev) irqflags |= IRQF_SHARED; + gprval = ior(db, DM9000_GPR); + /* GPIO0 on pre-activate PHY, Reg 1F is not set by reset */ - iow(db, DM9000_GPR, 0); /* REG_1F bit0 activate phyxcer */ - mdelay(1); /* delay needs by DM9000B */ + if (gprval & (1<<0)) { + dev_dbg(db->dev, "Activate PHY GPR: 0x%x\n", gprval); + gprval = gprval & ~(1<<0); + iow(db, DM9000_GPR, gprval);/* REG_1F bit0 activate phyxcer */ + mdelay(1); /* delay needs by DM9000B */ + } /* Initialize DM9000 board */ dm9000_reset(db); @@ -1314,7 +1321,7 @@ dm9000_open(struct net_device *dev) mii_check_media(&db->mii, netif_msg_link(db), 1); netif_start_queue(dev); - + dm9000_schedule_poll(db); return 0; @@ -1628,7 +1635,7 @@ dm9000_probe(struct platform_device *pdev) if (!is_valid_ether_addr(ndev->dev_addr)) { /* try reading from mac */ - + mac_src = "chip"; for (i = 0; i < 6; i++) ndev->dev_addr[i] = ior(db, i+DM9000_PAR); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] net: dm9000: Only call PHY reset for TYPE-B on shutdown
Unconditional call of PHY reset can triggers a fault to detect the link for DM9000A on reboot, only a hard reset can solve it. This patch check the version of the chip and call the PHY reset only for the B version of the chip. Signed-off-by: Chris Ruehl --- drivers/net/ethernet/davicom/dm9000.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c index 0349b91..55a2e9c 100644 --- a/drivers/net/ethernet/davicom/dm9000.c +++ b/drivers/net/ethernet/davicom/dm9000.c @@ -1333,7 +1333,8 @@ dm9000_shutdown(struct net_device *dev) board_info_t *db = netdev_priv(dev); /* RESET device */ - dm9000_phy_write(dev, 0, MII_BMCR, BMCR_RESET); /* PHY RESET */ + if (db->type == TYPE_DM9000B) + dm9000_phy_write(dev, 0, MII_BMCR, BMCR_RESET); /* PHY RESET */ iow(db, DM9000_GPR, 0x01); /* Power-Down PHY */ iow(db, DM9000_IMR, IMR_PAR); /* Disable all interrupt */ iow(db, DM9000_RCR, 0x00); /* Disable RX */ -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2 v2] imx27: pinctrl: fix offset calculation in imx_read_2bit
The offset for the 2bit register calculate wrong, this patch fixes the problem. The debugfs printout for oconf, iconfa, iconfb now shows the real values. Signed-off-by: Chris Ruehl --- drivers/pinctrl/pinctrl-imx1-core.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/pinctrl-imx1-core.c b/drivers/pinctrl/pinctrl-imx1-core.c index 8dfc3dc..59a16b6 100644 --- a/drivers/pinctrl/pinctrl-imx1-core.c +++ b/drivers/pinctrl/pinctrl-imx1-core.c @@ -139,7 +139,7 @@ static int imx1_read_2bit(struct imx1_pinctrl *ipctl, unsigned int pin_id, u32 reg_offset) { void __iomem *reg = imx1_mem(ipctl, pin_id) + reg_offset; - int offset = pin_id % 16; + int offset = (pin_id % 16) * 2; /* Use the next register if the pin's port pin number is >=16 */ if (pin_id % 32 >= 16) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2 v2] imx27: pinctrl: fix wrong offset to ICONFB
The offset to ICONFB was incorrect, this patch set the correct value 0x14. dev_dbg in function imx1_write_2bit print the wrong address and had been moved after address calculation. Signed-off-by: Chris Ruehl --- drivers/pinctrl/pinctrl-imx1-core.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/pinctrl/pinctrl-imx1-core.c b/drivers/pinctrl/pinctrl-imx1-core.c index f77914a..8dfc3dc 100644 --- a/drivers/pinctrl/pinctrl-imx1-core.c +++ b/drivers/pinctrl/pinctrl-imx1-core.c @@ -45,7 +45,7 @@ struct imx1_pinctrl { #define MX1_DDIR 0x00 #define MX1_OCR 0x04 #define MX1_ICONFA 0x0c -#define MX1_ICONFB 0x10 +#define MX1_ICONFB 0x14 #define MX1_GIUS 0x20 #define MX1_GPR 0x38 #define MX1_PUEN 0x40 @@ -97,13 +97,13 @@ static void imx1_write_2bit(struct imx1_pinctrl *ipctl, unsigned int pin_id, u32 old_val; u32 new_val; - dev_dbg(ipctl->dev, "write: register 0x%p offset %d value 0x%x\n", - reg, offset, value); - /* Use the next register if the pin's port pin number is >=16 */ if (pin_id % 32 >= 16) reg += 0x04; + dev_dbg(ipctl->dev, "write: register 0x%p offset %d value 0x%x\n", + reg, offset, value); + /* Get current state of pins */ old_val = readl(reg); old_val &= mask; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 9/9] mm: keep page cache radix tree nodes in check
On Tue, Jan 21, 2014 at 12:50:17AM -0500, Johannes Weiner wrote: > On Tue, Jan 21, 2014 at 02:03:58PM +1100, Dave Chinner wrote: > > On Mon, Jan 20, 2014 at 06:17:37PM -0500, Johannes Weiner wrote: > > > On Fri, Jan 17, 2014 at 11:05:17AM +1100, Dave Chinner wrote: > > > > On Fri, Jan 10, 2014 at 01:10:43PM -0500, Johannes Weiner wrote: > > > > > +static struct shrinker workingset_shadow_shrinker = { > > > > > + .count_objects = count_shadow_nodes, > > > > > + .scan_objects = scan_shadow_nodes, > > > > > + .seeks = DEFAULT_SEEKS * 4, > > > > > + .flags = SHRINKER_NUMA_AWARE, > > > > > +}; > > > > > > > > Can you add a comment explaining how you calculated the .seeks > > > > value? It's important to document the weighings/importance > > > > we give to slab reclaim so we can determine if it's actually > > > > acheiving the desired balance under different loads... > > > > > > This is not an exact science, to say the least. > > > > I know, that's why I asked it be documented rather than be something > > kept in your head. > > > > > The shadow entries are mostly self-regulated, so I don't want the > > > shrinker to interfere while the machine is just regularly trimming > > > caches during normal operation. > > > > > > It should only kick in when either a) reclaim is picking up and the > > > scan-to-reclaim ratio increases due to mapped pages, dirty cache, > > > swapping etc. or b) the number of objects compared to LRU pages > > > becomes excessive. > > > > > > I think that is what most shrinkers with an elevated seeks value want, > > > but this translates very awkwardly (and not completely) to the current > > > cost model, and we should probably rework that interface. > > > > > > "Seeks" currently encodes 3 ratios: > > > > > > 1. the cost of creating an object vs. a page > > > > > > 2. the expected number of objects vs. pages > > > > It doesn't encode that at all. If it did, then the default value > > wouldn't be "2". > > > > > 3. the cost of reclaiming an object vs. a page > > > > Which, when you consider #3 in conjunction with #1, the actual > > intended meaning of .seeks is "the cost of replacing this object in > > the cache compared to the cost of replacing a page cache page." > > But what it actually seems to do is translate scan rate from LRU pages > to scan rate in another object pool. The actual replacement cost > varies based on hotness of each set, an in-use object is more > expensive to replace than a cold page and vice versa, the dentry and > inode shrinkers reflect this by rotating hot objects and refusing to > actually reclaim items while they are in active use. Right, but so does the page cache when the page referenced bit is seen by the LRU scanner. That's a scanned page, so what is passed to shrink_slab is a ratio of pages scanned vs pages eligible for reclaim. IOWs, the fact that the slab caches rotate rather than reclaim is irrelevant - what matters is the same proportional pressure is applied to the slab cache that was applied to the page cache > So I am having a hard time deriving a meaningful value out of this > definition for my usecase because I want to push back objects based on > reclaim efficiency (scan rate vs. reclaim rate). The other shrinkers > with non-standard seek settings reek of magic number as well, which > suggests I am not alone with this. Right, which is exactly why I'm asking you to document it. I've got no idea how other subsystems have come up with their magic numbers because they are not documented, and so it's just about impossible to determine what the author of the code really needed and hence the best way to improve the interface is difficult to determine. > I wonder if we can come up with a better interface that allows both > traditional cache shrinkers with their own aging, as well as object > pools that want to push back based on reclaim efficiency. We probably can, though I'd prefer we don't end up with some alternative algorithm that is specific to a single shrinker. So, how do we measure page cache reclaim efficiency? How can that be communicated to a shrinker? how can we tell a shrinker what measure to use? How do we tell shrinker authors what measure to use? How do we translate that new method useful scan count information? > > > but they are not necessarily correlated. How I would like to > > > configure the shadow shrinker instead is: > > > > > > o scan objects when reclaim efficiency is down to 75%, because they > > > are more valuable than use-once cache but less than workingset > > > > > > o scan objects when the ratio between them and the number of pages > > > exceeds 1/32 (one shadow entry for each resident page, up to 64 > > > entries per shrinkable object, assume 50% packing for robustness) > > > > > > o as the expected balance between objects and lru pages is 1:32, > > > reclaim one object for every 32 reclaimed LRU pages, instead of > > > assuming that number of scanne
linux-next: manual merge of the drm-intel tree with the drm tree
Hi all, Today's linux-next merge of the drm-intel tree got a conflict in drivers/gpu/drm/i915/intel_display.c between commit c326c0a9c98c ("drm/i915: Call drm_calc_timestamping_constants() earlier") from the drm tree and commit bbee18af2a25 ("drm/i915: Prepare to track new pipe config per pipe") from the drm-intel tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/gpu/drm/i915/intel_display.c index 14b024becb91,e1d3ae1212a7.. --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@@ -9660,14 -9705,7 +9703,15 @@@ static int __intel_set_mode(struct drm_ /* mode_set/enable/disable functions rely on a correct pipe * config. */ to_intel_crtc(crtc)->config = *pipe_config; + to_intel_crtc(crtc)->new_config = &to_intel_crtc(crtc)->config; + + /* + * Calculate and store various constants which + * are later needed by vblank and swap-completion + * timestamping. They are derived from true hwmode. + */ + drm_calc_timestamping_constants(crtc, + &pipe_config->adjusted_mode); } /* Only after disabling all output pipelines that will be changed can we pgphEiQSyz2ju.pgp Description: PGP signature
Re: [PATCH] clk: export __clk_get_hw for re-use in others
Dear Greg, Mike, May I ask your answer or other opinion, please? On Mon, Jan 20, 2014 at 5:07 PM, SeongJae Park wrote: > On Mon, Jan 20, 2014 at 4:47 PM, Mike Turquette wrote: >> On Sun, Jan 19, 2014 at 9:37 AM, Greg KH wrote: >>> On Sun, Jan 19, 2014 at 02:55:07PM +0900, SeongJae Park wrote: Following build comes while modprobe process: > ERROR: "__clk_get_hw" [drivers/clk/clk-max77686.ko] undefined! > make[2]: *** [__modpost] Error 1 > make[1]: *** [modules] Error 2 Export the symbol to fix it and for other part's usecase. Signed-off-by: SeongJae Park --- drivers/clk/clk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index 2b38dc9..3883fba 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -575,6 +575,7 @@ struct clk_hw *__clk_get_hw(struct clk *clk) { return !clk ? NULL : clk->hw; } +EXPORT_SYMBOL_GPL(__clk_get_hw); >>> >>> __ functions should usually only be for "internal" use, why does this >>> get exported to modules? Why not just put it in a .h file? >> >> It was originally used only within the clock core but it is sensible >> for hardware-specific clock drivers to use this as well. I plan to >> audit all of the double-underscore functions in >> include/linux/clk-provider.h for 3.15. >> >> Regards, >> Mike >> > Thank you very much for answering about it, Mike. > > I agree Greg's indication and think Mike's explanation is reasonable. > > So, I think it would be better to just export the symbol now > because it would be easier for future functions renaming and > similar issues were solved in this way in past: > https://lkml.org/lkml/2013/4/15/50 > > Or, maybe I can change the client code of __clk_get_hw to not use the > function. > > What do you think would be better to fix this build error? Or, do you > have better idea? > I will respect your opinion. > > Thanks and Regards. > SeongJae Park. > >>> >>> greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the drm-intel tree with the drm tree
Hi all, Today's linux-next merge of the drm-intel tree got a conflict in drivers/gpu/drm/i915/i915_irq.c between commit abca9e454498 ("drm: Pass 'flags' from the caller to .get_scanout_position()") from the drm tree and commit d59a63ad8234 ("drm/i915: Add intel_get_crtc_scanline()") from the drm-intel tree. I fixed it up (I think - see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/gpu/drm/i915/i915_irq.c index 17d8fcb1b6f7,ffb56a9db9cc.. --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@@ -649,8 -675,9 +649,9 @@@ static bool ilk_pipe_in_vblank_locked(s } static int i915_get_crtc_scanoutpos(struct drm_device *dev, int pipe, - int *vpos, int *hpos, + unsigned int flags, int *vpos, int *hpos, - ktime_t *stime, ktime_t *etime) + ktime_t *stime, ktime_t *etime, + bool adjust) { struct drm_i915_private *dev_priv = dev->dev_private; struct drm_crtc *crtc = dev_priv->pipe_to_crtc_mapping[pipe]; @@@ -788,6 -786,24 +791,24 @@@ return ret; } + static int i915_get_scanout_position(struct drm_device *dev, int pipe, +int *vpos, int *hpos, +ktime_t *stime, ktime_t *etime) + { - return i915_get_crtc_scanoutpos(dev, pipe, vpos, hpos, ++ return i915_get_crtc_scanoutpos(dev, pipe, 0, vpos, hpos, + stime, etime, true); + } + + int intel_get_crtc_scanline(struct drm_crtc *crtc) + { + int vpos = 0, hpos = 0; + - i915_get_crtc_scanoutpos(crtc->dev, to_intel_crtc(crtc)->pipe, ++ i915_get_crtc_scanoutpos(crtc->dev, to_intel_crtc(crtc)->pipe, 0, +&vpos, &hpos, NULL, NULL, false); + + return vpos; + } + static int i915_get_vblank_timestamp(struct drm_device *dev, int pipe, int *max_error, struct timeval *vblank_time, pgpvLV6E23Jmh.pgp Description: PGP signature
[LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes
One topic that has been lurking forever at the edges is the current 4k limitation for file system block sizes. Some devices in production today and others coming soon have larger sectors and it would be interesting to see if it is time to poke at this topic again. LSF/MM seems to be pretty much the only event of the year that most of the key people will be present, so should be a great topic for a joint session. Ric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] uapi: convert u64 to __u64 in exported headers
On Tue, 21 Jan 2014, Mike Frysinger wrote: > The u64 type is not defined in any exported kernel headers, so trying > to use it will lead to build failures. > > Signed-off-by: Mike Frysinger Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] uapi: dn: pull in ioctl.h header
This header uses _IOW/_IOR defines but doesn't include ioctl.h for it. If you try to use this w/out including ioctl.h yourself, it can fail to build, so add the explicit include. Signed-off-by: Mike Frysinger --- include/uapi/linux/dn.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/dn.h b/include/uapi/linux/dn.h index 5fbdd3d..4295c74 100644 --- a/include/uapi/linux/dn.h +++ b/include/uapi/linux/dn.h @@ -1,6 +1,7 @@ #ifndef _LINUX_DN_H #define _LINUX_DN_H +#include #include #include -- 1.8.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] uapi: ppp-ioctl.h: pull in ppp_defs.h
This header uses enum NPmode but doesn't include ppp_defs.h. If you try to use this header w/out including the defs header first, it leads to a build failure. So add the explicit include to fix it. Don't know of any packages directly impacted, but noticed while building some ppp code by hand. Signed-off-by: Mike Frysinger --- include/uapi/linux/ppp-ioctl.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h index 2d9a885..63a23a3 100644 --- a/include/uapi/linux/ppp-ioctl.h +++ b/include/uapi/linux/ppp-ioctl.h @@ -12,6 +12,7 @@ #include #include +#include /* * Bit definitions for flags argument to PPPIOCGFLAGS/PPPIOCSFLAGS. -- 1.8.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BISECTED] Linux 3.12.7 introduces page map handling regression
On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman wrote: > > Odds are this also shows up in 3.13, right? Probably. I don't have a Xen PV setup to test with (and very little interest in setting one up).. And I have a suspicion that it might not be so much about Xen PV, as perhaps about the kind of hardware. I suspect the issue has something to do with the magic _PAGE_NUMA tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up removing the _PAGE_PRESENT bit, and now the crazy numa code is confused. The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the bit with _PAGE_PROTNONE, which is why it then has that tie-in to _PAGE_PRESENT. Adding Andrea to the Cc, because he's the author of that horridness. Putting Steven's test-case here as an attachement for Andrea, maybe that makes him go "Ahh, yes, silly case". Also added Kirill, because he was involved the last _PAGE_NUMA debacle. Andrea, you can find the thread on lkml, but it boils down to commit 1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the attached test-case (but apparently only under Xen PV). There it apparently causes a "BUG: Bad page map .." error. And I suspect this is another of those "this bug is only visible on real numa machines, because _PAGE_NUMA isn't actually ever set otherwise". That has pretty much guaranteed that it gets basically zero testing, which is not a great idea when coupled with that subtle sharing of the _PAGE_PROTNONE bit.. It may be that the whole "Xen PV" thing is a red herring, and that Steven only sees it on that one machine because the one he runs as a PV guest under is a real NUMA machine, and all the other machines he has tried it on haven't been numa. So it *may* be that that "only under Xen PV" is a red herring. But that's just a possible guess. Christ, how I hate that _PAGE_NUMA bit. Andrea: the fact that it gets no testing on any normal machines is a major problem. If it was simple and straightforward and the code was "obviously correct", it wouldn't be such a problem, but the _PAGE_NUMA code definitely does not fall under that "simple and obviously correct" heading. Guys, any ideas? Linus #include #include #include #include void die(const char *what) { perror(what); exit(1); } int main(int arg, char **argv) { void *p = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (p == MAP_FAILED) die("mmap"); /* Tickle the page. */ ((char *) p)[0] = 0; if (mprotect(p, 4096, PROT_NONE) != 0) die("mprotect"); if (mprotect(p, 4096, PROT_READ) != 0) die("mprotect"); if (munmap(p, 4096) != 0) die("munmap"); return 0; }
[PATCH] uapi: convert u64 to __u64 in exported headers
The u64 type is not defined in any exported kernel headers, so trying to use it will lead to build failures. Signed-off-by: Mike Frysinger --- include/uapi/linux/nfs4.h | 2 +- include/uapi/linux/perf_event.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/nfs4.h b/include/uapi/linux/nfs4.h index 788128e..35f5f4c 100644 --- a/include/uapi/linux/nfs4.h +++ b/include/uapi/linux/nfs4.h @@ -150,7 +150,7 @@ #define NFS4_SECINFO_STYLE4_CURRENT_FH 0 #define NFS4_SECINFO_STYLE4_PARENT 1 -#define NFS4_MAX_UINT64(~(u64)0) +#define NFS4_MAX_UINT64(~(__u64)0) /* An NFS4 sessions server must support at least NFS4_MAX_OPS operations. * If a compound requires more operations, adjust NFS4_MAX_OPS accordingly. diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 959d454..7a3fed5 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -787,7 +787,7 @@ union perf_mem_data_src { #define PERF_MEM_TLB_SHIFT 26 #define PERF_MEM_S(a, s) \ - (((u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) + (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) /* * single taken branch record layout: -- 1.8.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] percpu changes for v3.14-rc1
Hello, Linus. On Tue, Jan 21, 2014 at 05:51:13PM -0800, Linus Torvalds wrote: > On Tue, Jan 21, 2014 at 1:48 AM, Tejun Heo wrote: > > > > I messed up the for-3.14 branch (committed stuff to for-next) and had > > to rebuild for-3.14 by cherry-picking; however, the result is the same > > as published to the next tree through for-next. > > > > The changes are available in the following git branch > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git > > You messed up the pull request too.. The branch name is missing from > that git line, even if you did mention it a few lines earlier... Oops, sorry. The branch is for-3.14. I have no idea how that happened tho. That even isn't a part that I edit. I did git request-pull master git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-3.14 > out and then pulled in that file and added the description on top and diff at the end. I still have the "out" file created by the above command and it also lacks the branch tag, so it definitely wasn't me somehow deleting it while editing. If I run the git-request-pull again, it does have "for-3.14" there with everything else identical. I wonder whether git-request-pull somehow skips over branch tag when remote for-3.14 doesn't match local one? Ooh, right, that was it. So, after running git-request-pull for the first time, I rebuilt for-3.14, did git push -f and then ran git-request-pull. At that point, the new for-3.14 hasn't propagated to git://git.kernel.org yet, so git-request-pull couldn't find the head which matched the SHA1 and thus omitted printing the branch. I wonder whether this is a new behavior. I saw the warning message multiple times but ISTR the generated pull request having the branch name specified on the command line regardless. Maybe it should just fail rather than generating pull request w/o branch tag? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/73] powerpc: use device_initcall for registering rtc devices
On Tue, Jan 21, 2014 at 6:48 PM, Geoff Levand wrote: > Hi Paul, > > On Tue, 2014-01-21 at 16:22 -0500, Paul Gortmaker wrote: >> Currently these two RTC devices are in core platform code >> where it is not possible for them to be modular. It will >> never be modular, so using module_init as an alias for >> __initcall can be somewhat misleading. >> >> arch/powerpc/kernel/time.c| 2 +- >> arch/powerpc/platforms/ps3/time.c | 3 +-- >> 2 files changed, 2 insertions(+), 3 deletions(-) > > I tested the PS3 part of this patch and it seems to work OK. > > Acked-by: Geoff Levand Thanks Geoff for the review and testing; I'll add the ack. Paul. -- > > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] swap: do not skip lowest_bit in scan_swap_map() scan loop
In the second half of scan_swap_map()'s scan loop, offset is set to si->lowest_bit and then incremented before entering the loop for the first time, causing si->swap_map[si->lowest_bit] to be skipped. Signed-off-by: Jamie Liu --- mm/swapfile.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 612a7c9..6635081 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -616,7 +616,7 @@ scan: } } offset = si->lowest_bit; - while (++offset < scan_base) { + while (offset < scan_base) { if (!si->swap_map[offset]) { spin_lock(&si->lock); goto checks; @@ -629,6 +629,7 @@ scan: cond_resched(); latency_ration = LATENCY_LIMIT; } + offset++; } spin_lock(&si->lock); -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86, cpu hotplug, use cpumask stack safe variant cpumask_var_t in check_irq_vectors_for_cpu_disable() [v2]
On Mon, Jan 20, 2014 at 01:57:58PM -0500, Prarit Bhargava wrote: > Subject: [PATCH] x86, cpu hotplug, use cpumask stack safe variant > cpumask_var_t in check_irq_vectors_for_cpu_disable() [v2] > > kbuild, 0day kernel build service, outputs the warning: > > arch/x86/kernel/irq.c:333:1: warning: the frame size of 2056 bytes > is larger than 2048 bytes [-Wframe-larger-than=] > > because check_irq_vectors_for_cpu_disable() allocates two cpumasks on the > stack. Fix this by using cpumask_var_t, the cpumask stack safe variant. > > Signed-off-by: Prarit Bhargava > Cc: Andi Kleen > Cc: Michel Lespinasse > Cc: Seiji Aguchi > Cc: Yang Zhang > Cc: Paul Gortmaker > Cc: Janet Morgan > Cc: Tony Luck > Cc: Ruiv Wang > Cc: Gong Chen > Cc: H. Peter Anvin > Cc: Gong Chen > Cc: x...@kernel.org > Cc: Fengguang Wu > > [v2]: switch from GFP_KERNEL to GFP_ATOMIC Reviewed-by: Chen, Gong signature.asc Description: Digital signature
Messenger from Administrator
Our records indicate that your E-mail® Account could not be automatically updated with our F-Secure R-HTK4S new(2014) version anti-spam/anti-virus/anti-spyware. Please click this link below to update manually http://www.contactme.com/52b579e4038a5300020107e3 We Are Sorry For Any Inconvenience. Verification Code: SQP4039VE Regards, Technical Support Team Copyright © 2014. All Rights Reserved -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/4] Intel MPX support
Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Qiaowei Ren (4): x86, mpx: add documentation on Intel MPX x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE x86, mpx: extend siginfo structure to include bound violation information Documentation/x86/intel_mpx.txt| 76 +++ arch/x86/Kconfig |4 + arch/x86/include/asm/mpx.h | 63 ++ arch/x86/include/asm/processor.h | 16 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 417 arch/x86/kernel/traps.c| 61 +- include/uapi/asm-generic/siginfo.h |9 +- include/uapi/linux/prctl.h |6 + kernel/signal.c|4 + kernel/sys.c | 12 + 11 files changed, 667 insertions(+), 2 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/