Re: [PATCH V2] cpuidle/governors: Fix logic in selection of idle states
On 01/17/2014 05:33 AM, Preeti U Murthy wrote: The cpuidle governors today are not handling scenarios where no idle state can be chosen. Such scenarios coud arise if the user has disabled all the idle states at runtime or the latency requirement from the cpus is very strict. The menu governor returns 0th index of the idle state table when no other idle state is suitable. This is even when the idle state corresponding to this index is disabled or the latency requirement is strict and the exit_latency of the lowest idle state is also not acceptable. Hence this patch fixes this logic in the menu governor by defaulting to an idle state index of -1 unless any other state is suitable. The ladder governor needs a few more fixes in addition to that required in the menu governor. When the ladder governor decides to demote the idle state of a CPU, it does not check if the lower idle states are enabled. Add this logic in addition to the logic where it chooses an index of -1 if it can neither promote or demote the idle state of a cpu nor can it choose the current idle state. The cpuidle_idle_call() will return back if the governor decides upon not entering any idle state. However it cannot return an error code because all archs have the logic today that if the call to cpuidle_idle_call() fails, it means that the cpuidle driver failed to *function*; for instance due to errors during registration. As a result they end up deciding upon a default idle state on their own, which could very well be a deep idle state. This is incorrect in cases where no idle state is suitable. Besides for the scenario that this patch is addressing, the call actually succeeds. Its just that no idle state is thought to be suitable by the governors. Under such a circumstance return success code without entering any idle state. Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com Changes from V1:https://lkml.org/lkml/2014/1/14/26 1. Change the return code to success from -EINVAL due to the reason mentioned in the changelog. 2. Add logic that the patch is addressing in the ladder governor as well. 3. Added relevant comments and removed redundant logic as suggested in the above thread. --- drivers/cpuidle/cpuidle.c | 15 +- drivers/cpuidle/governors/ladder.c | 98 ++-- drivers/cpuidle/governors/menu.c |7 +-- 3 files changed, 89 insertions(+), 31 deletions(-) diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index a55e68f..831b664 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -131,8 +131,9 @@ int cpuidle_idle_call(void) /* ask the governor for the next state */ next_state = cpuidle_curr_governor-select(drv, dev); + + dev-last_residency = 0; if (need_resched()) { - dev-last_residency = 0; Why do you need to do this change ? ^ /* give the governor an opportunity to reflect on the outcome */ if (cpuidle_curr_governor-reflect) cpuidle_curr_governor-reflect(dev, next_state); @@ -140,6 +141,18 @@ int cpuidle_idle_call(void) return 0; } + /* Unlike in the need_resched() case, we return here because the +* governor did not find a suitable idle state. However idle is still +* in progress as we are not asked to reschedule. Hence we return +* without enabling interrupts. That will lead to a WARN. +* NOTE: The return code should still be success, since the verdict of this +* call is do not enter any idle state and not a failed call due to +* errors. +*/ + if (next_state 0) + return 0; + Returning from here breaks the symmetry of the trace. trace_cpu_idle_rcuidle(next_state, dev-cpu); broadcast = !!(drv-states[next_state].flags CPUIDLE_FLAG_TIMER_STOP); diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c index 9f08e8c..f495f57 100644 --- a/drivers/cpuidle/governors/ladder.c +++ b/drivers/cpuidle/governors/ladder.c @@ -58,6 +58,36 @@ static inline void ladder_do_selection(struct ladder_device *ldev, ldev-last_state_idx = new_idx; } +static int can_promote(struct ladder_device *ldev, int last_idx, + int last_residency) +{ + struct ladder_device_state *last_state; + + last_state = ldev-states[last_idx]; + if (last_residency last_state-threshold.promotion_time) { + last_state-stats.promotion_count++; + last_state-stats.demotion_count = 0; + if (last_state-stats.promotion_count = last_state-threshold.promotion_count) + return 1; + } + return 0; +} + +static int can_demote(struct ladder_device *ldev, int last_idx, + int last_residency) +{ + struct ladder_device_state *last_state; + + last_state =
Re: [PATCH 0/4] powernv: kvm: numa fault improvement
On Wed, Jan 22, 2014 at 1:18 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Paul Mackerras pau...@samba.org writes: On Mon, Jan 20, 2014 at 03:48:36PM +0100, Alexander Graf wrote: On 15.01.2014, at 07:36, Liu ping fan kernelf...@gmail.com wrote: On Thu, Jan 9, 2014 at 8:08 PM, Alexander Graf ag...@suse.de wrote: On 11.12.2013, at 09:47, Liu Ping Fan kernelf...@gmail.com wrote: This series is based on Aneesh's series [PATCH -V2 0/5] powerpc: mm: Numa faults support for ppc64 For this series, I apply the same idea from the previous thread [PATCH 0/3] optimize for powerpc _PAGE_NUMA (for which, I still try to get a machine to show nums) But for this series, I think that I have a good justification -- the fact of heavy cost when switching context between guest and host, which is well known. This cover letter isn't really telling me anything. Please put a proper description of what you're trying to achieve, why you're trying to achieve what you're trying and convince your readers that it's a good idea to do it the way you do it. Sorry for the unclear message. After introducing the _PAGE_NUMA, kvmppc_do_h_enter() can not fill up the hpte for guest. Instead, it should rely on host's kvmppc_book3s_hv_page_fault() to call do_numa_page() to do the numa fault check. This incurs the overhead when exiting from rmode to vmode. My idea is that in kvmppc_do_h_enter(), we do a quick check, if the page is right placed, there is no need to exit to vmode (i.e saving htab, slab switching) If my suppose is correct, will CCing k...@vger.kernel.org from next version. This translates to me as This is an RFC? Yes, I am not quite sure about it. I have no bare-metal to verify it. So I hope at least, from the theory, it is correct. Paul, could you please give this some thought and maybe benchmark it? OK, once I get Aneesh to tell me how I get to have ptes with _PAGE_NUMA set in the first place. :) I guess we want patch 2, Which Liu has sent separately and I have reviewed. http://article.gmane.org/gmane.comp.emulators.kvm.powerpc.devel/8619 I am not sure about the rest of the patches in the series. We definitely don't want to numa migrate on henter. We may want to do that on fault. But even there, IMHO, we should let the host take the fault and do the numa migration instead of doing this in guest context. My patch does NOT do the numa migration in guest context( h_enter). Instead it just do a pre-check to see whether the numa migration is needed. If needed, the host will take the fault and do the numa migration as it currently does. Otherwise, h_enter can directly setup hpte without HPTE_V_ABSENT. And since pte_mknuma() is called system-wide periodly, so it has more possibility that guest will suffer from HPTE_V_ABSENT.(as my previous reply, I think we should also place the quick check in kvmppc_hpte_hv_fault ) Thx, Fan -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/8] Add support for PowerPC Hypervisor supplied performance counters
On 01/22/2014 07:02 AM, Michael Ellerman wrote: On Thu, 2014-01-16 at 15:53 -0800, Cody P Schafer wrote: These patches add basic pmus for 2 powerpc hypervisor interfaces to obtain performance counters: gpci (get performance counter info) and 24x7. The counters supplied by these interfaces are continually counting and never need to be (and cannot be) disabled or enabled. They additionally do not generate any interrupts. This makes them in some regards similar to software counters, and as a result their implimentation shares some common code (which an initial patch exposes) with the sw counters. Hi Cody, Can you please add some more explanation of this series. In particular why do we need two new PMUs, and how do they relate to each other? And can you add an example of how I'd actually use them using perf. Yeah, agreed. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[Bug 67811] PASEMI: Kernel 3.13.0 doesn't boot with a PA6T cpu
Hi All, Thanks a lot for your effort to solve the boot problems. Unfortunately, this patch doesn't work for the Nemo board. I need the patch created by Olof Johansson. diff -rupN linux-3.13/arch/powerpc/kernel/head_64.S linux-3.13-nemo/arch/powerpc/kernel/head_64.S --- linux-3.13/arch/powerpc/kernel/head_64.S2014-01-05 00:12:14.0 +0100 +++ linux-3.13-nemo/arch/powerpc/kernel/head_64.S2014-01-05 23:06:13.001618802 +0100 @@ -69,6 +69,13 @@ _GLOBAL(__start) /* NOP this out unconditionally */ BEGIN_FTR_SECTION FIXUP_ENDIAN +/* Hack for PWRficient platforms: Due to CFE(?) bug, the 64-bit + * word at 0x8 needs to be set to 0. Patch it up here once we're + * done executing it (we can be lazy and avoid invalidating + * icache) + */ +lir0,0 +std0,8(0) b.__start_initialization_multiplatform END_FTR_SECTION(0, 1) Is it possible to integrate Olof's patch into the kernel sources? All the best, Christian Am 15.01.14 21:01, schrieb Christian Zigotzky: author Linus Torvalds torva...@linux-foundation.org 2014-01-13 03:59:05 (GMT) committer Linus Torvalds torva...@linux-foundation.org 2014-01-13 03:59:05 (GMT) commit a6da83f98267bc8ee4e34aa899169991eb0ceb93 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93 (patch https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93) tree 84c228e0a87475dbdb0f72621c137cce8253131b https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93 parent 061f49ec2d722f485237870f04544d8bec15a778 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=061f49ec2d722f485237870f04544d8bec15a778 (diff https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93id2=061f49ec2d722f485237870f04544d8bec15a778) parent 10348f5976830e5d8f74e8abb04a9a057a5e8478 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=10348f5976830e5d8f74e8abb04a9a057a5e8478 (diff https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93id2=10348f5976830e5d8f74e8abb04a9a057a5e8478) Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fix from Ben Herrenschmidt: Here's one regression fix for 3.13 that I would appreciate if you could still pull in. It was an interesting one to debug, basically it's an old bug that got somewhat exposed by new code breaking the boot on PA Semi boards (yes, it does appear that some people are still using these!) * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Check return value of instance-to-package OF call Diffstat https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93 -rw-r--r-- arch/powerpc/kernel/prom_init.c https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/arch/powerpc/kernel/prom_init.c?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93 22 1 files changed, 13 insertions, 9 deletions diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index cb64a6e..078145a 100644 --- a/arch/powerpc/kernel/prom_init.c https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/prom_init.c?id=061f49ec2d722f485237870f04544d8bec15a778 +++ b/arch/powerpc/kernel/prom_init.c https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/prom_init.c?id=a6da83f98267bc8ee4e34aa899169991eb0ceb93 @@ -1986,19 +1986,23 @@ static void __init prom_init_stdout(void) /* Get the full OF pathname of the stdout device */ memset(path, 0, 256); call_prom(instance-to-path, 3, 1, prom.stdout, path, 255); - stdout_node = call_prom(instance-to-package, 1, 1, prom.stdout); - val = cpu_to_be32(stdout_node); - prom_setprop(prom.chosen, /chosen, linux,stdout-package, - val, sizeof(val)); prom_printf(OF stdout device is: %s\n, of_stdout_device); prom_setprop(prom.chosen, /chosen, linux,stdout-path, path, strlen(path) + 1); - /* If it's a display, note it */ - memset(type, 0, sizeof(type)); - prom_getprop(stdout_node, device_type, type, sizeof(type)); - if (strcmp(type, display) == 0) - prom_setprop(stdout_node, path, linux,boot-display, NULL, 0); + /* instance-to-package fails on PA-Semi */ + stdout_node = call_prom(instance-to-package, 1, 1, prom.stdout); + if (stdout_node != PROM_ERROR) { + val = cpu_to_be32(stdout_node); + prom_setprop(prom.chosen, /chosen, linux,stdout-package, + val, sizeof(val)); + + /* If it's a display, note it */ + memset(type, 0, sizeof(type)); + prom_getprop(stdout_node, device_type, type, sizeof(type)); + if (strcmp(type, display) == 0) + prom_setprop(stdout_node, path, linux,boot-display, NULL, 0); + } } static int __init
Re: [PATCH 8/8] powerpc: Fix endian issues in crash dump code
Hi Michael, Not my favourite colour :D What about this instead? We could also add of_property_read_u32(), with an implied index of zero? I don't like the rc handling, but couldn't come up with anything I liked better. Thanks for pointing that out, I didn't realise we had so many of_property_read_* helpers. I'll be sure to use them from here on :) Anton ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V5 6/8] time/cpuidle: Support in tick broadcast framework in the absence of external clock device
On Wed, 15 Jan 2014, Preeti U Murthy wrote: diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 086ad60..d61404e 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -524,12 +524,13 @@ void clockevents_resume(void) #ifdef CONFIG_GENERIC_CLOCKEVENTS /** * clockevents_notify - notification about relevant events + * Returns non zero on error. */ -void clockevents_notify(unsigned long reason, void *arg) +int clockevents_notify(unsigned long reason, void *arg) { The interface change of clockevents_notify wants to be a separate patch. diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 9532690..1c23912 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -20,6 +20,7 @@ #include linux/sched.h #include linux/smp.h #include linux/module.h +#include linux/slab.h #include tick-internal.h @@ -35,6 +36,15 @@ static cpumask_var_t tmpmask; static DEFINE_RAW_SPINLOCK(tick_broadcast_lock); static int tick_broadcast_force; +/* + * Helper variables for handling broadcast in the absence of a + * tick_broadcast_device. + * */ +static struct hrtimer *bc_hrtimer; +static int bc_cpu = -1; +static ktime_t bc_next_wakeup; Why do you need another variable to store the expiry time? The broadcast code already knows it and the hrtimer expiry value gives you the same information for free. +static int hrtimer_initialized = 0; What's the point of this hrtimer_initialized dance? Why not simply making the hrtimer static and avoid that all together. Also adding the initialization into tick_broadcast_oneshot_available() is braindamaged. Why not adding this to tick_broadcast_init() which is the proper place to do? Aside of that you are making this hrtimer mode unconditional, which might break existing systems which are not aware of the hrtimer implications. What you really want is a pseudo clock event device which has the proper functions for handling the timer and you can register it from your architecture code. The broadcast core code needs a few tweaks to avoid the shutdown of the cpu local clock event device, but aside of that the whole thing just falls into place. So architectures can use this if they want and are sure that their low level idle code knows about the deep idle preventing return value of clockevents_notify(). Once that works you can register the hrtimer based broadcast device and a real hardware broadcast device with a higher rating. It just works. Find an incomplete and nonfunctional concept patch below. It should be simple to make it work for real. Thanks, tglx Index: linux-2.6/include/linux/clockchips.h === --- linux-2.6.orig/include/linux/clockchips.h +++ linux-2.6/include/linux/clockchips.h @@ -62,6 +62,11 @@ enum clock_event_mode { #define CLOCK_EVT_FEAT_DYNIRQ 0x20 #define CLOCK_EVT_FEAT_PERCPU 0x40 +/* + * Clockevent device is based on a hrtimer for broadcast + */ +#define CLOCK_EVT_FEAT_HRTIMER 0x80 + /** * struct clock_event_device - clock event device descriptor * @event_handler: Assigned by the framework to be called by the low @@ -83,6 +88,7 @@ enum clock_event_mode { * @name: ptr to clock event name * @rating:variable to rate clock event devices * @irq: IRQ number (only for non CPU local devices) + * @bound_on: Bound on CPU * @cpumask: cpumask to indicate for which CPUs this device works * @list: list head for the management code * @owner: module reference @@ -113,6 +119,7 @@ struct clock_event_device { const char *name; int rating; int irq; + int bound_on; const struct cpumask*cpumask; struct list_headlist; struct module *owner; Index: linux-2.6/kernel/time/tick-broadcast-hrtimer.c === --- /dev/null +++ linux-2.6/kernel/time/tick-broadcast-hrtimer.c @@ -0,0 +1,77 @@ + +static struct hrtimer bctimer; + +static void bc_set_mode(enum clock_event_mode mode, + struct clock_event_device *bc) +{ + switch (mode) { + case CLOCK_EVT_MODE_SHUTDOWN: + /* +* Note, we cannot cancel the timer here as we might +* run into the following live lock scenario: +* +* cpu 0cpu1 +* lock(broadcast_lock); +* hrtimer_interrupt() +* bc_handler() +* tick_handle_oneshot_broadcast(); +* lock(broadcast_lock); +* hrtimer_cancel() +
Re: [PATCH RFC] powerpc/mpc85xx: add support for the kmp204x reference board
On 01/21/2014 06:01 PM, Scott Wood wrote: On Tue, 2014-01-21 at 17:34 +0100, Valentin Longchamp wrote: On 01/20/2014 11:37 PM, Scott Wood wrote: On Mon, 2014-01-20 at 17:38 +0100, Valentin Longchamp wrote: On 01/17/2014 10:48 PM, Scott Wood wrote: Why isn't the compatible keymile,kmcoge4, like the model? Because kmcoge4 is the board that is based on the kmp204x architecture/design. We expect other boards (kmcoge7 for instance) based on the same kmp204x design. The top-level compatible isn't for the architecture or the design. It's for the board. Surely there's something different about kmcoge7 versus kmcoge4 -- is it visible to software? There should only be a few differences in the dts between the two boards. Reading the ePAPR my understanding was that compatible is the programming model and that's what I have named above design/architecture while model is the exact model of the device in this case the exact board name. In practice, model is more for human consumption (e.g. there may be many variants that all look identical to software). The programming model for an entire board includes everything on it. You would prefer that I have the model and compatible stricly the same and add any future board into the compatible boards[] from corenet_generic ? That's how it's usually done. Or, at least provide the board architecture name as a secondary compatible after the board name. If possible I would like to be able to see the boards that are based on a similar design, that's what I wanted to achieve with this kmp204x name. Is kmp204x an official name of the architecture, rather than a generalization of kmp2040 and kmp2041? If there were a p2042, and you made a board for it, is there any chance it would be called kmp204x even if it were very different from the p2040/p2041 board? It's the name we have picked up, but it's not official. We also use km83xx, km82xx and it was derived from that. If the hypothetical p2042 board was different it would then have another name. In that case, I don't object to it being listed in compatible, though the specific board name should come first. OK then to sum up both points we would have: model = keymile,kmcoge4; compatible = keymile,kmcoge4, keymile,kmp204x; And I would add keymile,kmcoge4 into the boards[] table. The device tree describes the hardware, not what driver you want to use. Plus, I don't see any driver that matches gen,spidev nor any binding for it, and gen doesn't make sense as a vendor prefix. The only instance of that string I can find in the Linux tree is in mgcoge.dts. Well it comes from mgcoge and that's why I have used this It's for usage with the spidev driver (driver/spi/spidev.c). I agree that the gen brings nothing. Would spidev@1 { compatible = spidev; make more sense ? It doesn't address any of the other comments. Can you please explicitly tell me how I should build this node ? What other comments ? Must I be more generic with the name ? Something like : spi@1 { compatible = zarlink,30343, spidev; Remove spidev. Any nodes under the SPI controller node will be SPI devices, right? So it doesn't add anything regarding hardware description. OK. Thank you for the feedback, I will then send a revised patch as soon as I have time. Valentin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 1/3] powerpc/pseries: Device tree should only be updated once after suspend/migrate
From: Haren Myneni hb...@us.ibm.com From: Haren Myneni hb...@us.ibm.com The current code makes rtas calls for update-nodes, activate-firmware and then update-nodes again. The FW provides the same data for both update-nodes calls. As a result a proc entry exists error is reported for the second update while adding device nodes. This patch makes a single rtas call for update-nodes after activating the FW. It also add rtas_busy delay for the activate-firmware rtas call. Signed-off-by: Haren Myneni hb...@us.ibm.com Signed-off-by: Tyrel Datwyler tyr...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/mobility.c | 26 ++ 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index cde4e0a..bde7eba 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -290,13 +290,6 @@ void post_mobility_fixup(void) int rc; int activate_fw_token; - rc = pseries_devicetree_update(MIGRATION_SCOPE); - if (rc) { - printk(KERN_ERR Initial post-mobility device tree update - failed: %d\n, rc); - return; - } - activate_fw_token = rtas_token(ibm,activate-firmware); if (activate_fw_token == RTAS_UNKNOWN_SERVICE) { printk(KERN_ERR Could not make post-mobility @@ -304,16 +297,17 @@ void post_mobility_fixup(void) return; } - rc = rtas_call(activate_fw_token, 0, 1, NULL); - if (!rc) { - rc = pseries_devicetree_update(MIGRATION_SCOPE); - if (rc) - printk(KERN_ERR Secondary post-mobility device tree - update failed: %d\n, rc); - } else { + do { + rc = rtas_call(activate_fw_token, 0, 1, NULL); + } while (rtas_busy_delay(rc)); + + if (rc) printk(KERN_ERR Post-mobility activate-fw failed: %d\n, rc); - return; - } + + rc = pseries_devicetree_update(MIGRATION_SCOPE); + if (rc) + printk(KERN_ERR Post-mobility device tree update + failed: %d\n, rc); return; } -- 1.7.12.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 0/3] powerpc/pseries: fix issues in suspend/resume code
This patchset fixes a couple of issues encountered in the suspend/resume code base. First when using the kernel device tree update code update-nodes is unnecessarily called more than once. Second the cpu cache lists are not updated after a suspend/resume which under certain conditions may cause a panic. Finally, since the cache list fix utilzes in kernel device tree update code a means for telling drmgr not to perform a device tree update from userspace is required. Changes from v1: - Fixed several commit message typos - Fixed authorship of first two patches Haren Myneni (2): powerpc/pseries: Device tree should only be updated once after suspend/migrate powerpc/pseries: Update dynamic cache nodes for suspend/resume operation Tyrel Datwyler (1): powerpc/pseries: Report in kernel device tree update to drmgr arch/powerpc/include/asm/rtas.h | 4 arch/powerpc/kernel/rtas.c| 17 + arch/powerpc/kernel/time.c| 6 ++ arch/powerpc/platforms/pseries/mobility.c | 26 ++ arch/powerpc/platforms/pseries/suspend.c | 25 - 5 files changed, 61 insertions(+), 17 deletions(-) -- 1.7.12.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/3] powerpc/pseries: Update dynamic cache nodes for suspend/resume operation
From: Haren Myneni hb...@us.ibm.com From: Haren Myneni hb...@us.ibm.com pHyp can change cache nodes for suspend/resume operation. The current code updates the device tree after all non boot CPUs are enabled. Hence, we do not modify the cache list based on the latest cache nodes. Also we do not remove cache entries for the primary CPU. This patch removes the cache list for the boot CPU, updates the device tree before enabling nonboot CPUs and adds cache list for the boot cpu. Signed-off-by: Haren Myneni hb...@us.ibm.com Signed-off-by: Tyrel Datwyler tyr...@linux.vnet.ibm.com --- arch/powerpc/include/asm/rtas.h | 4 arch/powerpc/kernel/rtas.c | 17 + arch/powerpc/kernel/time.c | 6 ++ 3 files changed, 27 insertions(+) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 9bd52c6..da9d733 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -283,6 +283,10 @@ extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal); #ifdef CONFIG_PPC_PSERIES extern int pseries_devicetree_update(s32 scope); +extern void post_mobility_fixup(void); +extern void update_dynamic_configuration(void); +#else /* !CONFIG_PPC_PSERIES */ +void update_dynamic_configuration(void) { } #endif #ifdef CONFIG_PPC_RTAS_DAEMON diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 4cf674d..8249eb2 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -43,6 +43,7 @@ #include asm/time.h #include asm/mmu.h #include asm/topology.h +#include cacheinfo.h struct rtas_t rtas = { .lock = __ARCH_SPIN_LOCK_UNLOCKED @@ -972,6 +973,22 @@ out: free_cpumask_var(offline_mask); return atomic_read(data.error); } + +/* + * The device tree cache nodes can be modified during suspend/ resume. + * So delete all cache entries and recreate them again after the device tree + * update. + * We already deleted cache entries for notboot CPUs before suspend. So delete + * entries for the primary CPU, recreate entries after the device tree update. + * We can create entries for nonboot CPU when enable them later. + */ + +void update_dynamic_configuration(void) +{ + cacheinfo_cpu_offline(smp_processor_id()); + post_mobility_fixup(); + cacheinfo_cpu_online(smp_processor_id()); +} #else /* CONFIG_PPC_PSERIES */ int rtas_ibm_suspend_me(struct rtas_args *args) { diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index b3b1441..5f1ca28 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -69,6 +69,7 @@ #include asm/vdso_datapage.h #include asm/firmware.h #include asm/cputime.h +#include asm/rtas.h /* powerpc clocksource/clockevent code */ @@ -592,6 +593,11 @@ void arch_suspend_enable_irqs(void) generic_suspend_enable_irqs(); if (ppc_md.suspend_enable_irqs) ppc_md.suspend_enable_irqs(); + /* +* Update configuration which can be modified based on devicetree +* changes during resume. +*/ + update_dynamic_configuration(); } #endif -- 1.7.12.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 3/3] powerpc/pseries: Report in kernel device tree update to drmgr
Traditionally it has been drmgr's responsibilty to update the device tree through the /proc/ppc64/ofdt interface after a suspend/resume operation. This patchset however has modified suspend/resume ops to preform that update entirely in the kernel during the resume. Therefore, a mechanism is required for drmgr to determine who is responsible for the update. This patch adds a show function to the hibernate attribute that returns 1 if the kernel updates the device tree after the resume and 0 if drmgr is responsible. Signed-off-by: Tyrel Datwyler tyr...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/suspend.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/suspend.c b/arch/powerpc/platforms/pseries/suspend.c index 16a2552..723115d 100644 --- a/arch/powerpc/platforms/pseries/suspend.c +++ b/arch/powerpc/platforms/pseries/suspend.c @@ -174,7 +174,30 @@ out: return rc; } -static DEVICE_ATTR(hibernate, S_IWUSR, NULL, store_hibernate); +#define USER_DT_UPDATE 0 +#define KERN_DT_UPDATE 1 + +/** + * show_hibernate - Report device tree update responsibilty + * @dev: subsys root device + * @attr: device attribute struct + * @buf: buffer + * + * Report whether a device tree update is performed by the kernel after a + * resume, or if drmgr must coordinate the update from user space. + * + * Return value: + * 0 if drmgr is to initiate update, and 1 otherwise + **/ +static ssize_t show_hibernate(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + return sprintf(buf, %d\n, KERN_DT_UPDATE); +} + +static DEVICE_ATTR(hibernate, S_IWUSR | S_IRUGO, + show_hibernate, store_hibernate); static struct bus_type suspend_subsys = { .name = power, -- 1.7.12.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH RFC] powerpc/mpc85xx: add support for the kmp204x reference board
On Wed, 2014-01-22 at 17:38 +0100, Valentin Longchamp wrote: On 01/21/2014 06:01 PM, Scott Wood wrote: On Tue, 2014-01-21 at 17:34 +0100, Valentin Longchamp wrote: Can you please explicitly tell me how I should build this node ? What other comments ? Must I be more generic with the name ? Something like : spi@1 { compatible = zarlink,30343, spidev; Remove spidev. Any nodes under the SPI controller node will be SPI devices, right? So it doesn't add anything regarding hardware description. OK. Thank you for the feedback, I will then send a revised patch as soon as I have time. Oh, and ideally the node name should describe the function of the device -- spi as a node name usually means a SPI controller. Maybe ptp_clock@1? Also, zarlink should be added to Documentation/devicetree/bindings/vendor-prefixes.txt -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/3] powerpc/fsl: Use the new interface to save or restore registers
On Sun, 2014-01-19 at 23:57 -0600, Wang Dongsheng-B40534 wrote: Use fsl_cpu_state_save/fsl_cpu_state_restore to save/restore registers. Use the functions to save/restore registers, so we don't need to maintain the code. Signed-off-by: Wang Dongsheng dongsheng.w...@freescale.com Is there any functional change with this patchset (e.g. suspend supported on chips where it wasn't before), or is it just cleanup? A cover letter would be useful to describe the purpose of the overall patchset when it isn't obvious. Yes, just cleanup.. It seems to be introducing complexity rather than removing it. Is this cleanup needed to prepare for adding new functionality? Plus, I'm skeptical that this is functionally equivalent. It looks like the new code saves a lot more than the old code does. Why? Actually, I want to take a practical example to push the save/restore patches. And this is also reasonable for 32bit-hibernation, the code is more clean. :) I think I need to change the description of the patch. + + /* Restore base register */ + li r4, 0 + bl fsl_cpu_state_restore Why are you calling anything with fsl in the name from code that is supposed to be for all booke? E200, E300 not support. Support E500, E500v2, E500MC, E5500, E6500. Do you have any suggestions about this? What about non-FSL booke such as 44x? Or if this file never supported 44x, rename it appropriately. Currently does not support. ok change the name first, if later support, and then again to modify the name of this function. How about 85xx_cpu_state_restore? Symbols can't begin with numbers. booke_cpu_state_restore would be better (it would still provide a place for 44x to be added if somebody actually cared about doing so). I'm still not convinced that asm code is the place to do this, though. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/8] Add support for PowerPC Hypervisor supplied performance counters
On 01/21/2014 05:32 PM, Michael Ellerman wrote: On Thu, 2014-01-16 at 15:53 -0800, Cody P Schafer wrote: These patches add basic pmus for 2 powerpc hypervisor interfaces to obtain performance counters: gpci (get performance counter info) and 24x7. The counters supplied by these interfaces are continually counting and never need to be (and cannot be) disabled or enabled. They additionally do not generate any interrupts. This makes them in some regards similar to software counters, and as a result their implimentation shares some common code (which an initial patch exposes) with the sw counters. Hi Cody, Can you please add some more explanation of this series. Sure In particular why do we need two new PMUs, and how do they relate to each other? These 2 PMUs end up providing access to some cpu, core, and chip level counters not exposed via other interfaces, and additionally allow monitoring the performance of other lpars (guests) on the same host system. Because it provides access to core and chip level counters, this pair of PMUs could be thought of as powerpc's counterpart to x86's uncore events. As an example, processor_bus_utilization_abc and processor_bus_utilization_wxyz (in hv_gpci.h) allow retreval of total cycles and idle cycles for various inter-chip buses. GPCI is an interface that already exists on some power7 machines (depending on the fw version), but is rather in-flexible and code intensive to add additional counters to. The 24x7 interfaces currently are designed to co-exist with the gpci interface while replacing most of gpci's functionality on newer systems. Right now, the 24x7 code I've submitted uses the gpci calls to check if it has permission to access certain classes of counters. And can you add an example of how I'd actually use them using perf. # For gpci (formed from reading hv_gpci.h), gets processor_time_in_timebase_cycles perf stat -e 'hv_gpci/counter_info_version=3,offset=0,length=8,secondary_index=0,starting_index=0x,request=0x10/' -r 0 -a -x ' ' sleep 0.1 # For 24x7, assuming access to hw+fw that supports it, gets a yet-to-be identified counter: perf stat -e 'hv_24x7/domain=2,offset=8,starting_index=0,lpar=0x/' -r 0 -C 0 -x ' ' sleep 0.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH RFC 00/73] tree-wide: clean up some no longer required #include linux/init.h
[Re: [PATCH RFC 00/73] tree-wide: clean up some no longer required #include linux/init.h] On 22/01/2014 (Wed 18:00) Stephen Rothwell wrote: Hi Paul, On Tue, 21 Jan 2014 16:22:03 -0500 Paul Gortmaker paul.gortma...@windriver.com wrote: Where: This work exists as a queue of patches that I apply to linux-next; since the changes are fixing some things that currently can only be found there. The patch series can be found at: http://git.kernel.org/cgit/linux/kernel/git/paulg/init.git git://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git I've avoided annoying Stephen with another queue of patches for linux-next while the development content was in flux, but now that the merge window has opened, and new additions are fewer, perhaps he wouldn't mind tacking it on the end... Stephen? OK, I have added this to the end of linux-next today - we will see how we go. It is called init. Thanks, it was a great help as it uncovered a few issues in fringe arch that I didn't have toolchains for, and I've fixed all of those up. I've noticed that powerpc has been un-buildable for a while now; I have used this hack patch locally so I could run the ppc defconfigs to check that I didn't break anything. Maybe useful for linux-next in the interim? It is a hack patch -- Not-Signed-off-by: Paul Gortmaker. :) Paul. -- diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h index d27960c89a71..d0f070a2b395 100644 --- a/arch/powerpc/include/asm/pgtable-ppc64.h +++ b/arch/powerpc/include/asm/pgtable-ppc64.h @@ -560,9 +560,9 @@ extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); #define pmd_move_must_withdraw pmd_move_must_withdraw -typedef struct spinlock spinlock_t; -static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl, -spinlock_t *old_pmd_ptl) +struct spinlock; +static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl, +struct spinlock *old_pmd_ptl) { /* * Archs like ppc64 use pgtable to store per pmd ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] clk: corenet: Update the clock bindings
On Tue, 2014-01-21 at 10:02 +0800, Tang Yuantian wrote: From: Tang Yuantian yuantian.t...@freescale.com Main changs include: - Clarified the clock nodes' version number - Fixed a issue in example Singed-off-by: Tang Yuantian yuantian.t...@freescale.com --- Documentation/devicetree/bindings/clock/corenet-clock.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/clock/corenet-clock.txt b/Documentation/devicetree/bindings/clock/corenet-clock.txt index 24711af..d6cadef 100644 --- a/Documentation/devicetree/bindings/clock/corenet-clock.txt +++ b/Documentation/devicetree/bindings/clock/corenet-clock.txt @@ -54,6 +54,8 @@ Required properties: It takes parent's clock-frequency as its clock. * fsl,qoriq-sysclk-2.0: for input system clock (v2.0). It takes parent's clock-frequency as its clock. + Note: v1.0 and v2.0 are clock version which should align to + clockgen node's they belong to which is chassis version. Instead, how about a note like this near the top of the file: All references to 1.0 and 2.0 refer to the QorIQ chassis version to which the chip complies. Chassis Version Example Chips --- - 1.0 p4080, p5020, p5040 2.0 t4240, b4860, t1040 BTW, this binding and the associated driver really should be called qoriq-clock, not corenet-clock. This would match the compatible string, and it doesn't really have much to do with corenet (which is part of the QorIQ chassis v1 and v2, but not *this* part). Do you know if the chassis v3 clock interface will be similar enough to share a driver? -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/3] powerpc/85xx: Provide two functions to save/restore the core registers
On Mon, 2014-01-20 at 20:43 -0600, Wang Dongsheng-B40534 wrote: -Original Message- From: Wood Scott-B07421 Sent: Tuesday, January 21, 2014 9:06 AM To: Wang Dongsheng-B40534 Cc: b...@kernel.crashing.org; Zhao Chenhui-B35336; an...@enomsg.org; linuxppc- d...@lists.ozlabs.org Subject: Re: [PATCH 2/3] powerpc/85xx: Provide two functions to save/restore the core registers On Mon, 2014-01-20 at 00:03 -0600, Wang Dongsheng-B40534 wrote: + /* + * Need to save float-point registers if MSR[FP] = 1. + */ + mfmsr r12 + andi. r12, r12, MSR_FP + beq 1f + do_sr_fpr_regs(save) C code should have already ensured that MSR[FP] is not 1 (and thus the FP context has been saved). Yes, right. But I mean if the FP still use in core save flow, we need to save it. In this process, i don't care what other code do, we need to focus on not losing valuable data. It is not allowed to use FP at that point. If MSR[FP] not active, that is FP not allowed to use. But here is a normal judgment, if MSR[FP] is active, this means that the floating point module is being used. I offer is a function of the interface, we don't know where is the function will be called. Just because we call this function in the context of uncertainty, we need this judgment to ensure that no data is lost. The whole point of calling enable_kernel_fp() in C code before suspending is to ensure that the FP state gets saved. If FP is used after that point it is a bug. If you're worried about such bugs, then clear MSR[FP] after calling enable_kernel_fp(), rather than adding redundant state saving. enable_kernel_fp() calling in MEM suspend flow. Hibernation is different with MEM suspend, and I'm not sure where will call this interface, so we need to ensure the integrity of the core saving. I don't think this code is *redundant*. I trust that the kernel can keep the FP related operations, that's why a judgment is here. :) For hibernation, save_processor_state() is called first, which does flush_fp_to_thread() which has a similar effect (though I wonder if it's being called on the correct task for non-SMP). -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] mtd: m25p80: Make the name of mtd_info fixed
Hi Hou, On Mon, Jan 06, 2014 at 02:34:29PM +0800, Hou Zhiqiang wrote: To give spi flash layout using mtdparts=... in cmdline, we must give mtd_info a fixed name,because the cmdlinepart's parser will match the name given in cmdline with the mtd_info. Now, if use OF node, mtd_info's name will be spi-dev-name. It consists of spi_master-bus_num, and the spi_master-bus_num maybe dynamically fetched. So, give the mtd_info a new fiexd name name.cs, name is name of spi_device_id and cs is chip-select in spi_dev. Signed-off-by: Hou Zhiqiang b48...@freescale.com --- drivers/mtd/devices/m25p80.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c index eb558e8..d1ed480 100644 --- a/drivers/mtd/devices/m25p80.c +++ b/drivers/mtd/devices/m25p80.c @@ -1012,7 +1012,8 @@ static int m25p_probe(struct spi_device *spi) if (data data-name) flash-mtd.name = data-name; else - flash-mtd.name = dev_name(spi-dev); + flash-mtd.name = kasprintf(GFP_KERNEL, %s.%d, + id-name, spi-chip_select); Changing the mtd.name may have far-reaching consequences for users who already have mtdparts= command lines. But your concern is probably valid for dynamically-determined bus numbers. Perhaps you can edit this patch to only change the name when the busnum is dynamically-allocated? This also needs a NULL check (for OOM), and you leak memory on device removal. flash-mtd.type = MTD_NORFLASH; flash-mtd.writesize = 1; Brian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH] clk: corenet: Update the clock bindings
-Original Message- From: Wood Scott-B07421 Sent: 2014年1月23日 星期四 8:44 To: Tang Yuantian-B29983 Cc: Wood Scott-B07421; ga...@kernel.crashing.org; linuxppc- d...@lists.ozlabs.org; devicet...@vger.kernel.org; Kushwaha Prabhakar- B32579 Subject: Re: [PATCH] clk: corenet: Update the clock bindings On Tue, 2014-01-21 at 10:02 +0800, Tang Yuantian wrote: From: Tang Yuantian yuantian.t...@freescale.com Main changs include: - Clarified the clock nodes' version number - Fixed a issue in example Singed-off-by: Tang Yuantian yuantian.t...@freescale.com --- Documentation/devicetree/bindings/clock/corenet-clock.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/clock/corenet-clock.txt b/Documentation/devicetree/bindings/clock/corenet-clock.txt index 24711af..d6cadef 100644 --- a/Documentation/devicetree/bindings/clock/corenet-clock.txt +++ b/Documentation/devicetree/bindings/clock/corenet-clock.txt @@ -54,6 +54,8 @@ Required properties: It takes parent's clock-frequency as its clock. * fsl,qoriq-sysclk-2.0: for input system clock (v2.0). It takes parent's clock-frequency as its clock. + Note: v1.0 and v2.0 are clock version which should align to + clockgen node's they belong to which is chassis version. Instead, how about a note like this near the top of the file: All references to 1.0 and 2.0 refer to the QorIQ chassis version to which the chip complies. Chassis Version Example Chips --- - 1.0 p4080, p5020, p5040 2.0 t4240, b4860, t1040 Better, I will update. BTW, this binding and the associated driver really should be called qoriq-clock, not corenet-clock. This would match the compatible string, and it doesn't really have much to do with corenet (which is part of the QorIQ chassis v1 and v2, but not *this* part). Do you know if the chassis v3 clock interface will be similar enough to share a driver? Doesn't QorIQ include some low-end socs, like p1022, p1020? This driver has nothing to do with these boards. I have no idea about chassis v3. If it has similar clock tree, this driver can be shared. Even the driver can't be used by v3, we can easily add v3 support since it has different Compatible string. Regards, Yuantian -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 2/3] powerpc/85xx: Provide two functions to save/restore the core registers
The whole point of calling enable_kernel_fp() in C code before suspending is to ensure that the FP state gets saved. If FP is used after that point it is a bug. If you're worried about such bugs, then clear MSR[FP] after calling enable_kernel_fp(), rather than adding redundant state saving. enable_kernel_fp() calling in MEM suspend flow. Hibernation is different with MEM suspend, and I'm not sure where will call this interface, so we need to ensure the integrity of the core saving. I don't think this code is *redundant*. I trust that the kernel can keep the FP related operations, that's why a judgment is here. :) For hibernation, save_processor_state() is called first, which does flush_fp_to_thread() which has a similar effect (though I wonder if it's being called on the correct task for non-SMP). Yes, thanks, I miss this code.:) But I still think we need to keep this judgment, because i provide an API. If you still insist on I can remove *FP*, but I don't want to do this..:) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 3/3] powerpc/fsl: Use the new interface to save or restore registers
Currently does not support. ok change the name first, if later support, and then again to modify the name of this function. How about 85xx_cpu_state_restore? Symbols can't begin with numbers. booke_cpu_state_restore would be better (it would still provide a place for 44x to be added if somebody actually cared about doing so). :). Thanks. -Dongsheng ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH] mtd: m25p80: Make the name of mtd_info fixed
Hi Brian, Thanks for your comments! -Original Message- From: Brian Norris [mailto:computersforpe...@gmail.com] Sent: Thursday, January 23, 2014 10:12 AM To: Hou Zhiqiang-B48286 Cc: linux-...@lists.infradead.org; linuxppc-...@ozlabs.org; Wood Scott- B07421; Hu Mingkai-B21284; Ezequiel Garcia Subject: Re: [PATCH] mtd: m25p80: Make the name of mtd_info fixed Hi Hou, On Mon, Jan 06, 2014 at 02:34:29PM +0800, Hou Zhiqiang wrote: To give spi flash layout using mtdparts=... in cmdline, we must give mtd_info a fixed name,because the cmdlinepart's parser will match the name given in cmdline with the mtd_info. Now, if use OF node, mtd_info's name will be spi-dev-name. It consists of spi_master-bus_num, and the spi_master-bus_num maybe dynamically fetched. So, give the mtd_info a new fiexd name name.cs, name is name of spi_device_id and cs is chip-select in spi_dev. Signed-off-by: Hou Zhiqiang b48...@freescale.com --- drivers/mtd/devices/m25p80.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c index eb558e8..d1ed480 100644 --- a/drivers/mtd/devices/m25p80.c +++ b/drivers/mtd/devices/m25p80.c @@ -1012,7 +1012,8 @@ static int m25p_probe(struct spi_device *spi) if (data data-name) flash-mtd.name = data-name; else - flash-mtd.name = dev_name(spi-dev); + flash-mtd.name = kasprintf(GFP_KERNEL, %s.%d, + id-name, spi-chip_select); Changing the mtd.name may have far-reaching consequences for users who already have mtdparts= command lines. But your concern is probably valid for dynamically-determined bus numbers. Perhaps you can edit this patch to only change the name when the busnum is dynamically-allocated? It's a good idea, but in the case of mtd_info's name dynamically-allocated using mtdparts=... in command lines is illegal obviously. Would you tell me what side-effect will be brought by the change of mtd_info's name. Thanks This also needs a NULL check (for OOM), and you leak memory on device removal. Yes, it's necessary to check the return value of function kasprintf. flash-mtd.type = MTD_NORFLASH; flash-mtd.writesize = 1; Brian Thanks, Hou Zhiqiang ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V5 6/8] time/cpuidle: Support in tick broadcast framework in the absence of external clock device
Hi Thomas, Thank you very much for the review. On 01/22/2014 06:57 PM, Thomas Gleixner wrote: On Wed, 15 Jan 2014, Preeti U Murthy wrote: diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 086ad60..d61404e 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -524,12 +524,13 @@ void clockevents_resume(void) #ifdef CONFIG_GENERIC_CLOCKEVENTS /** * clockevents_notify - notification about relevant events + * Returns non zero on error. */ -void clockevents_notify(unsigned long reason, void *arg) +int clockevents_notify(unsigned long reason, void *arg) { The interface change of clockevents_notify wants to be a separate patch. diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 9532690..1c23912 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -20,6 +20,7 @@ #include linux/sched.h #include linux/smp.h #include linux/module.h +#include linux/slab.h #include tick-internal.h @@ -35,6 +36,15 @@ static cpumask_var_t tmpmask; static DEFINE_RAW_SPINLOCK(tick_broadcast_lock); static int tick_broadcast_force; +/* + * Helper variables for handling broadcast in the absence of a + * tick_broadcast_device. + * */ +static struct hrtimer *bc_hrtimer; +static int bc_cpu = -1; +static ktime_t bc_next_wakeup; Why do you need another variable to store the expiry time? The broadcast code already knows it and the hrtimer expiry value gives you the same information for free. The reason was functions like tick_handle_oneshot_broadcast() and tick_broadcast_switch_to_oneshot() were using the tick_broadcast_device.evtdev-next_event to set/get the next wakeups. But since this patchset introduced an explicit hrtimer for archs which did not have such a device, I wanted these functions to use a generic parameter to set/get the next wakeups without having to know about the existence of this hrtimer, if at all. And program the hrtimer/tick broadcast device whichever was present only when the next event was to be set. But with your below concept patch, we will not be required to do this. +static int hrtimer_initialized = 0; What's the point of this hrtimer_initialized dance? Why not simply making the hrtimer static and avoid that all together. Also adding the initialization into tick_broadcast_oneshot_available() is braindamaged. Why not adding this to tick_broadcast_init() which is the proper place to do? Right I agree, this hrtimer initialization should have been in tick_broadcast_init() and a simple static declaration would have done the job. Aside of that you are making this hrtimer mode unconditional, which might break existing systems which are not aware of the hrtimer implications. What you really want is a pseudo clock event device which has the proper functions for handling the timer and you can register it from your architecture code. The broadcast core code needs a few tweaks to avoid the shutdown of the cpu local clock event device, but aside of that the whole thing just falls into place. So architectures can use this if they want and are sure that their low level idle code knows about the deep idle preventing return value of clockevents_notify(). Once that works you can register the hrtimer based broadcast device and a real hardware broadcast device with a higher rating. It just works. I now completely see your point. This will surely break on archs which are not using the return value of the BROADCAST_ENTER notification. I am not even giving them a choice about using the hrtimer mode of broadcast framework and am expecting them to take action for the failed return of BROADCAST_ENTER. I missed that critical point. I went through the below patch and am able to see how you are solving this problem. Find an incomplete and nonfunctional concept patch below. It should be simple to make it work for real. Thank you very much for the valuable review. The below patch makes your points very clear. Let me try this out. Regards Preeti U Murthy Thanks, tglx Index: linux-2.6/include/linux/clockchips.h === --- linux-2.6.orig/include/linux/clockchips.h +++ linux-2.6/include/linux/clockchips.h @@ -62,6 +62,11 @@ enum clock_event_mode { #define CLOCK_EVT_FEAT_DYNIRQ0x20 #define CLOCK_EVT_FEAT_PERCPU0x40 +/* + * Clockevent device is based on a hrtimer for broadcast + */ +#define CLOCK_EVT_FEAT_HRTIMER 0x80 + /** * struct clock_event_device - clock event device descriptor * @event_handler: Assigned by the framework to be called by the low @@ -83,6 +88,7 @@ enum clock_event_mode { * @name:ptr to clock event name * @rating: variable to rate clock event devices * @irq: IRQ number (only for non CPU local devices) + * @bound_on:
[PATCH 1/2] clocksource: Remove outdated comments
clocksource_register() and __clocksource_register_scale() always return 0, so the comment is just pointless, it's outdated, remove it. Signed-off-by: Yijing Wang wangyij...@huawei.com --- kernel/time/clocksource.c |3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index ba3e502..9951575 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -779,8 +779,6 @@ EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale); * @scale: Scale factor multiplied against freq to get clocksource hz * @freq: clocksource frequency (cycles per second) divided by scale * - * Returns -EBUSY if registration fails, zero otherwise. - * * This *SHOULD NOT* be called directly! Please use the * clocksource_register_hz() or clocksource_register_khz helper functions. */ @@ -805,7 +803,6 @@ EXPORT_SYMBOL_GPL(__clocksource_register_scale); * clocksource_register - Used to install new clocksources * @cs:clocksource to be registered * - * Returns -EBUSY if registration fails, zero otherwise. */ int clocksource_register(struct clocksource *cs) { -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] clocksource: Make clocksource register functions void
Currently, clocksource_register() and __clocksource_register_scale() functions always return 0, it's pointless, make functions void. And remove the dead code that check the clocksource_register_hz() return value. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/arm/mach-davinci/time.c|5 ++--- arch/arm/mach-msm/timer.c |4 +--- arch/arm/mach-omap2/timer.c |8 +++- arch/avr32/kernel/time.c|4 +--- arch/blackfin/kernel/time-ts.c |6 ++ arch/microblaze/kernel/timer.c |3 +-- arch/mips/jz4740/time.c |6 +- arch/mips/loongson/common/cs5536/cs5536_mfgpt.c |3 ++- arch/openrisc/kernel/time.c |3 +-- arch/powerpc/kernel/time.c |6 +- arch/um/kernel/time.c |6 +- arch/x86/platform/uv/uv_time.c | 14 ++ drivers/clocksource/acpi_pm.c |3 ++- drivers/clocksource/cadence_ttc_timer.c |6 +- drivers/clocksource/exynos_mct.c|4 +--- drivers/clocksource/i8253.c |3 ++- drivers/clocksource/mmio.c |3 ++- drivers/clocksource/samsung_pwm_timer.c |5 + drivers/clocksource/scx200_hrt.c|3 ++- drivers/clocksource/tcb_clksrc.c|8 +--- drivers/clocksource/timer-marco.c |2 +- drivers/clocksource/timer-prima2.c |2 +- drivers/clocksource/vt8500_timer.c |4 +--- include/linux/clocksource.h |8 kernel/time/clocksource.c |6 ++ kernel/time/jiffies.c |3 ++- 26 files changed, 45 insertions(+), 83 deletions(-) diff --git a/arch/arm/mach-davinci/time.c b/arch/arm/mach-davinci/time.c index 56c6eb5..9536f85 100644 --- a/arch/arm/mach-davinci/time.c +++ b/arch/arm/mach-davinci/time.c @@ -387,9 +387,8 @@ void __init davinci_timer_init(void) /* setup clocksource */ clocksource_davinci.name = id_to_name[clocksource_id]; - if (clocksource_register_hz(clocksource_davinci, - davinci_clock_tick_rate)) - printk(err, clocksource_davinci.name); + clocksource_register_hz(clocksource_davinci, + davinci_clock_tick_rate); setup_sched_clock(davinci_read_sched_clock, 32, davinci_clock_tick_rate); diff --git a/arch/arm/mach-msm/timer.c b/arch/arm/mach-msm/timer.c index 1e9c338..c96e034 100644 --- a/arch/arm/mach-msm/timer.c +++ b/arch/arm/mach-msm/timer.c @@ -226,9 +226,7 @@ static void __init msm_timer_init(u32 dgt_hz, int sched_bits, int irq, err: writel_relaxed(TIMER_ENABLE_EN, source_base + TIMER_ENABLE); - res = clocksource_register_hz(cs, dgt_hz); - if (res) - pr_err(clocksource_register failed\n); + clocksource_register_hz(cs, dgt_hz); setup_sched_clock(msm_sched_clock_read, sched_bits, dgt_hz); } diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c index 3ca81e0..beaf7c7 100644 --- a/arch/arm/mach-omap2/timer.c +++ b/arch/arm/mach-omap2/timer.c @@ -473,11 +473,9 @@ static void __init omap2_gptimer_clocksource_init(int gptimer_id, OMAP_TIMER_NONPOSTED); setup_sched_clock(dmtimer_read_sched_clock, 32, clksrc.rate); - if (clocksource_register_hz(clocksource_gpt, clksrc.rate)) - pr_err(Could not register clocksource %s\n, - clocksource_gpt.name); - else - pr_info(OMAP clocksource: %s at %lu Hz\n, + clocksource_register_hz(clocksource_gpt, clksrc.rate); + + pr_info(OMAP clocksource: %s at %lu Hz\n, clocksource_gpt.name, clksrc.rate); } diff --git a/arch/avr32/kernel/time.c b/arch/avr32/kernel/time.c index d0f771b..51b4a66 100644 --- a/arch/avr32/kernel/time.c +++ b/arch/avr32/kernel/time.c @@ -134,9 +134,7 @@ void __init time_init(void) /* figure rate for counter */ counter_hz = clk_get_rate(boot_cpu_data.clk); - ret = clocksource_register_hz(counter, counter_hz); - if (ret) - pr_debug(timer: could not register clocksource: %d\n, ret); + clocksource_register_hz(counter, counter_hz); /* setup COMPARE clockevent */ comparator.mult = div_sc(counter_hz, NSEC_PER_SEC, comparator.shift); diff --git a/arch/blackfin/kernel/time-ts.c b/arch/blackfin/kernel/time-ts.c index cb0a484..df3bb08 100644 --- a/arch/blackfin/kernel/time-ts.c +++ b/arch/blackfin/kernel/time-ts.c @@ -51,8 +51,7 @@ static inline unsigned long long bfin_cs_cycles_sched_clock(void) static int __init bfin_cs_cycles_init(void) { - if
Re: [PATCH 2/2] clocksource: Make clocksource register functions void
On Thu, Jan 23, 2014 at 8:45 AM, Tony Prisk li...@prisktech.co.nz wrote: -static inline int clocksource_register_hz(struct clocksource *cs, u32 hz) +static inline void clocksource_register_hz(struct clocksource *cs, u32 hz) { return __clocksource_register_scale(cs, 1, hz); } This doesn't make sense - you are still returning a value on a function declared void, and the return is now from a function that doesn't return anything either ?!?! Doesn't this throw a compile-time warning?? No, passing on void in functions returning void doesn't cause compiler warnings. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev