Re: [RFC PATCH 1/5] x86: Add cpu capability flag X86_FEATURE_TSC_S3_NOTSTOP
On Mon, Jan 21, 2013 at 02:27:29AM -0500, Chen Gong wrote: > On Mon, Jan 21, 2013 at 02:38:41PM +0800, Feng Tang wrote: > > Date: Mon, 21 Jan 2013 14:38:41 +0800 > > From: Feng Tang > > To: Thomas Gleixner , John Stultz > > , Ingo Molnar , "H. Peter Anvin" > > , x...@kernel.org, Len Brown , > > "Rafael J. Wysocki" , > > linux-kernel@vger.kernel.org > > Cc: Feng Tang > > Subject: [RFC PATCH 1/5] x86: Add cpu capability flag > > X86_FEATURE_TSC_S3_NOTSTOP > > X-Mailer: git-send-email 1.7.9.5 > > > > On some new Intel Atom processors (Penwell and Cloverview), there is > > a feature that the TSC won't stop S3, say the TSC value won't be > > reset to 0 after resume. This feature makes TSC a more reliable > > clocksource and could benefit the timekeeping code during system > > suspend/resume cycle, so add a flag for it. > > > > Signed-off-by: Feng Tang > > --- > > arch/x86/include/asm/cpufeature.h |1 + > > arch/x86/kernel/cpu/intel.c | 12 > > 2 files changed, 13 insertions(+) > > > > diff --git a/arch/x86/include/asm/cpufeature.h > > b/arch/x86/include/asm/cpufeature.h > > index 2d9075e..f7e1eac 100644 > > --- a/arch/x86/include/asm/cpufeature.h > > +++ b/arch/x86/include/asm/cpufeature.h > > @@ -100,6 +100,7 @@ > > #define X86_FEATURE_AMD_DCM (3*32+27) /* multi-node processor */ > > #define X86_FEATURE_APERFMPERF (3*32+28) /* APERFMPERF */ > > #define X86_FEATURE_EAGER_FPU (3*32+29) /* "eagerfpu" Non lazy FPU > > restore */ > > +#define X86_FEATURE_TSC_S3_NOTSTOP (3*32+30) /* TSC doesn't stop in S3 > > state */ > > > We have an existed "TSC always running in C3+" feature and name it as > X86_FEATURE_NONSTOP_TSC, so how about naming it with the same style, > like X86_FEATURE_NONSTOP_TSC_S3? Yeah, actually I used a name X86_FEATURE_xxx_TSC, then I did a grep, and found there is no unified name convention for TSC, so I chose such a name. -- #grep _TSC arch/x86/include/asm/cpufeature.h #define X86_FEATURE_TSC (0*32+ 4) /* Time Stamp Counter */ #define X86_FEATURE_CONSTANT_TSC (3*32+ 8) /* TSC ticks at a constant rate */ #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */ #define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */ #define X86_FEATURE_TSC_S3_NOTSTOP (3*32+30) /* TSC doesn't stop in S3 state */ #define X86_FEATURE_TSC_DEADLINE_TIMER (4*32+24) /* Tsc deadline timer */ #define X86_FEATURE_TSCRATEMSR (8*32+ 9) /* "tsc_scale" AMD TSC scaling support */ #define X86_FEATURE_TSC_ADJUST (9*32+ 1) /* TSC adjustment MSR 0x3b */ Thanks, Feng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] pwm-backlight: add subdrivers & Tegra support
On Sat, Jan 19, 2013 at 07:30:17PM +0900, Alexandre Courbot wrote: > This series introduces a way to use pwm-backlight hooks with platforms > that use the device tree through a subdriver system. It also adds support > for the Tegra-based Ventana board, adding the last missing block to enable > its panel. Support for other Tegra board can thus be easily added. > > I have something else in mind to properly support this (power > sequences), but this work relies on the GPIO subsystem redesign which will > take some time. The pwm-backlight subdrivers can do the job by the meantime. > > There are a few design points that might need to be discussed: > 1) Link order is important: subdrivers register themselves in their > module_init function, which must be called before pwm-backlight's probe. > This forbids linking subdrivers as separate modules from pwm-backlight. > 2) The subdriver's data is temporarily passed through the backlight > device's driver data. This should not hurt, but maybe there is a better way > to do this. > 3) Subdrivers must add themselves into pwm-backlight's own of_device_id > table. It would be cleaner to not have to list subdrivers into > pwm-backlight's main file, but I cannot think of a way to do otherwise. > > Suggestions for the 3 points listed above are very welcome - in any case, > I hope to make this converge into something mergeable quickly. > > Note that these patches are the last missing block to get a functional > panel on Tegra boards. Using 3.8rc4 and these patches, the internal panel > on Ventana is usable out-of-the-box. Yay. Hi Alexandre, It's great to see you pick this up. I've been meaning to do this myself but I just can't find the time right now. Generally I think the approach you've chosen looks good, but I don't think doing it in pwm-backlight is the right way. Eventually this should all be covered by the CDF, but since that's not ready yet we want something ad-hoc to get the hardware supported. As such I would like to see this go into some sort of minimalistic, Tegra- specific display/panel framework. I'd prefer to keep the pwm-backlight driver as simple and generic as possible, that is, a driver for a PWM- controlled backlight. Another advantage of moving this into a sort of display framework is that it may help in defining the requirements for a CDF and that moving the code to the CDF should be easier once it is done. Last but not least, abstracting away the panel allows other things such as physical dimensions and display modes to be properly encapsulated. I think that power-on/off timing requirements for panels also belong to this set since they are usually specific to a given panel. Maybe adding these drivers to tegra-drm for now would be a good option. That way the corresponding glue can be added without a need for inter- tree dependencies. Thierry pgprFSdcJRsHJ.pgp Description: PGP signature
[v2][PATCH 5/6] powerpc/book3e: support kgdb for kernel space
Currently we need to skip this for supporting KGDB. Signed-off-by: Tiejun Chen --- arch/powerpc/kernel/exceptions-64e.S |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 423a936..6204681 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -589,11 +589,14 @@ kernel_dbg_exc: rfdi /* Normal debug exception */ +1: +#ifndef CONFIG_KGDB /* XXX We only handle coming from userspace for now since we can't * quite save properly an interrupted kernel state yet */ -1: andi. r14,r11,MSR_PR; /* check for userspace again */ + andi. r14,r11,MSR_PR; /* check for userspace again */ beq kernel_dbg_exc; /* if from kernel mode */ +#endif /* Now we mash up things to make it look like we are coming on a * normal exception -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2][PATCH 0/6] powerpc/book3e: make kgdb to work well
This patchset is used to support kgdb/gdb on book3e. v2: * Make sure we cover CONFIG_PPC_BOOK3E_64 safely * Use LOAD_REG_IMMEDIATE() to load properly the value of the constant expression in load debug exception stack * Copy thread infor form the kernel stack coming from usr * Rebase latest powerpc git tree v1: * Copy thread info only when we are from !user mode since we'll get kernel stack coming from usr directly. * remove save/restore EX_R14/EX_R15 since DBG_EXCEPTION_PROLOG already covered this. * use CURRENT_THREAD_INFO() conveniently to get thread. * fix some typos * add a patch to make sure gdb can generate a single step properly to invoke a kgdb state. * add a patch to if we need to replay an interrupt, we shouldn't restore that previous backup thread info to make sure we can replay an interrupt lately with a proper thread info. * rebase latest powerpc git tree v0: This patchset is used to support kgdb for book3e. Tiejun Chen (6): powerpc/book3e: load critical/machine/debug exception stack powerpc/book3e: store critical/machine/debug exception thread info book3e/kgdb: update thread's dbcr0 book3e/kgdb: Fix a single stgep case of lazy IRQ powerpc/book3e: support kgdb for kernel space kgdb/kgdbts: support ppc64 arch/powerpc/kernel/exceptions-64e.S | 60 +++--- arch/powerpc/kernel/irq.c| 10 ++ arch/powerpc/kernel/kgdb.c | 16 ++--- drivers/misc/kgdbts.c|2 ++ 4 files changed, 80 insertions(+), 8 deletions(-) Tiejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2][PATCH 4/6] book3e/kgdb: Fix a single stgep case of lazy IRQ
When we're in kgdb_singlestep(), we have to work around to get thread_info by copying from the kernel stack before calling kgdb_handle_exception(), then copying it back afterwards. But for PPC64, we have a lazy interrupt implementation. So after copying thread info frome kernle stack, if we need to replay an interrupt, we shouldn't restore that previous backup thread info to make sure we can replay an interrupt lately with a proper thread info. This patch use __check_irq_replay() to guarantee this process. Signed-off-by: Tiejun Chen --- arch/powerpc/kernel/irq.c | 10 ++ arch/powerpc/kernel/kgdb.c |3 ++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 4f97fe3..bb8d27a 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -339,7 +339,17 @@ bool prep_irq_for_idle(void) return true; } +notrace unsigned int check_irq_replay(void) +{ + return __check_irq_replay(); +} +#else +notrace unsigned int check_irq_replay(void) +{ + return 0; +} #endif /* CONFIG_PPC64 */ +EXPORT_SYMBOL(check_irq_replay); int arch_show_interrupts(struct seq_file *p, int prec) { diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c index eb30a40..2f22807 100644 --- a/arch/powerpc/kernel/kgdb.c +++ b/arch/powerpc/kernel/kgdb.c @@ -151,6 +151,7 @@ static int kgdb_handle_breakpoint(struct pt_regs *regs) return 1; } +extern notrace unsigned int check_irq_replay(void); static int kgdb_singlestep(struct pt_regs *regs) { struct thread_info *thread_info, *exception_thread_info; @@ -181,7 +182,7 @@ static int kgdb_singlestep(struct pt_regs *regs) kgdb_handle_exception(0, SIGTRAP, 0, regs); - if (thread_info != exception_thread_info) + if ((thread_info != exception_thread_info) && (!check_irq_replay())) /* Restore current_thread_info lastly. */ memcpy(exception_thread_info, backup_current_thread_info, sizeof *thread_info); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On 01/21/2013 03:09 PM, Mike Galbraith wrote: > On Mon, 2013-01-21 at 07:42 +0100, Mike Galbraith wrote: >> On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote: > >>> May be we could try change this back to the old way later, after the aim >>> 7 test on my server. >> >> Yeah, something funny is going on. > > Never entering balance path kills the collapse. Asking wake_affine() > wrt the pull as before, but allowing us to continue should no idle cpu > be found, still collapsed. So the source of funny behavior is indeed in > balance_path. Below patch based on the patch set could help to avoid enter balance path if affine_sd could be found, just like the old logical, would you like to take a try and see whether it could help fix the collapse? Regards, Michael Wang --- kernel/sched/fair.c | 14 -- 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d600708..4e95bb0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3297,6 +3297,8 @@ next: sg = sg->next; } while (sg != sd->groups); } + + return -1; done: return target; } @@ -3349,7 +3351,7 @@ select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flags) * some cases. */ new_cpu = select_idle_sibling(p, prev_cpu); - if (idle_cpu(new_cpu)) + if (new_cpu != -1) goto unlock; /* @@ -3363,15 +3365,15 @@ select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flags) goto balance_path; new_cpu = select_idle_sibling(p, cpu); - if (!idle_cpu(new_cpu)) - goto balance_path; - /* * Invoke wake_affine() finally since it is no doubt a * performance killer. */ - if (wake_affine(sbm->affine_map[prev_cpu], p, sync)) - goto unlock; + if (new_cpu == -1 || + !wake_affine(sbm->affine_map[prev_cpu], p, sync)) + new_cpu = prev_cpu; + + goto unlock; } balance_path: -- 1.7.4.1 > > -Mike > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2][PATCH 3/6] book3e/kgdb: update thread's dbcr0
gdb always need to generate a single step properly to invoke a kgdb state. But with lazy interrupt, book3e can't always trigger a debug exception with a single step since the current is blocked for handling those pending exception, then we miss that expected dbcr configuration at last to generate a debug exception. So here we also update thread's dbcr0 to make sure the current can go back with that missed dbcr0 configuration. Signed-off-by: Tiejun Chen --- arch/powerpc/kernel/kgdb.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c index 8747447..eb30a40 100644 --- a/arch/powerpc/kernel/kgdb.c +++ b/arch/powerpc/kernel/kgdb.c @@ -409,7 +409,7 @@ int kgdb_arch_handle_exception(int vector, int signo, int err_code, struct pt_regs *linux_regs) { char *ptr = _in_buffer[1]; - unsigned long addr; + unsigned long addr, dbcr0; switch (remcom_in_buffer[0]) { /* @@ -426,8 +426,15 @@ int kgdb_arch_handle_exception(int vector, int signo, int err_code, /* set the trace bit if we're stepping */ if (remcom_in_buffer[0] == 's') { #ifdef CONFIG_PPC_ADV_DEBUG_REGS - mtspr(SPRN_DBCR0, - mfspr(SPRN_DBCR0) | DBCR0_IC | DBCR0_IDM); + dbcr0 = mfspr(SPRN_DBCR0) | DBCR0_IC | DBCR0_IDM; + mtspr(SPRN_DBCR0, dbcr0); +#ifdef CONFIG_PPC_BOOK3E_64 + /* With lazy interrut we have to update thread dbcr0 here +* to make sure we can set debug properly at last to invoke +* kgdb again to work well. +*/ + current->thread.dbcr0 = dbcr0; +#endif linux_regs->msr |= MSR_DE; #else linux_regs->msr |= MSR_SE; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2][PATCH 6/6] kgdb/kgdbts: support ppc64
We can't look up the address of the entry point of the function simply via that function symbol for all architectures. For PPC64 ABI, actually there is a function descriptors structure. A function descriptor is a three doubleword data structure that contains the following values: * The first doubleword contains the address of the entry point of the function. * The second doubleword contains the TOC base address for the function. * The third doubleword contains the environment pointer for languages such as Pascal and PL/1. So we should call a wapperred dereference_function_descriptor() to get the address of the entry point of the function. Note this is also safe for other architecture after refer to "include/asm-generic/sections.h" since: dereference_function_descriptor(p) always is (p) if without arched definition. Signed-off-by: Tiejun Chen --- drivers/misc/kgdbts.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/misc/kgdbts.c b/drivers/misc/kgdbts.c index 3aa9a96..4799e1f 100644 --- a/drivers/misc/kgdbts.c +++ b/drivers/misc/kgdbts.c @@ -103,6 +103,7 @@ #include #include #include +#include #define v1printk(a...) do { \ if (verbose) \ @@ -222,6 +223,7 @@ static unsigned long lookup_addr(char *arg) addr = (unsigned long)do_fork; else if (!strcmp(arg, "hw_break_val")) addr = (unsigned long)_break_val; + addr = (unsigned long )dereference_function_descriptor((void *)addr); return addr; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2][PATCH 2/6] powerpc/book3e: store critical/machine/debug exception thread info
We need to store thread info to these exception thread info like something we already did for PPC32. Signed-off-by: Tiejun Chen --- arch/powerpc/kernel/exceptions-64e.S | 15 +++ 1 file changed, 15 insertions(+) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 767f856..423a936 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -58,6 +58,18 @@ std r10,PACA_##level##_STACK(r13); #endif +/* Store something to exception thread info */ +#defineBOOK3E_STORE_EXC_LEVEL_THEAD_INFO(type) \ + ld r1,PACAKSAVE(r13); \ + CURRENT_THREAD_INFO(r14, r14); \ + CURRENT_THREAD_INFO(r15, r1); \ + ld r10,TI_FLAGS(r14); \ + std r10,TI_FLAGS(r15); \ + ld r10,TI_PREEMPT(r14); \ + std r10,TI_PREEMPT(r15); \ + ld r10,TI_TASK(r14); \ + std r10,TI_TASK(r15); + /* Exception prolog code for all exceptions */ #define EXCEPTION_PROLOG(n, intnum, type, addition)\ mtspr SPRN_SPRG_##type##_SCRATCH,r13; /* get spare registers */ \ @@ -95,6 +107,7 @@ BOOK3E_LOAD_EXC_LEVEL_STACK(CRIT); \ ld r1,PACA_CRIT_STACK(r13);\ subir1,r1,SPECIAL_EXC_FRAME_SIZE; \ + BOOK3E_STORE_EXC_LEVEL_THEAD_INFO(CRIT); \ 1: #define SPRN_CRIT_SRR0 SPRN_CSRR0 #define SPRN_CRIT_SRR1 SPRN_CSRR1 @@ -105,6 +118,7 @@ BOOK3E_LOAD_EXC_LEVEL_STACK(DBG); \ ld r1,PACA_DBG_STACK(r13); \ subir1,r1,SPECIAL_EXC_FRAME_SIZE; \ + BOOK3E_STORE_EXC_LEVEL_THEAD_INFO(DBG); \ 1: #define SPRN_DBG_SRR0 SPRN_DSRR0 #define SPRN_DBG_SRR1 SPRN_DSRR1 @@ -115,6 +129,7 @@ BOOK3E_LOAD_EXC_LEVEL_STACK(MC); \ ld r1,PACA_MC_STACK(r13); \ subir1,r1,SPECIAL_EXC_FRAME_SIZE; \ + BOOK3E_STORE_EXC_LEVEL_THEAD_INFO(MC); \ 1: #define SPRN_MC_SRR0 SPRN_MCSRR0 #define SPRN_MC_SRR1 SPRN_MCSRR1 -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2][PATCH 1/6] powerpc/book3e: load critical/machine/debug exception stack
We always alloc critical/machine/debug check exceptions. This is different from the normal exception. So we should load these exception stack properly like we did for booke. Signed-off-by: Tiejun Chen --- arch/powerpc/kernel/exceptions-64e.S | 40 +++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index ae54553..767f856 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -36,6 +36,28 @@ */ #defineSPECIAL_EXC_FRAME_SIZE INT_FRAME_SIZE +/* only on book3e */ +#define DBG_STACK_BASE dbgirq_ctx +#define MC_STACK_BASE mcheckirq_ctx +#define CRIT_STACK_BASEcritirq_ctx + +#ifdef CONFIG_SMP +#define BOOK3E_LOAD_EXC_LEVEL_STACK(level) \ + mfspr r14,SPRN_PIR; \ + slwir14,r14,3; \ + LOAD_REG_IMMEDIATE(r10, level##_STACK_BASE);\ + add r10,r10,r14;\ + ld r10,0(r10); \ + addir10,r10,THREAD_SIZE;\ + std r10,PACA_##level##_STACK(r13); +#else +#define BOOK3E_LOAD_EXC_LEVEL_STACK(level) \ + LOAD_REG_IMMEDIATE(r10, level##_STACK_BASE);\ + ld r10,0(r10); \ + addir10,r10,THREAD_SIZE;\ + std r10,PACA_##level##_STACK(r13); +#endif + /* Exception prolog code for all exceptions */ #define EXCEPTION_PROLOG(n, intnum, type, addition)\ mtspr SPRN_SPRG_##type##_SCRATCH,r13; /* get spare registers */ \ @@ -68,20 +90,32 @@ #define SPRN_GDBELL_SRR1 SPRN_GSRR1 #define CRIT_SET_KSTACK \ + andi. r10,r11,MSR_PR; \ + bne 1f; \ + BOOK3E_LOAD_EXC_LEVEL_STACK(CRIT); \ ld r1,PACA_CRIT_STACK(r13);\ - subir1,r1,SPECIAL_EXC_FRAME_SIZE; + subir1,r1,SPECIAL_EXC_FRAME_SIZE; \ +1: #define SPRN_CRIT_SRR0 SPRN_CSRR0 #define SPRN_CRIT_SRR1 SPRN_CSRR1 #define DBG_SET_KSTACK \ + andi. r10,r11,MSR_PR; \ + bne 1f; \ + BOOK3E_LOAD_EXC_LEVEL_STACK(DBG); \ ld r1,PACA_DBG_STACK(r13); \ - subir1,r1,SPECIAL_EXC_FRAME_SIZE; + subir1,r1,SPECIAL_EXC_FRAME_SIZE; \ +1: #define SPRN_DBG_SRR0 SPRN_DSRR0 #define SPRN_DBG_SRR1 SPRN_DSRR1 #define MC_SET_KSTACK \ + andi. r10,r11,MSR_PR; \ + bne 1f; \ + BOOK3E_LOAD_EXC_LEVEL_STACK(MC); \ ld r1,PACA_MC_STACK(r13); \ - subir1,r1,SPECIAL_EXC_FRAME_SIZE; + subir1,r1,SPECIAL_EXC_FRAME_SIZE; \ +1: #define SPRN_MC_SRR0 SPRN_MCSRR0 #define SPRN_MC_SRR1 SPRN_MCSRR1 -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cifs: fix srcip_matches() for ipv6
merged into cifs-2.6.git On Wed, Jan 16, 2013 at 10:04 PM, Nickolai Zeldovich wrote: > On Wed, Jan 16, 2013 at 10:51 PM, Steve French wrote: >> How did you discover this - did you have an ipv6 test case or by >> inspection or ...? > > By mostly-automated inspection (i.e., with the help of a static > program analysis tool). > > Nickolai. -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/5] x86: Add cpu capability flag X86_FEATURE_TSC_S3_NOTSTOP
On Mon, Jan 21, 2013 at 02:38:41PM +0800, Feng Tang wrote: > Date: Mon, 21 Jan 2013 14:38:41 +0800 > From: Feng Tang > To: Thomas Gleixner , John Stultz > , Ingo Molnar , "H. Peter Anvin" > , x...@kernel.org, Len Brown , > "Rafael J. Wysocki" , > linux-kernel@vger.kernel.org > Cc: Feng Tang > Subject: [RFC PATCH 1/5] x86: Add cpu capability flag > X86_FEATURE_TSC_S3_NOTSTOP > X-Mailer: git-send-email 1.7.9.5 > > On some new Intel Atom processors (Penwell and Cloverview), there is > a feature that the TSC won't stop S3, say the TSC value won't be > reset to 0 after resume. This feature makes TSC a more reliable > clocksource and could benefit the timekeeping code during system > suspend/resume cycle, so add a flag for it. > > Signed-off-by: Feng Tang > --- > arch/x86/include/asm/cpufeature.h |1 + > arch/x86/kernel/cpu/intel.c | 12 > 2 files changed, 13 insertions(+) > > diff --git a/arch/x86/include/asm/cpufeature.h > b/arch/x86/include/asm/cpufeature.h > index 2d9075e..f7e1eac 100644 > --- a/arch/x86/include/asm/cpufeature.h > +++ b/arch/x86/include/asm/cpufeature.h > @@ -100,6 +100,7 @@ > #define X86_FEATURE_AMD_DCM (3*32+27) /* multi-node processor */ > #define X86_FEATURE_APERFMPERF (3*32+28) /* APERFMPERF */ > #define X86_FEATURE_EAGER_FPU(3*32+29) /* "eagerfpu" Non lazy FPU > restore */ > +#define X86_FEATURE_TSC_S3_NOTSTOP (3*32+30) /* TSC doesn't stop in S3 state > */ > We have an existed "TSC always running in C3+" feature and name it as X86_FEATURE_NONSTOP_TSC, so how about naming it with the same style, like X86_FEATURE_NONSTOP_TSC_S3? signature.asc Description: Digital signature
Re: [PATCH 2/3] tegra: pwm-backlight: add tegra pwm-bl driver
On 01/19/2013 06:30 PM, Alexandre Courbot wrote: > Add a PWM-backlight subdriver for Tegra boards, with support for > Ventana. > > Signed-off-by: Alexandre Courbot > --- [...] > > + backlight { > + compatible = "pwm-backlight-ventana"; > + brightness-levels = <0 16 32 48 64 80 96 112 128 144 160 176 > 192 208 224 240 255>; > + default-brightness-level = <12>; > + > + pwms = < 2 500>; After read the codes of tegra pwm driver & pwm framework, I got to know the meaning of this property. So I think we need to add a doc(e.g: Documentation/devicetree/bindings/video/backlight/nvidia,tegra20-bl.txt) to explain this, "Documentation/devicetree/bindings/pwm/pwm.txt" doesn't explain this, because this may be different between different pwm drivers. > + pwm-names = "backlight"; > + > + power-supply = <_bl_reg>; > + panel-supply = <_pnl_reg>; > + bl-gpio = < 28 0>; > + bl-panel = < 10 0>; > + }; > + [...] > diff --git a/drivers/video/backlight/pwm_bl_tegra.c > b/drivers/video/backlight/pwm_bl_tegra.c > new file mode 100644 > index 000..8f2195b > --- /dev/null > +++ b/drivers/video/backlight/pwm_bl_tegra.c So according to the filename, I think we can put all tegra boards codes here, right? Just like what you do for Ventana, if I wanna add support for cardhu, I can define similar functions -- let's say "init_cardhu", "exit_cardhu", "notify_cardhu" and "notify_after_cardhu", right? But I think if we do in this way, the file will become very long soon. And there are a lot of redundant codes in it. So do you have any suggestions? Mark > @@ -0,0 +1,159 @@ > +/* > + * pwm-backlight subdriver for Tegra. > + * > + * Copyright (c) 2013 NVIDIA CORPORATION. All rights reserved. > + * > + * This software is licensed under the terms of the GNU General Public > + * License version 2, as published by the Free Software Foundation, and > + * may be copied, distributed, and modified under those terms. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + */ [...] > +MODULE_DESCRIPTION("Backlight Driver for Tegra boards"); > +MODULE_LICENSE("GPL"); > +MODULE_ALIAS("platform:pwm-tegra-backlight"); > + > + > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 4/5] clocksource: Enlarge the maxim time interval when configuring the scale and shift
On Mon, Jan 21, 2013 at 02:38:44PM +0800, Feng Tang wrote: > Date: Mon, 21 Jan 2013 14:38:44 +0800 > From: Feng Tang > To: Thomas Gleixner , John Stultz > , Ingo Molnar , "H. Peter Anvin" > , x...@kernel.org, Len Brown , > "Rafael J. Wysocki" , > linux-kernel@vger.kernel.org > Cc: Feng Tang > Subject: [RFC PATCH 4/5] clocksource: Enlarge the maxim time interval when > configuring the scale and shift > X-Mailer: git-send-email 1.7.9.5 > > On our x86 platform, we see a failure case of calling clocksource_cyc2ns(), > which return a negative value. The reason is the time interval was large > (more than 1000 seconds), while its TSC frequency is 2GHz, so the following > fomular overflowed: > ((u64) cycles * mult) >> shift > > So enlarge the time interval from 10 mins to 40 mins to fix the bug. > > Another solution may be adding a "max_interval" in struct clocksource, and > use a default value (like current 10 minutes) when clocksource driver > doesn't set it. > As you said, it looks like it is a littleb it arbitrary from 10m -> 40m, I think max_interval is a better choice, if timer guys not minding too many control knobs :-). > Signed-off-by: Feng Tang > --- > kernel/time/clocksource.c |6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c > index c958338..48fbfcb 100644 > --- a/kernel/time/clocksource.c > +++ b/kernel/time/clocksource.c > @@ -663,7 +663,7 @@ void __clocksource_updatefreq_scale(struct clocksource > *cs, u32 scale, u32 freq) >* Calc the maximum number of seconds which we can run before >* wrapping around. For clocksources which have a mask > 32bit >* we need to limit the max sleep time to have a good > - * conversion precision. 10 minutes is still a reasonable > + * conversion precision. 40 minutes is still a reasonable >* amount. That results in a shift value of 24 for a >* clocksource with mask >= 40bit and f >= 4GHz. That maps to >* ~ 0.06ppm granularity for NTP. We apply the same 12.5% > @@ -674,8 +674,8 @@ void __clocksource_updatefreq_scale(struct clocksource > *cs, u32 scale, u32 freq) > do_div(sec, scale); > if (!sec) > sec = 1; > - else if (sec > 600 && cs->mask > UINT_MAX) > - sec = 600; > + else if (sec > 2400 && cs->mask > UINT_MAX) > + sec = 2400; > > clocks_calc_mult_shift(>mult, >shift, freq, > NSEC_PER_SEC / scale, sec * scale); > -- > 1.7.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ signature.asc Description: Digital signature
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On 01/21/2013 02:42 PM, Mike Galbraith wrote: > On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote: > >> That seems like the default one, could you please show me the numbers in >> your datapoint file? > > Yup, I do not touch the workfile. Datapoints is what you see in the > tabulated result... > > 1 > 1 > 1 > 5 > 5 > 5 > 10 > 10 > 10 > ... > > so it does three consecutive runs at each load level. I quiesce the > box, set governor to performance, echo 250 32000 32 4096 >> /proc/sys/kernel/sem, then ./multitask -nl -f, and point it > at ./datapoints. I have changed the "/proc/sys/kernel/sem" to: 20002048000 256 1024 and run few rounds, seems like I can't reproduce this issue on my 12 cpu X86 server: prevpost Tasksjobs/min jobs/min 1 508.39 506.69 5 2792.63 2792.63 10 5454.55 5449.64 2010262.49 10271.19 4018089.55 18184.55 8028995.22 28960.57 16041365.19 41613.73 32053099.67 52767.35 64061308.88 61483.83 128066707.95 66484.96 256069736.58 69350.02 Almost nothing changed...I would like to find another machine and do the test again later. > >> I'm not familiar with this benchmark, but I'd like to have a try on my >> server, to make sure whether it is a generic issue. > > One thing I didn't like about your changes is that you don't ask > wake_affine() if it's ok to pull cross node or not, which I though might > induce imbalance, but twiddling that didn't fix up the collapse, pretty > much leaving only the balance path. wake_affine() will be asked before trying to use the idle sibling selected from current cpu's domain, doesn't it? It's just been delayed since it's cost is too high. But you notified me that I missed the case when prev == current, not sure whether it's the killer, but will correct it. > And I'm confusing about how those new parameter value was figured out and how could them help solve the possible issue? >>> >>> Oh, that's easy. I set sched_min_granularity_ns such that last_buddy >>> kicks in when a third task arrives on a runqueue, and set >>> sched_wakeup_granularity_ns near minimum that still allows wakeup >>> preemption to occur. Combined effect is reduced over-scheduling. >> >> That sounds very hard, to catch the timing, whatever, it could be an >> important clue for analysis. > > (Play with the knobs with a bunch of different loads, I think you'll > find that those settings work well) > Do you have any idea about which part in this patch set may cause the issue? >>> >>> Nope, I'm as puzzled by that as you are. When the box had 40 cores, >>> both virgin and patched showed over-scheduling effects, but not like >>> this. With 20 cores, symptoms changed in a most puzzling way, and I >>> don't see how you'd be directly responsible. >> >> Hmm... >> >>> One change by designed is that, for old logical, if it's a wake up and we found affine sd, the select func will never go into the balance path, but the new logical will, in some cases, do you think this could be a problem? >>> >>> Since it's the high load end, where looking for an idle core is most >>> likely to be a waste of time, it makes sense that entering the balance >>> path would hurt _some_, it isn't free.. except for twiddling preemption >>> knobs making the collapse just go away. We're still going to enter that >>> path if all cores are busy, no matter how I twiddle those knobs. >> >> May be we could try change this back to the old way later, after the aim >> 7 test on my server. > > Yeah, something funny is going on. I'd like select_idle_sibling() to > just go away, that task be integrated into one and only one short and > sweet balance path. I don't see why fine_idlest* needs to continue > traversal after seeing a zero. It should be just fine to say gee, we're > done. Yes, that's true :) Hohum, so much for pure test and report, twiddle twiddle tweak, > bend spindle mutilate ;-) Scheduler is impossible to be analysis some time, the only way to prove is the painful endless testing...and usually, we still missed some thing in the end... Regards, Michael Wang > > -Mike > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] scripts/package/Makefile: remove useless KBUILD_OUTPUT test
The test of KBUILD_OUTPUT in "rpm-pkg rpm" target is useless. KBUILD_OUTPUT is always empty here. Signed-off-by: Bin Wang --- scripts/package/Makefile | 6 -- 1 file changed, 6 deletions(-) diff --git a/scripts/package/Makefile b/scripts/package/Makefile index 87bf080..ba073a6 100644 --- a/scripts/package/Makefile +++ b/scripts/package/Makefile @@ -36,12 +36,6 @@ $(objtree)/kernel.spec: $(MKSPEC) $(srctree)/Makefile $(CONFIG_SHELL) $(MKSPEC) > $@ rpm-pkg rpm: $(objtree)/kernel.spec FORCE - @if test -n "$(KBUILD_OUTPUT)"; then \ - echo "Building source + binary RPM is not possible outside the"; \ - echo "kernel source tree. Don't set KBUILD_OUTPUT, or use the"; \ - echo "binrpm-pkg target instead."; \ - false; \ - fi $(MAKE) clean $(PREV) ln -sf $(srctree) $(KERNELPATH) $(CONFIG_SHELL) $(srctree)/scripts/setlocalversion --save-scmversion -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: thoughts on requiring multi-arch support for arm drm drivers?
On Sun, Jan 20, 2013 at 04:42:55PM +0100, Daniel Vetter wrote: > On Sun, Jan 20, 2013 at 4:08 PM, Rob Clark wrote: > > One thing I've run into in the past when trying to make changes in drm > > core, and Daniel Vetter has mentioned the same, is that it is a bit of > > a pain to compile test things for the arm drivers that do not support > > CONFIG_ARCH_MULTIPLATFORM. I went through a while back and fixed up > > the low hanging fruit (basically the drivers that just needed a > > Kconfig change). But, IIRC some of the backlight related code in > > shmob had some non-trivial plat dependencies. And I think when tegra > > came in, it introduced some non-trivial plat dependencies. > > > > What do others think about requiring multiarch or no arch dependencies > > for new drivers, and cleaning up existing drivers. Even if it is at > > reduced functionality (like maybe #ifdef CONFIG_ARCH_SHMOBILE for some > > of the backlight code in shmob) or doesn't even work but is just for > > the purpose of being able to compile test the rest of the code? > > > > Thoughts? > > Definitely in favour of this. Also, I think the arm world _really_ > needs something like Wu Fenggungs 0-day kernel testing/building > machines, which checks every commit pushed to around a 150 git kernel > maintainer repos with randconfigs, sparse (and iirc other static > checkers like cocinelle), and test-boots them on kvm. It's not just > that every driver seems to need it's own special defconfig/platform to > even be selectable in Kconfig, they also seem to randomly (and often) > break compilation if you're on the wrong tree or don't have the > exactly required golden config ... That's true. Unfortunately due to the many repositories involved there seem to be quite a few dependencies involved to get all the pieces to build properly. linux-next is usually in pretty good shape, however. I've been running an automated build over at least all ARM defconfigs in linux-next for a few days and sent out patches for build failures. But I'm not sure if I can keep that up, or at least not on a daily basis. Obviously it doesn't help the DRM problem all that much. But I agree with Rob that the only thing that will really help is multi-platform support. Thierry pgpgJZw8ndiwE.pgp Description: PGP signature
[PATCH v2 1/1] page_alloc: Bootmem limit with movablecore_map
This patch make sure bootmem will not allocate memory from areas that may be ZONE_MOVABLE. The map info is from movablecore_map boot option. Signed-off-by: Tang Chen Reviewed-by: Wen Congyang Reviewed-by: Lai Jiangshan Tested-by: Lin Feng --- include/linux/memblock.h |2 + mm/memblock.c| 50 ++ 2 files changed, 52 insertions(+), 0 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index d452ee1..ac52bbc 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -60,6 +60,8 @@ int memblock_reserve(phys_addr_t base, phys_addr_t size); void memblock_trim_memory(phys_addr_t align); #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP +extern struct movablecore_map movablecore_map; + void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, unsigned long *out_end_pfn, int *out_nid); diff --git a/mm/memblock.c b/mm/memblock.c index 88adc8a..0218231 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -92,9 +92,58 @@ static long __init_memblock memblock_overlaps_region(struct memblock_type *type, * * Find @size free area aligned to @align in the specified range and node. * + * If we have CONFIG_HAVE_MEMBLOCK_NODE_MAP defined, we need to check if the + * memory we found if not in hotpluggable ranges. + * * RETURNS: * Found address on success, %0 on failure. */ +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP +phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, + phys_addr_t end, phys_addr_t size, + phys_addr_t align, int nid) +{ + phys_addr_t this_start, this_end, cand; + u64 i; + int curr = movablecore_map.nr_map - 1; + + /* pump up @end */ + if (end == MEMBLOCK_ALLOC_ACCESSIBLE) + end = memblock.current_limit; + + /* avoid allocating the first page */ + start = max_t(phys_addr_t, start, PAGE_SIZE); + end = max(start, end); + + for_each_free_mem_range_reverse(i, nid, _start, _end, NULL) { + this_start = clamp(this_start, start, end); + this_end = clamp(this_end, start, end); + +restart: + if (this_end <= this_start || this_end < size) + continue; + + for (; curr >= 0; curr--) { + if ((movablecore_map.map[curr].start_pfn << PAGE_SHIFT) + < this_end) + break; + } + + cand = round_down(this_end - size, align); + if (curr >= 0 && + cand < movablecore_map.map[curr].end_pfn << PAGE_SHIFT) { + this_end = movablecore_map.map[curr].start_pfn + << PAGE_SHIFT; + goto restart; + } + + if (cand >= this_start) + return cand; + } + + return 0; +} +#else /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, phys_addr_t end, phys_addr_t size, phys_addr_t align, int nid) @@ -123,6 +172,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, } return 0; } +#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ /** * memblock_find_in_range - find free area in given range -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Creating an eeprom class
On Sun, Jan 20, 2013 at 11:39 PM, Greg KH wrote: > On Sun, Jan 20, 2013 at 07:08:28PM +0100, Thomas De Schampheleire wrote: >> [plaintext and fixed address of David Brownell] > > David passed away a year or so ago, so that's really not going to help :( So sorry to hear that, I was not aware... > >> Hi, >> >> Several of the eeprom drivers that live in drivers/misc/eeprom export >> a binary sysfs file 'eeprom'. If a userspace program or script wants >> to access this file, it needs to know the full path, for example: >> >> /sys/bus/spi/devices/spi32766.0/eeprom >> >> The problem with this approach is that it requires knowledge about the >> hardware configuration: is the eeprom on the SPI bus, the I2C bus, or >> maybe memory mapped? >> >> It would therefore be more interesting to have a bus-agnostic way to >> access this eeprom file, for example: >> /sys/class/eeprom/eeprom0/eeprom >> >> Maybe it'd be even better to use a more generic class name than >> 'eeprom', since there are several types of eeprom-like devices that >> you could export this way. > > Does all of the existing "eeprom" devices use the same userspace > interface? If so, yes, having a "class" would make sense. All but one do. That one (eeprom_93cx6.c) exports its read/write functions to other kernel code, and is used in several wireless/ethernet drivers. > >> Or should we rather hook the eeprom code into the mtd subsystem? > > Why mtd? Because an eeprom is a piece of memory. Maybe mtd is overkill in term of the operations supported, but from a high-level perspective an eeprom is a memory technology device, right? Thanks, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tty: Only wakeup the line discipline idle queue when queue is active
Hi, 2013/1/18 Oleg Nesterov : > > I can't understand why do you dislike Ivo's simple patch. There are > a lot of "if (waitqueue_active) wake_up" examples. Even if we add the > new helpers (personally I don't think this makes sense) , we can do > this later. Why should we delay this fix? > FYI: Greg has added my patch to his tty-next branch, so my fix has been approved. Thank you all for reviewing. Regards, Ivo Sieben -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the final tree (akpm tree related)
Hi Stephen, On 01/21/2013 02:08 PM, Stephen Rothwell wrote: Hi all, After merging the final tree, today's linux-next build (arm defconfig) failed like this: mm/memblock.c: In function 'memblock_find_in_range_node': mm/memblock.c:104:2: error: invalid use of undefined type 'struct movablecore_map' mm/memblock.c:123:4: error: invalid use of undefined type 'struct movablecore_map' mm/memblock.c:130:7: error: invalid use of undefined type 'struct movablecore_map' mm/memblock.c:131:4: error: invalid use of undefined type 'struct movablecore_map' Caused by commit "page_alloc: bootmem limit with movablecore_map" from the akpm tree. The definition of struct movablecore_map is protected by CONFIG_HAVE_MEMBLOCK_NODE_MAP but its use is not. I have reverted that commit for today. Thank you very much for reporting this. It was my mistake to miss this definition. I will post a new version of "page_alloc: bootmem limit with movablecore_map" since you have reverted it. CONFIG_HAVE_MEMBLOCK_NODE_MAP is selected by x86=y, but I don't have any non-x86 box. So I didn't test it. Please tell me if you have any problem with it on other platforms. Thanks. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/3] staging, zsmalloc: introduce zs_mem_[read/write]
Hello, Minchan. On Thu, Jan 17, 2013 at 08:59:22AM +0900, Minchan Kim wrote: > Hi Joonsoo, > > On Wed, Jan 16, 2013 at 05:08:55PM +0900, Joonsoo Kim wrote: > > If object is on boundary of page, zs_map_object() copy content of object > > to pre-allocated page and return virtual address of > > IMHO, for reviewer reading easily, it would be better to specify explict > word instead of abstract. > > pre-allocated pages : vm_buf which is reserved pages for zsmalloc > > > this pre-allocated page. If user inform zsmalloc of memcpy region, > > we can avoid this copy overhead. > > That's a good idea! > > > This patch implement two API and these get information of memcpy region. > > Using this information, we can do memcpy without overhead. > > For the clarification, > > we can reduce copy overhead with this patch > in !USE_PGTABLE_MAPPING case. > > > > > For USE_PGTABLE_MAPPING case, we can avoid flush cache and tlb overhead > > via these API. > > Yeb! > > > > > Signed-off-by: Joonsoo Kim > > --- > > These are [RFC] patches, because I don't test and > > I don't have test environment, yet. Just compile test done. > > If there is positive comment, I will setup test env and check correctness. > > These are based on v3.8-rc3. > > If rebase is needed, please notify me what tree I should rebase. > > Whenever you send zsmalloc/zram/zcache, you have to based on recent > linux-next. > But I hope we send the patches to akpm by promoting soon. :( > > > > > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c > > b/drivers/staging/zsmalloc/zsmalloc-main.c > > index 09a9d35..e3ef5a5 100644 > > --- a/drivers/staging/zsmalloc/zsmalloc-main.c > > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c > > @@ -1045,6 +1045,118 @@ void zs_unmap_object(struct zs_pool *pool, unsigned > > long handle) > > } > > EXPORT_SYMBOL_GPL(zs_unmap_object); > > > > It's exported function. Please write description. > > > +void zs_mem_read(struct zs_pool *pool, unsigned long handle, > > + void *dest, unsigned long src_off, size_t n) > > n is meaningless, please use meaningful word. > How about this? > void *buf, unsigned long offset, size_t count > > > +{ > > + struct page *page; > > + unsigned long obj_idx, off; > > + > > + unsigned int class_idx; > > + enum fullness_group fg; > > + struct size_class *class; > > + struct page *pages[2]; > > + int sizes[2]; > > + void *addr; > > + > > + BUG_ON(!handle); > > + > > + /* > > +* Because we use per-cpu mapping areas shared among the > > +* pools/users, we can't allow mapping in interrupt context > > +* because it can corrupt another users mappings. > > +*/ > > + BUG_ON(in_interrupt()); > > + > > + obj_handle_to_location(handle, , _idx); > > + get_zspage_mapping(get_first_page(page), _idx, ); > > + class = >size_class[class_idx]; > > + off = obj_idx_to_offset(page, obj_idx, class->size); > > + off += src_off; > > + > > + BUG_ON(class->size < n); > > + > > + if (off + n <= PAGE_SIZE) { > > + /* this object is contained entirely within a page */ > > + addr = kmap_atomic(page); > > + memcpy(dest, addr + off, n); > > + kunmap_atomic(addr); > > + return; > > + } > > + > > + /* this object spans two pages */ > > + pages[0] = page; > > + pages[1] = get_next_page(page); > > + BUG_ON(!pages[1]); > > + > > + sizes[0] = PAGE_SIZE - off; > > + sizes[1] = n - sizes[0]; > > + > > + addr = kmap_atomic(pages[0]); > > + memcpy(dest, addr + off, sizes[0]); > > + kunmap_atomic(addr); > > + > > + addr = kmap_atomic(pages[1]); > > + memcpy(dest + sizes[0], addr, sizes[1]); > > + kunmap_atomic(addr); > > +} > > +EXPORT_SYMBOL_GPL(zs_mem_read); > > + > > Ditto. Write descriptoin. > > > +void zs_mem_write(struct zs_pool *pool, unsigned long handle, > > + const void *src, unsigned long dest_off, size_t n) > > +{ > > + struct page *page; > > + unsigned long obj_idx, off; > > + > > + unsigned int class_idx; > > + enum fullness_group fg; > > + struct size_class *class; > > + struct page *pages[2]; > > + int sizes[2]; > > + void *addr; > > + > > + BUG_ON(!handle); > > + > > + /* > > +* Because we use per-cpu mapping areas shared among the > > +* pools/users, we can't allow mapping in interrupt context > > +* because it can corrupt another users mappings. > > +*/ > > + BUG_ON(in_interrupt()); > > + > > + obj_handle_to_location(handle, , _idx); > > + get_zspage_mapping(get_first_page(page), _idx, ); > > + class = >size_class[class_idx]; > > + off = obj_idx_to_offset(page, obj_idx, class->size); > > + off += dest_off; > > + > > + BUG_ON(class->size < n); > > + > > + if (off + n <= PAGE_SIZE) { > > + /* this object is contained entirely within a page */ > > + addr = kmap_atomic(page); > > +
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On Mon, 2013-01-21 at 07:42 +0100, Mike Galbraith wrote: > On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote: > > May be we could try change this back to the old way later, after the aim > > 7 test on my server. > > Yeah, something funny is going on. Never entering balance path kills the collapse. Asking wake_affine() wrt the pull as before, but allowing us to continue should no idle cpu be found, still collapsed. So the source of funny behavior is indeed in balance_path. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/2] ARM: shmobile: sh73a0: Use generic irqchip_init()
On Mon, Jan 21, 2013 at 09:54:39AM +0900, Simon Horman wrote: > On Fri, Jan 18, 2013 at 08:16:12AM +0100, Thierry Reding wrote: > > The asm/hardware/gic.h header does no longer exist and the corresponding > > functionality was moved to linux/irqchip.h and linux/irqchip/arm-gic.h > > respectively. gic_handle_irq() and of_irq_init() are no longer available > > either and have been replaced by irqchip_init(). > > asm/hardware/gic.h Seems to still exist in Linus's tree. > Could you let me know which tree of which branch I should depend on > in order to apply this change? I found this when doing an automated build over all ARM defconfigs on linux-next. Commit 520f7bd73354f003a9a59937b28e4903d985c420 "irqchip: Move ARM gic.h to include/linux/irqchip/arm-gic.h" moved the file and was merged through Olof Johansson's next/cleanup and for-next branches. Adding Olof on Cc since I'm not quite sure myself about how this is handled. Thierry pgpRFywRyd6Zr.pgp Description: PGP signature
Re: [RFC PATCH v1 0/3] kdump, vmcore: Map vmcore memory in direct mapping region
From: Vivek Goyal Subject: Re: [RFC PATCH v1 0/3] kdump, vmcore: Map vmcore memory in direct mapping region Date: Fri, 18 Jan 2013 15:54:13 -0500 > On Fri, Jan 18, 2013 at 11:06:59PM +0900, HATAYAMA Daisuke wrote: > > [..] >> > These are impressive improvements. I missed the discussion on mmap(). >> > So why couldn't we provide mmap() interface for /proc/vmcore. If that >> > works then application can select to mmap/unmap bigger chunks of file >> > (instead ioremap mapping/remapping a page at a time). >> > >> > And if application controls the size of mapping, then it can vary the >> > size of mapping based on available amount of free memory. That way if >> > somebody reserves less amount of memory, we could still dump but with >> > some time penalty. >> > >> >> mmap() needs user-space page table in addition to kernel-space's, > > [ CC Rik van Riel] > > I was chatting with Rik and it does not look like that there is any > fundamental requirement that range of pfn being mapped in user tables > has to be mapped in kernel tables too. Did you run into specific issue. > No, I was confused simply this around. >> and >> it looks that remap_pfn_range() that creates the user-space page >> table, doesn't support large pages, only 4KB pages. > > This indeed looks like the case. May be we can enahnce remap_pfn_range() > to take an argument and create larger size mappings. > Adding a new argument to remap_pfn_range would never easily be accepted because it changes signature of it. It is the function that is exported to modules. As init_memory_mapping does, it should internally automatically divide a given ranges of kernel address space into properly aligned ones then remap them. Also, if we extend this in the future, we need to have some feature for userland to know a given kernel can use 2MB/1GB pages for remapping. makedumpfile needs to estimate how much memory is required for the remapping. >> If mmaping small >> chunks only for small memory programming, then we would again face the >> same issue as with ioremap. > > Even if it is 4KB pages, I think it will still be faster than current > interface. Because we will not be issuing these many tlb flushes. > (Assuming makedumpfile has been modified to map/unap large areas of > /proc/vmcore). > OK, I'll go in this direction first. From my local investigation, I'm beginning with thinking that my idea to map a whole DIMM ranges in direct mapping region is difficult due to some memory hot-plug issues, and mmap interface is more useful than keeping page table handling in /proc/vmcore when we process /proc/vmcore in paralell where each process reads different range. Assuming we can use 4KB pages only, if we use 1MB buffer for page table, we can cover about 500MB memory region. Then, remapping is done about 2000 times. On ioremap case, remapping is done 268435456 times. Peformacne should be improved so much. We should benchmark this first. Thanks. HATAYAMA, Daisuke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 16/76] ARC: Syscall support (no-legacy-syscall ABI)
On Saturday 19 January 2013 08:39 AM, Al Viro wrote: > Please, collapse your #36--#40 into that one (and I'd probably fold #17 > here as well, to simplify that reordering). Sure, it's not a bisection > hazard, but... > I kept #16 and #17 distinct and * squashed switch-to-generic-kernel-thread #36 into process creation patch #17 * split generic kernel_execve and sys_execve #37 into two * squashed sys_execve bits into syscall patch #16 * squashed kernel_execve patch into #17 * squashed switch-to-saner-execve patches #38 and #39 into #17 * squashed generic clone patch #40 into #16 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 11/11] PCI: Put pci dev to device tree as early as possible
On Sun, Jan 20, 2013 at 3:23 PM, Rafael J. Wysocki wrote: > On Thursday, January 17, 2013 11:53:22 PM Yinghai Lu wrote: >> We want to put created pci device in the device tree as soon as possible. >> - just after we find it and create pci_dev struct for it. >> so for_pci_dev iteration will not miss them. >> >> But at that time, we can not load driver for them yet. Need to be after >> pci_assign_unsigned_resources() etc to make sure all pci devices get >> resource allocated at first. >> >> Move out device registering out of pci_bus_add_devices, and >> new pci_bus_add_devices() will do the device_attach work to load pci drivers >> >> Signed-off-by: Yinghai Lu >> --- >> drivers/pci/bus.c | 47 +++ >> drivers/pci/iov.c |7 --- >> drivers/pci/probe.c | 34 +++--- >> 3 files changed, 30 insertions(+), 58 deletions(-) >> >> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c >> index 18c1c6d..0a55845 100644 >> --- a/drivers/pci/bus.c >> +++ b/drivers/pci/bus.c >> @@ -178,22 +178,9 @@ static void pci_bus_attach_device(struct pci_dev *dev) >> */ .. >> @@ -205,21 +192,9 @@ int pci_bus_add_device(struct pci_dev *dev) >> */ >> int pci_bus_add_child(struct pci_bus *bus) >> { >> - int retval; >> - >> - if (bus->bridge) >> - bus->dev.parent = bus->bridge; >> - >> - retval = device_register(>dev); >> - if (retval) >> - return retval; >> - >> bus->is_added = 1; >> >> - /* Create legacy_io and legacy_mem files for this bus */ >> - pci_create_legacy_files(bus); >> - >> - return retval; >> + return 0; >> } > > Well, what sense does this make to keep that function as is after removing > almost all of the code from it? ok, will remove that function. ... >> list_for_each_entry(dev, >devices, bus_list) { >> BUG_ON(!dev->is_added); >> >> child = dev->subordinate; >> - /* >> - * If there is an unattached subordinate bus, attach >> - * it and then scan for unattached PCI devices. >> - */ >> + >> if (!child) >> continue; >> - if (list_empty(>node)) { >> - down_write(_bus_sem); >> - list_add_tail(>node, >bus->children); >> - up_write(_bus_sem); >> - } > > This doesn't seem to have a replacement. Why isn't it necessary any more? > add that in changelog, so related changlog will be: --- Also remove unattached child bus handling in pci_bus_add_devices(). Because that is not needed, child bus via pci_add_new_bus() is already in parent bus children list. --- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/2] Adding USB 3.0 DRD-phy support for exynos5250
Hi Felipe, On Mon, Jan 14, 2013 at 6:29 PM, Vivek Gautam wrote: > Changes from v2: > - Renaming 'samsung-usbphy.c' driver to 'samsung-usb2.c' indicating >usb 2.0 phy controller's driver for Samsung's SoCs. > - Moving the register definitions and strcuture definitions to >common header file 'samsung-usbphy.h' to be used across >usb 2.0 and usb 3.0 phy. > - Keeping common exported function definitions in samsung-usbphy.c >which can be used across usb 2.0 and usb 3.0 phy. > - Writting separate driver file for Samsung's USB 3.0 phy controller. >and making it dependent on USB_DWC3. > Is the re-organization being done here fine as per requirements for separate drivers for usb 2.0 type PHY and usb 3.0 type PHY ? > Rebased on top of usb-next followed by following patches/patch-threads: > -- [PATCH v9 1/2] usb: phy: samsung: Introducing usb phy driver for > hsotg > -- [PATCH] usb: phy: samsung: Add support to set pmu isolation > (version 6) > -- [PATCH v6 0/4] Adding usb2.0 host-phy support for exynos5250 > > Changes form v1: > - Moved architecture related patch out of this patch-set. > - Replaced unnecessary multi-line macro definitions by >single line definitions. > - Creating new data structure for USB 3.0 phy type and embedding >it in 'samsung_usbphy' structure. > - Adding a flag in 'samsung_usbphy' structure to check if device >has usb 3.0 type phy or not. > - Restructuring probe sequence for USB 3.0 phy, such that we are >initializing only when device has usb3.0 type phy. > > Vivek Gautam (2): > usb: phy: samsung: Common out the generic stuff > usb: phy: samsung: Add PHY support for USB 3.0 controller > > drivers/usb/phy/Kconfig |8 + > drivers/usb/phy/Makefile |3 +- > drivers/usb/phy/samsung-usb2.c | 511 +++ > drivers/usb/phy/samsung-usb3.c | 349 +++ > drivers/usb/phy/samsung-usbphy.c | 713 > +- > drivers/usb/phy/samsung-usbphy.h | 328 + > 6 files changed, 1205 insertions(+), 707 deletions(-) > create mode 100644 drivers/usb/phy/samsung-usb2.c > create mode 100644 drivers/usb/phy/samsung-usb3.c > create mode 100644 drivers/usb/phy/samsung-usbphy.h > -- Thanks & Regards Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote: > That seems like the default one, could you please show me the numbers in > your datapoint file? Yup, I do not touch the workfile. Datapoints is what you see in the tabulated result... 1 1 1 5 5 5 10 10 10 ... so it does three consecutive runs at each load level. I quiesce the box, set governor to performance, echo 250 32000 32 4096 > /proc/sys/kernel/sem, then ./multitask -nl -f, and point it at ./datapoints. > I'm not familiar with this benchmark, but I'd like to have a try on my > server, to make sure whether it is a generic issue. One thing I didn't like about your changes is that you don't ask wake_affine() if it's ok to pull cross node or not, which I though might induce imbalance, but twiddling that didn't fix up the collapse, pretty much leaving only the balance path. > >> And I'm confusing about how those new parameter value was figured out > >> and how could them help solve the possible issue? > > > > Oh, that's easy. I set sched_min_granularity_ns such that last_buddy > > kicks in when a third task arrives on a runqueue, and set > > sched_wakeup_granularity_ns near minimum that still allows wakeup > > preemption to occur. Combined effect is reduced over-scheduling. > > That sounds very hard, to catch the timing, whatever, it could be an > important clue for analysis. (Play with the knobs with a bunch of different loads, I think you'll find that those settings work well) > >> Do you have any idea about which part in this patch set may cause the > >> issue? > > > > Nope, I'm as puzzled by that as you are. When the box had 40 cores, > > both virgin and patched showed over-scheduling effects, but not like > > this. With 20 cores, symptoms changed in a most puzzling way, and I > > don't see how you'd be directly responsible. > > Hmm... > > > > >> One change by designed is that, for old logical, if it's a wake up and > >> we found affine sd, the select func will never go into the balance path, > >> but the new logical will, in some cases, do you think this could be a > >> problem? > > > > Since it's the high load end, where looking for an idle core is most > > likely to be a waste of time, it makes sense that entering the balance > > path would hurt _some_, it isn't free.. except for twiddling preemption > > knobs making the collapse just go away. We're still going to enter that > > path if all cores are busy, no matter how I twiddle those knobs. > > May be we could try change this back to the old way later, after the aim > 7 test on my server. Yeah, something funny is going on. I'd like select_idle_sibling() to just go away, that task be integrated into one and only one short and sweet balance path. I don't see why fine_idlest* needs to continue traversal after seeing a zero. It should be just fine to say gee, we're done. Hohum, so much for pure test and report, twiddle twiddle tweak, bend spindle mutilate ;-) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fat: eliminate iterations in fat_search_long in case of EOD
2013/1/21, OGAWA Hirofumi : > Namjae Jeon writes: > >> 2013/1/20, OGAWA Hirofumi : >>> Namjae Jeon writes: >>> From: Namjae Jeon When searching a directory for names, we can stop checking for further entries if we detect End of Directory, i.e. if (de->name[0] == 0x00).The current code traverses the cluster chain of a directory until a hit is found or till the last cluster for that directory, ignoring the EOD mark. Fix this. >>> >>> f_pos still works fine after this change? >> Hi OGAWA. >> I can not find f_pos usage in fat_search_long function. >> Maybe, Have you seen other function such as __fat_readdir ? >> Let me know your opinion. > > Ah, I see. Only ->lookup. So, this makes behavior more strange. > I.e. readdir() returns beyond 0, but lookup() can't find it? Yes, Good point. I will check other places included readdir. Thanks for review! > > Thanks. > -- > OGAWA Hirofumi > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 5/5] timekeeping: Add support for clocksource which doesn't stop during suspend
There are some new processors whose TSC clocksource won't stop during suspend. Currently, after system resumes from sleep state, kernel will use persistent clock or RTC to compensate the sleep time, but for those new types of clocksources, we could skip the special compensation from external sources, and just use current clocksource for recounting. This can solve some time drift bugs caused by the not-so-accurate RTC devices. Signed-off-by: Feng Tang --- kernel/time/timekeeping.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index cbc6acb..628c9ba 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -749,22 +749,36 @@ void timekeeping_inject_sleeptime(struct timespec *delta) static void timekeeping_resume(void) { struct timekeeper *tk = + struct clocksource *clock = tk->clock; unsigned long flags; struct timespec ts; + cycle_t cycle_now, cycle_delta; + s64 nsec; read_persistent_clock(); - clockevents_resume(); clocksource_resume(); write_seqlock_irqsave(>lock, flags); - if (timespec_compare(, _suspend_time) > 0) { + if (clock->flags & CLOCK_SOURCE_SUSPEND_NOTSTOP) { + cycle_now = clock->read(clock); + cycle_delta = (cycle_now - clock->cycle_last) & clock->mask; + clock->cycle_last = cycle_now; + + nsec = clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift); + ts = ns_to_timespec(nsec); + } else if (timespec_compare(, _suspend_time) > 0) ts = timespec_sub(ts, timekeeping_suspend_time); - __timekeeping_inject_sleeptime(tk, ); + else { + ts.tv_sec = 0; + ts.tv_nsec = 0; } + + __timekeeping_inject_sleeptime(tk, ); + /* re-base the last cycle value */ - tk->clock->cycle_last = tk->clock->read(tk->clock); + clock->cycle_last = clock->read(clock); tk->ntp_error = 0; timekeeping_suspended = 0; timekeeping_update(tk, false); @@ -1134,7 +1148,6 @@ static inline void old_vsyscall_fixup(struct timekeeper *tk) #endif - /** * update_wall_time - Uses the current clocksource to increment the wall time * -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 3/5] x86: tsc: Add support for new S3_NOTSTOP feature
Signed-off-by: Feng Tang --- arch/x86/kernel/tsc.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 06ccb50..4cc33ca 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -767,7 +767,8 @@ static cycle_t read_tsc(struct clocksource *cs) static void resume_tsc(struct clocksource *cs) { - clocksource_tsc.cycle_last = 0; + if (!boot_cpu_has(X86_FEATURE_TSC_S3_NOTSTOP)) + clocksource_tsc.cycle_last = 0; } static struct clocksource clocksource_tsc = { @@ -938,6 +939,9 @@ static int __init init_tsc_clocksource(void) clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS; } + if (boot_cpu_has(X86_FEATURE_TSC_S3_NOTSTOP)) + clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NOTSTOP; + /* * Trust the results of the earlier calibration on systems * exporting a reliable TSC. -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 4/5] clocksource: Enlarge the maxim time interval when configuring the scale and shift
On our x86 platform, we see a failure case of calling clocksource_cyc2ns(), which return a negative value. The reason is the time interval was large (more than 1000 seconds), while its TSC frequency is 2GHz, so the following fomular overflowed: ((u64) cycles * mult) >> shift So enlarge the time interval from 10 mins to 40 mins to fix the bug. Another solution may be adding a "max_interval" in struct clocksource, and use a default value (like current 10 minutes) when clocksource driver doesn't set it. Signed-off-by: Feng Tang --- kernel/time/clocksource.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index c958338..48fbfcb 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -663,7 +663,7 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) * Calc the maximum number of seconds which we can run before * wrapping around. For clocksources which have a mask > 32bit * we need to limit the max sleep time to have a good -* conversion precision. 10 minutes is still a reasonable +* conversion precision. 40 minutes is still a reasonable * amount. That results in a shift value of 24 for a * clocksource with mask >= 40bit and f >= 4GHz. That maps to * ~ 0.06ppm granularity for NTP. We apply the same 12.5% @@ -674,8 +674,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq) do_div(sec, scale); if (!sec) sec = 1; - else if (sec > 600 && cs->mask > UINT_MAX) - sec = 600; + else if (sec > 2400 && cs->mask > UINT_MAX) + sec = 2400; clocks_calc_mult_shift(>mult, >shift, freq, NSEC_PER_SEC / scale, sec * scale); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 0/5] Add support for S3 non-stop TSC support.
Hi All, On some new Intel Atom processors (Penwell and Cloverview), there is a feature that the TSC won't stop S3, say the TSC value won't be reset to 0 after resume. This feature makes TSC a more reliable clocksource and could benefit the timekeeping code during system suspend/resume cycles. The enabling efforts include adding new flags for this feature, modifying clocksource.c and timekeeping.c to support and utilizing it. One remaining question is inside the timekeeping_resume(), we don't know if it is called by resuming from suspend(s2ram) or from hibernate(s2disk), as there is no easy way to check it currently. But it doesn't hurt as these Penwell/Cloverview platforms only have S3 state, and no S4. Please help to review them, thanks! - Feng - Feng Tang (5): x86: Add cpu capability flag X86_FEATURE_TSC_S3_NOTSTOP clocksource: Add new feature flag CLOCK_SOURCE_SUSPEND_NOTSTOP x86: tsc: Add support for new S3_NOTSTOP feature clocksource: Enlarge the maxim time interval when configuring the scale and shift timekeeping: Add support for clocksource which doesn't stop during suspend arch/x86/include/asm/cpufeature.h |1 + arch/x86/kernel/cpu/intel.c | 12 arch/x86/kernel/tsc.c |6 +- include/linux/clocksource.h |1 + kernel/time/clocksource.c |6 +++--- kernel/time/timekeeping.c | 23 ++- 6 files changed, 40 insertions(+), 9 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 1/5] x86: Add cpu capability flag X86_FEATURE_TSC_S3_NOTSTOP
On some new Intel Atom processors (Penwell and Cloverview), there is a feature that the TSC won't stop S3, say the TSC value won't be reset to 0 after resume. This feature makes TSC a more reliable clocksource and could benefit the timekeeping code during system suspend/resume cycle, so add a flag for it. Signed-off-by: Feng Tang --- arch/x86/include/asm/cpufeature.h |1 + arch/x86/kernel/cpu/intel.c | 12 2 files changed, 13 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 2d9075e..f7e1eac 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -100,6 +100,7 @@ #define X86_FEATURE_AMD_DCM (3*32+27) /* multi-node processor */ #define X86_FEATURE_APERFMPERF (3*32+28) /* APERFMPERF */ #define X86_FEATURE_EAGER_FPU (3*32+29) /* "eagerfpu" Non lazy FPU restore */ +#define X86_FEATURE_TSC_S3_NOTSTOP (3*32+30) /* TSC doesn't stop in S3 state */ /* Intel-defined CPU features, CPUID level 0x0001 (ecx), word 4 */ #define X86_FEATURE_XMM3 (4*32+ 0) /* "pni" SSE-3 */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index fcaabd0..532f873 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -97,6 +97,18 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c) sched_clock_stable = 1; } + /* Penwell and Cloverview have the TSC which doesn't sleep on S3 */ + if (c->x86 == 6) { + switch (c->x86_model) { + case 0x27: /* Penwell */ + case 0x35: /* Cloverview */ + set_cpu_cap(c, X86_FEATURE_TSC_S3_NOTSTOP); + break; + default: + ; + } + } + /* * There is a known erratum on Pentium III and Core Solo * and Core Duo CPUs. -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 2/5] clocksource: Add new feature flag CLOCK_SOURCE_SUSPEND_NOTSTOP
Some x86 processors have a TSC clocksource, which continue to work when system is suspend. Add a feature flag so that it could be utilized. Signed-off-by: Feng Tang --- include/linux/clocksource.h |1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index 4dceaf8..2d53a8a 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -206,6 +206,7 @@ struct clocksource { #define CLOCK_SOURCE_WATCHDOG 0x10 #define CLOCK_SOURCE_VALID_FOR_HRES0x20 #define CLOCK_SOURCE_UNSTABLE 0x40 +#define CLOCK_SOURCE_SUSPEND_NOTSTOP 0x80 /* simplify initialization of mask field */ #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) : -1) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
scripts/package/Makefile: KBUILD_OUTPUT is useless in rpm build
I found KBUILD_OUTPUT variable is useless in the rpm-pkg and rpm target. Yes there is a comment said: # Note that the rpm-pkg target cannot be used with KBUILD_OUTPUT, # but the binrpm-pkg target can; for some reason O= gets ignored. It does not say for what reason. Also, the code under rpm-pkg checks if KBUILD_OUTPUT is defined. > @if test -n "$(KBUILD_OUTPUT)"; then \ > echo "Building source + binary RPM is not possible outside the"; \ > echo "kernel source tree. Don't set KBUILD_OUTPUT, or use the"; \ > echo "binrpm-pkg target instead."; \ > false; \ > fi But the fact is, whether or not the user use "O=" option, KBUILD_OUTPUT is always empty. I try to figure out why but the big Makefile drives me crazy. I'm thinking if the "O=" option really don't effect KBUILD_OUTPUT here, at least remove these code. -- Bin Wang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] ARM: davinci: da850: add OF_DEV_AUXDATA entry for eth0.
From: Lad, Prabhakar Add OF_DEV_AUXDATA for eth0 driver in da850 board dt file to use emac clock. Signed-off-by: Lad, Prabhakar Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: davinci-linux-open-sou...@linux.davincidsp.com Cc: net...@vger.kernel.org Cc: devicetree-disc...@lists.ozlabs.org Cc: Sekhar Nori Cc: Heiko Schocher --- arch/arm/mach-davinci/da8xx-dt.c |9 - 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/arch/arm/mach-davinci/da8xx-dt.c b/arch/arm/mach-davinci/da8xx-dt.c index 37c27af..d548a38 100644 --- a/arch/arm/mach-davinci/da8xx-dt.c +++ b/arch/arm/mach-davinci/da8xx-dt.c @@ -37,11 +37,18 @@ static void __init da8xx_init_irq(void) of_irq_init(da8xx_irq_match); } +struct of_dev_auxdata da850_evm_auxdata_lookup[] __initdata = { + OF_DEV_AUXDATA("ti,davinci-dm6467-emac", 0x01e2, "davinci_emac.1", + NULL), + {} +}; + #ifdef CONFIG_ARCH_DAVINCI_DA850 static void __init da850_init_machine(void) { - of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); + of_platform_populate(NULL, of_default_bus_match_table, +da850_evm_auxdata_lookup, NULL); da8xx_uart_clk_enable(); } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] ARM: davinci: da850: add DT node for eth0.
From: Lad, Prabhakar Add eth0 device tree node information to da850 by providing interrupt details and local mac address of eth0. Signed-off-by: Lad, Prabhakar Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: davinci-linux-open-sou...@linux.davincidsp.com Cc: net...@vger.kernel.org Cc: devicetree-disc...@lists.ozlabs.org Cc: Sekhar Nori Cc: Heiko Schocher --- arch/arm/boot/dts/da850-evm.dts |3 +++ arch/arm/boot/dts/da850.dtsi| 15 +++ 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/arch/arm/boot/dts/da850-evm.dts b/arch/arm/boot/dts/da850-evm.dts index 37dc5a3..a1d6e3e 100644 --- a/arch/arm/boot/dts/da850-evm.dts +++ b/arch/arm/boot/dts/da850-evm.dts @@ -24,5 +24,8 @@ serial2: serial@1d0d000 { status = "okay"; }; + eth0: emac@1e2 { + status = "okay"; + }; }; }; diff --git a/arch/arm/boot/dts/da850.dtsi b/arch/arm/boot/dts/da850.dtsi index 640ab75..309cc99 100644 --- a/arch/arm/boot/dts/da850.dtsi +++ b/arch/arm/boot/dts/da850.dtsi @@ -56,5 +56,20 @@ interrupt-parent = <>; status = "disabled"; }; + eth0: emac@1e2 { + compatible = "ti,davinci-dm6467-emac"; + reg = <0x22 0x4000>; + ti,davinci-ctrl-reg-offset = <0x3000>; + ti,davinci-ctrl-mod-reg-offset = <0x2000>; + ti,davinci-ctrl-ram-offset = <0>; + ti,davinci-ctrl-ram-size = <0x2000>; + local-mac-address = [ 00 00 00 00 00 00 ]; + interrupts = <33 + 34 + 35 + 36 + >; + interrupt-parent = <>; + }; }; }; -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] ARM: davinci: da850: add ethernet driver DT support
From: Lad, Prabhakar This patch set enables Ethernet support through device tree model. Patches are available on [1] for testing. [1] http://git.linuxtv.org/mhadli/v4l-dvb-davinci_devices.git/shortlog/refs/heads/da850_dt Lad, Prabhakar (2): ARM: davinci: da850: add DT node for eth0. ARM: davinci: da850: add OF_DEV_AUXDATA entry for eth0. arch/arm/boot/dts/da850-evm.dts |3 +++ arch/arm/boot/dts/da850.dtsi | 15 +++ arch/arm/mach-davinci/da8xx-dt.c |9 - 3 files changed, 26 insertions(+), 1 deletions(-) -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: Tree for Jan 21
Hi all, Changes since 20130118: The powerpc tree still had a build failure. The security tree gained a conflict against Linus' tree. The driver-core tree lost its build failure. The tty tree gained a conflict against Linus' tree. The usb tree gained a conflict against Linus' tree and a build failure so I used the version from next-20130118. The gpio-lw tree lost its build failure. The samsung tree gained a conflict against the gpio-lw tree. The akpm tree gained a conflict against the drm tree and a build failure for which I reverted a commit. I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc, sparc64 and arm defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. We are up to 211 trees (counting Linus' and 28 trees of patches pending for Linus' tree), more are welcome (even if they are currently empty). Thanks to those who have contributed, and to those who haven't, please do. Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwells...@canb.auug.org.au $ git checkout master $ git reset --hard stable Merging origin/master (3a142ed Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal) Merging fixes/master (d287b87 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs) Merging kbuild-current/rc-fixes (02f3e53 Merge branch 'yem-kconfig-rc-fixes' of git://gitorious.org/linux-kconfig/linux-kconfig into kbuild/rc-fixes) Merging arm-current/fixes (210b184 Merge branch 'for-rmk/virt/hyp-boot/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into fixes) Merging m68k-current/for-linus (e7e29b4 m68k: Wire up finit_module) Merging powerpc-merge/merge (e6449c9 powerpc: Add missing NULL terminator to avoid boot panic on PPC40x) Merging sparc/master (b7c13f7 sparc: remove __devinit, __devexit annotations) Merging net/master (b74aa93 tcp: fix incorrect LOCKDROPPEDICMPS counter) Merging sound-current/for-linus (e043403 ALSA: hda - Fix mute led for another HP machine) Merging pci-current/for-linus (444ee9b PCI: remove depends on CONFIG_EXPERIMENTAL) Merging wireless/master (4668cce ath9k: disable the tasklet before taking the PCU lock) Merging driver-core.current/driver-core-linus (7d1f9ae Linux 3.8-rc4) Merging tty.current/tty-linus (ebebd49 8250/16?50: Add support for Broadcom TruManage redirected serial port) Merging usb.current/usb-linus (1ee0a22 USB: io_ti: Fix NULL dereference in chase_port()) Merging staging.current/staging-linus (7dfc833 staging/sb105x: PARPORT config is not good enough must use PARPORT_PC) Merging char-misc.current/char-misc-linus (33080c1 Drivers: hv: balloon: Fix a memory leak) Merging input-current/for-linus (b666263 Input: document that unregistering managed devices is not necessary) Merging md-current/for-linus (a9add5d md/raid5: add blktrace calls) Merging audit-current/for-linus (c158a35 audit: no leading space in audit_log_d_path prefix) Merging crypto-current/master (a2c0911 crypto: caam - Updated SEC-4.0 device tree binding for ERA information.) Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops) Merging dwmw2/master (084a0ec x86: add CONFIG_X86_MOVBE option) CONFLICT (content): Merge conflict in arch/x86/Kconfig Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to inline functions) Merging irqdomain-current/irqdomain/merge (a0d271c Linux 3.6) Merging devicetree-current/devicetree/merge (ab28698 of: define struct device in of_platform.h if !OF_DEVICE and !OF_ADDRESS) Merging
[BUG] Bug in netprio_cgroup and netcls_cgroup ?
I'm not a network developer, so correct me if I'm wrong. Since commit 406a3c638ce8b17d9704052c07955490f732c2b8 ("net: netprio_cgroup: rework update socket logic"), sock->sk->sk_cgrp_prioidx is set when the socket is created, and won't be updated unless the task is moved to another cgroup. Now the problem is, a socket can be _shared_ by multiple processes (fork, SCM_RIGHT). If we place those processes in different cgroups, and each cgroup has different configs, but all of the processes will send data via this socket with the same network priority. Similar with cls cgroup. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: build failure after merge of the final tree (akpm tree related)
Hi all, After merging the final tree, today's linux-next build (arm defconfig) failed like this: mm/memblock.c: In function 'memblock_find_in_range_node': mm/memblock.c:104:2: error: invalid use of undefined type 'struct movablecore_map' mm/memblock.c:123:4: error: invalid use of undefined type 'struct movablecore_map' mm/memblock.c:130:7: error: invalid use of undefined type 'struct movablecore_map' mm/memblock.c:131:4: error: invalid use of undefined type 'struct movablecore_map' Caused by commit "page_alloc: bootmem limit with movablecore_map" from the akpm tree. The definition of struct movablecore_map is protected by CONFIG_HAVE_MEMBLOCK_NODE_MAP but its use is not. I have reverted that commit for today. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpfZzvezWCFY.pgp Description: PGP signature
Re: Issues with "x86, um: switch to generic fork/vfork/clone" commit
On Sun, Jan 20, 2013 at 06:39:09PM -0800, Linus Torvalds wrote: > And right now, that HAVE_SYSCALL_WRAPPERS does make it much harder to > think about the header file changes. Agreed. > > FWIW, there's another bit of ugliness around that area - all these > > #define __SC_BLAH3, etc., all of the same form. This stuff begs for > > something like > > #define __MAP1(m,t,a) m(t,a) > > #define __MAP2(m,t,a,...) m(t,a) __MAP1(m,__VA_ARGS__) > > #define __MAP3(m,t,a,...) m(t,a) __MAP2(m,__VA_ARGS__) > > #define __MAP4(m,t,a,...) m(t,a) __MAP3(m,__VA_ARGS__) > > #define __MAP5(m,t,a,...) m(t,a) __MAP4(m,__VA_ARGS__) > > #define __MAP6(m,t,a,...) m(t,a) __MAP5(m,__VA_ARGS__) > > #define __MAP(n,...) __MAP##n(__VA_ARGS__) > > with __MAP(x,__SC_DECL,__VA_ARGS__) instead of __SC_DECL##x(__VA_ARGS__) > > etc. in users... ... with missing commas added, of course. > Well, I can see both sides. The above is the nice and dense > declaration model with less duplication, but christ, it's hard for > people to wrap their minds around unless they've seen it a million > times. It really does take some getting used to, and the long-form can > be easier to understand. Umm... Even with /* * __MAP - apply a given macro to all syscall arguments. * __MAP(n, m, t1, a1, ..., tn, an) will expand to * m(t1,a1), m(t2,a2), ..., m(tn, an) * Note that the first argument of __MAP must be equal to the number of * type, name pairs in the list. The list itself (all arguments of __MAP * starting with the 3rd one) is in the form we pass to SYSCALL_DEFINE. */ slapped on top of it? > That said, we have so many of those things now when it comes to the > syscall stuff that the dense form seems to be called for just to be > consistent. > > So go wild if you have the energy for it. I'm not going to pull that > for 3.8, though. No, that's obviously next cycle fodder, along with the sick tricks for generating compat wrappers on s390 if Martin can live with those. BTW, grep for asmlinkage; it's amazing how much cargo-culting is going on with it ;-/ Some of the instances are syscalls yet to be converted to SYSCALL_DEFINE; even more of COMPAT_SYSCALL_DEFINE-to-be. We also have a bunch of declarations in syscalls.h and compat.h - those are fine. _Some_ of the rest might be legitimate - ia64 and i386 have non-trivial asmlinkage expansion and some (but not all) of arch/{x86,ia64} instances do make sense. Not all of those - e.g. things like FPU_divide_by_zero() have no business being regparm(0); they are only called from C code and forcing their arguments on stack is a pure pessimization for no reason whatsoever. Everything else in arch/* is magic green marker, AFAICS... There are some borderline cases - e.g. I'm not sure if having sys_recv done *not* via SYSCALL_DEFINE() is deliberate; it might cut down on some overhead (the sucker's calling sys_recvfrom(), which does normalizations, which make normalizing in sys_recv() pointless). OTOH, sys_send *is* done as SYSCALL_DEFINE, even though it ends up calling sys_sendto()... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Compilation problem with drivers/staging/zsmalloc when !SMP on ARM
On Fri, Jan 18, 2013 at 11:46:02PM -0500, Konrad Rzeszutek Wilk wrote: > On Fri, Jan 18, 2013 at 07:11:32PM -0600, Matt Sealey wrote: > > On Fri, Jan 18, 2013 at 3:08 PM, Russell King - ARM Linux > > wrote: > > > On Fri, Jan 18, 2013 at 02:24:15PM -0600, Matt Sealey wrote: > > >> Hello all, > > >> > > >> I wonder if anyone can shed some light on this linking problem I have > > >> right now. If I configure my kernel without SMP support (it is a very > > >> lean config for i.MX51 with device tree support only) I hit this error > > >> on linking: > > > > > > Yes, I looked at this, and I've decided that I will _not_ fix this export, > > > neither will I accept a patch to add an export. > > > > Understood.. > > > > > As far as I can see, this code is buggy in a SMP environment. There's > > > apparantly no guarantee that: > > > > > > 1. the mapping will be created on a particular CPU. > > > 2. the mapping will then be used only on this specific CPU. > > > 3. no guarantee that another CPU won't speculatively prefetch from this > > >region. > > > 4. when the mapping is torn down, no guarantee that it's the same CPU that > > >used the happing. > > > > > > So, the use of the local TLB flush leaves all the other CPUs potentially > > > containing TLB entries for this mapping. > > > > I'm gonna put this out to the maintainers (Konrad, and Seth since he > > committed it) that if this code is buggy it gets taken back out, even > > if it makes zsmalloc "slow" on ARM, for the following reasons: > > Just to make sure I understand, you mean don't use page table > mapping but instead use copying? > > > > > * It's buggy on SMP as Russell describes above > > * It might not be buggy on UP (opposite to Russell's description above > > as the restrictions he states do not exist), but that would imply an > > export for a really core internal MM function nobody should be using > > anyway > > * By that assessment, using that core internal MM function on SMP is > > also bad voodoo that zsmalloc should not be doing > > 'local_tlb_flush' is bad voodoo? > > > > > It also either smacks of a lack of comprehensive testing or defiance > > of logic that nobody ever built the code without CONFIG_SMP, which > > means it was only tested on a bunch of SMP ARM systems (I'm guessing.. > > Pandaboard? :) or UP systems with SMP/SMP_ON_UP enabled (to expand on > > that guess, maybe Beagleboard in some multiplatform Beagle/Panda > > hybrid kernel). I am sure I was reading the mailing lists when that > > patch was discussed, coded and committed and my guess is correct. In > > this case, what we have here anyway is code which when PROPERLY > > configured as so.. > > The initial patch were done on x86. Then Seth did the work to make sure > it worked on PPC. Munchin looked on ARM and that is it. s/Munchin/Minchan > > If you have an ARM server that you would be willing to part with I would > be thrilled to look at it. > > > > > diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c > > b/drivers/staging/zsmalloc/zsmalloc-main.c > > index 09a9d35..ecf75fb 100644 > > --- a/drivers/staging/zsmalloc/zsmalloc-main.c > > +++ b/drivers/staging/zsmalloc/zsmalloc-main.c > > @@ -228,7 +228,7 @@ struct zs_pool { > > * mapping rather than copying > > * for object mapping. > > */ > > -#if defined(CONFIG_ARM) > > +#if defined(CONFIG_ARM) && defined(CONFIG_SMP) > > #define USE_PGTABLE_MAPPING I don't get it. How to prevent the problem Russel described? The problem is that other CPU can prefetch _speculatively_ under us. > > #endif > > > > .. such that it even compiles in both "guess" configurations, the > > slower Cortex-A8 600MHz single core system gets to use the slow copy > > path and the dual-core 1GHz+ Cortex-A9 (with twice the RAM..) gets to > > use the fast mapping path. Essentially all the patch does is "improve > > performance" on the fastest, best-configured, large-amounts-of-RAM, > > lots-of-CPU-performance ARM systems (OMAP4+, Snapdragon, Exynos4+, > > marvell armada, i.MX6..) while introducing the problems Russell > > describes, and leave performance exactly the same and potentially far > > more stable on the slower, memory-limited ARM machines. > > Any ideas on how to detect that? > > > > Given the purpose of zsmalloc, zram, zcache etc. this somewhat defies > > logic. If it's not making the memory-limited, slow ARM systems run > > better, what's the point? > > > > So in summary I suggest "we" (Greg? or is it Seth's responsibility?) > > should just back out that whole USE_PGTABLE_MAPPING chunk of code > > introduced with f553646. Then Russell can carry on randconfiging and I > > can build for SMP and UP and get the same code.. with less bugs. > > I get that you want to have this fixed right now. I think having it > fixed the right way is a better choice. Lets discuss that first > before we start tossing patches to disable parts of it. If I don't miss something, we could have 2 choice. 1) use
Re: linux-next: build failure after merge of the gpio-lw tree
Hi Shawn, On Mon, 21 Jan 2013 14:20:13 +0800 Shawn Guo wrote: > > On Sat, Jan 19, 2013 at 10:40:45AM +1100, Stephen Rothwell wrote: > > > > On Fri, 18 Jan 2013 16:02:13 +0800 Shawn Guo wrote: > > > > > > My bad, sorry for that. I just sent a v2 in reply to this message > > > for fixing the error. I spent some time trying to install a ppc64 > > > toolchain for testing, but unfortunately with on luck. So Stephen, > > > I have to rely on linux-next to give it a test again. Thanks. > > > > Cross compilers suitable for building kernels are available at > > http://www.kernel.org/pub/tools/crosstool/ . > > Thanks for the link, Stephen. I installed the compiler and verified > that the v2 fixes the error. Thanks. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpSPRULyEr9U.pgp Description: PGP signature
Re: linux-next: build failure after merge of the gpio-lw tree
On Sat, Jan 19, 2013 at 10:40:45AM +1100, Stephen Rothwell wrote: > Hi Shawn, > > On Fri, 18 Jan 2013 16:02:13 +0800 Shawn Guo wrote: > > > > My bad, sorry for that. I just sent a v2 in reply to this message > > for fixing the error. I spent some time trying to install a ppc64 > > toolchain for testing, but unfortunately with on luck. So Stephen, > > I have to rely on linux-next to give it a test again. Thanks. > > Cross compilers suitable for building kernels are available at > http://www.kernel.org/pub/tools/crosstool/ . Thanks for the link, Stephen. I installed the compiler and verified that the v2 fixes the error. Shawn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 00/76] Synopsys ARC Linux kernel Port
On Sunday 20 January 2013 11:45 AM, H. Peter Anvin wrote: > On 01/18/2013 04:24 AM, Vineet Gupta wrote: >> This patchset based off-of 3.8-rc4, adds the Linux kernel port to ARC700 >> processor family (750D and 770D) from Synopsys. I would be greatful for >> further review and feedback. > > One thing: ARC, as I understand it, is a whole family of architectures, which > mostly have in common their origin at Synopsys. ARC has had a long history - as a startup in 90's. There used to be ARCTanget Instruction set (and cores A4,A5... based on that) which 10 years ago got deprecated by current ARCompact ISA (600 / 700 cores). So yes it is a family of architectures - but all we care about is the ARCompact and 600/700 at Synopsys. However, I don't think there were sub-architectures or forks which floated around. The MIPS "ARC" seems to be some sort of firmware standard, but not related to ARC: Ralf would you care to shed some light ? I don't know of any ARC arch other than these. > Can we make this arch/arc700 > since that is what it is? > > -hpa > As of now yes, but in near future we may have a new Instruction Set and cores based on that - so it will be better if we keep a non specific name. Thx, -Vineet -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] OPP usage fixes for RCU locking
On Sat, Jan 19, 2013 at 7:28 AM, Rafael J. Wysocki wrote: > On Friday, January 18, 2013 01:52:31 PM Nishanth Menon wrote: >> Hi, >> Despite being documented in function documentation and in >> Documentation/power/opp.txt, many of the users of OPP APIs >> dont honor RCU lock usage appropriately. >> >> This recently appeared in IRC discussion earlier today [1]. >> I did an audit of current usage and the following series >> is a result of this. >> >> NOTE: >> 1. The patch "PM / devfreq: exynos4_bus: honor RCU lock usage" has only >>been build tested as I dont have an exynos platform to try it on. I have >>tried to make it as least intrusive as possible and at least reviewed >>to ensure I haven't screwed anything up. >> >> Other than this, I have added appropriate tested by information in requisite >> patches. > > Thanks for the fixes. > > MyungJoo, do you want me to take the devfreq ones too? > > Rafael Yes, please take RCU-OPP patches. Having those patches splitted doesn't seem beneficial. I'll let other devfreq patches be based on this after you get them applied. Cheers, MyungJoo > > >> Series is based off: v3.8-rc4 tag >> Also available in the following location[2]: >> https://github.com/nmenon/linux-2.6-playground branch: post/pm/opp-fixes-v1 >> >> Nishanth Menon (4): >> cpufreq: OMAP: use RCU locks around usage of OPP >> cpufreq: cpufreq-cpu0: use RCU locks around usage of OPP >> PM / devfreq: add locking documentation for recommend_opp >> PM / devfreq: exynos4_bus: honor RCU lock usage >> >> drivers/cpufreq/cpufreq-cpu0.c |5 +++ >> drivers/cpufreq/omap-cpufreq.c |3 ++ >> drivers/devfreq/devfreq.c |5 +++ >> drivers/devfreq/exynos4_bus.c | 94 >> >> 4 files changed, 80 insertions(+), 27 deletions(-) >> >> [1] http://www.beagleboard.org/irclogs/index.php?date=2013-01-18#T14:14:07 >> [2] >> https://github.com/nmenon/linux-2.6-playground/commits/post/pm/opp-fixes-v1 >> >> Regards, >> Nishanth Menon >> > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center. > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- MyungJoo Ham, Ph.D. Mobile Software Platform Lab, DMC Business, Samsung Electronics -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the akpm tree with the drm tree
Hi Andrew, Today's linux-next merge of the akpm tree got a conflict in drivers/gpu/drm/drm_fb_helper.c between commit 848499032504 ("drm: add drm_modeset_lock|unlock_all") from the drm tree and commit "drivers/gpu/drm/drm_fb_helper.c: avoid sleeping in unblank_screen() if oops in progress" from the akpm tree. I can't see an easy way to resolve these, so I just dropped the akpm tree patch. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpKhdHVlO3t3.pgp Description: PGP signature
Re: [PATCH] fat: eliminate iterations in fat_search_long in case of EOD
Namjae Jeon writes: > 2013/1/20, OGAWA Hirofumi : >> Namjae Jeon writes: >> >>> From: Namjae Jeon >>> >>> When searching a directory for names, we can stop checking for further >>> entries if we detect End of Directory, i.e. if (de->name[0] == 0x00).The >>> current code traverses the cluster chain of a directory until a hit is >>> found or till the last cluster for that directory, ignoring the EOD mark. >>> Fix this. >> >> f_pos still works fine after this change? > Hi OGAWA. > I can not find f_pos usage in fat_search_long function. > Maybe, Have you seen other function such as __fat_readdir ? > Let me know your opinion. Ah, I see. Only ->lookup. So, this makes behavior more strange. I.e. readdir() returns beyond 0, but lookup() can't find it? Thanks. -- OGAWA Hirofumi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [LSF/MM TOPIC] Re: [dm-devel] Announcement: STEC EnhanceIO SSD caching software for Linux kernel
> -Original Message- > From: Mike Snitzer [mailto:snit...@redhat.com] > Sent: Saturday, January 19, 2013 3:08 AM > To: Darrick J. Wong > Cc: device-mapper development; Amit Kale; linux-bca...@vger.kernel.org; > kent.overstr...@gmail.com; LKML; lsf...@lists.linux-foundation.org; Joe > Thornber > Subject: Re: [LSF/MM TOPIC] Re: [dm-devel] Announcement: STEC EnhanceIO > SSD caching software for Linux kernel > > On Fri, Jan 18 2013 at 4:25pm -0500, > Darrick J. Wong wrote: > > > Since Joe is putting together a testing tree to compare the three > > caching things, what do you all think of having a(nother) session > > about ssd caching at this year's LSFMM Summit? > > > > [Apologies for hijacking the thread.] > > [Adding lsf-pc to the cc list.] > > Hopefully we'll have some findings on the comparisons well before LSF > (since we currently have some momentum). But yes it may be worthwhile > to discuss things further and/or report findings. We should have performance comparisons presented well before the summit. It'll be good to have ssd caching session in any case. The likelihood that one of them will be included in Linux kernel before April is very low. -Amit PROPRIETARY-CONFIDENTIAL INFORMATION INCLUDED This electronic transmission, and any documents attached hereto, may contain confidential, proprietary and/or legally privileged information. The information is intended only for use by the recipient named above. If you received this electronic message in error, please notify the sender and delete the electronic message. Any disclosure, copying, distribution, or use of the contents of information received in error is strictly prohibited, and violators will be pursued legally. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 10/11] PCI: Add match_driver in struct pci_dev
On Sun, Jan 20, 2013 at 3:15 PM, Rafael J. Wysocki wrote: > On Thursday, January 17, 2013 11:53:21 PM Yinghai Lu wrote: >> with that we could move out attaching driver for pci device, >> out of device_add for pci hot add path. >> >> pci_bus_attach_device() will attach driver to pci device. > > Acked-by: Rafael J. Wysocki > > for the code, but you still aren't saying in the changelog why the change > is needed. Thanks. Please check if the changelog is good to you. --- Subject: [PATCH] PCI: Skip attaching driver in device_add() We want to add pci device to device tree as early as possible but delay attach driver in next following path. To make that patch smaller, in this patch: We add match_driver field in pci_dev and default vaule is false, it will make pci_bus_match fail, so device_add will skip attaching driver, then pci_bus_attach_device() will set match_driver to true so pci_bus_match will return true and device_attach will attach driver to pci device. --- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 3/4] zram: get rid of lockdep warning
Lockdep complains about recursive deadlock of zram->init_lock. [1] made it false positive because we can't request IO to zram before setting disksize. Anyway, we should shut lockdep up to avoid many reporting from user. Cc: Jerome Marchand Cc: Nitin Gupta Signed-off-by: Minchan Kim --- drivers/staging/zram/zram_drv.c | 115 +++-- drivers/staging/zram/zram_drv.h | 12 +++- drivers/staging/zram/zram_sysfs.c | 10 +++- 3 files changed, 79 insertions(+), 58 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index e95e37c..1f6938a 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -462,19 +462,12 @@ error: void __zram_reset_device(struct zram *zram) { size_t index; + struct zram_meta meta; if (!zram->init_done) goto out; zram->init_done = 0; - - /* Free various per-device buffers */ - kfree(zram->compress_workmem); - free_pages((unsigned long)zram->compress_buffer, 1); - - zram->compress_workmem = NULL; - zram->compress_buffer = NULL; - /* Free all pages that are still in this zram device */ for (index = 0; index < zram->disksize >> PAGE_SHIFT; index++) { unsigned long handle = zram->table[index].handle; @@ -484,11 +477,11 @@ void __zram_reset_device(struct zram *zram) zs_free(zram->mem_pool, handle); } - vfree(zram->table); - zram->table = NULL; - - zs_destroy_pool(zram->mem_pool); - zram->mem_pool = NULL; + meta.compress_workmem = zram->compress_workmem; + meta.compress_buffer = zram->compress_buffer; + meta.table = zram->table; + meta.mem_pool = zram->mem_pool; + zram_meta_free(); /* Reset stats */ memset(>stats, 0, sizeof(zram->stats)); @@ -505,12 +498,59 @@ void zram_reset_device(struct zram *zram) up_write(>init_lock); } -/* zram->init_lock should be held */ -int zram_init_device(struct zram *zram) +void zram_meta_free(struct zram_meta *meta) +{ + zs_destroy_pool(meta->mem_pool); + kfree(meta->compress_workmem); + free_pages((unsigned long)meta->compress_buffer, 1); + vfree(meta->table); + kfree(meta); +} + +int zram_meta_alloc(struct zram_meta *meta, u64 disksize) { - int ret; size_t num_pages; + meta->compress_workmem = kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL); + if (!meta->compress_workmem) { + pr_err("Error allocating compressor working memory!\n"); + goto out; + } + + meta->compress_buffer = + (void *)__get_free_pages(GFP_KERNEL|__GFP_ZERO, 1); + if (!meta->compress_buffer) { + pr_err("Error allocating compressor buffer space\n"); + goto free_workmem; + } + + num_pages = disksize >> PAGE_SHIFT; + meta->table = vzalloc(num_pages * sizeof(*meta->table)); + if (!meta->table) { + pr_err("Error allocating zram address table\n"); + goto free_buffer; + } + + meta->mem_pool = zs_create_pool("zram", GFP_NOIO | __GFP_HIGHMEM); + if (!meta->mem_pool) { + pr_err("Error creating memory pool\n"); + goto free_table; + } + + return 0; + +free_table: + vfree(meta->table); +free_buffer: + free_pages((unsigned long)meta->compress_buffer, 1); +free_workmem: + kfree(meta->compress_workmem); +out: + return -ENOMEM; +} + +void zram_init_device(struct zram *zram, struct zram_meta *meta) +{ if (zram->disksize > 2 * (totalram_pages << PAGE_SHIFT)) { pr_info( "There is little point creating a zram of greater than " @@ -525,51 +565,16 @@ int zram_init_device(struct zram *zram) ); } - zram->compress_workmem = kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL); - if (!zram->compress_workmem) { - pr_err("Error allocating compressor working memory!\n"); - ret = -ENOMEM; - goto fail_no_table; - } - - zram->compress_buffer = - (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); - if (!zram->compress_buffer) { - pr_err("Error allocating compressor buffer space\n"); - ret = -ENOMEM; - goto fail_no_table; - } - - num_pages = zram->disksize >> PAGE_SHIFT; - zram->table = vzalloc(num_pages * sizeof(*zram->table)); - if (!zram->table) { - pr_err("Error allocating zram address table\n"); - ret = -ENOMEM; - goto fail_no_table; - } - /* zram devices sort of resembles non-rotational disks */ queue_flag_set_unlocked(QUEUE_FLAG_NONROT, zram->disk->queue); - zram->mem_pool = zs_create_pool("zram", GFP_NOIO | __GFP_HIGHMEM); - if
[PATCH v3 1/4] zram: force disksize setting before using zram
Now zram document syas "set disksize is optional" but partly it's wrong. When you try to use zram firstly after booting, you must set disksize, otherwise zram can't work because zram gendisk's size is 0. But once you do it, you can use zram freely after reset because reset doesn't reset to zero paradoxically. So in this time, disksize setting is optional.:( It's inconsitent for user behavior and not straightforward. This patch forces always setting disksize firstly before using zram. Yes. It changes current behavior so someone could complain when he upgrades zram. Apparently it could be a problem if zram is mainline but it still lives in staging so behavior could be changed for right way to go. Let them excuse. Cc: Nitin Gupta Acked-by: Dan Magenheimer Signed-off-by: Minchan Kim --- drivers/staging/zram/zram.txt | 27 +-- drivers/staging/zram/zram_drv.c | 52 ++--- drivers/staging/zram/zram_drv.h |5 +--- drivers/staging/zram/zram_sysfs.c |6 + 4 files changed, 35 insertions(+), 55 deletions(-) diff --git a/drivers/staging/zram/zram.txt b/drivers/staging/zram/zram.txt index 5f75d29..765d790 100644 --- a/drivers/staging/zram/zram.txt +++ b/drivers/staging/zram/zram.txt @@ -23,17 +23,17 @@ Following shows a typical sequence of steps for using zram. This creates 4 devices: /dev/zram{0,1,2,3} (num_devices parameter is optional. Default: 1) -2) Set Disksize (Optional): - Set disk size by writing the value to sysfs node 'disksize' - (in bytes). If disksize is not given, default value of 25% - of RAM is used. - - # Initialize /dev/zram0 with 50MB disksize - echo $((50*1024*1024)) > /sys/block/zram0/disksize - - NOTE: disksize cannot be changed if the disk contains any - data. So, for such a disk, you need to issue 'reset' (see below) - before you can change its disksize. +2) Set Disksize +Set disk size by writing the value to sysfs node 'disksize'. +The value can be either in bytes or you can use mem suffixes. +Examples: +# Initialize /dev/zram0 with 50MB disksize +echo $((50*1024*1024)) > /sys/block/zram0/disksize + +# Using mem suffixes +echo 256K > /sys/block/zram0/disksize +echo 512M > /sys/block/zram0/disksize +echo 1G > /sys/block/zram0/disksize 3) Activate: mkswap /dev/zram0 @@ -65,8 +65,9 @@ Following shows a typical sequence of steps for using zram. echo 1 > /sys/block/zram0/reset echo 1 > /sys/block/zram1/reset - (This frees all the memory allocated for the given device). - + This frees all the memory allocated for the given device and + resets the disksize to zero. You must set the disksize again + before reusing the device. Please report any problems at: - Mailing list: linux-mm-cc at laptop dot org diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 61fb8f1..1d45401 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -94,34 +94,6 @@ static int page_zero_filled(void *ptr) return 1; } -static void zram_set_disksize(struct zram *zram, size_t totalram_bytes) -{ - if (!zram->disksize) { - pr_info( - "disk size not provided. You can use disksize_kb module " - "param to specify size.\nUsing default: (%u%% of RAM).\n", - default_disksize_perc_ram - ); - zram->disksize = default_disksize_perc_ram * - (totalram_bytes / 100); - } - - if (zram->disksize > 2 * (totalram_bytes)) { - pr_info( - "There is little point creating a zram of greater than " - "twice the size of memory since we expect a 2:1 compression " - "ratio. Note that zram uses about 0.1%% of the size of " - "the disk when not in use so a huge zram is " - "wasteful.\n" - "\tMemory Size: %zu kB\n" - "\tSize you selected: %llu kB\n" - "Continuing anyway ...\n", - totalram_bytes >> 10, zram->disksize >> 10); - } - - zram->disksize &= PAGE_MASK; -} - static void zram_free_page(struct zram *zram, size_t index) { unsigned long handle = zram->table[index].handle; @@ -495,6 +467,9 @@ void __zram_reset_device(struct zram *zram) { size_t index; + if (!zram->init_done) + goto out; + zram->init_done = 0; /* Free various per-device buffers */ @@ -522,7 +497,9 @@ void __zram_reset_device(struct zram *zram) /* Reset stats */ memset(>stats, 0, sizeof(zram->stats)); +out: zram->disksize = 0; + set_capacity(zram->disk, 0); } void zram_reset_device(struct zram *zram) @@ -544,7 +521,19 @@ int zram_init_device(struct
[PATCH v3 4/4] zram: Fix deadlock bug in partial write
Now zram allocates new page with GFP_KERNEL in zram I/O path if IO is partial. Unfortunately, It may cuase deadlock with reclaim path so this patch solves the problem. Cc: Nitin Gupta Cc: Jerome Marchand Signed-off-by: Minchan Kim --- We could use GFP_IO instead of GFP_ATOMIC in zram_bvec_read with some modification related to buffer allocation in case of partial IO. But it needs more churn and prevent merge this patch into stable if we should send this to stable so I'd like to keep it as simple as possbile. GFP_IO usage could be separate patch after we merge it. Thanks. drivers/staging/zram/zram_drv.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 1f6938a..e00397f 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -192,7 +192,7 @@ static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec, user_mem = kmap_atomic(page); if (is_partial_io(bvec)) /* Use a temporary buffer to decompress the page */ - uncmem = kmalloc(PAGE_SIZE, GFP_KERNEL); + uncmem = kmalloc(PAGE_SIZE, GFP_ATOMIC); else uncmem = user_mem; @@ -240,7 +240,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index, * This is a partial IO. We need to read the full page * before to write the changes. */ - uncmem = kmalloc(PAGE_SIZE, GFP_KERNEL); + uncmem = kmalloc(PAGE_SIZE, GFP_NOIO); if (!uncmem) { pr_info("Error allocating temp memory!\n"); ret = -ENOMEM; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 2/4] zram: give up lazy initialization of zram metadata
1) User of zram normally do mkfs.xxx or mkswap before using the zram block device(ex, normally, do it at booting time) It ends up allocating such metadata of zram before real usage so benefit of lazy initialzation would be mitigated. 2) Some user want to use zram when memory pressure is high.(ie, load zram dynamically, NOT booting time). It does make sense because people don't want to waste memory until memory pressure is high(ie, where zram is really helpful time). In this case, lazy initialzation could be failed easily because we will use GFP_NOIO instead of GFP_KERNEL for avoiding deadlock. So the benefit of lazy initialzation would be mitigated, too. 3) Metadata overhead is not critical and Nitin has a plan to diet it. 4K : 12 byte(64bit machine) -> 64G : 192M so 0.3% isn't big overhead If insane user use such big zram device up to 20, it could consume 6% of ram but efficieny of zram will cover the waste. So this patch gives up lazy initialization and instead we initialize metadata at disksize setting time. Cc: Nitin Gupta Signed-off-by: Minchan Kim --- drivers/staging/zram/zram_drv.c | 20 drivers/staging/zram/zram_sysfs.c |1 + 2 files changed, 5 insertions(+), 16 deletions(-) diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index 1d45401..e95e37c 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -440,16 +440,13 @@ static void zram_make_request(struct request_queue *queue, struct bio *bio) { struct zram *zram = queue->queuedata; - if (unlikely(!zram->init_done) && zram_init_device(zram)) - goto error; - down_read(>init_lock); if (unlikely(!zram->init_done)) - goto error_unlock; + goto error; if (!valid_io_request(zram, bio)) { zram_stat64_inc(zram, >stats.invalid_io); - goto error_unlock; + goto error; } __zram_make_request(zram, bio, bio_data_dir(bio)); @@ -457,9 +454,8 @@ static void zram_make_request(struct request_queue *queue, struct bio *bio) return; -error_unlock: - up_read(>init_lock); error: + up_read(>init_lock); bio_io_error(bio); } @@ -509,18 +505,12 @@ void zram_reset_device(struct zram *zram) up_write(>init_lock); } +/* zram->init_lock should be held */ int zram_init_device(struct zram *zram) { int ret; size_t num_pages; - down_write(>init_lock); - - if (zram->init_done) { - up_write(>init_lock); - return 0; - } - if (zram->disksize > 2 * (totalram_pages << PAGE_SHIFT)) { pr_info( "There is little point creating a zram of greater than " @@ -569,7 +559,6 @@ int zram_init_device(struct zram *zram) } zram->init_done = 1; - up_write(>init_lock); pr_debug("Initialization done!\n"); return 0; @@ -579,7 +568,6 @@ fail_no_table: zram->disksize = 0; fail: __zram_reset_device(zram); - up_write(>init_lock); pr_err("Initialization failed: err=%d\n", ret); return ret; } diff --git a/drivers/staging/zram/zram_sysfs.c b/drivers/staging/zram/zram_sysfs.c index 4143af9..369db12 100644 --- a/drivers/staging/zram/zram_sysfs.c +++ b/drivers/staging/zram/zram_sysfs.c @@ -71,6 +71,7 @@ static ssize_t disksize_store(struct device *dev, zram->disksize = PAGE_ALIGN(disksize); set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); + zram_init_device(zram); up_write(>init_lock); return len; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] zram: get rid of lockdep warning
On Fri, Jan 18, 2013 at 01:34:18PM -0800, Nitin Gupta wrote: > On Wed, Jan 16, 2013 at 6:12 PM, Minchan Kim wrote: > > Lockdep complains about recursive deadlock of zram->init_lock. > > [1] made it false positive because we can't request IO to zram > > before setting disksize. Anyway, we should shut lockdep up to > > avoid many reporting from user. > > > > This patch allocates zram's metadata out of lock so we can fix it. > > In addition, this patch replace GFP_KERNEL with GFP_NOIO/GFP_ATOMIC > > in request handle path for partial I/O. > > > > [1] zram: give up lazy initialization of zram metadata > > > > Signed-off-by: Minchan Kim > > --- > > drivers/staging/zram/zram_drv.c | 194 > > +++-- > > drivers/staging/zram/zram_drv.h | 12 ++- > > drivers/staging/zram/zram_sysfs.c | 13 ++- > > 3 files changed, 118 insertions(+), 101 deletions(-) > > > > diff --git a/drivers/staging/zram/zram_drv.c > > b/drivers/staging/zram/zram_drv.c > > index 3693780..eb1bc37 100644 > > --- a/drivers/staging/zram/zram_drv.c > > +++ b/drivers/staging/zram/zram_drv.c > > @@ -71,22 +71,22 @@ static void zram_stat64_inc(struct zram *zram, u64 *v) > > zram_stat64_add(zram, v, 1); > > } > > > > -static int zram_test_flag(struct zram *zram, u32 index, > > +static int zram_test_flag(struct zram_meta *meta, u32 index, > > enum zram_pageflags flag) > > { > > - return zram->table[index].flags & BIT(flag); > > + return meta->table[index].flags & BIT(flag); > > } > > > > -static void zram_set_flag(struct zram *zram, u32 index, > > +static void zram_set_flag(struct zram_meta *meta, u32 index, > > enum zram_pageflags flag) > > { > > - zram->table[index].flags |= BIT(flag); > > + meta->table[index].flags |= BIT(flag); > > } > > > > -static void zram_clear_flag(struct zram *zram, u32 index, > > +static void zram_clear_flag(struct zram_meta *meta, u32 index, > > enum zram_pageflags flag) > > { > > - zram->table[index].flags &= ~BIT(flag); > > + meta->table[index].flags &= ~BIT(flag); > > } > > > > static int page_zero_filled(void *ptr) > > @@ -106,16 +106,17 @@ static int page_zero_filled(void *ptr) > > > > static void zram_free_page(struct zram *zram, size_t index) > > { > > - unsigned long handle = zram->table[index].handle; > > - u16 size = zram->table[index].size; > > + struct zram_meta *meta = zram->meta; > > + unsigned long handle = meta->table[index].handle; > > + u16 size = meta->table[index].size; > > > > if (unlikely(!handle)) { > > /* > > * No memory is allocated for zero filled pages. > > * Simply clear zero page flag. > > */ > > - if (zram_test_flag(zram, index, ZRAM_ZERO)) { > > - zram_clear_flag(zram, index, ZRAM_ZERO); > > + if (zram_test_flag(meta, index, ZRAM_ZERO)) { > > + zram_clear_flag(meta, index, ZRAM_ZERO); > > zram_stat_dec(>stats.pages_zero); > > } > > return; > > @@ -124,17 +125,17 @@ static void zram_free_page(struct zram *zram, size_t > > index) > > if (unlikely(size > max_zpage_size)) > > zram_stat_dec(>stats.bad_compress); > > > > - zs_free(zram->mem_pool, handle); > > + zs_free(meta->mem_pool, handle); > > > > if (size <= PAGE_SIZE / 2) > > zram_stat_dec(>stats.good_compress); > > > > zram_stat64_sub(zram, >stats.compr_size, > > - zram->table[index].size); > > + meta->table[index].size); > > zram_stat_dec(>stats.pages_stored); > > > > - zram->table[index].handle = 0; > > - zram->table[index].size = 0; > > + meta->table[index].handle = 0; > > + meta->table[index].size = 0; > > } > > > > static void handle_zero_page(struct bio_vec *bvec) > > @@ -159,20 +160,21 @@ static int zram_decompress_page(struct zram *zram, > > char *mem, u32 index) > > int ret = LZO_E_OK; > > size_t clen = PAGE_SIZE; > > unsigned char *cmem; > > - unsigned long handle = zram->table[index].handle; > > + struct zram_meta *meta = zram->meta; > > + unsigned long handle = meta->table[index].handle; > > > > - if (!handle || zram_test_flag(zram, index, ZRAM_ZERO)) { > > + if (!handle || zram_test_flag(meta, index, ZRAM_ZERO)) { > > memset(mem, 0, PAGE_SIZE); > > return 0; > > } > > > > - cmem = zs_map_object(zram->mem_pool, handle, ZS_MM_RO); > > - if (zram->table[index].size == PAGE_SIZE) > > + cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO); > > + if (meta->table[index].size == PAGE_SIZE) > > memcpy(mem, cmem, PAGE_SIZE); > > else > > - ret =
Re: [PATCH] dw_dmac: move soft LLP code from tasklet to dwc_scan_descriptors
On Fri, Jan 18, 2013 at 02:14:15PM +0200, Andy Shevchenko wrote: > The proper place for the main logic of the soft LLP mode is > dwc_scan_descriptors. It prevents to get the transfer unexpectedly aborted in > case the user calls dwc_tx_status. > > Signed-off-by: Andy Shevchenko Applied, Thanks -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] dw_dmac: don't exceed AHB master number in dwc_get_data_width
On Thu, Jan 17, 2013 at 01:35:47PM +0530, Viresh Kumar wrote: > On Thu, Jan 17, 2013 at 1:33 PM, Andy Shevchenko > wrote: > > The driver assumes that hardware has two AHB masters which might not be > > always > > true. In such cases we must not exceed number of the AHB masters present in > > the > > hardware. In the proposed scheme in this patch, we would choose the master > > with > > highest possible number whenever we exceed max AHB masters. > > > > Signed-off-by: Andy Shevchenko > > Acked-by: Viresh Kumar Applied, Thanks -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dw_dmac: allocate dma descriptors from DMA_COHERENT memory
On Wed, Jan 16, 2013 at 03:48:50PM +0200, Andy Shevchenko wrote: > Currently descriptors are allocated from normal cacheable memory and that > slows > down filling the descriptors, as we need to call cache_coherency routines > afterwards. It would be better to allocate memory for these descriptors from > DMA_COHERENT memory. This would make code much cleaner too. > > Signed-off-by: Andy Shevchenko > Tested-by: Mika Westerberg > Acked-by: Viresh Kumar Applied Thanks -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On 01/21/2013 12:38 PM, Mike Galbraith wrote: > On Mon, 2013-01-21 at 10:50 +0800, Michael Wang wrote: >> On 01/20/2013 12:09 PM, Mike Galbraith wrote: >>> On Thu, 2013-01-17 at 13:55 +0800, Michael Wang wrote: Hi, Mike I've send out the v2, which I suppose it will fix the below BUG and perform better, please do let me know if it still cause issues on your arm7 machine. >>> >>> s/arm7/aim7 >>> >>> Someone swiped half of CPUs/ram, so the box is now 2 10 core nodes vs 4. >>> >>> stock scheduler knobs >>> >>> 3.8-wang-v2 avg 3.8-virgin >>> avgvs wang >>> Tasksjobs/min >>> 1 436.29435.66435.97435.97437.86441.69 >>> 440.09439.88 1.008 >>> 5 2361.65 2356.14 2350.66 2356.15 2416.27 2563.45 >>> 2374.61 2451.44 1.040 >>>10 4767.90 4764.15 4779.18 4770.41 4946.94 4832.54 >>> 4828.69 4869.39 1.020 >>>20 9672.79 9703.76 9380.80 9585.78 9634.34 9672.79 >>> 9727.13 9678.08 1.009 >>>4019162.06 19207.61 19299.36 19223.01 19268.68 19192.40 >>> 19056.60 19172.56 .997 >>>8037610.55 37465.22 37465.22 37513.66 37263.64 37120.98 >>> 37465.22 37283.28 .993 >>> 16069306.65 69655.17 69257.14 69406.32 69257.14 69306.65 >>> 69257.14 69273.64 .998 >>> 320 111512.36 109066.37 111256.45 110611.72 108395.75 107913.19 >>> 108335.20 108214.71 .978 >>> 640 142850.83 148483.92 150851.81 147395.52 151974.92 151263.65 >>> 151322.67 151520.41 1.027 >>> 128052788.89 52706.39 67280.77 57592.01 189931.44 189745.60 >>> 189792.02 189823.02 3.295 >>> 256075403.91 52905.91 45196.21 57835.34 217368.64 217582.05 >>> 217551.54 217500.74 3.760 >>> >>> sched_latency_ns = 24ms >>> sched_min_granularity_ns = 8ms >>> sched_wakeup_granularity_ns = 10ms >>> >>> 3.8-wang-v2 avg 3.8-virgin >>> avgvs wang >>> Tasksjobs/min >>> 1 436.29436.60434.72435.87434.41439.77 >>> 438.81437.66 1.004 >>> 5 2382.08 2393.36 2451.46 2408.96 2451.46 2453.44 >>> 2425.94 2443.61 1.014 >>>10 5029.05 4887.10 5045.80 4987.31 4844.12 4828.69 >>> 4844.12 4838.97 .970 >>>20 9869.71 9734.94 9758.45 9787.70 9513.34 9611.42 >>> 9565.90 9563.55 .977 >>>4019146.92 19146.92 19192.40 19162.08 18617.51 18603.22 >>> 18517.95 18579.56 .969 >>>8037177.91 37378.57 37292.31 37282.93 36451.13 36179.10 >>> 36233.18 36287.80 .973 >>> 16070260.87 69109.05 69207.71 69525.87 68281.69 68522.97 >>> 68912.58 68572.41 .986 >>> 320 114745.56 113869.64 114474.62 114363.27 114137.73 114137.73 >>> 114137.73 114137.73 .998 >>> 640 164338.98 164338.98 164618.00 164431.98 164130.34 164130.34 >>> 164130.34 164130.34 .998 >>> 1280 209473.40 209134.54 209473.40 209360.44 210040.62 210040.62 >>> 210097.51 210059.58 1.003 >>> 2560 242703.38 242627.46 242779.34 242703.39 244001.26 243847.85 >>> 243732.91 243860.67 1.004 >>> >>> As you can see, the load collapsed at the high load end with stock >>> scheduler knobs (desktop latency). With knobs set to scale, the delta >>> disappeared. >> >> Thanks for the testing, Mike, please allow me to ask few questions. >> >> What are those tasks actually doing? what's the workload? > > It's the canned aim7 compute load, mixed bag load weighted toward > compute. Below is the workfile, should give you an idea. > > # @(#) workfile.compute:1.3 1/22/96 00:00:00 > # Compute Server Mix > FILESIZE: 100K > POOLSIZE: 250M > 50 add_double > 30 add_int > 30 add_long > 10 array_rtns > 10 disk_cp > 30 disk_rd > 10 disk_src > 20 disk_wrt > 40 div_double > 30 div_int > 50 matrix_rtns > 40 mem_rtns_1 > 40 mem_rtns_2 > 50 mul_double > 30 mul_int > 30 mul_long > 40 new_raph > 40 num_rtns_1 > 50 page_test > 40 series_1 > 10 shared_memory > 30 sieve > 20 stream_pipe > 30 string_rtns > 40 trig_rtns > 20 udp_test > That seems like the default one, could you please show me the numbers in your datapoint file? I'm not familiar with this benchmark, but I'd like to have a try on my server, to make sure whether it is a generic issue. >> And I'm confusing about how those new parameter value was figured out >> and how could them help solve the possible issue? > > Oh, that's easy. I set sched_min_granularity_ns such that last_buddy > kicks in when a third task arrives on a runqueue, and set > sched_wakeup_granularity_ns near minimum that still allows wakeup > preemption to occur. Combined effect is reduced over-scheduling. That sounds very hard, to catch the timing,
Re: [PATCH] perf evsel: fix NULL pointer deference when evsel->counts is NULL
Hi Colin, On Sat, 19 Jan 2013 16:36:54 +, Colin King wrote: > From: Colin Ian King > > __perf_evsel__read_on_cpu() only bails out with -ENOMEM if > evsel->counts is NULL and perf_evsel__alloc_counts() has returned > an error. If perf_evsel__alloc_counts() does not return an error > we get an NULL pointer deference on evsel->counts->cpu[cpu] > if evsel->counts is NULL. perf_evsel__alloc_counts() should allocate evsel->counts when it sees evsel->counts is NULL and return negative error code if the allocation fails. So I don't see any problem in current code. With your code, it won't try to allocate if ->counts is NULL but overwrite existing ->counts? Thanks, Namhyung > > Signed-off-by: Colin Ian King > --- > tools/perf/util/evsel.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c > index 1b16dd1..93acd06 100644 > --- a/tools/perf/util/evsel.c > +++ b/tools/perf/util/evsel.c > @@ -640,7 +640,7 @@ int __perf_evsel__read_on_cpu(struct perf_evsel *evsel, > if (FD(evsel, cpu, thread) < 0) > return -EINVAL; > > - if (evsel->counts == NULL && perf_evsel__alloc_counts(evsel, cpu + 1) < > 0) > + if (evsel->counts == NULL || perf_evsel__alloc_counts(evsel, cpu + 1) < > 0) > return -ENOMEM; > > if (readn(FD(evsel, cpu, thread), , nv * sizeof(u64)) < 0) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the samsung tree with the gpio-lw tree
Hi Kukjin, Today's linux-next merge of the samsung tree got a conflict in drivers/gpio/gpio-samsung.c between commit 6948ce588bd7 ("gpio: samsung: skip gpio lib registration for EXYNOS5440") from the gpio-lw tree and commit bda7f6d4e198 ("gpio: samsung: skip gpiolib registration if pinctrl support is enabled for exynos5250") from the samsung tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/gpio/gpio-samsung.c index 76be7ee,0d46db6..000 --- a/drivers/gpio/gpio-samsung.c +++ b/drivers/gpio/gpio-samsung.c @@@ -3025,7 -3025,7 +3024,8 @@@ static __init int samsung_gpiolib_init( static const struct of_device_id exynos_pinctrl_ids[] = { { .compatible = "samsung,pinctrl-exynos4210", }, { .compatible = "samsung,pinctrl-exynos4x12", }, + { .compatible = "samsung,pinctrl-exynos5250", }, + { .compatible = "samsung,pinctrl-exynos5440", }, }; for_each_matching_node(pctrl_np, exynos_pinctrl_ids) if (pctrl_np && of_device_is_available(pctrl_np)) pgpWAN1kPpeon.pgp Description: PGP signature
Re: [PATCH 5/6] OF: Introduce Device Tree resolve support.
On Fri, Jan 04, 2013 at 09:31:09PM +0200, Pantelis Antoniou wrote: > Introduce support for dynamic device tree resolution. > Using it, it is possible to prepare a device tree that's > been loaded on runtime to be modified and inserted at the kernel > live tree. > > Signed-off-by: Pantelis Antoniou > --- > .../devicetree/dynamic-resolution-notes.txt| 25 ++ > drivers/of/Kconfig | 9 + > drivers/of/Makefile| 1 + > drivers/of/resolver.c | 394 > + > include/linux/of.h | 17 + > 5 files changed, 446 insertions(+) > create mode 100644 Documentation/devicetree/dynamic-resolution-notes.txt > create mode 100644 drivers/of/resolver.c > > diff --git a/Documentation/devicetree/dynamic-resolution-notes.txt > b/Documentation/devicetree/dynamic-resolution-notes.txt > new file mode 100644 > index 000..0b396c4 > --- /dev/null > +++ b/Documentation/devicetree/dynamic-resolution-notes.txt > @@ -0,0 +1,25 @@ > +Device Tree Dynamic Resolver Notes > +-- > + > +This document describes the implementation of the in-kernel > +Device Tree resolver, residing in drivers/of/resolver.c and is a > +companion document to Documentation/devicetree/dt-object-internal.txt[1] > + > +How the resolver works > +-- > + > +The resolver is given as an input an arbitrary tree compiled with the > +proper dtc option and having a /plugin/ tag. This generates the > +appropriate __fixups__ & __local_fixups__ nodes as described in [1]. > + > +In sequence the resolver works by the following steps: > + > +1. Get the maximum device tree phandle value from the live tree + 1. > +2. Adjust all the local phandles of the tree to resolve by that amount. > +3. Using the __local__fixups__ node information adjust all local references > + by the same amount. > +4. For each property in the __fixups__ node locate the node it references > + in the live tree. This is the label used to tag the node. > +5. Retrieve the phandle of the target of the fixup. > +5. For each fixup in the property locate the node:property:offset location > + and replace it with the phandle value. Hrm. So, I'm really still not convinced by this approach. First, I think it's unwise to allow overlays to change essentially anything in the base tree, rather than having the base tree define sockets of some sort where things can be attached. Second, even allowing overlays to change anything, I don't see a lot of reason to do this kind of resolution within the kernel and with data stored in the dtb itself, rather than doing the resolution in userspace from an annotated overlay dts or dtb, then inserting the fully resolved product into the kernel. In either case, the overlay needs to be constructed with pretty intimate knowledge of the base tree. That said, I have some implementation comments below. [snip] > +/** > + * Find a subtree's maximum phandle value. > + */ > +static phandle __of_get_tree_max_phandle(struct device_node *node, > + phandle max_phandle) > +{ > + struct device_node *child; > + > + if (node->phandle != 0 && node->phandle != OF_PHANDLE_ILLEGAL && > + node->phandle > max_phandle) > + max_phandle = node->phandle; > + > + __for_each_child_of_node(node, child) > + max_phandle = __of_get_tree_max_phandle(child, max_phandle); Recursion is best avoided given the kernel's limited stack space. This is also trivial to implement non-recursively, using the allnext pointer. > + > + return max_phandle; > +} > + > +/** > + * Find live tree's maximum phandle value. > + */ > +static phandle of_get_tree_max_phandle(void) > +{ > + struct device_node *node; > + phandle phandle; > + > + /* get root node */ > + node = of_find_node_by_path("/"); > + if (node == NULL) > + return OF_PHANDLE_ILLEGAL; > + > + /* now search recursively */ > + read_lock(_lock); > + phandle = __of_get_tree_max_phandle(node, 0); > + read_unlock(_lock); > + > + of_node_put(node); > + > + return phandle; > +} > + > +/** > + * Adjust a subtree's phandle values by a given delta. > + * Makes sure not to just adjust the device node's phandle value, > + * but modify the phandle properties values as well. > + */ > +static void __of_adjust_tree_phandles(struct device_node *node, > + int phandle_delta) > +{ > + struct device_node *child; > + struct property *prop; > + phandle phandle; > + > + /* first adjust the node's phandle direct value */ > + if (node->phandle != 0 && node->phandle != OF_PHANDLE_ILLEGAL) > + node->phandle += phandle_delta; You need to have some kind of check for overflow here, or the adjusted phandle could be one of the illegal values (0 or -1) - or wrap around and colllide with existing phandle values
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On Mon, 2013-01-21 at 10:50 +0800, Michael Wang wrote: > On 01/20/2013 12:09 PM, Mike Galbraith wrote: > > On Thu, 2013-01-17 at 13:55 +0800, Michael Wang wrote: > >> Hi, Mike > >> > >> I've send out the v2, which I suppose it will fix the below BUG and > >> perform better, please do let me know if it still cause issues on your > >> arm7 machine. > > > > s/arm7/aim7 > > > > Someone swiped half of CPUs/ram, so the box is now 2 10 core nodes vs 4. > > > > stock scheduler knobs > > > > 3.8-wang-v2 avg 3.8-virgin > > avgvs wang > > Tasksjobs/min > > 1 436.29435.66435.97435.97437.86441.69 > > 440.09439.88 1.008 > > 5 2361.65 2356.14 2350.66 2356.15 2416.27 2563.45 > > 2374.61 2451.44 1.040 > >10 4767.90 4764.15 4779.18 4770.41 4946.94 4832.54 > > 4828.69 4869.39 1.020 > >20 9672.79 9703.76 9380.80 9585.78 9634.34 9672.79 > > 9727.13 9678.08 1.009 > >4019162.06 19207.61 19299.36 19223.01 19268.68 19192.40 > > 19056.60 19172.56 .997 > >8037610.55 37465.22 37465.22 37513.66 37263.64 37120.98 > > 37465.22 37283.28 .993 > > 16069306.65 69655.17 69257.14 69406.32 69257.14 69306.65 > > 69257.14 69273.64 .998 > > 320 111512.36 109066.37 111256.45 110611.72 108395.75 107913.19 > > 108335.20 108214.71 .978 > > 640 142850.83 148483.92 150851.81 147395.52 151974.92 151263.65 > > 151322.67 151520.41 1.027 > > 128052788.89 52706.39 67280.77 57592.01 189931.44 189745.60 > > 189792.02 189823.02 3.295 > > 256075403.91 52905.91 45196.21 57835.34 217368.64 217582.05 > > 217551.54 217500.74 3.760 > > > > sched_latency_ns = 24ms > > sched_min_granularity_ns = 8ms > > sched_wakeup_granularity_ns = 10ms > > > > 3.8-wang-v2 avg 3.8-virgin > > avgvs wang > > Tasksjobs/min > > 1 436.29436.60434.72435.87434.41439.77 > > 438.81437.66 1.004 > > 5 2382.08 2393.36 2451.46 2408.96 2451.46 2453.44 > > 2425.94 2443.61 1.014 > >10 5029.05 4887.10 5045.80 4987.31 4844.12 4828.69 > > 4844.12 4838.97 .970 > >20 9869.71 9734.94 9758.45 9787.70 9513.34 9611.42 > > 9565.90 9563.55 .977 > >4019146.92 19146.92 19192.40 19162.08 18617.51 18603.22 > > 18517.95 18579.56 .969 > >8037177.91 37378.57 37292.31 37282.93 36451.13 36179.10 > > 36233.18 36287.80 .973 > > 16070260.87 69109.05 69207.71 69525.87 68281.69 68522.97 > > 68912.58 68572.41 .986 > > 320 114745.56 113869.64 114474.62 114363.27 114137.73 114137.73 > > 114137.73 114137.73 .998 > > 640 164338.98 164338.98 164618.00 164431.98 164130.34 164130.34 > > 164130.34 164130.34 .998 > > 1280 209473.40 209134.54 209473.40 209360.44 210040.62 210040.62 > > 210097.51 210059.58 1.003 > > 2560 242703.38 242627.46 242779.34 242703.39 244001.26 243847.85 > > 243732.91 243860.67 1.004 > > > > As you can see, the load collapsed at the high load end with stock > > scheduler knobs (desktop latency). With knobs set to scale, the delta > > disappeared. > > Thanks for the testing, Mike, please allow me to ask few questions. > > What are those tasks actually doing? what's the workload? It's the canned aim7 compute load, mixed bag load weighted toward compute. Below is the workfile, should give you an idea. # @(#) workfile.compute:1.3 1/22/96 00:00:00 # Compute Server Mix FILESIZE: 100K POOLSIZE: 250M 50 add_double 30 add_int 30 add_long 10 array_rtns 10 disk_cp 30 disk_rd 10 disk_src 20 disk_wrt 40 div_double 30 div_int 50 matrix_rtns 40 mem_rtns_1 40 mem_rtns_2 50 mul_double 30 mul_int 30 mul_long 40 new_raph 40 num_rtns_1 50 page_test 40 series_1 10 shared_memory 30 sieve 20 stream_pipe 30 string_rtns 40 trig_rtns 20 udp_test > And I'm confusing about how those new parameter value was figured out > and how could them help solve the possible issue? Oh, that's easy. I set sched_min_granularity_ns such that last_buddy kicks in when a third task arrives on a runqueue, and set sched_wakeup_granularity_ns near minimum that still allows wakeup preemption to occur. Combined effect is reduced over-scheduling. > Do you have any idea about which part in this patch set may cause the issue? Nope, I'm as puzzled by that as you are. When the box had 40 cores, both virgin and patched showed over-scheduling effects, but not like this. With 20 cores, symptoms changed in a most puzzling way, and I don't see how you'd be directly responsible. > One change by designed is that, for old logical, if it's
Re: [patch] module: potential deadlock in error path
On Sun, Jan 20, 2013 at 7:52 PM, Rusty Russell wrote: > > You've now conflated two completely different lock paths into a single > unlock. We have that elsewhere too. And it's what we used to have before too. So the simple fact is that commit 1fb9341ac348 just introduced this bug, and moving the goto target around is the obvious fix for it, and makes it match the old code that was simply incorrectly modified. The suggested patch instead has *some* cleanup inside the if-statement, and some at the goto target. That makes no sense to humans, and just makes it harder for the compiler to generate better code. > mutex_bug_cleanup() should really lock internally, but doesn't > so we wrap it. And that mutex_unlock of yours has nothing to do with > cleaning up ddebug, so the labels misnamed, at best. Bah, humbug. It's called "ddebug_cleanup" because it's called after the debug setup, so it needs to clean up the state set up by that. The fact that it needs to unlock is secondary, and is simply because the lock is taken at that point, so needs to be released. The naming is not wonderful, but it's not hugely illogical, and again, that's what it used to (except "ddebug" has been renamed to "ddebug_cleanup"). You could rename it if you want to (we used to have a target called "unlock" at that point), but that's *still* no excuse for just creating code that does cleanup in two totally unrelated places. > Not that it matters much: this is going to change for next merge window. Now, agreed, that looks better, although I suspect you could have taken the "split that ugly function up" further still. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the security tree with Linus' tree
Hi Mimi, On Sun, 20 Jan 2013 22:10:23 -0500 Mimi Zohar wrote: > > Sorry Stephen, the merged result should look like what's contained in > linux-integrity/next-upstreamed-patches: > > int ima_module_check(struct file *file) > { > if (!file) { > if ((ima_appraise & IMA_APPRAISE_MODULES) && > (ima_appraise & IMA_APPRAISE_ENFORCE)) { > #ifndef CONFIG_MODULE_SIG_FORCE > return -EACCES; /* INTEGRITY_UNKNOWN */ > #endif > } > return 0; > } > return process_measurement(file, file->f_dentry->d_name.name, >MAY_EXEC, MODULE_CHECK); > } OK, I will use that version tomorrow. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgppP6zI04_oW.pgp Description: PGP signature
linux-next: build failure after merge of the usb tree
Hi Greg, After merging the usb tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: drivers/usb/core/port.c: In function 'usb_port_device_release': drivers/usb/core/port.c:25:2: error: implicit declaration of function 'kfree' [-Werror=implicit-function-declaration] drivers/usb/core/port.c: In function 'usb_hub_create_port_device': drivers/usb/core/port.c:38:2: error: implicit declaration of function 'kzalloc' [-Werror=implicit-function-declaration] drivers/usb/core/port.c:38:11: warning: assignment makes pointer from integer without a cast [enabled by default] Caused by commit 6e30d7cba992 ("usb: Add driver/usb/core/(port.c,hub.h) files"). See Rule 1 in Documentation/SubmitChecklist. I have used the usb tree from next-20130118 for today. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpUu9N_89NCM.pgp Description: PGP signature
Re: splice() giving unexpected EOF in 3.7.3 and 3.8-rc4+
From: Eric Dumazet Date: Fri, 18 Jan 2013 22:13:16 -0800 > On Fri, 2013-01-18 at 21:54 -0800, Eric Dumazet wrote: > >> >> Hmm, this might be already fixed in net-next tree, could you try it ? >> > > Yes, running your program on net-next seems OK. > > David, we need the two following commits. Tossed into 'net' and queued up for -stable, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] firewire net: Use LL_RESERVED_SPACE(), HH_DATA_OFF().
From: YOSHIFUJI Hideaki Date: Sun, 20 Jan 2013 17:03:07 +0900 > Signed-off-by: YOSHIFUJI Hideaki Applied. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] firewire net: Ensure checksumming in upper layer.
From: YOSHIFUJI Hideaki Date: Sun, 20 Jan 2013 16:43:40 +0900 > It is wrong to set skb->ip_summed to CHECKSUM_UNNECESSARY unless > the device has already checked it. > > Signed-off-by: YOSHIFUJI Hideaki Applied. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] kernel config template for running inside virtual machine
On Mon, Jan 21, 2013 at 11:03 AM, Mulyadi Santosa wrote: > Hello everybody > > With the significant usage of virtualization in recent years, I > personally think there might be a need to easily generate somewhat > more optimal kernel for running as VM guest. To make it clearer, it will be saved under arch/x86/configs using name like vm_defconfig or alike. PS: I am not subscribed to linux-kernel@vger right now, so kindly cc: with in your reply. -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fat: eliminate iterations in fat_search_long in case of EOD
2013/1/20, OGAWA Hirofumi : > Namjae Jeon writes: > >> From: Namjae Jeon >> >> When searching a directory for names, we can stop checking for further >> entries if we detect End of Directory, i.e. if (de->name[0] == 0x00).The >> current code traverses the cluster chain of a directory until a hit is >> found or till the last cluster for that directory, ignoring the EOD mark. >> Fix this. > > f_pos still works fine after this change? Hi OGAWA. I can not find f_pos usage in fat_search_long function. Maybe, Have you seen other function such as __fat_readdir ? Let me know your opinion. Thanks. > >> Signed-off-by: Namjae Jeon >> Signed-off-by: Ravishankar N >> --- >> fs/fat/dir.c |4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fs/fat/dir.c b/fs/fat/dir.c >> index 58bf744..cde0e69 100644 >> --- a/fs/fat/dir.c >> +++ b/fs/fat/dir.c >> @@ -484,10 +484,10 @@ parse_record: >> nr_slots = 0; >> if (de->name[0] == DELETED_FLAG) >> continue; >> +if (!de->name[0]) >> +goto end_of_dir; >> if (de->attr != ATTR_EXT && (de->attr & ATTR_VOLUME)) >> continue; >> -if (de->attr != ATTR_EXT && IS_FREE(de->name)) >> -continue; >> if (de->attr == ATTR_EXT) { >> int status = fat_parse_long(inode, , , , >> , _slots); > > -- > OGAWA Hirofumi > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the usb tree with Linus' tree
Hi Greg, Today's linux-next merge of the usb tree got a conflict in drivers/usb/serial/io_ti.c between commit 1ee0a224bc9a ("USB: io_ti: Fix NULL dereference in chase_port()") from Linus' tree and commit f40d781554ef ("USB: io_ti: kill custom closing_wait implementation") from the usb tree. I fixed it up (the latter removed the code fixed by the former, so I just used thet) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ pgpzyd8_rXYBC.pgp Description: PGP signature
Re: [PATCH 5/5] drivers: atm: checkpatch.pl fixed coding style issues in eni.c
From: Patrik Karlin Date: Mon, 21 Jan 2013 00:12:55 +0100 > This patch fixes statement placement around if/else/for statments > as suggested by checkpatch.pl > > Signed-off-by: Patrik Kårlin This patch set is a good example of why nobody should fix up coding style in such a robotic way in response to codingstyle.pl complaints. > - ATM_MAX_AAL5_PDU) eff = (length+3) >> 2; > + ATM_MAX_AAL5_PDU) I bet you didn't even notice that in this change you are adding trailing whitespace, the exact problem you fixed up for this file in a previous patch of the series. I really would encourage you to work on something else entirely. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] module: potential deadlock in error path
Linus Torvalds writes: > On Sun, Jan 20, 2013 at 5:20 PM, Rusty Russell wrote: >> Dan Carpenter writes: >>> We take the lock twice if we hit this goto. >>> >>> Signed-off-by: Dan Carpenter >> >> Damn, just pushed that to Linus: should have read mail first. >> >> I've added this, thanks. > > I'm not pulling this. It seems stupid. > > Why isn't the fix just this (whitespace-damaged, cut-and-pasted) > one-liner instead? I may be blind, but as far as I cal tell, there's > exactly one single place we do that "giti ddebug_cleanup", and it > wants to unlock the mutex, so we should just move the unlock down one > line instead. > > Hmm? Is there some hidden magic going on that I can't see? TBH, I find your change marginally less clear. You've now conflated two completely different lock paths into a single unlock. mutex_bug_cleanup() should really lock internally, but doesn't so we wrap it. And that mutex_unlock of yours has nothing to do with cleaning up ddebug, so the labels misnamed, at best. > diff --git a/kernel/module.c b/kernel/module.c > index d25e359279ae..eab08274ec9b 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3274,8 +3274,8 @@ again: > /* module_bug_cleanup needs module_mutex protection */ > mutex_lock(_mutex); > module_bug_cleanup(mod); > - mutex_unlock(_mutex); > ddebug_cleanup: > + mutex_unlock(_mutex); > dynamic_debug_remove(info->debug); > synchronize_sched(); > kfree(mod->args); Not that it matters much: this is going to change for next merge window. See below for freshly-minted patch (compiled, untested). Nice to make module_bug_cleanup() lock internally but it's in bug.c, and I've avoided making the module mutex non-static due to a history of abuse... Thanks, Rusty. module: clean up load_module a little more. 1fb9341ac34825aa40354e74d9a2c69df7d2c304 made our locking in load_module more complicated: we grab the mutex once to insert the module in the list, then again to upgrade it once it's formed. Since the locking is self-contained, it's neater to do this in separate functions. diff --git a/kernel/module.c b/kernel/module.c index 2b1d517..c0bc9b9 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3145,12 +3145,72 @@ static int may_init_module(void) return 0; } +/* + * We try to place it in the list now to make sure it's unique before + * we dedicate too many resources. In particular, temporary percpu + * memory exhaustion. + */ +static int add_unformed_module(struct module *mod) +{ + int err; + struct module *old; + + mod->state = MODULE_STATE_UNFORMED; + +again: + mutex_lock(_mutex); + if ((old = find_module_all(mod->name, true)) != NULL) { + if (old->state == MODULE_STATE_COMING + || old->state == MODULE_STATE_UNFORMED) { + /* Wait in case it fails to load. */ + mutex_unlock(_mutex); + err = wait_event_interruptible(module_wq, + finished_loading(mod->name)); + if (err) + goto out_unlocked; + goto again; + } + err = -EEXIST; + goto out; + } + list_add_rcu(>list, ); + err = 0; + +out: + mutex_unlock(_mutex); +out_unlocked: + return err; +} + +static int complete_formation(struct module *mod, struct load_info *info) +{ + int err; + + mutex_lock(_mutex); + + /* Find duplicate symbols (must be called under lock). */ + err = verify_export_symbols(mod); + if (err < 0) + goto out; + + /* This relies on module_mutex for list integrity. */ + module_bug_finalize(info->hdr, info->sechdrs, mod); + + /* Mark state as coming so strong_try_module_get() ignores us, +* but kallsyms etc. can see us. */ + mod->state = MODULE_STATE_COMING; + +out: + mutex_unlock(_mutex); + return err; +} + /* Allocate and load the module: note that size of section 0 is always zero, and we rely on this for optional sections. */ static int load_module(struct load_info *info, const char __user *uargs, int flags) { - struct module *mod, *old; + struct module *mod; long err; err = module_sig_check(info); @@ -3168,31 +3228,10 @@ static int load_module(struct load_info *info, const char __user *uargs, goto free_copy; } - /* -* We try to place it in the list now to make sure it's unique -* before we dedicate too many resources. In particular, -* temporary percpu memory exhaustion. -*/ - mod->state = MODULE_STATE_UNFORMED; -again: - mutex_lock(_mutex); - if ((old = find_module_all(mod->name, true)) != NULL) { - if (old->state == MODULE_STATE_COMING - ||
[git pull] drm fixes
Hi Linus, A bunch of intel and radeon fixes, along with two fixes to TTM code. The correct fix for the Intel ironlake failure is in this, and should make things more stable, along with some misc radeon fixes. Dave. The following changes since commit 7b4cf994e4c6ba48872bb25253cc393b7fb74c82: udldrmfb: udl_get_edid: drop unneeded i-- (2013-01-14 08:45:27 +1000) are available in the git repository at: git://people.freedesktop.org/~airlied/linux.git drm-fixes for you to fetch changes up to 014b34409fb2015f63663b6cafdf557fdf289628: ttm: on move memory failure don't leave a node dangling (2013-01-21 13:45:23 +1000) Alex Deucher (2): drm/radeon: clear reset flags if engines are idle Revert "drm/radeon: do not move bo to different placement at each cs" Chris Wilson (2): drm/i915: Record DERRMR, FORCEWAKE and RING_CTL in error-state drm/i915: Invalidate the relocation presumed_offsets along the slow path Dave Airlie (4): Merge branch 'drm-fixes-3.8' of git://people.freedesktop.org/~agd5f/linux into drm-next Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next ttm: don't destroy old mm_node on memcpy failure ttm: on move memory failure don't leave a node dangling Jani Nikula (2): drm/i915/eDP: do not write power sequence registers for ghost eDP drm/i915: fix FORCEWAKE posting reads Jerome Glisse (1): drm/radeon: improve semaphore debugging on lockup Marek Olšák (1): drm/radeon: allow FP16 color clear registers on r500 drivers/gpu/drm/i915/i915_debugfs.c| 3 ++ drivers/gpu/drm/i915/i915_drv.h| 3 ++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 21 + drivers/gpu/drm/i915/i915_irq.c| 11 +++ drivers/gpu/drm/i915/i915_reg.h| 2 ++ drivers/gpu/drm/i915/intel_dp.c| 47 -- drivers/gpu/drm/i915/intel_pm.c| 17 +++ drivers/gpu/drm/radeon/evergreen.c | 6 drivers/gpu/drm/radeon/ni.c| 6 drivers/gpu/drm/radeon/r600.c | 6 drivers/gpu/drm/radeon/radeon.h| 3 +- drivers/gpu/drm/radeon/radeon_drv.c| 3 +- drivers/gpu/drm/radeon/radeon_object.c | 18 +++- drivers/gpu/drm/radeon/radeon_ring.c | 2 ++ drivers/gpu/drm/radeon/radeon_semaphore.c | 4 +++ drivers/gpu/drm/radeon/reg_srcs/rv515 | 2 ++ drivers/gpu/drm/radeon/si.c| 6 drivers/gpu/drm/ttm/ttm_bo.c | 1 + drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +-- 19 files changed, 140 insertions(+), 32 deletions(-)
linux-next: manual merge of the tty tree with Linus' tree
Hi Greg, Today's linux-next merge of the tty tree got a conflict in drivers/tty/serial/vt8500_serial.c between commit a6dd114e16cb ("tty: serial: vt8500: fix return value check in vt8500_serial_probe()") from Linus' tree and commit 12faa35ae5cb ("serial: vt8500: UART uses gated clock rather than 24Mhz reference") from the tty tree. I fixed it up (I just used the tty tree version - which included the former fix) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au pgppyb2Wvi3qW.pgp Description: PGP signature
RE: USB: storage: optimize the matching rules and support new switch command for Huawei USB storage devices
Dear Greg: > -Original Message- > From: Greg KH [mailto:gre...@linuxfoundation.org] > Sent: Saturday, January 19, 2013 7:42 AM > To: Fangxiaozhi (Franko) > Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Xueguiying > (Zihan); > Linlei (Lei Lin); Yili (Neil); Wangyuhua (Roger, Credit); Huqiao (C); > ba...@ti.com; > mdharm-...@one-eyed-alien.net; sebast...@breakpoint.cc > Subject: Re: USB: storage: optimize the matching rules and support new switch > command for Huawei USB storage devices > > On Mon, Jan 14, 2013 at 10:55:48AM +0800, fangxiaozhi 00110321 wrote: > > > > From: fangxiaozhi > > > > 1. Optimize the matching rules with new macro for Huawei USB storage > >devices, to avoid to load USB storage driver for the modem interface > >with Huawei devices. > > 2. Add to support new switch command for new Huawei USB dongles. > > > > Signed-off-by: fangxiaozhi > > Next time, please always use the scripts/checkpatch.pl tool to find any > problems you might have made in your patch (you had trailing whitespace in > this one, which I have fixed.) > -Yes, I have checked my patch with scripts/checkpatch.pl tool before submitting. -For this trailing whitespace error, I think that it is better readable to leave whitespace in our patch code. Isn't it? > Also, you might want to use git, it makes creating the patches easier, that > way > you don't end up with lines in the patch like this one: > > > Binary files linux-3.8-rc3_orig/drivers/usb/storage/initializers.o and > > linux-3.8-rc3/drivers/usb/storage/initializers.o differ > > thanks, > > greg k-h Best Regards, Franko Fang
Re: [PATCH v2 2/3] dma: edma: add device_channel_caps() support
On Sun, Jan 20, 2013 at 11:51:08AM -0500, Matt Porter wrote: > The explanation in the cover letter mentions that dmaengine_slave_config() is > required to be called prior to dmaengine_get_channel_caps(). If we > switch to the alternative API, then that would go away including the > dependency on direction. Nope you got that wrong! -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/3] dmaengine: add per channel capabilities api
On Sun, Jan 20, 2013 at 11:37:35AM -0500, Matt Porter wrote: > On Sun, Jan 20, 2013 at 12:37:34PM +, Vinod Koul wrote: > > On Thu, Jan 10, 2013 at 02:07:03PM -0500, Matt Porter wrote: > > > The call is implemented as follows: > > > > > > struct dmaengine_chan_caps > > > *dma_get_channel_caps(struct dma_chan *chan, > > > enum dma_transfer_direction dir); > > > > > > The dma transfer direction parameter may appear a bit out of place > > > but it is necessary since the direction field in struct > > > dma_slave_config was deprecated. In some cases, EDMA for one, it > > > is necessary for the dmaengine driver to have the burst and address > > > width slave configuration parameters available in order to compute > > > the maximum segment size that can be handle. Due to this requirement, > > > the calling order of this api is as follows: > > Well you are passing direction as argument so even in EDMA it doesn't seem > > to > > help you as you seem to need burst and width!. So why do you even need the > > direction to compute the capablities > > Yes, I need burst and width, but they are dependent on direction (dst vs > src, as stored in the slave channel config). Ok, so I think I know where > this is leading...the problem is probably that I made an implicit > dependency on burst and width here. The expectation in this And also due to wrong documentation. This is what you have put up the flow as: Due to this requirement, the calling order of this api is as follows: 1. Allocate a DMA slave channel 1a. [Optionally] Get channel capabilities 2. Set slave and controller specific parameters 3. Get a descriptor for transaction 4. Submit the transaction 5. Issue pending requests and wait for callback notification Now when we query capablities, slave parameters _are_not_set_. So seems like you have thought something and written something else! Which brings me to the point on what are we trying to query: a) API capability, dont need slave parameters for that b) Sg segment length and numbers: Well these are capabilities, so it tells you what is the maximum I can do. IMO it doesn't make sense to tie it down to burst, width etc. For that configuration you are checking maximum. What this needs to return is what is the maximum length it supports and maximum number of sg list the h/w can use. Also if you return your burst and width capablity, then any client can easily find out what is the length byte value it can hold. If you feel this computaion if client specific, though looking at doesnt make me think so, you can add a callback for this computaion given the parameters. -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: USB: storage: optimize the matching rules and support new switch command for Huawei USB storage devices
Dear Greg: > -Original Message- > From: Greg KH [mailto:g...@kroah.com] > Sent: Saturday, January 19, 2013 7:44 AM > To: Fangxiaozhi (Franko) > Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Xueguiying > (Zihan); > Linlei (Lei Lin); Yili (Neil); Wangyuhua (Roger, Credit); Huqiao (C); > ba...@ti.com; > mdharm-...@one-eyed-alien.net; sebast...@breakpoint.cc > Subject: Re: USB: storage: optimize the matching rules and support new switch > command for Huawei USB storage devices > > On Mon, Jan 14, 2013 at 10:55:48AM +0800, fangxiaozhi 00110321 wrote: > > > > From: fangxiaozhi > > > > 1. Optimize the matching rules with new macro for Huawei USB storage > >devices, to avoid to load USB storage driver for the modem interface > >with Huawei devices. > > 2. Add to support new switch command for new Huawei USB dongles. > > > > Signed-off-by: fangxiaozhi > > This patch breaks the build, did you test it out? > > I get the following errors: > > drivers/usb/storage/unusual_devs.h:1530:1: error: implicit declaration of > function ‘UNUSUAL_VENDOR_INTF’ [-Werror=implicit-function-declaration] > drivers/usb/storage/unusual_devs.h:1534:3: warning: missing braces around > initializer [-Wmissing-braces] > drivers/usb/storage/unusual_devs.h:1534:3: warning: (near initialization for > ‘us_unusual_dev_list[186]’) [-Wmissing-braces] > drivers/usb/storage/unusual_devs.h:1534:3: error: initializer element is not > constant > drivers/usb/storage/unusual_devs.h:1534:3: error: (near initialization for > ‘us_unusual_dev_list[186].vendorName’) > drivers/usb/storage/unusual_devs.h:1537:1: warning: braces around scalar > initializer [enabled by default] > > And it goes on and on... --The macro define, please see another patch: [PATCH 1/1]linux-usb:Define a new macro for USB storage match rules http://www.spinics.net/lists/linux-usb/msg76629.html > Care to fix this up and resend it? > > thanks, > > greg k-h Best Regards, Franko Fang
Re: sched: Consequences of integrating the Per Entity Load Tracking Metric into the Load Balancer
On 01/21/2013 10:40 AM, Preeti U Murthy wrote: > Hi Alex, > Thank you very much for running the below benchmark on > blocked_load+runnable_load:) Just a few queries. > > How did you do the wake up balancing? Did you iterate over the L3 > package looking for an idle cpu? Or did you just query the L2 package > for an idle cpu? > Just used the current select_idle_sibling function, so it search in L3 package. > I think when you are using blocked_load+runnable_load it would be better > if we just query the L2 package as Vincent had pointed out because the > fundamental behind using blocked_load+runnable_load is to keep a steady > state across cpus unless we could reap the advantage of moving the > blocked load to a sibling core when it wakes up. > > And the drop of performance is relative to what? it is 2 VS 3.8-rc3 > 1.Your v3 patchset with runnable_load_avg in weighted_cpu_load(). > 2.Your v3 patchset with runnable_load_avg+blocked_load_avg in > weighted_cpu_load(). > > Are the above two what you are comparing? And in the above two versions > have you included your [PATCH] sched: use instant load weight in burst > regular load balance? no this patch. > > On 01/20/2013 09:22 PM, Alex Shi wrote: > The blocked load of a cluster will be high if the blocked tasks have > run recently. The contribution of a blocked task will be divided by 2 > each 32ms, so it means that a high blocked load will be made of recent > running tasks and the long sleeping tasks will not influence the load > balancing. > The load balance period is between 1 tick (10ms for idle load balance > on ARM) and up to 256 ms (for busy load balance) so a high blocked > load should imply some tasks that have run recently otherwise your > blocked load will be small and will not have a large influence on your > load balance >>> >>> Just tried using cfs's runnable_load_avg + blocked_load_avg in >>> weighted_cpuload() with my v3 patchset, aim9 shared workfile testing >>> show the performance dropped 70% more on the NHM EP machine. :( >>> >> >> Ops, the performance is still worse than just count runnable_load_avg. >> But dropping is not so big, it dropped 30%, not 70%. >> > > Thank you > > Regards > Preeti U Murthy > -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PULL] Module fixes, and a virtio block fix.
On Sun, Jan 20, 2013 at 6:57 PM, Rusty Russell wrote: > > I'm confused. The default argument is HEAD: what does it know about tag > names? Ugh. I actually thought that if you give it the tag name directly (as the "end") it will use that. But no. It figures it out with "git describe --exact" internally. Regardless, if your HEAD is actually tagged, it *will* have the tag-name in git-request-pull. And it will have it based on your *local* repo, so the fact that it hasn't been mirrored out yet doesn't really matter. git request-pull knows that tag name regardless of mirroring issues. > The bug is that if it can't find that commit at the remote end, it > still generates a valid-looking request (with a warning at the end), > where it guesses you're talking about the master branch. It really shouldn't do that any more, but you seem to have the older version with the bug. At least one of the annoying problems was fixed in the 1.7.11 series, you have 1.7.10. The nice thing about git is that it is *really* easy to upgrade. Just fetch the sources, do "make; make install" all as a normal user, and you do not need to worry about package management or distro issues or any crap like that. It installs into your $(HOME)/bin, and as long as your PATH has that first, you'll get it. I've long suggested that as the workaround for distros having old versions (some more so than others). > Since I use a wrapper script now for your pull requests I can use sed to > unscrew it: > > [alias] > for-linus = !check-commits && TAGNAME=`git symbolic-ref HEAD | cut > -d/ -f3`-for-linus && git tag -f -u D1ADB8F1 $TAGNAME HEAD && git push korg > tag $TAGNAME && git request-pull master korg | sed > s,gitol...@ra.kernel.org:/pub,git://git.kernel.org/pub, && git log --stat > --reverse master..$TAGNAME | emails-from-log | grep -v 'rusty@rustcorp' | > grep -v 'sta...@kernel.org' | sed 's/^/Cc: /' Heh. Ok. That will at least hide the breakage. But I suspect you could fix it by just updating git. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Subtract min_free_kbytes from dirtyable memory
When calculating amount of dirtyable memory, min_free_kbytes should be subtracted because it is not intended for dirty pages. Using an "extern int" because that is the only interface to some such sysctl values. (This patch does not solve the PAE OOM issue.) Paul Szabo p...@maths.usyd.edu.au http://www.maths.usyd.edu.au/u/psz/ School of Mathematics and Statistics University of SydneyAustralia Reported-by: Paul Szabo Reference: http://bugs.debian.org/695182 Signed-off-by: Paul Szabo --- mm/page-writeback.c.old 2012-12-06 22:20:40.0 +1100 +++ mm/page-writeback.c 2013-01-21 13:57:05.0 +1100 @@ -343,12 +343,16 @@ unsigned long determine_dirtyable_memory(void) { unsigned long x; + extern int min_free_kbytes; x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); if (!vm_highmem_is_dirtyable) x -= highmem_dirtyable_memory(x); + /* Subtract min_free_kbytes */ + x -= min(x, min_free_kbytes >> (PAGE_SHIFT - 10)); + return x + 1; /* Ensure that we never return 0 */ } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the security tree with Linus' tree
On Mon, 2013-01-21 at 13:12 +1100, Stephen Rothwell wrote: > Hi James, > > Today's linux-next merge of the security tree got a conflict in > security/integrity/ima/ima_main.c between commit a7f2a366f623 ("ima: > fallback to MODULE_SIG_ENFORCE for existing kernel module syscall") from > Linus' tree and commit 750943a30714 ("ima: remove enforce checking > duplication") from the security tree. > > I think I fixed it up (see below). Sorry Stephen, the merged result should look like what's contained in linux-integrity/next-upstreamed-patches: int ima_module_check(struct file *file) { if (!file) { if ((ima_appraise & IMA_APPRAISE_MODULES) && (ima_appraise & IMA_APPRAISE_ENFORCE)) { #ifndef CONFIG_MODULE_SIG_FORCE return -EACCES; /* INTEGRITY_UNKNOWN */ #endif } return 0; } return process_measurement(file, file->f_dentry->d_name.name, MAY_EXEC, MODULE_CHECK); } thanks, Mimi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MAX_PAUSE to be at least 4
Ensure MAX_PAUSE is 4 or larger, so limits in return clamp_val(t, 4, MAX_PAUSE); (the only use of it) are not back-to-front. (This patch does not solve the PAE OOM issue.) Paul Szabo p...@maths.usyd.edu.au http://www.maths.usyd.edu.au/u/psz/ School of Mathematics and Statistics University of SydneyAustralia Reported-by: Paul Szabo Reference: http://bugs.debian.org/695182 Signed-off-by: Paul Szabo --- mm/page-writeback.c.old 2012-12-06 22:20:40.0 +1100 +++ mm/page-writeback.c 2013-01-21 13:57:05.0 +1100 @@ -39,7 +39,7 @@ /* * Sleep at most 200ms at a time in balance_dirty_pages(). */ -#define MAX_PAUSE max(HZ/5, 1) +#define MAX_PAUSE max(HZ/5, 4) /* * Estimate write bandwidth at 200ms intervals. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] pwm-backlight: add subdrivers & Tegra support
Patch is applied OK on 3.8-rc4. Hmmm.. But I think it's better to make the patch can be applied on linux-next. Mark On 01/21/2013 10:09 AM, Mark Zhang wrote: > Hi Alex, > > This patch set applies failed on tot linux-next(0118). Here is the log: > > markz@markz-hp6200:~/tegradrm/official-upstream-kernel$ git am > ~/Desktop/*.eml > Applying: pwm-backlight: add subdriver mechanism > error: patch failed: drivers/video/backlight/pwm_bl.c:35 > error: drivers/video/backlight/pwm_bl.c: patch does not apply > Patch failed at 0001 pwm-backlight: add subdriver mechanism > When you have resolved this problem run "git am --resolved". > If you would prefer to skip this patch, instead run "git am --skip". > To restore the original branch and stop patching run "git am --abort". > > Anyway, I'll try to apply this on 3.8-rc4. > > Mark > On 01/19/2013 06:30 PM, Alexandre Courbot wrote: >> This series introduces a way to use pwm-backlight hooks with platforms >> that use the device tree through a subdriver system. It also adds support >> for the Tegra-based Ventana board, adding the last missing block to enable >> its panel. Support for other Tegra board can thus be easily added. >> >> I have something else in mind to properly support this (power >> sequences), but this work relies on the GPIO subsystem redesign which will >> take some time. The pwm-backlight subdrivers can do the job by the meantime. >> >> There are a few design points that might need to be discussed: >> 1) Link order is important: subdrivers register themselves in their >> module_init function, which must be called before pwm-backlight's probe. >> This forbids linking subdrivers as separate modules from pwm-backlight. >> 2) The subdriver's data is temporarily passed through the backlight >> device's driver data. This should not hurt, but maybe there is a better way >> to do this. >> 3) Subdrivers must add themselves into pwm-backlight's own of_device_id >> table. It would be cleaner to not have to list subdrivers into >> pwm-backlight's main file, but I cannot think of a way to do otherwise. >> >> Suggestions for the 3 points listed above are very welcome - in any case, >> I hope to make this converge into something mergeable quickly. >> >> Note that these patches are the last missing block to get a functional >> panel on Tegra boards. Using 3.8rc4 and these patches, the internal panel >> on Ventana is usable out-of-the-box. Yay. >> >> Alexandre Courbot (3): >> pwm-backlight: add subdriver mechanism >> tegra: pwm-backlight: add tegra pwm-bl driver >> tegra: ventana: of: add host1x device to DT >> >> arch/arm/boot/dts/tegra20-ventana.dts | 29 +- >> arch/arm/configs/tegra_defconfig | 1 + >> drivers/video/backlight/Kconfig| 7 ++ >> drivers/video/backlight/Makefile | 4 + >> drivers/video/backlight/pwm_bl.c | 70 ++- >> drivers/video/backlight/pwm_bl_tegra.c | 159 >> + >> include/linux/pwm_backlight.h | 15 >> 7 files changed, 281 insertions(+), 4 deletions(-) >> create mode 100644 drivers/video/backlight/pwm_bl_tegra.c >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PULL] Module fixes, and a virtio block fix.
Linus Torvalds writes: > On Sun, Jan 20, 2013 at 5:32 PM, Rusty Russell wrote: >> >> Due to the delay on git.kernel.org, git request-pull fails. It *looks* >> like it succeeds, except the warning, but (as we learned last time I >> screwed up), it doesn't put the branchname because it can't know. > > I think this should be fixed in modern git versions. > > And it sure as hell knows the proper tag name, since you *gave* it the > name and it used it for generating the actual contents. The fact that > some versions then screw that up and re-write the tag-name to > something randomly matching that isn't a tag was just a bug. I'm confused. The default argument is HEAD: what does it know about tag names? git request-pull master korg The bug is that if it can't find that commit at the remote end, it still generates a valid-looking request (with a warning at the end), where it guesses you're talking about the master branch. >> For want of a better solution, I'll now resort to sending pull requests >> with the anti-social gitolite URL in it, like so: > > That's even worse, fwiw. It means that the pull request address makes > no sense to anybody who doesn't have a kernel.org address, and then > I'm forced to just edit things by hand instead to not pollute the > kernel changelog history with crap. Since I use a wrapper script now for your pull requests I can use sed to unscrew it: [alias] for-linus = !check-commits && TAGNAME=`git symbolic-ref HEAD | cut -d/ -f3`-for-linus && git tag -f -u D1ADB8F1 $TAGNAME HEAD && git push korg tag $TAGNAME && git request-pull master korg | sed s,gitol...@ra.kernel.org:/pub,git://git.kernel.org/pub, && git log --stat --reverse master..$TAGNAME | emails-from-log | grep -v 'rusty@rustcorp' | grep -v 'sta...@kernel.org' | sed 's/^/Cc: /' > Junio, didn't "git request-pull" get fixed so that it *warns* about > missing tagnames/branches, but never actually corrupts the pull > request? Or did it just get "fixed" to be a hard error instead of > corrupting things? Because this is annoying. Here: git version 1.7.10.4 Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tty: Only wakeup the line discipline idle queue when queue is active
On 01/18/2013 09:15 PM, Oleg Nesterov wrote: > On 01/17, Preeti U Murthy wrote: >> >> On 01/16/2013 05:32 PM, Ivo Sieben wrote: >>> >>> I don't have a problem that there is a context switch to the high >>> priority process: it has a higher priority, so it probably is more >>> important. >>> My problem is that even when the waitqueue is empty, the high priority >>> thread has a risk to block on the spinlock needlessly (causing context >>> switches to low priority task and back to the high priority task) >>> >> Fair enough Ivo.I think you should go ahead with merging the >> waitqueue_active() >> wake_up() >> logic into the wake_up() variants. > > This is not easy. We can't simply change wake_up*() helpers or modify > __wake_up(). Hmm.I need to confess that I don't really know what goes into a change such as this.Since there are a lot of waitqueue_active()+wake_up() calls,I was wondering why at all have a separate logic as waitqueue_active(),if we could do what it does in wake_up*(). But you guys can decide this best. > > I can't understand why do you dislike Ivo's simple patch. There are > a lot of "if (waitqueue_active) wake_up" examples. Even if we add the > new helpers (personally I don't think this makes sense) , we can do > this later. Why should we delay this fix? Personally i was concerned about how this could cause a scheduler overhead.There does not seem to be much of a problem here.Ivo's patch for adding a waitqueue_active() for his specific problem would also do well,unless there is a dire requirement for a clean up,which I am unable to evaluate. > > Oleg. > Thank you Regards Preeti U Murthy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 2/2] perf stat: add interval printing
Hi Stephane, On Sat, 19 Jan 2013 00:13:59 +0100, Stephane Eranian wrote: > This patch adds a new printing mode for perf stat. > It allows internval printing. That means perf stat > can now print event deltas at regular time interval. > This is useful to detect phases in programs. > > The -I option enables interval printing. It expects > an interval duration in milliseconds. Minimum is > 100ms. Once, activated perf stat prints events deltas > since last printout. All modes are supported. > > $ perf stat -I 1000 -e cycles noploop 10 > noploop for 10 seconds Is this line an output from perf stat? In addition, how about adding a head line like: # timecount event # > 1.86918 2385155642 cycles#0.000 GHz > 2.000267937 2392279774 cycles#0.000 GHz > 3.000385400 2390971450 cycles#0.000 GHz > 4.000504408 2390996752 cycles#0.000 GHz > 5.000626878 2390853097 cycles#0.000 GHz > > The output format makes it easy to feed into a plotting program > such as gnuplot when the -I option is used in combination with the -x > option: > > $ perf stat -x, -I 1000 -e cycles noploop 10 > noploop for 10 seconds > 1.84113,2378775498,cycles > 2.000245798,2391056897,cycles > 3.000354445,2392089414,cycles > 4.000459115,2390936603,cycles > 5.000565341,2392108173,cycles > > Signed-off-by: Stephane Eranian > --- [snip] > @@ -877,6 +977,8 @@ static void print_counter(struct perf_evsel *counter) > static void print_stat(int argc, const char **argv) > { > struct perf_evsel *counter; > + struct timespec ts, rs; > + char prefix[64] = { 0, }; > int i; > > fflush(stdout); > @@ -899,12 +1001,18 @@ static void print_stat(int argc, const char **argv) > fprintf(output, ":\n\n"); > } > > + if (interval) { > + clock_gettime(CLOCK_MONOTONIC, ); > + diff_timespec(, , _time); > + sprintf(prefix, "%lu.%09lu%s", rs.tv_sec, rs.tv_nsec, csv_sep); > + } AFAICS the only caller of print_stat() is cmd_stat() and it'll call this only if interval is 0. So why not just setting prefix to NULL then? > + > if (no_aggr) { > list_for_each_entry(counter, _list->entries, node) > - print_counter(counter); > + print_counter(counter, prefix); > } else { > list_for_each_entry(counter, _list->entries, node) > - print_counter_aggr(counter); > + print_counter_aggr(counter, prefix); > } > > if (!csv_output) { > @@ -925,7 +1033,7 @@ static volatile int signr = -1; > > static void skip_signal(int signo) > { > - if(child_pid == -1) > + if((child_pid == -1) || interval) Looks like it needs a whitespace :) > done = 1; > > signr = signo; > @@ -1145,6 +1253,8 @@ int cmd_stat(int argc, const char **argv, const char > *prefix __maybe_unused) > "command to run prior to the measured command"), > OPT_STRING(0, "post", _cmd, "command", > "command to run after to the measured command"), > + OPT_INTEGER('I', "interval-print", , > + "print counts at regular interval in ms (>= 100)"), > OPT_END() > }; > const char * const stat_usage[] = { > @@ -1245,12 +1355,23 @@ int cmd_stat(int argc, const char **argv, const char > *prefix __maybe_unused) > usage_with_options(stat_usage, options); > return -1; > } > + if (interval < 0 || (interval > 0 && interval < 100)) { > + pr_err("print interval must be >= 100ms\n"); > + usage_with_options(stat_usage, options); > + return -1; > + } How about making 'interval' unsigned and simplify the condition a bit: if (interval && interval < 100) { ... } Thanks, Namhyung > > list_for_each_entry(pos, _list->entries, node) { > if (perf_evsel__alloc_stat_priv(pos) < 0 || > perf_evsel__alloc_counts(pos, perf_evsel__nr_cpus(pos)) < 0) > goto out_free_fd; > } > + if (interval) { > + list_for_each_entry(pos, _list->entries, node) { > + if (perf_evsel__alloc_prev_raw_counts(pos) < 0) > + goto out_free_fd; > + } > + } It's not about your patch, but I can't find where it frees evsel->counts - a counter part of perf_evsel__alloc_counts(). Seems we leak that? > > /* >* We dont want to block the signals - that would cause > @@ -1260,6 +1381,7 @@ int cmd_stat(int argc, const char **argv, const char > *prefix __maybe_unused) >*/ > atexit(sig_atexit); > signal(SIGINT, skip_signal); > + signal(SIGCHLD, skip_signal); > signal(SIGALRM,
Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
On 01/20/2013 12:09 PM, Mike Galbraith wrote: > On Thu, 2013-01-17 at 13:55 +0800, Michael Wang wrote: >> Hi, Mike >> >> I've send out the v2, which I suppose it will fix the below BUG and >> perform better, please do let me know if it still cause issues on your >> arm7 machine. > > s/arm7/aim7 > > Someone swiped half of CPUs/ram, so the box is now 2 10 core nodes vs 4. > > stock scheduler knobs > > 3.8-wang-v2 avg 3.8-virgin > avgvs wang > Tasksjobs/min > 1 436.29435.66435.97435.97437.86441.69 > 440.09439.88 1.008 > 5 2361.65 2356.14 2350.66 2356.15 2416.27 2563.45 > 2374.61 2451.44 1.040 >10 4767.90 4764.15 4779.18 4770.41 4946.94 4832.54 > 4828.69 4869.39 1.020 >20 9672.79 9703.76 9380.80 9585.78 9634.34 9672.79 > 9727.13 9678.08 1.009 >4019162.06 19207.61 19299.36 19223.01 19268.68 19192.40 > 19056.60 19172.56 .997 >8037610.55 37465.22 37465.22 37513.66 37263.64 37120.98 > 37465.22 37283.28 .993 > 16069306.65 69655.17 69257.14 69406.32 69257.14 69306.65 > 69257.14 69273.64 .998 > 320 111512.36 109066.37 111256.45 110611.72 108395.75 107913.19 > 108335.20 108214.71 .978 > 640 142850.83 148483.92 150851.81 147395.52 151974.92 151263.65 > 151322.67 151520.41 1.027 > 128052788.89 52706.39 67280.77 57592.01 189931.44 189745.60 > 189792.02 189823.02 3.295 > 256075403.91 52905.91 45196.21 57835.34 217368.64 217582.05 > 217551.54 217500.74 3.760 > > sched_latency_ns = 24ms > sched_min_granularity_ns = 8ms > sched_wakeup_granularity_ns = 10ms > > 3.8-wang-v2 avg 3.8-virgin > avgvs wang > Tasksjobs/min > 1 436.29436.60434.72435.87434.41439.77 > 438.81437.66 1.004 > 5 2382.08 2393.36 2451.46 2408.96 2451.46 2453.44 > 2425.94 2443.61 1.014 >10 5029.05 4887.10 5045.80 4987.31 4844.12 4828.69 > 4844.12 4838.97 .970 >20 9869.71 9734.94 9758.45 9787.70 9513.34 9611.42 > 9565.90 9563.55 .977 >4019146.92 19146.92 19192.40 19162.08 18617.51 18603.22 > 18517.95 18579.56 .969 >8037177.91 37378.57 37292.31 37282.93 36451.13 36179.10 > 36233.18 36287.80 .973 > 16070260.87 69109.05 69207.71 69525.87 68281.69 68522.97 > 68912.58 68572.41 .986 > 320 114745.56 113869.64 114474.62 114363.27 114137.73 114137.73 > 114137.73 114137.73 .998 > 640 164338.98 164338.98 164618.00 164431.98 164130.34 164130.34 > 164130.34 164130.34 .998 > 1280 209473.40 209134.54 209473.40 209360.44 210040.62 210040.62 > 210097.51 210059.58 1.003 > 2560 242703.38 242627.46 242779.34 242703.39 244001.26 243847.85 > 243732.91 243860.67 1.004 > > As you can see, the load collapsed at the high load end with stock > scheduler knobs (desktop latency). With knobs set to scale, the delta > disappeared. Thanks for the testing, Mike, please allow me to ask few questions. What are those tasks actually doing? what's the workload? And I'm confusing about how those new parameter value was figured out and how could them help solve the possible issue? Do you have any idea about which part in this patch set may cause the issue? One change by designed is that, for old logical, if it's a wake up and we found affine sd, the select func will never go into the balance path, but the new logical will, in some cases, do you think this could be a problem? > > I thought perhaps the bogus (shouldn't exist) CPU domain in mainline > somehow contributes to the strange behavioral delta, but killing it made > zero difference. All of these numbers for both trees were logged with > the below applies, but as noted, it changed nothing. The patch set was supposed to do accelerate by reduce the cost of select_task_rq(), so it should be harmless for all the conditions. Regards, Michael Wang > > From: Alex Shi > Date: Mon, 17 Dec 2012 09:42:57 +0800 > Subject: [PATCH 01/18] sched: remove SD_PERFER_SIBLING flag > > The flag was introduced in commit b5d978e0c7e79a. Its purpose seems > trying to fullfill one node first in NUMA machine via pulling tasks > from other nodes when the node has capacity. > > Its advantage is when few tasks share memories among them, pulling > together is helpful on locality, so has performance gain. The shortage > is it will keep unnecessary task migrations thrashing among different > nodes, that reduces the performance gain, and just hurt performance if > tasks has no memory cross. > > Thinking about the sched numa balancing patch is coming. The small >
Re: [PATCH v9 09/11] PCI, acpiphp: Don't bailout even no slots found yet.
> > If that's the case: > > Acked-by: Rafael J. Wysocki > > but please say something like this in the changelog: > > "The result returned by acpiphp_get_num_slots() is meaningless, because > the bridge the slots are under may be added after this function has been > called, so drop acpiphp_get_num_slots() and the code using it." yes, I add you inputs into change log. Thanks a lot -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sched: Consequences of integrating the Per Entity Load Tracking Metric into the Load Balancer
Hi Alex, Thank you very much for running the below benchmark on blocked_load+runnable_load:) Just a few queries. How did you do the wake up balancing? Did you iterate over the L3 package looking for an idle cpu? Or did you just query the L2 package for an idle cpu? I think when you are using blocked_load+runnable_load it would be better if we just query the L2 package as Vincent had pointed out because the fundamental behind using blocked_load+runnable_load is to keep a steady state across cpus unless we could reap the advantage of moving the blocked load to a sibling core when it wakes up. And the drop of performance is relative to what? 1.Your v3 patchset with runnable_load_avg in weighted_cpu_load(). 2.Your v3 patchset with runnable_load_avg+blocked_load_avg in weighted_cpu_load(). Are the above two what you are comparing? And in the above two versions have you included your [PATCH] sched: use instant load weight in burst regular load balance? On 01/20/2013 09:22 PM, Alex Shi wrote: The blocked load of a cluster will be high if the blocked tasks have run recently. The contribution of a blocked task will be divided by 2 each 32ms, so it means that a high blocked load will be made of recent running tasks and the long sleeping tasks will not influence the load balancing. The load balance period is between 1 tick (10ms for idle load balance on ARM) and up to 256 ms (for busy load balance) so a high blocked load should imply some tasks that have run recently otherwise your blocked load will be small and will not have a large influence on your load balance >> >> Just tried using cfs's runnable_load_avg + blocked_load_avg in >> weighted_cpuload() with my v3 patchset, aim9 shared workfile testing >> show the performance dropped 70% more on the NHM EP machine. :( >> > > Ops, the performance is still worse than just count runnable_load_avg. > But dropping is not so big, it dropped 30%, not 70%. > Thank you Regards Preeti U Murthy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Issues with "x86, um: switch to generic fork/vfork/clone" commit
On Sun, Jan 20, 2013 at 6:30 PM, Al Viro wrote: > > Neither do I, to be honest. It might be saving us a few cycles on > some architectures, but I'd like to see examples of that. amd64 > doesn't seem to be one, at least... I think that the inlining of the body should make it basically be pretty much free even on architectures that would want to do something about the casts. .. and thinking about it, the architectures that do actually generate code for casting to a narrower type should already have selected that HAVE_SYSCALL_WRAPPERS option anyway, so the only reason *not* to select it is for a n architecture that doesn't generate any extra code. And right now, that HAVE_SYSCALL_WRAPPERS does make it much harder to think about the header file changes. > FWIW, there's another bit of ugliness around that area - all these > #define __SC_BLAH3, etc., all of the same form. This stuff begs for > something like > #define __MAP1(m,t,a) m(t,a) > #define __MAP2(m,t,a,...) m(t,a) __MAP1(m,__VA_ARGS__) > #define __MAP3(m,t,a,...) m(t,a) __MAP2(m,__VA_ARGS__) > #define __MAP4(m,t,a,...) m(t,a) __MAP3(m,__VA_ARGS__) > #define __MAP5(m,t,a,...) m(t,a) __MAP4(m,__VA_ARGS__) > #define __MAP6(m,t,a,...) m(t,a) __MAP5(m,__VA_ARGS__) > #define __MAP(n,...) __MAP##n(__VA_ARGS__) > with __MAP(x,__SC_DECL,__VA_ARGS__) instead of __SC_DECL##x(__VA_ARGS__) > etc. in users... Well, I can see both sides. The above is the nice and dense declaration model with less duplication, but christ, it's hard for people to wrap their minds around unless they've seen it a million times. It really does take some getting used to, and the long-form can be easier to understand. That said, we have so many of those things now when it comes to the syscall stuff that the dense form seems to be called for just to be consistent. So go wild if you have the energy for it. I'm not going to pull that for 3.8, though. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 08/11] PCI, ACPI: debug print for installation of acpi root bridge's notifier
On Sun, Jan 20, 2013 at 3:00 PM, Rafael J. Wysocki wrote: > On Thursday, January 17, 2013 11:53:19 PM Yinghai Lu wrote: >> From: Tang Chen >> >> acpi_install_notify_handler() could fail. So check the exit status >> and give a better debug info. >> >> Signed-off-by: Tang Chen >> Signed-off-by: Yinghai Lu >> --- >> drivers/acpi/pci_root.c | 12 +--- >> 1 file changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c >> index 3ce5d80..f3ceb61 100644 >> --- a/drivers/acpi/pci_root.c >> +++ b/drivers/acpi/pci_root.c >> @@ -762,6 +762,7 @@ static void handle_hotplug_event_root(acpi_handle >> handle, u32 type, >> static acpi_status __init >> find_root_bridges(acpi_handle handle, u32 lvl, void *context, void **rv) >> { >> + acpi_status status; >> char objname[64]; >> struct acpi_buffer buffer = { .length = sizeof(objname), >> .pointer = objname }; >> @@ -774,9 +775,14 @@ find_root_bridges(acpi_handle handle, u32 lvl, void >> *context, void **rv) >> >> acpi_get_name(handle, ACPI_FULL_PATHNAME, ); >> >> - acpi_install_notify_handler(handle, ACPI_SYSTEM_NOTIFY, >> - handle_hotplug_event_root, NULL); >> - printk(KERN_DEBUG "acpi root: %s notify handler installed\n", objname); >> + status = acpi_install_notify_handler(handle, ACPI_SYSTEM_NOTIFY, >> + handle_hotplug_event_root, NULL); >> + if (ACPI_FAILURE(status)) >> + printk(KERN_DEBUG "acpi root: %s notify handler is not >> installed, exit status: %u\n", > > Can you break that line, please? And use pr_debug()? Long line should be ok, and checkpatch.pl is not complaining about that. Also keep the complete print out in one line, could make git grep find that code exactly. Actually I really hate pr_debug(), that will make the generated code different with DEBUG defined or not. And need to end user to recompile kernel to get debug output if needed. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 07/11] PCI, acpiphp: Move and enhance hotplug support of pci host bridge
On Sun, Jan 20, 2013 at 2:55 PM, Rafael J. Wysocki wrote: > On Thursday, January 17, 2013 11:53:18 PM Yinghai Lu wrote: >> We have partial hot-add support in acpiphp driver, and it is confusing. >> >> Move host bridge hot-add support to pci_root.c, and keep acpiphp simple, >> also add hot-remove support in pci_root.c. >> >> How to test it: if sci_emu patch is applied, >> >> Find out root bus number to acpi root name mapping from dmesg or /sys >> >> echo "\_SB.PCIB 3" > /sys/kernel/debug/acpi/sci_notify >> to remove root bus >> >> echo "\_SB.PCIB 1" > /sys/kernel/debug/acpi/sci_notify >> to add back root bus >> >> -v2: put back pci_root_hp change in one patch >> -v3: add pcibios_resource_survey_bus() calling >> -v4: remove not needed code with remove_bridge >> -v5: put back support for acpiphp support for slots just on root bus. >> -v6: change some functions to *_p2p_* to make it more clean. >> -v7: split hot_added change out. >> -v8: Move to pci_root.c instead of adding another file requested by Bjorn. >> -v9: Fold three following patches into this one for easy review: >> a: Add missing hot_remove support for root device. >> b: Tang Chen noticed that hotplug through container will not update >> acpi_root_bridge list. After closely checking, we don't need >> that for struct for tracking and could use acpi_pci_root directly. >> c: Tang Chen found handle_root_bridge_removal is very similiar to >> acpi_bus_hot_remove_device(). Change to handle_root_bridge_removal >> to use acpi_bus_hot_remove_device. >> >> Signed-off-by: Yinghai Lu >> --- >> drivers/acpi/pci_root.c| 139 >> >> drivers/pci/hotplug/acpiphp_glue.c | 59 --- >> 2 files changed, 154 insertions(+), 44 deletions(-) >> >> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c >> index bf5108a..3ce5d80 100644 >> --- a/drivers/acpi/pci_root.c >> +++ b/drivers/acpi/pci_root.c >> @@ -655,3 +655,142 @@ int __init acpi_pci_root_init(void) >> >> return 0; >> } >> + >> +/* Support root bridge hotplug */ >> + >> +static void handle_root_bridge_insertion(acpi_handle handle) >> +{ >> + struct acpi_device *device, *pdevice; >> + acpi_handle phandle; >> + int ret_val; >> + >> + acpi_get_parent(handle, ); >> + if (acpi_bus_get_device(phandle, )) { >> + printk(KERN_DEBUG "no parent device, assuming NULL\n"); >> + pdevice = NULL; >> + } >> + if (!acpi_bus_get_device(handle, )) { >> + /* check if pci root_bus is removed */ >> + struct acpi_pci_root *root = acpi_driver_data(device); >> + if (pci_find_bus(root->segment, root->secondary.start)) >> + return; >> + >> + printk(KERN_DEBUG "bus exists... trim\n"); >> + /* this shouldn't be in here, so remove >> + * the bus then re-add it... >> + */ >> + ret_val = acpi_bus_trim(device); > > You said that this followed acpiphp, but the purpose of the trimming in there > seems to be to handle surprise removal and re-insertion, which I'm not sure is > OK with something like a host bridge. ok, will just bail out if it is there. > > The drawback is that if we have a spurious ACPI_NOTIFY_BUS_CHECK or > ACPI_NOTIFY_DEVICE_CHECK, we'll be trying to remove the whole bus here in > response. That doesn't sound quite right. > >> + printk(KERN_DEBUG "acpi_bus_trim return %x\n", ret_val); >> + } >> + if (acpi_bus_add(handle)) >> + printk(KERN_ERR "cannot add bridge to acpi list\n"); >> +} >> + >> +static void handle_root_bridge_removal(struct acpi_device *device) >> +{ >> + struct acpi_eject_event *ej_event; >> + >> + ej_event = kmalloc(sizeof(*ej_event), GFP_KERNEL); >> + if (!ej_event) > > Shouldn't we do acpi_evaluate_hotplug_ost() here? ok. will add /* Inform firmware the hot-remove operation has error */ (void) acpi_evaluate_hotplug_ost(device->handle, ACPI_NOTIFY_EJECT_REQUEST, ACPI_OST_SC_NON_SPECIFIC_FAILURE, NULL); before return. > >> + return; >> + >> + ej_event->device = device; >> + ej_event->event = ACPI_NOTIFY_EJECT_REQUEST; >> + >> + acpi_bus_hot_remove_device(ej_event); >> +} >> + >> +static void _handle_hotplug_event_root(struct work_struct *work) >> +{ >> + struct acpi_pci_root *root; >> + char objname[64]; >> + struct acpi_buffer buffer = { .length = sizeof(objname), >> + .pointer = objname }; >> + struct acpi_hp_work *hp_work; >> + acpi_handle handle; >> + u32 type; >> + >> + hp_work = container_of(work, struct acpi_hp_work, work); >> + handle = hp_work->handle; >> + type = hp_work->type; >> + >> + root =
Re: Issues with "x86, um: switch to generic fork/vfork/clone" commit
On Sun, Jan 20, 2013 at 05:40:28PM -0800, Linus Torvalds wrote: > On Sun, Jan 20, 2013 at 5:22 PM, Al Viro wrote: > > > > Anyway, that's a separate story - semctl(2) is going to be ugly, no matter > > what we do, but the rest of those guys doesn't have to. How about the > > following (completely untested): > > Hmm. Looks like the RightThing(tm) to me. > > The thing that stands out that I question the value of that > HAVE_SYSCALL_WRAPPERS thing. Is there any reason we don't just make > all architectures use it? What's the downside? I'm not sure I see the > point of the non-wrapper version. Neither do I, to be honest. It might be saving us a few cycles on some architectures, but I'd like to see examples of that. amd64 doesn't seem to be one, at least... FWIW, there's another bit of ugliness around that area - all these #define __SC_BLAH3, etc., all of the same form. This stuff begs for something like #define __MAP1(m,t,a) m(t,a) #define __MAP2(m,t,a,...) m(t,a) __MAP1(m,__VA_ARGS__) #define __MAP3(m,t,a,...) m(t,a) __MAP2(m,__VA_ARGS__) #define __MAP4(m,t,a,...) m(t,a) __MAP3(m,__VA_ARGS__) #define __MAP5(m,t,a,...) m(t,a) __MAP4(m,__VA_ARGS__) #define __MAP6(m,t,a,...) m(t,a) __MAP5(m,__VA_ARGS__) #define __MAP(n,...) __MAP##n(__VA_ARGS__) with __MAP(x,__SC_DECL,__VA_ARGS__) instead of __SC_DECL##x(__VA_ARGS__) etc. in users... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] udf: add extent cache support in case of file reading
2013/1/19, Cong Ding : > On Sat, Jan 19, 2013 at 11:17:14AM +0900, Namjae Jeon wrote: >> From: Namjae Jeon >> >> This patch implements extent caching in case of file reading. >> While reading a file, currently, UDF reads metadata serially >> which takes a lot of time depending on the number of extents present >> in the file. Caching last accessd extent improves metadata read time. >> Instead of reading file metadata from start, now we read from >> the cached extent. >> >> This patch considerably improves the time spent by CPU in kernel mode. >> For example, while reading a 10.9 GB file using dd: >> Time before applying patch: >> 11677022208 bytes (10.9GB) copied, 1529.748921 seconds, 7.3MB/s >> real25m 29.85s >> user0m 12.41s >> sys 15m 34.75s >> >> Time after applying patch: >> 11677022208 bytes (10.9GB) copied, 1469.338231 seconds, 7.6MB/s >> real24m 29.44s >> user0m 15.73s >> sys 3m 27.61s > did you have any test on lots of small files? Hi. Cong. I created 2048 files of each 512KB size for testing performance drpping by extent cache feature. Used this script to read every file: index=0 while [ $index != 2048 ] do dd if=file.$index of=/dev/zero 1> /dev/null 2>/dev/null index=$(($index + 1)) done Performance without patch: VDLinux#> echo 3 > /proc/sys/vm/drop_caches VDLinux#> time ./script2.sh real0m 55.13s user0m 1.40s sys 0m 25.17s Performace with patch => VDLinux#> time ./script2.sh real0m 53.70s user0m 1.60s sys 0m 25.11s I can not find any performance dropping with extent cache patch. Thanks. > > - cong >> >> Signed-off-by: Namjae Jeon >> Signed-off-by: Ashish Sangwan >> Signed-off-by: Bonggil Bak >> --- >> fs/udf/ialloc.c |4 +++ >> fs/udf/inode.c | 79 >> +- >> fs/udf/udf_i.h | 16 +++ >> fs/udf/udfdecl.h | 10 +++ >> 4 files changed, 98 insertions(+), 11 deletions(-) >> >> diff --git a/fs/udf/ialloc.c b/fs/udf/ialloc.c >> index 7e5aae4..0cb208e 100644 >> --- a/fs/udf/ialloc.c >> +++ b/fs/udf/ialloc.c >> @@ -117,6 +117,10 @@ struct inode *udf_new_inode(struct inode *dir, >> umode_t mode, int *err) >> iinfo->i_lenAlloc = 0; >> iinfo->i_use = 0; >> iinfo->i_checkpoint = 1; >> +memset(>cached_extent, 0, sizeof(struct udf_ext_cache)); >> +spin_lock_init(&(iinfo->i_extent_cache_lock)); >> +/* Mark extent cache as invalid for now */ >> +iinfo->cached_extent.lstart = -1; >> if (UDF_QUERY_FLAG(inode->i_sb, UDF_FLAG_USE_AD_IN_ICB)) >> iinfo->i_alloc_type = ICBTAG_FLAG_AD_IN_ICB; >> else if (UDF_QUERY_FLAG(inode->i_sb, UDF_FLAG_USE_SHORT_AD)) >> diff --git a/fs/udf/inode.c b/fs/udf/inode.c >> index e78ef48..86e0469 100644 >> --- a/fs/udf/inode.c >> +++ b/fs/udf/inode.c >> @@ -91,6 +91,7 @@ void udf_evict_inode(struct inode *inode) >> } >> kfree(iinfo->i_ext.i_data); >> iinfo->i_ext.i_data = NULL; >> +udf_clear_extent_cache(iinfo); >> if (want_delete) { >> udf_free_inode(inode); >> } >> @@ -106,6 +107,7 @@ static void udf_write_failed(struct address_space >> *mapping, loff_t to) >> truncate_pagecache(inode, to, isize); >> if (iinfo->i_alloc_type != ICBTAG_FLAG_AD_IN_ICB) { >> down_write(>i_data_sem); >> +udf_clear_extent_cache(iinfo); >> udf_truncate_extents(inode); >> up_write(>i_data_sem); >> } >> @@ -373,7 +375,7 @@ static int udf_get_block(struct inode *inode, sector_t >> block, >> iinfo->i_next_alloc_goal++; >> } >> >> - >> +udf_clear_extent_cache(iinfo); >> phys = inode_getblk(inode, block, , ); >> if (!phys) >> goto abort; >> @@ -1172,6 +1174,7 @@ set_size: >> } else { >> if (iinfo->i_alloc_type == ICBTAG_FLAG_AD_IN_ICB) { >> down_write(>i_data_sem); >> +udf_clear_extent_cache(iinfo); >> memset(iinfo->i_ext.i_data + iinfo->i_lenEAttr + >> newsize, >> 0x00, bsize - newsize - >> udf_file_entry_alloc_offset(inode)); >> @@ -1185,6 +1188,7 @@ set_size: >> if (err) >> return err; >> down_write(>i_data_sem); >> +udf_clear_extent_cache(iinfo); >> truncate_setsize(inode, newsize); >> udf_truncate_extents(inode); >> up_write(>i_data_sem); >> @@ -1302,6 +1306,9 @@ static void udf_fill_inode(struct inode *inode, >> struct buffer_head *bh) >> iinfo->i_lenAlloc = 0; >> iinfo->i_next_alloc_block = 0; >> iinfo->i_next_alloc_goal = 0; >> +memset(>cached_extent, 0, sizeof(struct udf_ext_cache)); >> +spin_lock_init(&(iinfo->i_extent_cache_lock)); >> +iinfo->cached_extent.lstart = -1; >> if (fe->descTag.tagIdent == cpu_to_le16(TAG_IDENT_EFE)) { >>