Re: [PATCH] x86/mce: Rework cmci_rediscover() to play well with CPU hotplug
Looks lots cleaner. Applied on top of 3.9.rc3 and took it for a spin offlining and onlining cpus at random intervals. First time round I saw a few splats like the one below. But after a reboot I can no longer reproduce. -Tony INFO: task devkit-power-da:19861 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. devkit-power-da D 8810114cd048 0 19861 1 0x0080 88101063b828 0082 8810114cca80 88101063bfd8 88101063bfd8 88101063bfd8 88081fa18040 8810114cca80 88101063b808 88080c0cf800 88101063b8b0 Call Trace: [] schedule+0x29/0x70 [] usb_kill_urb+0x85/0xc0 [] ? wake_up_bit+0x40/0x40 [] usb_start_wait_urb+0xd8/0x160 [] usb_control_msg+0xcc/0x110 [] ? usb_get_status+0x43/0xc0 [] usb_get_status+0x83/0xc0 [] ? usb_set_device_state+0x127/0x160 [] usb_port_resume+0x2c6/0x630 [] ? __switch_to+0x181/0x4a0 [] generic_resume+0x15/0x30 [] usb_resume_both+0x105/0x150 [] usb_runtime_resume+0x1a/0x20 [] __rpm_callback+0x31/0x90 [] rpm_callback+0x2f/0x90 [] rpm_resume+0x40c/0x670 [] ? wake_up_bit+0x40/0x40 [] __pm_runtime_resume+0x5c/0x90 [] usb_autoresume_device+0x29/0x60 [] usbdev_open+0x110/0x210 [] chrdev_open+0x9c/0x180 [] do_dentry_open+0x20f/0x2c0 [] ? cdev_put+0x30/0x30 [] finish_open+0x35/0x50 [] do_last+0x6de/0xde0 [] ? inode_permission+0x18/0x50 [] ? link_path_walk+0x78/0x880 [] path_openat+0xb7/0x4a0 [] do_filp_open+0x41/0xa0 [] ? __alloc_fd+0x42/0x110 [] do_sys_open+0xf4/0x1e0 [] ? do_notify_resume+0x59/0x80 [] sys_open+0x21/0x30 [] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting
Building linux-next today (tag next-20130212) I get the following errors when building arch/ia64/configs/{tiger_defconfig, zx1_defconfig, bigsur_defconfig, sim_defconfig} arch/ia64/mm/init.c: In function 'free_initrd_mem': arch/ia64/mm/init.c:215: error: 'max_addr' undeclared (first use in this function) arch/ia64/mm/init.c:215: error: (Each undeclared identifier is reported only once arch/ia64/mm/init.c:215: error: for each function it appears in.) arch/ia64/mm/init.c:216: error: implicit declaration of function 'GRANULEROUNDDOWN' with "git blame" saying that these lines in init.c were added/changed by commit 5a54b4fb8f554b15c6113e30ca8412b7fe11c62e Author: Xishi Qiu Date: Thu Feb 7 12:25:59 2013 +1100 ia64/mm: fix a bad_page bug when crash kernel booting -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting
On Tue, Feb 12, 2013 at 4:19 PM, Andrew Morton wrote: > But, umm, why am I sitting here trying to maintain an ia64 bugfix and > handling bug reports from the ia64 maintainer? Wanna swap? That sounds like a plan. I'll look out for a new version with the missing #include and less silly global variable names and try to take it before you pull it into -mm -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/9] ia64: cpufreq: move cpufreq driver to drivers/cpufreq
On Mon, Apr 1, 2013 at 5:49 PM, Viresh Kumar wrote: > For now, your Ack will work :) Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 17/22] x86, ACPI, numa, ia64: split SLIT handling out
On Thu, Apr 4, 2013 at 4:46 PM, Yinghai Lu wrote: > It should not break ia64 by replacing acpi_numa_init with > acpi_numa_init_srat/acpi_numa_init_slit/acpi_num_arch_fixup. You are right - it doesn't break ia64. All my test configs still build. Machines both with and without NUMA still boot and nothing strange happens. Tested-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting
Foolishly sent an earlier reply from Outlook which appears to have mangled/lost it. Trying again ... > In efi_init() memory aligns in IA64_GRANULE_SIZE(16M). If set > "crashkernel=1024M-:600M" Is this where the real problem begins? Should we insist that users provide crashkernel parameters rounded to GRANULE boundaries? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] sched: move RR_TIMESLICE from sysctl.h to rt.h
On Wed, Feb 20, 2013 at 7:19 AM, Clark Williams wrote: > Signed-off-by: Clark Williams > --- This happens to unbreak the ia64 build which is currently grumbling about: arch/ia64/kernel/init_task.c:38: error: 'RR_TIMESLICE' undeclared here (not in a function) So I'd be happy if it got applied directly to Linus tree before I get too big of a bisection gap. Acked-by: Tony Luck > include/linux/sched/rt.h | 6 ++ > include/linux/sched/sysctl.h | 6 -- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h > index 94e19ea..440434d 100644 > --- a/include/linux/sched/rt.h > +++ b/include/linux/sched/rt.h > @@ -55,4 +55,10 @@ static inline bool tsk_is_pi_blocked(struct task_struct > *tsk) > extern void normalize_rt_tasks(void); > > > +/* > + * default timeslice is 100 msecs (used only for SCHED_RR tasks). > + * Timeslices get refilled after they expire. > + */ > +#define RR_TIMESLICE (100 * HZ / 1000) > + > #endif /* _SCHED_RT_H */ > diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h > index d2bb0ae..bf8086b 100644 > --- a/include/linux/sched/sysctl.h > +++ b/include/linux/sched/sysctl.h > @@ -91,12 +91,6 @@ extern unsigned int sysctl_sched_cfs_bandwidth_slice; > extern unsigned int sysctl_sched_autogroup_enabled; > #endif > > -/* > - * default timeslice is 100 msecs (used only for SCHED_RR tasks). > - * Timeslices get refilled after they expire. > - */ > -#define RR_TIMESLICE (100 * HZ / 1000) > - > extern int sched_rr_timeslice; > > extern int sched_rr_handler(struct ctl_table *table, int write, > -- > 1.8.1.2 > > -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] sched: move RR_TIMESLICE from sysctl.h to rt.h
On Wed, Feb 20, 2013 at 9:50 AM, Ingo Molnar wrote: > Hm, didn't it get fixed via the commit below? Together with moving RR_TIMESLICE to rt.h ... ia64 is good. But I see commit 77852fea6e24 in the tree I built and still see the RR_TIMESLICE errors. I don't see the MAX_PRIO half of the problem - so it did help a bit. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting
On Tue, Feb 19, 2013 at 5:38 PM, Xishi Qiu wrote: > Seems like a good idea, should we modify > "\linux\Documentation\kernel-parameters.txt"? Perhaps in Documentation/kdump/kdump.txt (which the crashkernel entry in kernel-parameters.txt points at). The ia64 section of kdump.txt notes that the start address will be rounded up to a GRANULE boundary, but doesn't talk about restrictions on the size. I wonder if any other architectures have alignment restrictions on the addresses in "crashkernel" parameters? Does x86 like them to be 2MB aligned? Second question is whether we should check and warn in parse_crashkernel_mem()? I think the answer is "yes" (since the consequences of getting this wrong don't show up till much later, and the errors aren't all that obviously connected back to the original mistake). Perhaps each architecture that cares could provide defines: #define ARCH_CRASH_KERNEL_START_ALIGN (... arch value here ...) #define ARCH_CRASH_KERNEL_SIZE_ALIGN (... arch value here ...) [Suggestion provided mostly to provoke somebody to provide a more elegant solution] -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] x86/mce: Honour bios-set CMCI threshold
On Mon, Sep 10, 2012 at 10:31 PM, Naveen N. Rao wrote: > + if (mce_bios_cmci_threshold) > + printk_once(KERN_INFO > + "bios_cmci_threshold: Using bios-set threshold values > for CMCI"); Do we really need this message? The user knows whether they gave the command line option or not (and can check in /proc/cmdline if they forgot whether they did). If it is needed, then you should add a "\n" to it. > + if (mce_bios_cmci_threshold && bios_wrong_thresh) { > + printk_once(KERN_INFO > + "bios_cmci_threshold: Some banks do not have valid > thresholds set"); > + printk_once(KERN_INFO > + "bios_cmci_threshold: Make sure your BIOS supports > this boot option"); Also need "\n" -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y
> It is legal to access per-cpu data as early as you like, > it just evaluates to the static copy in the per-cpu section > of the kernel image until the per-cpu areas are setup. On ia64 per-cpu variables are mapped into the top 64K of the address space. Accessing them before the resources to handle the access to that virtual address have been set up would cause problems. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y
> That's right. I thought you guys had something that would handle that > early on, but looking at how the trick works in the vmlinux.lds.S ia64 > uses that isn't the case. We try to get things set up pertty early ... but I agree this is fragile. Adding code to printk() to not provide a timestamp before some safe point in boot is a workaround to the current problem. But it may come back to haunt us if other per-cpu data is added that needs to be accessed early during boot. There are some changes going on at the moment on how we allocate the space for the per-cpu area. It is likely that for a non-boot cpu we might be able to get everything that we need for per-cpu access to work done in head.S before we can get to any C code. Boot cpu may be harder unless we statically allocate space for its per-cpu area in vmlinux.lds.S I'll take a closer look at what is needed tomorrow. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y
On Wed, Feb 13, 2008 at 7:47 PM, Roland Dreier <[EMAIL PROTECTED]> wrote: > The strange thing is that Ingo's patch to make cpu_clock() a NOP until > after sched_init() didn't fix things for me... Very strange. I threw in an output line counter into the printk code() ... if I disable the timestamps for the first 30 lines, then everything is good (so the basic timestamping code does still work on ia64). But I would have thought that Ingo's delay until sched_init() ought to be long enough too. Clearly I need to figure out exactly what needs to be initialized to prevent the hang/crash. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y
> We *ought* to be safe after cpu_init() ... which is called from setup_arch(), > which is several calls before sched_init(). Perhaps what is happening is that cpu0 comes online ... safely skips over the early printk calls. Calls cpu_init() which sets up the resources *it* needs (ar.k3 points to per-cpu space), and then executes sched_init() which marks it safe for all printk's. Then cpu1 comes up and does a printk before it gets to cpu_init(). Try with Ingo patch and CONFIG_SMP=n to see if you can come up on a uni-processor. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts
On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab wrote: > Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c: > > void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err > *mem_err) > { > struct mce m; > > /* Only corrected MC is reported */ > if (!corrected || !(mem_err->validation_bits & > CPER_MEM_VALID_PHYSICAL_ADDRESS)) > return; > > mce_setup(&m); > m.bank = 1; > /* Fake a memory read corrected error with unknown channel */ > m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV > | 0x9f; > m.addr = mem_err->physical_addr; > mce_log(&m); > mce_notify_irq(); > } > > Bank information there is fake; status is fake. Only addr is really filled > there; it works only for corrected errors. This went in like this to help out the Westmere-EX processors that didn't fill out MCi_ADDR for corrected errors. APEI could get the address from some platform CSRs ... reporting via /dev/mcelog so that predictive analysis in mcelog(8) would work on these machines. I don't think we can rip it out yet ... not until those machines are shuffled off to recycle heaven. But perhaps we should get smarter about which machines we enable APEI on? If we get everything we need from the machine check banks, then the detour via the BIOS to report the same thing again isn't helpful. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [next:akpm 129/309] net/core/sock.c:274:36: error: initializer element is not constant
On Tue, Jul 24, 2012 at 10:10 PM, James Bottomley wrote: >> Here is the line in sock.i: >> >> struct static_key memalloc_socks = ((struct static_key) { .enabled = >> ((atomic_t) { (0) }) }); > > The above line contains two compound literals. It also uses a designated > initializer to initialize the field enabled. A compound literal is not a > constant expression. Seeing the same thing on ia64 building next-20120726. Same fix works for me ... so I'll steal this whole changelog and attributes. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v5 05/19] memory-hotplug: check whether memory is present or not
On Fri, Jul 27, 2012 at 3:28 AM, Wen Congyang wrote: > +static inline int pfns_present(unsigned long pfn, unsigned long nr_pages) > +{ > + int i; > + for (i = 0; i < nr_pages; i++) { > + if (pfn_present(pfn + 1)) Typo? I think you meant "pfn + i" > + continue; > + else > + return -EINVAL; > + } > + return 0; > +} -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] mce, acpi/apei: Soft-offline a page on firmware GHES notification
On Wed, Jul 3, 2013 at 8:40 AM, Naveen N. Rao wrote: >>> +#ifdef CONFIG_ACPI_APEI_MEMORY_FAILURE >>> + int sec_sev = ghes_severity(gdata->error_severity); >>> + struct cper_sec_mem_err *mem_err; >>> + mem_err = (struct cper_sec_mem_err *)(gdata+1); >> >> >> A newline here please. Also, spaces around '+'. I was off on vacation last week - looks like you got lots done without me :-) I have parts 1 & 2 applied to an internal tree. Looks like parts 3 & 4 need a few final polishes to get an Ack from Boris. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] EFI changes for v3.11
On Mon, Jul 8, 2013 at 11:36 AM, H. Peter Anvin wrote: >> I had hoped to have this patch follow in the same path that the >> one that changed the types and introduced the warnings took ... >> but since that didn't work perhaps I should just ask Linus to pull >> it from my ia64 tree. >> > > I can push it, although it seems a bit odd to me to push an ia64-only > patch through the x86 tree. > > Let me know what you prefer. I've sent Linus a "please pull" for this from my ia64 tree. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ia64: dmi.h: Make dmi_alloc use kzalloc
On Tue, Jul 9, 2013 at 10:13 AM, Joe Perches wrote: > x86/ia64 have a slight mismatch in dmi_alloc as > x86 does a memset(0), and ia64 just does kmalloc. > > Make the ia64 dmi_alloc match the x86 style. > > Signed-off-by: Joe Perches Applied. Thanks Joe. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] mce, acpi/apei: Soft-offline a page on firmware GHES notification
>> Signed-off-by: Naveen N. Rao > > Acked-by: Borislav Petkov Applied-by: Tony Luck :-) Naveen: Thanks for having this idea, implementing it, and sticking with it through the review process. Once 3.11-rc1 is out I'll ask Ingo to pull this series to the tip tree ... and then on to 3.12 -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lockref: remove cpu_relax() again
No new Itanium numbers yet ... but I did wonder how this works on multi-socket x86 ... so I tweaked "t.c. to increase threads to 64 to max out my 4-socket Xeon E5-4650 (8 cores/socket 2 threads/core) and also print out the individual scores from each thread. $ ./t /tmp 64 389827 717666 1540293 130764 681839 33357 606966 33716 33183 33230 69685 76422 352851 34940 257132 34192 34200 34098 34053 34459 234399 33678 241571 545912 620857 65818 32853 739440 33697 683655 741366 36208 385775 446198 45974 33056 403944 717415 254782 166754 702745 43661 1042180 437367 43751 503342 154223 706917 878167 43802 51667 660875 33261 522425 33627 33637 33446 33604 52963 33688 406088 551690 446474 33289 Threads = 64 Total loops: 19109114 Individual thread performance varies from 32853 to 1540293. A factor of 46.9 Sometimes it is good to sacrifice fairness for throughput. But wow! Running for longer [ s/sleep(10)/sleep(300) ] gave things a chance to even out - but I still see a factor of 3.5 between the fastest and the slowest. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] Device tree updates for v3.12
On Tue, Sep 10, 2013 at 1:50 PM, Linus Torvalds wrote: > Of course, maybe even the stupid add_device_randomness() is fast > enough. I just wanted to point out that it definitely isn't some > optimized thing. When I posted the patch that mixes in the whole SMBIOS table: commit d114a33387472555188f142ed8e98acdb8181c6d Author: Tony Luck Date: Fri Jul 20 13:15:20 2012 -0700 dmi: Feed DMI table to /dev/random driver I asked whether there was any size issue - as it tends to be a few kilobytes on laptops and desktops, and tens of kilobytes on servers. The answer I got back then was not to worry - digesting a few kilobytes wouldn't be a problem. I just threw in a debug message to check and saw: dmi_walk_early: added 10342 bytes in 339968 cycles So a couple of hundred microseconds for me. There are plenty of machine specific values buried in there (e.g. serial numbers for all the DIMMs) ... so this looks like a good use of this much boot time. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] x86/mce: Pack boolean MCE flags into a structure
On Wed, Sep 5, 2012 at 3:22 AM, Naveen N. Rao wrote: > Many MCE flags are boolean in nature, but are declared as integers > currently. We can pack these into a bitfield to save some space. Before this patch: size arch/x86/kernel/cpu/mcheck/mce.o textdata bss dec hex filename 189464930 776 24652604c arch/x86/kernel/cpu/mcheck/mce.o After: size arch/x86/kernel/cpu/mcheck/mce.o textdata bss dec hex filename 193354890 776 2500161a9 arch/x86/kernel/cpu/mcheck/mce.o So we do indeed see "data" reduced by 40 bytes. But "text" is up by 389. This seems to be because you have another change, not described in the commit log, buried in part 2 to add get_dont_log_ce(), set_dont_log_ce() etc. Compiler version: gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) I know I'm contradicting the feedback you got from Borislav here, but is this code churn really worth it to save 40 bytes? I don't think so. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [mcelog] Start using the new sysfs tunables location
On Wed, Sep 5, 2012 at 11:47 AM, Andi Kleen wrote: > On Wed, Sep 05, 2012 at 04:02:37PM +0530, Naveen N. Rao wrote: >> All the current mce tunables are now available under >> /sys/devices/system/machinecheck. Start using this new location, but fall >> back >> to the older per-cpu location so that we continue working with older kernels. > > Who did that change in the kernel? > > That breaks Linus rule that the kernel should not break userland. > Kernel needs to fix that. The change is still under discussion. Stage one is to add the new global pathnames in addition to keeping the old per-cpu ones. Also fix all utilities (just mcelog(8) as far as we know) to prefer the new paths. After some time[1] ... delete the old paths. This is allowable under Linus' modified edict that you can change ABI "if nobody complains". If we wait long enough that the new mcelog is widely deployed, then nobody should complain. -Tony [1] several years - not just a kernel release or two. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND]mm/ia64: fix a node distance bug
On Fri, Sep 7, 2012 at 3:58 PM, David Rientjes wrote: > On Thu, 6 Sep 2012, wujianguo wrote: >> Signed-off-by: Jianguo Wu >> Signed-off-by: Jiang Liu > > Acked-by: David Rientjes Applied (should show up in linux-next in the next day or two). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] x86, RAS: Add a barebones RAS subtree
On Mon, Oct 8, 2012 at 10:11 AM, Borislav Petkov wrote: > +config X86_RAS > + def_bool y > + prompt "X86 RAS features" > + ---help--- > + A collection of Reliability, Availability and Serviceability software > + features which enable hardware error logging and reporting. Leave it > + at 'y' unless you really know what you're doing. > + The intent of "X86_RAS" is just to show/hide all the menu options for the individual features - which will all use depends on X86_RAS right? Having this set to "y" doesn't actually enable any of the features - they all have their own CONFIG_* variables. Perhaps we could make that clearer in the help text? And ditch the "Leave it at 'y' ... ", I don't think it helps anyone. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] AMD MCE injection improvs
On Mon, Oct 8, 2012 at 10:11 AM, Borislav Petkov wrote: > create mode 100644 arch/x86/ras/ras.c Overall it looks good - but I'm a bit puzzled by this ras.c file that gets created as an empty file in part1, and is still empty at the end of the series. What is going to go into it? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 3.7-rc8
On Mon, Dec 3, 2012 at 2:29 PM, Tony Luck wrote: > > > > On Mon, Dec 3, 2012 at 2:20 PM, Romain Francoise > wrote: >> >> Hi Linus, >> >> Linus Torvalds writes: >> >> > Linus Torvalds (5): >> > fs/buffer.c: make block-size be per-page and protected by the >> > page lock >> > blockdev: remove bd_block_size_semaphore again >> > direct-io: don't read inode->i_blkbits multiple times >> > blkdev_max_block: make private to fs/buffer.c >> >> Could these changes be the reason for the following suddenly appearing in >> one of my VMs with rc8 (no such messages with rc7)? Pretty standard >> virtio >> setup in KVM. >> >> [ 11.832295] attempt to access beyond end of device >> [ 11.832298] vda1: rw=0, want=4192904, limit=4192902 >> [ 11.832299] Buffer I/O error on device vda1, logical block 524112 >> [ 11.832394] attempt to access beyond end of device >> [ 11.832395] vda1: rw=0, want=4192904, limit=4192902 >> [ 11.832396] Buffer I/O error on device vda1, logical block 524112 >> > I'm seeing similar stuff in -rc8 too (on ia64, native no VMM): > > > attempt to access beyond end of device > sda3: rw=0, want=268317424, limit=268317421 > Buffer I/O error on device sda3, logical block 33539677 > > attempt to access beyond end of device > sda3: rw=0, want=268317424, limit=268317421 > Buffer I/O error on device sda3, logical block 33539677 > > -rc7 didn't do this. > > -Tony Resend ... to go to the list (Oh gmail, why did you decide to reply in HTML???) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 3.7-rc8
> Just for info, can you add a "WARN_ON_ONCE()" to handle_bad_sector() > just so that I see which particular path your kvm load triggers. On native ia64 (with SLES11 userspace) I see: WARNING: at block/blk-core.c:1557 generic_make_request_checks+0x680/0xa40() Hardware name: I8QBH Modules linked in: usb_storage sg container button usbhid uhci_hcd ehci_hcd usbcore usb_common fan processor thermal thermal_sys Call Trace: [] show_stack+0x80/0xa0 sp=e003153cf670 bsp=e003153c1638 [] dump_stack+0x30/0x50 sp=e003153cf840 bsp=e003153c1620 [] warn_slowpath_common+0xc0/0x100 sp=e003153cf840 bsp=e003153c15d8 [] warn_slowpath_null+0x40/0x60 sp=e003153cf840 bsp=e003153c15b0 [] generic_make_request_checks+0x680/0xa40 sp=e003153cf840 bsp=e003153c1570 [] generic_make_request+0x30/0x280 sp=e003153cf880 bsp=e003153c1550 [] submit_bio+0x170/0x3c0 sp=e003153cf890 bsp=e003153c1500 [] submit_bh+0x310/0x4e0 sp=e003153cf8b0 bsp=e003153c14d0 [] block_read_full_page+0x720/0x820 sp=e003153cf8b0 bsp=e003153c1430 [] blkdev_readpage+0x30/0x60 sp=e003153cfcb0 bsp=e003153c1408 [] read_pages+0x220/0x260 sp=e003153cfcb0 bsp=e003153c13a0 [] __do_page_cache_readahead+0x130/0x320 sp=e003153cfce0 bsp=e003153c1310 [] ra_submit+0x40/0x60 sp=e003153cfcf0 bsp=e003153c12e0 [] ondemand_readahead+0x210/0x580 sp=e003153cfcf0 bsp=e003153c1278 [] page_cache_sync_readahead+0x90/0x100 sp=e003153cfcf0 bsp=e003153c1238 [] do_generic_file_read+0x770/0xce0 sp=e003153cfcf0 bsp=e003153c1140 [] generic_file_aio_read+0x260/0x5c0 sp=e003153cfcf0 bsp=e003153c10d0 [] do_sync_read+0x130/0x240 sp=e003153cfd30 bsp=e003153c1078 [] vfs_read+0x1b0/0x340 sp=e003153cfe20 bsp=e003153c1030 [] sys_read+0x90/0xe0 sp=e003153cfe20 bsp=e003153c0fb0 [] ia64_ret_from_syscall+0x0/0x20 sp=e003153cfe30 bsp=e003153c0fb0 [] __kernel_syscall_via_break+0x0/0x20 sp=e003153d bsp=e003153c0fb0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 3.7-rc8
> Linus Torvalds writes: > >> Does that fix the printk's for you too? > > Yep, works for me, thanks! Belated "works for me too" (just in case you were worrying that ia64 was still broken). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new execve/kernel_thread design
On Fri, Oct 19, 2012 at 10:30 AM, Al Viro wrote: > IIRC, the lack of comments on function with unusual calling conventions was > the last remaining issue... Stylistically other asm functions have huge block header comments detailing register usage. But typically those are way more complex. I think your inline comments work fine here. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] Fix a cmci discovery problem
Ingo, Is there a problem with this pull request ... or did it just get lost in the LKML noise? -Tony On Tue, Oct 30, 2012 at 3:01 PM, Luck, Tony wrote: > The following changes since commit 8f0d8163b50e01f398b14bcd4dc039ac5ab18d64: > > Linux 3.7-rc3 (2012-10-28 12:24:48 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git > tags/please-pull-tangchen > > for you to fetch changes up to 85b97637bb40a9f486459dd254598759af9c3d50: > > x86/mce: Do not change worker's running cpu in cmci_rediscover(). > (2012-10-30 14:38:12 -0700) > > > Fix problem in CMCI rediscovery code that was illegally > migrating worker threads to other cpus. > > > Tang Chen (1): > x86/mce: Do not change worker's running cpu in cmci_rediscover(). > > arch/x86/kernel/cpu/mcheck/mce_intel.c | 31 ++- > 1 file changed, 18 insertions(+), 13 deletions(-) > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v8 3/3] aerdrv: Cleanup log output for AER
On Wed, Jan 2, 2013 at 3:27 PM, Joe Perches wrote: > Just use dev_err( instead of dev_printk(KERN_ERR, > It's a function and it makes the object code smaller. Looks like we are almost converged on a solution (Lance: thanks for your patience and diligence in making changes). Anyone on the "To:" list want to claim this for their tree to commit? The series touches pci, acpi, RAS, and tracing ... so there are several possible owners. If someone else wants it, then add an: Acked-by: Tony Luck to all three parts. If there isn't a strong claim, I'll add v9[*] to the RAS tree and see if the TIP tree folks will pull it from me. -Tony [*] When Lance makes the change suggested by Joe. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] pstore/ram for 3.11
On Wed, Jun 12, 2013 at 8:44 PM, Rob Herring wrote: > Not sure who takes this, but please pull these 2 changes for pstore for > 3.11. These are necessary to get pstore to work with on-chip RAM on > Calxeda highbank platform. Were these posted for discussion and review? Is there anyone who should be providing {Acked,Reviewed,Tested}-by: tags for them? I haven't ever had a sub-maintained tree to pull from - so I'm being double-extra cautious before doing something with this as it all feels new and strange. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers
On Fri, Jun 14, 2013 at 3:23 PM, Rafael J. Wysocki wrote: > Can you please just test patch [5/5] alone without patches [1-4/5]? We > believe > that this should work too and if that's the case, we'll only need that patch > and a reworked [1/5]. Your belief is sound - I popped all five patches and then applied just 5/5 ... and the system still works. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] pstore/ram for 3.11
On Fri, Jun 14, 2013 at 3:47 PM, Anton Vorontsov wrote: > > Acked-by: Anton Vorontsov > > (Or I can pick this via linux-pstore.git tree, I'll let Tony decide.) Added that Acked-by: and applied to my tree. Thanks -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Re: [Patch] MCE, APEI: Don't enable CMCI when Firmware First mode is set in
On Mon, Jun 17, 2013 at 11:43 PM, Naveen N. Rao wrote: > + if (bank >= mca_cfg.banks) { > + pr_info("mce_disable_bank: Invalid MCA bank %d ignored.\n", > bank); Let's have a FW_BUG in that message to point a finger at the source of the problem. + apei_hest_parse(hest_parse_cmc, NULL); I think we want a boot command line option to opt out of this. "nohestcmc"?? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers
> Can you please apply the appended patch on top of it and see if the system > still works then? Still works with this patch. -Tony > --- > drivers/acpi/scan.c |3 +++ > drivers/acpi/video.c |3 --- > 2 files changed, 3 insertions(+), 3 deletions(-) > > Index: linux-pm/drivers/acpi/scan.c > === > --- linux-pm.orig/drivers/acpi/scan.c > +++ linux-pm/drivers/acpi/scan.c > @@ -939,6 +939,9 @@ static int acpi_device_probe(struct devi > struct acpi_driver *acpi_drv = to_acpi_driver(dev->driver); > int ret; > > + if (acpi_dev->handler) > + return -EINVAL; > + > if (!acpi_drv->ops.add) > return -ENOSYS; > > Index: linux-pm/drivers/acpi/video.c > === > --- linux-pm.orig/drivers/acpi/video.c > +++ linux-pm/drivers/acpi/video.c > @@ -1722,9 +1722,6 @@ static int acpi_video_bus_add(struct acp > int error; > acpi_status status; > > - if (device->handler) > - return -EINVAL; > - > status = acpi_walk_namespace(ACPI_TYPE_DEVICE, > device->parent->handle, 1, > acpi_video_bus_match, NULL, > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers
> If you don't mind, I'll queue up https://patchwork.kernel.org/patch/2712741/ > and > this for 3.11. Mark them Tested-by: Tony Luck if you like. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [IA64] sim: Add casts to avoid assignment warnings
Oops - pasted in old e-mail address for Boris On Thu, Jun 20, 2013 at 11:15 AM, Luck, Tony wrote: > Pointers in the efi_runtime_services_t structure now have type > "void *" (formerly they were "unsigned long"). So we now see a > bunch of warnings like this: > > arch/ia64/hp/sim/boot/fw-emu.c:293: warning: assignment makes pointer from > integer without a cast > > Add (void *) casts to the 10 affected lines to make the build quiet again. > > Signed-off-by: Tony Luck > > --- > > Boris, Matt - Can you add this patch to the same tree that > >commit 43ab0476a648053e5998bf081f47f215375a4502 [linux-next id] >efi: Convert runtime services function ptrs > > is in so that it will follow along behind it. Thanks. > > arch/ia64/hp/sim/boot/fw-emu.c | 20 ++-- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/arch/ia64/hp/sim/boot/fw-emu.c b/arch/ia64/hp/sim/boot/fw-emu.c > index 271f412..87bf9ad 100644 > --- a/arch/ia64/hp/sim/boot/fw-emu.c > +++ b/arch/ia64/hp/sim/boot/fw-emu.c > @@ -290,16 +290,16 @@ sys_fw_init (const char *args, int arglen) > efi_runtime->hdr.signature = EFI_RUNTIME_SERVICES_SIGNATURE; > efi_runtime->hdr.revision = EFI_RUNTIME_SERVICES_REVISION; > efi_runtime->hdr.headersize = sizeof(efi_runtime->hdr); > - efi_runtime->get_time = __pa(&fw_efi_get_time); > - efi_runtime->set_time = __pa(&efi_unimplemented); > - efi_runtime->get_wakeup_time = __pa(&efi_unimplemented); > - efi_runtime->set_wakeup_time = __pa(&efi_unimplemented); > - efi_runtime->set_virtual_address_map = __pa(&efi_unimplemented); > - efi_runtime->get_variable = __pa(&efi_unimplemented); > - efi_runtime->get_next_variable = __pa(&efi_unimplemented); > - efi_runtime->set_variable = __pa(&efi_unimplemented); > - efi_runtime->get_next_high_mono_count = __pa(&efi_unimplemented); > - efi_runtime->reset_system = __pa(&efi_reset_system); > + efi_runtime->get_time = (void *)__pa(&fw_efi_get_time); > + efi_runtime->set_time = (void *)__pa(&efi_unimplemented); > + efi_runtime->get_wakeup_time = (void *)__pa(&efi_unimplemented); > + efi_runtime->set_wakeup_time = (void *)__pa(&efi_unimplemented); > + efi_runtime->set_virtual_address_map = (void > *)__pa(&efi_unimplemented); > + efi_runtime->get_variable = (void *)__pa(&efi_unimplemented); > + efi_runtime->get_next_variable = (void *)__pa(&efi_unimplemented); > + efi_runtime->set_variable = (void *)__pa(&efi_unimplemented); > + efi_runtime->get_next_high_mono_count = (void > *)__pa(&efi_unimplemented); > + efi_runtime->reset_system = (void *)__pa(&efi_reset_system); > > efi_tables->guid = SAL_SYSTEM_TABLE_GUID; > efi_tables->table = __pa(sal_systab); > -- > 1.8.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] aerdrv: Move cper_print_aer() call out of interrupt context
Ok - grabbed this version. Will see if I can tempt Linus with a "please pull" tomorrow (when the commit is suitably aged). By the way ... this meta-commit description: > v2 - Re-worded header text. Removed prefix arg from cper_print_aer(). > Added TODO message in cper_print_aer(). > v3 - Changed type of u8* to struct aer_capability_regs* in the code > to avoid too much casting based on comment from Bjorn Helgaas. > v4 - Removed TODO message. Does not have to do with what this patch > is trying to fix. belongs *after* the "---" past the sign-off & Acks ... then "git am" will drop it from the commit message automatically -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC
On Fri, Jun 21, 2013 at 1:36 AM, Borislav Petkov wrote: > So ok, I'm persuaded, yet another bitfield it is ... :-\ Let's add some more comments on what each of these bitfields mean. Otherwise we will be going back over this next time we have a patch that touches one of them and we've all forgotten the subtle details explained in this e-mail thread. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] pstore: Fail to unlink if a driver has not defined pstore_erase
On Tue, Jun 25, 2013 at 9:41 AM, Kees Cook wrote: > On Tue, Jun 25, 2013 at 2:03 AM, Aruna Balakrishnaiah > wrote: >> pstore_erase is used to erase the record from the persistent store. >> So if a driver has not defined pstore_erase callback return How do people manage devices like this? With no erase function they just keep getting more and more pstore entries. Eventually they fill up. >> Signed-off-by: Aruna Balakrishnaiah > > Acked-by: Kees Cook Applied - thanks. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] Power management and ACPI fixes for v3.10-rc5
On Fri, Jun 7, 2013 at 5:51 AM, Rafael J. Wysocki wrote: > Aaron Lu (1): > ACPI / scan: do not match drivers against objects having scan handlers This patch showed up in linux-next tag next-20130605 and appears to be the cause of a boot failure on my ia64 HP rx2600 system. It panics with the message: Kernel panic - not syncing: Unable to find SBA IOMMU: Try a generic or DIG kernel Reverting this from next-20130605 fixes my problem and I can boot again. So please don't pull. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] Power management and ACPI fixes for v3.10-rc5
On Fri, Jun 7, 2013 at 3:23 PM, Tony Luck wrote: > So please don't pull. Bother. I see I was a few hours late finding this, and commit 9f29ab11ddb is already in Linus' tree. That's what happens when I get busy and skip a couple of days testing linux-next :-( So my problem comes from arch/ia64/hp/common/sba_iommu.c where the code in sba_init() says: acpi_bus_register_driver(&acpi_sba_ioc_driver); if (!ioc_list) { but because of this change we never managed to call ioc_init() so ioc_list doesn't get set up, and we die. Before this commit, the call chain looked like this: [] ioc_init+0x40/0xd00 [] acpi_sba_ioc_add+0x190/0x1c0 [] acpi_device_probe+0xa0/0x280 [] really_probe+0xe0/0x520 [] driver_probe_device+0x30/0x60 [] __driver_attach+0x110/0x160 [] bus_for_each_dev+0x110/0x180 [] driver_attach+0x40/0x60 [] bus_add_driver+0x230/0x580 [] driver_register+0xf0/0x400 [] acpi_bus_register_driver+0x50/0x80 [] sba_init+0x30/0x2d0 Is my problem that this driver has (or attaches) a "scan handler" where it shouldn't ... and I just need to stop it doing that? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suggestion] arch/*/include/asm/bitops.h: about __set_bit() API.
On Sat, Jun 8, 2013 at 3:08 AM, Chen Gang wrote: > using 'unsigned int *', implicitly: > ./ia64/include/asm/bitops.h:63:__set_bit (int nr, volatile void *addr) There is some downside on ia64 to your suggestion. If "addr" is properly aligned for an "int", but misaligned for a long ... i.e. addr%8 == 4, then I'll take an unaligned reference trap if I work with long* where the current code working with int* does not. Now perhaps all the callers do guarantee long* alignment? But I don't know. Apart from uniformity, there doesn't see to be any upside to changing this. -Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors
> + if (sec_sev == GHES_SEV_CORRECTED && > + (gdata->flags & > CPER_SEC_ERROR_THRESHOLD_EXCEEDED) && > + (mem_err->validation_bits & > CPER_MEM_VALID_PHYSICAL_ADDRESS)) { > + unsigned long pfn; > + pfn = mem_err->physical_addr >> PAGE_SHIFT; As Reagan said "Trust ... but verify" ... we should make sure BIOS gave us a good pfn if (pfn_valid(pfn)) soft_memory_failure_queue(pfn, 0, 0); else printk( ...something about BIOS giving us bad pfn = %lu\n", pfn); > + } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] acpi: Eliminate console msg if pstore.backend excludes ERST
On Fri, Jun 28, 2013 at 1:14 PM, Lenny Szubowicz wrote: > - if (pstore_register(&erst_info)) { > - pr_info(ERST_PFX "Could not register with persistent > store\n"); > + rc = pstore_register(&erst_info); > + if (rc) { > + if (rc != -EPERM) > + pr_info(ERST_PFX > + "Could not register with persistent store\n"); > + erst_info.buf = NULL; > + erst_info.bufsize = 0; Mismatch between part 1 and part 2 here ... we return -EINVAL if our name doesn't match the desired backend ... but you only suppress the "Could not register" message for -EPERM Or am I confused while just looking at patch fragments? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] acpi: Eliminate misleading erst pstore console message
On Fri, Jun 28, 2013 at 1:14 PM, Lenny Szubowicz wrote: > On systems that have a valid ACPI ERST table, if the pstore.backend kernel > parameter selects a specific facility other than erst, then during boot the > following console message is displayed: > > ERST: Could not register with persistent store Applied (using revised version of part 1). Thanks -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug] Reproducible data corruption on i5-3340M: Please continue your great work! :-)
On Thu, Aug 15, 2013 at 5:33 PM, Linus Torvalds wrote: > I'll probably delay committing it until tomorrow, in the hope that > somebody using one of the other architectures will at least ack that > it compiles. I'm re-attaching the patch (with the two "logn" -> "long" > fixes) just to encourage that. Hint hint, everybody.. I see I'm too late to supply an Ack for the commit, because it is already in. But just for completeness sake - all my ia64 configs build OK, and the couple that get boot tested still appear to be working too. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v2 00/11] Add (de)compression support to pstore
On Sat, Aug 17, 2013 at 11:32 AM, Kees Cook wrote: > Yeah, this is great. While I haven't tested it myself yet, the code > seems to be in good shape. I acked the ram piece separately, but > consider the entire series: > > Reviewed-by: Kees Cook Applied. This should show up in linux-next tomorrow. Anyone using efivars as the pstore backend? Testing reports (positive or negative) appreciated. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] efi: provide a generic efi_config_init()
On Tue, Jul 30, 2013 at 9:47 AM, Leif Lindholm wrote: > + /* > +* Let's see what config tables the firmware passed to us. > +*/ > + config_tables = early_mememap(efi.systab->tables, > + efi.systab->nr_tables * sz); Breaks bisection on ia64 ... you use early_mememap() here, but don't define it on ia64 until patch 3/4. So I get: drivers/firmware/efi/efi.c: In function 'efi_config_init': drivers/firmware/efi/efi.c:200: error: implicit declaration of function 'early_memremap' drivers/firmware/efi/efi.c:201: warning: assignment makes pointer from integer without a cast -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] efi: provide a generic efi_config_init()
On Tue, Jul 30, 2013 at 11:02 AM, Leif Lindholm wrote: > So I guess the clean way to deal with that would be to make the > memremap definition a separate patch? Or just pull: +#define early_memremap(phys_addr, size)early_ioremap(phys_addr, size) out of part 3 and put it into part1 (along with some of the commit commentary). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] Make commonly useful UEFI functions common
On Tue, Jul 30, 2013 at 9:47 AM, Leif Lindholm wrote: > IA64 code compile tested only. Compiled on a bunch of ia64 configurations, Boot tested. But not on machine that does the PROCESSOR_ABSTRACTION_LAYER_OVERWRITE_GUID thingy. Code to do the arch specific thing looks ok though. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
On Thu, Aug 1, 2013 at 4:42 PM, Luck, Tony wrote: > when I rebuilt a plain 3.11-rc3 it didn't log anything via pstore either :-( Well this turned out to be operator error on my part. 3.11-rc3 does in fact log errors to pstore and allows them to be retrieved and cleared. So then I start testing with your 11 patches in place. First boot was fine - ERST had no records, and pstore mounted OK (and showed no files). Then I panic'd the machine and rebooted. The boot hung when some rc script printed" Mounting other filesystems: I guess something went wrong when pstore found a non-empty ERST. I added some debug traces and booted again. This time the boot succeeded but I saw a GP fault reported from pstore_mkfile(). Possibly in this code: spin_lock_irqsave(&allpstore_lock, flags); list_for_each_entry(pos, &allpstore, list) { if (pos->type == type && pos->id == id && pos->psi == psi) { rc = -EEXIST; break; } } spin_unlock_irqrestore(&allpstore_lock, flags); My other tracing showed that we'd already found two compressed entries in ERST and were working on a third when this error happened (implying that my hang had been a panic that failed to print anything to console) I've attached one of the compressed files that v3.11-rc3 shows in pstore now. The "openssl zlib -d" trick you mentioned back in June mostly works to decode ... but it seems to dump some trailing garbage at the end of the file. -Tony unknown-erst-5907623178007478273 Description: Binary data
Re: [PATCH 00/11] Add compression support to pstore
A quick experiment to use your patchset - but with compression disabled by tweaking this line in pstore_dump(): zipped_len = -1; //zip_data(dst, hsize + len); turned out well. This kernel dumps uncompressed dmesg blobs into pstore and gets them back out again. So it seems likely that the problems are someplace in the compression/decompression code. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/5] ia64: add early_memremap() alias for early_ioremap()
On Mon, Aug 5, 2013 at 3:56 AM, Matt Fleming wrote: >> @@ -424,6 +424,7 @@ extern void __iomem * ioremap(unsigned long offset, >> unsigned long size); >> extern void __iomem * ioremap_nocache (unsigned long offset, unsigned long >> size); >> extern void iounmap (volatile void __iomem *addr); >> extern void __iomem * early_ioremap (unsigned long phys_addr, unsigned long >> size); >> +#define early_memremap(phys_addr, size)early_ioremap(phys_addr, >> size) >> extern void early_iounmap (volatile void __iomem *addr, unsigned long size); >> static inline void __iomem * ioremap_cache (unsigned long phys_addr, >> unsigned long size) > > Tony, can I get your Acked-by for this? Acked-by: Tony Luck [Cut & paste this ack to other parts of the series that touch ia64] -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
One more experiment - removed previous hack that disabled compression. Added a new hack to skip decompression. System died cleanly when I forced a panic. On reboot I found 3 files in pstore: -r--r--r-- 1 root root 3972 Aug 5 09:24 dmesg-erst-5908671953186586625 -r--r--r-- 1 root root 2565 Aug 5 09:24 dmesg-erst-5908671953186586626 -r--r--r-- 1 root root 4067 Aug 5 09:24 dmesg-erst-5908671953186586627 Using "openssl zlib -d" to decompress then ends up with some garbage at the end of the decompressed file - some text that should be there is missing. E.g. the tail of decompressed version of *625 ends with: <4>Call Trace: <4> [] dump_stack+0x45/0x56 <4> [] panic+0xc2/0x1cb <4> [] ? printk+0x54/0x56 <4> [] aegl+0x25/0x30 <4> [] proc_reg_write+0x3d/0x80 <4> [] vfs_write+0xc5/0x1e0 <4> [] SyS_write+0x52/0xa0 <4> [] system_call_fastpath+0x16/0x1b )c10^@^@^@^@^@^@^@^@^@ But my serial console logged this: Call Trace: [] dump_stack+0x45/0x56 [] panic+0xc2/0x1cb [] ? printk+0x54/0x56 [] aegl+0x25/0x30 [] proc_reg_write+0x3d/0x80 [] vfs_write+0xc5/0x1e0 [] SyS_write+0x52/0xa0 [] system_call_fastpath+0x16/0x1b [ cut here ] WARNING: CPU: 18 PID: 381 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5b/0x60() Modules linked in: CPU: 18 PID: 381 Comm: kworker/18:1 Not tainted 3.11.0-rc3-11-ge41db9e #6 -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
See attachment for what I actually applied - I think I got what you suggested (I added a declaration for "total_len"). Forcing a panic worked some things were logged to pstore. But on reboot with your patches applied I'm still seeing a GP fault when pstore is mounted and we find compressed records and inflate them and install them into the pstore filesystem. Here's the oops: general protection fault: [#1] SMP Modules linked in: CPU: 29 PID: 10252 Comm: mount Not tainted 3.11.0-rc3-12-g73bec18 #2 Hardware name: Intel Corporation LH Pass ../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012 task: 88082e934040 ti: 88082e2ec000 task.ti: 88082e2ec000 RIP: 0010:[] [] pstore_mkfile+0x84/0x410 RSP: 0018:88082e2edc70 EFLAGS: 00010007 RAX: 0246 RBX: 81ca7b20 RCX: 625f6963703e373c RDX: 00040004 RSI: 0004 RDI: 820aa7e8 RBP: 88082e2edd10 R08: 881026a48000 R09: R10: 88102d21efb8 R11: R12: 881026a48000 R13: 51ffe3560003 R14: R15: 4450 FS: 7fbd37a2d7e0() GS:88103fca() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fbd37a47000 CR3: 00103dc78000 CR4: 000407e0 Stack: 881026a4c450 5227 81a3703d 881026a48000 2e2edd70 88103db34140 0001abaf 36383039 003a0fb8 881026a48000 88102d21e000 448a Call Trace: [] pstore_get_records+0xed/0x2c0 [] ? pstore_get_inode+0x50/0x50 [] pstore_fill_super+0xa2/0xc0 [] mount_single+0xa2/0xd0 [] pstore_mount+0x18/0x20 [] mount_fs+0x43/0x1b0 [] ? __alloc_percpu+0x10/0x20 [] vfs_kern_mount+0x6f/0x100 [] do_mount+0x259/0xa10 [] ? strndup_user+0x5b/0x80 [] SyS_mount+0x8e/0xe0 [] system_call_fastpath+0x16/0x1b Code: 88 e8 f1 0f 39 00 48 8b 0d 0a 3a a2 00 48 81 f9 00 0d c9 81 75 15 eb 67 0f 1f 80 00 00 00 00 48 8b 09 48 81 f9 00 0d c9 81 74 54 <44> 39 71 18 75 ee 4c 39 69 20 75 e8 48 39 59 10 75 e2 48 89 c6 RIP [] pstore_mkfile+0x84/0x410 RSP ---[ end trace 0e1dd8e3ccfa3dcc ]--- /etc/init.d/functions: line 530: 10252 Segmentation fault "$@" Here's the start of my pstore_mkfile() code where the GP fault occurred: 8126d290 : 8126d290: e8 2b 91 39 00 callq 816063c0 <__fentry__> 8126d295: 55 push %rbp 8126d296: 48 89 e5mov%rsp,%rbp 8126d299: 41 57 push %r15 8126d29b: 41 56 push %r14 8126d29d: 41 89 femov%edi,%r14d 8126d2a0: 48 c7 c7 e8 a7 0a 82mov$0x820aa7e8,%rdi 8126d2a7: 41 55 push %r13 8126d2a9: 49 89 d5mov%rdx,%r13 8126d2ac: 41 54 push %r12 8126d2ae: 53 push %rbx 8126d2af: 48 83 ec 78 sub$0x78,%rsp 8126d2b3: 89 4d 84mov%ecx,-0x7c(%rbp) 8126d2b6: 48 89 b5 70 ff ff ffmov%rsi,-0x90(%rbp) 8126d2bd: 65 48 8b 04 25 28 00mov%gs:0x28,%rax 8126d2c4: 00 00 8126d2c6: 48 89 45 d0 mov%rax,-0x30(%rbp) 8126d2ca: 31 c0 xor%eax,%eax 8126d2cc: 48 8b 05 0d d5 e3 00mov 0xe3d50d(%rip),%rax# 820aa7e0 8126d2d3: 4c 89 85 78 ff ff ffmov%r8,-0x88(%rbp) 8126d2da: 44 89 4d 80 mov%r9d,-0x80(%rbp) 8126d2de: 48 8b 5d 28 mov0x28(%rbp),%rbx 8126d2e2: 48 8b 40 60 mov0x60(%rax),%rax 8126d2e6: 48 89 45 88 mov%rax,-0x78(%rbp) 8126d2ea: e8 f1 0f 39 00 callq 815fe2e0 <_raw_spin_lock_irqsave> 8126d2ef: 48 8b 0d 0a 3a a2 00mov 0xa23a0a(%rip),%rcx# 81c90d00 8126d2f6: 48 81 f9 00 0d c9 81cmp$0x81c90d00,%rcx 8126d2fd: 75 15 jne 8126d314 8126d2ff: eb 67 jmp 8126d368 8126d301: 0f 1f 80 00 00 00 00nopl 0x0(%rax) 8126d308: 48 8b 09mov(%rcx),%rcx 8126d30b: 48 81 f9 00 0d c9 81cmp$0x81c90d00,%rcx 8126d312: 74 54 je 8126d368 8126d314: 44 39 71 18 cmp %r14d,0x18(%rcx) << GP fault here 8126d318: 75 ee jne 8126d308 8126d31a: 4c 39 69 20 cmp%r13,0x20(%rcx) 8126d31e: 75 e8 jne 8126d308 8126d320: 48 39 59 10 cmp
Re: Proposed stable release changes
On Wed, Aug 21, 2013 at 1:00 PM, Borislav Petkov wrote: > We don't want to run daily snapshots of your tree though, right? Only > -rcs because the daily states are kinda arbitrary and they can be broken > in various ways. Or are we at a point in time where we can amend that > rule? If *nobody* runs daily snapshots - then problems just sit latent all week until the -rc is released and people start testing. Doesn't sound optimal. Running daily git snapshots can be "exciting" during the merge window. But I rarely see problems running a random build after -rc1. If you are still running that ancient 3.11-rc6 released on Sunday - then you are missing out on 28 commits worth of goodness since then :-) -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
This patch seems to fix the garbage at the end problem. Booting an old kernel and using openssl decodes them OK. Still have problems booting if there are any compressed images in ERST to be inflated. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
On Mon, Aug 5, 2013 at 2:20 PM, Tony Luck wrote: > Still have problems booting if there are any compressed images in ERST > to be inflated. So I took another look at this part of the code ... and saw a couple of issues: while ((size = psi->read(&id, &type, &count, &time, &buf, &compressed, psi)) > 0) { if (compressed && (type == PSTORE_TYPE_DMESG)) { big_buf_sz = (psinfo->bufsize * 100) / 45; big_buf = allocate_buf_for_decompression(big_buf_sz); if (big_buf || stream.workspace) >>> Did you mean "&&" here rather that "||"? unzipped_len = pstore_decompress(buf, big_buf, size, big_buf_sz); >>> Need an "else" here to set unzipped_len to -1 (or set it to -1 down >>> at the bottom of the loop ready for next time around. if (unzipped_len > 0) { buf = big_buf; >>> This sets us up for problems. First, you just overwrote the address >>> of the buffer that psi->read allocated - so we have a memory leak. But >>> worse than that we now double free the same buffer below when we >>> kfree(buf) and then kfree(big_buf) size = unzipped_len; compressed = false; } else { pr_err("pstore: decompression failed;" "returned %d\n", unzipped_len); compressed = true; } } rc = pstore_mkfile(type, psi->name, id, count, buf, compressed, (size_t)size, time, psi); kfree(buf); kfree(stream.workspace); kfree(big_buf); buf = NULL; stream.workspace = NULL; big_buf = NULL; if (rc && (rc != -EEXIST || !quiet)) failed++; } See attached patch that fixes these - but the code still looks like it could be cleaned up a bit more. -Tony pstore.patch Description: Binary data
Re: [PATCH 00/11] Add compression support to pstore
On Tue, Aug 6, 2013 at 6:58 PM, Aruna Balakrishnaiah wrote: > The patch looks right. I will clean it up. Does the issue still persist > after this? Things seem to be working - but testing has hardly been extensive (just a couple of forced panics). I do have one other question. In this code: >> if (compressed && (type == PSTORE_TYPE_DMESG)) { >> big_buf_sz = (psinfo->bufsize * 100) / 45; Where does the magic multiply by 1.45 come from? Is that always enough for the decompression of "dmesg" type data to succeed? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
On Tue, Aug 6, 2013 at 10:13 PM, Aruna Balakrishnaiah wrote: > How is it with erst and efivars? ERST is at the whim of the BIOS writer (the ACPI standard doesn't provide any suggestions on record sizes). My systems support ~6K record size. efivars has, IIRC, a 1k limit coded in the Linux back end. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
Oh - one more thing - and my apologies for not spotting this before: dst = allocate_buf_for_compression(big_buf_sz); No - you may not call kmalloc() in oops/panic context. Please pre-allocate everything you need in some initialization code to make sure that we don't fail in the panic path because we can't get the memory we need. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
On Tue, Aug 6, 2013 at 10:35 PM, Tony Luck wrote: > ERST is at the whim of the BIOS writer (the ACPI standard doesn't provide any > suggestions on record sizes). My systems support ~6K record size. Off by a little - 7896 bytes on my current machine. > efivars has, IIRC, a 1k limit coded in the Linux back end. My memory was correct for this one. Adding a little tracing to pstore_getrecords() I see this: pstore: inflated 3880 bytes compressed to 17459 bytes pstore: inflated 2567 bytes compressed to 17531 bytes pstore: inflated 4018 bytes compressed to 17488 bytes Which isn't at all what I expected. The ERST backend advertised a bufsize of 7896, and I have the default kmsg_bytes of 10240. So on my forced panic the code decided to create a three part pstore dump. The sum of the pieces is close to, but a little over the target of 10K. But I don't understand why the compressed sizes are so much smaller that the ERST backend block size. The uncompressed sizes appear to be close to constant. The compression ratios vary from 14% to 23% Why do we get three small parts instead of two bigger ones close the the 7896 ERST bufsize? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Add compression support to pstore
On Wed, Aug 7, 2013 at 9:29 PM, Aruna Balakrishnaiah wrote: > When we preallocate, we can use the same big_buf for compression as well as > decompression. > Also workspace will be one for both. By allocating max of inflate workspace > size and deflate > workspace size. We can save memory here. Well decompression isn't a problem. We are doing that in the non-panicing context of the freshly booted kernel so we can allocate memory without any worries for this. It's only the compression during panic where we must pre-allocate. But if the sizes are close to the same, then we might as well use the same buffers for both (and simplify the code because we don't have to worry about the kmalloc/kfree bits. > If pre-allocating close to 50k of buffer is not a issue. We can go ahead > with this approach. I never care about allocations measured in *kilo*bytes[1] - the smallest systems I use have 32GB - so 50K is so far down in the noise of other allocations. But other types of systems might be more concerned. ERST is generally only implemented on servers ... so the better question might be: What are the sizes for the EFI backend (where the buffer size is 1024). It sounds like it should scale linearly ... so below 8K??? That should not scare many people. Even phones measure memory in hundreds of MBytes. -Tony [1] unless they are per-cpu or per something else that there are a lot of on a big server - but this is a one-per-system allocation. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] panic: call panic handlers before kmsg_dump
On Thu, Jul 18, 2013 at 4:03 PM, Kees Cook wrote: > Since the panic handlers may produce additional information (via printk) > for the kernel log, it should be reported as part of the panic output > saved by kmsg_dump(). Without this re-ordering, nothing that adds > information to a panic will show up in pstore's view when kmsg_dump > runs, and is therefore not visible to crash reporting tools that examine > pstore output. Good point. Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors.
Gah ... there is another bug in that unaffected thread entry. The check for MCG_STATUS should be for RIPV=1 *and* EIPV=0 gmail will mess this patch up ... but should still be readable. -Tony --- diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity index 7f6ab4e..48f0fd2 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -112,7 +112,7 @@ static struct severity { MCESEV( KEEP, "Action required but unaffected thread is continuable", SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR, MCI_UC_SAR|MCI_ADDR), - MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV) + MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV) ), MCESEV( AR, "Action required: data load error in a user process", -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: lockdep warning in edac_create_sysfs_mci_device
On Sun, Jul 21, 2013 at 12:02 PM, Borislav Petkov wrote: > A fix is on the way: > > http://marc.info/?l=linux-edac&m=137422971614927&w=2 Fix was pulled into Linus' tree yesterday evening (Portland, OR timezone): commit 88d84ac97378c2f1d5fec9af1e8b7d9a662d6b00 Author: Borislav Petkov Date: Fri Jul 19 12:28:25 2013 +0200 EDAC: Fix lockdep splat Alexandra: Give it a test and let me and Boris know if you still see any problems. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: scsi scan: INQUIRY result too short (5), using 36
Oops ... forgot final step. That commit does revert cleanly (at least git did not grumble when I asked it to revert). The resulting kernel builds cleanly and boots without seeing this problem. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: scsi scan: INQUIRY result too short (5), using 36
On Wed, Jul 24, 2013 at 12:43 PM, James Bottomley wrote: > Oops, apparently no-one I cc'd at intel actually bothered to check the > patch for the isci driver. The looks to be that sci_swab32_cpy needs > multiples of four, so for commands that aren't that, it's rounding the > wrong way. Does this fix it? Yes. That fixes it. Wrap whichever of: Reported-by: Tony Luck and/or Tested-by: Tony Luck around that patch and ship it! Thanks for the fast fix. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] machine check decode fixes
V2 of this: * Broken into two patches - by suggestion of Chen Gong * Just change MCACOD #define value - by suggestion of Naveen Tony Luck (2): x86/mce: Fix mce regression from recent cleanup x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors arch/x86/include/asm/mce.h| 13 +++-- arch/x86/kernel/cpu/mcheck/mce-severity.c | 4 ++-- 2 files changed, 13 insertions(+), 4 deletions(-) -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors
The 0x1000 bit of the MCACOD field of machine check MCi_STATUS registers is only defined for corrected errors (where it means that hardware may be filtering errors see SDM section 15.9.2.1). For uncorrected errors it may, or may not be set - so we should mask it out when checking for the architecturaly defined recoverable error signatures (see SDM 15.9.3.1 and 15.9.3.2) Signed-off-by: Tony Luck --- arch/x86/include/asm/mce.h | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 29e3093..aa97342 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -32,11 +32,20 @@ #define MCI_STATUS_PCC (1ULL<<57) /* processor context corrupt */ #define MCI_STATUS_S(1ULL<<56) /* Signaled machine check */ #define MCI_STATUS_AR (1ULL<<55) /* Action required */ -#define MCACOD 0x /* MCA Error Code */ + +/* + * Note that the full MCACOD field of IA32_MCi_STATUS MSR is + * bits 15:0. But bit 12 is the 'F' bit, defined for corrected + * errors to indicate that errors are being filtered by hardware. + * We should mask out bit 12 when looking for specific signatures + * of uncorrected errors - so the F bit is deliberately skipped + * in this #define. + */ +#define MCACOD 0xefff /* MCA Error Code */ /* Architecturally defined codes from SDM Vol. 3B Chapter 15 */ #define MCACOD_SCRUB 0x00C0 /* 0xC0-0xCF Memory Scrubbing */ -#define MCACOD_SCRUBMSK0xfff0 +#define MCACOD_SCRUBMSK0xeff0 /* Skip bit 12 ('F' bit) */ #define MCACOD_L3WB0x017A /* L3 Explicit Writeback */ #define MCACOD_DATA0x0134 /* Data Load */ #define MCACOD_INSTR 0x0150 /* Instruction Fetch */ -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] x86/mce: Fix mce regression from recent cleanup
commit 33d7885b594e169256daef652e8d3527b2298e75 x86/mce: Update MCE severity condition check Simplified the rules to recognise each classification of recoverable machine check combining the instruction and data fetch rules into a single entry based on clarifications in the June 2013 SDM that all recoverable events would be reported on the unaffected processor with MCG_STATUS.EIPV=0 and MCG_STATUS.RIPV=1. Unfortunately the simplified rule has a couple of bugs. Fix them here. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mcheck/mce-severity.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index e2703520..c370e1c 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -111,8 +111,8 @@ static struct severity { #ifdef CONFIG_MEMORY_FAILURE MCESEV( KEEP, "Action required but unaffected thread is continuable", - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR), - MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV) + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR, MCI_UC_SAR|MCI_ADDR), + MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV) ), MCESEV( AR, "Action required: data load error in a user process", -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_mce: mce_start uses number of phsical cores instead of logical cores
> What I understand from above in intel 64 Arch software Developer's manual are: > 1) this manual is written for software developer; > 2) It says that MCE handler only requires to synchronize among the logical > cores in the same package/core(what I assume here is same CPU socket). > > I have two CPU sockets on motherboard and total 24 logical cores(12 cores > each CPU). Each CPU has its own integrated memory controller. Each memory > controller controls three channels of DIMMs. I can understand that if one > dimm has error, the memory controller can trigger the MCE exception to it's > own CPU, but why should this memory controller sends the MCE exception to the > other CPU or the rest CPUs on the motherboard? Is there any hardware standard > or specification for it? The Software Developer Manual is the specification of the architecture - there are data sheets for each processor which describe implementation details (e.g. perhaps which types of errors are reported in whcih banks, an MCi_STATUS.MSCOD field values providing more information about an error). Your "1&2" understanding is correct. Your question on "why should this memory controller send the MCE exception ..." is a good one. The answer is because the architecture requires it; even though you and I can imagine that it is possible for OS to do its work if the error is just sent to the processors on the socket where the error was found in some cases. There may be some cases where this is less easy (e.g. a logical processor on one socket issues a NUMA read to a location that is on the memory controller on the other socket). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+
On Sat, May 11, 2013 at 12:52 AM, Dmitry Monakhov wrote:. > What was page_size and fsblock size? CONFIG_IA64_PAGE_SIZE_64KB=y fsblock size is whatever is the default for SLES11SP2 on ia64 - which tool will tell me? My git bisect finally competed and points the a finger at: bisect> git bisect good ae4647fb7654676fc44a97e86eb35f9f06b99f66 is first bad commit commit ae4647fb7654676fc44a97e86eb35f9f06b99f66 Author: Jan Kara Date: Fri Apr 12 00:03:42 2013 -0400 jbd2: reduce journal_head size Remove unused t_cow_tid field (ext4 copy-on-write support doesn't seem to be happening) and change b_modified and b_jlist to bitfields thus saving 8 bytes in the structure. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Reviewed-by: Zheng Liu :04 04 c39ece4341894b3daf84764ba425a87ffb90fe50 d4e8d9185c2a1b740c235ca8ed05d496a442fce3 M include -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+
On Sun, May 12, 2013 at 7:21 PM, EUNBONG SONG wrote: > Hi, my git bisect result is same yours. And i reported that to community > yesterday. Ah. Good to have some confirmation (I was never sure how long to keep running before deciding that a test was "good". My slowest "bad" test took about 2.5 hours. I mostly let the tests run for >6 hours before deciding. I just confirmed that 3.10-rc1 still fails (30 minutes). Now running a test on 3.10-rc1 with just this commit reverted. Only been going for about 15 minutes, so no useful information yet. My best guess as to why this commit causes problems is that there are places where updates to individual fields in this structure used to be independent because they were to whole words. Now we have bitfileds there are races between access to different fields in the same word. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+
The 3.10-rc1 with ae4647fb765467 reverted is still running OK. At 3 hours now (only marginally longer that the 2.5 hours that one of the "bad" runs during the bisect managed). So I'm about 30% sure that we have a winner at the moment. I'll leave it running and check again in the morning. This penguin is heading to bed now. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] NOHZ, check to see if tick device is initialized in IRQ handling path
> void tick_nohz_irq_exit(void) > { > struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched); > + struct clock_event_device *dev = > +__get_cpu_var(tick_cpu_device).evtdev; > + > + /* Has the tick been initialized yet? */ > + if (unlikely(!dev || dev->mode == CLOCK_EVT_MODE_UNUSED)) > + return; Could we have something in the "struct tick_sched" to tell us whether it has been set up? Rather than this somewhat convoluted digging around in the clock_event_device innards? > + if (unlikely(!dev || dev->mode == CLOCK_EVT_MODE_UNUSED)) > + return; Ditto here. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
VFAT complains that my file system may be corrupted
Built Linus' tree this morning (HEAD = d7ab7302f970a254997687a1cdede421a5635c68) and got this message: FAT-fs (sda1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. when booting my ia64 machine. The message may well be legitimate because I did crash the machine, so the filessytem was not unmounted cleanly. BUT ... If I unmount and run fsck as it suggests, then I see: # fsck /boot/efi fsck from util-linux-ng 2.16 dosfsck 2.11, 12 Mar 2005, FAT32, LFN There are differences between boot sector and its backup. Differences: (offset:original/backup) 65:01/00 1) Copy original to backup 2) Copy backup to original 3) No action I tried option 3 - fsck made no other changes, but I still see the message. I tried option 1 - and I still see the message. So I went for option 2 ... and guess what, I still see the message when I mount this filesystem. Note that with either option 1 or 2 "fsck" says: Leaving file system unchanged. /dev/sda1: 20 files, 19865/255496 clusters This is the first time I've ever seen this message ... but I haven't had this system crash for some time, so not really sure when this may have started. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: VFAT complains that my file system may be corrupted
On Mon, May 6, 2013 at 11:44 AM, Oleksij Rempel wrote: > i provided patches for dosfstools for some time now, you need at least > v3.0.14. If your system do not provide it you will need to grub it here: > http://daniel-baumann.ch/gitweb/?p=software/dosfstools.git I may have too old a toolchain to build those :-( src/boot.c:560: warning: implicit declaration of function ‘cpu_to_le16’ src/boot.c:562: warning: implicit declaration of function ‘cpu_to_le32’ and then at link time: /home/aegl/dosfstools/src/boot.c:560: undefined reference to `cpu_to_le16' /home/aegl/dosfstools/src/boot.c:561: undefined reference to `cpu_to_le16' /home/aegl/dosfstools/src/boot.c:562: undefined reference to `cpu_to_le32' I guess I can fake them easily (ia64 runs little endian on Linux). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: VFAT complains that my file system may be corrupted
On Mon, May 6, 2013 at 11:51 AM, Tony Luck wrote: > I guess I can fake them easily (ia64 runs little endian on Linux). Duh. Especially as the only use is line 560-562 in src/boot.c: de.starthi = CT_LE_W(0); de.start = CT_LE_W(0); de.size = CT_LE_L(0); Gotta make sure to use a little endian 0 rather than risk a big-endian one. WTF? Anyhow ... thanks for the pointer. That fixed my filesystem for me. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+
I think I have the same (or highly similar) thing happening on ia64. Similarities: seeing assertions fail for b_transaction Differences: I only have ext3 filesystems mounted, no ext4 See attached trace. I'm pretty certain that the highly unhelpful bugcheck! 0 [1] comes from the J_ASSERT_JH(jh, jh->b_transaction == NULL); from disassembling __journal_remove_journal_head(). The instruction pointer points to the 2nd "break" instruction in the function. The problem shows up after 30 minutes to a couple of hours of stress (kernel builds with "make -j32"). I'm pretty sure this problem didn't occur in plain v3.9 (it can run for a full 24 hours). Trying to bisect - but it takes a while to be convinced that a good kernel is actually good (since I don't have a clear picture of how long to run before deciding that the bug isn't going to show) -Tony bug Description: Binary data
Re: [tip:smp/hotplug] idle: Implement generic idle function
Built next-20130415 and got this on ia64 early in boot: WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380() Hardware name: server rx2620 Modules linked in: Call Trace: [] show_stack+0x80/0xa0 sp=a00101287c50 bsp=a00101280e48 [] dump_stack+0x30/0x50 sp=a00101287e20 bsp=a00101280e30 [] warn_slowpath_common+0xc0/0x100 sp=a00101287e20 bsp=a00101280de8 [] warn_slowpath_null+0x40/0x60 sp=a00101287e20 bsp=a00101280dc0 [] cpu_idle_loop+0x360/0x380 sp=a00101287e20 bsp=a00101280d80 [] cpu_startup_entry+0x40/0x60 sp=a00101287e20 bsp=a00101280d68 [] rest_init+0x100/0x120 sp=a00101287e20 bsp=a00101280d50 [] start_kernel+0x770/0x890 sp=a00101287e20 bsp=a00101280cd0 [] start_ap+0x760/0x780 sp=a00101287e30 bsp=a00101280bc0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:smp/hotplug] idle: Implement generic idle function
On Tue, Apr 16, 2013 at 6:28 AM, Thomas Gleixner wrote: > Hmm, is safe_halt() returning with interrupts disabled? If yes, it > lacks a local_irq_enable(). Quite probably. Adding arch_local_irq_enable() to arch_safe_halt() makes all the problems go away. I'll send you the one-line patch from a system that won't mung it like gmail will. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/9] ia64: cpufreq: move cpufreq driver to drivers/cpufreq
[Repost in plain text so the lists don't bounce it - curse you Gmail for switching to HTML] > Any comments on this patch? This part looks OK ... But is there a big finish later in the patch series where you unify some/all of the cpufreq code across architectures? By itself just moving bits from arch/ia64/kernel/cpufreq to drivers/cpufreq/ doesn't look to add much value. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] x86/mce: Provide an option to keep cmci_reenable() quiet
cmci_reenable() calls cmci_discover() to look at which machine check banks are shared between processors. It ensure that only one cpu takes ownership of each shared bank. At boot time cmci_discover() is muted, but during hot add events it provides some output which may be helpful to ensure that all banks have an owner. We want to use cmci_reenable() when a CMCI storm subsides. In this case the topology has not changed, so we do not need any commentary as it goes about its business. Add a "quiet" argument to cmci_reenable() that it passes to cmci_discover(). Signed-off-by: Tony Luck --- [Patches 1-4 remain as previously posted. This is a new patch to help tidy console messages. Old patch 5 becomes patch 6 (and has a few cleanups] arch/x86/include/asm/mce.h | 4 ++-- arch/x86/kernel/cpu/mcheck/mce.c | 4 ++-- arch/x86/kernel/cpu/mcheck/mce_intel.c | 10 +- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 441520e..bf79a0f 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -165,13 +165,13 @@ extern int mce_cmci_disabled; extern int mce_ignore_ce; void mce_intel_feature_init(struct cpuinfo_x86 *c); void cmci_clear(void); -void cmci_reenable(void); +void cmci_reenable(int quiet); void cmci_rediscover(int dying); void cmci_recheck(void); #else static inline void mce_intel_feature_init(struct cpuinfo_x86 *c) { } static inline void cmci_clear(void) {} -static inline void cmci_reenable(void) {} +static inline void cmci_reenable(int quiet) {} static inline void cmci_rediscover(int dying) {} static inline void cmci_recheck(void) {} #endif diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index b4dde15..826dd21 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1994,7 +1994,7 @@ static void mce_enable_ce(void *all) { if (!mce_available(__this_cpu_ptr(&cpu_info))) return; - cmci_reenable(); + cmci_reenable(0); cmci_recheck(); if (all) __mcheck_cpu_init_timer(); @@ -2246,7 +2246,7 @@ static void __cpuinit mce_reenable_cpu(void *h) return; if (!(action & CPU_TASKS_FROZEN)) - cmci_reenable(); + cmci_reenable(0); for (i = 0; i < banks; i++) { struct mce_bank *b = &mce_banks[i]; diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 38e49bc..e652cde 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -78,7 +78,7 @@ static void print_update(char *type, int *hdr, int num) * on this CPU. Use the algorithm recommended in the SDM to discover shared * banks. */ -static void cmci_discover(int banks, int boot) +static void cmci_discover(int banks, int quiet) { unsigned long *owned = (void *)&__get_cpu_var(mce_banks_owned); unsigned long flags; @@ -96,7 +96,7 @@ static void cmci_discover(int banks, int boot) /* Already owned by someone else? */ if (val & MCI_CTL2_CMCI_EN) { - if (test_and_clear_bit(i, owned) && !boot) + if (test_and_clear_bit(i, owned) && !quiet) print_update("SHD", &hdr, i); __clear_bit(i, __get_cpu_var(mce_poll_banks)); continue; @@ -109,7 +109,7 @@ static void cmci_discover(int banks, int boot) /* Did the enable bit stick? -- the bank supports CMCI */ if (val & MCI_CTL2_CMCI_EN) { - if (!test_and_set_bit(i, owned) && !boot) + if (!test_and_set_bit(i, owned) && !quiet) print_update("CMCI", &hdr, i); __clear_bit(i, __get_cpu_var(mce_poll_banks)); } else { @@ -196,11 +196,11 @@ void cmci_rediscover(int dying) /* * Reenable CMCI on this CPU in case a CPU down failed. */ -void cmci_reenable(void) +void cmci_reenable(int quiet) { int banks; if (cmci_supported(&banks)) - cmci_discover(banks, 0); + cmci_discover(banks, quiet); } static void intel_init_cmci(void) -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/6] x86/mce: Add CMCI poll mode
From: Chen Gong On Intel systems corrected machine check interrupts (CMCI) may be sent to multiple logical processors; possibly to all processors on the affected socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface"). This means that a persistent error (such as a stuck bit in ECC memory) may cause a storm of interrupts that greatly hinders or prevents forward progress (probably on many processors). To solve this we keep track of the rate at which each processor sees CMCI. If we exceed a threshold, we disable CMCI delivery and switch to polling the machine check banks. If the storm subsides (none of the affected processors see any more errors for a complete poll interval) we re-enable CMCI. Signed-off-by: Chen Gong Signed-off-by: Thomas Gleixner Tested-by: Chen Gong Signed-off-by: Tony Luck --- Changes (w.r.t. old patch 5/5): + New commit message + Print messages when storm starts/ends + Suppress messages from cmci_discover() + Some spelling fixes + Increased storm threshold from 5 to 15 (so we are have a few more samples for pattern detection to identify the source of the storm). arch/x86/kernel/cpu/mcheck/mce-internal.h | 12 arch/x86/kernel/cpu/mcheck/mce.c | 47 +++-- arch/x86/kernel/cpu/mcheck/mce_intel.c| 108 +- 3 files changed, 160 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index ed44c8a..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -28,6 +28,18 @@ extern int mce_ser; extern struct mce_bank *mce_banks; +#ifdef CONFIG_X86_MCE_INTEL +unsigned long mce_intel_adjust_timer(unsigned long interval); +void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); +#else +# define mce_intel_adjust_timer mce_adjust_timer_default +static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } +#endif + +void mce_timer_kick(unsigned long interval); + #ifdef CONFIG_ACPI_APEI int apei_write_mce(struct mce *m); ssize_t apei_read_mce(struct mce *m, u64 *record_id); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 826dd21..ee57a8f 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1260,6 +1260,14 @@ static unsigned long check_interval = 5 * 60; /* 5 minutes */ static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */ static DEFINE_PER_CPU(struct timer_list, mce_timer); +static unsigned long mce_adjust_timer_default(unsigned long interval) +{ + return interval; +} + +static unsigned long (*mce_adjust_timer)(unsigned long interval) = + mce_adjust_timer_default; + static void mce_timer_fn(unsigned long data) { struct timer_list *t = &__get_cpu_var(mce_timer); @@ -1270,6 +1278,7 @@ static void mce_timer_fn(unsigned long data) if (mce_available(__this_cpu_ptr(&cpu_info))) { machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_poll_banks)); + mce_intel_cmci_poll(); } /* @@ -1277,14 +1286,38 @@ static void mce_timer_fn(unsigned long data) * polling interval, otherwise increase the polling interval. */ iv = __this_cpu_read(mce_next_interval); - if (mce_notify_irq()) + if (mce_notify_irq()) { iv = max(iv / 2, (unsigned long) HZ/100); - else + } else { iv = min(iv * 2, round_jiffies_relative(check_interval * HZ)); + iv = mce_adjust_timer(iv); + } __this_cpu_write(mce_next_interval, iv); + /* Might have become 0 after CMCI storm subsided */ + if (iv) { + t->expires = jiffies + iv; + add_timer_on(t, smp_processor_id()); + } +} - t->expires = jiffies + iv; - add_timer_on(t, smp_processor_id()); +/* + * Ensure that the timer is firing in @interval from now. + */ +void mce_timer_kick(unsigned long interval) +{ + struct timer_list *t = &__get_cpu_var(mce_timer); + unsigned long when = jiffies + interval; + unsigned long iv = __this_cpu_read(mce_next_interval); + + if (timer_pending(t)) { + if (time_before(when, t->expires)) + mod_timer_pinned(t, when); + } else { + t->expires = round_jiffies(when); + add_timer_on(t, smp_processor_id()); + } + if (interval < iv) + __this_cpu_write(mce_next_interval, interval); } /* Must not be called in IRQ context where del_timer_sync() can deadlock */ @@ -1548,6 +1581,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) switch (c->x86_vendor) { case X86_VENDOR_INTEL: mce_intel_feature_init(c); +
[PATCH 5/6] x86/mce: Make cmci_discover() quiet
cmci_discover() works out which machine check banks support CMCI, and which of those are shared by multiple logical processors. It uses this information to ensure that exactly one cpu is designated the owner of each bank so that when interrupts are broadcast to multiple cpus, only one of them will look in a shared bank to log the error and clear the bank. At boot time cmci_discover() performs this task silently. But during certain cpu hotplug operations it prints out a set of summary lines like this: CPU 35 MCA banks CMCI:0 CMCI:1 CMCI:3 CMCI:5 CMCI:6 CMCI:7 CMCI:8 CMCI:9 CMCI:10 CMCI:11 CPU 1 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 39 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 38 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 32 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 37 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 36 MCA banks CMCI:0 CMCI:1 CMCI:3 CPU 34 MCA banks CMCI:0 CMCI:1 CMCI:3 The value of these messages seems very low. A user might painstakingly cross-check against the data sheet for a processor to ensure that all CMCI supported banks are correctly reported, but this seems improbable. If users really wanted to do this, we should print the information at boot time too. Remove the messages. Signed-off-by: Tony Luck --- Gong pointed out to me offline that my previous "patch 5/6" would not do what I said it did in the case where a processor is taken offline during a CMCI storm. We'd have a topology change, but would suppress the bank attribution messages when the storm ended. I took a longer look at the messages, and decided that we can live without them. arch/x86/kernel/cpu/mcheck/mce_intel.c | 25 ++--- 1 file changed, 6 insertions(+), 19 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 38e49bc..59648e4 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -65,24 +65,15 @@ static void intel_threshold_interrupt(void) mce_notify_irq(); } -static void print_update(char *type, int *hdr, int num) -{ - if (*hdr == 0) - printk(KERN_INFO "CPU %d MCA banks", smp_processor_id()); - *hdr = 1; - printk(KERN_CONT " %s:%d", type, num); -} - /* * Enable CMCI (Corrected Machine Check Interrupt) for available MCE banks * on this CPU. Use the algorithm recommended in the SDM to discover shared * banks. */ -static void cmci_discover(int banks, int boot) +static void cmci_discover(int banks) { unsigned long *owned = (void *)&__get_cpu_var(mce_banks_owned); unsigned long flags; - int hdr = 0; int i; raw_spin_lock_irqsave(&cmci_discover_lock, flags); @@ -96,8 +87,7 @@ static void cmci_discover(int banks, int boot) /* Already owned by someone else? */ if (val & MCI_CTL2_CMCI_EN) { - if (test_and_clear_bit(i, owned) && !boot) - print_update("SHD", &hdr, i); + clear_bit(i, owned); __clear_bit(i, __get_cpu_var(mce_poll_banks)); continue; } @@ -109,16 +99,13 @@ static void cmci_discover(int banks, int boot) /* Did the enable bit stick? -- the bank supports CMCI */ if (val & MCI_CTL2_CMCI_EN) { - if (!test_and_set_bit(i, owned) && !boot) - print_update("CMCI", &hdr, i); + set_bit(i, owned); __clear_bit(i, __get_cpu_var(mce_poll_banks)); } else { WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks))); } } raw_spin_unlock_irqrestore(&cmci_discover_lock, flags); - if (hdr) - printk(KERN_CONT "\n"); } /* @@ -186,7 +173,7 @@ void cmci_rediscover(int dying) continue; /* Recheck banks in case CPUs don't all have the same */ if (cmci_supported(&banks)) - cmci_discover(banks, 0); + cmci_discover(banks); } set_cpus_allowed_ptr(current, old); @@ -200,7 +187,7 @@ void cmci_reenable(void) { int banks; if (cmci_supported(&banks)) - cmci_discover(banks, 0); + cmci_discover(banks); } static void intel_init_cmci(void) @@ -211,7 +198,7 @@ static void intel_init_cmci(void) return; mce_threshold_vector = intel_threshold_interrupt; - cmci_discover(banks, 1); + cmci_discover(banks); /* * For CPU #0 this runs with still disabled APIC, but that's * ok because only the vector is set up. We still do another -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel"
Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y
> Perhaps what is happening is that cpu0 comes online ... safely skips > over the early printk calls. Calls cpu_init() which sets up the resources > *it* needs (ar.k3 points to per-cpu space), and then executes > sched_init() which marks it safe for all printk's. Then cpu1 comes > up and does a printk before it gets to cpu_init(). I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk timestamps turned on ... and it booted just fine on my tiger4. The default path for non-boot cpus is from head.S to start_secondary(), and that calls cpu_init() pretty quickly. There shouldn't normally[2] be any printk() calls on the non-boot cpu before it is safe to do so. -Tony [1] Attached [2] If you set #define SMP_DEBUG in arch/ia64/kernel/smpboot.c that enables at least one printk() that will cause problems if you have also configured timestamps. kernel/sched.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) Index: linux-x86.q/kernel/sched.c === --- linux-x86.q.orig/kernel/sched.c +++ linux-x86.q/kernel/sched.c @@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt */ const_debug unsigned int sysctl_sched_rt_ratio = 62259; +static __read_mostly int scheduler_running; + /* * For kernel-internal use: high-speed (but slightly incorrect) per-cpu * clock constructed from sched_clock(): @@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu) unsigned long flags; struct rq *rq; - local_irq_save(flags); - rq = cpu_rq(cpu); /* * Only call sched_clock() if the scheduler has already been * initialized (some code might call cpu_clock() very early): */ - if (rq->idle) - update_rq_clock(rq); + if (unlikely(!scheduler_running)) + return 0; + + local_irq_save(flags); + rq = cpu_rq(cpu); + update_rq_clock(rq); now = rq->clock; local_irq_restore(flags); @@ -7255,6 +7259,8 @@ void __init sched_init(void) * During early bootup we pretend to be a normal task: */ current->sched_class = &fair_sched_class; + + scheduler_running = 1; } #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP --
Re: [PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors
> Maybe define a default empty quirk_no_way_out() on the remaining > families/vendors so that the compiler can optimize it away and we save > ourselves the if-test? Perhaps I misunderstood your suggestion. I don't see how the compiler will manage to optimize it all away. I just tried defining static void quirk_no_way_out_nop(int bank, struct mce *m, struct pt_regs *regs) { } and providing that as an initial value for the quirk_no_way_out function pointer. Then I deleted the "if (quirk_no_way_out)". Looking at the assembly code produced, I now just have an unconditional call: callq *0x9fe992(%rip)# 81a18668 I'd think that a call through a function pointer to an empty function is more expensive that testing whether that function pointer was NULL. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] dmi: Feed DMI table to /dev/random driver
Send the entire DMI (SMBIOS) table to the /dev/random driver to help seed its pools. Signed-off-by: Tony Luck --- This looks a useful addition to your /dev/random series. There are lots of platform specific goodies in this table (BIOS version, system serial number and UUID, count and version number of processors, DIMM slot population and serial numbers, etc.) On the system I tested the patch on the table is 9866 bytes. Is it OK to dump that much into add_device_randomness() in one shot? The alternative is to select the 'useful' bits deeper into the routines that parse the entries in the table. drivers/firmware/dmi_scan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c index 153980b..b298158 100644 --- a/drivers/firmware/dmi_scan.c +++ b/drivers/firmware/dmi_scan.c @@ -6,6 +6,7 @@ #include #include #include +#include #include /* @@ -111,6 +112,8 @@ static int __init dmi_walk_early(void (*decode)(const struct dmi_header *, dmi_table(buf, dmi_len, dmi_num, decode, NULL); + add_device_randomness(buf, dmi_len); + dmi_iounmap(buf, dmi_len); return 0; } -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] dmi: Feed DMI table to /dev/random driver
On Fri, Jul 20, 2012 at 5:56 PM, Theodore Ts'o wrote: > The other approach was to add some new interface that random.c would > call which would grab the dmi data from rand_initialize(). But that's > going to be a lot more complicated, so I guess we should go with the > simple/stupid approach. It wouldn't be all that hard ... we'd just need to add a new entry point to the dmi codefor random to call (and a stub somewhere so that CONFIG_DMI=n kernels still build). But getting some per-platform data into the random pools earlier is a good thing ... it means that users of random data will see the benefit earlier than they do now. So add the big fat comment so that people know not to break this useful (if not entirely intentional) functionality. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] random: Add comment to random_initialize()
Many platforms have per-machine instance data (serial numbers, asset tags, etc.) squirreled away in areas that are accessed during early system bringup. Mixing this data into the random pools has a very high value in providing better random data, so we should allow (and even encourage) architecture code to call add_device_randomness() from the setup_arch() paths. However, this limits our options for internal structure of the random driver since random_initialize() is not called until long after setup_arch(). Add a big fat comment to rand_initialize() spelling out this requirement. Suggested-by: Theodore Ts'o Signed-off-by: Tony Luck --- Theodore Ts'o wrote: > I agree. Want to send a revised patch with the comment, and I'll drop > it into the random.git tree? Additional patch rather than revised (since I'm touching different subsystems: dmi and random). drivers/char/random.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/char/random.c b/drivers/char/random.c index 9793b40..1a2dfa8 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1087,6 +1087,16 @@ static void init_std_data(struct entropy_store *r) mix_pool_bytes(r, utsname(), sizeof(*(utsname())), NULL); } +/* + * Note that setup_arch() may call add_device_randomness() + * long before we get here. This allows seeding of the pools + * with some platform dependent data very early in the boot + * process. But it limits our options here. We must use + * statically allocated structures that already have all + * initializations complete at compile time. We should also + * take care not to overwrite the precious per platform data + * we were given. + */ static int rand_initialize(void) { init_std_data(&input_pool); -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
checkpatch should not complain about 'Suggested-by:'
checkpatch just gave me: WARNING: Non-standard signature: Suggested-by: There are over 500 instances of 'Suggested-by:', and it seems to have some value in tracking history and awarding credit where it is due. "Reported-and-tested-by:" is also in regular use, but not in the list of "standard" signatures. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] Fix machine check recovery for instruction fault on Sandy Bridge
[Unchanged since last posted - except to add Boris' Acked-by since after further discussion his nitpick didn't warrant a change. Ready for x86/mce branch ... and if possible to move to Linus in this merge window] This patch series adds a workaround for some strange asymmetry between how machine checks are reported for data and instruction fetches. For instruction fetch error the processor does not set the EIPV bit in the MCG_STATUS register on the affected processor, leading us to believe that the cs/ip values saved on the stack are not associated with the machine check ... which in turn makes us unable to determine whether the machine check was taken in kernel or user mode. The workaround is to fake the presence of the EIPV bit for this error on this processor model. Not pretty, but avoids having to make special cases later in the code. Tony Luck (2): x86/mce: Move MCACOD defines from mce-severity.c to x86/mce: Add quirk for instruction recovery on Sandy Bridge processors arch/x86/include/asm/mce.h| 8 ++ arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 - arch/x86/kernel/cpu/mcheck/mce.c | 43 --- 3 files changed, 48 insertions(+), 10 deletions(-) -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] x86/mce: Move MCACOD defines from mce-severity.c to
We will need some of these values in mce.c. Move them to the appropriate header file so they are available. Acked-by: Borislav Petkov Signed-off-by: Tony Luck --- arch/x86/include/asm/mce.h| 8 arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 --- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 441520e..a3ac52b 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -33,6 +33,14 @@ #define MCI_STATUS_PCC (1ULL<<57) /* processor context corrupt */ #define MCI_STATUS_S(1ULL<<56) /* Signaled machine check */ #define MCI_STATUS_AR (1ULL<<55) /* Action required */ +#define MCACOD 0x /* MCA Error Code */ + +/* Architecturally defined codes from SDM Vol. 3B Chapter 15 */ +#define MCACOD_SCRUB 0x00C0 /* 0xC0-0xCF Memory Scrubbing */ +#define MCACOD_SCRUBMSK0xfff0 +#define MCACOD_L3WB0x017A /* L3 Explicit Writeback */ +#define MCACOD_DATA0x0134 /* Data Load */ +#define MCACOD_INSTR 0x0150 /* Instruction Fetch */ /* MCi_MISC register defines */ #define MCI_MISC_ADDR_LSB(m) ((m) & 0x3f) diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 413c2ce..1301762 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -55,13 +55,6 @@ static struct severity { #define MCI_UC_S (MCI_STATUS_UC|MCI_STATUS_S) #define MCI_UC_SAR (MCI_STATUS_UC|MCI_STATUS_S|MCI_STATUS_AR) #defineMCI_ADDR (MCI_STATUS_ADDRV|MCI_STATUS_MISCV) -#define MCACOD 0x -/* Architecturally defined codes from SDM Vol. 3B Chapter 15 */ -#define MCACOD_SCRUB 0x00C0 /* 0xC0-0xCF Memory Scrubbing */ -#define MCACOD_SCRUBMSK0xfff0 -#define MCACOD_L3WB0x017A /* L3 Explicit Writeback */ -#define MCACOD_DATA0x0134 /* Data Load */ -#define MCACOD_INSTR 0x0150 /* Instruction Fetch */ MCESEV( NO, "Invalid", -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors
Sandy Bridge processors follow the SDM (Vol 3B, Table 15-20) and set both the RIPV and EIPV bits in the MCG_STATUS register to zero for machine checks during instruction fetch. This is more than a little counter-intuitive and means that Linux cannot recover from these errors. Rather than insert special case code at several places in mce.c and mce-severity.c, we pretend the EIPV bit was set for just this case early in processing the machine check. Acked-by: Borislav Petkov Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mcheck/mce.c | 43 +--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index da27c5d..e65e738 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -102,6 +102,8 @@ DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = { static DEFINE_PER_CPU(struct work_struct, mce_work); +static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs); + /* * CPU/chipset specific EDAC code can register a notifier call here to print * MCE errors in a human-readable form. @@ -649,14 +651,18 @@ EXPORT_SYMBOL_GPL(machine_check_poll); * Do a quick check if any of the events requires a panic. * This decides if we keep the events around or clear them. */ -static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp) +static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp, + struct pt_regs *regs) { int i, ret = 0; for (i = 0; i < banks; i++) { m->status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i)); - if (m->status & MCI_STATUS_VAL) + if (m->status & MCI_STATUS_VAL) { __set_bit(i, validp); + if (quirk_no_way_out) + quirk_no_way_out(i, m, regs); + } if (mce_severity(m, tolerant, msg) >= MCE_PANIC_SEVERITY) ret = 1; } @@ -1039,7 +1045,7 @@ void do_machine_check(struct pt_regs *regs, long error_code) *final = m; memset(valid_banks, 0, sizeof(valid_banks)); - no_way_out = mce_no_way_out(&m, &msg, valid_banks); + no_way_out = mce_no_way_out(&m, &msg, valid_banks, regs); barrier(); @@ -1415,6 +1421,34 @@ static void __mcheck_cpu_init_generic(void) } } +/* + * During IFU recovery Sandy Bridge -EP4S processors set the RIPV and + * EIPV bits in MCG_STATUS to zero on the affected logical processor (SDM + * Vol 3B Table 15-20). But this confuses both the code that determines + * whether the machine check occurred in kernel or user mode, and also + * the severity assessment code. Pretend that EIPV was set, and take the + * ip/cs values from the pt_regs that mce_gather_info() ignored earlier. + */ +static void quirk_sandybridge_ifu(int bank, struct mce *m, struct pt_regs *regs) +{ + if (bank != 0) + return; + if ((m->mcgstatus & (MCG_STATUS_EIPV|MCG_STATUS_RIPV)) != 0) + return; + if ((m->status & (MCI_STATUS_OVER|MCI_STATUS_UC| + MCI_STATUS_EN|MCI_STATUS_MISCV|MCI_STATUS_ADDRV| + MCI_STATUS_PCC|MCI_STATUS_S|MCI_STATUS_AR| + MCACOD)) != +(MCI_STATUS_UC|MCI_STATUS_EN| + MCI_STATUS_MISCV|MCI_STATUS_ADDRV|MCI_STATUS_S| + MCI_STATUS_AR|MCACOD_INSTR)) + return; + + m->mcgstatus |= MCG_STATUS_EIPV; + m->ip = regs->ip; + m->cs = regs->cs; +} + /* Add per CPU specific workarounds here */ static int __cpuinit __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c) { @@ -1512,6 +1546,9 @@ static int __cpuinit __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c) */ if (c->x86 == 6 && c->x86_model <= 13 && mce_bootlog < 0) mce_bootlog = 0; + + if (c->x86 == 6 && c->x86_model == 45) + quirk_no_way_out = quirk_sandybridge_ifu; } if (monarch_timeout < 0) monarch_timeout = 0; -- 1.7.10.2.552.gaa3bb87 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/