Re: [PATCH] x86/mce: Rework cmci_rediscover() to play well with CPU hotplug

2013-03-20 Thread Tony Luck
Looks lots cleaner. Applied on top of 3.9.rc3 and took it for a spin offlining and onlining cpus at random intervals. First time round I saw a few splats like the one below. But after a reboot I can no longer reproduce. -Tony INFO: task devkit-power-da:19861 blocked for more than 120 seconds.

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-12 Thread Tony Luck
Building linux-next today (tag next-20130212) I get the following errors when building arch/ia64/configs/{tiger_defconfig, zx1_defconfig, bigsur_defconfig, sim_defconfig} arch/ia64/mm/init.c: In function 'free_initrd_mem': arch/ia64/mm/init.c:215: error: 'max_addr' undeclared (first use in this fu

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-12 Thread Tony Luck
On Tue, Feb 12, 2013 at 4:19 PM, Andrew Morton wrote: > But, umm, why am I sitting here trying to maintain an ia64 bugfix and > handling bug reports from the ia64 maintainer? Wanna swap? That sounds like a plan. I'll look out for a new version with the missing #include and less silly global var

Re: [PATCH 4/9] ia64: cpufreq: move cpufreq driver to drivers/cpufreq

2013-04-03 Thread Tony Luck
On Mon, Apr 1, 2013 at 5:49 PM, Viresh Kumar wrote: > For now, your Ack will work :) Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.

Re: [PATCH v3 17/22] x86, ACPI, numa, ia64: split SLIT handling out

2013-04-05 Thread Tony Luck
still boot and nothing strange happens. Tested-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-19 Thread Tony Luck
Foolishly sent an earlier reply from Outlook which appears to have mangled/lost it. Trying again ... > In efi_init() memory aligns in IA64_GRANULE_SIZE(16M). If set > "crashkernel=1024M-:600M" Is this where the real problem begins? Should we insist that users provide crashkernel parameters roun

Re: [PATCH 1/2] sched: move RR_TIMESLICE from sysctl.h to rt.h

2013-02-20 Thread Tony Luck
ppy if it got applied directly to Linus tree before I get too big of a bisection gap. Acked-by: Tony Luck > include/linux/sched/rt.h | 6 ++ > include/linux/sched/sysctl.h | 6 -- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/include/linux/s

Re: [PATCH 1/2] sched: move RR_TIMESLICE from sysctl.h to rt.h

2013-02-20 Thread Tony Luck
On Wed, Feb 20, 2013 at 9:50 AM, Ingo Molnar wrote: > Hm, didn't it get fixed via the commit below? Together with moving RR_TIMESLICE to rt.h ... ia64 is good. But I see commit 77852fea6e24 in the tree I built and still see the RR_TIMESLICE errors. I don't see the MAX_PRIO half of the problem -

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-21 Thread Tony Luck
On Tue, Feb 19, 2013 at 5:38 PM, Xishi Qiu wrote: > Seems like a good idea, should we modify > "\linux\Documentation\kernel-parameters.txt"? Perhaps in Documentation/kdump/kdump.txt (which the crashkernel entry in kernel-parameters.txt points at). The ia64 section of kdump.txt notes that the st

Re: [PATCH v2] x86/mce: Honour bios-set CMCI threshold

2012-09-11 Thread Tony Luck
On Mon, Sep 10, 2012 at 10:31 PM, Naveen N. Rao wrote: > + if (mce_bios_cmci_threshold) > + printk_once(KERN_INFO > + "bios_cmci_threshold: Using bios-set threshold values > for CMCI"); Do we really need this message? The user knows whether they gave the

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck
> It is legal to access per-cpu data as early as you like, > it just evaluates to the static copy in the per-cpu section > of the kernel image until the per-cpu areas are setup. On ia64 per-cpu variables are mapped into the top 64K of the address space. Accessing them before the resources to hand

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck
> That's right. I thought you guys had something that would handle that > early on, but looking at how the trick works in the vmlinux.lds.S ia64 > uses that isn't the case. We try to get things set up pertty early ... but I agree this is fragile. Adding code to printk() to not provide a timestam

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck
On Wed, Feb 13, 2008 at 7:47 PM, Roland Dreier <[EMAIL PROTECTED]> wrote: > The strange thing is that Ingo's patch to make cpu_clock() a NOP until > after sched_init() didn't fix things for me... Very strange. I threw in an output line counter into the printk code() ... if I disable the timesta

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck
> We *ought* to be safe after cpu_init() ... which is called from setup_arch(), > which is several calls before sched_init(). Perhaps what is happening is that cpu0 comes online ... safely skips over the early printk calls. Calls cpu_init() which sets up the resources *it* needs (ar.k3 points to

Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-11-01 Thread Tony Luck
On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab wrote: > Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c: > > void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err > *mem_err) > { > struct mce m; > > /* Only corrected MC i

Re: [next:akpm 129/309] net/core/sock.c:274:36: error: initializer element is not constant

2012-07-26 Thread Tony Luck
On Tue, Jul 24, 2012 at 10:10 PM, James Bottomley wrote: >> Here is the line in sock.i: >> >> struct static_key memalloc_socks = ((struct static_key) { .enabled = >> ((atomic_t) { (0) }) }); > > The above line contains two compound literals. It also uses a designated > initializer to initialize t

Re: [RFC PATCH v5 05/19] memory-hotplug: check whether memory is present or not

2012-07-27 Thread Tony Luck
On Fri, Jul 27, 2012 at 3:28 AM, Wen Congyang wrote: > +static inline int pfns_present(unsigned long pfn, unsigned long nr_pages) > +{ > + int i; > + for (i = 0; i < nr_pages; i++) { > + if (pfn_present(pfn + 1)) Typo? I think you meant "pfn + i" > +

Re: [PATCH v3 3/3] mce, acpi/apei: Soft-offline a page on firmware GHES notification

2013-07-08 Thread Tony Luck
On Wed, Jul 3, 2013 at 8:40 AM, Naveen N. Rao wrote: >>> +#ifdef CONFIG_ACPI_APEI_MEMORY_FAILURE >>> + int sec_sev = ghes_severity(gdata->error_severity); >>> + struct cper_sec_mem_err *mem_err; >>> + mem_err = (struct cper_sec_mem_err *)(gdata+1); >> >> >> A newline here please

Re: [GIT PULL] EFI changes for v3.11

2013-07-08 Thread Tony Luck
On Mon, Jul 8, 2013 at 11:36 AM, H. Peter Anvin wrote: >> I had hoped to have this patch follow in the same path that the >> one that changed the types and introduced the warnings took ... >> but since that didn't work perhaps I should just ask Linus to pull >> it from my ia64 tree. >> > > I can p

Re: ia64: dmi.h: Make dmi_alloc use kzalloc

2013-07-09 Thread Tony Luck
On Tue, Jul 9, 2013 at 10:13 AM, Joe Perches wrote: > x86/ia64 have a slight mismatch in dmi_alloc as > x86 does a memset(0), and ia64 just does kmalloc. > > Make the ia64 dmi_alloc match the x86 style. > > Signed-off-by: Joe Perches Applied. Thanks Joe. -Tony -- To unsubscribe from this list:

Re: [PATCH v3 3/3] mce, acpi/apei: Soft-offline a page on firmware GHES notification

2013-07-10 Thread Tony Luck
>> Signed-off-by: Naveen N. Rao > > Acked-by: Borislav Petkov Applied-by: Tony Luck :-) Naveen: Thanks for having this idea, implementing it, and sticking with it through the review process. Once 3.11-rc1 is out I'll ask Ingo to pull this series to the tip tree ... and then

Re: [PATCH] lockref: remove cpu_relax() again

2013-09-06 Thread Tony Luck
No new Itanium numbers yet ... but I did wonder how this works on multi-socket x86 ... so I tweaked "t.c. to increase threads to 64 to max out my 4-socket Xeon E5-4650 (8 cores/socket 2 threads/core) and also print out the individual scores from each thread. $ ./t /tmp 64 389827 717666 1540293 130

Re: [GIT PULL] Device tree updates for v3.12

2013-09-11 Thread Tony Luck
table: commit d114a33387472555188f142ed8e98acdb8181c6d Author: Tony Luck Date: Fri Jul 20 13:15:20 2012 -0700 dmi: Feed DMI table to /dev/random driver I asked whether there was any size issue - as it tends to be a few kilobytes on laptops and desktops, and tens of kilobytes on servers. The answer I got bac

Re: [PATCH 2/3] x86/mce: Pack boolean MCE flags into a structure

2012-09-05 Thread Tony Luck
On Wed, Sep 5, 2012 at 3:22 AM, Naveen N. Rao wrote: > Many MCE flags are boolean in nature, but are declared as integers > currently. We can pack these into a bitfield to save some space. Before this patch: size arch/x86/kernel/cpu/mcheck/mce.o textdata bss dec hex filename

Re: [PATCH] [mcelog] Start using the new sysfs tunables location

2012-09-05 Thread Tony Luck
On Wed, Sep 5, 2012 at 11:47 AM, Andi Kleen wrote: > On Wed, Sep 05, 2012 at 04:02:37PM +0530, Naveen N. Rao wrote: >> All the current mce tunables are now available under >> /sys/devices/system/machinecheck. Start using this new location, but fall >> back >> to the older per-cpu location so that

Re: [PATCH RESEND]mm/ia64: fix a node distance bug

2012-09-10 Thread Tony Luck
On Fri, Sep 7, 2012 at 3:58 PM, David Rientjes wrote: > On Thu, 6 Sep 2012, wujianguo wrote: >> Signed-off-by: Jianguo Wu >> Signed-off-by: Jiang Liu > > Acked-by: David Rientjes Applied (should show up in linux-next in the next day or two). -Tony -- To unsubscribe from this list: send the li

Re: [PATCH 1/6] x86, RAS: Add a barebones RAS subtree

2012-10-08 Thread Tony Luck
On Mon, Oct 8, 2012 at 10:11 AM, Borislav Petkov wrote: > +config X86_RAS > + def_bool y > + prompt "X86 RAS features" > + ---help--- > + A collection of Reliability, Availability and Serviceability software > + features which enable hardware error logging and reporti

Re: [PATCH 0/6] AMD MCE injection improvs

2012-10-08 Thread Tony Luck
On Mon, Oct 8, 2012 at 10:11 AM, Borislav Petkov wrote: > create mode 100644 arch/x86/ras/ras.c Overall it looks good - but I'm a bit puzzled by this ras.c file that gets created as an empty file in part1, and is still empty at the end of the series. What is going to go into it? -Tony -- To un

Re: Linux 3.7-rc8

2012-12-03 Thread Tony Luck
On Mon, Dec 3, 2012 at 2:29 PM, Tony Luck wrote: > > > > On Mon, Dec 3, 2012 at 2:20 PM, Romain Francoise > wrote: >> >> Hi Linus, >> >> Linus Torvalds writes: >> >> > Linus Torvalds (5): >> > fs/buffer.c: make block-size be p

Re: Linux 3.7-rc8

2012-12-03 Thread Tony Luck
> Just for info, can you add a "WARN_ON_ONCE()" to handle_bad_sector() > just so that I see which particular path your kvm load triggers. On native ia64 (with SLES11 userspace) I see: WARNING: at block/blk-core.c:1557 generic_make_request_checks+0x680/0xa40() Hardware name: I8QBH Modules linked i

Re: Linux 3.7-rc8

2012-12-04 Thread Tony Luck
> Linus Torvalds writes: > >> Does that fix the printk's for you too? > > Yep, works for me, thanks! Belated "works for me too" (just in case you were worrying that ia64 was still broken). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message t

Re: new execve/kernel_thread design

2012-10-19 Thread Tony Luck
On Fri, Oct 19, 2012 at 10:30 AM, Al Viro wrote: > IIRC, the lack of comments on function with unusual calling conventions was > the last remaining issue... Stylistically other asm functions have huge block header comments detailing register usage. But typically those are way more complex. I thin

Re: [GIT PULL] Fix a cmci discovery problem

2012-11-07 Thread Tony Luck
Ingo, Is there a problem with this pull request ... or did it just get lost in the LKML noise? -Tony On Tue, Oct 30, 2012 at 3:01 PM, Luck, Tony wrote: > The following changes since commit 8f0d8163b50e01f398b14bcd4dc039ac5ab18d64: > > Linux 3.7-rc3 (2012-10-28 12:24:48 -0700) > > are availabl

Re: [PATCH v8 3/3] aerdrv: Cleanup log output for AER

2013-01-02 Thread Tony Luck
on the "To:" list want to claim this for their tree to commit? The series touches pci, acpi, RAS, and tracing ... so there are several possible owners. If someone else wants it, then add an: Acked-by: Tony Luck to all three parts. If there isn't a strong claim, I'll add v9[*]

Re: [GIT PULL] pstore/ram for 3.11

2013-06-14 Thread Tony Luck
On Wed, Jun 12, 2013 at 8:44 PM, Rob Herring wrote: > Not sure who takes this, but please pull these 2 changes for pstore for > 3.11. These are necessary to get pstore to work with on-chip RAM on > Calxeda highbank platform. Were these posted for discussion and review? Is there anyone who should

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-14 Thread Tony Luck
On Fri, Jun 14, 2013 at 3:23 PM, Rafael J. Wysocki wrote: > Can you please just test patch [5/5] alone without patches [1-4/5]? We > believe > that this should work too and if that's the case, we'll only need that patch > and a reworked [1/5]. Your belief is sound - I popped all five patches an

Re: [GIT PULL] pstore/ram for 3.11

2013-06-14 Thread Tony Luck
On Fri, Jun 14, 2013 at 3:47 PM, Anton Vorontsov wrote: > > Acked-by: Anton Vorontsov > > (Or I can pick this via linux-pstore.git tree, I'll let Tony decide.) Added that Acked-by: and applied to my tree. Thanks -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-k

Re: [PATCH] Re: [Patch] MCE, APEI: Don't enable CMCI when Firmware First mode is set in

2013-06-18 Thread Tony Luck
On Mon, Jun 17, 2013 at 11:43 PM, Naveen N. Rao wrote: > + if (bank >= mca_cfg.banks) { > + pr_info("mce_disable_bank: Invalid MCA bank %d ignored.\n", > bank); Let's have a FW_BUG in that message to point a finger at the source of the problem. + apei_hest_parse(hest_

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-19 Thread Tony Luck
> Can you please apply the appended patch on top of it and see if the system > still works then? Still works with this patch. -Tony > --- > drivers/acpi/scan.c |3 +++ > drivers/acpi/video.c |3 --- > 2 files changed, 3 insertions(+), 3 deletions(-) > > Index: linux-pm/drivers/acpi/sca

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-19 Thread Tony Luck
> If you don't mind, I'll queue up https://patchwork.kernel.org/patch/2712741/ > and > this for 3.11. Mark them Tested-by: Tony Luck if you like. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to m

Re: [PATCH] [IA64] sim: Add casts to avoid assignment warnings

2013-06-20 Thread Tony Luck
> arch/ia64/hp/sim/boot/fw-emu.c:293: warning: assignment makes pointer from > integer without a cast > > Add (void *) casts to the 10 affected lines to make the build quiet again. > > Signed-off-by: Tony Luck > > --- > > Boris, Mat

Re: [PATCH v4] aerdrv: Move cper_print_aer() call out of interrupt context

2013-05-30 Thread Tony Luck
Ok - grabbed this version. Will see if I can tempt Linus with a "please pull" tomorrow (when the commit is suitably aged). By the way ... this meta-commit description: > v2 - Re-worded header text. Removed prefix arg from cper_print_aer(). > Added TODO message in cper_print_aer(). > v3 - Ch

Re: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC

2013-06-21 Thread Tony Luck
On Fri, Jun 21, 2013 at 1:36 AM, Borislav Petkov wrote: > So ok, I'm persuaded, yet another bitfield it is ... :-\ Let's add some more comments on what each of these bitfields mean. Otherwise we will be going back over this next time we have a patch that touches one of them and we've all forgotte

Re: [PATCH v2] pstore: Fail to unlink if a driver has not defined pstore_erase

2013-06-25 Thread Tony Luck
On Tue, Jun 25, 2013 at 9:41 AM, Kees Cook wrote: > On Tue, Jun 25, 2013 at 2:03 AM, Aruna Balakrishnaiah > wrote: >> pstore_erase is used to erase the record from the persistent store. >> So if a driver has not defined pstore_erase callback return How do people manage devices like this? With n

Re: [GIT PULL] Power management and ACPI fixes for v3.10-rc5

2013-06-07 Thread Tony Luck
On Fri, Jun 7, 2013 at 5:51 AM, Rafael J. Wysocki wrote: > Aaron Lu (1): > ACPI / scan: do not match drivers against objects having scan handlers This patch showed up in linux-next tag next-20130605 and appears to be the cause of a boot failure on my ia64 HP rx2600 system. It panics with t

Re: [GIT PULL] Power management and ACPI fixes for v3.10-rc5

2013-06-07 Thread Tony Luck
On Fri, Jun 7, 2013 at 3:23 PM, Tony Luck wrote: > So please don't pull. Bother. I see I was a few hours late finding this, and commit 9f29ab11ddb is already in Linus' tree. That's what happens when I get busy and skip a couple of days testing linux-next :-( So my problem co

Re: [Suggestion] arch/*/include/asm/bitops.h: about __set_bit() API.

2013-06-10 Thread Tony Luck
;, but misaligned for a long ... i.e. addr%8 == 4, then I'll take an unaligned reference trap if I work with long* where the current code working with int* does not. Now perhaps all the callers do guarantee long* alignment? But I don't know. Apart from uniformity, there doesn't see t

Re: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-28 Thread Tony Luck
> + if (sec_sev == GHES_SEV_CORRECTED && > + (gdata->flags & > CPER_SEC_ERROR_THRESHOLD_EXCEEDED) && > + (mem_err->validation_bits & > CPER_MEM_VALID_PHYSICAL_ADDRESS)) { > + unsigned long pfn;

Re: [PATCH 2/3] acpi: Eliminate console msg if pstore.backend excludes ERST

2013-06-28 Thread Tony Luck
On Fri, Jun 28, 2013 at 1:14 PM, Lenny Szubowicz wrote: > - if (pstore_register(&erst_info)) { > - pr_info(ERST_PFX "Could not register with persistent > store\n"); > + rc = pstore_register(&erst_info); > + if (rc) { > +

Re: [PATCH 0/3] acpi: Eliminate misleading erst pstore console message

2013-06-28 Thread Tony Luck
On Fri, Jun 28, 2013 at 1:14 PM, Lenny Szubowicz wrote: > On systems that have a valid ACPI ERST table, if the pstore.backend kernel > parameter selects a specific facility other than erst, then during boot the > following console message is displayed: > > ERST: Could not register with persist

Re: [Bug] Reproducible data corruption on i5-3340M: Please continue your great work! :-)

2013-08-16 Thread Tony Luck
On Thu, Aug 15, 2013 at 5:33 PM, Linus Torvalds wrote: > I'll probably delay committing it until tomorrow, in the hope that > somebody using one of the other architectures will at least ack that > it compiles. I'm re-attaching the patch (with the two "logn" -> "long" > fixes) just to encourage tha

Re: [RFC PATCH v2 00/11] Add (de)compression support to pstore

2013-08-19 Thread Tony Luck
On Sat, Aug 17, 2013 at 11:32 AM, Kees Cook wrote: > Yeah, this is great. While I haven't tested it myself yet, the code > seems to be in good shape. I acked the ram piece separately, but > consider the entire series: > > Reviewed-by: Kees Cook Applied. This should show up in linux-next tomorro

Re: [PATCH 1/4] efi: provide a generic efi_config_init()

2013-07-30 Thread Tony Luck
On Tue, Jul 30, 2013 at 9:47 AM, Leif Lindholm wrote: > + /* > +* Let's see what config tables the firmware passed to us. > +*/ > + config_tables = early_mememap(efi.systab->tables, > + efi.systab->nr_tables * sz); Breaks bisection

Re: [PATCH 1/4] efi: provide a generic efi_config_init()

2013-07-30 Thread Tony Luck
On Tue, Jul 30, 2013 at 11:02 AM, Leif Lindholm wrote: > So I guess the clean way to deal with that would be to make the > memremap definition a separate patch? Or just pull: +#define early_memremap(phys_addr, size)early_ioremap(phys_addr, size) out of part 3 and put it into part1 (along

Re: [PATCH 0/4] Make commonly useful UEFI functions common

2013-07-30 Thread Tony Luck
On Tue, Jul 30, 2013 at 9:47 AM, Leif Lindholm wrote: > IA64 code compile tested only. Compiled on a bunch of ia64 configurations, Boot tested. But not on machine that does the PROCESSOR_ABSTRACTION_LAYER_OVERWRITE_GUID thingy. Code to do the arch specific thing looks ok though. -Tony -- To unsu

Re: [PATCH 00/11] Add compression support to pstore

2013-08-02 Thread Tony Luck
On Thu, Aug 1, 2013 at 4:42 PM, Luck, Tony wrote: > when I rebuilt a plain 3.11-rc3 it didn't log anything via pstore either :-( Well this turned out to be operator error on my part. 3.11-rc3 does in fact log errors to pstore and allows them to be retrieved and cleared. So then I start testing w

Re: [PATCH 00/11] Add compression support to pstore

2013-08-02 Thread Tony Luck
A quick experiment to use your patchset - but with compression disabled by tweaking this line in pstore_dump(): zipped_len = -1; //zip_data(dst, hsize + len); turned out well. This kernel dumps uncompressed dmesg blobs into pstore and gets them back out again. So it seems likely that the pro

Re: [PATCH v2 1/5] ia64: add early_memremap() alias for early_ioremap()

2013-08-05 Thread Tony Luck
m *addr, unsigned long size); >> static inline void __iomem * ioremap_cache (unsigned long phys_addr, >> unsigned long size) > > Tony, can I get your Acked-by for this? Acked-by: Tony Luck [Cut & paste this ack to other parts of the series that touch ia64] -Tony -- To unsubscribe fr

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck
One more experiment - removed previous hack that disabled compression. Added a new hack to skip decompression. System died cleanly when I forced a panic. On reboot I found 3 files in pstore: -r--r--r-- 1 root root 3972 Aug 5 09:24 dmesg-erst-5908671953186586625 -r--r--r-- 1 root root 2565 Aug

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck
See attachment for what I actually applied - I think I got what you suggested (I added a declaration for "total_len"). Forcing a panic worked some things were logged to pstore. But on reboot with your patches applied I'm still seeing a GP fault when pstore is mounted and we find compressed record

Re: Proposed stable release changes

2013-08-21 Thread Tony Luck
On Wed, Aug 21, 2013 at 1:00 PM, Borislav Petkov wrote: > We don't want to run daily snapshots of your tree though, right? Only > -rcs because the daily states are kinda arbitrary and they can be broken > in various ways. Or are we at a point in time where we can amend that > rule? If *nobody* ru

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck
This patch seems to fix the garbage at the end problem. Booting an old kernel and using openssl decodes them OK. Still have problems booting if there are any compressed images in ERST to be inflated. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: [PATCH 00/11] Add compression support to pstore

2013-08-06 Thread Tony Luck
On Mon, Aug 5, 2013 at 2:20 PM, Tony Luck wrote: > Still have problems booting if there are any compressed images in ERST > to be inflated. So I took another look at this part of the code ... and saw a couple of issues: while ((size = psi->read(&id, &type, &a

Re: [PATCH 00/11] Add compression support to pstore

2013-08-06 Thread Tony Luck
On Tue, Aug 6, 2013 at 6:58 PM, Aruna Balakrishnaiah wrote: > The patch looks right. I will clean it up. Does the issue still persist > after this? Things seem to be working - but testing has hardly been extensive (just a couple of forced panics). I do have one other question. In this code: >>

Re: [PATCH 00/11] Add compression support to pstore

2013-08-06 Thread Tony Luck
On Tue, Aug 6, 2013 at 10:13 PM, Aruna Balakrishnaiah wrote: > How is it with erst and efivars? ERST is at the whim of the BIOS writer (the ACPI standard doesn't provide any suggestions on record sizes). My systems support ~6K record size. efivars has, IIRC, a 1k limit coded in the Linux back e

Re: [PATCH 00/11] Add compression support to pstore

2013-08-07 Thread Tony Luck
Oh - one more thing - and my apologies for not spotting this before: dst = allocate_buf_for_compression(big_buf_sz); No - you may not call kmalloc() in oops/panic context. Please pre-allocate everything you need in some initialization code to make sure that we don't fail in the p

Re: [PATCH 00/11] Add compression support to pstore

2013-08-07 Thread Tony Luck
On Tue, Aug 6, 2013 at 10:35 PM, Tony Luck wrote: > ERST is at the whim of the BIOS writer (the ACPI standard doesn't provide any > suggestions on record sizes). My systems support ~6K record size. Off by a little - 7896 bytes on my current machine. > efivars has, IIRC, a 1k limi

Re: [PATCH 00/11] Add compression support to pstore

2013-08-07 Thread Tony Luck
On Wed, Aug 7, 2013 at 9:29 PM, Aruna Balakrishnaiah wrote: > When we preallocate, we can use the same big_buf for compression as well as > decompression. > Also workspace will be one for both. By allocating max of inflate workspace > size and deflate > workspace size. We can save memory here. We

Re: [PATCH] panic: call panic handlers before kmsg_dump

2013-07-18 Thread Tony Luck
a panic will show up in pstore's view when kmsg_dump > runs, and is therefore not visible to crash reporting tools that examine > pstore output. Good point. Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: [PATCH] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors.

2013-07-23 Thread Tony Luck
Gah ... there is another bug in that unaffected thread entry. The check for MCG_STATUS should be for RIPV=1 *and* EIPV=0 gmail will mess this patch up ... but should still be readable. -Tony --- diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity in

Re: lockdep warning in edac_create_sysfs_mci_device

2013-07-24 Thread Tony Luck
On Sun, Jul 21, 2013 at 12:02 PM, Borislav Petkov wrote: > A fix is on the way: > > http://marc.info/?l=linux-edac&m=137422971614927&w=2 Fix was pulled into Linus' tree yesterday evening (Portland, OR timezone): commit 88d84ac97378c2f1d5fec9af1e8b7d9a662d6b00 Author: Borislav Petkov Date: Fri

Re: scsi scan: INQUIRY result too short (5), using 36

2013-07-24 Thread Tony Luck
Oops ... forgot final step. That commit does revert cleanly (at least git did not grumble when I asked it to revert). The resulting kernel builds cleanly and boots without seeing this problem. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a messa

Re: scsi scan: INQUIRY result too short (5), using 36

2013-07-24 Thread Tony Luck
g the > wrong way. Does this fix it? Yes. That fixes it. Wrap whichever of: Reported-by: Tony Luck and/or Tested-by: Tony Luck around that patch and ship it! Thanks for the fast fix. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a

[PATCH 0/2] machine check decode fixes

2013-07-24 Thread Tony Luck
V2 of this: * Broken into two patches - by suggestion of Chen Gong * Just change MCACOD #define value - by suggestion of Naveen Tony Luck (2): x86/mce: Fix mce regression from recent cleanup x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors

[PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors

2013-07-24 Thread Tony Luck
architecturaly defined recoverable error signatures (see SDM 15.9.3.1 and 15.9.3.2) Signed-off-by: Tony Luck --- arch/x86/include/asm/mce.h | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 29e3093..aa97342

[PATCH 1/2] x86/mce: Fix mce regression from recent cleanup

2013-07-24 Thread Tony Luck
that all recoverable events would be reported on the unaffected processor with MCG_STATUS.EIPV=0 and MCG_STATUS.RIPV=1. Unfortunately the simplified rule has a couple of bugs. Fix them here. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mcheck/mce-severity.c | 4 ++-- 1 file changed, 2

Re: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-11 Thread Tony Luck
> What I understand from above in intel 64 Arch software Developer's manual are: > 1) this manual is written for software developer; > 2) It says that MCE handler only requires to synchronize among the logical > cores in the same package/core(what I assume here is same CPU socket). > > I have two

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck
On Sat, May 11, 2013 at 12:52 AM, Dmitry Monakhov wrote:. > What was page_size and fsblock size? CONFIG_IA64_PAGE_SIZE_64KB=y fsblock size is whatever is the default for SLES11SP2 on ia64 - which tool will tell me? My git bisect finally competed and points the a finger at: bisect> git bisect g

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck
On Sun, May 12, 2013 at 7:21 PM, EUNBONG SONG wrote: > Hi, my git bisect result is same yours. And i reported that to community > yesterday. Ah. Good to have some confirmation (I was never sure how long to keep running before deciding that a test was "good". My slowest "bad" test took about 2.5

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck
The 3.10-rc1 with ae4647fb765467 reverted is still running OK. At 3 hours now (only marginally longer that the 2.5 hours that one of the "bad" runs during the bisect managed). So I'm about 30% sure that we have a winner at the moment. I'll leave it running and check again in the morning. This peng

Re: [PATCH] NOHZ, check to see if tick device is initialized in IRQ handling path

2013-05-02 Thread Tony Luck
> void tick_nohz_irq_exit(void) > { > struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched); > + struct clock_event_device *dev = > +__get_cpu_var(tick_cpu_device).evtdev; > + > + /* Has the tick been initialized yet? */ > + if (unlik

VFAT complains that my file system may be corrupted

2013-05-06 Thread Tony Luck
Built Linus' tree this morning (HEAD = d7ab7302f970a254997687a1cdede421a5635c68) and got this message: FAT-fs (sda1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. when booting my ia64 machine. The message may well be legitimate because I did crash the machine, so

Re: VFAT complains that my file system may be corrupted

2013-05-06 Thread Tony Luck
On Mon, May 6, 2013 at 11:44 AM, Oleksij Rempel wrote: > i provided patches for dosfstools for some time now, you need at least > v3.0.14. If your system do not provide it you will need to grub it here: > http://daniel-baumann.ch/gitweb/?p=software/dosfstools.git I may have too old a toolchain to

Re: VFAT complains that my file system may be corrupted

2013-05-06 Thread Tony Luck
On Mon, May 6, 2013 at 11:51 AM, Tony Luck wrote: > I guess I can fake them easily (ia64 runs little endian on Linux). Duh. Especially as the only use is line 560-562 in src/boot.c: de.starthi = CT_LE_W(0); de.start = CT_LE_W(0); de.size = CT_LE_L(0); Gotta make sure to us

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-10 Thread Tony Luck
I think I have the same (or highly similar) thing happening on ia64. Similarities: seeing assertions fail for b_transaction Differences: I only have ext3 filesystems mounted, no ext4 See attached trace. I'm pretty certain that the highly unhelpful bugcheck! 0 [1] comes from the J_

Re: [tip:smp/hotplug] idle: Implement generic idle function

2013-04-15 Thread Tony Luck
Built next-20130415 and got this on ia64 early in boot: WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380() Hardware name: server rx2620 Modules linked in: Call Trace: [] show_stack+0x80/0xa0 sp=a00101287c50 bsp=a00101280e48 [] dump_stack+0x30/0x

Re: [tip:smp/hotplug] idle: Implement generic idle function

2013-04-16 Thread Tony Luck
On Tue, Apr 16, 2013 at 6:28 AM, Thomas Gleixner wrote: > Hmm, is safe_halt() returning with interrupts disabled? If yes, it > lacks a local_irq_enable(). Quite probably. Adding arch_local_irq_enable() to arch_safe_halt() makes all the problems go away. I'll send you the one-line patch from a sy

Re: [PATCH 4/9] ia64: cpufreq: move cpufreq driver to drivers/cpufreq

2013-04-01 Thread Tony Luck
[Repost in plain text so the lists don't bounce it - curse you Gmail for switching to HTML] > Any comments on this patch? This part looks OK ... But is there a big finish later in the patch series where you unify some/all of the cpufreq code across architectures? By itself just moving bits from

[PATCH 5/6] x86/mce: Provide an option to keep cmci_reenable() quiet

2012-08-07 Thread Tony Luck
all banks have an owner. We want to use cmci_reenable() when a CMCI storm subsides. In this case the topology has not changed, so we do not need any commentary as it goes about its business. Add a "quiet" argument to cmci_reenable() that it passes to cmci_discover(). Signed-off-by:

[PATCH 6/6] x86/mce: Add CMCI poll mode

2012-08-07 Thread Tony Luck
the storm subsides (none of the affected processors see any more errors for a complete poll interval) we re-enable CMCI. Signed-off-by: Chen Gong Signed-off-by: Thomas Gleixner Tested-by: Chen Gong Signed-off-by: Tony Luck --- Changes (w.r.t. old patch 5/5): + New commit message + Print messages

[PATCH 5/6] x86/mce: Make cmci_discover() quiet

2012-08-09 Thread Tony Luck
: Tony Luck --- Gong pointed out to me offline that my previous "patch 5/6" would not do what I said it did in the case where a processor is taken offline during a CMCI storm. We'd have a topology change, but would suppress the bank attribution messages when the storm ended. I too

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-19 Thread Tony Luck
> Perhaps what is happening is that cpu0 comes online ... safely skips > over the early printk calls. Calls cpu_init() which sets up the resources > *it* needs (ar.k3 points to per-cpu space), and then executes > sched_init() which marks it safe for all printk's. Then cpu1 comes > up and does a pr

Re: [PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors

2012-07-20 Thread Tony Luck
> Maybe define a default empty quirk_no_way_out() on the remaining > families/vendors so that the compiler can optimize it away and we save > ourselves the if-test? Perhaps I misunderstood your suggestion. I don't see how the compiler will manage to optimize it all away. I just tried defining st

[PATCH] dmi: Feed DMI table to /dev/random driver

2012-07-20 Thread Tony Luck
Send the entire DMI (SMBIOS) table to the /dev/random driver to help seed its pools. Signed-off-by: Tony Luck --- This looks a useful addition to your /dev/random series. There are lots of platform specific goodies in this table (BIOS version, system serial number and UUID, count and version

Re: [PATCH] dmi: Feed DMI table to /dev/random driver

2012-07-20 Thread Tony Luck
On Fri, Jul 20, 2012 at 5:56 PM, Theodore Ts'o wrote: > The other approach was to add some new interface that random.c would > call which would grab the dmi data from rand_initialize(). But that's > going to be a lot more complicated, so I guess we should go with the > simple/stupid approach. It

[PATCH] random: Add comment to random_initialize()

2012-07-23 Thread Tony Luck
. Suggested-by: Theodore Ts'o Signed-off-by: Tony Luck --- Theodore Ts'o wrote: > I agree. Want to send a revised patch with the comment, and I'll drop > it into the random.git tree? Additional patch rather than revised (since I'm touching different subsystems: dmi and random

checkpatch should not complain about 'Suggested-by:'

2012-07-23 Thread Tony Luck
checkpatch just gave me: WARNING: Non-standard signature: Suggested-by: There are over 500 instances of 'Suggested-by:', and it seems to have some value in tracking history and awarding credit where it is due. "Reported-and-tested-by:" is also in regular use, but not in the list of "standard"

[PATCH 0/2] Fix machine check recovery for instruction fault on Sandy Bridge

2012-07-23 Thread Tony Luck
ch in turn makes us unable to determine whether the machine check was taken in kernel or user mode. The workaround is to fake the presence of the EIPV bit for this error on this processor model. Not pretty, but avoids having to make special cases later in the code. Tony Luck (2): x86/mce: M

[PATCH 1/2] x86/mce: Move MCACOD defines from mce-severity.c to

2012-07-23 Thread Tony Luck
We will need some of these values in mce.c. Move them to the appropriate header file so they are available. Acked-by: Borislav Petkov Signed-off-by: Tony Luck --- arch/x86/include/asm/mce.h| 8 arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 --- 2 files changed, 8

[PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors

2012-07-23 Thread Tony Luck
special case code at several places in mce.c and mce-severity.c, we pretend the EIPV bit was set for just this case early in processing the machine check. Acked-by: Borislav Petkov Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mcheck/mce.c | 43 +--- 1 file

  1   2   3   4   5   6   >