from:"Tony Luck"

Re: [PATCH] x86/mce: Rework cmci_rediscover() to play well with CPU hotplug

2013-03-20 Thread Tony Luck

Looks lots cleaner.  Applied on top of 3.9.rc3 and took it for a spin
offlining and onlining cpus at random intervals.  First time round I
saw a few splats like the one below.  But after a reboot I can no
longer reproduce.

-Tony

INFO: task devkit-power-da:19861 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
devkit-power-da D 8810114cd048 0 19861  1 0x0080
 88101063b828 0082 8810114cca80 88101063bfd8
 88101063bfd8 88101063bfd8 88081fa18040 8810114cca80
 88101063b808 88080c0cf800  88101063b8b0
Call Trace:
 [] schedule+0x29/0x70
 [] usb_kill_urb+0x85/0xc0
 [] ? wake_up_bit+0x40/0x40
 [] usb_start_wait_urb+0xd8/0x160
 [] usb_control_msg+0xcc/0x110
 [] ? usb_get_status+0x43/0xc0
 [] usb_get_status+0x83/0xc0
 [] ? usb_set_device_state+0x127/0x160
 [] usb_port_resume+0x2c6/0x630
 [] ? __switch_to+0x181/0x4a0
 [] generic_resume+0x15/0x30
 [] usb_resume_both+0x105/0x150
 [] usb_runtime_resume+0x1a/0x20
 [] __rpm_callback+0x31/0x90
 [] rpm_callback+0x2f/0x90
 [] rpm_resume+0x40c/0x670
 [] ? wake_up_bit+0x40/0x40
 [] __pm_runtime_resume+0x5c/0x90
 [] usb_autoresume_device+0x29/0x60
 [] usbdev_open+0x110/0x210
 [] chrdev_open+0x9c/0x180
 [] do_dentry_open+0x20f/0x2c0
 [] ? cdev_put+0x30/0x30
 [] finish_open+0x35/0x50
 [] do_last+0x6de/0xde0
 [] ? inode_permission+0x18/0x50
 [] ? link_path_walk+0x78/0x880
 [] path_openat+0xb7/0x4a0
 [] do_filp_open+0x41/0xa0
 [] ? __alloc_fd+0x42/0x110
 [] do_sys_open+0xf4/0x1e0
 [] ? do_notify_resume+0x59/0x80
 [] sys_open+0x21/0x30
 [] system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-12 Thread Tony Luck

Building linux-next today (tag next-20130212) I get the following errors when
building arch/ia64/configs/{tiger_defconfig, zx1_defconfig, bigsur_defconfig,
sim_defconfig}

arch/ia64/mm/init.c: In function 'free_initrd_mem':
arch/ia64/mm/init.c:215: error: 'max_addr' undeclared (first use in
this function)
arch/ia64/mm/init.c:215: error: (Each undeclared identifier is
reported only once
arch/ia64/mm/init.c:215: error: for each function it appears in.)
arch/ia64/mm/init.c:216: error: implicit declaration of function
'GRANULEROUNDDOWN'

with "git blame" saying that these lines in init.c were added/changed by

commit 5a54b4fb8f554b15c6113e30ca8412b7fe11c62e
Author: Xishi Qiu 
Date:   Thu Feb 7 12:25:59 2013 +1100

ia64/mm: fix a bad_page bug when crash kernel booting

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-12 Thread Tony Luck

On Tue, Feb 12, 2013 at 4:19 PM, Andrew Morton
 wrote:
> But, umm, why am I sitting here trying to maintain an ia64 bugfix and
> handling bug reports from the ia64 maintainer?  Wanna swap?

That sounds like a plan.  I'll look out for a new version with the
missing #include
and less silly global variable names and try to take it before you
pull it into -mm

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] ia64: cpufreq: move cpufreq driver to drivers/cpufreq

2013-04-03 Thread Tony Luck

On Mon, Apr 1, 2013 at 5:49 PM, Viresh Kumar  wrote:
> For now, your Ack will work :)

Acked-by: Tony Luck 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 17/22] x86, ACPI, numa, ia64: split SLIT handling out

2013-04-05 Thread Tony Luck

On Thu, Apr 4, 2013 at 4:46 PM, Yinghai Lu  wrote:
> It should not break ia64 by replacing acpi_numa_init with
> acpi_numa_init_srat/acpi_numa_init_slit/acpi_num_arch_fixup.

You are right - it doesn't break ia64.  All my test configs still
build.  Machines both with and without NUMA still boot and
nothing strange happens.

Tested-by: Tony Luck 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-19 Thread Tony Luck

Foolishly sent an earlier reply from Outlook which appears
to have mangled/lost it. Trying again ...

> In efi_init() memory aligns in IA64_GRANULE_SIZE(16M). If set 
> "crashkernel=1024M-:600M"

Is this where the real problem begins?  Should we insist that users
provide crashkernel
parameters rounded to GRANULE boundaries?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] sched: move RR_TIMESLICE from sysctl.h to rt.h

2013-02-20 Thread Tony Luck

On Wed, Feb 20, 2013 at 7:19 AM, Clark Williams
 wrote:
> Signed-off-by: Clark Williams 
> ---

This happens to unbreak the ia64 build which is currently grumbling about:

arch/ia64/kernel/init_task.c:38: error: 'RR_TIMESLICE' undeclared here
(not in a function)

So I'd be happy if it got applied directly to Linus tree before I get too big of
a bisection gap.

Acked-by: Tony Luck 

>  include/linux/sched/rt.h | 6 ++
>  include/linux/sched/sysctl.h | 6 --
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
> index 94e19ea..440434d 100644
> --- a/include/linux/sched/rt.h
> +++ b/include/linux/sched/rt.h
> @@ -55,4 +55,10 @@ static inline bool tsk_is_pi_blocked(struct task_struct 
> *tsk)
>  extern void normalize_rt_tasks(void);
>
>
> +/*
> + * default timeslice is 100 msecs (used only for SCHED_RR tasks).
> + * Timeslices get refilled after they expire.
> + */
> +#define RR_TIMESLICE   (100 * HZ / 1000)
> +
>  #endif /* _SCHED_RT_H */
> diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
> index d2bb0ae..bf8086b 100644
> --- a/include/linux/sched/sysctl.h
> +++ b/include/linux/sched/sysctl.h
> @@ -91,12 +91,6 @@ extern unsigned int sysctl_sched_cfs_bandwidth_slice;
>  extern unsigned int sysctl_sched_autogroup_enabled;
>  #endif
>
> -/*
> - * default timeslice is 100 msecs (used only for SCHED_RR tasks).
> - * Timeslices get refilled after they expire.
> - */
> -#define RR_TIMESLICE   (100 * HZ / 1000)
> -
>  extern int sched_rr_timeslice;
>
>  extern int sched_rr_handler(struct ctl_table *table, int write,
> --
> 1.8.1.2
>
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] sched: move RR_TIMESLICE from sysctl.h to rt.h

2013-02-20 Thread Tony Luck

On Wed, Feb 20, 2013 at 9:50 AM, Ingo Molnar  wrote:
> Hm, didn't it get fixed via the commit below?

Together with moving RR_TIMESLICE to rt.h ... ia64 is good. But I
see commit 77852fea6e24 in the tree I built and still see the
RR_TIMESLICE errors.

I don't see the MAX_PRIO half of the problem - so it did help a bit.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-21 Thread Tony Luck

On Tue, Feb 19, 2013 at 5:38 PM, Xishi Qiu  wrote:
> Seems like a good idea, should we modify 
> "\linux\Documentation\kernel-parameters.txt"?

Perhaps in Documentation/kdump/kdump.txt (which the crashkernel entry
in kernel-parameters.txt
points at).  The ia64 section of kdump.txt notes that the start
address will be rounded up to
a GRANULE boundary, but doesn't talk about restrictions on the size.

I wonder if any other architectures have alignment restrictions on the
addresses in
"crashkernel" parameters? Does x86 like them to be 2MB aligned?

Second question is whether we should check and warn in parse_crashkernel_mem()?
I think the answer is "yes" (since the consequences of getting this
wrong don't show
up till much later, and the errors aren't all that obviously connected
back to the original
mistake).  Perhaps each architecture that cares could provide defines:

#define ARCH_CRASH_KERNEL_START_ALIGN (... arch value here ...)
#define ARCH_CRASH_KERNEL_SIZE_ALIGN (... arch value here ...)

[Suggestion provided mostly to provoke somebody to provide a more
elegant solution]

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] x86/mce: Honour bios-set CMCI threshold

2012-09-11 Thread Tony Luck

On Mon, Sep 10, 2012 at 10:31 PM, Naveen N. Rao
 wrote:
> +   if (mce_bios_cmci_threshold)
> +   printk_once(KERN_INFO
> +   "bios_cmci_threshold: Using bios-set threshold values 
> for CMCI");

Do we really need this message? The user knows whether they gave
the command line option or not (and can check in /proc/cmdline if
they forgot whether they did). If it is needed, then you should add
a "\n" to it.

> +   if (mce_bios_cmci_threshold && bios_wrong_thresh) {
> +   printk_once(KERN_INFO
> +   "bios_cmci_threshold: Some banks do not have valid 
> thresholds set");
> +   printk_once(KERN_INFO
> +   "bios_cmci_threshold: Make sure your BIOS supports 
> this boot option");

Also need "\n"

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck

> It is legal to access per-cpu data as early as you like,
> it just evaluates to the static copy in the per-cpu section
> of the kernel image until the per-cpu areas are setup.

On ia64 per-cpu variables are mapped into the top 64K
of the address space.  Accessing them before the
resources to handle the access to that virtual address
have been set up would cause problems.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-13 Thread Tony Luck

> That's right.  I thought you guys had something that would handle that
> early on, but looking at how the trick works in the vmlinux.lds.S ia64
> uses that isn't the case.

We try to get things set up pertty early ... but I agree this is
fragile.  Adding code to printk() to not provide a timestamp
before some safe point in boot is a workaround to the
current problem.  But it may come back to haunt us if
other per-cpu data is added that needs to be accessed
early during boot.

There are some changes going on at the moment on how
we allocate the space for the per-cpu area.  It is likely that
for a non-boot cpu we might be able to get everything that
we need for per-cpu access to work done in head.S before
we can get to any C code.  Boot cpu may be harder unless
we statically allocate space for its per-cpu area in
vmlinux.lds.S

I'll take a closer look at what is needed tomorrow.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck

On Wed, Feb 13, 2008 at 7:47 PM, Roland Dreier <[EMAIL PROTECTED]> wrote:
>  The strange thing is that Ingo's patch to make cpu_clock() a NOP until
>  after sched_init() didn't fix things for me...

Very strange.  I threw in an output line counter into the printk code() ... if I
disable the timestamps for the first 30 lines, then everything is good (so the
basic timestamping code does still work on ia64). But I would have thought
that Ingo's delay until sched_init() ought to be long enough too. Clearly I
need to figure out exactly what needs to be initialized to prevent the
hang/crash.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Tony Luck

> We *ought* to be safe after cpu_init() ... which is called from setup_arch(),
> which is several calls before sched_init().

Perhaps what is happening is that cpu0 comes online ... safely skips
over the early printk calls.  Calls cpu_init() which sets up the resources
*it* needs (ar.k3 points to per-cpu space), and then executes
sched_init() which marks it safe for all printk's. Then cpu1 comes
up and does a printk before it gets to cpu_init().

Try with Ingo patch and CONFIG_SMP=n to see if you can come
up on a uni-processor.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-11-01 Thread Tony Luck

On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab
 wrote:
> Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c:
>
> void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err 
> *mem_err)
> {
> struct mce m;
>
> /* Only corrected MC is reported */
> if (!corrected || !(mem_err->validation_bits &
> CPER_MEM_VALID_PHYSICAL_ADDRESS))
> return;
>
> mce_setup(&m);
> m.bank = 1;
> /* Fake a memory read corrected error with unknown channel */
> m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV 
> | 0x9f;
> m.addr = mem_err->physical_addr;
> mce_log(&m);
> mce_notify_irq();
> }
>
> Bank information there is fake; status is fake. Only addr is really filled
> there; it works only for corrected errors.

This went in like this to help out the Westmere-EX processors that
didn't fill out MCi_ADDR for corrected errors. APEI could get the
address from some platform CSRs ... reporting via /dev/mcelog
so that predictive analysis in mcelog(8) would work on these machines.

I don't think we can rip it out yet ... not until those machines are
shuffled off to recycle heaven.

But perhaps we should get smarter about which machines we enable
APEI on?  If we get everything we need from the machine check banks,
then the detour via the BIOS to report the same thing again isn't helpful.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [next:akpm 129/309] net/core/sock.c:274:36: error: initializer element is not constant

2012-07-26 Thread Tony Luck

On Tue, Jul 24, 2012 at 10:10 PM, James Bottomley
 wrote:
>> Here is the line in sock.i:
>>
>> struct static_key memalloc_socks = ((struct static_key) { .enabled =
>> ((atomic_t) { (0) }) });
>
> The above line contains two compound literals.  It also uses a designated
> initializer to initialize the field enabled.  A compound literal is not a
> constant expression.

Seeing the same thing on ia64 building next-20120726.  Same fix works
for me ... so I'll steal this whole changelog and attributes.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v5 05/19] memory-hotplug: check whether memory is present or not

2012-07-27 Thread Tony Luck

On Fri, Jul 27, 2012 at 3:28 AM, Wen Congyang  wrote:
> +static inline int pfns_present(unsigned long pfn, unsigned long nr_pages)
> +{
> +   int i;
> +   for (i = 0; i < nr_pages; i++) {
> +   if (pfn_present(pfn + 1))

Typo? I think you meant "pfn + i"

> +   continue;
> +   else
> +   return -EINVAL;
> +   }
> +   return 0;
> +}

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 3/3] mce, acpi/apei: Soft-offline a page on firmware GHES notification

2013-07-08 Thread Tony Luck

On Wed, Jul 3, 2013 at 8:40 AM, Naveen N. Rao
 wrote:

>>> +#ifdef CONFIG_ACPI_APEI_MEMORY_FAILURE
>>> +   int sec_sev = ghes_severity(gdata->error_severity);
>>> +   struct cper_sec_mem_err *mem_err;
>>> +   mem_err = (struct cper_sec_mem_err *)(gdata+1);
>>
>>
>> A newline here please. Also, spaces around '+'.

I was off on vacation last week - looks like you got lots done without me :-)

I have parts 1 & 2 applied to an internal tree. Looks like parts 3 & 4 need a
few final polishes to get an Ack from Boris.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] EFI changes for v3.11

2013-07-08 Thread Tony Luck

On Mon, Jul 8, 2013 at 11:36 AM, H. Peter Anvin  wrote:
>> I had hoped to have this patch follow in the same path that the
>> one that changed the types and introduced the warnings took ...
>> but since that didn't work perhaps I should just ask Linus to pull
>> it from my ia64 tree.
>>
>
> I can push it, although it seems a bit odd to me to push an ia64-only
> patch through the x86 tree.
>
> Let me know what you prefer.

I've sent Linus a "please pull" for this from my ia64 tree.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ia64: dmi.h: Make dmi_alloc use kzalloc

2013-07-09 Thread Tony Luck

On Tue, Jul 9, 2013 at 10:13 AM, Joe Perches  wrote:
> x86/ia64 have a slight mismatch in dmi_alloc as
> x86 does a memset(0), and ia64 just does kmalloc.
>
> Make the ia64 dmi_alloc match the x86 style.
>
> Signed-off-by: Joe Perches 

Applied. Thanks Joe.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 3/3] mce, acpi/apei: Soft-offline a page on firmware GHES notification

2013-07-10 Thread Tony Luck

>> Signed-off-by: Naveen N. Rao 
>
> Acked-by: Borislav Petkov 

Applied-by: Tony Luck :-)

Naveen: Thanks for having this idea, implementing it, and sticking
with it through the review process.

Once 3.11-rc1 is out I'll ask Ingo to pull this series to the tip tree
... and then on to 3.12

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] lockref: remove cpu_relax() again

2013-09-06 Thread Tony Luck

No new Itanium numbers yet ... but I did wonder how this works
on multi-socket x86 ... so I tweaked "t.c. to increase threads to
64 to max out my 4-socket Xeon E5-4650 (8 cores/socket 2 threads/core)
and also print out the individual scores from each thread.

$ ./t /tmp 64
389827
717666
1540293
130764
681839
33357
606966
33716
33183
33230
69685
76422
352851
34940
257132
34192
34200
34098
34053
34459
234399
33678
241571
545912
620857
65818
32853
739440
33697
683655
741366
36208
385775
446198
45974
33056
403944
717415
254782
166754
702745
43661
1042180
437367
43751
503342
154223
706917
878167
43802
51667
660875
33261
522425
33627
33637
33446
33604
52963
33688
406088
551690
446474
33289
Threads = 64 Total loops: 19109114

Individual thread performance varies from 32853 to 1540293. A factor of 46.9
Sometimes it is good to sacrifice fairness for throughput. But wow!

Running for longer [ s/sleep(10)/sleep(300) ] gave things a chance to
even out - but I still see a factor of 3.5 between the fastest and the
slowest.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Device tree updates for v3.12

2013-09-11 Thread Tony Luck

On Tue, Sep 10, 2013 at 1:50 PM, Linus Torvalds
 wrote:
> Of course, maybe even the stupid add_device_randomness() is fast
> enough. I just wanted to point out that it definitely isn't some
> optimized thing.

When I posted the patch that mixes in the whole SMBIOS table:

commit d114a33387472555188f142ed8e98acdb8181c6d
Author: Tony Luck 
Date:   Fri Jul 20 13:15:20 2012 -0700

dmi: Feed DMI table to /dev/random driver

I asked whether there was any size issue - as it tends to be a few
kilobytes on laptops and desktops, and tens of kilobytes on servers.
The answer I got back then was not to worry - digesting a few kilobytes
wouldn't be a problem.  I just threw in a debug message to check and saw:

dmi_walk_early: added 10342 bytes in 339968 cycles

So a couple of hundred microseconds for me.

There are plenty of machine specific values buried in there (e.g. serial
numbers for all the DIMMs) ... so this looks like a good use of this
much boot time.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] x86/mce: Pack boolean MCE flags into a structure

2012-09-05 Thread Tony Luck

On Wed, Sep 5, 2012 at 3:22 AM, Naveen N. Rao
 wrote:
> Many MCE flags are boolean in nature, but are declared as integers
> currently. We can pack these into a bitfield to save some space.

Before this patch:
size arch/x86/kernel/cpu/mcheck/mce.o
   textdata bss dec hex filename
  189464930 776   24652604c arch/x86/kernel/cpu/mcheck/mce.o

After:
size arch/x86/kernel/cpu/mcheck/mce.o
   textdata bss dec hex filename
  193354890 776   2500161a9 arch/x86/kernel/cpu/mcheck/mce.o

So we do indeed see "data" reduced by 40 bytes. But
"text" is up by 389.  This seems to be because you have
another change, not described in the commit log, buried
in part 2 to add get_dont_log_ce(), set_dont_log_ce() etc.

Compiler version: gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC)

I know I'm contradicting the feedback you got from Borislav here, but
is this code churn really worth it to save 40 bytes? I don't think so.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [mcelog] Start using the new sysfs tunables location

2012-09-05 Thread Tony Luck

On Wed, Sep 5, 2012 at 11:47 AM, Andi Kleen  wrote:
> On Wed, Sep 05, 2012 at 04:02:37PM +0530, Naveen N. Rao wrote:
>> All the current mce tunables are now available under
>> /sys/devices/system/machinecheck. Start using this new location, but fall 
>> back
>> to the older per-cpu location so that we continue working with older kernels.
>
> Who did that change in the kernel?
>
> That breaks Linus rule that the kernel should not break userland.
> Kernel needs to fix that.

The change is still under discussion. Stage one is to add the new global
pathnames in addition to keeping the old per-cpu ones. Also fix all utilities
(just mcelog(8) as far as we know) to prefer the new paths.

After some time[1] ... delete the old paths. This is allowable under Linus'
modified edict that you can change ABI "if nobody complains". If we wait
long enough that the new mcelog is widely deployed, then nobody should
complain.

-Tony

[1] several years - not just a kernel release or two.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND]mm/ia64: fix a node distance bug

2012-09-10 Thread Tony Luck

On Fri, Sep 7, 2012 at 3:58 PM, David Rientjes  wrote:
> On Thu, 6 Sep 2012, wujianguo wrote:
>> Signed-off-by: Jianguo Wu 
>> Signed-off-by: Jiang Liu 
>
> Acked-by: David Rientjes 

Applied (should show up in linux-next in the next day or two).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] x86, RAS: Add a barebones RAS subtree

2012-10-08 Thread Tony Luck

On Mon, Oct 8, 2012 at 10:11 AM, Borislav Petkov  wrote:
> +config X86_RAS
> +   def_bool y
> +   prompt "X86 RAS features"
> +   ---help---
> +   A collection of Reliability, Availability and Serviceability software
> +   features which enable hardware error logging and reporting. Leave it
> +   at 'y' unless you really know what you're doing.
> +

The intent of "X86_RAS" is just to show/hide all the menu
options for the individual features - which will all use
  depends on X86_RAS
right? Having this set to "y" doesn't actually enable any of
the features - they all have their own CONFIG_* variables.

Perhaps we could make that clearer in the help text? And
ditch the "Leave it at 'y' ... ", I don't think it helps anyone.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] AMD MCE injection improvs

2012-10-08 Thread Tony Luck

On Mon, Oct 8, 2012 at 10:11 AM, Borislav Petkov  wrote:
>  create mode 100644 arch/x86/ras/ras.c

Overall it looks good - but I'm a bit puzzled by this ras.c file that gets
created as an empty file in part1, and is still empty at the end of the
series.

What is going to go into it?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.7-rc8

2012-12-03 Thread Tony Luck

On Mon, Dec 3, 2012 at 2:29 PM, Tony Luck  wrote:
>
>
>
> On Mon, Dec 3, 2012 at 2:20 PM, Romain Francoise 
> wrote:
>>
>> Hi Linus,
>>
>> Linus Torvalds  writes:
>>
>> > Linus Torvalds (5):
>> >   fs/buffer.c: make block-size be per-page and protected by the
>> > page lock
>> >   blockdev: remove bd_block_size_semaphore again
>> >   direct-io: don't read inode->i_blkbits multiple times
>> >   blkdev_max_block: make private to fs/buffer.c
>>
>> Could these changes be the reason for the following suddenly appearing in
>> one of my VMs with rc8 (no such messages with rc7)? Pretty standard
>> virtio
>> setup in KVM.
>>
>> [   11.832295] attempt to access beyond end of device
>> [   11.832298] vda1: rw=0, want=4192904, limit=4192902
>> [   11.832299] Buffer I/O error on device vda1, logical block 524112
>> [   11.832394] attempt to access beyond end of device
>> [   11.832395] vda1: rw=0, want=4192904, limit=4192902
>> [   11.832396] Buffer I/O error on device vda1, logical block 524112
>>
> I'm seeing similar stuff in -rc8 too (on ia64, native no VMM):
>
>
> attempt to access beyond end of device
> sda3: rw=0, want=268317424, limit=268317421
> Buffer I/O error on device sda3, logical block 33539677
>
> attempt to access beyond end of device
> sda3: rw=0, want=268317424, limit=268317421
> Buffer I/O error on device sda3, logical block 33539677
>
> -rc7 didn't do this.
>
> -Tony

Resend ... to go to the list (Oh gmail, why did you decide to reply in HTML???)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.7-rc8

2012-12-03 Thread Tony Luck

> Just for info, can you add a "WARN_ON_ONCE()" to handle_bad_sector()
> just so that I see which particular path your kvm load triggers.

On native ia64 (with SLES11 userspace) I see:

WARNING: at block/blk-core.c:1557 generic_make_request_checks+0x680/0xa40()
Hardware name: I8QBH
Modules linked in: usb_storage sg container button usbhid uhci_hcd
ehci_hcd usbcore usb_common fan processor thermal thermal_sys

Call Trace:
 [] show_stack+0x80/0xa0
sp=e003153cf670 bsp=e003153c1638
 [] dump_stack+0x30/0x50
sp=e003153cf840 bsp=e003153c1620
 [] warn_slowpath_common+0xc0/0x100
sp=e003153cf840 bsp=e003153c15d8
 [] warn_slowpath_null+0x40/0x60
sp=e003153cf840 bsp=e003153c15b0
 [] generic_make_request_checks+0x680/0xa40
sp=e003153cf840 bsp=e003153c1570
 [] generic_make_request+0x30/0x280
sp=e003153cf880 bsp=e003153c1550
 [] submit_bio+0x170/0x3c0
sp=e003153cf890 bsp=e003153c1500
 [] submit_bh+0x310/0x4e0
sp=e003153cf8b0 bsp=e003153c14d0
 [] block_read_full_page+0x720/0x820
sp=e003153cf8b0 bsp=e003153c1430
 [] blkdev_readpage+0x30/0x60
sp=e003153cfcb0 bsp=e003153c1408
 [] read_pages+0x220/0x260
sp=e003153cfcb0 bsp=e003153c13a0
 [] __do_page_cache_readahead+0x130/0x320
sp=e003153cfce0 bsp=e003153c1310
 [] ra_submit+0x40/0x60
sp=e003153cfcf0 bsp=e003153c12e0
 [] ondemand_readahead+0x210/0x580
sp=e003153cfcf0 bsp=e003153c1278
 [] page_cache_sync_readahead+0x90/0x100
sp=e003153cfcf0 bsp=e003153c1238
 [] do_generic_file_read+0x770/0xce0
sp=e003153cfcf0 bsp=e003153c1140
 [] generic_file_aio_read+0x260/0x5c0
sp=e003153cfcf0 bsp=e003153c10d0
 [] do_sync_read+0x130/0x240
sp=e003153cfd30 bsp=e003153c1078
 [] vfs_read+0x1b0/0x340
sp=e003153cfe20 bsp=e003153c1030
 [] sys_read+0x90/0xe0
sp=e003153cfe20 bsp=e003153c0fb0
 [] ia64_ret_from_syscall+0x0/0x20
sp=e003153cfe30 bsp=e003153c0fb0
 [] __kernel_syscall_via_break+0x0/0x20
sp=e003153d bsp=e003153c0fb0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.7-rc8

2012-12-04 Thread Tony Luck

> Linus Torvalds  writes:
>
>> Does that fix the printk's for you too?
>
> Yep, works for me, thanks!

Belated "works for me too" (just in case you were worrying that ia64
was still broken).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: new execve/kernel_thread design

2012-10-19 Thread Tony Luck

On Fri, Oct 19, 2012 at 10:30 AM, Al Viro  wrote:
> IIRC, the lack of comments on function with unusual calling conventions was
> the last remaining issue...

Stylistically other asm functions have huge block header
comments detailing register usage. But typically those
are way more complex. I think your inline comments
work fine here.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Fix a cmci discovery problem

2012-11-07 Thread Tony Luck

Ingo,

Is there a problem with this pull request ... or did it just get lost
in the LKML noise?

-Tony

On Tue, Oct 30, 2012 at 3:01 PM, Luck, Tony  wrote:
> The following changes since commit 8f0d8163b50e01f398b14bcd4dc039ac5ab18d64:
>
>   Linux 3.7-rc3 (2012-10-28 12:24:48 -0700)
>
> are available in the git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
> tags/please-pull-tangchen
>
> for you to fetch changes up to 85b97637bb40a9f486459dd254598759af9c3d50:
>
>   x86/mce: Do not change worker's running cpu in cmci_rediscover(). 
> (2012-10-30 14:38:12 -0700)
>
> 
> Fix problem in CMCI rediscovery code that was illegally
> migrating worker threads to other cpus.
>
> 
> Tang Chen (1):
>   x86/mce: Do not change worker's running cpu in cmci_rediscover().
>
>  arch/x86/kernel/cpu/mcheck/mce_intel.c |   31 ++-
>  1 file changed, 18 insertions(+), 13 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v8 3/3] aerdrv: Cleanup log output for AER

2013-01-02 Thread Tony Luck

On Wed, Jan 2, 2013 at 3:27 PM, Joe Perches  wrote:
> Just use dev_err( instead of dev_printk(KERN_ERR,
> It's a function and it makes the object code smaller.

Looks like we are almost converged on a solution (Lance: thanks for
your patience and diligence in making changes).

Anyone on the "To:" list want to claim this for their tree to commit?
The series touches pci, acpi, RAS, and tracing ... so there are
several possible owners.

If someone else wants it, then add an:
Acked-by: Tony Luck 
to all three parts.

If there isn't a strong claim, I'll add v9[*] to the RAS tree
and see if the TIP tree folks will pull it from me.

-Tony

[*] When Lance makes the change suggested by Joe.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] pstore/ram for 3.11

2013-06-14 Thread Tony Luck

On Wed, Jun 12, 2013 at 8:44 PM, Rob Herring  wrote:
> Not sure who takes this, but please pull these 2 changes for pstore for
> 3.11. These are necessary to get pstore to work with on-chip RAM on
> Calxeda highbank platform.

Were these posted for discussion and review?  Is there anyone who should
be providing {Acked,Reviewed,Tested}-by: tags for them?  I haven't ever had
a sub-maintained tree to pull from - so I'm being double-extra cautious before
doing something with this as it all feels new and strange.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-14 Thread Tony Luck

On Fri, Jun 14, 2013 at 3:23 PM, Rafael J. Wysocki  wrote:
> Can you please just test patch [5/5] alone without patches [1-4/5]?  We 
> believe
> that this should work too and if that's the case, we'll only need that patch
> and a reworked [1/5].

Your belief is sound - I popped all five patches and then applied just
5/5 ... and
the system still works.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] pstore/ram for 3.11

2013-06-14 Thread Tony Luck

On Fri, Jun 14, 2013 at 3:47 PM, Anton Vorontsov  wrote:
>
> Acked-by: Anton Vorontsov 
>
> (Or I can pick this via linux-pstore.git tree, I'll let Tony decide.)

Added that Acked-by: and applied to my tree.

Thanks

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Re: [Patch] MCE, APEI: Don't enable CMCI when Firmware First mode is set in

2013-06-18 Thread Tony Luck

On Mon, Jun 17, 2013 at 11:43 PM, Naveen N. Rao
 wrote:
> +   if (bank >= mca_cfg.banks) {
> +   pr_info("mce_disable_bank: Invalid MCA bank %d ignored.\n", 
> bank);

Let's have a FW_BUG in that message to point a finger at the source of
the problem.


+   apei_hest_parse(hest_parse_cmc, NULL);

I think we want a boot command line option to opt out of this. "nohestcmc"??

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-19 Thread Tony Luck

> Can you please apply the appended patch on top of it and see if the system
> still works then?

Still works with this patch.

-Tony

> ---
>  drivers/acpi/scan.c  |3 +++
>  drivers/acpi/video.c |3 ---
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> Index: linux-pm/drivers/acpi/scan.c
> ===
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -939,6 +939,9 @@ static int acpi_device_probe(struct devi
> struct acpi_driver *acpi_drv = to_acpi_driver(dev->driver);
> int ret;
>
> +   if (acpi_dev->handler)
> +   return -EINVAL;
> +
> if (!acpi_drv->ops.add)
> return -ENOSYS;
>
> Index: linux-pm/drivers/acpi/video.c
> ===
> --- linux-pm.orig/drivers/acpi/video.c
> +++ linux-pm/drivers/acpi/video.c
> @@ -1722,9 +1722,6 @@ static int acpi_video_bus_add(struct acp
> int error;
> acpi_status status;
>
> -   if (device->handler)
> -   return -EINVAL;
> -
> status = acpi_walk_namespace(ACPI_TYPE_DEVICE,
> device->parent->handle, 1,
> acpi_video_bus_match, NULL,
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-19 Thread Tony Luck

> If you don't mind, I'll queue up https://patchwork.kernel.org/patch/2712741/ 
> and
> this for 3.11.

Mark them

Tested-by: Tony Luck 

if you like.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [IA64] sim: Add casts to avoid assignment warnings

2013-06-20 Thread Tony Luck

Oops - pasted in old e-mail address for Boris

On Thu, Jun 20, 2013 at 11:15 AM, Luck, Tony  wrote:
> Pointers in the efi_runtime_services_t structure now have type
> "void *" (formerly they were "unsigned long"). So we now see a
> bunch of warnings like this:
>
> arch/ia64/hp/sim/boot/fw-emu.c:293: warning: assignment makes pointer from 
> integer without a cast
>
> Add (void *) casts to the 10 affected lines to make the build quiet again.
>
> Signed-off-by: Tony Luck 
>
> ---
>
> Boris, Matt - Can you add this patch to the same tree that
>
>commit 43ab0476a648053e5998bf081f47f215375a4502 [linux-next id]
>efi: Convert runtime services function ptrs
>
> is in so that it will follow along behind it.  Thanks.
>
>  arch/ia64/hp/sim/boot/fw-emu.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/arch/ia64/hp/sim/boot/fw-emu.c b/arch/ia64/hp/sim/boot/fw-emu.c
> index 271f412..87bf9ad 100644
> --- a/arch/ia64/hp/sim/boot/fw-emu.c
> +++ b/arch/ia64/hp/sim/boot/fw-emu.c
> @@ -290,16 +290,16 @@ sys_fw_init (const char *args, int arglen)
> efi_runtime->hdr.signature = EFI_RUNTIME_SERVICES_SIGNATURE;
> efi_runtime->hdr.revision = EFI_RUNTIME_SERVICES_REVISION;
> efi_runtime->hdr.headersize = sizeof(efi_runtime->hdr);
> -   efi_runtime->get_time = __pa(&fw_efi_get_time);
> -   efi_runtime->set_time = __pa(&efi_unimplemented);
> -   efi_runtime->get_wakeup_time = __pa(&efi_unimplemented);
> -   efi_runtime->set_wakeup_time = __pa(&efi_unimplemented);
> -   efi_runtime->set_virtual_address_map = __pa(&efi_unimplemented);
> -   efi_runtime->get_variable = __pa(&efi_unimplemented);
> -   efi_runtime->get_next_variable = __pa(&efi_unimplemented);
> -   efi_runtime->set_variable = __pa(&efi_unimplemented);
> -   efi_runtime->get_next_high_mono_count = __pa(&efi_unimplemented);
> -   efi_runtime->reset_system = __pa(&efi_reset_system);
> +   efi_runtime->get_time = (void *)__pa(&fw_efi_get_time);
> +   efi_runtime->set_time = (void *)__pa(&efi_unimplemented);
> +   efi_runtime->get_wakeup_time = (void *)__pa(&efi_unimplemented);
> +   efi_runtime->set_wakeup_time = (void *)__pa(&efi_unimplemented);
> +   efi_runtime->set_virtual_address_map = (void 
> *)__pa(&efi_unimplemented);
> +   efi_runtime->get_variable = (void *)__pa(&efi_unimplemented);
> +   efi_runtime->get_next_variable = (void *)__pa(&efi_unimplemented);
> +   efi_runtime->set_variable = (void *)__pa(&efi_unimplemented);
> +   efi_runtime->get_next_high_mono_count = (void 
> *)__pa(&efi_unimplemented);
> +   efi_runtime->reset_system = (void *)__pa(&efi_reset_system);
>
> efi_tables->guid = SAL_SYSTEM_TABLE_GUID;
> efi_tables->table = __pa(sal_systab);
> --
> 1.8.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4] aerdrv: Move cper_print_aer() call out of interrupt context

2013-05-30 Thread Tony Luck

Ok - grabbed this version. Will see if I can tempt Linus with a "please pull"
tomorrow (when the commit is suitably aged).

By the way ... this meta-commit description:

> v2 - Re-worded header text.  Removed prefix arg from cper_print_aer().
>  Added TODO message in cper_print_aer().
> v3 - Changed type of u8* to struct aer_capability_regs* in the code
>  to avoid too much casting based on comment from Bjorn Helgaas.
> v4 - Removed TODO message.  Does not have to do with what this patch
>  is trying to fix.

belongs *after* the "---" past the sign-off & Acks ... then "git am" will
drop it from the commit message automatically

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC

2013-06-21 Thread Tony Luck

On Fri, Jun 21, 2013 at 1:36 AM, Borislav Petkov  wrote:
> So ok, I'm persuaded, yet another bitfield it is ... :-\

Let's add some more comments on what each of these bitfields mean. Otherwise
we will be going back over this next time we have a patch that touches one
of them and we've all forgotten the subtle details explained in this
e-mail thread.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] pstore: Fail to unlink if a driver has not defined pstore_erase

2013-06-25 Thread Tony Luck

On Tue, Jun 25, 2013 at 9:41 AM, Kees Cook  wrote:
> On Tue, Jun 25, 2013 at 2:03 AM, Aruna Balakrishnaiah
>  wrote:
>> pstore_erase is used to erase the record from the persistent store.
>> So if a driver has not defined pstore_erase callback return

How do people manage devices like this?  With no erase function
they just keep getting more and more pstore entries. Eventually
they fill up.


>> Signed-off-by: Aruna Balakrishnaiah 
>
> Acked-by: Kees Cook 

Applied - thanks.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Power management and ACPI fixes for v3.10-rc5

2013-06-07 Thread Tony Luck

On Fri, Jun 7, 2013 at 5:51 AM, Rafael J. Wysocki  wrote:
> Aaron Lu (1):
>   ACPI / scan: do not match drivers against objects having scan handlers

This patch showed up in linux-next tag next-20130605 and appears to be the
cause of a boot failure on my ia64 HP rx2600 system.  It panics with
the message:

Kernel panic - not syncing: Unable to find SBA IOMMU: Try a generic or
DIG kernel

Reverting this from next-20130605 fixes my problem and I can boot again.

So please don't pull.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Power management and ACPI fixes for v3.10-rc5

2013-06-07 Thread Tony Luck

On Fri, Jun 7, 2013 at 3:23 PM, Tony Luck  wrote:
> So please don't pull.

Bother. I see I was a few hours late finding this, and commit 9f29ab11ddb
is already in Linus' tree.

That's what happens when I get busy and skip a couple of days testing
linux-next :-(

So my problem comes from
arch/ia64/hp/common/sba_iommu.c

where the code in sba_init() says:

acpi_bus_register_driver(&acpi_sba_ioc_driver);
if (!ioc_list) {

but because of this change we never managed to call ioc_init()
so ioc_list doesn't get set up, and we die.
Before this commit, the call chain looked like this:

 [] ioc_init+0x40/0xd00
 [] acpi_sba_ioc_add+0x190/0x1c0
 [] acpi_device_probe+0xa0/0x280
 [] really_probe+0xe0/0x520
 [] driver_probe_device+0x30/0x60
 [] __driver_attach+0x110/0x160
 [] bus_for_each_dev+0x110/0x180
 [] driver_attach+0x40/0x60
 [] bus_add_driver+0x230/0x580
 [] driver_register+0xf0/0x400
 [] acpi_bus_register_driver+0x50/0x80
 [] sba_init+0x30/0x2d0

Is my problem that this driver has (or attaches) a "scan handler"
where it shouldn't ... and I just need to stop it doing that?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Suggestion] arch/*/include/asm/bitops.h: about __set_bit() API.

2013-06-10 Thread Tony Luck

On Sat, Jun 8, 2013 at 3:08 AM, Chen Gang  wrote:
> using 'unsigned int *', implicitly:
>   ./ia64/include/asm/bitops.h:63:__set_bit (int nr, volatile void *addr)

There is some downside on ia64 to your suggestion.  If "addr" is properly
aligned for an "int", but misaligned for a long ... i.e. addr%8 == 4, then I'll
take an unaligned reference trap if I work with long* where the current code
working with int* does not.

Now perhaps all the callers do guarantee long* alignment?  But I don't know.

Apart from uniformity, there doesn't see to be any upside to changing this.

-Tony Luck
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-28 Thread Tony Luck

> +   if (sec_sev == GHES_SEV_CORRECTED &&
> +   (gdata->flags & 
> CPER_SEC_ERROR_THRESHOLD_EXCEEDED) &&
> +   (mem_err->validation_bits & 
> CPER_MEM_VALID_PHYSICAL_ADDRESS)) {
> +   unsigned long pfn;
> +   pfn = mem_err->physical_addr >> PAGE_SHIFT;

As Reagan said "Trust ... but verify" ... we should make sure BIOS
gave us a good pfn
if (pfn_valid(pfn))
 soft_memory_failure_queue(pfn, 0, 0);
else
 printk( ...something about
BIOS giving us bad pfn = %lu\n", pfn);
> +   }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] acpi: Eliminate console msg if pstore.backend excludes ERST

2013-06-28 Thread Tony Luck

On Fri, Jun 28, 2013 at 1:14 PM, Lenny Szubowicz  wrote:

> -   if (pstore_register(&erst_info)) {
> -   pr_info(ERST_PFX "Could not register with persistent 
> store\n");
> +   rc = pstore_register(&erst_info);
> +   if (rc) {
> +   if (rc != -EPERM)
> +   pr_info(ERST_PFX
> +   "Could not register with persistent store\n");
> +   erst_info.buf = NULL;
> +   erst_info.bufsize = 0;

Mismatch between part 1 and part 2 here ... we return -EINVAL if
our name doesn't match the desired backend ... but you only suppress
the "Could not register" message for -EPERM

Or am I confused while just looking at patch fragments?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] acpi: Eliminate misleading erst pstore console message

2013-06-28 Thread Tony Luck

On Fri, Jun 28, 2013 at 1:14 PM, Lenny Szubowicz  wrote:
> On systems that have a valid ACPI ERST table, if the pstore.backend kernel
> parameter selects a specific facility other than erst, then during boot the
> following console message is displayed:
>
> ERST: Could not register with persistent store

Applied (using revised version of part 1).

Thanks

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Bug] Reproducible data corruption on i5-3340M: Please continue your great work! :-)

2013-08-16 Thread Tony Luck

On Thu, Aug 15, 2013 at 5:33 PM, Linus Torvalds
 wrote:
> I'll probably delay committing it until tomorrow, in the hope that
> somebody using one of the other architectures will at least ack that
> it compiles. I'm re-attaching the patch (with the two "logn" -> "long"
> fixes) just to encourage that. Hint hint, everybody..

I see I'm too late to supply an Ack for the commit, because it is already in.
But just for completeness sake - all my ia64 configs build OK, and the couple
that get boot tested still appear to be working too.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v2 00/11] Add (de)compression support to pstore

2013-08-19 Thread Tony Luck

On Sat, Aug 17, 2013 at 11:32 AM, Kees Cook  wrote:
> Yeah, this is great. While I haven't tested it myself yet, the code
> seems to be in good shape. I acked the ram piece separately, but
> consider the entire series:
>
> Reviewed-by: Kees Cook 

Applied.  This should show up in linux-next tomorrow.

Anyone using efivars as the pstore backend?  Testing reports (positive
or negative) appreciated.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] efi: provide a generic efi_config_init()

2013-07-30 Thread Tony Luck

On Tue, Jul 30, 2013 at 9:47 AM, Leif Lindholm  wrote:
> +   /*
> +* Let's see what config tables the firmware passed to us.
> +*/
> +   config_tables = early_mememap(efi.systab->tables,
> +  efi.systab->nr_tables * sz);

Breaks bisection on ia64 ... you use early_mememap() here, but don't
define it on ia64 until patch 3/4.  So I get:

drivers/firmware/efi/efi.c: In function 'efi_config_init':
drivers/firmware/efi/efi.c:200: error: implicit declaration of
function 'early_memremap'
drivers/firmware/efi/efi.c:201: warning: assignment makes pointer from
integer without a cast

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] efi: provide a generic efi_config_init()

2013-07-30 Thread Tony Luck

On Tue, Jul 30, 2013 at 11:02 AM, Leif Lindholm
 wrote:
> So I guess the clean way to deal with that would be to make the
> memremap definition a separate patch?

Or just pull:
+#define early_memremap(phys_addr, size)early_ioremap(phys_addr, size)
out of part 3 and put it into part1 (along with some of the commit commentary).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] Make commonly useful UEFI functions common

2013-07-30 Thread Tony Luck

On Tue, Jul 30, 2013 at 9:47 AM, Leif Lindholm  wrote:
> IA64 code compile tested only.

Compiled on a bunch of ia64 configurations, Boot tested. But not on machine that
does the PROCESSOR_ABSTRACTION_LAYER_OVERWRITE_GUID thingy.
Code to do the arch specific thing looks ok though.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-02 Thread Tony Luck

On Thu, Aug 1, 2013 at 4:42 PM, Luck, Tony  wrote:
> when I rebuilt a plain 3.11-rc3 it didn't log anything via pstore either :-(

Well this turned out to be operator error on my part. 3.11-rc3 does in fact
log errors to pstore and allows them to be retrieved and cleared.

So then I start testing with your 11 patches in place.

First boot was fine - ERST had no records, and pstore mounted OK
(and showed no files).

Then I panic'd the machine and rebooted.  The boot hung when some
rc script printed"

Mounting other filesystems:

I guess something went wrong when pstore found a non-empty ERST.

I added some debug traces and booted again.  This time the boot succeeded
but I saw a GP fault reported from pstore_mkfile(). Possibly in this code:

spin_lock_irqsave(&allpstore_lock, flags);
list_for_each_entry(pos, &allpstore, list) {
if (pos->type == type &&
pos->id == id &&
pos->psi == psi) {
rc = -EEXIST;
break;
}
}
spin_unlock_irqrestore(&allpstore_lock, flags);



My other tracing showed that we'd already found two compressed entries in
ERST and were working on a third when this error happened (implying that
my hang had been a panic that failed to print anything to console)

I've attached one of the compressed files that v3.11-rc3 shows in pstore
now.  The "openssl zlib -d" trick you mentioned back in June mostly works
to decode ... but it seems to dump some trailing garbage at the end of
the file.

-Tony


unknown-erst-5907623178007478273
Description: Binary data

Re: [PATCH 00/11] Add compression support to pstore

2013-08-02 Thread Tony Luck

A quick experiment to use your patchset - but with compression
disabled by tweaking this line in pstore_dump():

zipped_len = -1; //zip_data(dst, hsize + len);

turned out well. This kernel dumps uncompressed dmesg blobs into pstore
and gets them back out again.  So it seems likely that the problems are
someplace in the compression/decompression code.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/5] ia64: add early_memremap() alias for early_ioremap()

2013-08-05 Thread Tony Luck

On Mon, Aug 5, 2013 at 3:56 AM, Matt Fleming  wrote:
>> @@ -424,6 +424,7 @@ extern void __iomem * ioremap(unsigned long offset, 
>> unsigned long size);
>>  extern void __iomem * ioremap_nocache (unsigned long offset, unsigned long 
>> size);
>>  extern void iounmap (volatile void __iomem *addr);
>>  extern void __iomem * early_ioremap (unsigned long phys_addr, unsigned long 
>> size);
>> +#define early_memremap(phys_addr, size)early_ioremap(phys_addr, 
>> size)
>>  extern void early_iounmap (volatile void __iomem *addr, unsigned long size);
>>  static inline void __iomem * ioremap_cache (unsigned long phys_addr, 
>> unsigned long size)
>
> Tony, can I get your Acked-by for this?

Acked-by: Tony Luck 

[Cut & paste this ack to other parts of the series that touch ia64]

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck

One more experiment - removed previous hack that disabled compression.
Added a new hack to skip decompression.

System died cleanly when I forced a panic.
On reboot I found 3 files in pstore:
-r--r--r--   1 root root 3972 Aug  5 09:24 dmesg-erst-5908671953186586625
-r--r--r--   1 root root 2565 Aug  5 09:24 dmesg-erst-5908671953186586626
-r--r--r--   1 root root 4067 Aug  5 09:24 dmesg-erst-5908671953186586627

Using  "openssl zlib -d" to decompress then ends up with some garbage
at the end of the decompressed file - some text that should be there is
missing.  E.g. the tail of decompressed version of *625 ends with:

<4>Call Trace:
<4> [] dump_stack+0x45/0x56
<4> [] panic+0xc2/0x1cb
<4> [] ? printk+0x54/0x56
<4> [] aegl+0x25/0x30
<4> [] proc_reg_write+0x3d/0x80
<4> [] vfs_write+0xc5/0x1e0
<4> [] SyS_write+0x52/0xa0
<4> [] system_call_fastpath+0x16/0x1b
 )c10^@^@^@^@^@^@^@^@^@

But my serial console logged this:

Call Trace:
 [] dump_stack+0x45/0x56
 [] panic+0xc2/0x1cb
 [] ? printk+0x54/0x56
 [] aegl+0x25/0x30
 [] proc_reg_write+0x3d/0x80
 [] vfs_write+0xc5/0x1e0
 [] SyS_write+0x52/0xa0
 [] system_call_fastpath+0x16/0x1b
[ cut here ]
WARNING: CPU: 18 PID: 381 at arch/x86/kernel/smp.c:124
native_smp_send_reschedule+0x5b/0x60()
Modules linked in:
CPU: 18 PID: 381 Comm: kworker/18:1 Not tainted 3.11.0-rc3-11-ge41db9e #6

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck

See attachment for what I actually applied - I think I got what you
suggested (I added a declaration for "total_len").

Forcing a panic worked some things were logged to pstore.

But on reboot with your patches applied I'm still seeing a GP fault
when pstore is mounted and we find compressed records and inflate them
and install them into the pstore filesystem.  Here's the oops:

general protection fault:  [#1] SMP
Modules linked in:
CPU: 29 PID: 10252 Comm: mount Not tainted 3.11.0-rc3-12-g73bec18 #2
Hardware name: Intel Corporation LH Pass ../SVRBD-ROW_T, BIOS
SE5C600.86B.99.99.x059.091020121352 09/10/2012
task: 88082e934040 ti: 88082e2ec000 task.ti: 88082e2ec000
RIP: 0010:[]  [] pstore_mkfile+0x84/0x410
RSP: 0018:88082e2edc70  EFLAGS: 00010007
RAX: 0246 RBX: 81ca7b20 RCX: 625f6963703e373c
RDX: 00040004 RSI: 0004 RDI: 820aa7e8
RBP: 88082e2edd10 R08: 881026a48000 R09: 
R10: 88102d21efb8 R11:  R12: 881026a48000
R13: 51ffe3560003 R14:  R15: 4450
FS:  7fbd37a2d7e0() GS:88103fca() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fbd37a47000 CR3: 00103dc78000 CR4: 000407e0
Stack:
 881026a4c450 5227 81a3703d 881026a48000
 2e2edd70 88103db34140 0001abaf 36383039
 003a0fb8 881026a48000 88102d21e000 448a
Call Trace:
 [] pstore_get_records+0xed/0x2c0
 [] ? pstore_get_inode+0x50/0x50
 [] pstore_fill_super+0xa2/0xc0
 [] mount_single+0xa2/0xd0
 [] pstore_mount+0x18/0x20
 [] mount_fs+0x43/0x1b0
 [] ? __alloc_percpu+0x10/0x20
 [] vfs_kern_mount+0x6f/0x100
 [] do_mount+0x259/0xa10
 [] ? strndup_user+0x5b/0x80
 [] SyS_mount+0x8e/0xe0
 [] system_call_fastpath+0x16/0x1b
Code: 88 e8 f1 0f 39 00 48 8b 0d 0a 3a a2 00 48 81 f9 00 0d c9 81 75
15 eb 67 0f 1f 80 00 00 00 00 48 8b 09 48 81 f9 00 0d c9 81 74 54 <44>
39 71 18 75 ee 4c 39 69 20 75 e8 48 39 59 10 75 e2 48 89 c6
RIP  [] pstore_mkfile+0x84/0x410
 RSP 
---[ end trace 0e1dd8e3ccfa3dcc ]---
/etc/init.d/functions: line 530: 10252 Segmentation fault  "$@"

Here's the start of my pstore_mkfile() code where the GP fault occurred:

8126d290 :
8126d290:   e8 2b 91 39 00  callq
816063c0 <__fentry__>
8126d295:   55  push   %rbp
8126d296:   48 89 e5mov%rsp,%rbp
8126d299:   41 57   push   %r15
8126d29b:   41 56   push   %r14
8126d29d:   41 89 femov%edi,%r14d
8126d2a0:   48 c7 c7 e8 a7 0a 82mov$0x820aa7e8,%rdi
8126d2a7:   41 55   push   %r13
8126d2a9:   49 89 d5mov%rdx,%r13
8126d2ac:   41 54   push   %r12
8126d2ae:   53  push   %rbx
8126d2af:   48 83 ec 78 sub$0x78,%rsp
8126d2b3:   89 4d 84mov%ecx,-0x7c(%rbp)
8126d2b6:   48 89 b5 70 ff ff ffmov%rsi,-0x90(%rbp)
8126d2bd:   65 48 8b 04 25 28 00mov%gs:0x28,%rax
8126d2c4:   00 00
8126d2c6:   48 89 45 d0 mov%rax,-0x30(%rbp)
8126d2ca:   31 c0   xor%eax,%eax
8126d2cc:   48 8b 05 0d d5 e3 00mov
0xe3d50d(%rip),%rax# 820aa7e0 
8126d2d3:   4c 89 85 78 ff ff ffmov%r8,-0x88(%rbp)
8126d2da:   44 89 4d 80 mov%r9d,-0x80(%rbp)
8126d2de:   48 8b 5d 28 mov0x28(%rbp),%rbx
8126d2e2:   48 8b 40 60 mov0x60(%rax),%rax
8126d2e6:   48 89 45 88 mov%rax,-0x78(%rbp)
8126d2ea:   e8 f1 0f 39 00  callq
815fe2e0 <_raw_spin_lock_irqsave>
8126d2ef:   48 8b 0d 0a 3a a2 00mov
0xa23a0a(%rip),%rcx# 81c90d00 
8126d2f6:   48 81 f9 00 0d c9 81cmp$0x81c90d00,%rcx
8126d2fd:   75 15   jne
8126d314 
8126d2ff:   eb 67   jmp
8126d368 
8126d301:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
8126d308:   48 8b 09mov(%rcx),%rcx
8126d30b:   48 81 f9 00 0d c9 81cmp$0x81c90d00,%rcx
8126d312:   74 54   je
8126d368 
8126d314:   44 39 71 18 cmp
%r14d,0x18(%rcx)   << GP fault here
8126d318:   75 ee   jne
8126d308 
8126d31a:   4c 39 69 20 cmp%r13,0x20(%rcx)
8126d31e:   75 e8   jne
8126d308 
8126d320:   48 39 59 10 cmp

Re: Proposed stable release changes

2013-08-21 Thread Tony Luck

On Wed, Aug 21, 2013 at 1:00 PM, Borislav Petkov  wrote:
> We don't want to run daily snapshots of your tree though, right? Only
> -rcs because the daily states are kinda arbitrary and they can be broken
> in various ways. Or are we at a point in time where we can amend that
> rule?

If *nobody* runs daily snapshots - then problems just sit latent all week until
the -rc is released and people start testing. Doesn't sound optimal.

Running daily git snapshots can be "exciting" during the merge window. But
I rarely see problems running a random build after -rc1.  If you are still
running that ancient 3.11-rc6 released on Sunday - then you are missing out
on 28 commits worth of goodness since then :-)

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-05 Thread Tony Luck

This patch seems to fix the garbage at the end problem.  Booting an
old kernel and using openssl decodes them OK.

Still have problems booting if there are any compressed images in ERST
to be inflated.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-06 Thread Tony Luck

On Mon, Aug 5, 2013 at 2:20 PM, Tony Luck  wrote:
> Still have problems booting if there are any compressed images in ERST
> to be inflated.

So I took another look at this part of the code ... and saw a couple of issues:

while ((size = psi->read(&id, &type, &count, &time, &buf, &compressed,
psi)) > 0) {
if (compressed && (type == PSTORE_TYPE_DMESG)) {
big_buf_sz = (psinfo->bufsize * 100) / 45;
big_buf = allocate_buf_for_decompression(big_buf_sz);

if (big_buf || stream.workspace)
>>> Did you mean "&&" here rather that "||"?
unzipped_len = pstore_decompress(buf, big_buf,
size, big_buf_sz);
>>> Need an "else" here to set unzipped_len to -1 (or set it to -1 down
>>> at the bottom of the loop ready for next time around.

if (unzipped_len > 0) {
buf = big_buf;
>>> This sets us up for problems.  First, you just overwrote the address
>>> of the buffer that psi->read allocated - so we have a memory leak. But
>>> worse than that we now double free the same buffer below when we
>>> kfree(buf) and then kfree(big_buf)
size = unzipped_len;
compressed = false;
} else {
pr_err("pstore: decompression failed;"
"returned %d\n", unzipped_len);
compressed = true;
}
}
rc = pstore_mkfile(type, psi->name, id, count, buf,
  compressed, (size_t)size, time, psi);
kfree(buf);
kfree(stream.workspace);
kfree(big_buf);
buf = NULL;
stream.workspace = NULL;
big_buf = NULL;
if (rc && (rc != -EEXIST || !quiet))
failed++;
}


See attached patch that fixes these - but the code still looks like it
could be cleaned up a bit more.

-Tony


pstore.patch
Description: Binary data

Re: [PATCH 00/11] Add compression support to pstore

2013-08-06 Thread Tony Luck

On Tue, Aug 6, 2013 at 6:58 PM, Aruna Balakrishnaiah
 wrote:
> The patch looks right. I will clean it up. Does the issue still persist
> after this?

Things seem to be working - but testing has hardly been extensive (just
a couple of forced panics).

I do have one other question. In this code:

>>  if (compressed && (type == PSTORE_TYPE_DMESG)) {
>>  big_buf_sz = (psinfo->bufsize * 100) / 45;

Where does the magic multiply by 1.45 come from?  Is that always enough
for the decompression of "dmesg" type data to succeed?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-06 Thread Tony Luck

On Tue, Aug 6, 2013 at 10:13 PM, Aruna Balakrishnaiah
 wrote:
> How is it with erst and efivars?

ERST is at the whim of the BIOS writer (the ACPI standard doesn't provide any
suggestions on record sizes).  My systems support ~6K record size.

efivars has, IIRC, a 1k limit coded in the Linux back end.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-07 Thread Tony Luck

Oh - one more thing - and my apologies for not spotting this before:

dst = allocate_buf_for_compression(big_buf_sz);

No - you may not call kmalloc() in oops/panic context.  Please pre-allocate
everything you need in some initialization code to make sure that we don't
fail in the panic path because we can't get the memory we need.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-07 Thread Tony Luck

On Tue, Aug 6, 2013 at 10:35 PM, Tony Luck  wrote:
> ERST is at the whim of the BIOS writer (the ACPI standard doesn't provide any
> suggestions on record sizes).  My systems support ~6K record size.

Off by a little - 7896 bytes on my current machine.

> efivars has, IIRC, a 1k limit coded in the Linux back end.
My memory was correct for this one.

Adding a little tracing to pstore_getrecords() I see this:

pstore: inflated 3880 bytes compressed to 17459 bytes
pstore: inflated 2567 bytes compressed to 17531 bytes
pstore: inflated 4018 bytes compressed to 17488 bytes

Which isn't at all what I expected.  The ERST backend
advertised a bufsize of 7896, and I have the default
kmsg_bytes of 10240.  So on my forced panic the code
decided to create a three part pstore dump.  The sum of
the pieces is close to, but a little over the target of 10K.
But I don't understand why the compressed sizes are so
much smaller that the ERST backend block size.

The uncompressed sizes appear to be close to constant.
The compression ratios vary from 14% to 23%

Why do we get three small parts instead of two bigger
ones close the the 7896 ERST bufsize?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] Add compression support to pstore

2013-08-07 Thread Tony Luck

On Wed, Aug 7, 2013 at 9:29 PM, Aruna Balakrishnaiah
 wrote:
> When we preallocate, we can use the same big_buf for compression as well as
> decompression.
> Also workspace will be one for both. By allocating max of inflate workspace
> size and deflate
> workspace size. We can save memory here.

Well decompression isn't a problem. We are doing that in the non-panicing
context of the freshly booted kernel so we can allocate memory without any
worries for this.  It's only the compression during panic where we must
pre-allocate.  But if the sizes are close to the same, then we might as well
use the same buffers for both (and simplify the code because we don't have
to worry about the kmalloc/kfree bits.

> If pre-allocating close to 50k of buffer is not a issue. We can go ahead
> with this approach.

I never care about allocations measured in *kilo*bytes[1] - the smallest systems
I use have 32GB - so 50K is so far down in the noise of other allocations.
But other types of systems might be more concerned.  ERST is generally
only implemented on servers ... so the better question might be:
What are the sizes for the EFI backend (where the buffer size is 1024). It
sounds like it should scale linearly ... so below 8K???  That should not
scare many people. Even phones measure memory in hundreds of MBytes.

-Tony

[1] unless they are per-cpu or per something else that there are a lot of
on a big server - but this is a one-per-system allocation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] panic: call panic handlers before kmsg_dump

2013-07-18 Thread Tony Luck

On Thu, Jul 18, 2013 at 4:03 PM, Kees Cook  wrote:
> Since the panic handlers may produce additional information (via printk)
> for the kernel log, it should be reported as part of the panic output
> saved by kmsg_dump(). Without this re-ordering, nothing that adds
> information to a panic will show up in pstore's view when kmsg_dump
> runs, and is therefore not visible to crash reporting tools that examine
> pstore output.

Good point.

Acked-by: Tony Luck 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors.

2013-07-23 Thread Tony Luck

Gah ... there is another bug in that unaffected thread entry.  The check for
MCG_STATUS should be for RIPV=1 *and* EIPV=0

gmail will mess this patch up ... but should still be readable.

-Tony

---

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c
b/arch/x86/kernel/cpu/mcheck/mce-severity
index 7f6ab4e..48f0fd2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -112,7 +112,7 @@ static struct severity {
MCESEV(
KEEP, "Action required but unaffected thread is continuable",
SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR,
MCI_UC_SAR|MCI_ADDR),
-   MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV)
+   MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV)
),
MCESEV(
AR, "Action required: data load error in a user process",
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: lockdep warning in edac_create_sysfs_mci_device

2013-07-24 Thread Tony Luck

On Sun, Jul 21, 2013 at 12:02 PM, Borislav Petkov  wrote:
> A fix is on the way:
>
> http://marc.info/?l=linux-edac&m=137422971614927&w=2

Fix was pulled into Linus' tree yesterday evening (Portland, OR timezone):

commit 88d84ac97378c2f1d5fec9af1e8b7d9a662d6b00
Author: Borislav Petkov 
Date:   Fri Jul 19 12:28:25 2013 +0200

EDAC: Fix lockdep splat

Alexandra: Give it a test and let me and Boris know if you still see
any problems.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi scan: INQUIRY result too short (5), using 36

2013-07-24 Thread Tony Luck

Oops ... forgot final step.  That commit does revert cleanly (at least
git did not grumble when I asked it to revert).  The resulting kernel
builds cleanly and boots without seeing this problem.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: scsi scan: INQUIRY result too short (5), using 36

2013-07-24 Thread Tony Luck

On Wed, Jul 24, 2013 at 12:43 PM, James Bottomley
 wrote:
> Oops, apparently no-one I cc'd at intel actually bothered to check the
> patch for the isci driver.  The looks to be that sci_swab32_cpy needs
> multiples of four, so for commands that aren't that, it's rounding the
> wrong way.  Does this fix it?

Yes. That fixes it.

Wrap whichever of:

Reported-by: Tony Luck 

and/or

Tested-by: Tony Luck 

around that patch and ship it!

Thanks for the fast fix.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] machine check decode fixes

2013-07-24 Thread Tony Luck

V2 of this:
* Broken into two patches - by suggestion of Chen Gong
* Just change MCACOD #define value - by suggestion of Naveen

Tony Luck (2):
  x86/mce: Fix mce regression from recent cleanup
  x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC'
errors

 arch/x86/include/asm/mce.h| 13 +++--
 arch/x86/kernel/cpu/mcheck/mce-severity.c |  4 ++--
 2 files changed, 13 insertions(+), 4 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors

2013-07-24 Thread Tony Luck

The 0x1000 bit of the MCACOD field of machine check MCi_STATUS
registers is only defined for corrected errors (where it means
that hardware may be filtering errors see SDM section 15.9.2.1).

For uncorrected errors it may, or may not be set - so we should mask
it out when checking for the architecturaly defined recoverable
error signatures (see SDM 15.9.3.1 and 15.9.3.2)

Signed-off-by: Tony Luck 
---
 arch/x86/include/asm/mce.h | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 29e3093..aa97342 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -32,11 +32,20 @@
 #define MCI_STATUS_PCC   (1ULL<<57)  /* processor context corrupt */
 #define MCI_STATUS_S(1ULL<<56)  /* Signaled machine check */
 #define MCI_STATUS_AR   (1ULL<<55)  /* Action required */
-#define MCACOD   0x /* MCA Error Code */
+
+/*
+ * Note that the full MCACOD field of IA32_MCi_STATUS MSR is
+ * bits 15:0.  But bit 12 is the 'F' bit, defined for corrected
+ * errors to indicate that errors are being filtered by hardware.
+ * We should mask out bit 12 when looking for specific signatures
+ * of uncorrected errors - so the F bit is deliberately skipped
+ * in this #define.
+ */
+#define MCACOD   0xefff /* MCA Error Code */
 
 /* Architecturally defined codes from SDM Vol. 3B Chapter 15 */
 #define MCACOD_SCRUB   0x00C0  /* 0xC0-0xCF Memory Scrubbing */
-#define MCACOD_SCRUBMSK0xfff0
+#define MCACOD_SCRUBMSK0xeff0  /* Skip bit 12 ('F' bit) */
 #define MCACOD_L3WB0x017A  /* L3 Explicit Writeback */
 #define MCACOD_DATA0x0134  /* Data Load */
 #define MCACOD_INSTR   0x0150  /* Instruction Fetch */
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] x86/mce: Fix mce regression from recent cleanup

2013-07-24 Thread Tony Luck

commit 33d7885b594e169256daef652e8d3527b2298e75
x86/mce: Update MCE severity condition check

Simplified the rules to recognise each classification of recoverable
machine check combining the instruction and data fetch rules into a
single entry based on clarifications in the June 2013 SDM that all
recoverable events would be reported on the unaffected processor with
MCG_STATUS.EIPV=0 and MCG_STATUS.RIPV=1.  Unfortunately the simplified
rule has a couple of bugs.  Fix them here.

Signed-off-by: Tony Luck 
---
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c 
b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index e2703520..c370e1c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -111,8 +111,8 @@ static struct severity {
 #ifdef CONFIG_MEMORY_FAILURE
MCESEV(
KEEP, "Action required but unaffected thread is continuable",
-   SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, 
MCI_UC_SAR|MCI_ADDR),
-   MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV)
+   SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR, 
MCI_UC_SAR|MCI_ADDR),
+   MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV)
),
MCESEV(
AR, "Action required: data load error in a user process",
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-11 Thread Tony Luck

> What I understand from above in intel 64 Arch software Developer's manual are:
> 1) this manual is written for software developer;
> 2) It says that MCE handler only requires to synchronize among the logical 
> cores in the same package/core(what I assume here is same CPU socket).
>
> I have two CPU sockets on motherboard and total 24 logical cores(12 cores 
> each CPU). Each CPU has its own integrated memory controller. Each memory 
> controller controls three channels of DIMMs. I can understand that if one 
> dimm has error, the memory controller can trigger the MCE exception to it's 
> own CPU, but why should this memory controller sends the MCE exception to the 
> other CPU or the rest CPUs on the motherboard? Is there any hardware standard 
> or specification for it?

The Software Developer Manual is the specification of the architecture
- there are data sheets for each processor which describe
implementation details (e.g. perhaps which types of errors are
reported in whcih banks, an MCi_STATUS.MSCOD field values providing
more information about an error).

Your "1&2" understanding is correct. Your question on "why should this
memory controller send the MCE exception ..." is a good one. The
answer is because the architecture requires it; even though you and I
can imagine that it is possible for OS to do its work if the error is
just sent to the processors on the socket where the error was found in
some cases. There may be some cases where this is less easy (e.g. a
logical processor on one socket issues a NUMA read to a location that
is on the memory controller on the other socket).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck

On Sat, May 11, 2013 at 12:52 AM, Dmitry Monakhov  wrote:.
> What was page_size and fsblock size?

CONFIG_IA64_PAGE_SIZE_64KB=y

fsblock size is whatever is the default for SLES11SP2 on ia64 - which
tool will tell me?

My git bisect finally competed and points the a finger at:

bisect> git bisect good
ae4647fb7654676fc44a97e86eb35f9f06b99f66 is first bad commit
commit ae4647fb7654676fc44a97e86eb35f9f06b99f66
Author: Jan Kara 
Date:   Fri Apr 12 00:03:42 2013 -0400

jbd2: reduce journal_head size

Remove unused t_cow_tid field (ext4 copy-on-write support doesn't seem
to be happening) and change b_modified and b_jlist to bitfields thus
saving 8 bytes in the structure.

Signed-off-by: Jan Kara 
Signed-off-by: "Theodore Ts'o" 
Reviewed-by: Zheng Liu 

:04 04 c39ece4341894b3daf84764ba425a87ffb90fe50
d4e8d9185c2a1b740c235ca8ed05d496a442fce3 M  include

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck

On Sun, May 12, 2013 at 7:21 PM, EUNBONG SONG  wrote:
> Hi, my git bisect result is same yours. And i reported that to community 
> yesterday.

Ah. Good to have some confirmation (I was never sure how long to keep
running before deciding that a test was "good".  My slowest "bad" test took
about 2.5 hours.  I mostly let the tests run for >6 hours before deciding.

I just confirmed that 3.10-rc1 still fails (30 minutes).  Now running a test
on 3.10-rc1 with just this commit reverted. Only been going for about
15 minutes, so no useful information yet.

My best guess as to why this commit causes problems is that there are places
where updates to individual fields in this structure used to be independent
because they were to whole words.  Now we have bitfileds there are races
between access to different fields in the same word.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck

The 3.10-rc1 with ae4647fb765467 reverted is still running OK. At
3 hours now (only marginally longer that the 2.5 hours that one of
the "bad" runs during the bisect managed).  So I'm about 30% sure
that we have a winner at the moment. I'll leave it running and check
again in the morning. This penguin is heading to bed now.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] NOHZ, check to see if tick device is initialized in IRQ handling path

2013-05-02 Thread Tony Luck

>  void tick_nohz_irq_exit(void)
>  {
> struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
> +   struct clock_event_device *dev =
> +__get_cpu_var(tick_cpu_device).evtdev;
> +
> +   /* Has the tick been initialized yet? */
> +   if (unlikely(!dev || dev->mode == CLOCK_EVT_MODE_UNUSED))
> +   return;

Could we have something in the "struct tick_sched" to tell us whether
it has been set up? Rather than this somewhat convoluted digging
around in the clock_event_device innards?

> +   if (unlikely(!dev || dev->mode == CLOCK_EVT_MODE_UNUSED))
> +   return;

Ditto here.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

VFAT complains that my file system may be corrupted

2013-05-06 Thread Tony Luck

Built Linus' tree this morning (HEAD =
d7ab7302f970a254997687a1cdede421a5635c68) and got this message:

FAT-fs (sda1): Volume was not properly unmounted. Some data may be
corrupt. Please run fsck.

when booting my ia64 machine.  The message may well be legitimate
because I did crash the machine, so the filessytem was not unmounted
cleanly.

BUT ... If I unmount and run fsck as it suggests, then I see:

# fsck /boot/efi
fsck from util-linux-ng 2.16
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
There are differences between boot sector and its backup.
Differences: (offset:original/backup)
  65:01/00
1) Copy original to backup
2) Copy backup to original
3) No action

I tried option 3 - fsck made no other changes, but I still see the
message. I tried
option 1 - and I still see the message. So I went for option 2 ... and
guess what,
I still see the message when I mount this filesystem.

Note that with either option 1 or 2 "fsck" says:

Leaving file system unchanged.
/dev/sda1: 20 files, 19865/255496 clusters

This is the first time I've ever seen this message ... but I haven't
had this system crash for some time, so not really sure when this may
have started.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VFAT complains that my file system may be corrupted

2013-05-06 Thread Tony Luck

On Mon, May 6, 2013 at 11:44 AM, Oleksij Rempel
 wrote:
> i provided patches for dosfstools for some time now, you need at least
> v3.0.14. If your system do not provide it you will need to grub it here:
> http://daniel-baumann.ch/gitweb/?p=software/dosfstools.git

I may have too old a toolchain to build those :-(

src/boot.c:560: warning: implicit declaration of function ‘cpu_to_le16’
src/boot.c:562: warning: implicit declaration of function ‘cpu_to_le32’

and then at link time:

/home/aegl/dosfstools/src/boot.c:560: undefined reference to `cpu_to_le16'
/home/aegl/dosfstools/src/boot.c:561: undefined reference to `cpu_to_le16'
/home/aegl/dosfstools/src/boot.c:562: undefined reference to `cpu_to_le32'

I guess I can fake them easily (ia64 runs little endian on Linux).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VFAT complains that my file system may be corrupted

2013-05-06 Thread Tony Luck

On Mon, May 6, 2013 at 11:51 AM, Tony Luck  wrote:
> I guess I can fake them easily (ia64 runs little endian on Linux).

Duh. Especially as the only use is line 560-562 in src/boot.c:

  de.starthi = CT_LE_W(0);
  de.start = CT_LE_W(0);
  de.size = CT_LE_L(0);

Gotta make sure to use a little endian 0 rather than risk a big-endian one. WTF?

Anyhow ... thanks for the pointer. That fixed my filesystem for me.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-10 Thread Tony Luck

I think I have the same (or highly similar) thing happening on ia64.

Similarities: seeing assertions fail for b_transaction
Differences: I only have ext3 filesystems mounted, no ext4

See attached trace.  I'm pretty certain that the highly unhelpful

bugcheck! 0 [1]

comes from the

J_ASSERT_JH(jh, jh->b_transaction == NULL);

from disassembling __journal_remove_journal_head(). The instruction
pointer  points to the 2nd "break" instruction
in the function.

The problem shows up after 30 minutes to a couple of hours of stress (kernel
builds with "make -j32").

I'm pretty sure this problem didn't occur in plain v3.9 (it can run for
a full 24 hours).

Trying to bisect - but it takes a while to be convinced that a good kernel
is actually good (since I don't have a clear picture of how long to run
before deciding that the bug isn't going to show)

-Tony


bug
Description: Binary data

Re: [tip:smp/hotplug] idle: Implement generic idle function

2013-04-15 Thread Tony Luck

Built next-20130415 and got this on ia64 early in boot:

WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380()
Hardware name: server rx2620
Modules linked in:

Call Trace:
 [] show_stack+0x80/0xa0
sp=a00101287c50 bsp=a00101280e48
 [] dump_stack+0x30/0x50
sp=a00101287e20 bsp=a00101280e30
 [] warn_slowpath_common+0xc0/0x100
sp=a00101287e20 bsp=a00101280de8
 [] warn_slowpath_null+0x40/0x60
sp=a00101287e20 bsp=a00101280dc0
 [] cpu_idle_loop+0x360/0x380
sp=a00101287e20 bsp=a00101280d80
 [] cpu_startup_entry+0x40/0x60
sp=a00101287e20 bsp=a00101280d68
 [] rest_init+0x100/0x120
sp=a00101287e20 bsp=a00101280d50
 [] start_kernel+0x770/0x890
sp=a00101287e20 bsp=a00101280cd0
 [] start_ap+0x760/0x780
sp=a00101287e30 bsp=a00101280bc0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [tip:smp/hotplug] idle: Implement generic idle function

2013-04-16 Thread Tony Luck

On Tue, Apr 16, 2013 at 6:28 AM, Thomas Gleixner  wrote:
> Hmm, is safe_halt() returning with interrupts disabled? If yes, it
> lacks a local_irq_enable().

Quite probably. Adding arch_local_irq_enable() to arch_safe_halt()
makes all the problems go away.  I'll send you the one-line patch
from a system that won't mung it like gmail will.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] ia64: cpufreq: move cpufreq driver to drivers/cpufreq

2013-04-01 Thread Tony Luck

[Repost in plain text so the lists don't bounce it - curse you Gmail
for switching to HTML]

> Any comments on this patch?

This part looks OK ... But is there a big finish later in the patch series where
you unify some/all of the cpufreq code across architectures?  By
itself just moving
bits from arch/ia64/kernel/cpufreq to drivers/cpufreq/ doesn't look to add
much value.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/6] x86/mce: Provide an option to keep cmci_reenable() quiet

2012-08-07 Thread Tony Luck

cmci_reenable() calls cmci_discover() to look at which machine check
banks are shared between processors. It ensure that only one cpu takes
ownership of each shared bank. At boot time cmci_discover() is muted,
but during hot add events it provides some output which may be helpful
to ensure that all banks have an owner.

We want to use cmci_reenable() when a CMCI storm subsides. In this case
the topology has not changed, so we do not need any commentary as it
goes about its business.

Add a "quiet" argument to cmci_reenable() that it passes to cmci_discover().

Signed-off-by: Tony Luck 
---

[Patches 1-4 remain as previously posted. This is a new patch to
 help tidy console messages. Old patch 5 becomes patch 6 (and has
 a few cleanups]

 arch/x86/include/asm/mce.h |  4 ++--
 arch/x86/kernel/cpu/mcheck/mce.c   |  4 ++--
 arch/x86/kernel/cpu/mcheck/mce_intel.c | 10 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 441520e..bf79a0f 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -165,13 +165,13 @@ extern int mce_cmci_disabled;
 extern int mce_ignore_ce;
 void mce_intel_feature_init(struct cpuinfo_x86 *c);
 void cmci_clear(void);
-void cmci_reenable(void);
+void cmci_reenable(int quiet);
 void cmci_rediscover(int dying);
 void cmci_recheck(void);
 #else
 static inline void mce_intel_feature_init(struct cpuinfo_x86 *c) { }
 static inline void cmci_clear(void) {}
-static inline void cmci_reenable(void) {}
+static inline void cmci_reenable(int quiet) {}
 static inline void cmci_rediscover(int dying) {}
 static inline void cmci_recheck(void) {}
 #endif
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index b4dde15..826dd21 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1994,7 +1994,7 @@ static void mce_enable_ce(void *all)
 {
if (!mce_available(__this_cpu_ptr(&cpu_info)))
return;
-   cmci_reenable();
+   cmci_reenable(0);
cmci_recheck();
if (all)
__mcheck_cpu_init_timer();
@@ -2246,7 +2246,7 @@ static void __cpuinit mce_reenable_cpu(void *h)
return;
 
if (!(action & CPU_TASKS_FROZEN))
-   cmci_reenable();
+   cmci_reenable(0);
for (i = 0; i < banks; i++) {
struct mce_bank *b = &mce_banks[i];
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c 
b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 38e49bc..e652cde 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -78,7 +78,7 @@ static void print_update(char *type, int *hdr, int num)
  * on this CPU. Use the algorithm recommended in the SDM to discover shared
  * banks.
  */
-static void cmci_discover(int banks, int boot)
+static void cmci_discover(int banks, int quiet)
 {
unsigned long *owned = (void *)&__get_cpu_var(mce_banks_owned);
unsigned long flags;
@@ -96,7 +96,7 @@ static void cmci_discover(int banks, int boot)
 
/* Already owned by someone else? */
if (val & MCI_CTL2_CMCI_EN) {
-   if (test_and_clear_bit(i, owned) && !boot)
+   if (test_and_clear_bit(i, owned) && !quiet)
print_update("SHD", &hdr, i);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
continue;
@@ -109,7 +109,7 @@ static void cmci_discover(int banks, int boot)
 
/* Did the enable bit stick? -- the bank supports CMCI */
if (val & MCI_CTL2_CMCI_EN) {
-   if (!test_and_set_bit(i, owned) && !boot)
+   if (!test_and_set_bit(i, owned) && !quiet)
print_update("CMCI", &hdr, i);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
} else {
@@ -196,11 +196,11 @@ void cmci_rediscover(int dying)
 /*
  * Reenable CMCI on this CPU in case a CPU down failed.
  */
-void cmci_reenable(void)
+void cmci_reenable(int quiet)
 {
int banks;
if (cmci_supported(&banks))
-   cmci_discover(banks, 0);
+   cmci_discover(banks, quiet);
 }
 
 static void intel_init_cmci(void)
-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/6] x86/mce: Add CMCI poll mode

2012-08-07 Thread Tony Luck

From: Chen Gong 

On Intel systems corrected machine check interrupts (CMCI) may be sent to
multiple logical processors; possibly to all processors on the affected
socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface").  This means
that a persistent error (such as a stuck bit in ECC memory) may cause
a storm of interrupts that greatly hinders or prevents forward progress
(probably on many processors).

To solve this we keep track of the rate at which each processor sees
CMCI. If we exceed a threshold, we disable CMCI delivery and switch to
polling the machine check banks. If the storm subsides (none of the
affected processors see any more errors for a complete poll interval) we
re-enable CMCI.

Signed-off-by: Chen Gong 
Signed-off-by: Thomas Gleixner 
Tested-by: Chen Gong 
Signed-off-by: Tony Luck 
---

Changes (w.r.t. old patch 5/5):
+ New commit message
+ Print messages when storm starts/ends
+ Suppress messages from cmci_discover()
+ Some spelling fixes
+ Increased storm threshold from 5 to 15 (so we are
  have a few more samples for pattern detection to
  identify the source of the storm).

 arch/x86/kernel/cpu/mcheck/mce-internal.h |  12 
 arch/x86/kernel/cpu/mcheck/mce.c  |  47 +++--
 arch/x86/kernel/cpu/mcheck/mce_intel.c| 108 +-
 3 files changed, 160 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h 
b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index ed44c8a..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -28,6 +28,18 @@ extern int mce_ser;
 
 extern struct mce_bank *mce_banks;
 
+#ifdef CONFIG_X86_MCE_INTEL
+unsigned long mce_intel_adjust_timer(unsigned long interval);
+void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
+#else
+# define mce_intel_adjust_timer mce_adjust_timer_default
+static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
+#endif
+
+void mce_timer_kick(unsigned long interval);
+
 #ifdef CONFIG_ACPI_APEI
 int apei_write_mce(struct mce *m);
 ssize_t apei_read_mce(struct mce *m, u64 *record_id);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 826dd21..ee57a8f 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1260,6 +1260,14 @@ static unsigned long check_interval = 5 * 60; /* 5 
minutes */
 static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */
 static DEFINE_PER_CPU(struct timer_list, mce_timer);
 
+static unsigned long mce_adjust_timer_default(unsigned long interval)
+{
+   return interval;
+}
+
+static unsigned long (*mce_adjust_timer)(unsigned long interval) =
+   mce_adjust_timer_default;
+
 static void mce_timer_fn(unsigned long data)
 {
struct timer_list *t = &__get_cpu_var(mce_timer);
@@ -1270,6 +1278,7 @@ static void mce_timer_fn(unsigned long data)
if (mce_available(__this_cpu_ptr(&cpu_info))) {
machine_check_poll(MCP_TIMESTAMP,
&__get_cpu_var(mce_poll_banks));
+   mce_intel_cmci_poll();
}
 
/*
@@ -1277,14 +1286,38 @@ static void mce_timer_fn(unsigned long data)
 * polling interval, otherwise increase the polling interval.
 */
iv = __this_cpu_read(mce_next_interval);
-   if (mce_notify_irq())
+   if (mce_notify_irq()) {
iv = max(iv / 2, (unsigned long) HZ/100);
-   else
+   } else {
iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
+   iv = mce_adjust_timer(iv);
+   }
__this_cpu_write(mce_next_interval, iv);
+   /* Might have become 0 after CMCI storm subsided */
+   if (iv) {
+   t->expires = jiffies + iv;
+   add_timer_on(t, smp_processor_id());
+   }
+}
 
-   t->expires = jiffies + iv;
-   add_timer_on(t, smp_processor_id());
+/*
+ * Ensure that the timer is firing in @interval from now.
+ */
+void mce_timer_kick(unsigned long interval)
+{
+   struct timer_list *t = &__get_cpu_var(mce_timer);
+   unsigned long when = jiffies + interval;
+   unsigned long iv = __this_cpu_read(mce_next_interval);
+
+   if (timer_pending(t)) {
+   if (time_before(when, t->expires))
+   mod_timer_pinned(t, when);
+   } else {
+   t->expires = round_jiffies(when);
+   add_timer_on(t, smp_processor_id());
+   }
+   if (interval < iv)
+   __this_cpu_write(mce_next_interval, interval);
 }
 
 /* Must not be called in IRQ context where del_timer_sync() can deadlock */
@@ -1548,6 +1581,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 
*c)
switch (c->x86_vendor) {
case X86_VENDOR_INTEL:
mce_intel_feature_init(c);
+

[PATCH 5/6] x86/mce: Make cmci_discover() quiet

2012-08-09 Thread Tony Luck

cmci_discover() works out which machine check banks support CMCI, and
which of those are shared by multiple logical processors. It uses this
information to ensure that exactly one cpu is designated the owner of
each bank so that when interrupts are broadcast to multiple cpus, only one
of them will look in a shared bank to log the error and clear the bank.

At boot time cmci_discover() performs this task silently. But during
certain cpu hotplug operations it prints out a set of summary lines
like this:

CPU 35 MCA banks CMCI:0 CMCI:1 CMCI:3 CMCI:5 CMCI:6 CMCI:7 CMCI:8 CMCI:9 
CMCI:10 CMCI:11
CPU 1 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 39 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 38 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 32 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 37 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 36 MCA banks CMCI:0 CMCI:1 CMCI:3
CPU 34 MCA banks CMCI:0 CMCI:1 CMCI:3

The value of these messages seems very low. A user might painstakingly
cross-check against the data sheet for a processor to ensure that all
CMCI supported banks are correctly reported, but this seems improbable.
If users really wanted to do this, we should print the information at
boot time too.

Remove the messages.

Signed-off-by: Tony Luck 
---

Gong pointed out to me offline that my previous "patch 5/6" would not
do what I said it did in the case where a processor is taken offline
during a CMCI storm. We'd have a topology change, but would suppress
the bank attribution messages when the storm ended.  I took a longer
look at the messages, and decided that we can live without them.

 arch/x86/kernel/cpu/mcheck/mce_intel.c | 25 ++---
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c 
b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 38e49bc..59648e4 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -65,24 +65,15 @@ static void intel_threshold_interrupt(void)
mce_notify_irq();
 }
 
-static void print_update(char *type, int *hdr, int num)
-{
-   if (*hdr == 0)
-   printk(KERN_INFO "CPU %d MCA banks", smp_processor_id());
-   *hdr = 1;
-   printk(KERN_CONT " %s:%d", type, num);
-}
-
 /*
  * Enable CMCI (Corrected Machine Check Interrupt) for available MCE banks
  * on this CPU. Use the algorithm recommended in the SDM to discover shared
  * banks.
  */
-static void cmci_discover(int banks, int boot)
+static void cmci_discover(int banks)
 {
unsigned long *owned = (void *)&__get_cpu_var(mce_banks_owned);
unsigned long flags;
-   int hdr = 0;
int i;
 
raw_spin_lock_irqsave(&cmci_discover_lock, flags);
@@ -96,8 +87,7 @@ static void cmci_discover(int banks, int boot)
 
/* Already owned by someone else? */
if (val & MCI_CTL2_CMCI_EN) {
-   if (test_and_clear_bit(i, owned) && !boot)
-   print_update("SHD", &hdr, i);
+   clear_bit(i, owned);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
continue;
}
@@ -109,16 +99,13 @@ static void cmci_discover(int banks, int boot)
 
/* Did the enable bit stick? -- the bank supports CMCI */
if (val & MCI_CTL2_CMCI_EN) {
-   if (!test_and_set_bit(i, owned) && !boot)
-   print_update("CMCI", &hdr, i);
+   set_bit(i, owned);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
} else {
WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
}
}
raw_spin_unlock_irqrestore(&cmci_discover_lock, flags);
-   if (hdr)
-   printk(KERN_CONT "\n");
 }
 
 /*
@@ -186,7 +173,7 @@ void cmci_rediscover(int dying)
continue;
/* Recheck banks in case CPUs don't all have the same */
if (cmci_supported(&banks))
-   cmci_discover(banks, 0);
+   cmci_discover(banks);
}
 
set_cpus_allowed_ptr(current, old);
@@ -200,7 +187,7 @@ void cmci_reenable(void)
 {
int banks;
if (cmci_supported(&banks))
-   cmci_discover(banks, 0);
+   cmci_discover(banks);
 }
 
 static void intel_init_cmci(void)
@@ -211,7 +198,7 @@ static void intel_init_cmci(void)
return;
 
mce_threshold_vector = intel_threshold_interrupt;
-   cmci_discover(banks, 1);
+   cmci_discover(banks);
/*
 * For CPU #0 this runs with still disabled APIC, but that's
 * ok because only the vector is set up. We still do another
-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-19 Thread Tony Luck

> Perhaps what is happening is that cpu0 comes online ... safely skips
> over the early printk calls.  Calls cpu_init() which sets up the resources
> *it* needs (ar.k3 points to per-cpu space), and then executes
> sched_init() which marks it safe for all printk's. Then cpu1 comes
> up and does a printk before it gets to cpu_init().

I just tried Ingo's patch[1] on a 2.6.25-rc2 kernel with printk timestamps
turned on ... and it booted just fine on my tiger4.  The default path
for non-boot cpus is from head.S to start_secondary(), and that
calls cpu_init() pretty quickly.  There shouldn't normally[2] be any
printk() calls on the non-boot cpu before it is safe to do so.

-Tony

[1] Attached
[2] If you set #define SMP_DEBUG in arch/ia64/kernel/smpboot.c
that enables at least one printk() that will cause problems if you have
also configured timestamps.
 kernel/sched.c |   14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
  */
 const_debug unsigned int sysctl_sched_rt_ratio = 62259;
 
+static __read_mostly int scheduler_running;
+
 /*
  * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
  * clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
 	unsigned long flags;
 	struct rq *rq;
 
-	local_irq_save(flags);
-	rq = cpu_rq(cpu);
 	/*
 	 * Only call sched_clock() if the scheduler has already been
 	 * initialized (some code might call cpu_clock() very early):
 	 */
-	if (rq->idle)
-		update_rq_clock(rq);
+	if (unlikely(!scheduler_running))
+		return 0;
+
+	local_irq_save(flags);
+	rq = cpu_rq(cpu);
+	update_rq_clock(rq);
 	now = rq->clock;
 	local_irq_restore(flags);
 
@@ -7255,6 +7259,8 @@ void __init sched_init(void)
 	 * During early bootup we pretend to be a normal task:
 	 */
 	current->sched_class = &fair_sched_class;
+
+	scheduler_running = 1;
 }
 
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--

Re: [PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors

2012-07-20 Thread Tony Luck

> Maybe define a default empty quirk_no_way_out() on the remaining
> families/vendors so that the compiler can optimize it away and we save
> ourselves the if-test?

Perhaps I misunderstood your suggestion. I don't see how the compiler will
manage to optimize it all away.  I just tried defining

static void quirk_no_way_out_nop(int bank, struct mce *m, struct pt_regs *regs)
{
}

and providing that as an initial value for the quirk_no_way_out
function pointer.

Then I deleted the "if (quirk_no_way_out)".

Looking at the assembly code produced, I now just have an unconditional call:

 callq  *0x9fe992(%rip)# 81a18668 


I'd think that a call through a function pointer to an empty function is
more expensive that testing whether that function pointer was NULL.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] dmi: Feed DMI table to /dev/random driver

2012-07-20 Thread Tony Luck

Send the entire DMI (SMBIOS) table to the /dev/random driver to
help seed its pools.

Signed-off-by: Tony Luck 
---

This looks a useful addition to your /dev/random series. There are
lots of platform specific goodies in this table (BIOS version, system
serial number and UUID, count and version number of processors, DIMM
slot population and serial numbers, etc.)

On the system I tested the patch on the table is 9866 bytes. Is it
OK to dump that much into add_device_randomness() in one shot? The
alternative is to select the 'useful' bits deeper into the routines
that parse the entries in the table.

 drivers/firmware/dmi_scan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
index 153980b..b298158 100644
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -111,6 +112,8 @@ static int __init dmi_walk_early(void (*decode)(const 
struct dmi_header *,
 
dmi_table(buf, dmi_len, dmi_num, decode, NULL);
 
+   add_device_randomness(buf, dmi_len);
+
dmi_iounmap(buf, dmi_len);
return 0;
 }
-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dmi: Feed DMI table to /dev/random driver

2012-07-20 Thread Tony Luck

On Fri, Jul 20, 2012 at 5:56 PM, Theodore Ts'o  wrote:
> The other approach was to add some new interface that random.c would
> call which would grab the dmi data from rand_initialize().  But that's
> going to be a lot more complicated, so I guess we should go with the
> simple/stupid approach.

It wouldn't be all that hard ... we'd just need to add a new entry point
to the dmi codefor random to call (and a stub somewhere so that
CONFIG_DMI=n kernels still build). But getting some per-platform
data into the random pools earlier is a good thing ... it means that
users of random data will see the benefit earlier than they do now.

So add the big fat comment so that people know not to break this
useful (if not entirely intentional) functionality.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] random: Add comment to random_initialize()

2012-07-23 Thread Tony Luck

Many platforms have per-machine instance data (serial numbers,
asset tags, etc.) squirreled away in areas that are accessed
during early system bringup. Mixing this data into the random
pools has a very high value in providing better random data,
so we should allow (and even encourage) architecture code to
call add_device_randomness() from the setup_arch() paths.

However, this limits our options for internal structure of
the random driver since random_initialize() is not called
until long after setup_arch().

Add a big fat comment to rand_initialize() spelling out
this requirement.

Suggested-by: Theodore Ts'o 
Signed-off-by: Tony Luck 
---

Theodore Ts'o wrote:
> I agree.  Want to send a revised patch with the comment, and I'll drop
> it into the random.git tree?

Additional patch rather than revised (since I'm touching different
subsystems: dmi and random).

 drivers/char/random.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 9793b40..1a2dfa8 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1087,6 +1087,16 @@ static void init_std_data(struct entropy_store *r)
mix_pool_bytes(r, utsname(), sizeof(*(utsname())), NULL);
 }

+/*
+ * Note that setup_arch() may call add_device_randomness()
+ * long before we get here. This allows seeding of the pools
+ * with some platform dependent data very early in the boot
+ * process. But it limits our options here. We must use
+ * statically allocated structures that already have all
+ * initializations complete at compile time. We should also
+ * take care not to overwrite the precious per platform data
+ * we were given.
+ */
 static int rand_initialize(void)
 {
init_std_data(&input_pool);
-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

checkpatch should not complain about 'Suggested-by:'

2012-07-23 Thread Tony Luck

checkpatch just gave me:

   WARNING: Non-standard signature: Suggested-by:

There are over 500 instances of 'Suggested-by:', and it seems
to have some value in tracking history and awarding credit
where it is due.

"Reported-and-tested-by:" is also in regular use, but not
in the list of "standard" signatures.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] Fix machine check recovery for instruction fault on Sandy Bridge

2012-07-23 Thread Tony Luck

[Unchanged since last posted - except to add Boris' Acked-by
 since after further discussion his nitpick didn't warrant a
 change. Ready for x86/mce branch ... and if possible to
 move to Linus in this merge window]

This patch series adds a workaround for some strange
asymmetry between how machine checks are reported for
data and instruction fetches. For instruction fetch
error the processor does not set the EIPV bit in the
MCG_STATUS register on the affected processor, leading
us to believe that the cs/ip values saved on the stack
are not associated with the machine check ... which in
turn makes us unable to determine whether the machine
check was taken in kernel or user mode. The workaround
is to fake the presence of the EIPV bit for this error
on this processor model. Not pretty, but avoids having
to make special cases later in the code.

Tony Luck (2):
  x86/mce: Move MCACOD defines from mce-severity.c to 
  x86/mce: Add quirk for instruction recovery on Sandy Bridge
processors

 arch/x86/include/asm/mce.h|  8 ++
 arch/x86/kernel/cpu/mcheck/mce-severity.c |  7 -
 arch/x86/kernel/cpu/mcheck/mce.c  | 43 ---
 3 files changed, 48 insertions(+), 10 deletions(-)

-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] x86/mce: Move MCACOD defines from mce-severity.c to

2012-07-23 Thread Tony Luck

We will need some of these values in mce.c. Move them to the
appropriate header file so they are available.

Acked-by: Borislav Petkov 
Signed-off-by: Tony Luck 
---
 arch/x86/include/asm/mce.h| 8 
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 ---
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 441520e..a3ac52b 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -33,6 +33,14 @@
 #define MCI_STATUS_PCC   (1ULL<<57)  /* processor context corrupt */
 #define MCI_STATUS_S(1ULL<<56)  /* Signaled machine check */
 #define MCI_STATUS_AR   (1ULL<<55)  /* Action required */
+#define MCACOD   0x /* MCA Error Code */
+
+/* Architecturally defined codes from SDM Vol. 3B Chapter 15 */
+#define MCACOD_SCRUB   0x00C0  /* 0xC0-0xCF Memory Scrubbing */
+#define MCACOD_SCRUBMSK0xfff0
+#define MCACOD_L3WB0x017A  /* L3 Explicit Writeback */
+#define MCACOD_DATA0x0134  /* Data Load */
+#define MCACOD_INSTR   0x0150  /* Instruction Fetch */
 
 /* MCi_MISC register defines */
 #define MCI_MISC_ADDR_LSB(m)   ((m) & 0x3f)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c 
b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 413c2ce..1301762 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -55,13 +55,6 @@ static struct severity {
 #define MCI_UC_S (MCI_STATUS_UC|MCI_STATUS_S)
 #define MCI_UC_SAR (MCI_STATUS_UC|MCI_STATUS_S|MCI_STATUS_AR)
 #defineMCI_ADDR (MCI_STATUS_ADDRV|MCI_STATUS_MISCV)
-#define MCACOD 0x
-/* Architecturally defined codes from SDM Vol. 3B Chapter 15 */
-#define MCACOD_SCRUB   0x00C0  /* 0xC0-0xCF Memory Scrubbing */
-#define MCACOD_SCRUBMSK0xfff0
-#define MCACOD_L3WB0x017A  /* L3 Explicit Writeback */
-#define MCACOD_DATA0x0134  /* Data Load */
-#define MCACOD_INSTR   0x0150  /* Instruction Fetch */
 
MCESEV(
NO, "Invalid",
-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors

2012-07-23 Thread Tony Luck

Sandy Bridge processors follow the SDM (Vol 3B, Table 15-20) and set
both the RIPV and EIPV bits in the MCG_STATUS register to zero for
machine checks during instruction fetch. This is more than a little
counter-intuitive and means that Linux cannot recover from these
errors. Rather than insert special case code at several places in mce.c
and mce-severity.c, we pretend the EIPV bit was set for just this case
early in processing the machine check.

Acked-by: Borislav Petkov 
Signed-off-by: Tony Luck 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 43 +---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index da27c5d..e65e738 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -102,6 +102,8 @@ DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
 
 static DEFINE_PER_CPU(struct work_struct, mce_work);
 
+static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
+
 /*
  * CPU/chipset specific EDAC code can register a notifier call here to print
  * MCE errors in a human-readable form.
@@ -649,14 +651,18 @@ EXPORT_SYMBOL_GPL(machine_check_poll);
  * Do a quick check if any of the events requires a panic.
  * This decides if we keep the events around or clear them.
  */
-static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp)
+static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp,
+ struct pt_regs *regs)
 {
int i, ret = 0;
 
for (i = 0; i < banks; i++) {
m->status = mce_rdmsrl(MSR_IA32_MCx_STATUS(i));
-   if (m->status & MCI_STATUS_VAL)
+   if (m->status & MCI_STATUS_VAL) {
__set_bit(i, validp);
+   if (quirk_no_way_out)
+   quirk_no_way_out(i, m, regs);
+   }
if (mce_severity(m, tolerant, msg) >= MCE_PANIC_SEVERITY)
ret = 1;
}
@@ -1039,7 +1045,7 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
*final = m;
 
memset(valid_banks, 0, sizeof(valid_banks));
-   no_way_out = mce_no_way_out(&m, &msg, valid_banks);
+   no_way_out = mce_no_way_out(&m, &msg, valid_banks, regs);
 
barrier();
 
@@ -1415,6 +1421,34 @@ static void __mcheck_cpu_init_generic(void)
}
 }
 
+/*
+ * During IFU recovery Sandy Bridge -EP4S processors set the RIPV and
+ * EIPV bits in MCG_STATUS to zero on the affected logical processor (SDM
+ * Vol 3B Table 15-20). But this confuses both the code that determines
+ * whether the machine check occurred in kernel or user mode, and also
+ * the severity assessment code. Pretend that EIPV was set, and take the
+ * ip/cs values from the pt_regs that mce_gather_info() ignored earlier.
+ */
+static void quirk_sandybridge_ifu(int bank, struct mce *m, struct pt_regs 
*regs)
+{
+   if (bank != 0)
+   return;
+   if ((m->mcgstatus & (MCG_STATUS_EIPV|MCG_STATUS_RIPV)) != 0)
+   return;
+   if ((m->status & (MCI_STATUS_OVER|MCI_STATUS_UC|
+ MCI_STATUS_EN|MCI_STATUS_MISCV|MCI_STATUS_ADDRV|
+ MCI_STATUS_PCC|MCI_STATUS_S|MCI_STATUS_AR|
+ MCACOD)) !=
+(MCI_STATUS_UC|MCI_STATUS_EN|
+ MCI_STATUS_MISCV|MCI_STATUS_ADDRV|MCI_STATUS_S|
+ MCI_STATUS_AR|MCACOD_INSTR))
+   return;
+
+   m->mcgstatus |= MCG_STATUS_EIPV;
+   m->ip = regs->ip;
+   m->cs = regs->cs;
+}
+
 /* Add per CPU specific workarounds here */
 static int __cpuinit __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
 {
@@ -1512,6 +1546,9 @@ static int __cpuinit __mcheck_cpu_apply_quirks(struct 
cpuinfo_x86 *c)
 */
if (c->x86 == 6 && c->x86_model <= 13 && mce_bootlog < 0)
mce_bootlog = 0;
+
+   if (c->x86 == 6 && c->x86_model == 45)
+   quirk_no_way_out = quirk_sandybridge_ifu;
}
if (monarch_timeout < 0)
monarch_timeout = 0;
-- 
1.7.10.2.552.gaa3bb87

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 >

1 - 100 of 554 matches

Mail list logo