On Fri, Sep 25, 2015 at 10:29:01AM +0200, Borislav Petkov wrote:
> > >
> >
> > The last patch of that series had 2 changes.
> >
> > 1. Allow offline cpu's to participate in the rendezvous. Since in the odd
> > chance the offline cpus have any errors collected we can still report them.
> > (we ch
+ x...@kernel.org
On Thu, Sep 24, 2015 at 02:25:41PM -0700, Raj, Ashok wrote:
> Hi Boris
>
> I should have expanded on it..
>
> On Thu, Sep 24, 2015 at 11:07:33PM +0200, Borislav Petkov wrote:
> >
> > How are you ever going to call into those from an offlined CPU?!
> >
> > And that's easy:
>
Hi Boris
I should have expanded on it..
On Thu, Sep 24, 2015 at 11:07:33PM +0200, Borislav Petkov wrote:
>
> How are you ever going to call into those from an offlined CPU?!
>
> And that's easy:
>
> if (!cpu_online(cpu))
> return;
>
The last patch of that series had 2 ch
On Thu, Sep 24, 2015 at 01:22:12PM -0700, Raj, Ashok wrote:
> Another reason i had a separate buffer in my earlier patch was to avoid
> calling rcu() functions from the offline CPU. I had an offline discussion
> with Paul McKenney he said don't do that...
>
> mce_gen_pool_add()->gen_pool_alloc
Hi Boris
On Thu, Sep 24, 2015 at 09:22:24PM +0200, Borislav Petkov wrote:
>
> Ah, we return. But we shouldn't return - we should overwrite. I believe
> we've talked about the policy of overwriting old errors with new ones.
>
Another reason i had a separate buffer in my earlier patch was to avoi
On Thu, Sep 24, 2015 at 07:00:46PM +, Luck, Tony wrote:
> > If we get new ones logged in the meantime and userspace hasn't managed
> > to consume and delete the present ones yet, we overwrite the oldest ones
> > and set MCE_OVERFLOW like mce_log does now for mcelog. And that's no
> > difference
> If we get new ones logged in the meantime and userspace hasn't managed
> to consume and delete the present ones yet, we overwrite the oldest ones
> and set MCE_OVERFLOW like mce_log does now for mcelog. And that's no
> difference in functionality than what we have now.
U. No.
On Thu, Sep 24, 2015 at 06:44:25PM +, Luck, Tony wrote:
> > Now that we have this shiny 2-pages sized lockless gen_pool, why are we
> > still dealing with struct mce_log mcelog? Why can't we rip it out and
> > kill it finally? And switch to the gen_pool?
> >
> > All code that reads from mcelog
> Now that we have this shiny 2-pages sized lockless gen_pool, why are we
> still dealing with struct mce_log mcelog? Why can't we rip it out and
> kill it finally? And switch to the gen_pool?
>
> All code that reads from mcelog - /dev/mcelog chrdev - should switch to
> the lockless buffer and will
On Thu, Sep 24, 2015 at 01:48:38AM -0400, Ashok Raj wrote:
> MCE_LOG_LEN appears to be short for high core count parts. Especially when
> handling fatal errors, we don't clear MCE banks. Socket level MC banks
> are visible to all CPUs that share banks.
>
> Assuming 18 core part, 2 threads per core
MCE_LOG_LEN appears to be short for high core count parts. Especially when
handling fatal errors, we don't clear MCE banks. Socket level MC banks
are visible to all CPUs that share banks.
Assuming 18 core part, 2 threads per core 2 banks per thread and couple uncore
MSRs. Rounding to 128 with some
11 matches
Mail list logo