On Thu, Jan 04, 2018 at 01:28:59PM +0100, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> > Our memory map code is utter shite. This kind of bug should not be
> > possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the
On Thu, 4 Jan 2018, Andy Lutomirski wrote:
> On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner wrote:
> > --- a/arch/x86/include/asm/pgtable_64_types.h
> > +++ b/arch/x86/include/asm/pgtable_64_types.h
> > @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
> > # define VMALLOC_SIZE_TB
On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>> wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables()
On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
> wrote:
> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
> >> to do nothing, and the read /sys/kernel/debug/pa
On Thu, Jan 04, 2018 at 08:20:31AM +0100, Ingo Molnar wrote:
>
> * Greg Kroah-Hartman wrote:
>
> > On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> > > - (or it's something I missed to consider)
> >
> > It was a operator error, the issue is also on 4.15-rc6, see another
> > email
* Ingo Molnar wrote:
> These will cherry-pick cleanly, so it would be nice to test them on top of of
> the
> -stable kernel that fails:
>
> for N in 450cbdd0125c 4d2dc2cc766c 1e0f25dbf246 be62a3204406 0c3292ca8025
> 9d0b62328d34; do git cherry-pick $N; done
>
> if this brute-force approac
* Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> > - (or it's something I missed to consider)
>
> It was a operator error, the issue is also on 4.15-rc6, see another
> email in this thread :)
ah, ok :-)
Nevertheless it made sense to go through all
On Thu, Jan 04, 2018 at 08:14:21AM +0100, Ingo Molnar wrote:
> - (or it's something I missed to consider)
It was a operator error, the issue is also on 4.15-rc6, see another
email in this thread :)
* Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
>
> > On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> > > Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> > > is also there (or not)?
> >
> > I haven't been able to reproduce
On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
wrote:
> On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> current_kernel, or whatever
On Wed, Jan 03, 2018 at 05:37:42PM -0800, Benjamin Gilbert wrote:
> I was caught by the fact that 4.14.11 has PAGE_TABLE_ISOLATION default y
> but 4.15-rc6 doesn't. Retesting.
It turns out that 4.15-rc6 has the same problem as 4.14.11.
--Benjamin Gilbert
On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
> current_kernel, or whatever it's called). The problem may be obvious.
current_kernel att
On Wed, Jan 03, 2018 at 04:33:03PM -0800, Benjamin Gilbert wrote:
> I haven't been able to reproduce this on 4.15-rc6.
This is bad data. I was caught by the fact that 4.14.11 has
PAGE_TABLE_ISOLATION default y but 4.15-rc6 doesn't. Retesting.
--Benjamin Gilbert
On Wed, Jan 03, 2018 at 04:27:04PM -0800, Andy Lutomirski wrote:
> How much memory does the affected system have? It sounds like something
> is mapped in the LDT region and is getting corrupted because the LDT code
> expects to own that region.
We've seen this on systems from 1 to 7 GB.
--Benjam
On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> > Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> > is also there (or not)?
>
> I haven't been able to reproduce this on 4.15-rc6.
Hmm. So we need to scr
> On Jan 3, 2018, at 4:33 PM, Benjamin Gilbert
> wrote:
>
>> On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
>> Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
>> is also there (or not)?
>
> I haven't been able to reproduce this on 4.15-rc6.
Ah.
On Wed, Jan 03, 2018 at 10:20:16AM +0100, Greg Kroah-Hartman wrote:
> Ick, not good, any chance you can test 4.15-rc6 to verify that the issue
> is also there (or not)?
I haven't been able to reproduce this on 4.15-rc6.
--Benjamin Gilbert
> On Jan 3, 2018, at 2:58 PM, Thomas Gleixner wrote:
>
>
>
>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>>
>>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
Can you please send me your .config and a full dmesg ?
>>>
>
On Wed, 3 Jan 2018, Andy Lutomirski wrote:
> > On Jan 3, 2018, at 2:58 PM, Thomas Gleixner wrote:
> >> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
> >>
> >>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> Can you please se
> On Jan 3, 2018, at 2:58 PM, Thomas Gleixner wrote:
>
>
>
>> On Wed, 3 Jan 2018, Thomas Gleixner wrote:
>>
>>> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
Can you please send me your .config and a full dmesg ?
>>>
>
On Wed, 3 Jan 2018, Thomas Gleixner wrote:
> On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> > On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> > > Can you please send me your .config and a full dmesg ?
> >
> > I've attached a serial log from a local QEMU. I can rerun with a hi
On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> > Can you please send me your .config and a full dmesg ?
>
> I've attached a serial log from a local QEMU. I can rerun with a higher
> loglevel if need be.
Thanks!
Cc'ing Andy who mi
On Wed, Jan 03, 2018 at 11:34:46PM +0100, Thomas Gleixner wrote:
> Can you please send me your .config and a full dmesg ?
I've attached a serial log from a local QEMU. I can rerun with a higher
loglevel if need be.
--Benjamin Gilbert
config-4.14.11.gz
Description: application/gzip
console.tx
On Wed, 3 Jan 2018, Benjamin Gilbert wrote:
> On Wed, Jan 03, 2018 at 04:48:33PM +0100, Ingo Molnar wrote:
> > first please test the latest WIP.x86/pti branch which has a couple of fixes.
>
> I'm still seeing the problem with that branch (3ffdeb1a02be, plus a couple
> of local patches which shoul
On Wed, Jan 03, 2018 at 04:48:33PM +0100, Ingo Molnar wrote:
> first please test the latest WIP.x86/pti branch which has a couple of fixes.
I'm still seeing the problem with that branch (3ffdeb1a02be, plus a couple
of local patches which shouldn't affect the resulting binary).
--Benjamin Gilbert
* Greg Kroah-Hartman wrote:
> On Wed, Jan 03, 2018 at 12:46:00AM -0800, Benjamin Gilbert wrote:
> > [resending with less web]
>
> (adding lkml and x86 developers)
>
> > Hi all,
> >
> > In our regression tests on kernel 4.14.11, we're occasionally seeing a run
> > of "bad pmd" messages during
On Wed, Jan 03, 2018 at 12:46:00AM -0800, Benjamin Gilbert wrote:
> [resending with less web]
(adding lkml and x86 developers)
> Hi all,
>
> In our regression tests on kernel 4.14.11, we're occasionally seeing a run
> of "bad pmd" messages during boot, followed by a "BUG: unable to handle
> kern
27 matches
Mail list logo