On 19.06.2013 10:06, Peter Zijlstra wrote:
>> On 19.06.2013 07:20, Linus Torvalds wrote:
>>> There's the fast_tlb race that Peter fixed in commit 29eb77825cc7
>>> ("arch, mm: Remove tlb_fast_mode()"). I'm not seeing how it would
>>> cause infinite TLB faults, but it definitely causes potentially
>
On Wed, Jun 19, 2013 at 09:36:39AM +0200, Stanislav Meduna wrote:
> On 19.06.2013 07:20, Linus Torvalds wrote:
>
> >> No crash in 2 days running with preempt none...
> >
> > Is this UP?
>
> Yes it is.
>
> > There's the fast_tlb race that Peter fixed in commit 29eb77825cc7
> > ("arch, mm: Remove
On 19.06.2013 07:20, Linus Torvalds wrote:
>> No crash in 2 days running with preempt none...
>
> Is this UP?
Yes it is.
> There's the fast_tlb race that Peter fixed in commit 29eb77825cc7
> ("arch, mm: Remove tlb_fast_mode()"). I'm not seeing how it would
> cause infinite TLB faults, but it de
On Tue, Jun 18, 2013 at 9:13 AM, Stanislav Meduna wrote:
>
> No crash in 2 days running with preempt none...
Is this UP?
There's the fast_tlb race that Peter fixed in commit 29eb77825cc7
("arch, mm: Remove tlb_fast_mode()"). I'm not seeing how it would
cause infinite TLB faults, but it definitel
On 16.06.2013 23:34, Stanislav Meduna wrote:
> Right now a test with the same kernel with preempt none
> is running to see whether the problem also happens with this
> application there (due to the timing sensitivity only a positive
> result has a significance).
No crash in 2 days running with pr
Hi all,
I was able to reproduce the page fault problem with
a relatively simple application, for now on the
Geode platform. It can be downloaded at
http://www.meduna.org/tmp/PageFault.tar.gz
Basically the test application does:
- 4 threads that do nothing but periodically sleep
- 1 thread loo
On 24.05.2013 15:55, Stanislav Meduna wrote:
>> Just to rule something out, are you using
>> transparent huge pages on those systems?
>
> On my present test system they are configured in, but I am
> not using them.
Ah, _transparent_ huge pages. No, that is not enabled.
--
On 24.05.2013 15:06, Rik van Riel wrote:
> Just to rule something out, are you using
> transparent huge pages on those systems?
On my present test system they are configured in, but I am
not using them.
# cat /proc/meminfo | grep Huge
HugePages_Total: 0
HugePages_Free:0
HugePages_R
On 05/24/2013 04:29 AM, Stanislav Meduna wrote:
On 23.05.2013 14:19, Rik van Riel wrote:
static inline void __native_flush_tlb_single(unsigned long addr)
{
__flush_tlb();
}
I will give it some more testing time.
That is a good idea.
Still no crash, so this one indeed seems to c
On 24.05.2013 10:29, Stanislav Meduna wrote:
static inline void __native_flush_tlb_single(unsigned long addr)
{
__flush_tlb();
}
>>
>>> I will give it some more testing time.
>>
>> That is a good idea.
>
> Still no crash, so this one indeed seems to change things.
Ta
On 23.05.2013 14:19, Rik van Riel wrote:
>>> static inline void __native_flush_tlb_single(unsigned long addr)
>>> {
>>> __flush_tlb();
>>> }
>
>> I will give it some more testing time.
>
> That is a good idea.
Still no crash, so this one indeed seems to change things.
If I understand
On 05/23/2013 10:36 AM, Steven Rostedt wrote:
> On Thu, 2013-05-23 at 10:24 -0700, H. Peter Anvin wrote:
>> On 05/23/2013 08:27 AM, Steven Rostedt wrote:
>>> On Thu, 2013-05-23 at 08:06 -0700, H. Peter Anvin wrote:
>>>
We don't even need the jump_label infrastructure -- we have
static_cpu
On Thu, 2013-05-23 at 10:24 -0700, H. Peter Anvin wrote:
> On 05/23/2013 08:27 AM, Steven Rostedt wrote:
> > On Thu, 2013-05-23 at 08:06 -0700, H. Peter Anvin wrote:
> >
> >> We don't even need the jump_label infrastructure -- we have
> >> static_cpu_has*() which actually predates jump_label altho
On 05/23/2013 08:27 AM, Steven Rostedt wrote:
> On Thu, 2013-05-23 at 08:06 -0700, H. Peter Anvin wrote:
>
>> We don't even need the jump_label infrastructure -- we have
>> static_cpu_has*() which actually predates jump_label although it uses
>> the same underlying ideas.
>
> Ah right. I wonder i
On Thu, 2013-05-23 at 08:06 -0700, H. Peter Anvin wrote:
> We don't even need the jump_label infrastructure -- we have
> static_cpu_has*() which actually predates jump_label although it uses
> the same underlying ideas.
Ah right. I wonder if it would be worth consolidating a lot of these
"modifyi
On 05/23/2013 06:29 AM, Steven Rostedt wrote:
> On Thu, 2013-05-23 at 08:19 -0400, Rik van Riel wrote:
>
>> We can add a bit in the architecture bits that
>> we use to check against other CPU and system
>> errata, and conditionally flush the whole TLB
>> from __native_flush_tlb_single().
>
> If w
On 23.05.2013 16:50, Linus Torvalds wrote:
> Another question: I'm assuming this is all 32-bit, is it with PAE
> enabled? That changes some of the TLB flushing, and we had one bug
> related to that, maybe there are others..
32 bit, no PAE.
--
Stano
--
On Thu, May 23, 2013 at 7:45 AM, Linus Torvalds
wrote:
>
> Page faults that don't cause us to map a page (ie a spurious one, or
> one that just updates dirty/accessed bits) don't show up as even minor
> faults. Thing of the major/minor as "mapping activity" not a page
> fault count.
Actually, I t
On Thu, May 23, 2013 at 1:07 AM, Stanislav Meduna wrote:
>
> It did not crash overnight, but it also does not show any
> minor fault counted for the threads
Page faults that don't cause us to map a page (ie a spurious one, or
one that just updates dirty/accessed bits) don't show up as even minor
On Thu, 2013-05-23 at 08:19 -0400, Rik van Riel wrote:
> We can add a bit in the architecture bits that
> we use to check against other CPU and system
> errata, and conditionally flush the whole TLB
> from __native_flush_tlb_single().
If we find that some CPUs have issues and others do not, and w
On 05/23/2013 04:07 AM, Stanislav Meduna wrote:
On 22.05.2013 20:43, Rik van Riel wrote:
Some CPUs have had errata when it comes to flushing large pages that
have been split into small pages by hardware, e.g. due to MTRR
conflicts. In that case, fragments of the large page may have been left
i
On 22.05.2013 20:43, Rik van Riel wrote:
>> Some CPUs have had errata when it comes to flushing large pages that
>> have been split into small pages by hardware, e.g. due to MTRR
>> conflicts. In that case, fragments of the large page may have been left
>> in the TLB.
Can I somehow find if this
On 22.05.2013 20:35, Rik van Riel wrote:
> I'm stumped.
>
> If the Geode knows how to flush single TLB entries, it
> should do that when flush_tlb_page is called.
>
> If it does not know, it should throw an invalid instruction
> exception, and not quietly complete the instruction without
> doing
On 05/22/2013 02:42 PM, H. Peter Anvin wrote:
On 05/22/2013 11:35 AM, Rik van Riel wrote:
On 05/22/2013 02:21 PM, Stanislav Meduna wrote:
On 22.05.2013 20:11, Steven Rostedt wrote:
Did you apply both patches? Without the first one, this one is
meaningless.
Sure.
BTW, back when I tried to p
On 05/22/2013 11:35 AM, Rik van Riel wrote:
> On 05/22/2013 02:21 PM, Stanislav Meduna wrote:
>> On 22.05.2013 20:11, Steven Rostedt wrote:
>>
>>> Did you apply both patches? Without the first one, this one is
>>> meaningless.
>>
>> Sure.
>>
>> BTW, back when I tried to pinpoint it I also tried add
On 05/22/2013 02:21 PM, Stanislav Meduna wrote:
On 22.05.2013 20:11, Steven Rostedt wrote:
Did you apply both patches? Without the first one, this one is
meaningless.
Sure.
BTW, back when I tried to pinpoint it I also tried adding
flush_tlb_page(vma, address)
at the beginning of handle_pt
On 22.05.2013 20:11, Steven Rostedt wrote:
> Did you apply both patches? Without the first one, this one is
> meaningless.
Sure.
BTW, back when I tried to pinpoint it I also tried adding
flush_tlb_page(vma, address)
at the beginning of handle_pte_fault, which as I read should
be basically the
On Wed, 2013-05-22 at 20:04 +0200, Stanislav Meduna wrote:
> On 22.05.2013 19:41, Rik van Riel wrote:
>
> >> I think you should also remove the
> >>
> >> if (flags & FAULT_FLAG_WRITE)
>
> Done
>
> >>> Can you test the attached patch?
>
> Nope. Fails with the same symptoms, min_flt skyro
On 22.05.2013 19:41, Rik van Riel wrote:
>> I think you should also remove the
>>
>> if (flags & FAULT_FLAG_WRITE)
Done
>>> Can you test the attached patch?
Nope. Fails with the same symptoms, min_flt skyrockets,
the throttler activates and after 2 seconds all is well
again.
This is on
29 matches
Mail list logo