Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Linus Torvalds wrote: > This is, actually, a problem that I suspect ends up being _very_ similar > to the zap_page_range() case. zap_page_range() needs to make sure that > everything has been updated by the time the page is actually free'd. While > filemap_sync() needs to

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Ben LaHaise wrote: > > Actually, in the filemap_sync case, the flush_tlb_page is redundant -- > there's already a call to flush_tlb_range in filemap_sync after the dirty > bits are cleared. This is not enough. If another CPU has started write-out of one of the dirty

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Manfred Spraul wrote: > That leaves msync() - it currently does a flush_tlb_page() for every > single dirty page. > Is it possible to integrate that into the mmu gather code? > > tlb_transfer_dirty() in addition to tlb_clear_page()? Actually, in the filemap_sync case, the

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Manfred Spraul wrote: > > That leaves msync() - it currently does a flush_tlb_page() for every > single dirty page. > Is it possible to integrate that into the mmu gather code? Not even necessary. The D bit does not have to be coherent. We need to make sure that we flush

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Linus wrote: > > > > > That second pass is what I had in mind. > > > > > * munmap(file): No. Second pass required for correct msync behaviour. > > > > It is? > > Not now it isn't. We just do a msync() + fsync() for msync(MS_SYNC). Which > is admittedly not optimal, but it works. > Ok, munmap()

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Hugh Dickins
On Fri, 16 Feb 2001, Jamie Lokier wrote: > > > And check the Pentium III erratas. There is one with the tlb > > that's only triggered if 4 instruction lie in a certain window and all > > access memory in the same way of the tlb (EFLAGS incorrect if 'andl > > mask,' causes page fault)). > >

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: > A very simple test might be > > cpu 1: > cpu 2: Ben's test uses only one CPU. > Now start with variants: > change to read only instead of not present > a and b in the same way of the tlb, in a different way. > change pte with write, change with lock; > . > . > . > >

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: > > > > Ben, fancy writing a boot-time test? > > > > > I'd never rely on such a test - what if the cpu checks in 99% of the > > cases, but doesn't handle some cases ('rep movd, everything unaligned, > > ...'. > > A good point. The test results are inconclusive. > > > And

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Manfred Spraul wrote: > Jamie Lokier wrote: > > > > Linus Torvalds wrote: > > > So the only case that ends up being fairly heavy may be a case that is > > > very uncommon in practice (only for unmapping shared mappings in > > > threaded programs or the lazy TLB case). > >

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Linus Torvalds wrote: > How do you expect to ever see this in practice? Sounds basically > impossible to test for this hardware race. The obvious "try to dirty as > fast as possible on one CPU while doing an atomic get-and-clear on the > other" thing is not valid - it's in

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Ben LaHaise wrote: > On Fri, 16 Feb 2001, Jamie Lokier wrote: > > > It should be fast on known CPUs, correct on unknown ones, and much > > simpler than "gather" code which may be completely unnecessary and > > rather difficult to test. > > > > If anyone reports the

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
> > Ben, fancy writing a boot-time test? > > > I'd never rely on such a test - what if the cpu checks in 99% of the > cases, but doesn't handle some cases ('rep movd, everything unaligned, > ...'. A good point. The test results are inconclusive. > And check the Pentium III erratas. There is

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Jamie Lokier wrote: > Manfred Spraul wrote: > > Ok, Is there one case were your pragmatic solutions is vastly faster? > > > * mprotect: No. The difference is at most one additional locked > > instruction for each pte. > > Oh, what instruction is that? The "set_pte()"

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Jamie Lokier wrote: > It should be fast on known CPUs, correct on unknown ones, and much > simpler than "gather" code which may be completely unnecessary and > rather difficult to test. > > If anyone reports the message, _then_ we think about the problem some more. > > Ben,

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: > Ok, Is there one case were your pragmatic solutions is vastly faster? > * mprotect: No. The difference is at most one additional locked > instruction for each pte. Oh, what instruction is that? > * munmap(anon): No. We must handle delayed accessed anyway (don't call >

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: > > Manfred Spraul wrote: > > The other cpu writes the dirty bit - we just overwrite it ;-) > > After the ptep_get_and_clear(), before the set_pte(). > > Ah, I see. The other CPU does an atomic *pte |= _PAGE_DIRTY, without > checking the present bit. ('scuse me for

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: > The other cpu writes the dirty bit - we just overwrite it ;-) > After the ptep_get_and_clear(), before the set_pte(). Ah, I see. The other CPU does an atomic *pte |= _PAGE_DIRTY, without checking the present bit. ('scuse me for temporary brain failure). How about a

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: > > And how does that lose a dirty bit? > > For the other processor to not write a dirty bit, it must have a dirty ^^^ > TLB entry already which, along with the locked cycle in > ptep_get_and_clear, means that `entry' will have _PAGE_DIRTY

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: > > entry = ptep_get_and_clear(pte); > > set_pte(pte, pte_modify(entry, newprot)); > > > > I.e. the only code with the race condition is code which explicitly > > clears the dirty bit, in vmscan.c. > > > > Do you see any possibility of losing a dirty bit

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: > > /* mprotect.c */ > entry = ptep_get_and_clear(pte); > set_pte(pte, pte_modify(entry, newprot)); > > I.e. the only code with the race condition is code which explicitly > clears the dirty bit, in vmscan.c. > > Do you see any possibility of losing a dirty

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: > > I can think of one case where performance is considered quite important: > > mprotect() is used by several garbage collectors, including threaded > > ones. Maybe mprotect() isn't the best primitive for those anyway, but > > it's what they have to work with atm. > >

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: > > Linus Torvalds wrote: > > So the only case that ends up being fairly heavy may be a case that is > > very uncommon in practice (only for unmapping shared mappings in > > threaded programs or the lazy TLB case). > The lazy tlb case is quite fast: lazy tlb thread never

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Linus Torvalds wrote: > So the only case that ends up being fairly heavy may be a case that is > very uncommon in practice (only for unmapping shared mappings in > threaded programs or the lazy TLB case). I can think of one case where performance is considered quite important: mprotect() is used

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Linus Torvalds wrote: So the only case that ends up being fairly heavy may be a case that is very uncommon in practice (only for unmapping shared mappings in threaded programs or the lazy TLB case). I can think of one case where performance is considered quite important: mprotect() is used by

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: Linus Torvalds wrote: So the only case that ends up being fairly heavy may be a case that is very uncommon in practice (only for unmapping shared mappings in threaded programs or the lazy TLB case). The lazy tlb case is quite fast: lazy tlb thread never write to user

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: I can think of one case where performance is considered quite important: mprotect() is used by several garbage collectors, including threaded ones. Maybe mprotect() isn't the best primitive for those anyway, but it's what they have to work with atm. Does

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: /* mprotect.c */ entry = ptep_get_and_clear(pte); set_pte(pte, pte_modify(entry, newprot)); I.e. the only code with the race condition is code which explicitly clears the dirty bit, in vmscan.c. Do you see any possibility of losing a dirty bit here?

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: entry = ptep_get_and_clear(pte); set_pte(pte, pte_modify(entry, newprot)); I.e. the only code with the race condition is code which explicitly clears the dirty bit, in vmscan.c. Do you see any possibility of losing a dirty bit here? Of

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: The other cpu writes the dirty bit - we just overwrite it ;-) After the ptep_get_and_clear(), before the set_pte(). Ah, I see. The other CPU does an atomic *pte |= _PAGE_DIRTY, without checking the present bit. ('scuse me for temporary brain failure). How about a

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: Manfred Spraul wrote: The other cpu writes the dirty bit - we just overwrite it ;-) After the ptep_get_and_clear(), before the set_pte(). Ah, I see. The other CPU does an atomic *pte |= _PAGE_DIRTY, without checking the present bit. ('scuse me for temporary brain

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: Ok, Is there one case were your pragmatic solutions is vastly faster? * mprotect: No. The difference is at most one additional locked instruction for each pte. Oh, what instruction is that? * munmap(anon): No. We must handle delayed accessed anyway (don't call

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Jamie Lokier wrote: It should be fast on known CPUs, correct on unknown ones, and much simpler than "gather" code which may be completely unnecessary and rather difficult to test. If anyone reports the message, _then_ we think about the problem some more. Ben, fancy

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Jamie Lokier wrote: Manfred Spraul wrote: Ok, Is there one case were your pragmatic solutions is vastly faster? * mprotect: No. The difference is at most one additional locked instruction for each pte. Oh, what instruction is that? The "set_pte()" thing could

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Ben, fancy writing a boot-time test? I'd never rely on such a test - what if the cpu checks in 99% of the cases, but doesn't handle some cases ('rep movd, everything unaligned, ...'. A good point. The test results are inconclusive. And check the Pentium III erratas. There is one with

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Ben LaHaise wrote: On Fri, 16 Feb 2001, Jamie Lokier wrote: It should be fast on known CPUs, correct on unknown ones, and much simpler than "gather" code which may be completely unnecessary and rather difficult to test. If anyone reports the message, _then_ we

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Linus Torvalds wrote: How do you expect to ever see this in practice? Sounds basically impossible to test for this hardware race. The obvious "try to dirty as fast as possible on one CPU while doing an atomic get-and-clear on the other" thing is not valid - it's in fact

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Manfred Spraul wrote: Jamie Lokier wrote: Linus Torvalds wrote: So the only case that ends up being fairly heavy may be a case that is very uncommon in practice (only for unmapping shared mappings in threaded programs or the lazy TLB case). The lazy tlb

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Jamie Lokier wrote: Ben, fancy writing a boot-time test? I'd never rely on such a test - what if the cpu checks in 99% of the cases, but doesn't handle some cases ('rep movd, everything unaligned, ...'. A good point. The test results are inconclusive. And check the Pentium

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Jamie Lokier
Manfred Spraul wrote: A very simple test might be cpu 1: cpu 2: Ben's test uses only one CPU. Now start with variants: change to read only instead of not present a and b in the same way of the tlb, in a different way. change pte with write, change with lock; . . . But you'll

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Hugh Dickins
On Fri, 16 Feb 2001, Jamie Lokier wrote: And check the Pentium III erratas. There is one with the tlb that's only triggered if 4 instruction lie in a certain window and all access memory in the same way of the tlb (EFLAGS incorrect if 'andl mask,memory_addr' causes page fault)).

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Manfred Spraul
Linus wrote: That second pass is what I had in mind. * munmap(file): No. Second pass required for correct msync behaviour. It is? Not now it isn't. We just do a msync() + fsync() for msync(MS_SYNC). Which is admittedly not optimal, but it works. Ok, munmap() will be fixed by

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Manfred Spraul wrote: That leaves msync() - it currently does a flush_tlb_page() for every single dirty page. Is it possible to integrate that into the mmu gather code? tlb_transfer_dirty() in addition to tlb_clear_page()? Actually, in the filemap_sync case, the

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Manfred Spraul wrote: That leaves msync() - it currently does a flush_tlb_page() for every single dirty page. Is it possible to integrate that into the mmu gather code? Not even necessary. The D bit does not have to be coherent. We need to make sure that we flush the

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Linus Torvalds
On Fri, 16 Feb 2001, Ben LaHaise wrote: Actually, in the filemap_sync case, the flush_tlb_page is redundant -- there's already a call to flush_tlb_range in filemap_sync after the dirty bits are cleared. This is not enough. If another CPU has started write-out of one of the dirty pages

Re: x86 ptep_get_and_clear question

2001-02-16 Thread Ben LaHaise
On Fri, 16 Feb 2001, Linus Torvalds wrote: This is, actually, a problem that I suspect ends up being _very_ similar to the zap_page_range() case. zap_page_range() needs to make sure that everything has been updated by the time the page is actually free'd. While filemap_sync() needs to make

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
On Thu, 15 Feb 2001, Manfred Spraul wrote: > > > Now, I will agree that I suspect most x86 _implementations_ will not do > > this. TLB's are too timing-critical, and nobody tends to want to make > > them bigger than necessary - so saving off the source address is > > unlikely. Also, setting

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
On Fri, 16 Feb 2001, Jamie Lokier wrote: > > If you want to take it really far, it _could_ be that the TLB data > contains both the pointer and the original pte contents. Then "mark > dirty" becomes > >val |= D >write *ptr No. This is forbidden by the intel documentation.

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Linus Torvalds wrote: > It _could_ be that the TLB data actually also contains the pointer to > the place where it was fetched, and a "mark dirty" becomes > > read *ptr locked > val |= D > write *ptr unlock If you want to take it really far, it _could_ be that the TLB data

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Manfred Spraul
Manfred Spraul wrote: > > I just benchmarked a single flush_tlb_page(). > > Pentium II 350: ~ 2000 cpu ticks. > Pentium III 850: ~ 3000 cpu ticks. > I forgot the important part: SMP, including a smp_call_function() IPI. IIRC Ingo wrote that a local 'invplg' is around 100 ticks. --

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Manfred Spraul
Linus Torvalds wrote: > > In article <[EMAIL PROTECTED]>, > Jamie Lokier <[EMAIL PROTECTED]> wrote: > >> > << lock; > >> > read pte > >> > if (!present(pte)) > >> >do_page_fault(); > >> > pte |= dirty > >> > write pte. > >> > >> end lock; > >> > >> No, it is a little more complicated. You

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
In article <[EMAIL PROTECTED]>, Jamie Lokier <[EMAIL PROTECTED]> wrote: >> > << lock; >> > read pte >> > if (!present(pte)) >> >do_page_fault(); >> > pte |= dirty >> > write pte. >> > >> end lock; >> >> No, it is a little more complicated. You also have to include in the >> tlb state into

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
In article <[EMAIL PROTECTED]>, Kanoj Sarcar <[EMAIL PROTECTED]> wrote: >> >> Will you please go off and prove that this "problem" exists on some x86 >> processor before continuing this rant? None of the PII, PIII, Athlon, > >And will you please stop behaving like this is not an issue? This

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
> > On Thu, 15 Feb 2001, Kanoj Sarcar wrote: > > > No. All architectures do not have this problem. For example, if the > > Linux "dirty" (not the pte dirty) bit is managed by software, a fault > > will actually be taken when processor 2 tries to do the write. The fault > > is solely to make

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Kanoj Sarcar wrote: > > Is the sequence > > << lock; > > read pte > > pte |= dirty > > write pte > > >> end lock; > > or > > << lock; > > read pte > > if (!present(pte)) > > do_page_fault(); > > pte |= dirty > > write pte. > > >> end lock; > > No, it is a little more complicated. You also

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
> > Kanoj Sarcar wrote: > > > > Okay, I will quote from Intel Architecture Software Developer's Manual > > Volume 3: System Programming Guide (1997 print), section 3.7, page 3-27: > > > > "Bus cycles to the page directory and page tables in memory are performed > > only when the TLBs do not

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Manfred Spraul wrote: > Is the sequence > << lock; > read pte > pte |= dirty > write pte > >> end lock; > or > << lock; > read pte > if (!present(pte)) > do_page_fault(); > pte |= dirty > write pte. > >> end lock; or more generally << lock; read pte if (!present(pte) || !writable(pte))

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Ben LaHaise
On Thu, 15 Feb 2001, Kanoj Sarcar wrote: > No. All architectures do not have this problem. For example, if the > Linux "dirty" (not the pte dirty) bit is managed by software, a fault > will actually be taken when processor 2 tries to do the write. The fault > is solely to make sure that the

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
> > Kanoj Sarcar wrote: > > > Here's the important part: when processor 2 wants to set the pte's dirty > > > bit, it *rereads* the pte and *rechecks* the permission bits again. > > > Even though it has a non-dirty TLB entry for that pte. > > > > > > That is how I read Ben LaHaise's description,

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Manfred Spraul
Kanoj Sarcar wrote: > > Okay, I will quote from Intel Architecture Software Developer's Manual > Volume 3: System Programming Guide (1997 print), section 3.7, page 3-27: > > "Bus cycles to the page directory and page tables in memory are performed > only when the TLBs do not contain the

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Kanoj Sarcar wrote: > > Here's the important part: when processor 2 wants to set the pte's dirty > > bit, it *rereads* the pte and *rechecks* the permission bits again. > > Even though it has a non-dirty TLB entry for that pte. > > > > That is how I read Ben LaHaise's description, and his test

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
> > [Added Linus and linux-kernel as I think it's of general interest] > > Kanoj Sarcar wrote: > > Whether Jamie was trying to illustrate a different problem, I am not > > sure. > > Yes, I was talking about pte_test_and_clear_dirty in the earlier post. > > > Look in mm/mprotect.c. Look at the

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
> > [Added Linus and linux-kernel as I think it's of general interest] > > Kanoj Sarcar wrote: > > Whether Jamie was trying to illustrate a different problem, I am not > > sure. > > Yes, I was talking about pte_test_and_clear_dirty in the earlier post. > > > Look in mm/mprotect.c. Look at the

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
[Added Linus and linux-kernel as I think it's of general interest] Kanoj Sarcar wrote: > Whether Jamie was trying to illustrate a different problem, I am not > sure. Yes, I was talking about pte_test_and_clear_dirty in the earlier post. > Look in mm/mprotect.c. Look at the call sequence

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
[Added Linus and linux-kernel as I think it's of general interest] Kanoj Sarcar wrote: Whether Jamie was trying to illustrate a different problem, I am not sure. Yes, I was talking about pte_test_and_clear_dirty in the earlier post. Look in mm/mprotect.c. Look at the call sequence

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
[Added Linus and linux-kernel as I think it's of general interest] Kanoj Sarcar wrote: Whether Jamie was trying to illustrate a different problem, I am not sure. Yes, I was talking about pte_test_and_clear_dirty in the earlier post. Look in mm/mprotect.c. Look at the call

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
[Added Linus and linux-kernel as I think it's of general interest] Kanoj Sarcar wrote: Whether Jamie was trying to illustrate a different problem, I am not sure. Yes, I was talking about pte_test_and_clear_dirty in the earlier post. Look in mm/mprotect.c. Look at the call

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Kanoj Sarcar wrote: Here's the important part: when processor 2 wants to set the pte's dirty bit, it *rereads* the pte and *rechecks* the permission bits again. Even though it has a non-dirty TLB entry for that pte. That is how I read Ben LaHaise's description, and his test program

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Manfred Spraul
Kanoj Sarcar wrote: Okay, I will quote from Intel Architecture Software Developer's Manual Volume 3: System Programming Guide (1997 print), section 3.7, page 3-27: "Bus cycles to the page directory and page tables in memory are performed only when the TLBs do not contain the translation

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
Kanoj Sarcar wrote: Here's the important part: when processor 2 wants to set the pte's dirty bit, it *rereads* the pte and *rechecks* the permission bits again. Even though it has a non-dirty TLB entry for that pte. That is how I read Ben LaHaise's description, and his test

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Manfred Spraul wrote: Is the sequence lock; read pte pte |= dirty write pte end lock; or lock; read pte if (!present(pte)) do_page_fault(); pte |= dirty write pte. end lock; or more generally lock; read pte if (!present(pte) || !writable(pte)) do_page_fault();

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Ben LaHaise
On Thu, 15 Feb 2001, Kanoj Sarcar wrote: No. All architectures do not have this problem. For example, if the Linux "dirty" (not the pte dirty) bit is managed by software, a fault will actually be taken when processor 2 tries to do the write. The fault is solely to make sure that the Linux

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
Kanoj Sarcar wrote: Okay, I will quote from Intel Architecture Software Developer's Manual Volume 3: System Programming Guide (1997 print), section 3.7, page 3-27: "Bus cycles to the page directory and page tables in memory are performed only when the TLBs do not contain the

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Kanoj Sarcar
On Thu, 15 Feb 2001, Kanoj Sarcar wrote: No. All architectures do not have this problem. For example, if the Linux "dirty" (not the pte dirty) bit is managed by software, a fault will actually be taken when processor 2 tries to do the write. The fault is solely to make sure that the

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Kanoj Sarcar wrote: Is the sequence lock; read pte pte |= dirty write pte end lock; or lock; read pte if (!present(pte)) do_page_fault(); pte |= dirty write pte. end lock; No, it is a little more complicated. You also have to include in the tlb state into

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
In article [EMAIL PROTECTED], Kanoj Sarcar [EMAIL PROTECTED] wrote: Will you please go off and prove that this "problem" exists on some x86 processor before continuing this rant? None of the PII, PIII, Athlon, And will you please stop behaving like this is not an issue? This is

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
In article [EMAIL PROTECTED], Jamie Lokier [EMAIL PROTECTED] wrote: lock; read pte if (!present(pte)) do_page_fault(); pte |= dirty write pte. end lock; No, it is a little more complicated. You also have to include in the tlb state into this algorithm. Since that is what

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Manfred Spraul
Linus Torvalds wrote: In article [EMAIL PROTECTED], Jamie Lokier [EMAIL PROTECTED] wrote: lock; read pte if (!present(pte)) do_page_fault(); pte |= dirty write pte. end lock; No, it is a little more complicated. You also have to include in the tlb state into

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Manfred Spraul
Manfred Spraul wrote: I just benchmarked a single flush_tlb_page(). Pentium II 350: ~ 2000 cpu ticks. Pentium III 850: ~ 3000 cpu ticks. I forgot the important part: SMP, including a smp_call_function() IPI. IIRC Ingo wrote that a local 'invplg' is around 100 ticks. -- Manfred -

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Jamie Lokier
Linus Torvalds wrote: It _could_ be that the TLB data actually also contains the pointer to the place where it was fetched, and a "mark dirty" becomes read *ptr locked val |= D write *ptr unlock If you want to take it really far, it _could_ be that the TLB data contains

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
On Fri, 16 Feb 2001, Jamie Lokier wrote: If you want to take it really far, it _could_ be that the TLB data contains both the pointer and the original pte contents. Then "mark dirty" becomes val |= D write *ptr No. This is forbidden by the intel documentation. First

Re: x86 ptep_get_and_clear question

2001-02-15 Thread Linus Torvalds
On Thu, 15 Feb 2001, Manfred Spraul wrote: Now, I will agree that I suspect most x86 _implementations_ will not do this. TLB's are too timing-critical, and nobody tends to want to make them bigger than necessary - so saving off the source address is unlikely. Also, setting the D bit