Re: [PATCH] mm: PageLRU can be non-atomic bit operation
> This seems to imply that we need the lock prefix even when updating > different bits within the same long (not just the same byte). Is this > interpretation correct? Yes, that is what I said. The consistency applies to loads/stores (so e.g. for 128bit SSE2 load/stores too), not bytes. The current IA32 manuals have a more detailed description of the rules in volume 3 -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Tue, 2007-04-24 at 16:22 +0200, Andi Kleen wrote: > > Why would you need any kind of lock when just changing a single bit, > > if it didn't affect other bits of the same word? Just as you don't > > need a lock when simply assigning a word, setting a bit to 0 or 1 > > is simple in itself (it doesn't matter if it was 0 or 1 before). > > > > > But, I think that concurrent bit operation on different bits > > > is just like OR operation , so lock prefix is not needed. > > > > I firmly believe that it is; but I'm not a processor expert. > > Think of the CPU cache like the page cache. The VFS cannot change > anything directly on disk - it always has to read a page (or block); > change it there even if it's only a single bit and back. > > Now imagine multiple independent kernels to that disk doing this in parallel > without any locking. You will lose data on simple changes because > of data races. > > The CPU is similar. The memory is the disk; the protocol talking to > the memory only knows cache lines. There are multiple CPUs talking > to that memory. The CPUs cannot change anything > without reading a full cache line first and then later writing it back. > There is "just do OR operation" when talking to memory, just like > disks don't have such a operation; just read and write. > > [there are some special cases with uncached mappings, but they are not > relevant here] > > With lock the CPU ensures the read-modify-write cycle is atomic, > without it doesn't. > > The CPU also guarantees you that multiple writes don't get lost > even when they hit the same cache line (otherwise you would need lock > for anything in the same cache line), but it doesn't guarantee that > for a word. The details for that vary by architecture; on x86 it is > any memory access, on others it can be only guaranteed for long stores. > > The exact rules for all this can be quite complicated (and also vary > by architecture), this is a simplification but should be enough as a rough > guide. Intel 80386 Reference Programmer's Manual reads: " When accessing a bit in memory, the processor may access four bytes starting from the memory address given by: Effective Address + (4 * (BitOffset DIV 32)) for a 32-bit operand size, or two bytes starting from the memory address given by: Effective Address + (2 * (BitOffset DIV 16)) for a 16-bit operand size. It may do this even when only a single byte needs to be accessed in order to get at the given bit. " This seems to imply that we need the lock prefix even when updating different bits within the same long (not just the same byte). Is this interpretation correct? - Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Tue, 2007-04-24 at 16:22 +0200, Andi Kleen wrote: Why would you need any kind of lock when just changing a single bit, if it didn't affect other bits of the same word? Just as you don't need a lock when simply assigning a word, setting a bit to 0 or 1 is simple in itself (it doesn't matter if it was 0 or 1 before). But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. I firmly believe that it is; but I'm not a processor expert. Think of the CPU cache like the page cache. The VFS cannot change anything directly on disk - it always has to read a page (or block); change it there even if it's only a single bit and back. Now imagine multiple independent kernels to that disk doing this in parallel without any locking. You will lose data on simple changes because of data races. The CPU is similar. The memory is the disk; the protocol talking to the memory only knows cache lines. There are multiple CPUs talking to that memory. The CPUs cannot change anything without reading a full cache line first and then later writing it back. There is just do OR operation when talking to memory, just like disks don't have such a operation; just read and write. [there are some special cases with uncached mappings, but they are not relevant here] With lock the CPU ensures the read-modify-write cycle is atomic, without it doesn't. The CPU also guarantees you that multiple writes don't get lost even when they hit the same cache line (otherwise you would need lock for anything in the same cache line), but it doesn't guarantee that for a word. The details for that vary by architecture; on x86 it is any memory access, on others it can be only guaranteed for long stores. The exact rules for all this can be quite complicated (and also vary by architecture), this is a simplification but should be enough as a rough guide. Intel 80386 Reference Programmer's Manual reads: When accessing a bit in memory, the processor may access four bytes starting from the memory address given by: Effective Address + (4 * (BitOffset DIV 32)) for a 32-bit operand size, or two bytes starting from the memory address given by: Effective Address + (2 * (BitOffset DIV 16)) for a 16-bit operand size. It may do this even when only a single byte needs to be accessed in order to get at the given bit. This seems to imply that we need the lock prefix even when updating different bits within the same long (not just the same byte). Is this interpretation correct? - Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
This seems to imply that we need the lock prefix even when updating different bits within the same long (not just the same byte). Is this interpretation correct? Yes, that is what I said. The consistency applies to loads/stores (so e.g. for 128bit SSE2 load/stores too), not bytes. The current IA32 manuals have a more detailed description of the rules in volume 3 -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
> Why would you need any kind of lock when just changing a single bit, > if it didn't affect other bits of the same word? Just as you don't > need a lock when simply assigning a word, setting a bit to 0 or 1 > is simple in itself (it doesn't matter if it was 0 or 1 before). > > > But, I think that concurrent bit operation on different bits > > is just like OR operation , so lock prefix is not needed. > > I firmly believe that it is; but I'm not a processor expert. Think of the CPU cache like the page cache. The VFS cannot change anything directly on disk - it always has to read a page (or block); change it there even if it's only a single bit and back. Now imagine multiple independent kernels to that disk doing this in parallel without any locking. You will lose data on simple changes because of data races. The CPU is similar. The memory is the disk; the protocol talking to the memory only knows cache lines. There are multiple CPUs talking to that memory. The CPUs cannot change anything without reading a full cache line first and then later writing it back. There is "just do OR operation" when talking to memory, just like disks don't have such a operation; just read and write. [there are some special cases with uncached mappings, but they are not relevant here] With lock the CPU ensures the read-modify-write cycle is atomic, without it doesn't. The CPU also guarantees you that multiple writes don't get lost even when they hit the same cache line (otherwise you would need lock for anything in the same cache line), but it doesn't guarantee that for a word. The details for that vary by architecture; on x86 it is any memory access, on others it can be only guaranteed for long stores. The exact rules for all this can be quite complicated (and also vary by architecture), this is a simplification but should be enough as a rough guide. Hope this helps, -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Tue, 24 Apr 2007, Hisashi Hifumi wrote: > At 22:42 07/04/23, Hugh Dickins wrote: > >On Mon, 23 Apr 2007, Hisashi Hifumi wrote: > > > >No. The PG_lru flag bit is just one bit amongst many others: > > > >what of concurrent operations changing other bits in that same > > > >unsigned long e.g. trying to lock the page by setting PG_locked? > > > >There are some places where such micro-optimizations can be made > > > >(typically while first allocating the page); but in general, no. > > > > > > In i386 and x86_64, btsl is used to change page flag. In this case, > > > if btsl without lock prefix > > > set PG_locked and PG_lru flag concurrently, does only one operation > > > succeed ? > > > >That's right: on an SMP machine, without the lock prefix, the operation > >is no longer atomic: what's stored back may be missing the result of > >one or the other of the racing operations. > > In the case that changing the same bit concurrently, lock prefix or other > spinlock is needed. Why would you need any kind of lock when just changing a single bit, if it didn't affect other bits of the same word? Just as you don't need a lock when simply assigning a word, setting a bit to 0 or 1 is simple in itself (it doesn't matter if it was 0 or 1 before). > But, I think that concurrent bit operation on different bits > is just like OR operation , so lock prefix is not needed. I firmly believe that it is; but I'm not a processor expert. > AMD instruction manual says about bts that , > > "Copies a bit, specified by bit index in a register or 8-bit immediate value > (second operand), from a bit > string (first operand), also called the bit base, to the carry flag (CF) of > the rFLAGS register, and then > sets the bit in the bit string to 1." > > BTS instruction is read-modify-write instruction on bit unit. So concurrent > bit operation on different bits may be possible. read-modify-write indeed. That's exactly what the "lock" prefix is needed for, isn't it? To lock together the read-modify-write cycles to make the entire operation atomic on SMP. Without which you may lose the changes made concurrently to the same word by another cpu, or it lose yours. (If another cpu is merely changing the same bit as you are, no problem: either you're both changing it to the same value, or you're racing to set it to opposite values, and one of you wins - fair enough, however much locking you add, you're still left with the situation that two are racing and one wins in the end.) You now seem to be arguing, never mind your original PageLRU case, that include/asm-i386/bitops.h and include/asm-x86_64/bitops.h don't need the LOCK_PREFIX before the "btsl" in set_bit(). I disagree with you; but let someone more authoritative speak up. Mabye Andi can explain it a lot better than I can? Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
Hisashi Hifumi wrote: At 11:47 07/04/24, Nick Piggin wrote: >As Hugh points out, we must have atomic ops here, so changing the generic >code to use the __ version is wrong. However if there is a faster way that >i386 can perform the atomic variant, then doing so will speed up the generic >code without breaking other architectures. > Do you mean writing page-flags.h specific for i386 so improving generic code and without breaking other architectures ? I meant improving the i386 bitops specific code. However if there is some variant of operation that is not captured with the current bitop API, but could provide a useful speedup of common page flag manipulations, then you might consider extending the bitop API and making page-flags.h use that new operation. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
At 11:47 07/04/24, Nick Piggin wrote: >As Hugh points out, we must have atomic ops here, so changing the generic >code to use the __ version is wrong. However if there is a faster way that >i386 can perform the atomic variant, then doing so will speed up the generic >code without breaking other architectures. > Do you mean writing page-flags.h specific for i386 so improving generic code and without breaking other architectures ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
At 11:47 07/04/24, Nick Piggin wrote: As Hugh points out, we must have atomic ops here, so changing the generic code to use the __ version is wrong. However if there is a faster way that i386 can perform the atomic variant, then doing so will speed up the generic code without breaking other architectures. Do you mean writing page-flags.h specific for i386 so improving generic code and without breaking other architectures ? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
Hisashi Hifumi wrote: At 11:47 07/04/24, Nick Piggin wrote: As Hugh points out, we must have atomic ops here, so changing the generic code to use the __ version is wrong. However if there is a faster way that i386 can perform the atomic variant, then doing so will speed up the generic code without breaking other architectures. Do you mean writing page-flags.h specific for i386 so improving generic code and without breaking other architectures ? I meant improving the i386 bitops specific code. However if there is some variant of operation that is not captured with the current bitop API, but could provide a useful speedup of common page flag manipulations, then you might consider extending the bitop API and making page-flags.h use that new operation. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Tue, 24 Apr 2007, Hisashi Hifumi wrote: At 22:42 07/04/23, Hugh Dickins wrote: On Mon, 23 Apr 2007, Hisashi Hifumi wrote: No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. In i386 and x86_64, btsl is used to change page flag. In this case, if btsl without lock prefix set PG_locked and PG_lru flag concurrently, does only one operation succeed ? That's right: on an SMP machine, without the lock prefix, the operation is no longer atomic: what's stored back may be missing the result of one or the other of the racing operations. In the case that changing the same bit concurrently, lock prefix or other spinlock is needed. Why would you need any kind of lock when just changing a single bit, if it didn't affect other bits of the same word? Just as you don't need a lock when simply assigning a word, setting a bit to 0 or 1 is simple in itself (it doesn't matter if it was 0 or 1 before). But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. I firmly believe that it is; but I'm not a processor expert. AMD instruction manual says about bts that , Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then sets the bit in the bit string to 1. BTS instruction is read-modify-write instruction on bit unit. So concurrent bit operation on different bits may be possible. read-modify-write indeed. That's exactly what the lock prefix is needed for, isn't it? To lock together the read-modify-write cycles to make the entire operation atomic on SMP. Without which you may lose the changes made concurrently to the same word by another cpu, or it lose yours. (If another cpu is merely changing the same bit as you are, no problem: either you're both changing it to the same value, or you're racing to set it to opposite values, and one of you wins - fair enough, however much locking you add, you're still left with the situation that two are racing and one wins in the end.) You now seem to be arguing, never mind your original PageLRU case, that include/asm-i386/bitops.h and include/asm-x86_64/bitops.h don't need the LOCK_PREFIX before the btsl in set_bit(). I disagree with you; but let someone more authoritative speak up. Mabye Andi can explain it a lot better than I can? Hugh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
Why would you need any kind of lock when just changing a single bit, if it didn't affect other bits of the same word? Just as you don't need a lock when simply assigning a word, setting a bit to 0 or 1 is simple in itself (it doesn't matter if it was 0 or 1 before). But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. I firmly believe that it is; but I'm not a processor expert. Think of the CPU cache like the page cache. The VFS cannot change anything directly on disk - it always has to read a page (or block); change it there even if it's only a single bit and back. Now imagine multiple independent kernels to that disk doing this in parallel without any locking. You will lose data on simple changes because of data races. The CPU is similar. The memory is the disk; the protocol talking to the memory only knows cache lines. There are multiple CPUs talking to that memory. The CPUs cannot change anything without reading a full cache line first and then later writing it back. There is just do OR operation when talking to memory, just like disks don't have such a operation; just read and write. [there are some special cases with uncached mappings, but they are not relevant here] With lock the CPU ensures the read-modify-write cycle is atomic, without it doesn't. The CPU also guarantees you that multiple writes don't get lost even when they hit the same cache line (otherwise you would need lock for anything in the same cache line), but it doesn't guarantee that for a word. The details for that vary by architecture; on x86 it is any memory access, on others it can be only guaranteed for long stores. The exact rules for all this can be quite complicated (and also vary by architecture), this is a simplification but should be enough as a rough guide. Hope this helps, -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
Hisashi Hifumi wrote: At 22:42 07/04/23, Hugh Dickins wrote: >On Mon, 23 Apr 2007, Hisashi Hifumi wrote: >> >No. The PG_lru flag bit is just one bit amongst many others: >> >what of concurrent operations changing other bits in that same >> >unsigned long e.g. trying to lock the page by setting PG_locked? >> >There are some places where such micro-optimizations can be made >> >(typically while first allocating the page); but in general, no. >> >> In i386 and x86_64, btsl is used to change page flag. In this case, if btsl >> without lock prefix >> set PG_locked and PG_lru flag concurrently, does only one operation >> succeed ? > >That's right: on an SMP machine, without the lock prefix, the operation >is no longer atomic: what's stored back may be missing the result of >one or the other of the racing operations. > In the case that changing the same bit concurrently, lock prefix or other spinlock is needed. But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. AMD instruction manual says about bts that , "Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then sets the bit in the bit string to 1." BTS instruction is read-modify-write instruction on bit unit. So concurrent bit operation on different bits may be possible. No matter what actual instruction is used, the SetPageLRU operation (ie. without the double underscore prefix) must be atomic, and the __SetPageLRU operation *can* be non-atomic if that would be faster. As Hugh points out, we must have atomic ops here, so changing the generic code to use the __ version is wrong. However if there is a faster way that i386 can perform the atomic variant, then doing so will speed up the generic code without breaking other architectures. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Tue, 24 Apr 2007 10:54:27 +0900 Hisashi Hifumi <[EMAIL PROTECTED]> wrote: > In the case that changing the same bit concurrently, lock prefix or other > spinlock is needed. But, I think that concurrent bit operation on different > bits > is just like OR operation , so lock prefix is not needed. > > AMD instruction manual says about bts that , > > "Copies a bit, specified by bit index in a register or 8-bit immediate > value (second operand), from a bit > string (first operand), also called the bit base, to the carry flag (CF) of > the rFLAGS register, and then > sets the bit in the bit string to 1." > > BTS instruction is read-modify-write instruction on bit unit. So concurrent > bit operation on different > bits may be possible. > This is ia64's __set_bit() hehe.. == static __inline__ void __set_bit (int nr, volatile void *addr) { *((__u32 *) addr + (nr >> 5)) |= (1 << (nr & 31)); } == Bye. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
At 22:42 07/04/23, Hugh Dickins wrote: >On Mon, 23 Apr 2007, Hisashi Hifumi wrote: >> >No. The PG_lru flag bit is just one bit amongst many others: >> >what of concurrent operations changing other bits in that same >> >unsigned long e.g. trying to lock the page by setting PG_locked? >> >There are some places where such micro-optimizations can be made >> >(typically while first allocating the page); but in general, no. >> >> In i386 and x86_64, btsl is used to change page flag. In this case, if btsl >> without lock prefix >> set PG_locked and PG_lru flag concurrently, does only one operation >> succeed ? > >That's right: on an SMP machine, without the lock prefix, the operation >is no longer atomic: what's stored back may be missing the result of >one or the other of the racing operations. > In the case that changing the same bit concurrently, lock prefix or other spinlock is needed. But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. AMD instruction manual says about bts that , "Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then sets the bit in the bit string to 1." BTS instruction is read-modify-write instruction on bit unit. So concurrent bit operation on different bits may be possible. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Mon, 23 Apr 2007, Hisashi Hifumi wrote: > >No. The PG_lru flag bit is just one bit amongst many others: > >what of concurrent operations changing other bits in that same > >unsigned long e.g. trying to lock the page by setting PG_locked? > >There are some places where such micro-optimizations can be made > >(typically while first allocating the page); but in general, no. > > In i386 and x86_64, btsl is used to change page flag. In this case, if btsl > without lock prefix > set PG_locked and PG_lru flag concurrently, does only one operation > succeed ? That's right: on an SMP machine, without the lock prefix, the operation is no longer atomic: what's stored back may be missing the result of one or the other of the racing operations. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
>No. The PG_lru flag bit is just one bit amongst many others: >what of concurrent operations changing other bits in that same >unsigned long e.g. trying to lock the page by setting PG_locked? >There are some places where such micro-optimizations can be made >(typically while first allocating the page); but in general, no. In i386 and x86_64, btsl is used to change page flag. In this case, if btsl without lock prefix set PG_locked and PG_lru flag concurrently, does only one operation succeed ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Mon, 23 Apr 2007, Hisashi Hifumi wrote: > > PageLRU flag operation is protected by zone->lru_lock, so > SetPageLRU/ClearPageLRU > can be replaced to __SetPageLRU/__ClearPageLRU non-atomic bit operation. No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: PageLRU can be non-atomic bit operation
Hi PageLRU flag operation is protected by zone->lru_lock, so SetPageLRU/ClearPageLRU can be replaced to __SetPageLRU/__ClearPageLRU non-atomic bit operation. Thanks. Signed-off-by :Hisashi Hifumi <[EMAIL PROTECTED]> diff -Nrup linux-2.6.21-rc7.org/include/linux/page-flags.h linux-2.6.21-rc7/include/linux/page-flags.h --- linux-2.6.21-rc7.org/include/linux/page-flags.h 2007-04-17 16:36:05.0 +0900 +++ linux-2.6.21-rc7/include/linux/page-flags.h 2007-04-23 18:28:19.0 +0900 @@ -148,8 +148,7 @@ static inline void SetPageUptodate(struc #define TestClearPageDirty(page) test_and_clear_bit(PG_dirty, &(page)->flags) #define PageLRU(page) test_bit(PG_lru, &(page)->flags) -#define SetPageLRU(page) set_bit(PG_lru, &(page)->flags) -#define ClearPageLRU(page) clear_bit(PG_lru, &(page)->flags) +#define __SetPageLRU(page) __set_bit(PG_lru, &(page)->flags) #define __ClearPageLRU(page) __clear_bit(PG_lru, &(page)->flags) #define PageActive(page) test_bit(PG_active, &(page)->flags) diff -Nrup linux-2.6.21-rc7.org/mm/migrate.c linux-2.6.21-rc7/mm/migrate.c --- linux-2.6.21-rc7.org/mm/migrate.c 2007-04-17 16:36:05.0 +0900 +++ linux-2.6.21-rc7/mm/migrate.c 2007-04-23 18:30:43.0 +0900 @@ -52,7 +52,7 @@ int isolate_lru_page(struct page *page, if (PageLRU(page)) { ret = 0; get_page(page); - ClearPageLRU(page); + __ClearPageLRU(page); if (PageActive(page)) del_page_from_active_list(zone, page); else diff -Nrup linux-2.6.21-rc7.org/mm/swap.c linux-2.6.21-rc7/mm/swap.c --- linux-2.6.21-rc7.org/mm/swap.c 2007-02-05 03:44:54.0 +0900 +++ linux-2.6.21-rc7/mm/swap.c 2007-04-23 18:29:51.0 +0900 @@ -364,7 +364,7 @@ void __pagevec_lru_add(struct pagevec *p spin_lock_irq(>lru_lock); } VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); add_page_to_inactive_list(zone, page); } if (zone) @@ -391,7 +391,7 @@ void __pagevec_lru_add_active(struct pag spin_lock_irq(>lru_lock); } VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); VM_BUG_ON(PageActive(page)); SetPageActive(page); add_page_to_active_list(zone, page); diff -Nrup linux-2.6.21-rc7.org/mm/vmscan.c linux-2.6.21-rc7/mm/vmscan.c --- linux-2.6.21-rc7.org/mm/vmscan.c2007-04-17 16:36:05.0 +0900 +++ linux-2.6.21-rc7/mm/vmscan.c2007-04-23 18:30:30.0 +0900 @@ -642,7 +642,7 @@ static unsigned long isolate_lru_pages(u * sure the page is not being freed elsewhere -- the * page release code relies on it. */ - ClearPageLRU(page); + __ClearPageLRU(page); target = dst; nr_taken++; } /* else it is being freed elsewhere */ @@ -704,7 +704,7 @@ static unsigned long shrink_inactive_lis while (!list_empty(_list)) { page = lru_to_page(_list); VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); list_del(>lru); if (PageActive(page)) add_page_to_active_list(zone, page); @@ -851,7 +851,7 @@ force_reclaim_mapped: page = lru_to_page(_inactive); prefetchw_prev_lru_page(page, _inactive, flags); VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); VM_BUG_ON(!PageActive(page)); ClearPageActive(page); @@ -881,7 +881,7 @@ force_reclaim_mapped: page = lru_to_page(_active); prefetchw_prev_lru_page(page, _active, flags); VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); VM_BUG_ON(!PageActive(page)); list_move(>lru, >active_list); pgmoved++; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: PageLRU can be non-atomic bit operation
Hi PageLRU flag operation is protected by zone-lru_lock, so SetPageLRU/ClearPageLRU can be replaced to __SetPageLRU/__ClearPageLRU non-atomic bit operation. Thanks. Signed-off-by :Hisashi Hifumi [EMAIL PROTECTED] diff -Nrup linux-2.6.21-rc7.org/include/linux/page-flags.h linux-2.6.21-rc7/include/linux/page-flags.h --- linux-2.6.21-rc7.org/include/linux/page-flags.h 2007-04-17 16:36:05.0 +0900 +++ linux-2.6.21-rc7/include/linux/page-flags.h 2007-04-23 18:28:19.0 +0900 @@ -148,8 +148,7 @@ static inline void SetPageUptodate(struc #define TestClearPageDirty(page) test_and_clear_bit(PG_dirty, (page)-flags) #define PageLRU(page) test_bit(PG_lru, (page)-flags) -#define SetPageLRU(page) set_bit(PG_lru, (page)-flags) -#define ClearPageLRU(page) clear_bit(PG_lru, (page)-flags) +#define __SetPageLRU(page) __set_bit(PG_lru, (page)-flags) #define __ClearPageLRU(page) __clear_bit(PG_lru, (page)-flags) #define PageActive(page) test_bit(PG_active, (page)-flags) diff -Nrup linux-2.6.21-rc7.org/mm/migrate.c linux-2.6.21-rc7/mm/migrate.c --- linux-2.6.21-rc7.org/mm/migrate.c 2007-04-17 16:36:05.0 +0900 +++ linux-2.6.21-rc7/mm/migrate.c 2007-04-23 18:30:43.0 +0900 @@ -52,7 +52,7 @@ int isolate_lru_page(struct page *page, if (PageLRU(page)) { ret = 0; get_page(page); - ClearPageLRU(page); + __ClearPageLRU(page); if (PageActive(page)) del_page_from_active_list(zone, page); else diff -Nrup linux-2.6.21-rc7.org/mm/swap.c linux-2.6.21-rc7/mm/swap.c --- linux-2.6.21-rc7.org/mm/swap.c 2007-02-05 03:44:54.0 +0900 +++ linux-2.6.21-rc7/mm/swap.c 2007-04-23 18:29:51.0 +0900 @@ -364,7 +364,7 @@ void __pagevec_lru_add(struct pagevec *p spin_lock_irq(zone-lru_lock); } VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); add_page_to_inactive_list(zone, page); } if (zone) @@ -391,7 +391,7 @@ void __pagevec_lru_add_active(struct pag spin_lock_irq(zone-lru_lock); } VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); VM_BUG_ON(PageActive(page)); SetPageActive(page); add_page_to_active_list(zone, page); diff -Nrup linux-2.6.21-rc7.org/mm/vmscan.c linux-2.6.21-rc7/mm/vmscan.c --- linux-2.6.21-rc7.org/mm/vmscan.c2007-04-17 16:36:05.0 +0900 +++ linux-2.6.21-rc7/mm/vmscan.c2007-04-23 18:30:30.0 +0900 @@ -642,7 +642,7 @@ static unsigned long isolate_lru_pages(u * sure the page is not being freed elsewhere -- the * page release code relies on it. */ - ClearPageLRU(page); + __ClearPageLRU(page); target = dst; nr_taken++; } /* else it is being freed elsewhere */ @@ -704,7 +704,7 @@ static unsigned long shrink_inactive_lis while (!list_empty(page_list)) { page = lru_to_page(page_list); VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); list_del(page-lru); if (PageActive(page)) add_page_to_active_list(zone, page); @@ -851,7 +851,7 @@ force_reclaim_mapped: page = lru_to_page(l_inactive); prefetchw_prev_lru_page(page, l_inactive, flags); VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); VM_BUG_ON(!PageActive(page)); ClearPageActive(page); @@ -881,7 +881,7 @@ force_reclaim_mapped: page = lru_to_page(l_active); prefetchw_prev_lru_page(page, l_active, flags); VM_BUG_ON(PageLRU(page)); - SetPageLRU(page); + __SetPageLRU(page); VM_BUG_ON(!PageActive(page)); list_move(page-lru, zone-active_list); pgmoved++; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Mon, 23 Apr 2007, Hisashi Hifumi wrote: PageLRU flag operation is protected by zone-lru_lock, so SetPageLRU/ClearPageLRU can be replaced to __SetPageLRU/__ClearPageLRU non-atomic bit operation. No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. Hugh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. In i386 and x86_64, btsl is used to change page flag. In this case, if btsl without lock prefix set PG_locked and PG_lru flag concurrently, does only one operation succeed ? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Mon, 23 Apr 2007, Hisashi Hifumi wrote: No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. In i386 and x86_64, btsl is used to change page flag. In this case, if btsl without lock prefix set PG_locked and PG_lru flag concurrently, does only one operation succeed ? That's right: on an SMP machine, without the lock prefix, the operation is no longer atomic: what's stored back may be missing the result of one or the other of the racing operations. Hugh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
At 22:42 07/04/23, Hugh Dickins wrote: On Mon, 23 Apr 2007, Hisashi Hifumi wrote: No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. In i386 and x86_64, btsl is used to change page flag. In this case, if btsl without lock prefix set PG_locked and PG_lru flag concurrently, does only one operation succeed ? That's right: on an SMP machine, without the lock prefix, the operation is no longer atomic: what's stored back may be missing the result of one or the other of the racing operations. In the case that changing the same bit concurrently, lock prefix or other spinlock is needed. But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. AMD instruction manual says about bts that , Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then sets the bit in the bit string to 1. BTS instruction is read-modify-write instruction on bit unit. So concurrent bit operation on different bits may be possible. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
On Tue, 24 Apr 2007 10:54:27 +0900 Hisashi Hifumi [EMAIL PROTECTED] wrote: In the case that changing the same bit concurrently, lock prefix or other spinlock is needed. But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. AMD instruction manual says about bts that , Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then sets the bit in the bit string to 1. BTS instruction is read-modify-write instruction on bit unit. So concurrent bit operation on different bits may be possible. This is ia64's __set_bit() hehe.. == static __inline__ void __set_bit (int nr, volatile void *addr) { *((__u32 *) addr + (nr 5)) |= (1 (nr 31)); } == Bye. -Kame - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: PageLRU can be non-atomic bit operation
Hisashi Hifumi wrote: At 22:42 07/04/23, Hugh Dickins wrote: On Mon, 23 Apr 2007, Hisashi Hifumi wrote: No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting PG_locked? There are some places where such micro-optimizations can be made (typically while first allocating the page); but in general, no. In i386 and x86_64, btsl is used to change page flag. In this case, if btsl without lock prefix set PG_locked and PG_lru flag concurrently, does only one operation succeed ? That's right: on an SMP machine, without the lock prefix, the operation is no longer atomic: what's stored back may be missing the result of one or the other of the racing operations. In the case that changing the same bit concurrently, lock prefix or other spinlock is needed. But, I think that concurrent bit operation on different bits is just like OR operation , so lock prefix is not needed. AMD instruction manual says about bts that , Copies a bit, specified by bit index in a register or 8-bit immediate value (second operand), from a bit string (first operand), also called the bit base, to the carry flag (CF) of the rFLAGS register, and then sets the bit in the bit string to 1. BTS instruction is read-modify-write instruction on bit unit. So concurrent bit operation on different bits may be possible. No matter what actual instruction is used, the SetPageLRU operation (ie. without the double underscore prefix) must be atomic, and the __SetPageLRU operation *can* be non-atomic if that would be faster. As Hugh points out, we must have atomic ops here, so changing the generic code to use the __ version is wrong. However if there is a faster way that i386 can perform the atomic variant, then doing so will speed up the generic code without breaking other architectures. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/