Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sun, 31 Dec 2006, Arjan van de Ven wrote: > So... yes I fully agree with you that it's worth looking at the > memset( , PAGE_SIZE) users. If they are page aligned, yes absolutely > make it a clear_page(), I think that's a very good idea. However > also please check if they've been very recently allocated in that > code, and if maybe the zeroing allocators are better suited there.. > (or maybe there's even double zeroing going on.. that's be a nice > gain) there's certainly some cleanup/speedup that could be done regarding these numerous "memset(...,0,PAGE_SIZE) calls. first, there the obvious 1:1 replacement with a call to "clear_page()" ***if that's appropriate***. second, there's some possible simplification, given snippets like this one from arch/sparc/mm/sun4c.c pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_REPEAT); if (pte) memset(pte, 0, PAGE_SIZE); which seems to be an obvious candidate for replacement with: pte = get_zeroed_page(GFP_KERNEL|__GFP_REPEAT) no? finally, there is certainly some "double zeroing" going on, as with this snippet from drivers/atm/eni.c: ... eni_dev->rx_map = (struct atm_vcc **) get_zeroed_page(GFP_KERNEL); ^^^ if (!eni_dev->rx_map) { printk(KERN_ERR DEV_LABEL "(itf %d): couldn't get free page\n", dev->number); free_page((unsigned long) eni_dev->free_list); return -ENOMEM; } memset(eni_dev->rx_map,0,PAGE_SIZE);// redundant, no? ... so, yes, there does appear to be room for cleanup/speedup. rday - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sun, 31 Dec 2006, Arjan van de Ven wrote: So... yes I fully agree with you that it's worth looking at the memset( , PAGE_SIZE) users. If they are page aligned, yes absolutely make it a clear_page(), I think that's a very good idea. However also please check if they've been very recently allocated in that code, and if maybe the zeroing allocators are better suited there.. (or maybe there's even double zeroing going on.. that's be a nice gain) there's certainly some cleanup/speedup that could be done regarding these numerous memset(...,0,PAGE_SIZE) calls. first, there the obvious 1:1 replacement with a call to clear_page() ***if that's appropriate***. second, there's some possible simplification, given snippets like this one from arch/sparc/mm/sun4c.c pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_REPEAT); if (pte) memset(pte, 0, PAGE_SIZE); which seems to be an obvious candidate for replacement with: pte = get_zeroed_page(GFP_KERNEL|__GFP_REPEAT) no? finally, there is certainly some double zeroing going on, as with this snippet from drivers/atm/eni.c: ... eni_dev-rx_map = (struct atm_vcc **) get_zeroed_page(GFP_KERNEL); ^^^ if (!eni_dev-rx_map) { printk(KERN_ERR DEV_LABEL (itf %d): couldn't get free page\n, dev-number); free_page((unsigned long) eni_dev-free_list); return -ENOMEM; } memset(eni_dev-rx_map,0,PAGE_SIZE);// redundant, no? ... so, yes, there does appear to be room for cleanup/speedup. rday - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sat, 30 Dec 2006, Denis Vlasenko wrote: > I was experimenting with SSE[2] clear_page() which uses > non-temporal stores. That one requires 16 byte alignment. > > BTW, it worked ~300% faster than memset. But Andi Kleen > insists that cache eviction caused by NT stores will make it > slower in macrobenchmark. > > Apart from fairly extensive set of microbechmarks > I tested kernel compiles (i.e. "real world load") > and they are FASTER too, not slower, but Andi > is fairly entrenched in his opinion ;) > I gave up. you know, with the kernel zeroing pages through the 1:1 phys mapping, and userland accessing pages through a different mapping... it seems that frequently virtual address bits 12..14 will differ between user and kernel. on K8 this results in a virtual alias conflict which costs *70 cycles* per cache line. (K8 L1 DC uses virtual bits 12..14 as part of the index.) this is larger than the cost for L1 miss L2 hit... this wouldn't happen with movnt... but then we get into the handwaving arguments about timing of accesses to the freshly zeroed page. too bad there's no "evict from L1 to L2" operation -- that would avoid the virtual alias problem. there's an event (75h unit mask 02h) to measure virtual alias conflicts... i've always wondered if there are workloads which trigger this behaviour. it can happy on copy to/from user as well. -dean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sat, 30 Dec 2006, Denis Vlasenko wrote: I was experimenting with SSE[2] clear_page() which uses non-temporal stores. That one requires 16 byte alignment. BTW, it worked ~300% faster than memset. But Andi Kleen insists that cache eviction caused by NT stores will make it slower in macrobenchmark. Apart from fairly extensive set of microbechmarks I tested kernel compiles (i.e. real world load) and they are FASTER too, not slower, but Andi is fairly entrenched in his opinion ;) I gave up. you know, with the kernel zeroing pages through the 1:1 phys mapping, and userland accessing pages through a different mapping... it seems that frequently virtual address bits 12..14 will differ between user and kernel. on K8 this results in a virtual alias conflict which costs *70 cycles* per cache line. (K8 L1 DC uses virtual bits 12..14 as part of the index.) this is larger than the cost for L1 miss L2 hit... this wouldn't happen with movnt... but then we get into the handwaving arguments about timing of accesses to the freshly zeroed page. too bad there's no evict from L1 to L2 operation -- that would avoid the virtual alias problem. there's an event (75h unit mask 02h) to measure virtual alias conflicts... i've always wondered if there are workloads which trigger this behaviour. it can happy on copy to/from user as well. -dean - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Mon, Jan 01, 2007 at 05:27:10AM -0500, Robert P. J. Day wrote: > > both look good... I'd be in favor of this. Maybe also add a part > > about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem > > writeout code and GFP_NOIO from block writeout code (and never doing > > in_interrupt()?GFP_ATOMIC:GFP_KERNEL !) > > it strikes me that that latter part is starting to go beyond the scope > of simple coding style aesthetics and getting into actual coding > distinctions. would that really be appropriate for the CodingStyle > doc? i'm just asking. Adding a separate 'good practices' doc wouldn't be a bad idea. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Mon, 1 Jan 2007 17:42:31 +0900 Paul Mundt wrote: > On Mon, Jan 01, 2007 at 02:59:32AM +0100, Folkert van Heusden wrote: > > > > regarding alignment that don't allow clear_page() to be used > > > > copy_page() in the memcpy() case), but it's going to need a lot of > > > > Maybe these optimalisations should be in the coding style docs? > > > For what purpose? CodingStyle is not about documenting usage constraints > for every minor part of the kernel. If someone intends to use an API, > it's up to them to figure out the semantics for doing so. Let's not > confuse common sense with style. > - I agree, these aren't CodingStyle material. They could make sense in some other doc, either MM/VM-related or more general. I've often wanted a somewhat introductory doc about Linux environmental assumptions (memory model, pointers/longs, data cleared to 0, many other items that I don't have on my fingertips). --- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Mon, 1 Jan 2007, Arjan van de Ven wrote: > > Given the above, some basic suggestions for page-based memory management: > > > > (a) If you need to allocate or free a single page, use the single page > > version of the routine/macro, rather than calling the multi-page > > version with an order value of zero, such as: > > > > alloc_pages(gfp_mask, 0); /* no */ > > alloc_page(gfp_mask); /* better */ > > > > (b) If you need to allocate a single zeroed page by logical address, > > use get_zeroed_page(), rather than __get_free_page() followed > > by a call to memset() to clear that page. > > both look good... I'd be in favor of this. Maybe also add a part > about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem > writeout code and GFP_NOIO from block writeout code (and never doing > in_interrupt()?GFP_ATOMIC:GFP_KERNEL !) it strikes me that that latter part is starting to go beyond the scope of simple coding style aesthetics and getting into actual coding distinctions. would that really be appropriate for the CodingStyle doc? i'm just asking. > > (c) If you need to specifically allocate some DMA pages, use the > > __get_dma_pages() macro, as in: > > > > __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */ > > __get_dma_pages(GFP_KERNEL, order) /* better */ > > this.. does not really. GFP_DMA is an ancient artifact from the ISA > days. Better to describe the dma mapping interface (well give a > pointer to the doc that already exists about that), that one is > REALLY for allocating dma pages in this century. ok, i was just trying to make the calls consistent based on what i could see in the current source code. i'm still reviewing the material on DMA -- feel free to suggest better wording. rday p.s. what DMA doc are you referring to above? DMA-mapping.txt? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
> Given the above, some basic suggestions for page-based memory management: > > (a) If you need to allocate or free a single page, use the single page > version of the routine/macro, rather than calling the multi-page > version with an order value of zero, such as: > > alloc_pages(gfp_mask, 0); /* no */ > alloc_page(gfp_mask); /* better */ > > (b) If you need to allocate a single zeroed page by logical address, > use get_zeroed_page(), rather than __get_free_page() followed > by a call to memset() to clear that page. both look good... I'd be in favor of this. Maybe also add a part about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem writeout code and GFP_NOIO from block writeout code (and never doing in_interrupt()?GFP_ATOMIC:GFP_KERNEL !) > > (c) If you need to specifically allocate some DMA pages, use the > __get_dma_pages() macro, as in: > > __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */ > __get_dma_pages(GFP_KERNEL, order) /* better */ this.. does not really. GFP_DMA is an ancient artifact from the ISA days. Better to describe the dma mapping interface (well give a pointer to the doc that already exists about that), that one is REALLY for allocating dma pages in this century. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Mon, Jan 01, 2007 at 02:59:32AM +0100, Folkert van Heusden wrote: > > > regarding alignment that don't allow clear_page() to be used > > > copy_page() in the memcpy() case), but it's going to need a lot of > > Maybe these optimalisations should be in the coding style docs? > For what purpose? CodingStyle is not about documenting usage constraints for every minor part of the kernel. If someone intends to use an API, it's up to them to figure out the semantics for doing so. Let's not confuse common sense with style. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Mon, 1 Jan 2007, Folkert van Heusden wrote: > > > regarding alignment that don't allow clear_page() to be used > > > copy_page() in the memcpy() case), but it's going to need a lot of > > Maybe these optimalisations should be in the coding style docs? i was thinking of submitting the following as a new "chapter" for the doc -- it would address *some* of these issues: Chapter XX: Page-related memory management The following functions and macro definitions are available via include/linux/gfp.h for page-based memory management: struct page *alloc_pages(gfp_mask, order); unsigned long __get_free_pages(gfp_mask, order); unsigned long get_zeroed_page(gfp_mask); void __free_pages(struct page *page, order); void free_pages(unsigned long addr, order); #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0) #define __get_free_page(gfp_mask) __get_free_pages((gfp_mask),0) #define __free_page(page) __free_pages((page), 0) #define free_page(addr) free_pages((addr),0) #define __get_dma_pages(gfp_mask, order) \ __get_free_pages((gfp_mask) | GFP_DMA,(order)) Given the above, some basic suggestions for page-based memory management: (a) If you need to allocate or free a single page, use the single page version of the routine/macro, rather than calling the multi-page version with an order value of zero, such as: alloc_pages(gfp_mask, 0); /* no */ alloc_page(gfp_mask); /* better */ (b) If you need to allocate a single zeroed page by logical address, use get_zeroed_page(), rather than __get_free_page() followed by a call to memset() to clear that page. (c) If you need to specifically allocate some DMA pages, use the __get_dma_pages() macro, as in: __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */ __get_dma_pages(GFP_KERNEL, order) /* better */ (d) If you need to clear (zero) a page, be aware that every architecture defines a clear_page() routine, either as a macro in include//page.h or as an assembler routine. You should check if it's appropriate to use that routine/macro, rather than making an explicit call to memset(...,0, PAGE_SIZE). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Mon, 1 Jan 2007, Folkert van Heusden wrote: regarding alignment that don't allow clear_page() to be used copy_page() in the memcpy() case), but it's going to need a lot of Maybe these optimalisations should be in the coding style docs? i was thinking of submitting the following as a new chapter for the doc -- it would address *some* of these issues: Chapter XX: Page-related memory management The following functions and macro definitions are available via include/linux/gfp.h for page-based memory management: struct page *alloc_pages(gfp_mask, order); unsigned long __get_free_pages(gfp_mask, order); unsigned long get_zeroed_page(gfp_mask); void __free_pages(struct page *page, order); void free_pages(unsigned long addr, order); #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0) #define __get_free_page(gfp_mask) __get_free_pages((gfp_mask),0) #define __free_page(page) __free_pages((page), 0) #define free_page(addr) free_pages((addr),0) #define __get_dma_pages(gfp_mask, order) \ __get_free_pages((gfp_mask) | GFP_DMA,(order)) Given the above, some basic suggestions for page-based memory management: (a) If you need to allocate or free a single page, use the single page version of the routine/macro, rather than calling the multi-page version with an order value of zero, such as: alloc_pages(gfp_mask, 0); /* no */ alloc_page(gfp_mask); /* better */ (b) If you need to allocate a single zeroed page by logical address, use get_zeroed_page(), rather than __get_free_page() followed by a call to memset() to clear that page. (c) If you need to specifically allocate some DMA pages, use the __get_dma_pages() macro, as in: __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */ __get_dma_pages(GFP_KERNEL, order) /* better */ (d) If you need to clear (zero) a page, be aware that every architecture defines a clear_page() routine, either as a macro in include/arch/page.h or as an assembler routine. You should check if it's appropriate to use that routine/macro, rather than making an explicit call to memset(...,0, PAGE_SIZE). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Mon, Jan 01, 2007 at 02:59:32AM +0100, Folkert van Heusden wrote: regarding alignment that don't allow clear_page() to be used copy_page() in the memcpy() case), but it's going to need a lot of Maybe these optimalisations should be in the coding style docs? For what purpose? CodingStyle is not about documenting usage constraints for every minor part of the kernel. If someone intends to use an API, it's up to them to figure out the semantics for doing so. Let's not confuse common sense with style. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
Given the above, some basic suggestions for page-based memory management: (a) If you need to allocate or free a single page, use the single page version of the routine/macro, rather than calling the multi-page version with an order value of zero, such as: alloc_pages(gfp_mask, 0); /* no */ alloc_page(gfp_mask); /* better */ (b) If you need to allocate a single zeroed page by logical address, use get_zeroed_page(), rather than __get_free_page() followed by a call to memset() to clear that page. both look good... I'd be in favor of this. Maybe also add a part about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem writeout code and GFP_NOIO from block writeout code (and never doing in_interrupt()?GFP_ATOMIC:GFP_KERNEL !) (c) If you need to specifically allocate some DMA pages, use the __get_dma_pages() macro, as in: __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */ __get_dma_pages(GFP_KERNEL, order) /* better */ this.. does not really. GFP_DMA is an ancient artifact from the ISA days. Better to describe the dma mapping interface (well give a pointer to the doc that already exists about that), that one is REALLY for allocating dma pages in this century. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Mon, 1 Jan 2007, Arjan van de Ven wrote: Given the above, some basic suggestions for page-based memory management: (a) If you need to allocate or free a single page, use the single page version of the routine/macro, rather than calling the multi-page version with an order value of zero, such as: alloc_pages(gfp_mask, 0); /* no */ alloc_page(gfp_mask); /* better */ (b) If you need to allocate a single zeroed page by logical address, use get_zeroed_page(), rather than __get_free_page() followed by a call to memset() to clear that page. both look good... I'd be in favor of this. Maybe also add a part about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem writeout code and GFP_NOIO from block writeout code (and never doing in_interrupt()?GFP_ATOMIC:GFP_KERNEL !) it strikes me that that latter part is starting to go beyond the scope of simple coding style aesthetics and getting into actual coding distinctions. would that really be appropriate for the CodingStyle doc? i'm just asking. (c) If you need to specifically allocate some DMA pages, use the __get_dma_pages() macro, as in: __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */ __get_dma_pages(GFP_KERNEL, order) /* better */ this.. does not really. GFP_DMA is an ancient artifact from the ISA days. Better to describe the dma mapping interface (well give a pointer to the doc that already exists about that), that one is REALLY for allocating dma pages in this century. ok, i was just trying to make the calls consistent based on what i could see in the current source code. i'm still reviewing the material on DMA -- feel free to suggest better wording. rday p.s. what DMA doc are you referring to above? DMA-mapping.txt? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Mon, 1 Jan 2007 17:42:31 +0900 Paul Mundt wrote: On Mon, Jan 01, 2007 at 02:59:32AM +0100, Folkert van Heusden wrote: regarding alignment that don't allow clear_page() to be used copy_page() in the memcpy() case), but it's going to need a lot of Maybe these optimalisations should be in the coding style docs? For what purpose? CodingStyle is not about documenting usage constraints for every minor part of the kernel. If someone intends to use an API, it's up to them to figure out the semantics for doing so. Let's not confuse common sense with style. - I agree, these aren't CodingStyle material. They could make sense in some other doc, either MM/VM-related or more general. I've often wanted a somewhat introductory doc about Linux environmental assumptions (memory model, pointers/longs, data cleared to 0, many other items that I don't have on my fingertips). --- ~Randy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Mon, Jan 01, 2007 at 05:27:10AM -0500, Robert P. J. Day wrote: both look good... I'd be in favor of this. Maybe also add a part about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem writeout code and GFP_NOIO from block writeout code (and never doing in_interrupt()?GFP_ATOMIC:GFP_KERNEL !) it strikes me that that latter part is starting to go beyond the scope of simple coding style aesthetics and getting into actual coding distinctions. would that really be appropriate for the CodingStyle doc? i'm just asking. Adding a separate 'good practices' doc wouldn't be a bad idea. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
> > regarding alignment that don't allow clear_page() to be used > > copy_page() in the memcpy() case), but it's going to need a lot of Maybe these optimalisations should be in the coding style docs? Folkert van Heusden -- Ever wonder what is out there? Any alien races? Then please support the [EMAIL PROTECTED] project: setiathome.ssl.berkeley.edu -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Mon, 1 Jan 2007, Paul Mundt wrote: > On Sat, Dec 30, 2006 at 06:04:14PM -0500, Robert P. J. Day wrote: > > fair enough. *technically*, not every call of the form > > "memset(ptr,0,PAGE_SIZE)" necessarily represents an address that's on > > a page boundary. but, *realistically*, i'm guessing most of them do. > > just grabbing a random example from some grep output: > > > > arch/sh/mm/init.c: > > ... > > /* clear the zero-page */ > > memset(empty_zero_page, 0, PAGE_SIZE); > > ... > > > The problem with random grepping is that it doesn't give you any > context. clear_page() isn't available in this case since we have a > couple of different ways of implementing it, and the optimal > approach is selected later on. There are also additional assumptions > regarding alignment that don't allow clear_page() to be used > directly as replacement for the memset() callsites (as has already > been pointed out for some of the other architectures). While the > empty_zero_page in this case sits on a full page boundary, others do > not. > > You might find some places in drivers that do this where you might > be able to optimize things slightly with a clear_page() (or > copy_page() in the memcpy() case), but it's going to need a lot of > manual auditing rather than a find and replace. Any sort of wins you > get out of this would be marginal at best, anyways. > > The more interesting case would be page clustering/bulk page > clearing with offload engines, and there's certainly room to build > on the SGI patches for this. your point is well taken -- i wasn't trying to suggest that a blind cut-and-replace would be appropriate, only that there were an awful lot of places where it wasn't clear that that kind of replacement *wasn't* appropriate. or perhaps even recommended. (doing that kind of search in the drivers/ directory would perhaps be more meaningful than in the arch/ directory. just my luck i picked a bad example.) clearly, that kind of replacement might require manual intervention in a lot of cases, no question. as with other examples i've brought up here, i'm just looking at this from a relatively newbie perspective, where i'm perusing the code and, in this case, got to thinking, "gee, given that every architecture defines a clear_page() macro, i wonder why all these people keep calling memset()." that's all. kind of like how, given that include/linux/gfp.h defines the macro __get_dma_pages(), so many people persist in calling __get_free_pages() with a GFP_DMA setting. that sort of thing. :-) rday - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sat, Dec 30, 2006 at 06:04:14PM -0500, Robert P. J. Day wrote: > fair enough. *technically*, not every call of the form > "memset(ptr,0,PAGE_SIZE)" necessarily represents an address that's on > a page boundary. but, *realistically*, i'm guessing most of them do. > just grabbing a random example from some grep output: > > arch/sh/mm/init.c: > ... > /* clear the zero-page */ > memset(empty_zero_page, 0, PAGE_SIZE); > ... > The problem with random grepping is that it doesn't give you any context. clear_page() isn't available in this case since we have a couple of different ways of implementing it, and the optimal approach is selected later on. There are also additional assumptions regarding alignment that don't allow clear_page() to be used directly as replacement for the memset() callsites (as has already been pointed out for some of the other architectures). While the empty_zero_page in this case sits on a full page boundary, others do not. You might find some places in drivers that do this where you might be able to optimize things slightly with a clear_page() (or copy_page() in the memcpy() case), but it's going to need a lot of manual auditing rather than a find and replace. Any sort of wins you get out of this would be marginal at best, anyways. The more interesting case would be page clustering/bulk page clearing with offload engines, and there's certainly room to build on the SGI patches for this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
> arjan, you and i actually agree on this. i fully accept that the idea > of a "clear_page()" call might or should have extra semantics, > compared to the more simple and direct "memset(...,0,PAGE_SIZE)" call > (such as alignment requirements, for example). my observation is > simply that this is not what is currently happening. that's fair > > consider, for example, how many calls there are to clear_page() in the > drivers directory: > > $ grep -rw clear_page drivers > > not that many. the biggest user of clear_page and such is the pagefault code path in practice. > i can't believe that at least *some* of those memset() calls couldn't > be re-written as clear_page() calls. and that's just for the > drivers/ directory. yes I can believe that > > sure, clear_page() might have extra semantics. but if that's the > case, and those semantics happen to be in play, i'm suggesting that > not only *can* one use clear_page() at that point, one *should* use > it. > > put another way, if a given situation is appropriate for a call to > clear_page(), then that's what should be used. however there is potentially a bigger thing possible. These places that zero a whole full page may have just allocated it (that's an assumption on my side), and if that's the case, maybe those places instead should use the zeroing version of the allocator instead (which internally uses clear_page() ). So... yes I fully agree with you that it's worth looking at the memset( , PAGE_SIZE) users. If they are page aligned, yes absolutely make it a clear_page(), I think that's a very good idea. However also please check if they've been very recently allocated in that code, and if maybe the zeroing allocators are better suited there.. (or maybe there's even double zeroing going on.. that's be a nice gain) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sun, 31 Dec 2006, Arjan van de Ven wrote: > On Sun, 2006-12-31 at 14:39 +0100, Folkert van Heusden wrote: > > > > i don't see how that can be true, given that most of the definitions > > > > of the clear_page() macro are simply invocations of memset(). see for > > > > yourself: > > > *MOST*. Not all. > > > For example an SSE version will at least assume 16 byte alignment, etc > > > etc. > > > > What about an if (adress & 15) { memset } else { sse stuff } > > or is that too obvious? :-) > > it's only one example. clear_page() working only on a full page is a > nice restriction that allows the implementation to be optimized > (again the x86 hardware that had a hardware page zeroer comes to > mind, the hw is only 4 years old or so... and future hw may have it > again) > > clear_page() is more restricted than memset(). And that's good, it > allows for a more focused implementation. Otherwise there'd be no > reason to HAVE a clear_page(), if it just was a memset wrapper > entirely. (just one more note about this, then i'll stop dragging it out. i didn't mean to get this long-winded about it.) arjan, you and i actually agree on this. i fully accept that the idea of a "clear_page()" call might or should have extra semantics, compared to the more simple and direct "memset(...,0,PAGE_SIZE)" call (such as alignment requirements, for example). my observation is simply that this is not what is currently happening. consider, for example, how many calls there are to clear_page() in the drivers directory: $ grep -rw clear_page drivers not that many. now how many calls are there of the memset variety? $ grep -Er "memset(.*0, ?PAGE_SIZE)" drivers i can't believe that at least *some* of those memset() calls couldn't be re-written as clear_page() calls. and that's just for the drivers/ directory. sure, clear_page() might have extra semantics. but if that's the case, and those semantics happen to be in play, i'm suggesting that not only *can* one use clear_page() at that point, one *should* use it. put another way, if a given situation is appropriate for a call to clear_page(), then that's what should be used. because if one sees instead a call to an equivalent memset(), that might suggest that there's something *preventing* the use of clear_page() -- that it's not appropriate. and, really, there's no need to be unnecessarily confusing. this is just another example of the basic kernel infrastructure defining lots of useful features (ARRAY_SIZE, etc.) and lots of programmers not using them for one reason or another. in this situation with clear_page(), the semantics of that call should be defined clearly and, when the situation arises, i think that call should be used unless there's a clear reason *not* to. it just makes the code easier to read. and on that note, i'll let this one go. others are free to follow up. rday - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sun, 2006-12-31 at 14:39 +0100, Folkert van Heusden wrote: > > > i don't see how that can be true, given that most of the definitions > > > of the clear_page() macro are simply invocations of memset(). see for > > > yourself: > > *MOST*. Not all. > > For example an SSE version will at least assume 16 byte alignment, etc > > etc. > > What about an if (adress & 15) { memset } else { sse stuff } > or is that too obvious? :-) it's only one example. clear_page() working only on a full page is a nice restriction that allows the implementation to be optimized (again the x86 hardware that had a hardware page zeroer comes to mind, the hw is only 4 years old or so... and future hw may have it again) clear_page() is more restricted than memset(). And that's good, it allows for a more focused implementation. Otherwise there'd be no reason to HAVE a clear_page(), if it just was a memset wrapper entirely. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
> > i don't see how that can be true, given that most of the definitions > > of the clear_page() macro are simply invocations of memset(). see for > > yourself: > *MOST*. Not all. > For example an SSE version will at least assume 16 byte alignment, etc > etc. What about an if (adress & 15) { memset } else { sse stuff } or is that too obvious? :-) > clear_page() is supposed to be for full real pages only... for example > it allows the architecture to optimize for alignment, cache aliasing etc > etc. (and if there are cpus that get a "clear an entire page" > instruction there has been hardware like that in the past, even on > x86, just it's no longer sold afaik) Folkert van Heusden -- www.biglumber.com <- site where one can exchange PGP key signatures -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: *MOST*. Not all. For example an SSE version will at least assume 16 byte alignment, etc etc. What about an if (adress 15) { memset } else { sse stuff } or is that too obvious? :-) clear_page() is supposed to be for full real pages only... for example it allows the architecture to optimize for alignment, cache aliasing etc etc. (and if there are cpus that get a clear an entire page instruction there has been hardware like that in the past, even on x86, just it's no longer sold afaik) Folkert van Heusden -- www.biglumber.com - site where one can exchange PGP key signatures -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sun, 2006-12-31 at 14:39 +0100, Folkert van Heusden wrote: i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: *MOST*. Not all. For example an SSE version will at least assume 16 byte alignment, etc etc. What about an if (adress 15) { memset } else { sse stuff } or is that too obvious? :-) it's only one example. clear_page() working only on a full page is a nice restriction that allows the implementation to be optimized (again the x86 hardware that had a hardware page zeroer comes to mind, the hw is only 4 years old or so... and future hw may have it again) clear_page() is more restricted than memset(). And that's good, it allows for a more focused implementation. Otherwise there'd be no reason to HAVE a clear_page(), if it just was a memset wrapper entirely. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sun, 31 Dec 2006, Arjan van de Ven wrote: On Sun, 2006-12-31 at 14:39 +0100, Folkert van Heusden wrote: i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: *MOST*. Not all. For example an SSE version will at least assume 16 byte alignment, etc etc. What about an if (adress 15) { memset } else { sse stuff } or is that too obvious? :-) it's only one example. clear_page() working only on a full page is a nice restriction that allows the implementation to be optimized (again the x86 hardware that had a hardware page zeroer comes to mind, the hw is only 4 years old or so... and future hw may have it again) clear_page() is more restricted than memset(). And that's good, it allows for a more focused implementation. Otherwise there'd be no reason to HAVE a clear_page(), if it just was a memset wrapper entirely. (just one more note about this, then i'll stop dragging it out. i didn't mean to get this long-winded about it.) arjan, you and i actually agree on this. i fully accept that the idea of a clear_page() call might or should have extra semantics, compared to the more simple and direct memset(...,0,PAGE_SIZE) call (such as alignment requirements, for example). my observation is simply that this is not what is currently happening. consider, for example, how many calls there are to clear_page() in the drivers directory: $ grep -rw clear_page drivers not that many. now how many calls are there of the memset variety? $ grep -Er memset(.*0, ?PAGE_SIZE) drivers i can't believe that at least *some* of those memset() calls couldn't be re-written as clear_page() calls. and that's just for the drivers/ directory. sure, clear_page() might have extra semantics. but if that's the case, and those semantics happen to be in play, i'm suggesting that not only *can* one use clear_page() at that point, one *should* use it. put another way, if a given situation is appropriate for a call to clear_page(), then that's what should be used. because if one sees instead a call to an equivalent memset(), that might suggest that there's something *preventing* the use of clear_page() -- that it's not appropriate. and, really, there's no need to be unnecessarily confusing. this is just another example of the basic kernel infrastructure defining lots of useful features (ARRAY_SIZE, etc.) and lots of programmers not using them for one reason or another. in this situation with clear_page(), the semantics of that call should be defined clearly and, when the situation arises, i think that call should be used unless there's a clear reason *not* to. it just makes the code easier to read. and on that note, i'll let this one go. others are free to follow up. rday - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
arjan, you and i actually agree on this. i fully accept that the idea of a clear_page() call might or should have extra semantics, compared to the more simple and direct memset(...,0,PAGE_SIZE) call (such as alignment requirements, for example). my observation is simply that this is not what is currently happening. that's fair consider, for example, how many calls there are to clear_page() in the drivers directory: $ grep -rw clear_page drivers not that many. the biggest user of clear_page and such is the pagefault code path in practice. i can't believe that at least *some* of those memset() calls couldn't be re-written as clear_page() calls. and that's just for the drivers/ directory. yes I can believe that sure, clear_page() might have extra semantics. but if that's the case, and those semantics happen to be in play, i'm suggesting that not only *can* one use clear_page() at that point, one *should* use it. put another way, if a given situation is appropriate for a call to clear_page(), then that's what should be used. however there is potentially a bigger thing possible. These places that zero a whole full page may have just allocated it (that's an assumption on my side), and if that's the case, maybe those places instead should use the zeroing version of the allocator instead (which internally uses clear_page() ). So... yes I fully agree with you that it's worth looking at the memset( , PAGE_SIZE) users. If they are page aligned, yes absolutely make it a clear_page(), I think that's a very good idea. However also please check if they've been very recently allocated in that code, and if maybe the zeroing allocators are better suited there.. (or maybe there's even double zeroing going on.. that's be a nice gain) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sat, Dec 30, 2006 at 06:04:14PM -0500, Robert P. J. Day wrote: fair enough. *technically*, not every call of the form memset(ptr,0,PAGE_SIZE) necessarily represents an address that's on a page boundary. but, *realistically*, i'm guessing most of them do. just grabbing a random example from some grep output: arch/sh/mm/init.c: ... /* clear the zero-page */ memset(empty_zero_page, 0, PAGE_SIZE); ... The problem with random grepping is that it doesn't give you any context. clear_page() isn't available in this case since we have a couple of different ways of implementing it, and the optimal approach is selected later on. There are also additional assumptions regarding alignment that don't allow clear_page() to be used directly as replacement for the memset() callsites (as has already been pointed out for some of the other architectures). While the empty_zero_page in this case sits on a full page boundary, others do not. You might find some places in drivers that do this where you might be able to optimize things slightly with a clear_page() (or copy_page() in the memcpy() case), but it's going to need a lot of manual auditing rather than a find and replace. Any sort of wins you get out of this would be marginal at best, anyways. The more interesting case would be page clustering/bulk page clearing with offload engines, and there's certainly room to build on the SGI patches for this. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Mon, 1 Jan 2007, Paul Mundt wrote: On Sat, Dec 30, 2006 at 06:04:14PM -0500, Robert P. J. Day wrote: fair enough. *technically*, not every call of the form memset(ptr,0,PAGE_SIZE) necessarily represents an address that's on a page boundary. but, *realistically*, i'm guessing most of them do. just grabbing a random example from some grep output: arch/sh/mm/init.c: ... /* clear the zero-page */ memset(empty_zero_page, 0, PAGE_SIZE); ... The problem with random grepping is that it doesn't give you any context. clear_page() isn't available in this case since we have a couple of different ways of implementing it, and the optimal approach is selected later on. There are also additional assumptions regarding alignment that don't allow clear_page() to be used directly as replacement for the memset() callsites (as has already been pointed out for some of the other architectures). While the empty_zero_page in this case sits on a full page boundary, others do not. You might find some places in drivers that do this where you might be able to optimize things slightly with a clear_page() (or copy_page() in the memcpy() case), but it's going to need a lot of manual auditing rather than a find and replace. Any sort of wins you get out of this would be marginal at best, anyways. The more interesting case would be page clustering/bulk page clearing with offload engines, and there's certainly room to build on the SGI patches for this. your point is well taken -- i wasn't trying to suggest that a blind cut-and-replace would be appropriate, only that there were an awful lot of places where it wasn't clear that that kind of replacement *wasn't* appropriate. or perhaps even recommended. (doing that kind of search in the drivers/ directory would perhaps be more meaningful than in the arch/ directory. just my luck i picked a bad example.) clearly, that kind of replacement might require manual intervention in a lot of cases, no question. as with other examples i've brought up here, i'm just looking at this from a relatively newbie perspective, where i'm perusing the code and, in this case, got to thinking, gee, given that every architecture defines a clear_page() macro, i wonder why all these people keep calling memset(). that's all. kind of like how, given that include/linux/gfp.h defines the macro __get_dma_pages(), so many people persist in calling __get_free_pages() with a GFP_DMA setting. that sort of thing. :-) rday - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
regarding alignment that don't allow clear_page() to be used copy_page() in the memcpy() case), but it's going to need a lot of Maybe these optimalisations should be in the coding style docs? Folkert van Heusden -- Ever wonder what is out there? Any alien races? Then please support the [EMAIL PROTECTED] project: setiathome.ssl.berkeley.edu -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sat, 30 Dec 2006, Arjan van de Ven wrote: > rday wrote: > > ... most of the definitions of the clear_page() macro are simply > > invocations of memset(). see for yourself: > *MOST*. Not all. i did notice that. while the majority of the architectures simply define clear_page() as a macro calling memset(ptr, 0, PAGE_SIZE), the rest will implement it in assembler code for whatever reason. (i'm assuming that *every* architecture *must* define clear_page() one way or the other, is that correct? that would seem fairly obvious, but i just want to make sure i'm not missing anything obvious.) > clear_page() is supposed to be for full real pages only... for > example it allows the architecture to optimize for alignment, cache > aliasing etc etc. fair enough. *technically*, not every call of the form "memset(ptr,0,PAGE_SIZE)" necessarily represents an address that's on a page boundary. but, *realistically*, i'm guessing most of them do. just grabbing a random example from some grep output: arch/sh/mm/init.c: ... /* clear the zero-page */ memset(empty_zero_page, 0, PAGE_SIZE); ... my only point here is that, given that every architecture needs to supply some kind of definition of a "clear_page()" routine, one would think that *lots* of those memset() calls could reasonably be rewritten as a clear_page() call for improved readibility, no? and there are a *lot* of memset() calls like that. rday - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
> i don't see how that can be true, given that most of the definitions > of the clear_page() macro are simply invocations of memset(). see for > yourself: *MOST*. Not all. For example an SSE version will at least assume 16 byte alignment, etc etc. clear_page() is supposed to be for full real pages only... for example it allows the architecture to optimize for alignment, cache aliasing etc etc. (and if there are cpus that get a "clear an entire page" instruction there has been hardware like that in the past, even on x86, just it's no longer sold afaik) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Saturday 30 December 2006 23:08, Robert P. J. Day wrote: > > > > clear_page assumes that given address is page aligned, I think. It > > may fail if you feed it with misaligned region's address. > > i don't see how that can be true, given that most of the definitions > of the clear_page() macro are simply invocations of memset(). see for > yourself: > > $ grep -r "#define clear_page" include > > my only point here was that lots of code seems to be calling memset() > when it would be clearer to invoke clear_page(). but there's still > something a bit curious happening here. i'll poke around a bit more > before i ask, though. There are MMX implementations of clear_page(). I was experimenting with SSE[2] clear_page() which uses non-temporal stores. That one requires 16 byte alignment. BTW, it worked ~300% faster than memset. But Andi Kleen insists that cache eviction caused by NT stores will make it slower in macrobenchmark. Apart from fairly extensive set of microbechmarks I tested kernel compiles (i.e. "real world load") and they are FASTER too, not slower, but Andi is fairly entrenched in his opinion ;) I gave up. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Sat, 30 Dec 2006, Denis Vlasenko wrote: > On Friday 29 December 2006 07:16, Robert P. J. Day wrote: > > > > is there some reason there are so many calls of the form > > > > memset(addr, 0, PAGE_SIZE) > > > > rather than the apparently equivalent invocation of > > > > clear_page(addr) > > > > the majority of architectures appear to define the clear_page() > > macro in their include//page.h header file, but not entirely > > identically, and in some cases that definition is conditional, as > > with i386: > > > > = > > #ifdef CONFIG_X86_USE_3DNOW > > ... > > #define clear_page(page)mmx_clear_page((void *)(page)) > > ... > > #else > > ... > > #define clear_page(page)memset((void *)(page), 0, PAGE_SIZE) > > ... > > #endif > > > > > > should it perhaps be part of the CodingStyle doc to use the > > clear_page() macro rather than an explicit call to memset()? > > (and should all architectures be required to define that macro?) > > clear_page assumes that given address is page aligned, I think. It > may fail if you feed it with misaligned region's address. i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: $ grep -r "#define clear_page" include my only point here was that lots of code seems to be calling memset() when it would be clearer to invoke clear_page(). but there's still something a bit curious happening here. i'll poke around a bit more before i ask, though. rday - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
On Friday 29 December 2006 07:16, Robert P. J. Day wrote: > > is there some reason there are so many calls of the form > > memset(addr, 0, PAGE_SIZE) > > rather than the apparently equivalent invocation of > > clear_page(addr) > > the majority of architectures appear to define the clear_page() macro > in their include//page.h header file, but not entirely > identically, and in some cases that definition is conditional, as with > i386: > > = > #ifdef CONFIG_X86_USE_3DNOW > ... > #define clear_page(page)mmx_clear_page((void *)(page)) > ... > #else > ... > #define clear_page(page)memset((void *)(page), 0, PAGE_SIZE) > ... > #endif > > > should it perhaps be part of the CodingStyle doc to use the > clear_page() macro rather than an explicit call to memset()? (and > should all architectures be required to define that macro?) clear_page assumes that given address is page aligned, I think. It may fail if you feed it with misaligned region's address. -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Friday 29 December 2006 07:16, Robert P. J. Day wrote: is there some reason there are so many calls of the form memset(addr, 0, PAGE_SIZE) rather than the apparently equivalent invocation of clear_page(addr) the majority of architectures appear to define the clear_page() macro in their include/arch/page.h header file, but not entirely identically, and in some cases that definition is conditional, as with i386: = #ifdef CONFIG_X86_USE_3DNOW ... #define clear_page(page)mmx_clear_page((void *)(page)) ... #else ... #define clear_page(page)memset((void *)(page), 0, PAGE_SIZE) ... #endif should it perhaps be part of the CodingStyle doc to use the clear_page() macro rather than an explicit call to memset()? (and should all architectures be required to define that macro?) clear_page assumes that given address is page aligned, I think. It may fail if you feed it with misaligned region's address. -- vda - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sat, 30 Dec 2006, Denis Vlasenko wrote: On Friday 29 December 2006 07:16, Robert P. J. Day wrote: is there some reason there are so many calls of the form memset(addr, 0, PAGE_SIZE) rather than the apparently equivalent invocation of clear_page(addr) the majority of architectures appear to define the clear_page() macro in their include/arch/page.h header file, but not entirely identically, and in some cases that definition is conditional, as with i386: = #ifdef CONFIG_X86_USE_3DNOW ... #define clear_page(page)mmx_clear_page((void *)(page)) ... #else ... #define clear_page(page)memset((void *)(page), 0, PAGE_SIZE) ... #endif should it perhaps be part of the CodingStyle doc to use the clear_page() macro rather than an explicit call to memset()? (and should all architectures be required to define that macro?) clear_page assumes that given address is page aligned, I think. It may fail if you feed it with misaligned region's address. i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: $ grep -r #define clear_page include my only point here was that lots of code seems to be calling memset() when it would be clearer to invoke clear_page(). but there's still something a bit curious happening here. i'll poke around a bit more before i ask, though. rday - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Saturday 30 December 2006 23:08, Robert P. J. Day wrote: clear_page assumes that given address is page aligned, I think. It may fail if you feed it with misaligned region's address. i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: $ grep -r #define clear_page include my only point here was that lots of code seems to be calling memset() when it would be clearer to invoke clear_page(). but there's still something a bit curious happening here. i'll poke around a bit more before i ask, though. There are MMX implementations of clear_page(). I was experimenting with SSE[2] clear_page() which uses non-temporal stores. That one requires 16 byte alignment. BTW, it worked ~300% faster than memset. But Andi Kleen insists that cache eviction caused by NT stores will make it slower in macrobenchmark. Apart from fairly extensive set of microbechmarks I tested kernel compiles (i.e. real world load) and they are FASTER too, not slower, but Andi is fairly entrenched in his opinion ;) I gave up. -- vda - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
i don't see how that can be true, given that most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: *MOST*. Not all. For example an SSE version will at least assume 16 byte alignment, etc etc. clear_page() is supposed to be for full real pages only... for example it allows the architecture to optimize for alignment, cache aliasing etc etc. (and if there are cpus that get a clear an entire page instruction there has been hardware like that in the past, even on x86, just it's no longer sold afaik) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: replace memset(...,0,PAGE_SIZE) calls with clear_page()?
On Sat, 30 Dec 2006, Arjan van de Ven wrote: rday wrote: ... most of the definitions of the clear_page() macro are simply invocations of memset(). see for yourself: *MOST*. Not all. i did notice that. while the majority of the architectures simply define clear_page() as a macro calling memset(ptr, 0, PAGE_SIZE), the rest will implement it in assembler code for whatever reason. (i'm assuming that *every* architecture *must* define clear_page() one way or the other, is that correct? that would seem fairly obvious, but i just want to make sure i'm not missing anything obvious.) clear_page() is supposed to be for full real pages only... for example it allows the architecture to optimize for alignment, cache aliasing etc etc. fair enough. *technically*, not every call of the form memset(ptr,0,PAGE_SIZE) necessarily represents an address that's on a page boundary. but, *realistically*, i'm guessing most of them do. just grabbing a random example from some grep output: arch/sh/mm/init.c: ... /* clear the zero-page */ memset(empty_zero_page, 0, PAGE_SIZE); ... my only point here is that, given that every architecture needs to supply some kind of definition of a clear_page() routine, one would think that *lots* of those memset() calls could reasonably be rewritten as a clear_page() call for improved readibility, no? and there are a *lot* of memset() calls like that. rday - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
is there some reason there are so many calls of the form memset(addr, 0, PAGE_SIZE) rather than the apparently equivalent invocation of clear_page(addr) the majority of architectures appear to define the clear_page() macro in their include//page.h header file, but not entirely identically, and in some cases that definition is conditional, as with i386: = #ifdef CONFIG_X86_USE_3DNOW ... #define clear_page(page)mmx_clear_page((void *)(page)) ... #else ... #define clear_page(page)memset((void *)(page), 0, PAGE_SIZE) ... #endif should it perhaps be part of the CodingStyle doc to use the clear_page() macro rather than an explicit call to memset()? (and should all architectures be required to define that macro?) rday - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
replace memset(...,0,PAGE_SIZE) calls with clear_page()?
is there some reason there are so many calls of the form memset(addr, 0, PAGE_SIZE) rather than the apparently equivalent invocation of clear_page(addr) the majority of architectures appear to define the clear_page() macro in their include/arch/page.h header file, but not entirely identically, and in some cases that definition is conditional, as with i386: = #ifdef CONFIG_X86_USE_3DNOW ... #define clear_page(page)mmx_clear_page((void *)(page)) ... #else ... #define clear_page(page)memset((void *)(page), 0, PAGE_SIZE) ... #endif should it perhaps be part of the CodingStyle doc to use the clear_page() macro rather than an explicit call to memset()? (and should all architectures be required to define that macro?) rday - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/