On 6/23/07, Oleg Verych <[EMAIL PROTECTED]> wrote:
Why not just show actual objdump output on code (maybe with different oxygen atoms used in gcc), rather than *talking* about optimization and standards, hm?
here is the objdump output of the two object files: As you could see, the older one used 0x38 bytes stack space while the new one used 0x28 bytes, and the object code is two bytes less, I think all these benefits are the gcc's __builtin_memset optimization than the explicit call to memset.
$ objdump -d /tmp/init.orig.o|grep -A23 -nw '<paging_init>' 525:0000000000000395 <paging_init>: 526- 395: 48 83 ec 38 sub $0x38,%rsp 527- 399: 48 8d 54 24 10 lea 0x10(%rsp),%rdx 528- 39e: fc cld 529- 39f: 31 c0 xor %eax,%eax 530- 3a1: 48 89 d7 mov %rdx,%rdi 531- 3a4: ab stos %eax,%es:(%rdi) 532- 3a5: ab stos %eax,%es:(%rdi) 533- 3a6: ab stos %eax,%es:(%rdi) 534- 3a7: ab stos %eax,%es:(%rdi) 535- 3a8: ab stos %eax,%es:(%rdi) 536- 3a9: 48 89 7c 24 08 mov %rdi,0x8(%rsp) 537- 3ae: ab stos %eax,%es:(%rdi) 538- 3af: 48 c7 44 24 10 00 10 movq $0x1000,0x10(%rsp) 539- 3b6: 00 00 540- 3b8: 48 c7 44 24 18 00 00 movq $0x100000,0x18(%rsp) 541- 3bf: 10 00 542- 3c1: 48 8b 05 00 00 00 00 mov 0(%rip),%rax # 3c8 <paging_init+0x33> 543- 3c8: 48 89 44 24 20 mov %rax,0x20(%rsp) 544- 3cd: 48 89 d7 mov %rdx,%rdi 545- 3d0: e8 00 00 00 00 callq 3d5 <paging_init+0x40> 546- 3d5: 48 83 c4 38 add $0x38,%rsp 547- 3d9: c3 retq 548- $ objdump -d /tmp/init.new.o|grep -A23 -nw '<paging_init>' 525:0000000000000395 <paging_init>: 526- 395: 48 83 ec 28 sub $0x28,%rsp 527- 399: 48 89 e7 mov %rsp,%rdi 528- 39c: fc cld 529- 39d: 31 c0 xor %eax,%eax 530- 39f: ab stos %eax,%es:(%rdi) 531- 3a0: ab stos %eax,%es:(%rdi) 532- 3a1: ab stos %eax,%es:(%rdi) 533- 3a2: ab stos %eax,%es:(%rdi) 534- 3a3: ab stos %eax,%es:(%rdi) 535- 3a4: ab stos %eax,%es:(%rdi) 536- 3a5: 48 c7 04 24 00 10 00 movq $0x1000,(%rsp) 537- 3ac: 00 538- 3ad: 48 c7 44 24 08 00 00 movq $0x100000,0x8(%rsp) 539- 3b4: 10 00 540- 3b6: 48 8b 05 00 00 00 00 mov 0(%rip),%rax # 3bd <paging_init+0x28> 541- 3bd: 48 89 44 24 10 mov %rax,0x10(%rsp) 542- 3c2: 48 89 e7 mov %rsp,%rdi 543- 3c5: e8 00 00 00 00 callq 3ca <paging_init+0x35> 544- 3ca: 48 83 c4 28 add $0x28,%rsp 545- 3ce: c3 retq 546- 547-00000000000003cf <alloc_low_page>: 548- 3cf: 41 56 push %r14
I bet, that will be a key for success. And if you are interested in such optimizations, why not to grep whole source tree for this kind of things? I'm not sure one function in arch/x86_64 is only such ``unoptimized''. And after doing that maybe you will see, that "{}" initializer can be applied not only to integer values (you did init with of *long int*, with *int*, btw), but to structs and others.
with '{}' initializer, gcc will fill its memory with zeros. to other potential points to be optimized, I only see this trivial as the first point, I wonder how people gives comments on this; and if this optimization can be tested correctly, this can be done as an optimization example and I'll try others.
Ahh, one more thing about _optimizing_ your time, i.e. not wasting one. Add to CC list people, who already did reply on you patch. Otherwise you are showing your disrespect for them and hiding from further discussion.
Thank you, I know it and I've already subscribed the linux kernel mailing list(linux-kernel@vger.kernel.org) so that I won't miss any further discussion about it.
I think you do not, but Linux development not have an automatic system for patch tracking, so you are on your own with your text editor and e-mail client on this. Please take care for your time.
What about that? Do you mean something such as git by "an automatic system"?
-- frenzy -o--=O`C #oo'L O <___=E M
-- Denis Cheng Linux Application Developer - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/