Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-10 Thread Tres Melton
On Wed, 2005-06-08 at 19:20 -0700, Stephen Horner wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 18:00, Mon 06 Jun 05, Tres Melton wrote:
> > Well, the pages I was reading used the nop trick but it looks like a
> > better solution has been presented.  I almost forgot that integers are
> > still 32 bit on x86-64 so explaining the movl instead of movq.  For
> > those interested the "gcc -S" is (the #APP/#NO_APP is gcc's way of
> > marking inline asm):
> > 
> > - 8< --
> > #APP
> > startit:
> > #NO_APP
> > movl$10, -20(%rbp)
> > #APP
> > stopit:
> > #NO_APP
> > - >8 --
> 
> Forgive me if this seems obvious to most, but i'm unclear on exactly what you
> are saying here about #APP and #NO_APP in regards to labelling assembly code.
> I'm very new to assembly, and this sounded to interesting for me to just say 
> to
> myself "meh i'll prolly learn it later . . ." ^_^ Also I was curious why if 
> you
> simply intend to lable a code block before assembly, that you don't just
> asm(" pants on "); if asm() allows for such a thing ( couldn't find 
> man
> 2 asm lol ). At any rate, thanks in advance.
> 

First, any blocks of asm("[statements]" [outputs] : [inputs] :
[clobbers] ); that gcc encounters will be placed between #APP and
#NO_APP.  Second Mike's "startit:" and "stopit:" are labels or jump
destinations and as such they must be one word (no spaces, dashes, etc.)
but consume no space (except within the intermediate files).

HTH,
-- 
Tres



---
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-06 Thread Tres Melton
On Mon, 2005-06-06 at 19:26 -0400, Mike Frysinger wrote:
> i use asm label tricks:
> $ cat test.c
> int main(int argc, char *argv[])
> {
> int a;
> asm("startit:");
> a = 10;
> asm("stopit:");
> return 0;
> }
> 
> $ gcc -c test.c && objdump -d test.o
> test.o: file format elf64-x86-64
> 
> Disassembly of section .text:
> 
>  :
>0:   55  push   %rbp
>1:   48 89 e5mov%rsp,%rbp
>4:   89 7d fcmov%edi,0xfffc(%rbp)
>7:   48 89 75 f0 mov%rsi,0xfff0(%rbp)
> 
> 000b :
>b:   c7 45 ec 0a 00 00 00movl   $0xa,0xffec(%rbp)
> 
> 0012 :
>   12:   b8 00 00 00 00  mov$0x0,%eax
>   17:   c9  leaveq 
>   18:   c3  retq   
> -mike

Well, the pages I was reading used the nop trick but it looks like a
better solution has been presented.  I almost forgot that integers are
still 32 bit on x86-64 so explaining the movl instead of movq.  For
those interested the "gcc -S" is (the #APP/#NO_APP is gcc's way of
marking inline asm):

- 8< --
#APP
startit:
#NO_APP
movl$10, -20(%rbp)
#APP
stopit:
#NO_APP
- >8 --


Thanks Vapier,
-- 
Tres



---
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-06 Thread Mike Frysinger
On Monday 06 June 2005 07:09 pm, Tres Melton wrote:
> The "nop" is just an easy way to place markers in the code so you
> can find the interesting parts quickly.  In reality it costs one clock
> cycle for each "nop".

i use asm label tricks:
$ cat test.c
int main(int argc, char *argv[])
{
int a;
asm("startit:");
a = 10;
asm("stopit:");
return 0;
}

$ gcc -c test.c && objdump -d test.o
test.o: file format elf64-x86-64

Disassembly of section .text:

 :
   0:   55  push   %rbp
   1:   48 89 e5mov%rsp,%rbp
   4:   89 7d fcmov%edi,0xfffc(%rbp)
   7:   48 89 75 f0 mov%rsi,0xfff0(%rbp)

000b :
   b:   c7 45 ec 0a 00 00 00movl   $0xa,0xffec(%rbp)

0012 :
  12:   b8 00 00 00 00  mov$0x0,%eax
  17:   c9  leaveq 
  18:   c3  retq   
-mike


---
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-06 Thread Tres Melton
On Mon, 2005-06-06 at 13:26 -0400, Michael Jennings wrote:

> > #define BINSWAP(a, b) \
> >(((long) (a)) ^= ((long) (b)) ^= ((long) (a)) ^= ((long) (b)))
> > 
> > int main( void )
> > {
> >   long a = 3;
> >   long b = 8;
> > 
> >   asm( "noop;noop;noop" );
> >   BINSWAP(a,b);
> >   asm( "noop;noop;noop" );
> > 
> > }

I've been using the "gcc -S foo.c" trick to examine what gcc really does
since I made such a large and incorrect assumption about the way that
gcc handles function parameters in my attempt to port Eterm's MMX stuff
to SSE2.  I was going too fast here as no-operation is actually spelled
"nop" but the resulting code is the same, this just won't assemble or
run.  The "nop" is just an easy way to place markers in the code so you
can find the interesting parts quickly.  In reality it costs one clock
cycle for each "nop".
 
> > yields:
> > 
> > noop;noop;noop
> > movq-16(%rbp), %rdx
> > leaq-8(%rbp), %rax 
> > xorq%rdx, (%rax)   
> > movq-8(%rbp), %rdx 
> > leaq-16(%rbp), %rax
> > xorq%rdx, (%rax)   
> > movq-16(%rbp), %rdx
> > leaq-8(%rbp), %rax 
> > xorq%rdx, (%rax)   
> > noop;noop;noop
> > 
> > If you enable -O[123] then you will need to use the values a & b before
> > and after the BINSWAP call or they will be optimized away.  And simply
> > using immediate values like I did will cause the compiler to simply set
> > the different registers that are used to access them in reverse order.
> > In other words the swap gets optimized out.  The above code is without
> > -O and is clearly more complicated (by more than double) than it needs
> > to be.
> 
> Interesting.  You think I should just get rid of it then?
--->  snip  <---
> It's actually slower normally but faster optimized:
> 
> -O0
> Profiling SWAP() macro...300 iterations in 0.052468 seconds, 1.7489e-08 
> seconds per iteration
> Profiling BINSWAP() macro...300 iterations in 0.067905 seconds, 
> 2.2635e-08 seconds per iteration
> 
> -O2
> Profiling SWAP() macro...300 iterations in 0.014328 seconds, 4.776e-09 
> seconds per iteration
> Profiling BINSWAP() macro...300 iterations in 0.014183 seconds, 
> 4.7277e-09 seconds per iteration
> 
> (Done with libast's "make perf")

The performance difference is 1.0116% according to your profiling so the
question is:  Do you, the author and maintainer, think that a 1%
performance difference is worth the maintenance problems that it might
cause?  Another thought is that you are running Linux and there is a
possibility of the process getting preempted and messing with the
timings.  Caching issues might have arisen and if the 3 million values
aren't actually used they might not really be calculated; I still can't
predict, with any certainty, how gcc's optimizer works.  That's why I
look at the output and why my test program didn't actually perform a
swap (I didn't use the values) with -O[123].  In practice how many times
is it called?  To really test it you might want to consider loading a
large pixmap and creating a black pixmap of the same size and swapping
the pixels between the two.  At least then you can be sure that the
values are used.  Maybe even average 100 runs to minimize L1/L2/L3 cache
hits/misses.

Did you try any of the asm stuff?  That is probably more of a
maintenance problem than the other stuff though.  If you are interested
I did find a way to reduce the instructions by another 25%.
Unfortunately the "xchg" op can't take two memory locations.

#define BINSWAP(a, b) \
  asm( "movq (%%rsi), %%rax  \n\t" \
   "xchg  %%rax, (%%rdi) \n\t" \
   "movq  %%rax, (%%rsi) \n\t" \
   :: "S" (&a), "D" (&b)  : "%rax" \
 );

I don't want to make any presumptions about your code but If you are
asking me I would just let gcc handle the optimizations.  Take what I
say with a grain of salt as I have been away from programming for years
and have only recently gotten back into it as the errors in my patches
to Eterm will attest.  By the way there is still an outstanding patch of
mine that fixes one of my synapse misfires.  pixmap.c, line 1588 should
use 0x7c00 instead of 0xfc00.  My apologies for a cut + paste screw-up.
If you look at the patch (in an earlier email) you can see I added
things like (a>>0) for readability but am counting on gcc to optimize it
away for performance.  Raster often does this too (I've been reading his
code recently and working with kwo on e-16.8).

Best Regards,
-- 
Tres



---
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
___
enlightenment-devel mailing list
enlightenment-devel@

Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-06 Thread Michael Jennings
On Thursday, 02 June 2005, at 15:56:40 (-0400),
John Ellson wrote:

> RCS file: /cvsroot/enlightenment/eterm/libast/configure.in,v
> retrieving revision 1.24

Got it, thanks.



On Thursday, 02 June 2005, at 16:02:00 (-0400),
Mike Frysinger wrote:

> would an ifdef like:
> #if STRICT_ISO_C99 || __GNUC__ > 4
> be unacceptable ?  that way the installed libast wouldnt need to be 
> reconfigured/built/installed just to change STRICT_ISO_C99 ...

How about this:

#ifdef __GNUC__
#  if __GNUC__ >= 4
#define STRICT_ISO_C99
#  fi
...
#endif




On Thursday, 02 June 2005, at 18:39:01 (-0600),
Tres Melton wrote:

> #define BINSWAP(a, b) \
>(((long) (a)) ^= ((long) (b)) ^= ((long) (a)) ^= ((long) (b)))
> 
> int main( void )
> {
>   long a = 3;
>   long b = 8;
> 
>   asm( "noop;noop;noop" );
>   BINSWAP(a,b);
>   asm( "noop;noop;noop" );
> 
> }
> 
> yields:
> 
> noop;noop;noop
> movq-16(%rbp), %rdx
> leaq-8(%rbp), %rax 
> xorq%rdx, (%rax)   
> movq-8(%rbp), %rdx 
> leaq-16(%rbp), %rax
> xorq%rdx, (%rax)   
> movq-16(%rbp), %rdx
> leaq-8(%rbp), %rax 
> xorq%rdx, (%rax)   
> noop;noop;noop
> 
> If you enable -O[123] then you will need to use the values a & b before
> and after the BINSWAP call or they will be optimized away.  And simply
> using immediate values like I did will cause the compiler to simply set
> the different registers that are used to access them in reverse order.
> In other words the swap gets optimized out.  The above code is without
> -O and is clearly more complicated (by more than double) than it needs
> to be.

Interesting.  You think I should just get rid of it then?



On Thursday, 02 June 2005, at 21:04:59 (-0400),
John Ellson wrote:

> I understand that this is a ISO C99 restriction and that gcc4 is
> just a bit more pedantic than gcc3.

gcc4 does C99 by default now, does it not?

> In fact my first attempt at a fix was to just not use the xor
> BINSWAP macro at all, but this is really a question for Michael.  I
> was just trying to get his code to compile.

It's actually slower normally but faster optimized:

-O0
Profiling SWAP() macro...300 iterations in 0.052468 seconds, 1.7489e-08 
seconds per iteration
Profiling BINSWAP() macro...300 iterations in 0.067905 seconds, 2.2635e-08 
seconds per iteration

-O2
Profiling SWAP() macro...300 iterations in 0.014328 seconds, 4.776e-09 
seconds per iteration
Profiling BINSWAP() macro...300 iterations in 0.014183 seconds, 4.7277e-09 
seconds per iteration

(Done with libast's "make perf")

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/  <[EMAIL PROTECTED]>
n + 1, Inc., http://www.nplus1.net/   Author, Eterm (www.eterm.org)
---
 "There is a greater darkness than the one we fight.  It is the 
  darkness of a soul that has lost its way."   -- G'Kar, Babylon 5


---
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-02 Thread Mike Frysinger
On Thursday 02 June 2005 01:01 pm, Michael Jennings wrote:
> On Thursday, 02 June 2005, at 10:28:15 (-0400),
>
> John Ellson wrote:
> > The problem is that the xor trick isn't working for swapping pointers.
> > One possible work around is to use
> > the more straightfoward SWAP code:
> >
> > RCS file: /cvsroot/enlightenment/eterm/libast/include/libast.h,v
> > retrieving revision 1.58
> > diff -u -r1.58 libast.h
> > --- include/libast.h15 Dec 2004 00:00:19 -  1.58
> > +++ include/libast.h2 Jun 2005 14:25:33 -
> > @@ -281,11 +281,11 @@
> >  * @param a The first variable.
> >  * @param b The second variable.
> >  */
> > -#if STRICT_ISO_C99
> > +// #if STRICT_ISO_C99
>
> Why not just define STRICT_ISO_C99?  I created it specifically for
> that purpose.

would an ifdef like:
#if STRICT_ISO_C99 || __GNUC__ > 4
be unacceptable ?  that way the installed libast wouldnt need to be 
reconfigured/built/installed just to change STRICT_ISO_C99 ...
-mike


---
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-02 Thread John Ellson

Michael Jennings wrote:


On Thursday, 02 June 2005, at 13:56:28 (-0400),
John Ellson wrote:

 


Sure, but then there should be a configure test for the cases that need it.
   



Patches welcome. :)

Michael

 



RCS file: /cvsroot/enlightenment/eterm/libast/configure.in,v
retrieving revision 1.24
diff -u -r1.24 configure.in
--- configure.in7 Mar 2005 20:07:10 -   1.24
+++ configure.in2 Jun 2005 19:53:08 -
@@ -77,6 +77,24 @@
]
)

+AC_MSG_CHECKING(if STRICT_ISO_C99 required)
+AC_TRY_RUN([
+   int main () {
+   int a = 0, b = 0;
+   (long)a = (long)b;
+   return 0;
+   }]
+   ,
+   AC_MSG_RESULT(no)
+   ,
+   AC_MSG_RESULT(yes)
+   AC_DEFINE_UNQUOTED(STRICT_ISO_C99,1,[Define if compiler needs it])
+   ,
+   AC_MSG_RESULT(no - assumed because cross-compiling)
+   )
+
+
+
AST_X11_SUPPORT()
AST_IMLIB2_SUPPORT()
AST_MMX_SUPPORT()



---
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-02 Thread Michael Jennings
On Thursday, 02 June 2005, at 13:56:28 (-0400),
John Ellson wrote:

> Sure, but then there should be a configure test for the cases that need it.

Patches welcome. :)

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/  <[EMAIL PROTECTED]>
n + 1, Inc., http://www.nplus1.net/   Author, Eterm (www.eterm.org)
---
 "If you can scrounge up another brain cell, you might captivate us
  further...but I doubt it.  You couldn't get a clue during
  clue-mating season in a field full of horny clues if you smeared
  your body with clue musk and did the clue mating dance."   -- OS2Bot


---
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


Re: [E-devel] Re: libast from Eterm CVS fails to build with gcc4 on x86_64

2005-06-02 Thread John Ellson

Michael Jennings wrote:



Why not just define STRICT_ISO_C99?  I created it specifically for
that purpose.
 



Sure, but then there should be a configure test for the cases that need it.

john


---
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel