[E-devel] Eterm SSE2 patch for x86_64

2005-06-06 Thread John Ellson

Michael,

Tres asked me to help with the configure.in and Makefile.am bits for his 
SSE2 code for x86_64.  I've attached patches against Eterm CVS
with changes to: configure.in, src/Makefile.am, src/pixmap.c, and the 
new file: src/sse2_cmod.c  (changed since Tres's earlier version)


The configure.in changes add --enable-sse2 which will be on by default 
for x86_64.   The incomplete tests for MMX_64 have been removed.
There was an odd dps_snprintf_oflow() line in configure.in which was 
causing problems on my i686 system, and which I think was a typo, so I

removed it.

The src/Makefile.am changes just add the new source file sse2_cmod.c and 
conditionally compile it when HAVE_SSE2 is defined.


The src/pixman.c changes add externs for the new sse2 function, and 
conditionals to use them if HAVE_SSE2 is defined.  Tres also added 8 or 
16 byte alignment conditionally to colormod_trans()  - this should 
perhaps be done only for gcc?


Tres indicated that adding -mpreferred-stack-boundary=16 might still 
be beneficial on x86_64, but that it might consume extra space at run 
time.   I'm not in a position to make that call, but if you agree its a 
good idea I can probably make the configure.in change.


I've tested the changes on xi686 and x86_64 and verified that the 
correct routines get compiled in.   I verified that the modified Eterms 
run and that the Brightness control does something reasonable on both 
systems.I have not done any performance tests.


All credit for the sse2_cmod.c code goes to Tres.   I just did the easy 
bits.


John




Index: configure.in
===
RCS file: /cvsroot/enlightenment/eterm/Eterm/configure.in,v
retrieving revision 1.92
diff -u -r1.92 configure.in
--- configure.in	1 May 2005 07:16:51 -	1.92
+++ configure.in	6 Jun 2005 14:44:01 -
@@ -220,7 +220,7 @@
 seteuid memmove putenv strsep setresuid setresgid \
 memmem usleep snprintf strcasestr strcasechr \
 strcasepbrk strrev nl_langinfo)
-dps_snprintf_oflow()
+
 AC_CHECK_LIB(m, pow)
 
 dnl# Portability checks for various functions
@@ -500,9 +500,11 @@
 AC_DEFINE(PIXMAP_OFFSET, , [Define for pseudo-transparency support.])
 ])
 
+dnl#
+dnl# MMX support
+dnl#
 AC_MSG_CHECKING(for MMX support)
 HAVE_MMX=
-HAVE_MMX_64=
 AC_ARG_ENABLE(mmx, [  --enable-mmxenable MMX assembly routines], [
   test x$enableval = xyes  HAVE_MMX=yes
   ], [
@@ -510,25 +512,39 @@
   i*86)
   grep mmx /proc/cpuinfo /dev/null 21  HAVE_MMX=yes
   ;;
-  x86_64)
-  grep mmx /proc/cpuinfo /dev/null 21  HAVE_MMX_64=yes
-  ;;
   esac
   ])
 if test x$HAVE_MMX = xyes; then
 AC_MSG_RESULT([yes (32-bit)])
 AC_DEFINE(HAVE_MMX, , [Define for 32-bit MMX support.])
-elif test x$HAVE_MMX_64 = xyes; then
-dnl# AC_MSG_RESULT([yes (64-bit)])
-dnl# AC_DEFINE(HAVE_MMX_64, , [Define for 64-bit MMX support.])
-AC_MSG_RESULT([no (64-bit MMX not yet supported)])
 else
 AC_MSG_RESULT([no (no MMX detected)])
 fi
-dnl# AM_CONDITIONAL(HAVE_MMX, test x$HAVE_MMX = xyes -o x$HAVE_MMX_64 = xyes)
 AM_CONDITIONAL(HAVE_MMX, test x$HAVE_MMX = xyes)
 
 dnl#
+dnl# SSE2 support
+dnl#
+AC_MSG_CHECKING(for SSE2 support)
+HAVE_SSE2=
+AC_ARG_ENABLE(sse2, [  --enable-sse2enable SSE2 assembly routines], [
+  test x$enableval = xyes  HAVE_SSE2=yes
+  ], [
+  case $host_cpu in
+  x86_64)
+  grep sse2 /proc/cpuinfo /dev/null 21  HAVE_SSE2=yes
+  ;;
+  esac
+  ])
+if test x$HAVE_SSE2 = xyes; then
+AC_MSG_RESULT([yes])
+AC_DEFINE(HAVE_SSE2, , [Define for 64-bit SSE2 support.])
+else
+AC_MSG_RESULT([no (no SSE2 detected)])
+fi
+AM_CONDITIONAL(HAVE_SSE2, test x$HAVE_SSE2 = xyes)
+
+dnl#
 dnl# LibAST
 dnl#
 LIBAST_MIN=5
Index: src/Makefile.am
===
RCS file: /cvsroot/enlightenment/eterm/Eterm/src/Makefile.am,v
retrieving revision 1.29
diff -u -r1.29 Makefile.am
--- src/Makefile.am	15 Mar 2005 21:48:01 -	1.29
+++ src/Makefile.am	6 Jun 2005 14:44:01 -
@@ -6,6 +6,9 @@
 MMX_SRCS = mmx_cmod.S
 MMX_OBJS = mmx_cmod.lo
 
+SSE2_SRCS = sse2_cmod.c
+SSE2_OBJS = sse2_cmod.lo
+
 libEterm_la_SOURCES = actions.c actions.h buttons.c buttons.h command.c  \
   command.h draw.c draw.h e.c e.h eterm_debug.h eterm_utmp.h \
   events.c events.h feature.h font.c font.h grkelot.c\
@@ -16,22 +19,27 @@
   timer.c timer.h utmp.c windows.c windows.h defaultfont.c   \
   defaultfont.h libscream.c scream.h screamcfg.h
 
-EXTRA_libEterm_la_SOURCES = $(MMX_SRCS)
+EXTRA_libEterm_la_SOURCES = $(MMX_SRCS) 

Re: [E-devel] Eterm SSE2 patch for x86_64

2005-06-06 Thread Tres Melton
On Mon, 2005-06-06 at 11:28 -0400, John Ellson wrote:

 All credit for the sse2_cmod.c code goes to Tres.   I just did the easy 
 bits.

Thanks, but the real credit goes to Willem Monsuwe [EMAIL PROTECTED] for
writing the original MMX code.  All I did was expand it to use all 128
bits of the xmm registers via SSE2 and make it inline so that it can
handle whatever optimizations are thrown at gcc.  It should be twice as
fast as the original MMX since it processes twice as many pixels at
once.  

Thanks for the work John,
-- 
Tres



---
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61 plasma display: http://www.necitguy.com/?r=20
___
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel