Hi,

the following might be interesting for some people around here. As
time permits I will offer an optional spkg.

Cheers,

Michael

-------- Original Message --------
Subject: [atlas-devel] ATLAS 3.9.0 & LAPACK
Date: Fri, 18 Jul 2008 06:19:10 -0500
From: Clint Whaley <[EMAIL PROTECTED]>
Reply-To: List for developer discussion,  NOT SUPPORT. <math-atlas-
[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]

Guys,

Its been a long time coming, but I have finally heaved out 3.9.0.  The
main reason for the this long delay is that I did a major rewrite of
ATLAS for additional rank-K performance, which timings showed was a
big win **until I fixed the performance bug that mandated 3.8.2**.
After
that, I found I had written thousands of lines of code for nothing, so
I had to yank the code back out :(

However, 3.9.0 is finally out, and it has several key features that I
hope will
make it worth the wait.  There are much improved DGEMM kernels for
Core2Duo64
and K10h64 architectures.  These kernels (particulary K10h) can still
be
improved, and I haven't yet ported them to single precision or 32
bits.
However, this should provide some relief on the Core2Duo, where ATLAS
was
taking a savage beat-down from Goto and MKL blas.  ATLAS still trails
Goto, but
it is not quite the same excoriating humiliation now (at least for for
double).
The key to the Core2Duo64 was doing 2D blocking, which I had tested
but
apparently messed up before.  Thanks to Yevgen Voronenko of CMU/
SPIRAL, who
gave me a code fragment to work from (see ATLAS/doc/AtlasCredits.txt
for
details).

The main focus of 3.9.0 has been in improving ATLAS's LAPACK
support.  The first of these is that you no longer have to install
LAPACK separately from ATLAS.  If you have LAPACK 3.1.1 untarred
somewhere,
you can use the flag '-Ss lasrc /path/to/lapack3.1.1/SRC', and ATLAS
will automatically build it during the ATLAS build, with no need for
the
flag/make.inc headaches that we have in the 3.8 series.   You can also
provide '--with-netlib-lapack-tarfile=/path/to/tarfile' and ATLAS will
extract the tarfile for you in the ATLAS directory, and build it from
there.  If you have more than one install, you can save space by using
the
-Ss flag, so that all ATLAS installs share one copy of the LAPACK
source,
so I recommend the first method.

The second big lapack push for this release is that I've started to
support a new C API for lapack, which I hope to eventually expand to
all of LAPACK.  For most of the routines, it calls the F77LAPACK, but
for ATLAS native routines (like LU/Cholesky) it calls ATLAS's faster
routines instead.  The name is the f77 name, in lower case, with a
"C_"
prepended.  Thus DGETRF is C_dgetrf.  Character arguments (Uplo,
Trans, etc)
are replaced by CBLAS enum types, and all (non-complex) scalars are
passed
by value.  This API supports only column major arrays (it mostly calls
the F77/netlib lapack, which are column-major only).  Routines that
take
workspace in F77 don't in the C_ equivalents, as the wrapper auto-
queries
LAPACK and allocates.  However, if you want to allocate the work
yourself, the
routine taking workspace usually exists with the name C_rout_wrk and
you can
test if it exists by doing (it may not exist if ATLAS supports the
routine
natively):
   #ifdef ATL_C2F<rout>_wrk__   (eg., ATL_C2Fdgels_wrk__)

This API is currently supported for the following LAPACK routines:
   ATLAS native routines:
      xPOSV xGESV xPOTRF xGETRF xPOTRS xPOTRI xLAUMM
   C2F wrappers:
      xGELS xGELQF xGERQF xGEQLF xGEQRF

Obviously, you need to build the full lapack library (and thus need a
functional F77 compiler) to use these routines.
You can find more info in the following files found in ATLAS/include:
    C_lapack.h    # main header file you must include to use the
C_lapack API
    clapack.h     # header for ATLAS's native lapack
    atlas_C2Flapack.h   # header for C to F77 wrapper functions.

I would like to get some feedback on this new API.  I use macros to
select
between native & C2F files to save some calling overhead.  Is this
real bad
news for people?  Will it make your life easier to have a full C API
supported
out-of-the-box for ATLAS?  If there is a demand for this API, I can
fill
it out fairly rapidly (with some help from you guys for testing); if
there's
not, I will populate it only as needed for internal ATLAS stuff.
So, speak up if you are interested!

Finally, the last lapack deal is that ATLAS can now tune some
of the lapack routines that it doesn't natively support by
empirically
tuning LAPACK's blocking factor to both the platform and problem size.
Right now, ATLAS autotunes only the QR factorization routines
mentioned
above.  Initial timings show improvements ranging from 5-25% (as much
as
75% for small problems on Itanium!).  Core2Duo64SSE3 has arch defaults
with QR pretuned.  Be default, ATLAS does not tune LAPACK.  To enable
it,
you pass '-Si latune 1' to configure.  You will only want to do this
if
QR (or one of the many LAPACK routines that call it) is important to
you:
my present BFI lapack tuning framework adds roughly 3 hours to a
*fast*
machine's install!

Cheers,
Clint

**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley
**
**************************************************************************

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
Build the coolest Linux based applications with Moblin SDK & win great
prizes
Grand prize is a trip for two to an Open Source event anywhere in the
world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Math-atlas-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel


--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

Reply via email to