Greetings! "James A. Treacy" <[EMAIL PROTECTED]> writes:
> I should start by mentioning that Sue is orphaning the lapack packages. > One of these days I'll convince her to actually send mail to the wnpp > stating such. > > Once she has officially orphaned them, I will send a note offering to > take them over. > Wonderful! Do you think this will be done before November? > Are you aware that the current lapack packages contain both static and shared > libs(*)? The static libs are not compiled with -fPIC though. As I'm sure > you are aware, using -fPIC uses a register which can have a substantial(**) > hit on performance on register starved architectures like intel. > I'm sorry to have overlooked the lapack situation. Yes, I see that now. I've been spending more time on scalapack, and just assumed (incorrectly) that they were packaged the same. As for the performance with -fPIC, yes, I was thinking the same thing after I sent the original post. So, if we really want the highest performing options, and the functionality of a user-invoked per-system blas tuning feature provided by atlas, we have the following possibilities: 1) Have the separate blas package ship static no -fPIC, static -fPIC, and shared -fPIC versions. Then atlas can unpack the static -fPIC with ar, copy in its own modules, and rebuild the static -fPIC and shared -fPIC versions, preferably with separate filenames and using the alternatives system to choose. Disadvantage: no static no -fPIC optimized version can be made this way. 2) Have the atlas package duplicate the source code from the blas package. Then we can forget about static -fPIC versions entirely. Disadvantage: two packages with same source component. Maybe not a problem? This seems better than 1, IMHO. Advantage: This package is already ready (minus the alternatives)! 3) Maybe only have one blas/atlas package, supplying the reference implementation, a pre-optimized generic implementation, and the script to optimize to the user's system, with switching done with alternatives. In other words, do we really ever want to ship just the reference implementation by itself? Wouldn't the *alternative* of an optimized version always be preferable? Advantage: I guess this package is ready too. > Perhaps a bit more discussion before a decision is reached. Do you have > any information on performance of > atlas modified static blas > vs. > atlas modified static blas compiled using -fPIC ? > > How portable are the tuned libraries to other machines? For example, if the > atlas modified blas libraries are generated on a P133, how well would those > libs perform on a PII (with a much larger cache) compared to a set of > libraries tuned specifically for that configuration. > I'm doing some benchmarks now, and hope to report shortly. > If they aren't very portable and -fPIC isn't a big hit, then we could > install atlas and use the postinst to automatically tune the users > machine. I believe this is what you suggested in your opening paragraph. > Actually, I think you were right to try to avoid this. I've tried to describe 1-3) in a way that the user will always have a no -fPIC option. Question: How does this reserved register thing work? Its not a reserved register *per* shared lib, is it? Just one for any shared libraries at all? If so, what about libc and libm? Do people running real code statically compile them in too? > Can atlas fix an atlas modified version of blas? I'm thinking here of shipping > the blas package with an atlas modified library which could then be tuned > as per the previous paragraph if a user wishes. This would give an > (possibly) improved generic blas library while still allowing power users > to tune it for fastest performance on their machine. > This is a good idea. I think this is equivalent to 3) above. If we agree on 3), then I'd be happy to send you the package I have, or, if you'd prefer, I could maintain it and you maintain the reset of lapack. > If we have good answers to the above, the best course of action > should be clear. > > > Thanks again! I hope you don't mind me cc'ing this note to the > > debian-beowulf mailing list, to solicit some additional feedback. > > > No problem. Since the blas routines are critical to many programs we need > to ship the fastest ones we can. More feedback is good. > > Jay Treacy Thanks so much for your valuable input! -- Camm Maguire [EMAIL PROTECTED] ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah

