On Thu, 22 Oct 2009, Douglas Bates wrote:

On Thu, Oct 22, 2009 at 10:26 AM, Ravi Varadhan <rvarad...@jhmi.edu> wrote:
Ted,

LAPACK is newer and is supposed to contain better algorithms than LINPACK.  It 
is not true that LAPACK cannot handle rank-deficient problems.  It can.

It's not just a question of handling rank-deficiency.  It's the
particular form of pivoting that is used so that columns associated
with the same term stay adjacent.

The code that is actually used in glm.fit and lm.fit, called through
the Fortran subroutine dqrls, is a modified version of the Linpack
dqrdc subroutine.

However, I do not know if using LAPACK in glm.fit instead of LINPACK would be 
faster and/or more memory efficient.

The big thing that could be gained is the use of level-3 BLAS.  The
current code uses only level-1 BLAS.  Accelerated BLAS can take
advantage of level 3 calls relative to level-1.


How would I change to level-3 ? Would I need to rebuild R with some flags? I imagine some comparative benchmarks.


Even so, I doubt that the QR decomposition is the big time sink in
glm.fit.  Why not profile a representative fit and check?  I did
profile the glm.fit code a couple of years ago and discovered that a
lot of time was being spent evaluating the various family functions
like the inverse link and the variance function and that was because
of calls to pmin and pmax.

What kind of profiling software should I use? Is the the Rprof in R able to report which part of glm.fit is the bottleneck?


Before trying to change very tricky Fortran code you owe it to
yourself to check that the potential gains would be.

Thanks for the suggestions.


----- Original Message -----
From: Ted <tchi...@sickkids.ca>
Date: Thursday, October 22, 2009 10:53 am
Subject: Re: [R] glm.fit to use LAPACK instead of LINPACK
To: "r-help@R-project.org" <r-help@r-project.org>


Hi,

I understand that the glm.fit calls LINPACK fortran routines instead of
LAPACK because it can handle the 'rank deficiency problem'.  If my data
matrix is not rank deficient, would a glm.fit function which runs on
LAPACK be faster?  Would this be worthwhile to convert glm.fit to use
LAPACK?  Has anyone done this already??  What is the best way to do this?

I'm looking at very large datasets (thousands of glm calls), and would
like to know if it's worth the effort for performance issues.

Thanks,

Ted

-------------------------------------
Ted Chiang
  Bioinformatics Analyst
  Centre for Computational Biology
  Hospital for Sick Children, Toronto
  416.813.7028
  tchi...@sickkids.ca

______________________________________________
R-help@r-project.org mailing list

PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to