Re: [R] integrate

2007-08-22 Thread Ravi Varadhan
You can divide your domain of integration into smaller intervals and then
add up the individual contributions.  This could improve the speed of
adaptive Gauss-Kronrod quadrature used in integrate().

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Santanu Pramanik
Sent: Wednesday, August 22, 2007 6:41 PM
To: [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
Subject: Re: [R] integrate

Hi Andy,

Thank your very much for your input. I also tried something like that
which gives a value close to 20, basically using the same trapezoidal
rule.

 sum(apply(as.matrix(seq(-10,10,by=0.1)),1,my.fcn))*0.1
[1] 20.17385

Actually my function is much more complicated than the posted example
and it is taking a long time...

Anyway, thanks again,
Santanu


JPSM, 1218J Lefrak Hall
University of Maryland, College Park
Phone: 301-314-9916



 [EMAIL PROTECTED] 8/22/2007 5:20:05 PM 
As Duncan Murdoch mentioned in his reply, the problem is with the fact
that
your function is not really a properly defined function in the sense
of
assigning a unique y to each x.  The integrate function uses an
adaptive
quadrature routine which probably makes multiple calls to the function
being integrated and expects to get the same y's for the same x's
every
time.

If you want to get a number close to 20 (for your example) you need an
integration routine which will use single evaluation of your function
at
each value of x.  A simple method like rectangular approximation on a
grid
or the so-called trapezoidal rule will do just that.

Here is a very crude prototype of such an integrator:

integrate1 - function(f, lower, upper){
   f - match.fun(f)
   xx - seq(lower, upper, length=100)
   del - xx[2] - xx[1]
   yy - f(xx[-100])
return(del*sum(yy))

Now when you run integrate1(my.fun, -10, 10) you will get a number
close to
20 but, of course, every time you do it you will get a different
value.

Hope this helps,

Andy

__
Andy Jaworski
518-1-01
Process Laboratory
3M Corporate Research Laboratory
-
E-mail: [EMAIL PROTECTED] 
Tel:  (651) 733-6092
Fax:  (651) 736-3122


   
   
 Santanu  
   
 Pramanik 
   
 [EMAIL PROTECTED] 
To 
 .umd.edu r-help@stat.math.ethz.ch  
   
 Sent by:  
cc 
 [EMAIL PROTECTED] 
   
 at.math.ethz.ch  
Subject 
   [R] integrate   
   
   
   
 08/22/2007 02:56  
   
 PM
   
   
   
   
   
   
   




Hi,
I am trying to integrate a function which is approximately constant
over the range of the integration. The function is as follows:

 my.fcn = function(mu){
+ m = 1000
+ z = 0
+ z.mse = 0
+ for(i in 1:m){
+ z[i] = rnorm(1, mu, 1)
+ z.mse = z.mse + (z[i] - mu)^2
+ }
+ return(z.mse/m)
+ }

 my.fcn(-10)
[1] 1.021711
 my.fcn(10)
[1] 0.9995235
 my.fcn(-5)
[1] 1.012727
 my.fcn(5)
[1] 1.033595
 my.fcn(0)
[1] 1.106282

The function takes the value (approx) 1 over the range of mu. So the
integration result should be close to 20 if we integrate over (-10,
10).
But R gives:

 integrate(my.fcn, -10, 10)
685.4941 with absolute error  7.6e-12

 integrate(Vectorize(my.fcn), -10, 10)  # this code never terminate

I have seen in the ?integrate page it is clearly written:

If the function is approximately constant (in particular, zero) over
nearly all its range it is possible that the result and error estimate
may be seriously wrong.

But this doesn't help in solving the problem.
Thanks,
Santanu



JPSM, 1218J Lefrak Hall
University of Maryland, College Park
Phone: 301-314-9916



 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https

Re: [R] Linear models over large datasets

2007-08-17 Thread Ravi Varadhan
The simplest trick is to use the QR decomposition:

The OLS solution (X'X)^{-1}X'y can be easily computed as:
qr.solve(X, y)

Here is an illustration:

 set.seed(123)
 X - matrix(round(rnorm(100),1),20,5)
 b - c(1,1,2,2,3)
 y - X %*% b + rnorm(20)
 
 ans1 - solve(t(X)%*%X,t(X)%*%y)
 ans2 - qr.solve(X,y)
 all.equal(ans1,ans2)
[1] TRUE

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of dave fournier
Sent: Friday, August 17, 2007 12:43 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Linear models over large datasets

 Its actually only a few lines of code to do this from first principles.
 The coefficients depend only on the cross products X'X and X'y and you
 can build them up easily by extending this example to read files or
 a database holding x and y instead of getting them from the args.
 Here we process incr rows of builtin matrix state.x77 at a time
 building up the two cross productxts, xtx and xty, regressing
 Income (variable 2) on the other variables:

 mylm - function(x, y, incr = 25) {
   start - xtx - xty - 0
   while(start  nrow(x)) {
   idx - seq(start + 1, min(start + incr, nrow(x)))
   x1 - cbind(1, x[idx,])
   xtx - xtx + crossprod(x1)
   xty - xty + crossprod(x1, y[idx])
   start - start + incr
   }
   solve(xtx, xty)
 }

 mylm(state.x77[,-2], state.x77[,2])


 On 8/16/07, Alp ATICI alpatici at gmail.com wrote:
  I'd like to fit linear models on very large datasets. My data frames
  are about 200 rows x 200 columns of doubles and I am using an 64
  bit build of R. I've googled about this extensively and went over the
  R Data Import/Export guide. My primary issue is although my data
  represented in ascii form is 4Gb in size (therefore much smaller
  considered in binary), R consumes about 12Gb of virtual memory.
 
  What exactly are my options to improve this? I looked into the biglm
  package but the problem with it is it uses update() function and is
  therefore not transparent (I am using a sophisticated script which is
  hard to modify). I really liked the concept behind the  LM package
  here: http://www.econ.uiuc.edu/~roger/research/rq/RMySQL.html
  But it is no longer available. How could one fit linear models to very
  large datasets without loading the entire set into memory but from a
  file/database (possibly through a connection) using a relatively
  simple modification of standard lm()? Alternatively how could one
  improve the memory usage of R given a large dataset (by changing some
  default parameters of R or even using on-the-fly compression)? I don't
  mind much higher levels of CPU time required.
 
  Thank you in advance for your help.
 
  __
  R-help at stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
If your design matrix X is very well behaved this approach may work for 
you. Often just doing solve(X'X,y) will fail for numerical reasons. The 
right way to do it is tofactor the matrix X  as

   X = A * B

where B is 200x200 in your case and A is  200 x 200 with
   A'*A = I  (That is A is orthogonal.)

so  X'*X = B' *B  and you use

 solve(B'*B,y);

To find A and B you can use modified Gram-Schmidt which is very easy to 
program and works well when you wish to store the columns of X on a hard 
disk and just read in a bit at a time. Some people claim that modifed 
Gram-Schmidt is unstable but it has always worked well for me.
In any event you can check  to ensure that A'*A = I and
   X=A*B

   Cheers,

Dave

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread Ravi Varadhan
An even simpler solution is:

mat2 - 1 * (mat1  0.25)

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lanre Okusanya
Sent: Friday, August 10, 2007 2:20 PM
To: jim holtman
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Help wit matrices

that was ridiculously simple. duh.

THanks

Lanre

On 8/10/07, jim holtman [EMAIL PROTECTED] wrote:
 Is this what you want:

  x - matrix(runif(100), 10)
  round(x, 3)
[,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
  [1,] 0.268 0.961 0.262 0.347 0.306 0.762 0.524 0.062 0.028 0.226
  [2,] 0.219 0.100 0.165 0.131 0.578 0.933 0.317 0.109 0.527 0.131
  [3,] 0.517 0.763 0.322 0.374 0.910 0.471 0.278 0.382 0.880 0.982
  [4,] 0.269 0.948 0.510 0.631 0.143 0.604 0.788 0.169 0.373 0.327
  [5,] 0.181 0.819 0.924 0.390 0.415 0.485 0.702 0.299 0.048 0.507
  [6,] 0.519 0.308 0.511 0.690 0.211 0.109 0.165 0.192 0.139 0.681
  [7,] 0.563 0.650 0.258 0.689 0.429 0.248 0.064 0.257 0.321 0.099
  [8,] 0.129 0.953 0.046 0.555 0.133 0.499 0.755 0.181 0.155 0.119
  [9,] 0.256 0.954 0.418 0.430 0.460 0.373 0.620 0.477 0.132 0.050
 [10,] 0.718 0.340 0.854 0.453 0.943 0.935 0.170 0.771 0.221 0.929
  ifelse(x  .5, 1, 0)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,]010001100 0
  [2,]000011001 0
  [3,]110010001 1
  [4,]011101100 0
  [5,]011000100 1
  [6,]101100000 1
  [7,]110100000 0
  [8,]010100100 0
  [9,]010000100 0
 [10,]101011010 1


 On 8/10/07, Lanre Okusanya [EMAIL PROTECTED] wrote:
  Hello all,
 
  I am working with a 1000x1000 matrix, and I would like to return a
  1000x1000 matrix that tells me which value in the matrix is greater
  than a theshold value (1 or 0 indicator).
  i have tried
   mat2-as.matrix(as.numeric(mat10.25))
  but that returns a 1:10 matrix.
  I have also tried for loops, but they are grossly inefficient.
 
  THanks for all your help in advance.
 
  Lanre
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixture of Normals with Large Data

2007-08-05 Thread RAVI VARADHAN
Another possibility is to use data squashing methods.  Relevant papers are: 
(1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999).  

Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: [EMAIL PROTECTED]


- Original Message -
From: Charles C. Berry [EMAIL PROTECTED]
Date: Saturday, August 4, 2007 8:01 pm
Subject: Re: [R] Mixture of Normals with Large Data
To: [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch


 On Sat, 4 Aug 2007, Tim Victor wrote:
  
   All:
  
   I am trying to fit a mixture of 2 normals with  110 million 
 observations. I
   am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and 
 I
   continue to run out of memory. Does anyone have any suggestions.
  
  
  If the first few million observations can be regarded as a SRS of the 
 
  rest, then just use them. Or read in blocks of a convenient size and 
 
  sample some observations from each block. You can repeat this process 
 a 
  few times to see if the results are sufficiently accurate.
  
  Otherwise, read in blocks of a convenient size (perhaps 1 million 
  observations at a time), quantize the data to a manageable number of 
 
  intervals - maybe a few thousand - and tabulate it. Add the counts 
 over 
  all the blocks.
  
  Then use mle() to fit a multinomial likelihood whose probabilities 
 are the 
  masses associated with each bin under a mixture of normals law.
  
  Chuck
  
  
   Thanks so much,
  
   Tim
  
  [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   
   PLEASE do read the posting guide 
   and provide commented, minimal, self-contained, reproducible code.
  
  
  Charles C. Berry(858) 534-2098
   Dept of 
 Family/Preventive Medicine
  E UC San Diego
La Jolla, San Diego 92093-0901
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Computing the rank of a matrix.

2007-07-24 Thread Ravi Varadhan
Hi Martin,

I just realized (courtesy: ?qr) that LAPACK=TRUE always gives full rank, no
matter what the matrix and tolerance are. So, clearly my results using
LAPACK=TRUE should be ignored.  So, the real comparison is only between
rankMat and qr(., LAPACK=FALSE)$rank.  I can’t help but fell that there can
be no correct solution to an ill-posed problem.  Furthermore, I haven't
come across a real application where the numerical estimate of a rank is
directly useful.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Martin Maechler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 24, 2007 1:43 PM
To: RAVI VARADHAN
Cc: Martin Maechler; r-help@stat.math.ethz.ch
Subject: Re: [R] Computing the rank of a matrix.

 RV == RAVI VARADHAN [EMAIL PROTECTED]
 on Sat, 07 Apr 2007 18:39:36 -0400 writes:
 

this is a bit a late reply...  better late than never

RV Hi Martin,

Hi Ravi, thanks for your research.


RV I played a bit with rankMat.  I will first state the
RV conclusions of my numerical experiments and then present
RV the results to support them:

RV 1.  I don't believe that rankMat, or equivalently
RV Matlab's rank computation, is necessarily better than
RV qr(.)$rank or (qr., LAPACK=TRUE)$rank.  In fact, for the
RV notorious Hilbert matrix, rankMat can give poor
RV estimates of rank.

RV 2.  There exists no universally optimal (i.e. optimal
RV for all A) tol in qr(A, tol)$rank that would be
RV optimally close to rankMat.  The discrepancy in the
RV ranks computed by qr(A)$rank and rankMat(A) obviously
RV depends on the matrix A. This is evident from the tol
RV used in rankMat:
RV  tol - max(d) * .Machine$double.eps * abs(singValA[1])
RV So, this value of tol in qr will also minimize the discrepancy. 

RV 3.  The tol parameter is irrelevant in qr(A, tol,
RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize
RV the tol parameter when computing the rank.  However,
RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol.

Yes, indeed!  The help page of qr() actually says so 
{at least now, don't know about 3 months ago}.

RV Now, here are the results:
RV 1.
 hilbert.rank - matrix(NA,20,3)
 hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
 for (i in 1:20) {
RV + himat - hilbert(i)
RV + hilbert.rank[i,1] - rankMat(himat)
RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank
RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK =
TRUE)$rank
RV + }
 hilbert.rank
RV [,1] [,2] [,3]
RV [1,]111
RV [2,]222
RV [3,]333
RV [4,]444
RV [5,]555
RV [6,]666
RV [7,]777
RV [8,]888
RV [9,]999
RV [10,]   10   10   10
RV [11,]   10   11   11
RV [12,]   11   12   12
RV [13,]   11   12   13
RV [14,]   11   13   14
RV [15,]   12   13   15
RV [16,]   12   15   16
RV [17,]   12   16   17
RV [18,]   12   16   18
RV [19,]   13   17   19
RV [20,]   13   17   20

RV We see that rankMat underestimates the rank for n  10, but
qr(himat, tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right.

Yes, indeed;  and that's  seems a bit  against the  ``common
knowledge'' that svd() is more reliable than qr()

Hmm I'm still baffled a bit..

Note that with the Hilbert matrices, 
one might argue that  hilbert(20) might really not have a
correct estimated rank of 20,
but at least for hilbert(13) or so, the rank should be == n.

BTW, there's a nice plot, slightly related to this problem,
using rcond() from the Matrix package :

  library(Matrix)

  hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
  rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n

  plot(rcHilb, xlab = n, log = y, col = 2, type = b,
   main = reciprocal condition numbers of Hilbert(n))

where one sees that the LAPACK code that underlies
Matrix::rcond() also gets into problem when estimating the
condition number for hilbert(n) when n ~ 16 .

I think if we wanted to make real progress here, we'd have to
consult with numerical analyist specialists.
But for me the topic is too remote to be followed up further at
the moment.

One conclusion for me is that to estimate the rank of a matrix
in current versions of R, one should use

  rankMat - function(x)  qr(x, LAPACK = TRUE)$rank

(as was suggested as one possibility in the original

Re: [R] Linear programming question

2007-07-18 Thread Ravi Varadhan
Tobias,

Adding the first constraints yields:
S1 + S2 = -2A

Similarly adding the second set of constraints:
S3 + S4 = 2B

If A and B are positive (which you didn't specify) then
The minimum of S1+S2 is -2A, and the maximum of S3+S4 is 2B.

Thus, the minimum of S1+S2-S3-S4 is -2(A+B).  

Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tobias Schlottmann
Sent: Wednesday, July 18, 2007 1:24 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Linear programming question

 Hi everybody,
  consider please an optimization problem:  
   
  minimize   sum S1+S2
   
  Subject to :  y - x = A + S1
 x - y = A + S2
   
  and we want to add two more constraints:
   
   y - x = B - S3
   x - y = B - S4
   
  where A is a small constant value and B is a large constant value, S1 and
S2 are surplus and S3 and S4 are slack variables.
   
  S3 and S4 have to be maximized in objective function. As objective
function, is this correct?  :
   
  minimize sum S1+ S2 - S3 -S4
   
  where actually we want to minimize S1 and S2; and maximize S3 and S4.
   
  If it is not correct, what to do ?
   
  Thank you for any guide.
   
  Tobias

 
-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear programming question

2007-07-18 Thread Ravi Varadhan
Tobias,

Just a clarification/correction to my solution: it makes no difference
whether A and B are positive or negative.  The minimum of S1+S2-S3-S4 is
always -2(A+B).

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
Sent: Wednesday, July 18, 2007 1:52 PM
To: 'Tobias Schlottmann'; r-help@stat.math.ethz.ch
Subject: Re: [R] Linear programming question

Tobias,

Adding the first constraints yields:
S1 + S2 = -2A

Similarly adding the second set of constraints:
S3 + S4 = 2B

If A and B are positive (which you didn't specify) then
The minimum of S1+S2 is -2A, and the maximum of S3+S4 is 2B.

Thus, the minimum of S1+S2-S3-S4 is -2(A+B).  

Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tobias Schlottmann
Sent: Wednesday, July 18, 2007 1:24 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Linear programming question

 Hi everybody,
  consider please an optimization problem:  
   
  minimize   sum S1+S2
   
  Subject to :  y - x = A + S1
 x - y = A + S2
   
  and we want to add two more constraints:
   
   y - x = B - S3
   x - y = B - S4
   
  where A is a small constant value and B is a large constant value, S1 and
S2 are surplus and S3 and S4 are slack variables.
   
  S3 and S4 have to be maximized in objective function. As objective
function, is this correct?  :
   
  minimize sum S1+ S2 - S3 -S4
   
  where actually we want to minimize S1 and S2; and maximize S3 and S4.
   
  If it is not correct, what to do ?
   
  Thank you for any guide.
   
  Tobias

 
-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization

2007-07-17 Thread Ravi Varadhan
Hi,

Your problem can be solved analytically.  Eliminate one of the variables,
say x3, from the problem by using the equality x1 + x2 + x3 = 1. Then solve
for the intersection of the circle (in x1 and x2) defined by the radical
constraint, with the straight line defined by the objective function.  There
will be, at most, two intersection points. The extremum has to be one of
these two points, provided they also satisfy the other inequalities (To me,
this sounds an awful lot like a homework problem).  


Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of massimiliano.talarico
Sent: Monday, July 16, 2007 4:50 PM
To: r-help
Subject: [R] Optimization

Dear all,
I need a suggest to obtain the max of this function:

Max x1*0.021986+x2*0.000964+x3*0.02913

with these conditions:

x1+x2+x3=1;
radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
x1=0;
x1=1;
x2=0;
x2=1;
x3=0;
x3=1;

Any suggests ?

Thanks in advanced,
Massimiliano

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nearest correlation to polychoric

2007-07-13 Thread Ravi Varadhan
Martin,

I sent you the Matlab code for this which I had obtained from Nich Higham. 

Cheng, Sheung Hun and Higham, Nick (1998)
  A Modified Cholesky Algorithm Based on a Symmetric Indefinite
Factorization;
  \emph{SIAM J. Matrix Anal.\ Appl.}, \bold{19}, 1097--1110.

Do you remember?

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Martin Maechler
Sent: Friday, July 13, 2007 10:29 AM
To: Dimitris Rizopoulos
Cc: Jens Oehlschlägel; r-help@stat.math.ethz.ch
Subject: Re: [R] nearest correlation to polychoric

 DR == Dimitris Rizopoulos [EMAIL PROTECTED]
 on Fri, 13 Jul 2007 14:43:08 +0200 writes:

DR you could also have a look at function posdefify() from
DR package `sfsmisc'.

DR I hope it helps.

Yes, thanks, Dimitris; note that my posdefify() function uses a
pretty arbitrary fudge value for posdefification, namely
  eps.ev = 1e-07

As a matter of fact, earlier this week (in the first
international R/Rmetrics workshop),
I've talked to people from finance who also need that (or
something better?).

Jordi Molins Coronado (Madrid) drew my attention to an idea
he found in the book (English re-edition of French as of 1996)
Jean-Philippe Bouchaud (2000)
Theory of Financial Risk and Derivative Pricing:
From Statistical Physics to Risk Management

which supposedly uses theory of random matrices and
the entailing distribution of random eigenvalues in order to
find a more sensible cutoff than my 'eps.ev' default of 1e-7.
Unfortunately that book is checked out from our library and I
can't have a look.   Googling and Wikipedia seem to indicate to
me that most of the random matrix theory does not directly apply
here, since I'm really interested in the spectrum of X'X where X
is a de-meaned n x p random matrix.


OTOH, help(posdefify) already mentions more sophisticated
approaches to the problem, the one I (as I vaguely remember)
should be made available being 

  Cheng, Sheung Hun and Higham, Nick (1998)
  A Modified Cholesky Algorithm Based on a Symmetric Indefinite
Factorization;
  \emph{SIAM J. Matrix Anal.\ Appl.}, \bold{19}, 1097--1110.


Jens, could you make your code (mentioned below) available to
the community, or even donate to be included as a new method of
posdefify() ?

Regards,
Martin Maechler, ETH Zurich


DR Best, Dimitris

DR  Dimitris Rizopoulos Ph.D. Student Biostatistical
DR Centre School of Public Health Catholic University of
DR Leuven

DR Address: Kapucijnenvoer 35, Leuven, Belgium Tel:
DR +32/(0)16/336899 Fax: +32/(0)16/337015 Web:
DR http://med.kuleuven.be/biostat/
DR http://www.student.kuleuven.be/~m0390867/dimitris.htm


DR - Original Message - From: Jens Oehlschlägel
DR [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent:
DR Friday, July 13, 2007 2:25 PM Subject: [R] nearest
DR correlation to polychoric


DR Dear all,

DR Has someone implemented in R (or any other language)

DR Knol DL, ten Berge JMF.  Least-squares approximation of
DR an improper correlation matrix by a proper one.
DR Psychometrika, 1989, 54, 53-61.

DR or any other similar algorithm?

DR Best regards


DR Jens Oehlschlägel


DR Background:

DR I want to factanal() matrices of polychoric correlations
DR which have negative eigenvalue. I coded

DR Highham 2002 Computing the nearest correlation matrix -
DR a problem from finance, IMA Journal of Numerical
DR Analysis (2002), 22, 329-343.

DR which basically works but still leaves very small
DR negative eigenvalues which causes factanal() to fail
DR with

 factanal(cov=ncor$cor, factors=2)
DR Fehler in optim(start, FAfn, FAgr, method = L-BFGS-B,
DR lower = lower, : L-BFGS-B benötigt endliche Werte von
DR 'fn' Zusätzlich: Warning message: NaNs wurden erzeugt
DR in: log(x)
 version
DR_ platform i386-pc-mingw32 arch i386 os
DR mingw32 system i386, mingw32 status major 2 minor 4.1
DR year 2006 month 12 day 18 svn rev 40228 language R
DR version.string R version 2.4.1 (2006-12-18)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman

Re: [R] Stepwise GLM selection by LRT?

2007-07-11 Thread Ravi Varadhan
Check out the stepAIC function in MASS package.  This is a nice tool, where
you can actually implement any penalty even though the function's name has
AIC in it because it is the default.  Although this doesn't do an LRT test
based variable selection, you can sort of approximate it by using a penalty
of k = qchisq(1-p, df=1), where p is the p-value for variable selection.
This penalty means that a variable enters/exits an existing model, when the
absolute value of change in log-likelihood is greater than qchisq(1-p,
df=1). For p = 0.1, k = 2.71, and for p=0.05, k = 3.84.  Is this whhant
you'd like to do?
 
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lutz Ph. Breitling
Sent: Wednesday, July 11, 2007 3:06 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Stepwise GLM selection by LRT?

Dear List,

having searched the help and archives, I have the impression that
there is no automatic model selection procedure implemented in R that
includes/excludes predictors in logistic regression models based on
LRT P-values. Is that true, or is someone aware of an appropriate
function somewhere in a custom package?

Even if automatic model selection and LRT might not be the most
appropriate methods, I actually would like to use these in order to
simulate someone else's modeling approach...

Many thanks for all comments-
Lutz
-
Lutz Ph. Breitling
German Cancer Research Center
Heidelberg/Germany

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integration over a simplex

2007-07-10 Thread RAVI VARADHAN
Hi Robin,

A Monte-Carlo approach could be attempted, if one could generate samples that 
are either uniformly distributed over the simplex.  There is a small section in 
Luc Devroye's book (Generation of Non-uniform random deviates) on random 
uniform sampling from a simplex, if I remeber correctly.
Another approach is importance sampling, where the sampling points have a 
characterized distribution.  I have seen a technique called polyEDA, based on 
Gibbs sampling and truncated multivariate normal distribution.  I had 
previously emailed the authors of this approach for the code, but haven't 
received a reply yet.  You can google polyEDA for more info.  

I am interested in various computational problems related to polyhedra (e.g. 
enumeration of vertices, locating extrema, random sampling).  I would 
appreciate if you'd keep me posted on how you solved this problem.  

Best,
Ravi.

- Original Message -
From: Robin Hankin [EMAIL PROTECTED]
Date: Tuesday, July 10, 2007 6:58 am
Subject: [R] integration over a simplex
To: RHelp help r-help@stat.math.ethz.ch


 Hello
  
  The excellent adapt package integrates over multi-dimensional
  hypercubes.
  
  I want to integrate over a multidimensional simplex.  Has anyone
  implemented such a thing in R?
  
  I can transform an n-simplex to a hyperrectangle
  but the Jacobian is a rapidly-varying (and very lopsided)
  function and this is making adapt() slow.
  
  [
  A \dfn{simplex} is an n-dimensional analogue of a triangle or  
  tetrahedron.
  It is the convex hull of (n+1) points in an n-dimensional Euclidean  
 
  space.
  
  My application is a variant of the Dirichlet distribution:
  With p~D(a), if length(p) = n+1 then the requirement that
  all(p0) and sum(p)=1 mean that the support of the
  Dirichlet distribution is an n-simplex.
  ]
  
  
  --
  Robin Hankin
  Uncertainty Analyst
  National Oceanography Centre, Southampton
  European Way, Southampton SO14 3ZH, UK
tel  023-8059-7743
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] No convergence using ADAPT

2007-07-09 Thread Ravi Varadhan
Since the covariance is zero (i.e. you have independent normals), you can
simplify your problem so that you just need to perform one-dimensional
integration.  Here is how you can do it:


trial2 - function(input) {
#pmvnorm(lower = c(0,0), upper = c(10, 10), mean = input, sigma =
matrix(c(.01, 0, 0, .01), nrow = 2, ncol = 2, byrow = FALSE)) 
pnorm(q=10, mean = input, sd = sqrt(.01)) - pnorm(q=0, mean = input, sd =
sqrt(.01))
}

bottomB - -5*sqrt(.01)
topB - 10 + 5*sqrt(.01)
areaB - (topB - bottomB)^2

new.ans - 1/areaB * (integrate(f=trial2, lo = bottomB, up = topB)$val)^2
 new.ans
[1] 0.8264463

Hope this is helpful,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Philip Turk
Sent: Saturday, July 07, 2007 3:20 PM
To: r-help@stat.math.ethz.ch
Subject: [R] No convergence using ADAPT

I am trying calculate a probability using numerical integration. The first 
program I ran spit out an answer in a very short time. The program is below:

## START PROGRAM

trial - function(input)

{
pmvnorm(lower = c(0,0), upper = c(2, 2), mean = input, sigma = matrix(c(.1,
0, 
0, .1), nrow = 2, ncol = 2, byrow = FALSE)) 
}

require(mvtnorm)
require(adapt)

bottomB - -5*sqrt(.1)
topB - 2 + 5*sqrt(.1)
areaB - (topB - bottomB)^2

unscaled.Po.in.a - adapt(2, lo = c(bottomB, bottomB), up = c(topB, topB), 
minpts = 1000, eps = 1e-4, functn = trial)

(1/areaB)*unscaled.Po.in.a$value

## FINISH PROGRAM

I tried to run the program again changing a.) sigma in the trial function,
b.) 
upper in the trial function, and c.) the bounds of integration; that is, 
bottomB and topB.  The new program is below:

## START PROGRAM

trial - function(input)

{
pmvnorm(lower = c(0,0), upper = c(10, 10), mean = input, sigma =
matrix(c(.01, 
0, 0, .01), nrow = 2, ncol = 2, byrow = FALSE)) 
}

require(mvtnorm)
require(adapt)

bottomB - -5*sqrt(.01)
topB - 10 + 5*sqrt(.01)
areaB - (topB - bottomB)^2

unscaled.Po.in.a - adapt(2, lo = c(bottomB, bottomB), up = c(topB, topB), 
minpts = 1000, eps = 1e-4, functn = trial)

(1/areaB)*unscaled.Po.in.a$value

## FINISH PROGRAM

Now, the program just runs and runs (48 hours at last count!).  By playing 
around with the program, I have deduced the program is highly sensitive to 
changing the upper option in the trial function.  For example, using a
vector 
like (4, 4) causes no problems and the program quickly yields an answer.  I 
have a couple of other programs where I can easily obtain a simulation-based

answer, but I would ultimately like to know what's up with this program
before 
I give up on it so I can learn a thing or two.  Does anyone have any clues
or 
tricks to get around this problem?  My guess is that it will simply be very 
difficult (impossible?) to obtain this type of relative error (eps = 1e-4)
and 
I will have no choice but to pursue the simulation approach.

Thanks for any responses ([EMAIL PROTECTED])!

-- Phil

Philip Turk
Assistant Professor of Statistics
Northern Arizona University
Department of Mathematics and Statistics
PO Box 5717
Flagstaff, AZ 86011
Phone: 928-523-6884
Fax: 928-523-5847
E-mail: [EMAIL PROTECTED]
Web Site: http://jan.ucc.nau.edu/~stapjt-p/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incidence estimated from Kaplan-Meier

2007-07-05 Thread Ravi Varadhan
The 1-Pr(disease free survival) estimate from KM is not appropriate if
competing risk of mortality (from causes other than the disease of interest)
are present.  In that case, 1-Pr(disease free survival) over-estimates the
cumulative incidence of disease.  The larger the hazard of mortality, the
larger the over-estimation.  This is a well-known phenomenon in the
competing risks literature.  See, for example, Gooley et al. (Stats in Med
1999).

Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
Sent: Thursday, July 05, 2007 8:48 AM
To: Nguyen Dinh Nguyen
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Incidence estimated from Kaplan-Meier

Nguyen Dinh Nguyen wrote:
 Dear all,
 
 I have a stat question that may not be related to R, but I would like to
 have your advice.
 
  
 
 I have just read a medical paper in which the authors report the 1-p
(where
 p is the cumulative survival probability from the Kaplan Meier curve) as
 incidence of disease.  
 
  
 
 Specifically, the study followed ~12000 women on drug A and ~2 women
on
 drug B for 12 months.  During that period 29 women on drug A and 80 on
drug
 B had the disease.  The incidence of disease for A and B was 0.24% and
0.30%
 respectively.  However, instead of reporting these numbers, they report
the
 1-p figure which was 0.3% for A and 0.6% for B. 
 
  
 
 So, the incidence from 1-p was substantially higher than the actual
 incidence.  My question is: is it appropriate to use 1-p estimated from
 Kaplan-Meier as the incidence of disease?  If not, why not? 
 
  
 
 Regards,
 
 Nguyen

Yes it's appropriate, and it makes you state the cumulative incidence by 
time t rather than leaving time unspecified.  In your example it is 
likely that all women weren't followed completely, so simple incidences 
are not appropriate to compute because the denominator is not constant.

Frank

 
  
 
  
 Nguyen Dinh Nguyen, 
 
 Bone and Mineral Research Program 
 Garvan Institute of Medical Research 
 St Vincent's Hospital 
 384 Victoria Street, Darlinghurst 
 Sydney, NSW 2010 
 Australia 
 Tel; 61-2-9295 8274 
 Fax: 61-2-9295 8241 
 E-mail: [EMAIL PROTECTED] 
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected result in function valuation

2007-07-05 Thread Ravi Varadhan
The problem is that you are dividing two numbers that are both very small.
Any small imprecision in the denominator creates a big error. 

Here you can re-write your function using a trig identity to get accurate
results:

sinca - function(N,th) {
#return(sin((N+0.5)* th)/sin(0.5*th))
return( (sin(N*th)/tan(th/2)) + cos(N*th))
}

This function works well, as you had expected.

Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of James Foadi
Sent: Thursday, July 05, 2007 1:39 PM
To: r-help@stat.math.ethz.ch
Subject: [R] unexpected result in function valuation

Dear all,
I have a very small script to plot a function. Here it is:

##
sinca - function(N,th)

{

return(sin((N+0.5)*th)/sin(0.5*th))

}

plot_sinca - function(N)

{

x - seq(-5*pi,5*pi,by=pi/100)

y - rep(0,length=length(x))

for (i in 1:length(x))y[i] - sinca(N,x[i])

plot(x,y,type=l,ylim=c(0,2*N+4))

return(c(x,y))

}

##

When I load the script and run the function like this:

###
 data - plot_sinca(4)
 y - data[1002:2002]
###

I notice a spike on the plot which should not be there.
In fact I have checked and:
###
 y[701]
[1] 10.07404
 sinca(4,2*pi)
[1] 9
###

The second result is the correct one. Why, then do
I get the y[701]=10.07404? This function is not supposed
to be higher than 9...

Any help is greatly appreciated.

Regards,

J

Dr James Foadi
Membrane Protein Laboratory
Diamond Light Source Ltd
Chilton, Didcot
Oxfordshire OX11 0DE
---

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to solve a min problem

2007-07-04 Thread RAVI VARADHAN
Whether you can use optim or not depends on the nature of the constraints on 
S.  If you have simple box constraints, you can use the L-BFGS-B method in 
optim.  If not, optim may not be directly applicable, unless you can somehow 
transform your problem into an unconstrained minimization problem. 

Ravi.

- Original Message -
From: domenico pestalozzi [EMAIL PROTECTED]
Date: Wednesday, July 4, 2007 11:26 am
Subject: Re: [R] how to solve a min problem
To: R-help r-help@stat.math.ethz.ch


 S is an array 1-dimensional, for example 1 X 10, and mean(S) is the 
 mean of
  these 10 elements.
  
  So, I want to do:
  
  minimize mean(S) with 0  b_func(S)  800.
  That is, there are some boundaries on S according the b_funct
  
  The function apply an iterative convergent criterion:
  
  f_1=g(S), f_2=g(f_1), f_3=g(f_2), ecc
  The function stops when
  f_n - f_n-1 =0.1e-09
  and g(S) is a non-linear function of S and the convergence is mathematically
  assured.
  
  Is it possible to use  'optimize'?
  
  thanks
  
  domenico
  
  
  2007/7/3, Spencer Graves [EMAIL PROTECTED]:
  
Do you mean
  
minimize mu with 0  b_func(S+mu)  800?
  
For this kind of problem, I'd first want to know the nature of
   b_func.  Without knowing more, I might try to plot b_func(S+mu) vs.
   mu, then maybe use 'optimize'.
  
If this is not what you mean, please be more specific:  I'm
   confused.
  
Hope this helps.
Spencer Graves
  
   domenico pestalozzi wrote:
I know it's possible to solve max e min problems  by using these
   functions:
   
nlm, optimize, optim
   
but I don't know how to use them (...if possible...) to solve this
   problem.
   
I have a personal function called  b_func(S) where S is an input 
 array
   (1 X
n)  and I'd like:
   
minimize mean(S) with 0  b_funct  800.
   
I know that the solution exists, but It's possible to calculate 
 it in R?
The b_func is non linear and it calculates a particular value 
 using S as
input and applying a convergent iterative algorithm.
   
thanks
   
   
domenico
   
  [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list

PLEASE do read the posting guide
   
and provide commented, minimal, self-contained, reproducible code.
   
  
  
   [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to solve a min problem

2007-07-04 Thread RAVI VARADHAN
If the constraints on S are linear inequalities, then linear programming 
methods would work.  See function solveLP in package linprog or simplex in 
boot or package lpSolve.

Ravi.

- Original Message -
From: domenico pestalozzi [EMAIL PROTECTED]
Date: Wednesday, July 4, 2007 11:26 am
Subject: Re: [R] how to solve a min problem
To: R-help r-help@stat.math.ethz.ch


 S is an array 1-dimensional, for example 1 X 10, and mean(S) is the 
 mean of
  these 10 elements.
  
  So, I want to do:
  
  minimize mean(S) with 0  b_func(S)  800.
  That is, there are some boundaries on S according the b_funct
  
  The function apply an iterative convergent criterion:
  
  f_1=g(S), f_2=g(f_1), f_3=g(f_2), ecc
  The function stops when
  f_n - f_n-1 =0.1e-09
  and g(S) is a non-linear function of S and the convergence is mathematically
  assured.
  
  Is it possible to use  'optimize'?
  
  thanks
  
  domenico
  
  
  2007/7/3, Spencer Graves [EMAIL PROTECTED]:
  
Do you mean
  
minimize mu with 0  b_func(S+mu)  800?
  
For this kind of problem, I'd first want to know the nature of
   b_func.  Without knowing more, I might try to plot b_func(S+mu) vs.
   mu, then maybe use 'optimize'.
  
If this is not what you mean, please be more specific:  I'm
   confused.
  
Hope this helps.
Spencer Graves
  
   domenico pestalozzi wrote:
I know it's possible to solve max e min problems  by using these
   functions:
   
nlm, optimize, optim
   
but I don't know how to use them (...if possible...) to solve this
   problem.
   
I have a personal function called  b_func(S) where S is an input 
 array
   (1 X
n)  and I'd like:
   
minimize mean(S) with 0  b_funct  800.
   
I know that the solution exists, but It's possible to calculate 
 it in R?
The b_func is non linear and it calculates a particular value 
 using S as
input and applying a convergent iterative algorithm.
   
thanks
   
   
domenico
   
  [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list

PLEASE do read the posting guide
   
and provide commented, minimal, self-contained, reproducible code.
   
  
  
   [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fine tunning rgenoud

2007-07-03 Thread Ravi Varadhan
Paul,

You had indicated in your previous email that you are having trouble finding
a feasible starting value for constrOptim().  So, you basically need to
solve a system of linear inequalities to obtain a starting point.  Have you
considered using linear programming? Either simplex() in the boot package
or solveLP() in linprog would work.  It seems to me that you could use any
linear objective function in solveLP to obtain a feasible starting point.
This is not the most efficient solution, but it might be worth a try. 

I am aware of other methods for generating n-tuples that satisfy linear
inequality constraints, but AFAIK those are not available in R.

Best,
Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
Sent: Tuesday, July 03, 2007 4:10 PM
To: R-help
Subject: [R] Fine tunning rgenoud

Dear All,

I am trying to solve the following maximization problem, but I cannot
have rgenoud giving me a reliable solution.

Any ideas?

Thanks in advance,

Paul


library(rgenoud)

v - 0.90
O1 - 10
O2 - 20
O0 - v*O1+(1-v)*O2

myfunc - function(x) {
  U0 - x[1]
  U1 - x[2]
  U2 - x[3]
  q0 - x[4]
  q1 - x[5]
  q2 - x[6]
  p - x[7]

  if (U0  0)
return(-1e+200)
  else if (U1  0)
return(-1e+200)
  else if (U2  0)
return(-1e+200)
  else if ((U0-(U1+(O1-O0)*q1))  0)
return(-1e+200)
  else if ((U0-(U2+(O2-O0)*q2))  0)
return(-1e+200)
  else if ((U1-(U0+(O0-O1)*q0))  0)
return(-1e+200)
  else if ((U1-(U2+(O2-O1)*q2))  0)
return(-1e+200)
  else if((U2-(U0+(O0-O2)*q0))  0)
return(-1e+200)
  else if((U2-(U1+(O1-O2)*q1))  0)
return(-1e+200)
  else if(p  0)
return(-1e+200)
  else if(p  1)
return(-1e+200)
  else if(q0  0)
return(-1e+200)
  else if(q1  0)
return(-1e+200)
  else if(q2  0)
return(-1e+200)
  else
return(p*(sqrt(q0)-(O0*q0+U0))+(1-p)*(v*(sqrt(q1)-(O1*q1+U1))+(1-v)*(sqrt(q2
)-(O2*q2+U2

}
genoud(myfunc,nvars=7,max=T,pop.size=6000,starting.values=runif(7),wait.gene
rations=150,max.generations=300,boundary.enforcement=2)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fine tunning rgenoud

2007-07-03 Thread Ravi Varadhan
Paul,

It should be easy enough to check that your solution is valid (i.e. a local
minimum):  first, check to see if the solution satisfies all the
constraints; secondly, check to see if it is an interior point (i.e. none of
the constraints become equality); and finally, if the solution is an
interior point, check to see whether the gradient there is close to zero.
Note that if the solution is one of the vertices of the polyhedron, then the
gradient may not be zero.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
Sent: Tuesday, July 03, 2007 5:10 PM
To: R-help
Subject: Re: [R] Fine tunning rgenoud

On 7/3/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
 You had indicated in your previous email that you are having trouble
finding
 a feasible starting value for constrOptim().  So, you basically need to
 solve a system of linear inequalities to obtain a starting point.  Have
you
 considered using linear programming? Either simplex() in the boot
package
 or solveLP() in linprog would work.  It seems to me that you could use
any
 linear objective function in solveLP to obtain a feasible starting point.
 This is not the most efficient solution, but it might be worth a try.

 I am aware of other methods for generating n-tuples that satisfy linear
 inequality constraints, but AFAIK those are not available in R.

Thanks, Ravi. I had already conceived the solution that you suggest,
actually using lpSolve. I am able to get a solution for my problem
with constrOptim, but I am not enough confident that the solution is
right. That is why I am trying to get a solution with rgenoud, but
unsuccessfully until now.

Paul



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
 Sent: Tuesday, July 03, 2007 4:10 PM
 To: R-help
 Subject: [R] Fine tunning rgenoud

 Dear All,

 I am trying to solve the following maximization problem, but I cannot
 have rgenoud giving me a reliable solution.

 Any ideas?

 Thanks in advance,

 Paul

 
 library(rgenoud)

 v - 0.90
 O1 - 10
 O2 - 20
 O0 - v*O1+(1-v)*O2

 myfunc - function(x) {
   U0 - x[1]
   U1 - x[2]
   U2 - x[3]
   q0 - x[4]
   q1 - x[5]
   q2 - x[6]
   p - x[7]

   if (U0  0)
 return(-1e+200)
   else if (U1  0)
 return(-1e+200)
   else if (U2  0)
 return(-1e+200)
   else if ((U0-(U1+(O1-O0)*q1))  0)
 return(-1e+200)
   else if ((U0-(U2+(O2-O0)*q2))  0)
 return(-1e+200)
   else if ((U1-(U0+(O0-O1)*q0))  0)
 return(-1e+200)
   else if ((U1-(U2+(O2-O1)*q2))  0)
 return(-1e+200)
   else if((U2-(U0+(O0-O2)*q0))  0)
 return(-1e+200)
   else if((U2-(U1+(O1-O2)*q1))  0)
 return(-1e+200)
   else if(p  0)
 return(-1e+200)
   else if(p  1)
 return(-1e+200)
   else if(q0  0)
 return(-1e+200)
   else if(q1  0)
 return(-1e+200)
   else if(q2  0)
 return(-1e+200)
   else

return(p*(sqrt(q0)-(O0*q0+U0))+(1-p)*(v*(sqrt(q1)-(O1*q1+U1))+(1-v)*(sqrt(q2
 )-(O2*q2+U2

 }

genoud(myfunc,nvars=7,max=T,pop.size=6000,starting.values=runif(7),wait.gene
 rations=150,max.generations=300,boundary.enforcement=2)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fine tunning rgenoud

2007-07-03 Thread RAVI VARADHAN
Paul,

Here is another approach: I wrote an R function that would generate interior 
points as starting values for constrOptim.  This might work better than the LP 
approach, since the LP approach gives you a starting value that is on the 
boundary of the feasible region, i.e a vertex of the polyhedron, whereas this 
new approach gives you points on the interior.  You can generate as many points 
as you wish, but the approach is brute-force and is very inefficient - it takes 
on the order of a 1000 tries to find one feasible point.

Hope this helps,
Ravi. 

- Original Message -
From: Paul Smith [EMAIL PROTECTED]
Date: Tuesday, July 3, 2007 7:32 pm
Subject: Re: [R] Fine tunning rgenoud
To: R-help r-help@stat.math.ethz.ch


 On 7/4/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
   It should be easy enough to check that your solution is valid (i.e. 
 a local
   minimum):  first, check to see if the solution satisfies all the
   constraints; secondly, check to see if it is an interior point 
 (i.e. none of
   the constraints become equality); and finally, if the solution is an
   interior point, check to see whether the gradient there is close to 
 zero.
   Note that if the solution is one of the vertices of the polyhedron, 
 then the
   gradient may not be zero.
  
  I am having bad luck: all constraints are satisfied, but the solution
  given by constrOptim is not interior; the gradient is not equal to
  zero.
  
  Paul
  
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [ On Behalf Of Paul Smith
   Sent: Tuesday, July 03, 2007 5:10 PM
   To: R-help
   Subject: Re: [R] Fine tunning rgenoud
  
   On 7/3/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
You had indicated in your previous email that you are having trouble
   finding
a feasible starting value for constrOptim().  So, you basically 
 need to
solve a system of linear inequalities to obtain a starting point. 
  Have
   you
considered using linear programming? Either simplex() in the boot
   package
or solveLP() in linprog would work.  It seems to me that you 
 could use
   any
linear objective function in solveLP to obtain a feasible 
 starting point.
This is not the most efficient solution, but it might be worth a 
 try.
   
I am aware of other methods for generating n-tuples that satisfy 
 linear
inequality constraints, but AFAIK those are not available in R.
  
   Thanks, Ravi. I had already conceived the solution that you suggest,
   actually using lpSolve. I am able to get a solution for my problem
   with constrOptim, but I am not enough confident that the solution is
   right. That is why I am trying to get a solution with rgenoud, but
   unsuccessfully until now.
  
   Paul
  
  
  
-Original Message-
From: [EMAIL PROTECTED]
[ On Behalf Of Paul Smith
Sent: Tuesday, July 03, 2007 4:10 PM
To: R-help
Subject: [R] Fine tunning rgenoud
   
Dear All,
   
I am trying to solve the following maximization problem, but I cannot
have rgenoud giving me a reliable solution.
   
Any ideas?
   
Thanks in advance,
   
Paul
   

library(rgenoud)
   
v - 0.90
O1 - 10
O2 - 20
O0 - v*O1+(1-v)*O2
   
myfunc - function(x) {
  U0 - x[1]
  U1 - x[2]
  U2 - x[3]
  q0 - x[4]
  q1 - x[5]
  q2 - x[6]
  p - x[7]
   
  if (U0  0)
return(-1e+200)
  else if (U1  0)
return(-1e+200)
  else if (U2  0)
return(-1e+200)
  else if ((U0-(U1+(O1-O0)*q1))  0)
return(-1e+200)
  else if ((U0-(U2+(O2-O0)*q2))  0)
return(-1e+200)
  else if ((U1-(U0+(O0-O1)*q0))  0)
return(-1e+200)
  else if ((U1-(U2+(O2-O1)*q2))  0)
return(-1e+200)
  else if((U2-(U0+(O0-O2)*q0))  0)
return(-1e+200)
  else if((U2-(U1+(O1-O2)*q1))  0)
return(-1e+200)
  else if(p  0)
return(-1e+200)
  else if(p  1)
return(-1e+200)
  else if(q0  0)
return(-1e+200)
  else if(q1  0)
return(-1e+200)
  else if(q2  0)
return(-1e+200)
  else
   
   
 return(p*(sqrt(q0)-(O0*q0+U0))+(1-p)*(v*(sqrt(q1)-(O1*q1+U1))+(1-v)*(sqrt(q2
)-(O2*q2+U2
   
}
   
   
 genoud(myfunc,nvars=7,max=T,pop.size=6000,starting.values=runif(7),wait.gene
rations=150,max.generations=300,boundary.enforcement=2)
   
__
R-help@stat.math.ethz.ch mailing list

PLEASE do read the posting guide
   
and provide commented, minimal, self-contained, reproducible code.
   
  
   __
   R-help@stat.math.ethz.ch mailing list
   
   PLEASE do read the posting guide 
   and provide commented, minimal, self-contained, reproducible code.
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Ravi Varadhan
Mark,

What you are referring to deals with the selection of covariates, since PC
doesn't do dimensionality reduction in the sense of covariate selection.
But what Mark is asking for is to identify how much each data point
contributes to individual PCs.  I don't think that Mark's query makes much
sense, unless he meant to ask: which individuals have high/low scores on
PC1/PC2.  Here are some comments that may be tangentially related to Mark's
question:

1.  If one is worried about a few data points contributing heavily to the
estimation of PCs, then one can use robust PCA, for example, using robust
covariance matrices.  MASS has some tools for this.
2.  The biplot for the first 2 PCs can give some insights
3. PCs, especially, the last few PCs, can be used to identify outliers.
  
Hope this is helpful,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mark Difford
Sent: Monday, July 02, 2007 1:55 PM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] Question about PCA with prcomp


Hi James,

Have a look at Cadima et al.'s subselect package [Cadima worked with/was a
student of Prof Jolliffe, one of _the_ experts on PCA; Jolliffe devotes part
of a Chapter to this question in his text (Principal Component Analysis,
pub. Springer)].  Then you should look at psychometric stuff: a good place
to start would be Professor Revelle's psych package.

BestR,
Mark.


James R. Graham wrote:
 
 Hello All,
 
 The basic premise of what I want to do is the following:
 
 I have 20 entities for which I have ~500 measurements each. So, I  
 have a matrix of 20 rows by ~500 columns.
 
 The 20 entities fall into two classes: good and bad.
 
 I eventually would like to derive a model that would then be able to  
 classify new entities as being in good territory or bad territory  
 based upon my existing data set.
 
 I know that not all ~500 measurements are meaningful, so I thought  
 the best place to begin would be to do a PCA in order to reduce the  
 amount of data with which I have to work.
 
 I did this using the prcomp function and found that nearly 90% of the  
 variance in the data is explained by PC1 and 2.
 
 So far, so good.
 
 I would now like to find out which of the original ~500 measurements  
 contribute to PC1 and 2 and by how much.
 
 Any tips would be greatly appreciated! And apologies in advance if  
 this turns out to be an idiotic question.
 
 
 james
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context:
http://www.nabble.com/Question-about-PCA-with-prcomp-tf4012919.html#a1139860
8
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about PCA with prcomp

2007-07-02 Thread Ravi Varadhan
The PCs that are associated with the smaller eigenvalues. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Patrick Connolly [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 02, 2007 4:23 PM
To: Ravi Varadhan
Cc: 'Mark Difford'; r-help@stat.math.ethz.ch
Subject: Re: [R] Question about PCA with prcomp

On Mon, 02-Jul-2007 at 03:16PM -0400, Ravi Varadhan wrote:

| Mark,
| 
| What you are referring to deals with the selection of covariates, since
PC
| doesn't do dimensionality reduction in the sense of covariate selection.
| But what Mark is asking for is to identify how much each data point
| contributes to individual PCs.  I don't think that Mark's query makes
much
| sense, unless he meant to ask: which individuals have high/low scores on
| PC1/PC2.  Here are some comments that may be tangentially related to
Mark's
| question:
| 
| 1.  If one is worried about a few data points contributing heavily to the
| estimation of PCs, then one can use robust PCA, for example, using robust
| covariance matrices.  MASS has some tools for this.
| 2.  The biplot for the first 2 PCs can give some insights
| 3. PCs, especially, the last few PCs, can be used to identify outliers.

What is meant by last few PCs?

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dominant eigenvector displayed as third (Marco Visser)

2007-06-30 Thread RAVI VARADHAN
Yes, Spencer, your observation is correct, because the characeristic equation 
det(A - \lambda*I) is a sixth degree polynomial: \lambda^6 - 5 = 0.  So the 
eigenvalues are the complex numbers (generally) that are located at equal 
angles on the circle of radius 5^(1/6), at angles 2*pi*k/6, where k runs from 0 
to 5.  Thus, the roots are:

z_k = 5^(1/6) * exp(i * 2*pi*k/6), k= 0, 1, ..., 5.

where i = sqrt(-1).

Ravi.

- Original Message -
From: Spencer Graves [EMAIL PROTECTED]
Date: Friday, June 29, 2007 6:51 pm
Subject: Re: [R] Dominant eigenvector displayed as third (Marco Visser)
To: Marco Visser [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch


   There is no dominant eigenvalue:  The eigenvalues of that matrix 
 
  are the 6 different roots of 5.  All have modulus (or absolute value) 
 = 
  1.307660.  When I raised them all to the 6th power, all 6 were 5+0i. 
 
  
Someone else can tell us why this is, but this should suffice 
 as 
  an initial answer to your question. 
  
Hope this helps. 
Spencer Graves
  
  Marco Visser wrote:
   Dear R users  Experts,
  
   This is just a curiousity, I was wondering why the dominant 
 eigenvetor and eigenvalue 
   of the following matrix is given as the third. I guess this could 
 complicate automatic selection 
   procedures. 
  
   000005
   100000
   010000
   001000
   000100
   000010
  
   Please copy  paste the following into R;
  
   
 a=c(0,0,0,0,0,5,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0)
   mat=matrix(a, ncol=6,byrow=T)
   eigen(mat)
  
   The matrix is a population matrix for a plant pathogen (Powell et 
 al 2005).
  
   Basically I would really like to know why this happens so I will 
 know if it can occur 
   again. 
  
   Thanks for any comments,
  
   Marco Visser
  
  
   Comment: In Matlab the the dominant eigenvetor and eigenvalue 
   of the described matrix are given as the sixth. Again no idea why.
  
   reference
  
   J. A. Powell, I. Slapnicar and W. van der Werf. Epidemic spread of 
 a lesion-forming 
   plant pathogen - analysis of a mechanistic model with infinite age 
 structure. (2005) 
   Linear Algebra and its Applications 298. p 117-140.  
  
  
  
  
  
   
 Ready
  
 for the edge of your seat? 
   Check out tonight's top picks on Yahoo! TV. 
  
  [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   
   PLEASE do read the posting guide 
   and provide commented, minimal, self-contained, reproducible code.
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Solving a system of nonlinear equations

2007-06-27 Thread Ravi Varadhan
Hi All,

 

Here is a modification of nlsolve() that I had submitted before.  The new
version of nlsolve() has the option to generate multiple random starting
values in order to improve the chances of locating the zeros of the
nonlinear system.  The last test example in the attached file, the
Freudenstein-Roth function, shows the usefulness of the multiple random
starts.  

 

Hope this is useful,

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 

##
nlsolve - function(par, fn, method=BFGS, nstarts = 1, ...) {
###
#  A solver for finding a zero of a system of non-linear equations
#  Actually minimizes the squared-norm of the set of functions by calling 
optim()
#  Uses the BFGS algorithm within optim()
#  All the control parameters can be passed as in the call to optim()
#  Uses a random multiple start approach to locate the zeros of the system
#
#  Author:  Ravi Varadhan, Center on Aging and Health, Johns Hopkins 
University, [EMAIL PROTECTED]
#
#  June 21, 2007
#
func - function(x, ...) sum(fn(x, ...)^2)
ans - optim(par, fn=func, method=method, ...)
err - sqrt(ans$val/length(par)) 
istart - 1

while (err  0.01  istart  nstarts) {
par - par * ( 1 + runif(length(par), -1, 1) )  # generating random starting 
values
ans - try (optim(par, fn=func, method=method, ...))
if (class(ans) != try-error) err - sqrt(ans$val/length(par)) 
istart - istart + 1
}

if (err  0.01) cat( \n Caution:  Zero of the function has not been located!!! 
\n Increase nstart \n \n)

ans
}

#
##
#
# Some examples illustrating the use of nlsolve()
#
expo1 - function(p) {
#  From La Cruz and Raydan, Optim Methods and Software 2003, 18 (583-599)
n - length(p)
f - rep(NA, n)
f[1] - exp(p[1] - 1) - 1
f[2:n] - (2:n) * (exp(p[2:n] - 1) - p[2:n])
f
}

n - 50
p0 - runif(n)
nlsolve(par=p0, fn=expo1)


expo3 - function(p) {
#  From La Cruz and Raydan, Optim Methods and Software 2003, 18 (583-599)
n - length(p)
f - rep(NA, n)
onm1 - 1:(n-1) 
f[onm1] - onm1/10 * (1 - p[onm1]^2 - exp(-p[onm1]^2))
f[n] - n/10 * (1 - exp(-p[n]^2))
f
}

n - 15
p0 - (1:n)/(4*n^2)
nlsolve(par=p0, fn=expo3)


func1 - function(x) {
f - rep(0,3)
f[1] - 1/3*cos(x[2]*x[3]) - x[1] + 1/6
f[2] - 1/9 * sqrt(x[1]^2 + sin(x[3])+ 1.06) - x[2] - 0.1
f[3] - -1/20*exp(-x[1]*x[2]) - x[3] - (10*pi - 3)/60
f
}

p0 - runif(3)
nlsolve(par=p0, fn=func1)


# This example illustrates the use of multiple random starts to obtain the 
global minimum of zero (i.e. the roots of the system)
#
froth - function(p){
# Freudenstein and Roth function (Broyden, Mathematics of Computation 1965, p. 
577-593)
f - rep(NA,length(p))
f[1] - -13 + p[1] + (p[2]*(5 - p[2]) - 2) * p[2]
f[2] - -29 + p[1] + (p[2]*(1 + p[2]) - 14) * p[2]
f
}

p0 - c(3,2)  # this gives the zero of the system
nlsolve(par=p0, fn=froth)

p0 - c(1,1)  # this gives the local minimum that is not the zero of the system
nlsolve(par=p0, fn=froth)
nlsolve(par=p0, fn=froth, nstart=100)

p0 - rpois(2,10)
nlsolve(par=p0, fn=froth)
nlsolve(par=p0, fn=froth, nstart=10)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher information matrix

2007-06-26 Thread Ravi Varadhan
Hi,

You can use the function hessian() in the package numDeriv.  This will
yield a very accurate estimate of the observed Fisher information matrix.


library(numDeriv)
?hessian

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Federico Calboli
Sent: Tuesday, June 26, 2007 7:58 AM
To: r-help
Cc: Gambhir, Manoj
Subject: [R] fisher information matrix

Hi All,

a colleague wants to calculate the Fisher information matrix for a model he 
wrote (not in R). He can easily get the neg-log-likelihood and the best fit 
parameters at the minimum. He can also get negLLs for other parameter values
too.

Given these data, is there a way in R to calculate the Fisher information
matrix?

Best,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creatiing an R package for solving nonlinear system ofequations was: RE: finding roots of multivariate equation

2007-06-26 Thread Ravi Varadhan
Local minima, other than the actual roots, will be present only when the
Jacobian of the system is singular.  If the Jacobian is well-behaved then
there should be no problem, although this is hard to verify in practice.
Furthermore, as I had pointed out in one of my previous emails, if
convergence to a local optimum takes place, you simply restart the procedure
with another initial value.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Rob Creecy [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 26, 2007 1:01 PM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch; 'Bill Shipley'
Subject: Re: [R] Creatiing an R package for solving nonlinear system
ofequations was: RE: finding roots of multivariate equation

This seems useful, but it is important to note that the approach may not 
work well
unless the system of nonlinear equations is very well behaved and a good 
starting
point is chosen. A good explanation of the problems with this exact 
approach, that
is adding up the sums of squares of the individual functions,  is described
in Numerical Recipes for C, second edition, p 382 (see 
http://www.nrbook.com/a/bookcpdf.php)
Briefly there will often be a great number of local minima even when 
there is only a single
root of the original equations.

Rob





Ravi Varadhan wrote:
 Hi,

 I have written a simple function to solve a system of nonlinear equations.
I
 have called it nlsolve().  It actually minimizes the squared-norm of the
set
 of functions by calling optim().  It uses the BFGS algorithm within
optim().
 Apart from this restriction, the user can pass all the arguments available
 in optim().  All the control parameters can be passed as in the call to
 optim().  I have attached a text file containing the source for nlsolve()
 and also a number of test problems illustrating the use of nlsolve().  Any
 feedback and suggestions to improve it are welcome.

 Hope this is useful.

 Best,
 Ravi.



 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology 

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
 Sent: Wednesday, June 20, 2007 5:23 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Creatiing an R package for solving nonlinear system of
 equations was: RE: finding roots of multivariate equation

 Hi All,

 Replying to this and numerous other requests in the past has made me
realize
 that a nonlinear solver is very much needed for R users.  I have
 successfully used a nonlinear solver based on the spectral gradient
method,
 in FORTRAN.  I can readily translate that to R and make it available as an
R
 function, but what I would really like to do is to make that into a
package.
 I can provide the R function and several test examples.  But I am not good
 at creating a good/reliable package.  So, it would be ideal if one of the
R
 gurus is interested in collaborating with me on this project.  Any one
 interested?

 Ravi.


 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology 

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Bill Shipley
 Sent: Wednesday, June 20, 2007 1:37 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] finding roots of multivariate equation

 Hello,
 I want to find the roots of an equation in two variables.  I am aware of
the
 uniroot function, which can do this for a function with a single variable
 (as I
 understand it...) but cannot find a function that does this for an
equation
 with more than one variable.  I am looking for something implementing
 similar
 to a Newton-Raphson algorithm.
 Thanks

Re: [R] fft and the derivative

2007-06-25 Thread Ravi Varadhan
Todd, 

Your idea is correct for continuous Fourier transform, but I am not sure
how one could apply that to fft, which corresponds to the discrete Fourier
transform.  For instance, what values of omega would you use for the term
i*omega to get the discrete fourier transform of the derivative of f(t)?  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Todd Remund
Sent: Monday, June 25, 2007 5:16 PM
To: r-help@stat.math.ethz.ch
Subject: [R] fft and the derivative

Can one take f(t) and transform to F(omega) in the frequency domain using 
fft(), and use the properties of the fft and find the derivative of f(t)?  
For example,

f(t) - F(omega) = f(t)^n - (i*omega)^n  *  F(omega)

Use this and get,

f(t)^n = F^(-) [ (i*omega)^n  *  F(omega) ]

to get the nth derivative of f(t)?
Todd Remund

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creatiing an R package for solving nonlinear system of equations was: RE: finding roots of multivariate equation

2007-06-21 Thread Ravi Varadhan
Hi,

I have written a simple function to solve a system of nonlinear equations. I
have called it nlsolve().  It actually minimizes the squared-norm of the set
of functions by calling optim().  It uses the BFGS algorithm within optim().
Apart from this restriction, the user can pass all the arguments available
in optim().  All the control parameters can be passed as in the call to
optim().  I have attached a text file containing the source for nlsolve()
and also a number of test problems illustrating the use of nlsolve().  Any
feedback and suggestions to improve it are welcome.

Hope this is useful.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html






-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
Sent: Wednesday, June 20, 2007 5:23 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Creatiing an R package for solving nonlinear system of
equations was: RE: finding roots of multivariate equation

Hi All,

Replying to this and numerous other requests in the past has made me realize
that a nonlinear solver is very much needed for R users.  I have
successfully used a nonlinear solver based on the spectral gradient method,
in FORTRAN.  I can readily translate that to R and make it available as an R
function, but what I would really like to do is to make that into a package.
I can provide the R function and several test examples.  But I am not good
at creating a good/reliable package.  So, it would be ideal if one of the R
gurus is interested in collaborating with me on this project.  Any one
interested?

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html






-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bill Shipley
Sent: Wednesday, June 20, 2007 1:37 PM
To: r-help@stat.math.ethz.ch
Subject: [R] finding roots of multivariate equation

Hello,
I want to find the roots of an equation in two variables.  I am aware of the
uniroot function, which can do this for a function with a single variable
(as I
understand it...) but cannot find a function that does this for an equation
with more than one variable.  I am looking for something implementing
similar
to a Newton-Raphson algorithm.
Thanks.

-- 
Bill Shipley
North American Editor for Annals of Botany
Subject Editor for Ecology
Département de biologie
Université de Sherbrooke
Sherbrooke (Québec) J1K 2R9
Canada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] finding roots of multivariate equation

2007-06-20 Thread Ravi Varadhan
R does not really have a dedicated solver for nonlinear systems of
equations, but instead you can use optim(), which is a minimizer.  Suppose
your system is F(x) = 0, where x \in R^p and F is a mapping from R^p to R^p,
then you minimize the norm of F.  The problem with this approach is that it
can sometimes yield local minima which are not the zeros of the original
system.  However, this can be easily remedied by using different starting
values.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bill Shipley
Sent: Wednesday, June 20, 2007 1:37 PM
To: r-help@stat.math.ethz.ch
Subject: [R] finding roots of multivariate equation

Hello,
I want to find the roots of an equation in two variables.  I am aware of the
uniroot function, which can do this for a function with a single variable
(as I
understand it...) but cannot find a function that does this for an equation
with more than one variable.  I am looking for something implementing
similar
to a Newton-Raphson algorithm.
Thanks.

-- 
Bill Shipley
North American Editor for Annals of Botany
Subject Editor for Ecology
Département de biologie
Université de Sherbrooke
Sherbrooke (Québec) J1K 2R9
Canada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creatiing an R package for solving nonlinear system of equations was: RE: finding roots of multivariate equation

2007-06-20 Thread Ravi Varadhan
Hi All,

Replying to this and numerous other requests in the past has made me realize
that a nonlinear solver is very much needed for R users.  I have
successfully used a nonlinear solver based on the spectral gradient method,
in FORTRAN.  I can readily translate that to R and make it available as an R
function, but what I would really like to do is to make that into a package.
I can provide the R function and several test examples.  But I am not good
at creating a good/reliable package.  So, it would be ideal if one of the R
gurus is interested in collaborating with me on this project.  Any one
interested?

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bill Shipley
Sent: Wednesday, June 20, 2007 1:37 PM
To: r-help@stat.math.ethz.ch
Subject: [R] finding roots of multivariate equation

Hello,
I want to find the roots of an equation in two variables.  I am aware of the
uniroot function, which can do this for a function with a single variable
(as I
understand it...) but cannot find a function that does this for an equation
with more than one variable.  I am looking for something implementing
similar
to a Newton-Raphson algorithm.
Thanks.

-- 
Bill Shipley
North American Editor for Annals of Botany
Subject Editor for Ecology
Département de biologie
Université de Sherbrooke
Sherbrooke (Québec) J1K 2R9
Canada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Normal and Poisson tail area expectations in R

2007-06-14 Thread Ravi Varadhan
Perfect, Chuck.  I got a closed-form solution after some algebraic labor,
but your solution is simple and elegant.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: Charles C. Berry [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 14, 2007 1:36 PM
To: Ravi Varadhan
Cc: 'kavindra malik'; r-help@stat.math.ethz.ch
Subject: RE: [R] Normal and Poisson tail area expectations in R



Ravi,

This looks simple to me.

km_G - function(lambda,k)
lambda*ppois(k-1,lambda,lower=FALSE) -
k*ppois(k,lambda,lower=FALSE)

Am I confused here?

Chuck



On Wed, 13 Jun 2007, Ravi Varadhan wrote:


 More interesting is the Poisson convolution. I don't know if there is an
 analytic solution to this.  I looked at Jolley's Summation of Series and
 Abramowitz and Stegun, but no help there.  It seems that discrete FFT
 technique should work. Does anyone know the answer?

 Ravi.


 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of kavindra malik
 Sent: Wednesday, June 13, 2007 5:45 PM
 To: Charles C. Berry
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Normal and Poisson tail area expectations in R

 Thank you very much. This solves the problem I was trying to solve. I am
new
 to R and am learning. A great lesson in the power of R...

 Charles C. Berry [EMAIL PROTECTED] wrote: On Wed, 13 Jun 2007,
 kavindra malik wrote:

 I am interested in R functions for the following integrals / sums
 (expressed best I can in text)  -

 Normal: G_u(k) =  Integration_{Lower limit=k}^{Upper limit=infinity} [(u
 -k) f(u) d(u)], where where u is N(0,1), and f(u) is the density function.

 Poisson: G(lambda,k) = Sum_{Lower limit=k}^{Upper limit=infinity} [(x-k)
 p(x, lambda)] where P(x,lambda) is the Poisson prob function with
parameter
 lambda.

 The Normal expression is very commonly used in inventory management to
 determine safety stocks (and its tabular values can be found in some
 texts) - and I am also looking for Poisson and/or Gamma as that'd fit
 the situation better.

 I am wondering if there are standard functions in R that might allow me
to
 get these values, instead of needing to do the numerical integration, etc.
 myself.

 Not that I know of, but it is not difficult to do the integration:

 k - 1.1 # for example
 integrate(function(x) (x-k)*dnorm(x),lower=k,upper=Inf)
 0.06861951 with absolute error  5.5e-07


 see

  ?integrate
  ?qnorm
  ?qpois
  ?qgamma

Thank you very much.



 -
 Sucker-punch spam with award-winning protection.

  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 Charles C. Berry(858) 534-2098
 Dept of Family/Preventive
 Medicine
 E mailto:[EMAIL PROTECTED] UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901





 -


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive
Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented

Re: [R] Normal and Poisson tail area expectations in R

2007-06-14 Thread Ravi Varadhan
Inspired by Chuck's elegant solution to the Poisson tail area problem, here
is a simple solution to the normal tail area expectation that does not use
integrate().

Gu.k - function(k) {1/sqrt(2*pi) * exp(-k*k/2) - k * pnorm(k, lower=FALSE)}

 k - 1:10
 Gu.k(k)
 [1] 8.331547e-02 8.490703e-03 3.821543e-04 7.145258e-06 5.346166e-08
1.563570e-10 1.760326e-13 7.550262e-17 1.224779e-20 7.474560e-25

 sapply(k, function(k)integrate(function(x)
(x-k)*dnorm(x),lower=k,upper=Inf)$val)
 [1] 8.331547e-02 8.490702e-03 3.821543e-04 7.145259e-06 5.346163e-08
1.563570e-10 1.760326e-13 7.550262e-17 1.224779e-20 7.474560e-25


Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
Sent: Thursday, June 14, 2007 2:08 PM
To: 'Charles C. Berry'
Cc: 'kavindra malik'; r-help@stat.math.ethz.ch
Subject: Re: [R] Normal and Poisson tail area expectations in R

Perfect, Chuck.  I got a closed-form solution after some algebraic labor,
but your solution is simple and elegant.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: Charles C. Berry [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 14, 2007 1:36 PM
To: Ravi Varadhan
Cc: 'kavindra malik'; r-help@stat.math.ethz.ch
Subject: RE: [R] Normal and Poisson tail area expectations in R



Ravi,

This looks simple to me.

km_G - function(lambda,k)
lambda*ppois(k-1,lambda,lower=FALSE) -
k*ppois(k,lambda,lower=FALSE)

Am I confused here?

Chuck



On Wed, 13 Jun 2007, Ravi Varadhan wrote:


 More interesting is the Poisson convolution. I don't know if there is an
 analytic solution to this.  I looked at Jolley's Summation of Series and
 Abramowitz and Stegun, but no help there.  It seems that discrete FFT
 technique should work. Does anyone know the answer?

 Ravi.


 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of kavindra malik
 Sent: Wednesday, June 13, 2007 5:45 PM
 To: Charles C. Berry
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Normal and Poisson tail area expectations in R

 Thank you very much. This solves the problem I was trying to solve. I am
new
 to R and am learning. A great lesson in the power of R...

 Charles C. Berry [EMAIL PROTECTED] wrote: On Wed, 13 Jun 2007,
 kavindra malik wrote:

 I am interested in R functions for the following integrals / sums
 (expressed best I can in text)  -

 Normal: G_u(k) =  Integration_{Lower limit=k}^{Upper limit=infinity} [(u
 -k) f(u) d(u)], where where u is N(0,1), and f(u) is the density function.

 Poisson: G(lambda,k) = Sum_{Lower limit=k}^{Upper limit=infinity} [(x-k)
 p(x, lambda)] where P(x,lambda) is the Poisson prob function with
parameter
 lambda.

 The Normal expression is very commonly used in inventory management to
 determine safety stocks (and its tabular values can be found in some
 texts) - and I am also looking for Poisson and/or Gamma as that'd fit
 the situation better.

 I am wondering if there are standard functions in R that might allow me
to
 get these values, instead of needing to do the numerical integration, etc.
 myself.

 Not that I know of, but it is not difficult to do the integration:

 k - 1.1 # for example
 integrate(function(x) (x-k)*dnorm(x),lower=k,upper=Inf)
 0.06861951 with absolute error  5.5e-07


 see

  ?integrate
  ?qnorm
  ?qpois
  ?qgamma

Thank you very much.



 -
 Sucker-punch spam with award-winning protection.

  [[alternative HTML version deleted]]

 __
 R

Re: [R] Normal and Poisson tail area expectations in R

2007-06-13 Thread Ravi Varadhan

More interesting is the Poisson convolution. I don't know if there is an
analytic solution to this.  I looked at Jolley's Summation of Series and
Abramowitz and Stegun, but no help there.  It seems that discrete FFT
technique should work. Does anyone know the answer?

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of kavindra malik
Sent: Wednesday, June 13, 2007 5:45 PM
To: Charles C. Berry
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Normal and Poisson tail area expectations in R

Thank you very much. This solves the problem I was trying to solve. I am new
to R and am learning. A great lesson in the power of R...

Charles C. Berry [EMAIL PROTECTED] wrote: On Wed, 13 Jun 2007,
kavindra malik wrote:

 I am interested in R functions for the following integrals / sums
(expressed best I can in text)  -

 Normal: G_u(k) =  Integration_{Lower limit=k}^{Upper limit=infinity} [(u
-k) f(u) d(u)], where where u is N(0,1), and f(u) is the density function.

 Poisson: G(lambda,k) = Sum_{Lower limit=k}^{Upper limit=infinity} [(x-k)
p(x, lambda)] where P(x,lambda) is the Poisson prob function with parameter
lambda.

 The Normal expression is very commonly used in inventory management to 
 determine safety stocks (and its tabular values can be found in some 
 texts) - and I am also looking for Poisson and/or Gamma as that'd fit 
 the situation better.

 I am wondering if there are standard functions in R that might allow me to
get these values, instead of needing to do the numerical integration, etc.
myself.

Not that I know of, but it is not difficult to do the integration:

 k - 1.1 # for example
 integrate(function(x) (x-k)*dnorm(x),lower=k,upper=Inf)
0.06861951 with absolute error  5.5e-07


see

  ?integrate
  ?qnorm
  ?qpois
  ?qgamma

Thank you very much.



 -
 Sucker-punch spam with award-winning protection.

  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive
Medicine
E mailto:[EMAIL PROTECTED] UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




   
-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find how many modes in 2 dimensions case

2007-06-12 Thread Ravi Varadhan
Hi Patrick,

Here is a simple R code for locating ALL the local maximum of a bivariate
function, which is known on a rectangular grid.  I have illustrated it with
a function called the Branin function, which is commonly used as a test
function in the global optimization literature.  It has 6 local maxima, two
of which are global.  

branin - function(x1,x2,p) {
p[1] * x1^2 + p[2]*x1^4 + p[3]*x1^6 - x1*x2 + p[4]*x2^2 + p[5]*x2^4
}

x - seq(-2, 2, length=100)
y - seq(-1, 1, length=100)
p - c(-4, 2.1, -1/3, 4, -4)
z - outer(x, y, branin,p=p)
persp(x, y, z, theta=30, phi=30, col=lightblue)

#  here is a brute-force algorithm to locate ALL the local maxima
for (i in 2:(nrow(z)-1) ) {
for (j in 2:(ncol(z)-1) ) {
lmax - (z[i,j]  z[i-1,j])  (z[i,j]  z[i+1,j])  (z[i,j]  z[i,j-1]) 
(z[i,j]  z[i,j+1])
if(lmax) cat(x: ,x[i], y: , y[j], function: , z[i,j], \n)
} 
}

x:  -1.72 y:  0.798 function:  0.214 
x:  -1.60 y:  -0.576 function:  -2.10 
x:  -0.101 y:  0.717 function:  1.03 
x:  0.101 y:  -0.717 function:  1.03 
x:  1.60 y:  0.576 function:  -2.10 
x:  1.72 y:  -0.798 function:  0.214

Of course, this brute-force grid search is highly inefficient for dimensions
greater than 2.

Hope this is helpful,
Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Patrick Wang
Sent: Friday, June 08, 2007 3:35 PM
To: Bert Gunter
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] how to find how many modes in 2 dimensions case

Thanks for the reply,

maybe I shall say bumps, I can use persp to show a density on a X Y
dimensions.
one peak is one mode I think. I try to find an automatic way to detect how
many peaks of the densities.

Pat
 Note that the number of modes (local maxima??)  is a function of the
 bandwidth, so I'm not sure your question is even meaningful.

 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Wang
 Sent: Friday, June 08, 2007 11:54 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] how to find how many modes in 2 dimensions case

 Hi,

 Does anyone know how to count the number of modes in 2 dimensions using
 kde2d function?

 Thanks
 Pat

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find how many modes in 2 dimensions case

2007-06-12 Thread Ravi Varadhan
Patrick,

Here is an example closer to what you are interested in - locating bumps in
kernel density estimator.  I am using the example from package KernSmooth,
using the function bkde2D().

# Another example for locating maxima in kernel density estimation
 data(geyser, package=MASS)
 x - cbind(geyser$duration, geyser$waiting)
 est - bkde2D(x, bandwidth=c(0.7,7))
 persp(est$fhat)

x - est$x1
y - est$x2
z - est$fhat

#  here is a brute-force algorithm to locate ALL the local maxima

for (i in 2:(nrow(z)-1) ) {
for (j in 2:(ncol(z)-1) ) {
lmax - (z[i,j]  z[i-1,j])  (z[i,j]  z[i+1,j])  (z[i,j]  z[i,j-1]) 
(z[i,j]  z[i,j+1])
if(lmax) cat(x: ,x[i], y: , y[j], function: , z[i,j], \n)
} 
}

x:  0.724 y:  41.1 function:  2.71e-20 
x:  0.858 y:  39.4 function:  1.08e-19 
x:  0.992 y:  35.9 function:  2.17e-19 
x:  2.07 y:  82.4 function:  0.00795 
x:  4.08 y:  77.2 function:  0.00722 
x:  4.35 y:  54.9 function:  0.00778

Of these, you can ignore the first 3, which have zero density.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Patrick Wang
Sent: Friday, June 08, 2007 3:35 PM
To: Bert Gunter
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] how to find how many modes in 2 dimensions case

Thanks for the reply,

maybe I shall say bumps, I can use persp to show a density on a X Y
dimensions.
one peak is one mode I think. I try to find an automatic way to detect how
many peaks of the densities.

Pat
 Note that the number of modes (local maxima??)  is a function of the
 bandwidth, so I'm not sure your question is even meaningful.

 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Wang
 Sent: Friday, June 08, 2007 11:54 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] how to find how many modes in 2 dimensions case

 Hi,

 Does anyone know how to count the number of modes in 2 dimensions using
 kde2d function?

 Thanks
 Pat

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Using odesolve to produce non-negative solutions

2007-06-11 Thread Ravi Varadhan
Spencer,

Lsoda does not estimate any parameters (nlmeODE does parameter
estimation).  It just computes the solution trajectory, at discrete times,
of a dynamical systems (i.e. set of differential equations).  It only works
with real numbers, as far as I know.


Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves
Sent: Monday, June 11, 2007 12:53 PM
To: Martin Henry H. Stevens
Cc: Jeremy Goldhaber-Fiebert; r-help@stat.math.ethz.ch
Subject: Re: [R] Fwd: Using odesolve to produce non-negative solutions

in line

Martin Henry H. Stevens wrote:
 Hi Jeremy,
 First, setting hmax to a small number could prevent a large step, if 
 you think that is a problem. Second, however, I don't see how you can 
 get a negative population size when using the log trick. 
SG:  Can lsoda estimate complex or imaginary parameters? 

 I would think that that would prevent completely any negative values 
 of N (i.e. e^-10  0). Can you explain? or do you want to a void 
 that trick? The only other solver I know of is rk4 and it is not 
 recommended.
 Hank
 On Jun 11, 2007, at 11:46 AM, Jeremy Goldhaber-Fiebert wrote:

 Hi Spencer,

 Thank you for your response. I also did not see anything on the lsoda 
 help page which is the reason that I wrote to the list.

 From your response, I am not sure if I asked my question clearly.

 I am modeling a group of people (in a variety of health states) 
 moving through time (and getting infected with an infectious 
 disease). This means that the count of the number of people in each 
 state should be positive at all times.

 What appears to happen is that lsoda asks for a derivative at a given 
 point in time t and then adjusts the state of the population. 
 However, perhaps due to numerical instability, it occasionally lower 
 the population count below 0 for one of the health states (perhaps 
 because it's step size is too big or something).

 I have tried both the logarithm trick
snip

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Using odesolve to produce non-negative solutions

2007-06-11 Thread Ravi Varadhan
Hi Jeremy,

A smaller step size may or may not help.  If the issue is simply truncation
error, that is the error involved in discretizing the differential
equations, then a smaller step size would help.  If, however, the true
solution to the differential equation is negative, for some t, then the
numerical solution should also be negative.  If the negative solution does
not make sense, then the system of equation needs to be examined to see when
and why negative solutions arise.  Perhaps, I am just making this up - there
needs to be a dampening function that slows down the trajectory as it
approaches zero from its initial value. It is also possible that only
certain regions of the parameter space (note that initial conditions are
also parameters) are allowed in the sense that only there the solution is
feasible for all t.  So, in your example, the parameters might not be
realistic.  In short, if you are sure that the numerical solution is
accurate, then you need to go back to your system of equations and analyze
them carefully.  


Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeremy
Goldhaber-Fiebert
Sent: Monday, June 11, 2007 11:47 AM
To: Spencer Graves
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Fwd: Using odesolve to produce non-negative solutions

Hi Spencer,

Thank you for your response. I also did not see anything on the lsoda help
page which is the reason that I wrote to the list.

From your response, I am not sure if I asked my question clearly.

I am modeling a group of people (in a variety of health states) moving
through time (and getting infected with an infectious disease). This means
that the count of the number of people in each state should be positive at
all times. 

What appears to happen is that lsoda asks for a derivative at a given point
in time t and then adjusts the state of the population. However, perhaps due
to numerical instability, it occasionally lower the population count below 0
for one of the health states (perhaps because it's step size is too big or
something). 

I have tried both the logarithm trick and also changing the relative and
absolute tolerance inputs but I still get the problem for certain
combinations of parameters and initial conditions. 

It occurs both under MS Windows XP Service Pack 2 and on a Linux cluster so
I am pretty sure it is not platform specific.

My real question to the group is if there is not a work around in lsoda are
there other ode solvers in R that will allow the constraint of solutions to
the ODEs remain non-negative?

Best regards,
Jeremy
  

 Spencer Graves [EMAIL PROTECTED] 6/8/2007 9:51 AM 
On the 'lsoda' help page, I did not see any option to force some 
or all parameters to be nonnegative. 

  Have you considered replacing the parameters that must be 
nonnegative with their logarithms?  This effective moves the 0 lower 
limit to (-Inf) and seems to have worked well for me in the past.  
Often, it can even make the log likelihood or sum of squares surface 
more elliptical, which means that the standard normal approximation for 
the sampling distribution of parameter estimates will likely be more 
accurate. 

  Hope this helps. 
  Spencer Graves
p.s.  Your example seems not to be self contained.  If I could have 
easily copied it from your email and run it myself, I might have been 
able to offer more useful suggestions. 

Jeremy Goldhaber-Fiebert wrote:
 Hello,

 I am using odesolve to simulate a group of people moving through time and
transmitting infections to one another. 

 In Matlab, there is a NonNegative option which tells the Matlab solver to
keep the vector elements of the ODE solution non-negative at all times. What
is the right way to do this in R?

 Thanks,
 Jeremy

 P.S., Below is a simplified version of the code I use to try to do this,
but I am not sure that it is theoretically right 

 dynmodel - function(t,y,p) 
 { 
 ## Initialize parameter values

   birth - p$mybirth(t)
   death - p$mydeath(t)
   recover - p$myrecover
   beta - p$mybeta
   vaxeff - p$myvaxeff
   vaccinated - p$myvax(t)

   vax - vaxeff*vaccinated/100

 ## If the state currently has negative quantities (shouldn't have), then
reset to reasonable values for computing meaningful derivatives

   for (i in 1:length(y)) {
   if (y[i]0) {
   y[i] - 0
   }
   }

   S - y[1]
   I - y[2]
   R - y[3]
   N - y[4]

   shat - (birth

Re: [R] Fwd: Using odesolve to produce non-negative solutions

2007-06-11 Thread Ravi Varadhan
Hi Jeremy,

A smaller step size may or may not help.  If the issue is simply truncation
error, that is the error involved in discretizing the differential
equations, then a smaller step size would help.  If, however, the true
solution to the differential equation is negative, for some t, then the
numerical solution should also be negative.  If the negative solution does
not make sense, then the system of equation needs to be examined to see when
and why negative solutions arise.  Perhaps, I am just making this up - there
needs to be a barrier function that slows down the trajectory as it
approaches zero from its initial value. It is also possible that only
certain regions of the parameter space are allowed in the sense that only
there the solution is feasible for all t.  So, in your example, the
parameters might not be realistic.  In short, if you are sure that the
numerical solution is accurate, then you need to go back to your system of
equations and analyze them carefully.


Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Martin Henry H.
Stevens
Sent: Monday, June 11, 2007 1:03 PM
To: Spencer Graves
Cc: Jeremy Goldhaber-Fiebert; R-Help; [EMAIL PROTECTED]
Subject: Re: [R] Fwd: Using odesolve to produce non-negative solutions

Hi Spencer,
I have copied Woody Setzer. I have no idea whether lsoda can estimate  
parameters that could take imaginary values.
Hank
On Jun 11, 2007, at 12:52 PM, Spencer Graves wrote:

 in line

 Martin Henry H. Stevens wrote:
 Hi Jeremy,
 First, setting hmax to a small number could prevent a large step, if
 you think that is a problem. Second, however, I don't see how you can
 get a negative population size when using the log trick.
 SG:  Can lsoda estimate complex or imaginary parameters?
Hmm. I have no idea.

 I would think that that would prevent completely any negative values
 of N (i.e. e^-10  0). Can you explain? or do you want to a void
 that trick? The only other solver I know of is rk4 and it is not
 recommended.
 Hank
 On Jun 11, 2007, at 11:46 AM, Jeremy Goldhaber-Fiebert wrote:

 Hi Spencer,

 Thank you for your response. I also did not see anything on the  
 lsoda
 help page which is the reason that I wrote to the list.

 From your response, I am not sure if I asked my question clearly.

 I am modeling a group of people (in a variety of health states)
 moving through time (and getting infected with an infectious
 disease). This means that the count of the number of people in each
 state should be positive at all times.

 What appears to happen is that lsoda asks for a derivative at a  
 given
 point in time t and then adjusts the state of the population.
 However, perhaps due to numerical instability, it occasionally lower
 the population count below 0 for one of the health states (perhaps
 because it's step size is too big or something).

 I have tried both the logarithm trick
 snip

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



Dr. Hank Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.muohio.edu/ecology/
http://www.muohio.edu/botany/

E Pluribus Unum

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Using odesolve to produce non-negative solutions

2007-06-11 Thread Ravi Varadhan
Jeremy,

You should examine the steady-state solution to your system of equations, by
setting the time-derivatives to zero and then solving/analyzing the
resulting algebraic equations.  This should give you some insights.  

Let us say you have 3 groups, A,B, and C, with initial conditions:
N_A(t=0) = N_{A0}, N_B(t=0) = N_{B0}, and N_C(t=0) = N_{C0}, and that people
transition in and out of these 3 states (one of the states could even be
absorbing, e.g. death), but it is true for any time t that N_A(t) + N_B(t) +
N_C(t) = N_{A0} + N_{B0} + N_{C0}.  Furthermore, you have 3 diff-equations
that describe the rate of change of N_A, N_B, and N_C, for t  0.  If it
happens that one of the N's, say N_A, becomes negative, you could set it
equal to zero.  But then you have to figure out how to re-adjust N_B and N_C
so that they add up to the initial total count.  After re-adjustment, you
also have to think about whether the resulting system of equations are
valid, when there are no A people.

Ravi. 



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeremy
Goldhaber-Fiebert
Sent: Monday, June 11, 2007 11:47 AM
To: Spencer Graves
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Fwd: Using odesolve to produce non-negative solutions

Hi Spencer,

Thank you for your response. I also did not see anything on the lsoda help
page which is the reason that I wrote to the list.

From your response, I am not sure if I asked my question clearly.

I am modeling a group of people (in a variety of health states) moving
through time (and getting infected with an infectious disease). This means
that the count of the number of people in each state should be positive at
all times. 

What appears to happen is that lsoda asks for a derivative at a given point
in time t and then adjusts the state of the population. However, perhaps due
to numerical instability, it occasionally lower the population count below 0
for one of the health states (perhaps because it's step size is too big or
something). 

I have tried both the logarithm trick and also changing the relative and
absolute tolerance inputs but I still get the problem for certain
combinations of parameters and initial conditions. 

It occurs both under MS Windows XP Service Pack 2 and on a Linux cluster so
I am pretty sure it is not platform specific.

My real question to the group is if there is not a work around in lsoda are
there other ode solvers in R that will allow the constraint of solutions to
the ODEs remain non-negative?

Best regards,
Jeremy
  

 Spencer Graves [EMAIL PROTECTED] 6/8/2007 9:51 AM 
On the 'lsoda' help page, I did not see any option to force some 
or all parameters to be nonnegative. 

  Have you considered replacing the parameters that must be 
nonnegative with their logarithms?  This effective moves the 0 lower 
limit to (-Inf) and seems to have worked well for me in the past.  
Often, it can even make the log likelihood or sum of squares surface 
more elliptical, which means that the standard normal approximation for 
the sampling distribution of parameter estimates will likely be more 
accurate. 

  Hope this helps. 
  Spencer Graves
p.s.  Your example seems not to be self contained.  If I could have 
easily copied it from your email and run it myself, I might have been 
able to offer more useful suggestions. 

Jeremy Goldhaber-Fiebert wrote:
 Hello,

 I am using odesolve to simulate a group of people moving through time and
transmitting infections to one another. 

 In Matlab, there is a NonNegative option which tells the Matlab solver to
keep the vector elements of the ODE solution non-negative at all times. What
is the right way to do this in R?

 Thanks,
 Jeremy

 P.S., Below is a simplified version of the code I use to try to do this,
but I am not sure that it is theoretically right 

 dynmodel - function(t,y,p) 
 { 
 ## Initialize parameter values

   birth - p$mybirth(t)
   death - p$mydeath(t)
   recover - p$myrecover
   beta - p$mybeta
   vaxeff - p$myvaxeff
   vaccinated - p$myvax(t)

   vax - vaxeff*vaccinated/100

 ## If the state currently has negative quantities (shouldn't have), then
reset to reasonable values for computing meaningful derivatives

   for (i in 1:length(y)) {
   if (y[i]0) {
   y[i] - 0
   }
   }

   S - y[1]
   I - y[2]
   R - y[3]
   N - y[4]

   shat - (birth*(1-vax)) - (death*S) - (beta*S*I/N)
   ihat

[R] An extension of Gabriel's biplot

2007-06-01 Thread Ravi Varadhan
Hi All,

 

I would like to extend the standard biplot of Grabriel (1971), which is done
by the R function biplot().  Rather than just showing the variables as
vectors (with arrows at end point), I would like to show them as line
segments, where the segment runs from minimum value of the variable to its
maximum.  In other words, I would like to produce a plot similar to that
shown in Figure 1 of Gabriel and Odoroff (1990), rather than the classical
biplot shown in Figure 2 of the same paper.   Has anyone generated biplots
similar to Figure 1 of Gabriel and Odoroff?

 

Any help is greatly appreciated.

 

Best,

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing multiple distributions

2007-05-31 Thread Ravi Varadhan
Your data is compositional data. The R package compositions might be
useful. You might also want to consult the book by J. Aitchison: statistical
analysis of compositional data.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of jiho
Sent: Thursday, May 31, 2007 11:37 AM
To: R-help
Subject: Re: [R] Comparing multiple distributions

Nobody answered my first request. I am sorry if I did not explain my  
problem clearly. English is not my native language and statistical  
english is even more difficult. I'll try to summarize my issue in  
more appropriate statistical terms:

Each of my observations is not a single number but a vector of 5  
proportions (which add up to 1 for each observation). I want to  
compare the shape of those vectors between two treatments (i.e. how  
the quantities are distributed between the 5 values in treatment A  
with respect to treatment B).

I was pointed to Hotelling T-squared. Does it seem appropriate? Are  
there other possibilities (I read many discussions about hotelling  
vs. manova but I could not see how any of those related to my  
particular case)?

Thank you very much in advance for your insights. See below for my  
earlier, more detailed, e-mail.

On 2007-May-21  , at 19:26 , jiho wrote:
 I am studying the vertical distribution of plankton and want to  
 study its variations relatively to several factors (time of day,  
 species, water column structure etc.). So my data is special in  
 that, at each sampling site (each observation), I don't have *one*  
 number, I have *several* numbers (abundance of organisms in each  
 depth bin, I sample 5 depth bins) which describe a vertical  
 distribution.

 Then let say I want to compare speciesA with speciesB, I would end  
 up trying to compare a group of several distributions with another  
 group of several distributions (where a distribution is a vector  
 of 5 numbers: an abundance for each depth bin). Does anyone know  
 how I could do this (with R obviously ;) )?

 Currently I kind of get around the problem and:
 - compute mean abundance per depth bin within each group and  
 compare the two mean distributions with a ks.test but this  
 obviously diminishes the power of the test (I only compare 5*2  
 observations)
 - restrict the information at each sampling site to the mean depth  
 weighted by the abundance of the species of interest. This way I  
 have one observation per station but I reduce the information to  
 the mean depths while the actual repartition is important also.

 I know this is probably not directly R related but I have already  
 searched around for solutions and solicited my local statistics  
 expert... to no avail. So I hope that the stats' experts on this  
 list will help me.

 Thank you very much in advance.

JiHO
---
http://jo.irisson.free.fr/



-- 
Ce message a iti virifii par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a iti trouvi.
CRI UPVD http://www.univ-perp.fr

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ratio distribution - missing attachment

2007-05-29 Thread Ravi Varadhan
Dear Martin and Vitto,

Please find attached the R function to compute the density of the ratio of 2
dependent normal variates.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Martin Maechler [mailto:[EMAIL PROTECTED] 
Sent: Monday, May 28, 2007 5:37 AM
To: Ravi Varadhan
Cc: R-help@stat.math.ethz.ch
Subject: Re: [R] ratio distribution - missing attachment

Thank you, Ravi.
You probably will have noticed, that the attachment didn't make
it to the mailing list.

The reason is that the we let the mailing list software strip 
binary attachments which can easily be misused to spread
viruses; see -- http://www.r-project.org/mail.html (search attachment)
or the posting-guide.

OTOH, the software allows attachments with MIME type text/plain.
If you use an e-mail software for sophisticated users, the
software allows you to specify the MIME type of your
attachments;
otherwise (as with most user friendly, modern e-mail
software), attach a *.txt file (Doug Bates uses  foo_R.txt) 
and it should make it to the lists;
as a third alternative, just cut  paste the corresponding
text into your e-mal.

I think your R function should make it to R-help (and its
archives), so I'd be thankful for a repost.

Martin


 Ravi == Ravi Varadhan [EMAIL PROTECTED]
 on Fri, 25 May 2007 14:24:20 -0400 writes:

Ravi Mike, Attached is an R function to do this, along with
Ravi an example that will reproduce the MathCad plot shown
Ravi in your attached paper. I haven't checked it
Ravi thoroughly, but it seems to reproduce the MathCad
Ravi example well.

Ravi Ravi.

Ravi

Ravi ---

Ravi Ravi Varadhan, Ph.D.

Ravi Assistant Professor, The Center on Aging and Health

Ravi Division of Geriatric Medicine and Gerontology

Ravi Johns Hopkins University

Ravi Ph: (410) 502-2619

Ravi Fax: (410) 614-9625

Ravi Email: [EMAIL PROTECTED]

Ravi Webpage:
Ravi http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 

Ravi

Ravi 


Ravi -Original Message- From:
Ravi [EMAIL PROTECTED]
Ravi [mailto:[EMAIL PROTECTED] On Behalf Of
Ravi Mike Lawrence Sent: Friday, May 25, 2007 1:55 PM To:
Ravi Lucke, Joseph F Cc: Rhelp Subject: Re: [R] Calculation
Ravi of ratio distribution properties

Ravi According to the paper I cited, there is controversy
Ravi over the sufficiency of Hinkley's solution, hence
Ravi their proposed more complete solution.

Ravi On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

 The exact ratio is given in
 
 On the Ratio of Two Correlated Normal Random Variables,
 D. V.  Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969),
 pp. 635-639.
 

Ravi -- Mike Lawrence Graduate Student, Department of
Ravi Psychology, Dalhousie University

Ravi Website: http://myweb.dal.ca/mc973993 Public calendar:
Ravi http://icalx.com/public/informavore/Public

Ravi The road to wisdom? Well, it's plain and simple to
Ravi express: Err and err and err again, but less and less
Ravi and less.  - Piet Hein
ratio2normals - function(x, mean1,mean2,sd1,sd2,rho){
# A function to compute ratio of 2 normals
# R code written by Ravi Varadhan 
# May 25, 2007
# Based on the paper by Pham Gia et al., Comm in Stats (2006)
A - 1 / (2*pi*sd1*sd2*sqrt(1-rho^2))
exponent.num - -sd2^2*mean1^2 - sd1^2*mean2^2 + 2*rho*sd1*sd2*mean1*mean2
exponent.denom - 2*(1-rho^2)*sd1^2*sd2^2
K - A * exp(exponent.num/exponent.denom)
t2x.num - -sd2^2*mean1*x - sd1^2*mean2 + rho*sd1*sd2*(mean2*x + mean1)
t2x.denom - sd1*sd2*sqrt(2*(1-rho^2)*(sd2^2*x^2 - 2*rho*x*sd1*sd2 + sd1^2))
t2x - t2x.num / t2x.denom
erf.term - 2 * pnorm(sqrt(2) * t2x) - 1
Ft2x - sqrt(pi) * t2x * exp(t2x^2) * erf.term + 1
fx - K * Ft2x * 2 * (1 - rho^2) * sd1^2 * sd2^2 / (sd2^2 * x^2 + sd1^2 - 
2*x*rho*sd1*sd2)
return(fx)
}


mean1 - 75.25
mean2 - 71.58 
sd1 - 6.25
sd2 - 5.45
rho - 0.76

x - seq(0.5,1.5, length=100)
y - ratio2normals(x,mean1,mean2, sd1,sd2,rho)
plot(x,y, type=l)


# compute the mean and variance via quadrature
m1 - function(x, mean1, mean2, sd1, sd2, rho){
x * ratio2normals(x, mean1, mean2, sd1, sd2, rho)
}

m2 - function(x, mean1, mean2, sd1, sd2, rho){
x^2 * ratio2normals(x, mean1, mean2, sd1, sd2, rho)
}

m.1 - integrate(m1, lower=-Inf, upper=Inf, mean1=mean1, mean2=mean2, sd1=sd1, 
sd2=sd2, rho=rho, rel.tol=1.e

Re: [R] Problem with numerical integration and optimization with BFGS

2007-05-25 Thread Ravi Varadhan
Deepankar,

If the problem seems to be in the evaluation of numerical quadrature part,
you might want to try quadrature methods that are better suited to
integrands with strong peaks.  The traditional Gaussian quadrature methods,
even their adaptive versions such as Gauss-Kronrod, are not best suited for
integrating because they do not explicitly account for the peakedness of
the integrand, and hence can be inefficient and inaccurate. See the article
below:
http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu
zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf

Alan Genz has worked on this problem a lot and has a number of computational
tools available. I used some of them when I was working on computing Bayes
factors for binomial regression models with different link functions.  If
you are interested, check the following:

http://www.math.wsu.edu/faculty/genz/software/software.html.

For your immediate needs, there is an R package called mnormt that has a
function for computing integrals under a multivariate normal (and
multivariate t) densities, which is actually based on Genz's Fortran
routines.  You could try that.

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu
Sent: Friday, May 25, 2007 12:02 AM
To: Prof Brian Ripley
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Problem with numerical integration and optimization with
BFGS

Prof. Ripley,

The code that I provided with my question of course does not contain
code for the derivatives; but I am supplying analytical derivatives in
my full program. I did not include that code with my question because
that would have added about 200 more lines of code without adding any
new information relevant for my question. The problem that I had pointed
to occurs whether I provide analytical derivatives or not to the
optimization routine. And the problem was that when I use the BFGS
method in optim, I get an error message saying that the integrals are
probably divergent; I know, on the other hand, that the integrals are
convergent. The same problem does not arise when I instead use the
Nelder-Mead method in optim.

Your suggestion that the expression can be analytically integrated
(which will involve pnorm) might be correct though I do not see how to
do that. The integrands are the bivariate normal density functions with
one variable replaced by known quantities while I integrate over the
second. 

For instance, the first integral is as follows: the integrand is the
bivariate normal density function (with general covariance matrix) where
the second variable has been replaced by 
y[i] - rho1*y[i-1] + delta 
and I integrate over the first variable; the range of integration is
lower=-y[i]+rho1*y[i-1]
upper=y[i]-rho1*y[i-1]

The other two integrals are very similar. It would be of great help if
you could point out how to integrate the expressions analytically using
pnorm.

Thanks.
Deepankar


On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote:
 You are trying to use a derivative-based optimization method without 
 supplying derivatives.  This will use numerical approoximations to the 
 derivatives, and your objective function will not be suitable as it is 
 internally using adaptive numerical quadrature and hence is probably not 
 close enough to a differentiable function (it may well have steps).
 
 I believe you can integrate analytically (the answer will involve pnorm), 
 and that you can also find analytical derivatives.
 
 Using (each of) numerical optimization and integration is a craft, and it 
 seems you need to know more about it.  The references on ?optim are too 
 advanced I guess, so you could start with Chapter 16 of MASS and its 
 references.
 
 On Thu, 24 May 2007, Deepankar Basu wrote:
 
  Hi R users,
 
  I have a couple of questions about some problems that I am facing with
  regard to numerical integration and optimization of likelihood
  functions. Let me provide a little background information: I am trying
  to do maximum likelihood estimation of an econometric model that I have
  developed recently. I estimate the parameters of the model using the
  monthly US unemployment rate series obtained from the Federal Reserve
  Bank of St. Louis. (The data is freely available from their web-based
  database called FRED-II).
 
  For my model, the likelihood function for each observation is the sum of
  three integrals. The integrand in each of these integrals is of the
  following form

Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Ravi Varadhan
Mike,

Attached is an R function to do this, along with an example that will
reproduce the MathCad plot shown in your attached paper. I haven't checked
it thoroughly, but it seems to reproduce the MathCad example well.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mike Lawrence
Sent: Friday, May 25, 2007 1:55 PM
To: Lucke, Joseph F
Cc: Rhelp
Subject: Re: [R] Calculation of ratio distribution properties

According to the paper I cited, there is controversy over the  
sufficiency of Hinkley's solution, hence their proposed more complete  
solution.

On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

 The exact ratio is given in

 On the Ratio of Two Correlated Normal Random Variables, D. V.  
 Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639.


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time series\optimization question not R question

2007-05-22 Thread Ravi Varadhan
Your approach obviously won't give you the same result as when the
likelihood is optimized jointly with A and \beta.  However, you can maximize
the likelihood over \beta for different values of A, which would give you a
profiled likelihood.  Then you pick the \beta and A corresponding to
maximum of the profiled likelihood.  However, this set of A and \beta need
not necessarily satisfy your constraints.  If this does happen, you could
make a simple parameter transformation from (A, beta) to (P1, P2) that might
resolve the problem:

P1 - beta
P2 - atanh(A + beta)

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
Sent: Tuesday, May 22, 2007 10:29 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Time series\optimization question not R question

This is a time series\optimization rather than an R question : Suppose I
have an ARMA(1,1) with
restrictions such that the coefficient on the lagged epsilon_term is
related to the coefficient on
The lagged z term as below.

z_t =[A + beta]*z_t-1 + epsilon_t - A*epsilon_t-1

So, if I don't have a facility for optimizing with this restriction, is
it legal to set A to something and then
Optimize just for the beta given the A ? Would this give me the same
answer likelihood wise, of optimizing both
jointly with the restriction ? This methodology doesn't sound right to
me. Thanks.

P.S : abs(A + beta) also has to be less than 1  but I was just going to
hope for that and not worry about it right now.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time series\optimization question not R question

2007-05-22 Thread Ravi Varadhan
In my previous email, I meant to say:

P1 - A
P2 - atanh(A + beta)

So that the model becomes:

z_t = tanh(P2)*z_t-1 + epsilon_t - P1*epsilon_t-1

Although I am not sure, how readily the likelihood of the above model can be
maximized.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
Sent: Tuesday, May 22, 2007 11:27 AM
To: 'Leeds, Mark (IED)'; r-help@stat.math.ethz.ch
Subject: Re: [R] Time series\optimization question not R question

Your approach obviously won't give you the same result as when the
likelihood is optimized jointly with A and \beta.  However, you can maximize
the likelihood over \beta for different values of A, which would give you a
profiled likelihood.  Then you pick the \beta and A corresponding to
maximum of the profiled likelihood.  However, this set of A and \beta need
not necessarily satisfy your constraints.  If this does happen, you could
make a simple parameter transformation from (A, beta) to (P1, P2) that might
resolve the problem:

P1 - beta
P2 - atanh(A + beta)

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
Sent: Tuesday, May 22, 2007 10:29 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Time series\optimization question not R question

This is a time series\optimization rather than an R question : Suppose I
have an ARMA(1,1) with
restrictions such that the coefficient on the lagged epsilon_term is
related to the coefficient on
The lagged z term as below.

z_t =[A + beta]*z_t-1 + epsilon_t - A*epsilon_t-1

So, if I don't have a facility for optimizing with this restriction, is
it legal to set A to something and then
Optimize just for the beta given the A ? Would this give me the same
answer likelihood wise, of optimizing both
jointly with the restriction ? This methodology doesn't sound right to
me. Thanks.

P.S : abs(A + beta) also has to be less than 1  but I was just going to
hope for that and not worry about it right now.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonlinear constrains with optim

2007-05-10 Thread Ravi Varadhan
Paul,

The general idea here is to transform a constrained problem into a sequence
of unconstrained optimization problems.  When only equality constraints are
involved, a popular way to do this is to augment the objective function with
a Lagrangian term and with a quadratic penalty term.  When general
inequalities are present, it is a lot harder, but various kinds of barrier
functions are used.  

A canonical reference (quite theoretical) for constrained optimization is
the book:
Fiacco and McCormick (1968, but reprinted as classic by SIAM in 1990):
Nonlinear programming: sequential unconstrained minimization techniques.

Another good book is that by Roger Fletcher (1987): Practical methods of
optimization.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
Sent: Thursday, May 10, 2007 11:08 AM
To: R-help
Subject: [R] Nonlinear constrains with optim

Dear All

I am dealing at the moment with optimization problems with nonlinear
constraints. Regenoud is quite apt to solve that kind of problems, but
the precision of the optimal values for the parameters is sometimes
far from what I need. Optim seems to be more precise, but it can only
accept box-constrained optimization problems. I read in the list
archives that optim can also be used with nonlinear constrains through
penalizations. However, I am not familiar with the technique of
penalizations. Could someone please indicate to me a site or a book to
learn about that penalization technique?

Thanks in advance,

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optim

2007-05-10 Thread Ravi Varadhan

Let us first assume that you have enumerated all the local maxima, which is
by no means a trivial thing to assure.  How different are the likelihood
values?  If they are significantly different, then take the parameter
estimates corresponding to the largest likelihood.  If they are not
significantly different but the corresponding parameter estimates differ
widely, then you may have identifiability issues.  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Wassim Kamoum
Sent: Thursday, May 10, 2007 3:46 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Optim

Hello, 
  I'm maximizing a likelihood function with the function optim, but for
different intial parameters (in the input of the optim funtion) , I found
different value for the likelihood function and the parameters estimates,
the causes is that the algorithm has not found the global maximum for the
function but only a local maximum. What must I do to obtain the global
maximum for the likelihood function?
  Thanks


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A function for raising a matrix to a power?

2007-05-09 Thread Ravi Varadhan
Atte,

Your matrix A is not symmetric, that is why exponentiation using spectral
decomposition, expM.sd, does not give you the correct answer.  

Convert A to a symmetric matrix: 
A - (A + t(A))/2
then the results will all match.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Atte Tenkanen [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 09, 2007 10:07 AM
To: Paul Gilbert
Cc: Ravi Varadhan; r-help@stat.math.ethz.ch
Subject: Re: [R] A function for raising a matrix to a power?


Hello,

Thanks for many replys. I tested all the functions presented. 
I'm a beginner in linear algebra, but now I have a serious question ;-)
Here is a matrix A, which determinant is 3, so it is nonsingular. 
Then there are similar computer runs done with each function proposed.
I have calculated powers of A between A^274 - A^277.
In the first version (see the outputs lower), when n=277, 
there comes zero in the upper right corner. 
Can you say, if the first function is more reliable than others? 
What can you say about the accuracy of calculations?

-Atte

#-#

# Matrix A:
A=rbind(c(-3,-4,-2),c(3,5,1),c(2,-1,4))

A
det(A)

# 1st version:
%^%-function(A,n){
  if(n==1) A else {B-A; for(i in (2:n)){A-A%*%B}}; A
  }

for(i in 274:277){print(i);print(A%^%i)}


# 2nd version:
%^% - function(A, n) if(n == 1) A else A %*% (A %^% (n-1))

for(i in 274:277){print(i);print(A%^%i)}


# 3rd version:

mp - function(mat,pow){
ans - mat
for ( i in 1:(pow-1)){
ans - mat%*%ans
}
return(ans)
}

for(i in 274:277){print(i);print(mp(A,i))}


# 4th version:

library(Malmig)

mtx.exp
for(i in 274:277){print(i);print(mtx.exp(A,i))}

# 5th version:

matrix.power - function(mat, n)
{
  # test if mat is a square matrix
  # treat n  0 and n = 0 -- this is left as an exercise
  # trap non-integer n and return an error
  if (n == 1) return(mat)
  result - diag(1, ncol(mat))
  while (n  0) {
if (n %% 2 != 0) {
  result - result %*% mat
  n - n - 1
}
mat - mat %*% mat
n - n / 2
  }
  return(result)
}

for(i in 274:277){print(i);print(matrix.power(A,i))}


# 6th version:

expM.sd - function(X,e){Xsd - eigen(X); Xsd$vec %*% diag(Xsd$val^e)
%*%t(Xsd$vec)}

for(i in 274:277){print(i);print(expM.sd(A,i))}


#-OUTPUTS---#


 A=rbind(c(-3,-4,-2),c(3,5,1),c(2,-1,4))
 
 A
 [,1] [,2] [,3]
[1,]   -3   -4   -2
[2,]351
[3,]2   -14
 det(A)
[1] 3
 
 # 1st version:
 %^%-function(A,n){
+   if(n==1) A else {B-A; for(i in (2:n)){A-A%*%B}}; A
+   }
 
 for(i in 274:277){print(i);print(A%^%i)}
[1] 274
   [,1]   [,2]   [,3]
[1,]  1.615642e+131  3.231283e+131 -3.940201e+115
[2,] -5.385472e+130 -1.077094e+131  4.925251e+114
[3,] -3.769831e+131 -7.539661e+131  7.880401e+115
[1] 275
   [,1]   [,2]   [,3]
[1,]  4.846925e+131  9.693850e+131 -1.182060e+116
[2,] -1.615642e+131 -3.231283e+131   0.00e+00
[3,] -1.130949e+132 -2.261898e+132  4.728241e+116
[1] 276
   [,1]   [,2]   [,3]
[1,]  1.454078e+132  2.908155e+132 -1.576080e+116
[2,] -4.846925e+131 -9.693850e+131   0.00e+00
[3,] -3.392848e+132 -6.785695e+132  1.576080e+117
[1] 277
   [,1]   [,2]   [,3]
[1,]  4.362233e+132  8.724465e+132   0.00e+00
[2,] -1.454078e+132 -2.908155e+132 -1.576080e+116
[3,] -1.017854e+133 -2.035709e+133  3.782593e+117
 
 
 # 2nd version:
 %^% - function(A, n) if(n == 1) A else A %*% (A %^% (n-1))
 
 for(i in 274:277){print(i);print(A%^%i)}
[1] 274
   [,1]   [,2]   [,3]
[1,]  1.615642e+131  3.231283e+131 -3.101125e+114
[2,] -5.385472e+130 -1.077094e+131  1.533306e+114
[3,] -3.769831e+131 -7.539661e+131  5.664274e+114
[1] 275
   [,1]   [,2]   [,3]
[1,]  4.846925e+131  9.693850e+131 -8.158398e+114
[2,] -1.615642e+131 -3.231283e+131  4.027430e+114
[3,] -1.130949e+132 -2.261898e+132  1.492154e+115
[1] 276
   [,1]   [,2]   [,3]
[1,]  1.454078e+132  2.908155e+132 -2.147761e+115
[2,] -4.846925e+131 -9.693850e+131  1.058350e+115
[3,] -3.392848e+132 -6.785695e+132  3.934193e+115
[1] 277
   [,1]   [,2]   [,3]
[1,]  4.362233e+132  8.724465e+132 -5.658504e+115
[2,] -1.454078e+132 -2.908155e+132  2.782660e+115
[3,] -1.017854e+133 -2.035709e+133  1.038290e+116
 
 
 # 3rd version:
 
 mp - function(mat,pow){
+ ans - mat
+ for ( i in 1:(pow-1)){
+ ans - mat%*%ans
+ }
+ return(ans)
+ }
 
 for(i in 274:277

Re: [R] Bad optimization solution

2007-05-08 Thread RAVI VARADHAN
Paul,

The problem lies neither with R nor with numercial methods.  The onus is always 
on the user to understand what the numerical schemes can do and what they can't 
do.  One should never blindly take the results given by a numerical scheme and 
run with it.  In your example, the optimization method is doing what it was 
designed to do: to find a critical point of a function where the gradient is 
zero.  It is your responsibility to ensure that the result makes sense, and if 
it doesn't, to understand why it doesn't make sense.  In your problem, maxima 
((1,0) and (0,1)) lie on the boundary of the parameter space, and the gradient 
at the maxima (defined as the limit from within the domain) are clearly not 
zero.  Another problem with your example is that the hessian for your function 
is singular, it has eigenvalues of 0 and 2.  In short, there is no substitute 
to using your analytic powers!

Ravi.

- Original Message -
From: Paul Smith [EMAIL PROTECTED]
Date: Tuesday, May 8, 2007 4:33 am
Subject: Re: [R] Bad optimization solution
To: R-help r-help@stat.math.ethz.ch


 It seems that there is here a problem of reliability, as one never
  knows whether the solution provided by R is correct or not. In the
  case that I reported, it is fairly simple to see that the solution
  provided by R (without any warning!) is incorrect, but, in general,
  that is not so simple and one may take a wrong solution as a correct
  one.
  
  Paul
  
  
  On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
   Your function, (x1-x2)^2, has zero gradient at all the starting 
 values such
   that x1 = x2, which means that the gradient-based search methods will
   terminate there because they have found a critical point, i.e. a 
 point at
   which the gradient is zero (which can be a maximum or a minimum or 
 a saddle
   point).
  
   However, I do not why optim converges to the boundary maximum, when 
 analytic
   gradient is supplied (as shown by Sundar).
  
   Ravi.
  
   
 
   ---
  
   Ravi Varadhan, Ph.D.
  
   Assistant Professor, The Center on Aging and Health
  
   Division of Geriatric Medicine and Gerontology
  
   Johns Hopkins University
  
   Ph: (410) 502-2619
  
   Fax: (410) 614-9625
  
   Email: [EMAIL PROTECTED]
  
   Webpage:  
  
  
  
   
 
   
  
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [ On Behalf Of Paul Smith
   Sent: Monday, May 07, 2007 6:26 PM
   To: R-help
   Subject: Re: [R] Bad optimization solution
  
   On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote:
 I think the problem is the starting point.  I do not remember the
   details
 of the BFGS method, but I am almost sure the (.5, .5) starting 
 point is
 suspect, since the abs function is not differentiable at 0.  If 
 you
   perturb
 the starting point even slightly you will have no problem.

  Paul Smith
  [EMAIL PROTECTED]
  
   To
  Sent by:  R-help 
 r-help@stat.math.ethz.ch
  [EMAIL PROTECTED]
   cc
  at.math.ethz.ch

   Subject
[R] Bad optimization solution
  05/07/2007 04:30
  PM








 Dear All

 I am trying to perform the below optimization problem, but getting
 (0.5,0.5) as optimal solution, which is wrong; the correct solution
 should be (1,0) or (0,1).

 Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 
 (Linux).

 Thanks in advance,

 Paul

 --
 myfunc - function(x) {
   x1 - x[1]
   x2 - x[2]
   abs(x1-x2)
 }


   
 optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
   list(fnscale=-1))
   
Yes, with (0.2,0.9), a correct solution comes out. However, how can
one be sure in general that the solution obtained by optim is correct?
In ?optim says:
   
 Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows
 _box constraints_, that is each variable can be given a lower
 and/or upper bound. The initial value must satisfy the
 constraints. This uses a limited-memory modification of the 
 BFGS
 quasi-Newton method. If non-trivial bounds are supplied, this
 method will be selected, with a warning.
   
which only demands that the initial value must satisfy the constraints.
  
   Furthermore, X^2 is everywhere differentiable and notwithstanding the
   reported problem occurs with
  
   myfunc - function(x) {
 x1 - x[1]
 x2 - x[2]
 (x1-x2)^2
   }
  
   
 optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
   list(fnscale=-1))
  
   Paul

Re: [R] A function for raising a matrix to a power?

2007-05-07 Thread Ravi Varadhan
Paul,

Your solution based on SVD does not work even for the matrix in your example
(the reason it worked for e=3, was that it is an odd power and since P is a
permutation matrix.  It will also work for all odd powers, but not for even
powers).
  
However, a spectral decomposition will work for all symmetric matrices (SVD
based exponentiation doesn't work for any matrix).  

Here is the function to do exponentiation based on spectral decomposition:

expM.sd - function(X,e){Xsd - eigen(X); Xsd$vec %*% diag(Xsd$val^e) %*%
t(Xsd$vec)}

 P - matrix(c(.3,.7, .7, .3), ncol=2)
 P
 [,1] [,2]
[1,]  0.3  0.7
[2,]  0.7  0.3
 P %*% P
 [,1] [,2]
[1,] 0.58 0.42
[2,] 0.42 0.58
 expM(P,2)  
 [,1] [,2]
[1,] 0.42 0.58
[2,] 0.58 0.42
 expM.sd(P,2)
 [,1] [,2]
[1,] 0.58 0.42
[2,] 0.42 0.58


Exponentiation based on spectral decomposition should be generally more
accurate (not sure about the speed).  A caveat is that it will only work for
real, symmetric or complex, hermitian matrices.  It will not work for
asymmetric matrices.  It would work for asymmetric matrix if the
eigenvectors are made to be orthonormal.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert
Sent: Monday, May 07, 2007 5:16 PM
To: Atte Tenkanen
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] A function for raising a matrix to a power?

You might try this, from 9/22/2006 with subject line Exponentiate a matrix:

I am getting a bit rusty on some of these things, but I seem to recall 
that there is a numerical advantage (speed and/or accuracy?) to 
diagonalizing:

  expM - function(X,e) { v - La.svd(X); v$u %*% diag(v$d^e) %*% v$vt }

  P - matrix(c(.3,.7, .7, .3), ncol=2)
  P %*% P %*% P
  [,1]  [,2]
[1,] 0.468 0.532
[2,] 0.532 0.468
  expM(P,3)
  [,1]  [,2]
[1,] 0.468 0.532
[2,] 0.532 0.468

I think this also works for non-integer, negative, large, and complex 
exponents:

  expM(P, 1.5)
  [,1]  [,2]
[1,] 0.3735089 0.6264911
[2,] 0.6264911 0.3735089
  expM(P, 1000)
 [,1] [,2]
[1,]  0.5  0.5
[2,]  0.5  0.5
  expM(P, -3)
[,1][,2]
[1,] -7.3125  8.3125
[2,]  8.3125 -7.3125
  expM(P, 3+.5i)
  [,1]  [,2]
[1,] 0.4713+0.0141531i 0.5287-0.0141531i
[2,] 0.5287-0.0141531i 0.4713+0.0141531i
 

Paul Gilbert

Doran, Harold wrote:

  Suppose I have a square matrix P
 
  P - matrix(c(.3,.7, .7, .3), ncol=2)
 
  I know that
 
 
  P * P
 
  Returns the element by element product, whereas
 
 
 
  P%*%P
 
 
  Returns the matrix product.
 
  Now, P2 also returns the element by element product. But, is there a
  slick way to write
 
  P %*% P %*% P
 
  Obviously, P3 does not return the result I expect.
 
  Thanks,
  Harold
 


Atte Tenkanen wrote:
 Hi,
 
 Is there a function for raising a matrix to a power? For example if you
like to compute A%*%A%*%A, is there an abbreviation similar to A^3?
 
 Atte Tenkanen
 
 A=rbind(c(1,1),c(-1,-2))
 A
  [,1] [,2]
 [1,]11
 [2,]   -1   -2
 A^3
  [,1] [,2]
 [1,]11
 [2,]   -1   -8
 
 But:
 
 A%*%A%*%A
  [,1] [,2]
 [1,]12
 [2,]   -2   -5
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



La version française suit le texte anglais.




This email may contain privileged and/or confidential inform...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bad optimization solution

2007-05-07 Thread Ravi Varadhan
Your function, (x1-x2)^2, has zero gradient at all the starting values such
that x1 = x2, which means that the gradient-based search methods will
terminate there because they have found a critical point, i.e. a point at
which the gradient is zero (which can be a maximum or a minimum or a saddle
point).  

However, I do not why optim converges to the boundary maximum, when analytic
gradient is supplied (as shown by Sundar).

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
Sent: Monday, May 07, 2007 6:26 PM
To: R-help
Subject: Re: [R] Bad optimization solution

On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote:
  I think the problem is the starting point.  I do not remember the
details
  of the BFGS method, but I am almost sure the (.5, .5) starting point is
  suspect, since the abs function is not differentiable at 0.  If you
perturb
  the starting point even slightly you will have no problem.
 
   Paul Smith
   [EMAIL PROTECTED]
   
To
   Sent by:  R-help r-help@stat.math.ethz.ch
   [EMAIL PROTECTED]
cc
   at.math.ethz.ch
 
Subject
 [R] Bad optimization solution
   05/07/2007 04:30
   PM
 
 
 
 
 
 
 
 
  Dear All
 
  I am trying to perform the below optimization problem, but getting
  (0.5,0.5) as optimal solution, which is wrong; the correct solution
  should be (1,0) or (0,1).
 
  Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux).
 
  Thanks in advance,
 
  Paul
 
  --
  myfunc - function(x) {
x1 - x[1]
x2 - x[2]
abs(x1-x2)
  }
 
 
optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
list(fnscale=-1))

 Yes, with (0.2,0.9), a correct solution comes out. However, how can
 one be sure in general that the solution obtained by optim is correct?
 In ?optim says:

  Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows
  _box constraints_, that is each variable can be given a lower
  and/or upper bound. The initial value must satisfy the
  constraints. This uses a limited-memory modification of the BFGS
  quasi-Newton method. If non-trivial bounds are supplied, this
  method will be selected, with a warning.

 which only demands that the initial value must satisfy the constraints.

Furthermore, X^2 is everywhere differentiable and notwithstanding the
reported problem occurs with

myfunc - function(x) {
  x1 - x[1]
  x2 - x[2]
  (x1-x2)^2
}

optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
list(fnscale=-1))

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SPAM] - Re: R package development in windows - BayesianFilter detected spam

2007-05-03 Thread Ravi Varadhan
Harold,

I totally echo your sentiments on the difficulty of creating an R package in
Windows.  I really wish that this process could be made a bit less painful.

Ravi.  


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
Sent: Thursday, May 03, 2007 3:33 PM
To: Duncan Murdoch
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] [SPAM] - Re: R package development in windows -
BayesianFilter detected spam

Thanks, Duncan. I'll look into that. Is there an authoritative document
that codifies the new package development procedures for 2.5.0
(windows-specific), or is that Writing R Extensions? In this thread
alone I've received multiple emails pointing to multiple web sites with
instructions for windows. Inasmuch as its appreciated, I'm a bit
confused as to which I should consider authoritative.

I do hope I can resolve this and appreciate the help I've received.
However, I feel a bit compelled to note how very difficult this process
is. 

Harold


 -Original Message-
 From: Duncan Murdoch [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, May 03, 2007 3:24 PM
 To: Doran, Harold
 Cc: Gabor Grothendieck; r-help@stat.math.ethz.ch
 Subject: [SPAM] - Re: [R] R package development in windows - 
 Bayesian Filter detected spam
 
 On 5/3/2007 3:04 PM, Doran, Harold wrote:
  Thanks Gabor, Sundar, and Tony. Indeed, Rtools was missing from the 
  path. With that resolved, and another 10 minute windows 
 restart, I get 
  the following below. The log suggests that hhc is not installed. It 
  is, and, according to the directions I am following, I have 
 placed it 
  in the c:\cygwin directory.
 
 I think the problem is that you are following a real mix of 
 instructions, and they don't make sense.
 
 It would be nice if folks would submit patches to the R Admin 
 manual (or to the Rtools web site) rather than putting 
 together web sites with advice that is bad from day one, and 
 quickly gets worse when it is not updated.
 
  BTW, package.skeleton() doesn't seem to create the correct 
 DESCRIPTION 
  template. I had to add the DEPENDS line. Without this, I 
 get another 
  error.
  
  
  C:\Program Files\R\R-2.4.1\binRcmd build --force --binary g:\foo
 
 R 2.4.1 is no longer current; the package building 
 instructions in R 2.5.0 have been simplified a bit.  You 
 might want to try those.
 
 Duncan Murdoch
 
  * checking for file 'g:\foo/DESCRIPTION' ... OK
  * preparing 'g:\foo':
  * checking DESCRIPTION meta-information ... OK
  * removing junk files
  * checking for LF line-endings in source files
  * checking for empty or unneeded directories
  * building binary distribution
   WARNING
  * some HTML links may not be found
  installing R.css in c:/TEMP/Rinst40061099
  
  Using auto-selected zip options ''
  latex: not found
  latex: not found
  latex: not found
  
  -- Making package foo 
  latex: not found
adding build stamp to DESCRIPTION
  latex: not found
  latex: not found
  latex: not found
installing R files
  latex: not found
installing data files
  latex: not found
installing man source files
installing indices
  latex: not found
not zipping data
installing help
  Warning: \alias{foo} already in foo-package.Rd -- skipping 
 the one in 
  foo.Rd   Building/Updating help pages for package 'foo'
   Formats: text html latex example chm
foo-package   texthtmllatex   
 example chm
foo   texthtmllatex   
 example chm
mydatatexthtmllatex   
 example chm
  hhc: not found
  cp: cannot stat `c:/TEMP/Rbuild40048815/foo/chm/foo.chm': 
 No such file 
  or direct ory
  make[1]: *** [chm-foo] Error 1
  make: *** [pkg-foo] Error 2
  *** Installation of foo failed ***
  
  Removing 'c:/TEMP/Rinst40061099/foo'
   ERROR
  * installation failed
  
  
  C:\Program Files\R\R-2.4.1\bin
  
  -Original Message-
  From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
  Sent: Thursday, May 03, 2007 2:50 PM
  To: Doran, Harold
  Cc: r-help@stat.math.ethz.ch
  Subject: Re: [R] R package development in windows
  
  It can find sh.exe so you haven't installed Rtools.
  
  There are several HowTo's listed in the links section here that 
  include pointers to R manuals and other step by step
  instructions:
  
  http://code.google.com/p/batchfiles/
  
  On 5/3/07, Doran, Harold [EMAIL PROTECTED] wrote:
   I'm attempting to build an R package for distribution and
  am

Re: [R] Simple plot question

2007-04-26 Thread Ravi Varadhan
Check out the function mvrnorm in package MASS.

library(MASS)
?mvrnorm

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David Kaplan
Sent: Thursday, April 26, 2007 2:49 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Simple plot question

Hi,

I have been searching and can't seem to find a simple command that will 
allow me to sample from a multivariate normal distribution with known 
covariance matrix.  I am likely missing something.

Thanks in advance.

David


-- 
===
David Kaplan, Ph.D.
Professor
Department of Educational Psychology
University of Wisconsin - Madison
Educational Sciences, Room, 1061
1025 W. Johnson Street
Madison, WI 53706

email: [EMAIL PROTECTED]
homepage: http://www.education.wisc.edu/edpsych/facstaff/kaplan/kaplan.htm
Phone: 608-262-0836

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimates at each iteration of optim()?

2007-04-23 Thread Ravi Varadhan
Deepankar,

Here is an example using BFGS:

 fr - function(x) {   ## Rosenbrock Banana function
+ x1 - x[1]
+ x2 - x[2]
+ 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
+ }
 grr - function(x) { ## Gradient of 'fr'
+ x1 - x[1]
+ x2 - x[2]
+ c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
+200 *  (x2 - x1 * x1))
+ }
 optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE))
initial  value 24.20 
iter  10 value 1.367383
iter  20 value 0.134560
iter  30 value 0.001978
iter  40 value 0.00
final  value 0.00 
converged
$par
[1] 1 1

$value
[1] 9.594955e-18

$counts
function gradient 
 110   43 

$convergence
[1] 0

$message
NULL

 

This example shows that the parameter estimates are printed out every 10
iterations.  However, trying different integer values for trace from 2 to 10
(trace = 1 behaves the same as trace=TRUE) did not change anything.  If you
want to get estimates at every iteration, look at the source code for BFGS
(which I assume is in FORTRAN). You may have to modify the source code and
recompile it yourself to get more detailed trace for BFGS. 

However, you can get parameter iterates at every step for L-BFGS-B using
trace=6, although this gives a lot more information than just the parameter
estimates.  Alternatively, you can use the CG methods with trace=TRUE or
trace=1, which is a generally a lot slower than BFGS or L-BFGS-B.

Why do you want to look at parameter estimates for each step, anyway?


Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of DEEPANKAR BASU
Sent: Monday, April 23, 2007 11:34 AM
To: Peter Dalgaard
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Estimates at each iteration of optim()?

I read the description of the trace control parameter in ?optim and then
also looked at the examples given at the end. In one of the examples I found
that they had used trace=TRUE  with the method SANN. I am using the
method BFGS and I tried using trace=TRUE too but I did not get the
parameter estimates at each iteration. As you say, it might be method
dependent. I tried reading the source code for optim but could not find
out what I was looking for. Hence, I was wondering if anyone could tell me
what option to use with the method BFGS to get the parameter estimates at
each iteration of the optimization.

Deepankar


- Original Message -
From: Peter Dalgaard [EMAIL PROTECTED]
Date: Monday, April 23, 2007 2:46 am
Subject: Re: [R] Estimates at each iteration of optim()?

 DEEPANKAR BASU wrote:
  I am trying to maximise a complicated loglikelihood function with 
 the optim command. Is there some way to get to know the estiamtes 
 at each iteration? When I put control=list(trace=TRUE) as an 
 option in optim, I just got the initial and final values of the 
 loglikelihood, number of iterations and whether the routine has 
 converged or not. I need to know the estimate values at each 
 iteration.

 It might help if you actually _read_ the description of the trace 
 control parameter (hint: it is not an on/off switch) in ?optim... 
 And, 
 as it says, this is method dependent, so you may have to study the 
 source code.
 
  Deepankar
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html and provide commented, minimal, self-contained, 
 reproducible code.

 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimates at each iteration of optim()?

2007-04-23 Thread Ravi Varadhan
Without knowing much about your problem, it is hard to suggest good
strategies.  However, if you are having trouble with the estimates of
covariance matrix not being positive-definite, you can force them to be
positive-definite after each iteration, before moving on to the next
iteration.  Look at the make.positive.definite function from corpcor
package.  This is just one approach. There may be better approaches -
perhaps, an EM-like approach might be applicable that would automatically
satisfy all parameter constraints.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: DEEPANKAR BASU [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 23, 2007 1:09 PM
To: Ravi Varadhan
Cc: 'Peter Dalgaard'; r-help@stat.math.ethz.ch
Subject: Re: RE: [R] Estimates at each iteration of optim()?

Ravi,

Thanks a lot for your detailed reply. It clarifies many of the confusions in
my mind. 

I want to look at the parameter estimates at each iteration because the full
model that I am trying to estimate is not converging; a smaller version of
the model converges but the results are quite meaningless. The problem in
the estimation of the full model is the following: my likelihood function
contains the elements of a (bivariate normal) covariance matrix as
parameters. To compute the likelihood, I have to draw random samples from
the bivariate normal distribution. But no matter what starting values I
give, I cannot ensure that the covariance matrix remains positive definite
at each iteration of the optimization exercise. Moreover, as soon as the
covariance matrix fails to be positive definite, I get an error message
(because I can no longer draw from the bivariate normal distribution) and
the program stops. Faced with this problem, I wanted to see exactly at which
parameter estimates the covariance matrix fails to remain positive definite.
From that I would think of d
evising a method to get around the problem, at least I would try to.  

Probably there is some other way to solve this problem. I would like your
opinion on the following question: is there some way I can transform the
three parametrs of my (2 by 2) covariance matrix (the two standard
devaitions and the correlation coefficient) to ensure that the covariance
matrix remains positive definite at each iteration of the optimization. Is
there any method other than transforming the parameters to ensure this?

Deepankar



- Original Message -
From: Ravi Varadhan [EMAIL PROTECTED]
Date: Monday, April 23, 2007 12:21 pm
Subject: RE: [R] Estimates at each iteration of optim()?

 Deepankar,
 
 Here is an example using BFGS:
 
  fr - function(x) {   ## Rosenbrock Banana function
 + x1 - x[1]
 + x2 - x[2]
 + 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 + }
  grr - function(x) { ## Gradient of 'fr'
 + x1 - x[1]
 + x2 - x[2]
 + c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
 +200 *  (x2 - x1 * x1))
 + }
  optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE))
 initial  value 24.20 
 iter  10 value 1.367383
 iter  20 value 0.134560
 iter  30 value 0.001978
 iter  40 value 0.00
 final  value 0.00 
 converged
 $par
 [1] 1 1
 
 $value
 [1] 9.594955e-18
 
 $counts
 function gradient 
 110   43 
 
 $convergence
 [1] 0
 
 $message
 NULL
 
  
 
 This example shows that the parameter estimates are printed out 
 every 10
 iterations.  However, trying different integer values for trace 
 from 2 to 10
 (trace = 1 behaves the same as trace=TRUE) did not change anything. 
 If you
 want to get estimates at every iteration, look at the source code 
 for BFGS
 (which I assume is in FORTRAN). You may have to modify the source 
 code and
 recompile it yourself to get more detailed trace for BFGS. 
 
 However, you can get parameter iterates at every step for L-BFGS-
 B using
 trace=6, although this gives a lot more information than just the 
 parameterestimates.  Alternatively, you can use the CG methods 
 with trace=TRUE or
 trace=1, which is a generally a lot slower than BFGS or L-BFGS-B.
 
 Why do you want to look at parameter estimates for each step, anyway?
 
 
 Ravi.
 
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: [EMAIL PROTECTED]
 
 Webpage:  
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

Re: [R] Suggestions for statistical computing course

2007-04-20 Thread Ravi Varadhan
Hi Giovanni,

You may want to consider:
Numerical analysis for statisticians (Springer) by Ken Lange.  We used
when I was taking a graduate level (MS and PhD students) course in
statistical computing.  I really like it and still use it frequently.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Giovanni Petris
Sent: Friday, April 20, 2007 9:34 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Suggestions for statistical computing course


Dear R-helpers,

I am planning a course on Statistical Computing and Computational
Statistics for the Fall semester, aimed at first year Masters students
in Statistics. Among the topics that I would like to cover are linear
algebra related to least squares calculations, optimization and
root-finding, numerical integration, Monte Carlo methods (possibly
including MCMC), bootstrap, smoothing and nonparametric density
estimation. Needless to say, the software I will be using is R.

1. Does anybody have a suggestion about a book to follow that covers
   (most of) the topics above at a reasonable revel for my audience? 
   Are there any on-line publicly-available manuals, lecture notes,
   instructional documents that may be useful?

2. I do most of my work in R using Emacs and ESS. That means that I
   keep a file in an emacs window and I submit it to R one line at a
   time or one region at a time, making corrections and iterating as
   needed. When I am done, I just save the file with the last,
   working, correct (hopefully!) version of my code. Is there a way of
   doing something like that, or in the same spirit, without using
   Emacs/ESS? What approach would you use to polish and save your code
   in this case? For my course I will be working in a Windows
   environment. 
   
   While I am looking for simple and effective solutions that do not
   require installing emacs in our computer lab, the answer you
   should teach your students emacs/ess on top of R is perfecly
   acceptable.
   

Thank you for your consideration, and thank you in advance for the
useful replies.

Have a good day,
Giovanni

-- 

Giovanni Petris  [EMAIL PROTECTED]
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convergence

2007-04-19 Thread Ravi Varadhan
No.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alberto Monteiro
Sent: Thursday, April 19, 2007 9:24 AM
To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Subject: Re: [R] convergence


Ted Harding wrote:
 
 There are various ways round this, but a 'for' loop with
 a fixed number of iterations is not usully one of them!
 
 The simplest is to use while(). A possibly strategy is
 
   Y.old - initial.Y
   while(TRUE){
 Y - compute.Y(Y.old, ...)
 if(abs(Y - Y.old)  small.number) break
 Y.old - Y
   }
 
 This will loop indefinitely until the convergence criterion
 
   abs(Y - Y.old)  small.number
 
 is met, and then stop.
 
I guess some precaution must be taken to prevent that the loop
runs forever.

Those algorithms that must optimize something, but run the risk 
of running forever, sound like the chess playing engine: we
know that a deterministic solution exists (there is a finite number
of chess positions), but it's not practical to check all of them.

I read somewhere that computer loop problems are treated as if
the computer was playing chess agains Murphy: it tries hard to
solve the problem, but sometimes he must give up a path and backtrack
to a less optimum but faster solution.

Do I make any sense?

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems in programming a simple likelihood

2007-04-19 Thread Ravi Varadhan
Hi Deepankar,

Dimitris' code works just fine.  Your problem is that the output of optim
does not have a corresponding summary method.  Instead you should simply
type the name of the object returned by optim to look at the results.

 out - optim(mu.start, mlogl, method = CG, y = women$J, X = cbind(1,
women$M, women$S))
 out
$par
[1] -3.0612277 -1.4567141  0.3659251

$value
[1] 13.32251

$counts
function gradient 
 357  101 

$convergence
[1] 1

$message
NULL

Hope this helps,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu
Sent: Thursday, April 19, 2007 12:42 PM
To: Dimitris Rizopoulos
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Problems in programming a simple likelihood

Dimitris,

Thanks a lot for your suggestion and also for suggestions that others
have provided. I am learning fast and with the help of the R community
will be able to get this going pretty soon. Of course, right now I am
just trying to learn the language; so I am trying to program a standard
probit model for which I know the answers. I could easily get the
estimates with glm. But I want to program the probit model to make
sure I understand the subtleties of R. 

I tried running the alternate script that you provided, but it is still
not working. Am I making some mistake?

Here is what I get when I run your script (which shows that the maximum
number of iterations was reached without convergence):

 source(probit1.R)
 summary(out)
Length Class  Mode
par 3  -none- numeric
value   1  -none- numeric
counts  2  -none- numeric
convergence 1  -none- numeric
message 0  -none- NULL

Here is the script (exactly what you had suggested):

mlogl - function (mu, y, X) {
zeta - as.vector(X %*% mu)
y.logic - as.logical(y)
lgLik - numeric(length(y))
lgLik[y.logic] - pnorm(zeta[y.logic], log.p = TRUE)
lgLik[!y.logic] - pnorm(zeta[!y.logic], lower.tail = FALSE, log.p =
TRUE)
-sum(lgLik)
}

women -
read.table(http://wps.aw.com/wps/media/objects/2228/2281678/Data_Sets/ASCII
/Women13.txt, 
header=TRUE)

mu.start - c(-3, -1.5, 0.5)
out - optim(mu.start, mlogl, method = BFGS, y = women$J, X = cbind(1,
women$M, women$S))
out

glm.fit(x = cbind(1, women$M, women$S), y = women$J, family =
binomial(link = probit))$coefficients


Thanks.
Deepankar


On Thu, 2007-04-19 at 09:26 +0200, Dimitris Rizopoulos wrote:
 try the following:
 
 mlogl - function (mu, y, X) {
 zeta - as.vector(X %*% mu)
 y.logic - as.logical(y)
 lgLik - numeric(length(y))
 lgLik[y.logic] - pnorm(zeta[y.logic], log.p = TRUE)
 lgLik[!y.logic] - pnorm(zeta[!y.logic], lower.tail = FALSE, log.p 
 = TRUE)
 -sum(lgLik)
 }
 
 women - 

read.table(http://wps.aw.com/wps/media/objects/2228/2281678/Data_Sets/ASCII
/Women13.txt, 
 header=TRUE)
 
 mu.start - c(0, -1.5, 0.01)
 out - optim(mu.start, mlogl, method = BFGS, y = women$J, X = 
 cbind(1, women$M, women$S))
 out
 
 glm.fit(x = cbind(1, women$M, women$S), y = women$J, family = 
 binomial(link = probit))$coefficients
 
 
 I hope it helps.
 
 Best,
 Dimitris
 
 
 Dimitris Rizopoulos
 Ph.D. Student
 Biostatistical Centre
 School of Public Health
 Catholic University of Leuven
 
 Address: Kapucijnenvoer 35, Leuven, Belgium
 Tel: +32/(0)16/336899
 Fax: +32/(0)16/337015
 Web: http://med.kuleuven.be/biostat/
  http://www.student.kuleuven.be/~m0390867/dimitris.htm
 
 
 - Original Message - 
 From: Deepankar Basu [EMAIL PROTECTED]
 To: r-help@stat.math.ethz.ch
 Sent: Thursday, April 19, 2007 12:38 AM
 Subject: [R] Problems in programming a simple likelihood
 
 
  As part of carrying out a complicated maximum likelihood estimation, 
  I
  am trying to learn to program likelihoods in R. I started with a 
  simple
  probit model but am unable to get the code to work. Any help or
  suggestions are most welcome. I give my code below:
 
  
  mlogl - function(mu, y, X) {
  n - nrow(X)
  zeta - X%*%mu
  llik - 0
  for (i in 1:n) {
   if (y[i]==1)
llik - llik + log(pnorm(zeta[i,], mean=0, sd=1))
   else
llik - llik + log(1-pnorm(zeta[i,], mean=0, sd=1))
 }
  return(-llik)
  }
 
  women - read.table(~/R/Examples/Women13.txt, header=TRUE)  # DATA
 
  # THE DATA SET CAN BE ACCESSED HERE
  # women -
 
read.table(http://wps.aw.com/wps/media/objects/2228/2281678/Data_Sets/ASCII
/Women13.txt, 
  header=TRUE)
  # I HAVE CHANGED THE NAMES OF THE VARIABLES
  # J is changed to work

Re: [R] Computing the rank of a matrix.

2007-04-07 Thread RAVI VARADHAN
Hi Martin,

I played a bit with rankMat.  I will first state the conclusions of my 
numerical experiments and then present the results to support them:

1.  I don't believe that rankMat, or equivalently Matlab's rank computation, is 
necessarily better than qr(.)$rank or (qr., LAPACK=TRUE)$rank.  In fact, for 
the notorious Hilbert matrix, rankMat can give poor estimates of rank.
2.  There exists no universally optimal (i.e. optimal for all A) tol in qr(A, 
tol)$rank that would be optimally close to rankMat.  The discrepancy in the 
ranks computed by qr(A)$rank and rankMat(A) obviously depends on the matrix A. 
This is evident from the tol used in rankMat:  
tol - max(d) * .Machine$double.eps * abs(singValA[1])
So, this value of tol in qr will also minimize the discrepancy. 
3.  The tol parameter is irrelevant in qr(A, tol, LAPACK=TRUE)$rank, i.e. 
LAPACK doesn't seem to utilize the tol parameter when computing the rank.  
However, qr(A, tol, LAPACK=FALSE)$rank does depend on tol.   

Now, here are the results:
1.
 hilbert.rank - matrix(NA,20,3)
 hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
 for (i in 1:20) {
+ himat - hilbert(i)
+ hilbert.rank[i,1] - rankMat(himat)
+ hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank
+ hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK = TRUE)$rank
+ }
 hilbert.rank
  [,1] [,2] [,3]
 [1,]111
 [2,]222
 [3,]333
 [4,]444
 [5,]555
 [6,]666
 [7,]777
 [8,]888
 [9,]999
[10,]   10   10   10
[11,]   10   11   11
[12,]   11   12   12
[13,]   11   12   13
[14,]   11   13   14
[15,]   12   13   15
[16,]   12   15   16
[17,]   12   16   17
[18,]   12   16   18
[19,]   13   17   19
[20,]   13   17   20

We see that rankMat underestimates the rank for n  10, but qr(himat, tol = 
.Machine$double.eps, LAPACK=TRUE)$rank gets it right.

2.  
Here I first, created a random rectangular matrix, and then added a number of 
rows to it, where these new rows are the same as some of the original rows 
except for a tiny amount of noise, which I call eps.  So, the degree of rank 
deficiency is controlled by eps.  I let eps take on 3 values: 1.e-07, 1.e-10, 
and 1.e-14, and show that the optimal tol (in terms of being close to rankMat) 
depends on the level of precision (eps) in the matrix.
 set.seed(123)
 nrow - 15
 ncol - 20
 nsim - 5000
 ndefic - 4  # number of nearly dependent rows
 eps - 1.e-07
 rnk - matrix(NA, nsim, 5)
 for (j in 1:nsim) {
+ A - matrix(rnorm(nrow*ncol),nrow, ncol)
+ newrows - matrix(NA, ndefic, ncol)
+ for (i in 1:ndefic) {
+ newrows[i,] - A[nrow-i,] + rnorm(ncol, sd=min(abs(A[nrow-i+1,]))* eps)
+ }
+ 
+ B - rbind(A,newrows)
+ rnk[j,1] - rankMat(B)
+ rnk[j,2] - qr(B, tol = 1.e-07)$rank
+ rnk[j,3] - qr(B, tol = 1.e-11)$rank
+ rnk[j,4] - qr(B, tol = 1.0e-14)$rank
+ rnk[j,5] - qr(B, tol = .Machine$double.eps, LAPACK = TRUE)$rank
+ }
 apply(abs(rnk[,1] - rnk[,2:5]),2,sum)
[1] 1935153 0 0

We observe that both qr(B, tol=1.e-14)$rank and qr(B, tol = 
.Machine$double.eps, LAPACK = TRUE)$rank give exactly the same rank as rankMat.

Now, we change eps - 1.e-10 and the results are:
 apply(abs(rnk[,1] - rnk[,2:5]),2,sum)
[1] 19778 14263   166   220
This means that a tolerance of 1.e-14 works best.

Now we lower eps further: eps - 1.e-14
 apply(abs(rnk[,1] - rnk[,2:5]),2,sum)
[1] 0 3   665 2
Clearly, the smaller tolerances yield rank estimates that are higher than that 
given by rankMat.  That is, rankMat underestimates the rank as in the case of 
Hilbert matrix.

3.
Now we show that qr(., tol, LAPACK=TRUE)$rank is independent of tol:
 exp - -(7:16)
 tol - 10^exp
 sapply(tol, A=hilbert(20), function(x,A)qr(A, tol=x, LAPACK=FALSE)$rank)
 [1] 10 12 14 14 15 16 16 17 18 19
 sapply(tol, A=hilbert(20), function(x,A)qr(A, tol=x, LAPACK=TRUE)$rank)
 [1] 20 20 20 20 20 20 20 20 20 20

Looking forward to comments.

Best,
Ravi. 

- Original Message -
From: Martin Maechler [EMAIL PROTECTED]
Date: Saturday, April 7, 2007 10:57 am
Subject: Re: [R] Computing the rank of a matrix.
To: Ravi Varadhan [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], 'José Luis Aznarte M.' [EMAIL PROTECTED], 
r-help@stat.math.ethz.ch


  Ravi == Ravi Varadhan [EMAIL PROTECTED]
   on Fri, 6 Apr 2007 12:44:33 -0400 writes:
  
  Ravi Hi, qr(A)$rank will work, but just be wary of the
  Ravi tolerance parameter (default is 1.e-07), since the
  Ravi rank computation could be sensitive to the tolerance
  Ravi chosen.
  
  Yes, indeed.
  
  The point is that   rank(Matrix) 
  is well defined in pure math (linear algebra), as well as 
  a singular matrix is.
  
  The same typically no longer makes sense as soon as you enter
  the real world: A matrix close to singular may have to be
  treated as if singular depending on its singularity
  closeness {{ learn about the condition number of a matrix }}
  and the same issues arise with rank(matrix).
  
  Of course

Re: [R] Computing the rank of a matrix.

2007-04-06 Thread Ravi Varadhan
Hi,

qr(A)$rank will work, but just be wary of the tolerance parameter (default
is 1.e-07), since the rank computation could be sensitive to the tolerance
chosen.  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Friday, April 06, 2007 11:07 AM
To: José Luis Aznarte M.
Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
Subject: Re: [R] Computing the rank of a matrix.

How about

  qr(A)$rank

or perhaps

  qr(A, LAPACK=TRUE)$rank

Cheers,

Andy

__
Andy Jaworski
518-1-01
Process Laboratory
3M Corporate Research Laboratory
-
E-mail: [EMAIL PROTECTED]
Tel:  (651) 733-6092
Fax:  (651) 736-3122


   
 José Luis
 Aznarte M.   
 [EMAIL PROTECTED]  To 
 .ugr.es  r-help@stat.math.ethz.ch
 Sent by:   cc 
 [EMAIL PROTECTED] 
 at.math.ethz.ch   Subject 
   [R] Computing the rank of a matrix. 
   
 04/06/2007 06:39  
 AM
   
   
   




Hi! Maybe this is a silly question, but I need the column rank
(http://en.wikipedia.org/wiki/Rank_matrix) of a matrix and R function
'rank()' only gives me the ordering of the elements of my matrix.
How can I compute the column rank of a matrix? Is there not an R
equivalent to Matlab's 'rank()'?
I've been browsing for a time now and I can't find anything, so any
help will be greatly appreciated. Best regards!


--  --
Jose Luis Aznarte M.   http://decsai.ugr.es/~jlaznarte
Department of Computer Science and Artificial Intelligence
Universidad de Granada   Tel. +34 - 958 - 24 04 67
GRANADA (Spain)  Fax: +34 - 958 - 24 00 79

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Likelihood returning inf values to optim(L-BFGS-B) other options?

2007-04-05 Thread Ravi Varadhan
Hi,

In your code, the variables x (which I assume is the observed data), Tvec,
and flag are not passed to the function as arguments.  This could be a
potential problem.  Another problem could be that you have to use negative
log-likelihood function as input to optim, since by default it minimizes
the function, whereas you are interested in finding the argmax of
log-likelihood.  So, in your function you should return (-ll) instead of ll.

If the above strategies don't work, I would try different initial values (it
would be best if you have a data-driven strategy for picking a starting
value) and different optimization methods (e.g. conjugate gradient with
Polak-Ribiere steplength option, Nelder-Mead, etc.).  

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Thursday, April 05, 2007 6:12 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Likelihood returning inf values to optim(L-BFGS-B) other
options?

Dear R-help list,

I am working on an optimization with R by evaluating a likelihood  
function that contains lots of Gamma calculations (BGNBD: Hardie Fader  
Lee 2005 Management Science). Since I am forced to implement lower  
bounds for the four parameters included in the model, I chose the  
optim() function mith L-BFGS-B as method. But the likelihood often  
returns inf-values which L-BFGS-B can't deal with.

Are there any other options to implement an optimization algorithm  
with R accounting for lower bounds and a four parameter-space?

Here is the error message I receive (german):
--

out=optim(c(.1,.1,.1,.1),fn,method=L-BFGS-B,lower=c(.0001,.0001,.0001,.000
1,.0001))
Fehler in optim(c(0.1, 0.1, 0.1, 0.1), fn, method = L-BFGS-B, lower  
= c(1e-04,  :
L-BFGS-B benötigt endliche Werte von 'fn'
Zusätzlich: Es gab 50 oder mehr Warnungen (Anzeige der ersten 50 mit  
warnings())
--
And this is the likelihood function:
--
fn-function(p) {
 A1=(gamma(p[1]+x)*p[2]^p[1])/(gamma(p[1]))
 A2=(gamma(p[3]+p[4])*gamma(p[4]+x))/(gamma(p[4])*gamma(p[3]+p[4]+x))
 A3=(1/(p[2]+Tvec))^(p[1]+x)
 A4=(p[3]/(p[4]+x-1))*((1/(p[2]+t_x))^(p[1]+x))
 ll=sum(log(A1*A2*(A3+flag*A4)))
 return(ll)
}

Thank you very much for your help in advance!

Best regards,

Michael

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Large matrix into a vector

2007-03-28 Thread Ravi Varadhan
c(HR) will do it.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of A Ezhil
Sent: Wednesday, March 28, 2007 1:28 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Large matrix into a vector

Hi,

I have a matrix HR(9x27). I would like to make a
single vector with elements: t(HR[,1]) followed by
t(HR[,2]) and then t(HR[,3] ... etc. Is there any neat
way of converting this matrix into a vector rather
doing something like c(t(HR[,1]), t(HR[,2]), t(HR[,3])
..)?

Thanks in Advance.
Kind regards,
Ezhil


 


TV dinner still cooling? 
Check out Tonight's Picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Gaussian Adaptive Quadrature

2007-03-21 Thread Ravi Varadhan
Hi,

The function integrate is based on QUADPACK Fortran package by Piessens.
It does indeed perform adaptive Gauss-Kronrod quadrature, where Kronrod's
method allows the re-use of abscissa from the previous iteration, thus
enabling the estimation of the quadrature error and its control.  In
contrast, in the standard Gaussian quadrature methods this is not feasible. 

The terminology Gaussian quadrature is not restricted to Gauss-Hermite
quadrature (where exp(-x^2) is the weight function), but applies more
broadly to Gauss-Legendre, Gauss-Laguerre, etc., where the abscissa are
chosen from Legendre, Laguerre polynomials.

Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Douglas Bates
Sent: Wednesday, March 21, 2007 9:25 AM
To: Doran, Harold
Cc: Help mailing list - R
Subject: Re: [R] Gaussian Adaptive Quadrature

On 3/21/07, Doran, Harold [EMAIL PROTECTED] wrote:
 The function integrate() uses AGQ. There are other functions for
 gaussian quadrature in the statmod() package that I really like.

I think that integrate does adaptive quadrature but not adaptive
Gaussian quadrature (which probably should have been called adaptive
Gauss-Hermite quadrature to be more specific).  In the first case the
adaptive refers to a choice of mesh size.  In the second case one is
integrating a function that is close to a multivariate Gaussian
density by first finding the conditional optimum of the integrand and
using a quadratic approximation to the log-integrand to establish the
location of the Gauss-Hermite quadrature points.

The Laplace approximation to the log-likelihood for a generalized
linear mixed model is a 1-point adaptive Gauss-Hermite quadrature
evaluation.


  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Caio
  Lucidius Naberezny Azevedo
  Sent: Wednesday, March 21, 2007 5:55 AM
  To: Help mailing list - R
  Subject: [R] Gaussian Adaptive Quadrature
 
  Hi all,
 
Does anybody know any function that performs gaussian
  adapative quadrature integration of univariate functions?
 
Thanks in advance,
 
Regards,
 
  Caio
 
   __
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MCMC logit

2007-03-09 Thread Ravi Varadhan
As the error message clearly indicates, the function MCMClogit is unable to
find the variable x1 (possibly x2,x3, and x4 also) in the data frame c.df.
Check the names of the variables in that data frame and make sure that the
names correspond to the formula specification.

Hope this helps,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Anamika Chaudhuri
Sent: Friday, March 09, 2007 3:27 PM
To: r-help@stat.math.ethz.ch
Subject: [R] MCMC logit

Hi, 
I have a dataset with the binary outcome Y(0,1) and 4 covariates
(X1,X@,X#,X$). I am trying to use MCMClogit to model logistic regression
using MCMC. I am getting an error where it doesnt identify the covariates
,although its reading in correctly. The dataset is a sample of actual
dataset. Below is my code:
 ###
 
 
 #retreive data
 # considering four covariates

d.df=as.data.frame(read.table(c:/tina/phd/thesis/data/modified_data1.1.txt
,header=T,sep=,))
 y=d.df[,ncol(d.df)]
 x=d.df[,1:4]
 c.df=cbind(y,x)
 #x=cbind(1,x)
 p - ncol(c.df)
 
 # marginal log-prior of beta[]
 logpriorfun - function(beta, mu, gshape, grate)
+ {
+ logprior = -p*log(2) + log(gamma(p+gshape)) - log(gamma(gshape))
+ + gshape*log(grate) - (p+gshape)* log(grate+sum(abs(beta)))
+ return(logprior)
+ }
 require(MCMCpack)
Loading required package: MCMCpack
Loading required package: coda
Loading required package: lattice
Loading required package: MASS
##
## Markov Chain Monte Carlo Package (MCMCpack)
## Copyright (C) 2003-2007 Andrew D. Martin and Kevin M. Quinn
##
## Support provided by the U.S. National Science Foundation
## (Grants SES-0350646 and SES-0350613)
##
[1] TRUE
Warning message:
package 'MASS' was built under R version 2.4.1 
 a0 = 0.5
 b0 = 1
 mu0 = 0
 beta.init=list(c(0, rep(0.1,4)), c(0, rep(-0.1,4)), c(0, rep(0, 4)))
 burnin.cycles = 1000
 mcmc.cycles = 25000
 # three chains
 post.list - lapply(beta.init, function(vec)
+ {
+ posterior - MCMClogit(y~x1+x2+x3+x4, data=c.df, burnin=burnin.cycles,
mcmc=mcmc.cycles,
+ thin=5, tune=0.5, beta.start=vec, user.prior.density=logpriorfun,
logfun=TRUE,
+ mu=mu0, gshape=a0, grate=b0)
+ return(posterior)
+ })
Error in eval(expr, envir, enclos) : object x1 not found
 
  Any suggestions will be greatly appreciated.
  Thanks,
Anamika

 
-
We won't tell. Get more on shows you hate to love

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with faster optimization for large parameter problem

2007-03-02 Thread Ravi Varadhan
Hi James,

There are a few things that immediately come to my mind that you could try:

1.  Profile your code using Rprof to detect which steps are the most
time-consuming.
2.  In your likelihood function you can make the code a bit more faster by
using outer instead of %*%. 
3.  Using conjugate-gradient type methods in optim might be a better
option since you are dealing with thousands of parameters and BFGS could
use up a lot of memory working with a large Hessian matrix.  CG methods use
much less storage and memory than quasi-Newton methods.  The main caveat
here is that the CG methods generally have slower convergence than QN type
methods, unless you can precondition the problem.

Hope this is helpful,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of James Fowler
Sent: Friday, March 02, 2007 4:34 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Help with faster optimization for large parameter problem

Hello all,

I have a large parameter problem with the following very simple likelihood 
function:

fn-function(param) {
  x1-param[1:n]
  g1-param[(n+1):(2*n)]
  beta-param[(2*n+1):(2*n+k)]
  sigma2-param[2*n+k+1]^2
  meang1sp-mean(g1[sp])
  mu-beta%*%matrix(x1,1,n)-(g1[sp]-meang1sp)%*%matrix(g1,1,n)
  return(sum((ydc-mu)^2)/(2*sigma2) + n*k*log(sqrt(sigma2)) +
   mean(x1)^2 + mean(g1)^2 + 1000*(x1[1]x1[n]))
}

There's nothing special here -- it's just plain old OLS, except all the 
variables on the right hand side are parameters (only the variable ydc is 
observed).  I have no problems getting this to recover parameter estimates 
from data I myself have generated for smaller problems (e.g. where ydc is 
a k=500 by n=50 matrix and there are 601 parameters).  But the target 
problem with real data will be k=6000 by n=400 with 6801 parameters.

I am currently using optim(method=BFGS) but it is slow.  I can get good 
starting values for x1 and g1 with some svd techniques, and these help me 
generate the starting values for the betas via lm().

I then use optim() on a modified likelihood function to find g1,x1,sigma2 
while holding beta fixed and then use optim() again to find beta while 
holding the other variables fixed.  But eventually, I have to run optim on 
the unmodified likelihood function above and it is very slow, taking 
several days for large problems.

I have also tried metrop() in mcmc, but I find this needs to be very 
close to the mode of the likelihood to be efficient (in fact, MCMCpack's 
metropolis function calls optim first and even requires it to invert the 
hessian before even starting the metropolis algorithm, unless we can 
provide our own covariance matrix).  I will probably use metrop() to 
generate standard errors once I find a mode

In the mean time, I can't help thinking that there is some easy way to 
make this much faster than I am currently doing, especially since the 
likelihood is normal.  I am sure I have missed something obvious so I'd 
very much appreciate any advice you could give on packages in R or code 
that might help.

Thanks!
james

--
James H. Fowler, Associate Professor   web:   http://jhfowler.ucsd.edu
Department of Political Scienceemail: [EMAIL PROTECTED]
University of California, San Diegophone: (858) 534-6807
Social Sciences Building 383, #0521fax:   (858) 534-7130
9500 Gilman Drive, La Jolla, CA 92093

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Questions regarding biplot

2007-03-02 Thread Ravi Varadhan
Hi,

 

I am using biplot (in package stats) to visualize the results of some
PCA analyses.  When I use scale=0 in the biplot function, it draws X (PC1)
and Y (PC2) axes, with the scales multiplied by sqrt(N), where N is the
sample size.  Is there a way to change these scales so that the actual
values of PC1 and PC2 are shown, instead of multiplication by sqrt(n)?   

 

Also, how can I plot the points in biplot with different colors or symbols
corresponding to a grouping variable, instead of sample row number?

 

Thanks for any suggestions.

 

Best,

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] returns from dnorm and dmvnorm

2007-02-26 Thread Ravi Varadhan
I guarantee that it would also happen on all future versions of R.

Why would you expect density to be smaller than 1?  The only constraints on
density are that (a) it is non-negative and (b) it integrates to one. The
smaller the variance, the greater the density is around its center.  Density
can be made to become arbitrarily large by letting the variance gets close
to zero, and in the limit you will obtain Dirac's delta function.  


Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of A Hailu
Sent: Monday, February 26, 2007 3:04 PM
To: r-help@stat.math.ethz.ch
Subject: [R] returns from dnorm and dmvnorm

Hi All,
Why would calls to dnorm and dmvnorm return values that are above 1? For
example,
 dnorm(0.3,mean=0, sd=0.1)
[1] 3.989423

This is happening on two different installations of R that I have.

Thank you.

Hailu

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using integrate in a function definition

2007-02-23 Thread Ravi Varadhan
Your function jjj is not vectorized.

Try this:

jjj - function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value)
plot(jjj, 0, 5)

It should work.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of theo borm
Sent: Friday, February 23, 2007 1:43 PM
To: r-help@stat.math.ethz.ch
Subject: [R] using integrate in a function definition

Dear list members,

I'm quite new to R, and though I tried to find the answer to my probably 
very basic question through the available resources (website, mailing 
list archives, docs, google), I've not found it.


If I try to use the integrate function from within my own functions, 
my functions seem to misbehave in some contexts. The following example 
is a bit silly, but illustrates what's happening:


The following behaves as expected:

  jjj-function(www) {2*integrate(dnorm,0,www)$value}
  kkk-function(www) {2*(pnorm(www)-0.5)}
  jjj(2)
[1] 0.9544997
  kkk(2)
[1] 0.9544997



However, if I do:
  plot(jjj,0,5)
Error in xy.coords(x, y, xlabel, ylabel, log) :
 'x' and 'y' lengths differ

whereas
  plot(kkk,0,5)
produces a nice plot.




  xxx-seq(0:5)
  yyy-jjj(xxx)
  zzz-kkk(xxx)

produces no errors, but:
  yyy
[1] 0.6826895
  zzz
[1] 0.6826895 0.9544997 0.9973002 0.367 0.994 1.000




Why is this? Is this some R language feature that I've completely missed?




Ultimately my problem is that I have a mathematical function describing 
a distribution, and I want to use this in a Kolmogorov-Smirnov test 
where I need a cumulative distribution function. If I use the following 
(synthetic) dataset with ks.test with either the jjj or kkk 
function, I get:

  ppp-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853,
0.72023272,0.40245392,0.16002948,0.24203950,0.44029851,
0.34830446,1.66967152,1.71609574,1.17540830,0.22306379,
0.64382628,0.37382795,0.84361547,1.92138362,0.02844235,
0.11238473,0.21237557,1.01058073,2.33108243,1.36034473,
1.88951679,0.18230647,0.96571916,0.91239760,2.05574766,
0.33681130,2.76006257,0.83952491,1.24038831,1.46199671,
0.24694163,0.01565159,0.94816108,1.04642673,0.36556444,
2.20760578,1.59518590,0.83951030,0.07113652,0.97422886,
0.59835793,0.08383890,1.09180853,0.43508943,0.35368637,
0.75805978,0.12790161,1.56798563,1.68669770,0.56168021)
  ks.test(ppp,kkk)

 One-sample Kolmogorov-Smirnov test

data:  ppp
D = 0.1013, p-value = 0.5895
alternative hypothesis: two.sided

[ which seems correct as the samples come from abs(rnorm()) ]

  ks.test(ppp,jjj)

 One-sample Kolmogorov-Smirnov test

data:  ppp
D = 0.9875, p-value  2.2e-16
alternative hypothesis: two.sided

[ which is *incorrect* ]


Note: This example is artificial as I have used a function for which I 
know the integral; my real problem involves a distribution that I can 
only integrate numerically.

How would I go about to solve this problem?



With kind regards,

Theo.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simulating from Langevin distributions

2007-02-13 Thread Ravi Varadhan
Hi Ranjan,

I think that the following would work:

library(MASS)

rlangevin - function(n, mu, K) {
q - length(mu)
norm.sim - mvrnorm(n, mu=mu, Sigma=diag(1/K, q))
cp - apply(norm.sim, 1, function(x) sqrt(crossprod(x)))
sweep(norm.sim, 1, cp, FUN=/)
}

 mu - runif(7)
 mu - mu / sqrt(crossprod(mu))
 K - 1.2
 ylang - rlangevin(n=10, mu=mu, K=K)
 apply(ylang,1,crossprod)
 [1] 1 1 1 1 1 1 1 1 1 1


I hope that this helps.

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ranjan Maitra
Sent: Tuesday, February 13, 2007 1:04 PM
To: r-help@stat.math.ethz.ch
Subject: [R] simulating from Langevin distributions

Dear all,

I have been looking for a while for ways to simulate from Langevin
distributions and I thought I would ask here. I am ok with finding an
algorithmic reference, though of course, a R package would be stupendous!

Btw, just to clarify, the Langevin distribution with (mu, K), where mu is a
vector and K0 the concentration parameter is defined to be:

f(x) = exp(K*mu'x) / const where both mu and x are p-variate vectors with
norm 1.

For p=2, this corresponds to von-Mises (for which algorithms exist,
including in R/Splus) while for p=3, I believe it is called the Fisher
distribution. I am looking for general p.

Can anyone please help in this?

Many thanks and best wishes,
Ranjan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multidimensional Integration over arbitrary sets

2007-02-13 Thread Ravi Varadhan
Hi,

By defining your function appropriately (e.g. using indicator functions),
you can make adapt work:

myfunc - function(x) {
x[1]*x[2] * (x[1] = x[2])
}
# Exact answer is 1/8

 library(adapt)
 adapt(2, lo=c(0,0), up=c(1,1), functn=myfunc)
  value  relerr  minpts  lenwrk   ifail 
  0.1250612 0.00999505459071123   0 


Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Saptarshi Guha
Sent: Tuesday, February 13, 2007 3:34 PM
To: R-Help
Subject: [R] Multidimensional Integration over arbitrary sets

Hi,
I need to integrate a 2D function over range where the limits depend

on the other e.g integrate f(x,y)=x*y over {x,0,1} and {y,x,1}.
i.e \int_0^1 \int_x^1 xy dydx

I checked adapt but it doesn't seem to help here. Are they any  
packages for this sort of thing?
I tried RSitesearch but couldn't find the answer to this.
Many thanks for you help.
Regards
Saptarshi

Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls: missing value or an infinity (Error in numericDeriv) andsingular gradient matrixError in nlsModel

2007-02-13 Thread Ravi Varadhan
Hi Daniela,

Please read the error message from nls.  The problem is with the start
values for the parameters a and b.  You haven't specified any, so it uses
default values of a=1 and b=1, which may not be very good.  So, you should
specify good start values, if you have a reasonable idea of what they should
be.  If you don't, then you could try selfStart, as the error message tells
you to do.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Daniela Salvini
Sent: Tuesday, February 13, 2007 4:09 PM
To: r-help@stat.math.ethz.ch
Subject: [R] nls: missing value or an infinity (Error in numericDeriv)
andsingular gradient matrixError in nlsModel

Hi,

I am a non-expert user of R. I am essaying the fit of two different
functions to my data, but I receive two different error messages. I suppose
I have two different problems here... But, of which nature? In the first
instance I did try with some different starting values for the parameters,
but without success. 
If anyone could suggest a sensible way to proceed to solve these I would be
really thankful! 

Daniela

***

Details on data and formulas:

  polcurve=read.table(polcurve.txt,header=TRUE)
 polcurve
   distfath
1 1 0.138908965
2 2 0.113768871
3 3 0.114361753
4 4 0.051065765
5 5 0.095787946
6 6 0.123601131
7 7 0.117340746
8 8 0.054049434
9 9 0.112000962
10   10 0.074843028
11   11 0.064735196
12   12 0.098210497
13   13 0.093786895
14   14 0.033752379
15   15 0.038961039
16   16 0.023048216
17   17 0.010018243
18   18 0.007177034
19   19 0.003787879
20   20 0.0
21   21 0.0

#Exponential power function
 s=nls(fath~((b^2)/(a^2))*exp((-dist/a)^b),polcurve)


Error in numericDeriv(form[[3]], names(ind), env) : 
Missing value or an infinity produced when evaluating the model
In addition: Warning message:
No starting values specified for some parameters.
Intializing 'b', 'a' to '1.'.
Consider specifying 'start' or using a selfStart model in: nls(fath ~
((b^2)/(a^2)) * exp((-dist/a)^b), polcurve) 

#Geometric function
s=nls(fath~((b-1)*(b-2)/a)*((1+dist/a)^-b),polcurve)


Error in nlsModel(formula, mf, start, wts) : 
singular gradient matrix at initial parameter estimates
In addition: Warning message:
No starting values specified for some parameters.
Intializing 'b', 'a' to '1.'.
Consider specifying 'start' or using a selfStart model in: nls(fath ~ ((b -
1) * (b - 2)/a) * ((1 + dist/a)^-b),  

Daniela Salvini 
Ph.D. student
University of Copenhagen, 
Faculty of Life Sciences, 
Centre for Forest and Landscape
Hørsholm Kongevej 11
2970 Hørsholm, Denmark
 
e-mail: [EMAIL PROTECTED] 
Tel.: (+45)35281639 / 35281645
Fax.: (+45)35281517

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Timings of function execution in R [was Re: R in Industry]

2007-02-09 Thread Ravi Varadhan
Hi Profs. Ripley and Bates,

I also recollect from a Tom Lumley email that when he profiled an MCMC
computation, he found that pmin/pmax was the bottleneck.  That is why he
suggested the function that I called fast.pmax.  I think that it would be
nice to have restricted alternative functions dealing exclusively with
numeric mode.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Douglas Bates
Sent: Friday, February 09, 2007 10:05 AM
To: Prof Brian Ripley
Cc: R-Help
Subject: Re: [R] Timings of function execution in R [was Re: R in Industry]

On 2/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 The other reason why pmin/pmax are preferable to your functions is that
 they are fully generic.  It is not easy to write C code which takes into
 account that , [, [- and is.na are all generic.  That is not to say that
 it is not worth having faster restricted alternatives, as indeed we do
 with rep.int and seq.int.

 Anything that uses arithmetic is making strong assumptions about the
 inputs.  It ought to be possible to write a fast C version that worked for
 atomic vectors (logical, integer, real and character), but is there
 any evidence of profiled real problems where speed is an issue?

Yes.  I don't have the profiled timings available now and one would
need to go back to earlier versions of R to reproduce them but I did
encounter a situation where the bottleneck in a practical computation
was pmin/pmax.  The binomial and poisson families for generalized
linear models used pmin and pmax to avoid boundary conditions when
evaluating the inverse link and other functions.  When I profiled the
execution of some generalized linear model and, more importantly for
me, generalized linear mixed model fits, these calls to pmin and pmax
were the bottleneck.  That is why I moved some of the calculations for
the binomial and poisson families in the stats package to compiled
code.

In that case I didn't rewrite the general form of pmin and pmax, I
replaced specific calls in the compiled code.


 On Fri, 9 Feb 2007, Martin Maechler wrote:

  Ravi == Ravi Varadhan [EMAIL PROTECTED]
  on Thu, 8 Feb 2007 18:41:38 -0500 writes:
 
 Ravi Hi,
 Ravi greaterOf is indeed an interesting function.  It is much
faster than the
 Ravi equivalent R function, pmax, because pmax does a lot of
checking for
 Ravi missing data and for recycling.  Tom Lumley suggested a simple
function to
 Ravi replace pmax, without these checks, that is analogous to
greaterOf, which I
 Ravi call fast.pmax.
 
 Ravi fast.pmax - function(x,y) {i- xy; x[i]-y[i]; x}
 
 Ravi Interestingly, greaterOf is even faster than fast.pmax,
although you have to
 Ravi be dealing with very large vectors (O(10^6)) to see any real
difference.
 
  Yes. Indeed, I have a file, first version dated from 1992
  where I explore the slowness of pmin() and pmax() (in S-plus
  3.2 then). I had since added quite a few experiments and versions to
that
  file in the past.
 
  As consequence, in the robustbase CRAN package (which is only a bit
  more than a year old though), there's a file, available as
   https://svn.r-project.org/R-packages/robustbase/R/Auxiliaries.R
  with the very simple content {note line 3 !}:
 
 
-
  ### Fast versions of pmin() and pmax() for 2 arguments only:
 
  ### FIXME: should rather add these to R
  pmin2 - function(k,x) (x+k - abs(x-k))/2
  pmax2 - function(k,x) (x+k + abs(x-k))/2
 
-
 
  {the funny argument name 'k' comes from the use of these to
  compute Huber's psi() fast :
 
   psiHuber - function(x,k)  pmin2(k, pmax2(- k, x))
   curve(psiHuber(x, 1.35), -3,3, asp = 1)
  }
 
  One point *is* that I think proper function names would be pmin2() and
  pmax2() since they work with exactly 2 arguments,
  whereas IIRC the feature to work with '...' is exactly the
  reason that pmax() and pmin() are so much slower.
 
  I've haven't checked if Gabor's
  pmax2.G - function(x,y) {z - x  y; z * (x-y) + y}
  is even faster than the abs() using one.
  It may have the advantage of giving *identical* results (to the
  last bit!)  to pmax()  which my version does not --- IIRC the
  only reason I did not follow my own 'FIXME' above.
 
  I  had then planned to implement pmin2() and pmax2() in C code,
trivially,
  and and hence get identical (to the last bit!) behavior as
  pmin

Re: [R] R in Industry

2007-02-08 Thread Ravi Varadhan
Here is a function to create a Toeplitz matrix of any size, and an example
of a 220 x 220 toeplitz matrix, which was created in almost no time:


# Given a vector x, forms a Toeplitz matrix
#
toeplitz - function (x, sym=T) {
   if (!is.vector(x)) 
   stop(x is not a vector)
   n - length(x)
   if (!sym) { 
   if (!n%%2) stop(length of vector must be odd)
   n2 - (n+1)/2
   A - matrix(NA, n2, n2)
   mat - matrix(x[col(A) - row(A) + n2], n2, n2)
   }
   else {
   A - matrix(NA, n, n)
   mat - matrix(x[abs(col(A) - row(A)) + 1], n, n)
   }
   mat
}
###

 system.time(top.mat - toeplitz(runif(220)))
[1] 0.00 0.01 0.02   NA   NA

Hope this is fast enough!

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Stefan Grosse
Sent: Thursday, February 08, 2007 12:09 PM
To: Albrecht, Dr. Stefan (AZ Private Equity Partner)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] R in Industry

I just ran on my Windows PC the benchmark from
http://www.sciviews.org/benchmark/index.html which is pretty old now.
Thats probably the reason for the errors which I did not correct. As
you see R has some advantages but Matlab has also some advantages.
However the differences are not to big. There is only one big
difference which indeed includes loops (Creation of a 220x220 Toeplitz
matrix) where Matlab is much faster. But maybe a simple change in the
programmation can change that...

Has someone in the list an updated script?

Stefan Grosse

The benchmarks:

R 2.4.1
   R Benchmark 2.3
   ===

   I. Matrix calculation

   -
Creation, transp., deformation of a 1500x1500 matrix (sec):
0.865
800x800 normal distributed random matrix ^1000__ (sec):
0.136
Sorting of 2,000,000 random values__ (sec):
0.615
700x700 cross-product matrix (b = a' * a)___ (sec):
0.557
Linear regression over a 600x600 matrix (c = a \ b') (sec):  0 #ERROR


   II. Matrix functions
   
FFT over 800,000 random values__ (sec):
0.557
Eigenvalues of a 320x320 random matrix__ (sec):
0.495
Determinant of a 650x650 random matrix__ (sec):
0.276
Cholesky decomposition of a 900x900 matrix__ (sec):  0 #ERROR
Inverse of a 400x400 random matrix__ (sec):  0 #ERROR

   III. Programmation
   --
750,000 Fibonacci numbers calculation (vector calc)_ (sec):
0.469
Creation of a 2250x2250 Hilbert matrix (matrix calc) (sec):
1.016667
Grand common divisors of 70,000 pairs (recursion)___ (sec):
0.3966671
Creation of a 220x220 Toeplitz matrix (loops)___ (sec):
0.552
Escoufier's method on a 37x37 matrix (mixed) (sec):
2.66

  --- End of test ---

Matlab 7.0.4

  Matlab Benchmark 2
   ==
Number of times each test is run__: 3
 
   I. Matrix calculation
   -
Creation, transp., deformation of a 1500x1500 matrix (sec): 0.29047
800x800 normal distributed random matrix ^1000__ (sec): 0.42967
Sorting of 2,000,000 random values__ (sec): 0.71432
700x700 cross-product matrix (b = a' * a)___ (sec): 0.14748
Linear regression over a 600x600 matrix (c = a \ b') (sec): 0.12831
  --
Trimmed geom. mean (2 extremes eliminated): 0.26403
 
   II. Matrix functions
   
FFT over 800,000 random values__ (sec): 0.24591
Eigenvalues of a 320x320 random matrix__ (sec): 0.38507
Determinant of a 650x650 random matrix__ (sec): 0.091612
Cholesky decomposition of a 900x900 matrix__ (sec): 0.11059
Inverse of a 400x400 random matrix__ (sec): 0.069414
  --
Trimmed geom. mean (2 extremes eliminated): 0.13556
 
   III. Programmation
   --
750,000 Fibonacci numbers calculation (vector calc)_ (sec): 1.2386
Creation of a 2250x2250 Hilbert matrix (matrix calc) (sec): 3.0541
Grand common divisors of 70,000 pairs (recursion)___ (sec): 1.7637
Creation of a 220x220 Toeplitz matrix (loops)___ (sec): 0.0045972
Escoufier's method on a 37x37 matrix

Re: [R] Timings of function execution in R [was Re: R in Industry]

2007-02-08 Thread Ravi Varadhan
Hi,

greaterOf is indeed an interesting function.  It is much faster than the
equivalent R function, pmax, because pmax does a lot of checking for
missing data and for recycling.  Tom Lumley suggested a simple function to
replace pmax, without these checks, that is analogous to greaterOf, which I
call fast.pmax.  

fast.pmax - function(x,y) {i- xy; x[i]-y[i]; x}

Interestingly, greaterOf is even faster than fast.pmax, although you have to
be dealing with very large vectors (O(10^6)) to see any real difference.

 n - 200
 
  x1 - runif(n)
  x2 - rnorm(n)
 system.time( ans1 - greaterOf(x1,x2) )
[1] 0.17 0.06 0.23   NA   NA
 system.time( ans2 - pmax(x1,x2) )
[1] 0.72 0.19 0.94   NA   NA
 system.time( ans3 - fast.pmax(x1,x2) )
[1] 0.29 0.05 0.35   NA   NA
  
 all.equal(ans1,ans2,ans3)
[1] TRUE


Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Douglas Bates
Sent: Thursday, February 08, 2007 6:00 PM
To: R-Help
Subject: [R] Timings of function execution in R [was Re: R in Industry]

On 2/8/07, Albrecht, Dr. Stefan (AZ Private Equity Partner)
[EMAIL PROTECTED] wrote:
 Dear all,

 Thanks a lot for your comments.

 I very well agree with you that writing efficient code is about
optimisation. The most important rules I know would be:
 - vectorization
 - pre-definition of vectors, etc.
 - use matrix instead of data.frame
 - do not use named objects
 - use pure matrix instead of involved S4 (perhaps also S3) objects (can
have enormous effects)
 - use function instead of expression
 - use compiled code
 - I guess indexing with numbers (better variables) is also much faster
than with text (names) (see also above)
 - I even made, for example, my own min, max, since they are slow, e.g.,

 greaterOf - function(x, y){
 # Returns for each element of x and y (numeric)
 # x or y may be a multiple of the other
   z - x  y
   z*x + (!z)*y

That's an interesting function.  I initially was tempted to respond
that you have managed to reinvent a specialized form of the ifelse
function but then I decided to do the timings just to check (always a
good idea).  The enclosed timings show that your function is indeed
faster than a call to ifelse.  A couple of comments:

- I needed to make the number of components in the vectors x and y
quite large before I could  get reliable timings on the system I am
using.

- The recommended way of doing timings is with system.time function,
which makes an effort to minimize the effects of garbage collection on
the timings.

- Even when using system.time there is often a big difference in
timing between the first execution of a function call that generates a
large object and subsequent executions of the same function call.

[additional parts of the original message not relevant to this
discussion have been removed]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-02 Thread Ravi Varadhan
Thomas, you are absolutely correct.  

Here are some results comparing Jeff's original implementation, my
suggestion using outer and pmax, and your clever trick using fast.pmax.
Your fast.pmax function is more than 3 times faster than pmax.  Thanks for
this wonderful insight. 

Best,
Ravi.

 x - rnorm(2000)
 y - runif(500)
   nx - length(x)
   ny - length(y)
   sum1 - 0
   sum3 - 0
 
 # Here is the straightforward way
 system.time(
+   for(i in 1:nx) {
+ sum1 - sum1 + sum(ifelse(x[i]x,x[i],x))
+ sum3 - sum3 + sum(ifelse(x[i]y,x[i],y))
+   }
+ )
[1] 3.83 0.00 3.83   NA   NA
 
 # Here is a faster way using outer and pmax
 system.time({
+ sum11 - sum(outer(x,x,FUN=pmax))
+ sum33 - sum(outer(x,y,FUN=pmax))
+ })
[1] 2.55 0.48 3.04   NA   NA
 
 # Here is an even faster method using Tom Lumley's suggestion:
 fast.pmax - function(x,y) {i- xy; x[i]-y[i]; x}
 
 system.time({
+ sum111 - sum(outer(x,x,FUN=fast.pmax))
+ sum333 - sum(outer(x,y,FUN=fast.pmax))
+ })
[1] 0.78 0.08 0.86   NA   NA
 
 
 all.equal(sum1,sum11,sum111)
[1] TRUE
 all.equal(sum3,sum33,sum333)
[1] TRUE
 



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Thomas Lumley [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 02, 2007 9:06 AM
To: Ravi Varadhan
Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Subject: Re: [R] Help with efficient double sum of max (X_i, Y_i) (X  Y
vectors)

On Thu, 1 Feb 2007, Ravi Varadhan wrote:

 Jeff,

 Here is something which is a little faster:

 sum1 - sum(outer(x, x, FUN=pmax))
 sum3 - sum(outer(x, y, FUN=pmax))

This is the sort of problem where profiling can be useful.  My experience 
with pmax() is that it is surprisingly slow, presumably because it handles 
recycling and NAs

In the example I profiled (an MCMC calculation) it was measurably faster 
to use
function(x,y) {i- xy; x[i]-y[i]; x}

-thomas


 Best,
 Ravi.



 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Racine
 Sent: Thursday, February 01, 2007 1:18 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Help with efficient double sum of max (X_i, Y_i) (X  Y
 vectors)

 Greetings.

 For R gurus this may be a no brainer, but I could not find pointers to
 efficient computation of this beast in past help files.

 Background - I wish to implement a Cramer-von Mises type test statistic
 which involves double sums of max(X_i,Y_j) where X and Y are vectors of
 differing length.

 I am currently using ifelse pointwise in a vector, but have a nagging
 suspicion that there is a more efficient way to do this. Basically, I
 require three sums:

 sum1: \sum_i\sum_j max(X_i,X_j)
 sum2: \sum_i\sum_j max(Y_i,Y_j)
 sum3: \sum_i\sum_j max(X_i,Y_j)

 Here is my current implementation - any pointers to more efficient
 computation greatly appreciated.

  nx - length(x)
  ny - length(y)

  sum1 - 0
  sum3 - 0

  for(i in 1:nx) {
sum1 - sum1 + sum(ifelse(x[i]x,x[i],x))
sum3 - sum3 + sum(ifelse(x[i]y,x[i],y))
  }

  sum2 - 0
  sum4 - sum3 # symmetric and identical

  for(i in 1:ny) {
sum2 - sum2 + sum(ifelse(y[i]y,y[i],y))
  }

 Thanks in advance for your help.

 -- Jeff

 -- 
 Professor J. S. Racine Phone:  (905) 525 9140 x 23825
 Department of EconomicsFAX:(905) 521-8232
 McMaster Universitye-mail: [EMAIL PROTECTED]
 1280 Main St. W.,Hamilton, URL:
 http://www.economics.mcmaster.ca/racine/
 Ontario, Canada. L8S 4M4

 `The generation of random numbers is too important to be left to chance'

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Thomas

[R] Need help writing a faster code

2007-02-01 Thread Ravi Varadhan
Hi,

 

I apologize for this repeat posting, which I first posted yesterday. I would
appreciate any hints on solving this problem:

 

I have two matrices A (m x 2) and B (n x 2), where m and n are large
integers (on the order of 10^4).  I am looking for an efficient way to
create another matrix, W (m x n), which can be defined as follows:

 

for (i in 1:m){

for (j in 1:n) {

W[i,j] - g(A[i,], B[j,])

} }

where g(x,y) is a function that takes two vectors and returns a scalar.

 

The following works okay, but is not fast enough for my purpose.  I am sure
that I can do better:

 

for (i in 1:m) {

W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x)) 

}

 

How can I do this in a faster manner? I attempted outer, kronecker,
expand.grid, etc, but with no success.

 

Here is an example:

 

m - 2000

n - 5000

A - matrix(rnorm(2*m),ncol=2)

B - matrix(rnorm(2*n),ncol=2)

W - matrix(NA, m, n)

 

for (i in 1:m) {

W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x)) 

}

 

g - function(x,y){

theta - atan((y[2]-x[2]) / (y[1] - x[1]))

theta + 2*pi*(theta  0)

}

 

Thanks for any suggestions.

 

Best,

Ravi.

 

 

 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help writing a faster code

2007-02-01 Thread Ravi Varadhan
Thank you, Dimitris and Robin.  

Dimitris - your solution(s) works very well.  Although my g function is a
lot more complicated than that in the simple example that I gave, I think
that I can use your idea of taking the whole matrix inside the function and
working directly with it.

Robin - using two applys doesn't make the code any faster, it just produces
a compact one-liner.

Best,
Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Dimitris Rizopoulos [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 10:33 AM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Need help writing a faster code

the following seems to be a first improvement:

m - 2000
n - 5000
A - matrix(rnorm(2*m), ncol=2)
B - matrix(rnorm(2*n), ncol=2)
W1 - W2 - matrix(0, m, n)

##
##

g1 - function(x, y){
theta - atan((y[2] - x[2]) / (y[1] - x[1]))
theta + 2*pi*(theta  0)
}

invisible({gc(); gc()})
system.time(for (i in 1:m) {
W1[i, ] - apply(B, 1, y = A[i,], function(x, y) g1(y, x))
})

##

g2 - function(x){
out - tB - x
theta - atan(out[2, ] / out[1, ])
theta + 2*pi*(theta  0)
}

tB - t(B)
invisible({gc(); gc()})
system.time(for (i in 1:m) {
W2[i, ] - g2(A[i, ])
})

## or

invisible({gc(); gc()})
system.time(W3 - t(apply(A, 1, g2)))

all.equal(W1, W2)
all.equal(W1, W3)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Ravi Varadhan [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 4:10 PM
Subject: [R] Need help writing a faster code


 Hi,



 I apologize for this repeat posting, which I first posted yesterday. 
 I would
 appreciate any hints on solving this problem:



 I have two matrices A (m x 2) and B (n x 2), where m and n are large
 integers (on the order of 10^4).  I am looking for an efficient way 
 to
 create another matrix, W (m x n), which can be defined as follows:



 for (i in 1:m){

 for (j in 1:n) {

 W[i,j] - g(A[i,], B[j,])

 } }

 where g(x,y) is a function that takes two vectors and returns a 
 scalar.



 The following works okay, but is not fast enough for my purpose.  I 
 am sure
 that I can do better:



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 How can I do this in a faster manner? I attempted outer, 
 kronecker,
 expand.grid, etc, but with no success.



 Here is an example:



 m - 2000

 n - 5000

 A - matrix(rnorm(2*m),ncol=2)

 B - matrix(rnorm(2*n),ncol=2)

 W - matrix(NA, m, n)



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 g - function(x,y){

 theta - atan((y[2]-x[2]) / (y[1] - x[1]))

 theta + 2*pi*(theta  0)

 }



 Thanks for any suggestions.



 Best,

 Ravi.









 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help writing a faster code

2007-02-01 Thread Ravi Varadhan
Dear Dimitris,

I implemented your solution on my actual problem.  I was able to generate my
large transition matrix in 56 seconds, compared to the previous time of
around 27 minutes.  Wow!!!

I thank you very much for the help.  R and the R-user group are truly
amazing!

Best regards,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: Dimitris Rizopoulos [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 10:33 AM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Need help writing a faster code

the following seems to be a first improvement:

m - 2000
n - 5000
A - matrix(rnorm(2*m), ncol=2)
B - matrix(rnorm(2*n), ncol=2)
W1 - W2 - matrix(0, m, n)

##
##

g1 - function(x, y){
theta - atan((y[2] - x[2]) / (y[1] - x[1]))
theta + 2*pi*(theta  0)
}

invisible({gc(); gc()})
system.time(for (i in 1:m) {
W1[i, ] - apply(B, 1, y = A[i,], function(x, y) g1(y, x))
})

##

g2 - function(x){
out - tB - x
theta - atan(out[2, ] / out[1, ])
theta + 2*pi*(theta  0)
}

tB - t(B)
invisible({gc(); gc()})
system.time(for (i in 1:m) {
W2[i, ] - g2(A[i, ])
})

## or

invisible({gc(); gc()})
system.time(W3 - t(apply(A, 1, g2)))

all.equal(W1, W2)
all.equal(W1, W3)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Ravi Varadhan [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 4:10 PM
Subject: [R] Need help writing a faster code


 Hi,



 I apologize for this repeat posting, which I first posted yesterday. 
 I would
 appreciate any hints on solving this problem:



 I have two matrices A (m x 2) and B (n x 2), where m and n are large
 integers (on the order of 10^4).  I am looking for an efficient way 
 to
 create another matrix, W (m x n), which can be defined as follows:



 for (i in 1:m){

 for (j in 1:n) {

 W[i,j] - g(A[i,], B[j,])

 } }

 where g(x,y) is a function that takes two vectors and returns a 
 scalar.



 The following works okay, but is not fast enough for my purpose.  I 
 am sure
 that I can do better:



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 How can I do this in a faster manner? I attempted outer, 
 kronecker,
 expand.grid, etc, but with no success.



 Here is an example:



 m - 2000

 n - 5000

 A - matrix(rnorm(2*m),ncol=2)

 B - matrix(rnorm(2*n),ncol=2)

 W - matrix(NA, m, n)



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 g - function(x,y){

 theta - atan((y[2]-x[2]) / (y[1] - x[1]))

 theta + 2*pi*(theta  0)

 }



 Thanks for any suggestions.



 Best,

 Ravi.









 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-01 Thread Ravi Varadhan
Jeff,

Here is something which is a little faster:

sum1 - sum(outer(x, x, FUN=pmax))
sum3 - sum(outer(x, y, FUN=pmax))

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Racine
Sent: Thursday, February 01, 2007 1:18 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Help with efficient double sum of max (X_i, Y_i) (X  Y
vectors)

Greetings.

For R gurus this may be a no brainer, but I could not find pointers to
efficient computation of this beast in past help files.

Background - I wish to implement a Cramer-von Mises type test statistic
which involves double sums of max(X_i,Y_j) where X and Y are vectors of
differing length.

I am currently using ifelse pointwise in a vector, but have a nagging
suspicion that there is a more efficient way to do this. Basically, I
require three sums:

sum1: \sum_i\sum_j max(X_i,X_j)
sum2: \sum_i\sum_j max(Y_i,Y_j)
sum3: \sum_i\sum_j max(X_i,Y_j)

Here is my current implementation - any pointers to more efficient
computation greatly appreciated.

  nx - length(x)
  ny - length(y)

  sum1 - 0
  sum3 - 0

  for(i in 1:nx) {
sum1 - sum1 + sum(ifelse(x[i]x,x[i],x))
sum3 - sum3 + sum(ifelse(x[i]y,x[i],y))
  }

  sum2 - 0
  sum4 - sum3 # symmetric and identical

  for(i in 1:ny) {
sum2 - sum2 + sum(ifelse(y[i]y,y[i],y))
  }

Thanks in advance for your help.

-- Jeff

-- 
Professor J. S. Racine Phone:  (905) 525 9140 x 23825
Department of EconomicsFAX:(905) 521-8232
McMaster Universitye-mail: [EMAIL PROTECTED]
1280 Main St. W.,Hamilton, URL:
http://www.economics.mcmaster.ca/racine/
Ontario, Canada. L8S 4M4

`The generation of random numbers is too important to be left to chance'

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looking for a faster code

2007-01-31 Thread RAVI VARADHAN
Hi,

I have two matrices A (m x 2) and B (n x 2), where m and n are large integers 
(on the order of 10^4).  I am looking for an efficient way to create another 
matrix, W (m x n), which can be defined as follows:

for (i in 1:m){
for (j in 1:n) {
W[i,j] - g(A[i,], B[j,])
} }

I have tried the following and it works okay, but I am sure that I can do even 
better:

for (i in 1:m) {
W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))
}

How can I do this in a faster manner (for example, I feel that I should be able 
to use outer)?  

Thanks for any suggestions.

Best,
Ravi.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] maximum likelihood estimation of 5 parameters

2007-01-05 Thread Ravi Varadhan
Franco,
Is it possible that you have failed to provide the negative of loglikelihood
to optim, since optim, by default, minimizes a function?  If you want to
do this withput redefining the log-likelihood, you should set fnscale= -1
(as hinted by Prof. Ripley).  This would turn the problem into a
maximization problem.  

If this doesn't work, you should provide more details (a reproducible code
with actual error message).

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of francogrex
Sent: Friday, January 05, 2007 10:42 AM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] maximum likelihood estimation of 5 parameters



Franco,
You can provide lower and upper bounds on the parameters if you use optim
with method=L-BFGS-B.
Hth, Ingmar

Thanks, but when I use L-BFGS-B it tells me that there is an  error in
optim(start, f, method = method, hessian = TRUE, ...) : L-BFGS-B needs
finite values of 'fn'

-- 
View this message in context:
http://www.nabble.com/maximum-likelihood-estimation-of-5-parameters-tf292536
4.html#a8180120
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] na.action and simultaneous regressions

2007-01-03 Thread Ravi Varadhan
No, Bert, lm doesn't produce a list each of whose components is a separate
fit using all the nonmissing data in the column.  It is true that the
regressions are independently performed, but when the response matrix is
passed from lm on to lm.fit, only the complete rows are passed, i.e.
rows with no missing values.  I looked at lm function, but it was not
obvious to me how to fix it.  

In the following toy example, the degrees of freedom for y1 regression
should be 18 and that for y2 should be 15, but both degrees of freedom are
only 15.

 y1 - runif(20)
 y2 - c(runif(17), rep(NA,3))
 x - rnorm(20)
 summary(lm(cbind(y1,y2) ~ x))
Response y1 :

Call:
lm(formula = y1 ~ x)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.52592 -0.22632 -0.00964  0.25117  0.31227 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.569890.06902   8.257 5.82e-07 ***
x   -0.123250.06516  -1.8910.078 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.2798 on 15 degrees of freedom
Multiple R-Squared: 0.1926, Adjusted R-squared: 0.1387 
F-statistic: 3.577 on 1 and 15 DF,  p-value: 0.07804 


Response y2 :

Call:
lm(formula = y2 ~ x)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.48880 -0.28552 -0.06022  0.23167  0.54425 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.437120.07686   5.687 4.31e-05 ***
x0.102780.07257   1.4160.177
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.3115 on 15 degrees of freedom
Multiple R-Squared: 0.118,  Adjusted R-squared: 0.05915 
F-statistic: 2.006 on 1 and 15 DF,  p-value: 0.1771 


Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bert Gunter
Sent: Wednesday, January 03, 2007 4:46 PM
To: 'Talbot Katz'; r-help@stat.math.ethz.ch
Subject: Re: [R] na.action and simultaneous regressions

As the Help page says:

If response is a matrix a linear model is fitted separately by least-squares
to each column of the matrix

So there's nothing hidden going on behind the scenes, and
apply(cbind(y1,y2),2,function(z)lm(z~x)) (or an explicit loop, of course)
will produce a list each of whose components is a separate fit using all the
nonmissing data in the column. 

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Talbot Katz
Sent: Wednesday, January 03, 2007 11:56 AM
To: r-help@stat.math.ethz.ch
Subject: [R] na.action and simultaneous regressions

Hi.

I am running regressions of several dependent variables using the same set 
of independent variables.  The independent variable values are complete, but

each dependent variable has some missing values for some observations; by 
default, lm(y1~x) will carry out the regressions using only the observations

without missing values of y1.  If I do lm(cbind(y1,y2)~x), the default will 
be to use only the observations for which neither y1 nor y2 is missing.  I'd

like to have the regression for each separate dependent variable use all the

non-missing cases for that variable.  I would think that there should be a 
way to do that using the na.action option, but I haven't seen this in the 
documentation or figured out how to do it on my own.  Can it be done this 
way, or do I have to code the regressions in a loop?  (By the way, since it 
restricts to non-missing values in all the variables simultaneously, is this

because it's doing some sort of SUR or other simultaneous equation 
estimation behind the scenes?)

Thanks!

--  TMK  --
212-460-5430home
917-656-5351cell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting

Re: [R] na.action and simultaneous regressions

2007-01-03 Thread Ravi Varadhan
Sorry, Bert.  I didn't notice your use of apply, which will indeed give
you separate regression results using all available data.  But I was
wondering, if there was a way to modify lm to be able to accomplish this,
since it is doing separate regressions anyway.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bert Gunter
Sent: Wednesday, January 03, 2007 4:46 PM
To: 'Talbot Katz'; r-help@stat.math.ethz.ch
Subject: Re: [R] na.action and simultaneous regressions

As the Help page says:

If response is a matrix a linear model is fitted separately by least-squares
to each column of the matrix

So there's nothing hidden going on behind the scenes, and
apply(cbind(y1,y2),2,function(z)lm(z~x)) (or an explicit loop, of course)
will produce a list each of whose components is a separate fit using all the
nonmissing data in the column. 

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Talbot Katz
Sent: Wednesday, January 03, 2007 11:56 AM
To: r-help@stat.math.ethz.ch
Subject: [R] na.action and simultaneous regressions

Hi.

I am running regressions of several dependent variables using the same set 
of independent variables.  The independent variable values are complete, but

each dependent variable has some missing values for some observations; by 
default, lm(y1~x) will carry out the regressions using only the observations

without missing values of y1.  If I do lm(cbind(y1,y2)~x), the default will 
be to use only the observations for which neither y1 nor y2 is missing.  I'd

like to have the regression for each separate dependent variable use all the

non-missing cases for that variable.  I would think that there should be a 
way to do that using the na.action option, but I haven't seen this in the 
documentation or figured out how to do it on my own.  Can it be done this 
way, or do I have to code the regressions in a loop?  (By the way, since it 
restricts to non-missing values in all the variables simultaneously, is this

because it's doing some sort of SUR or other simultaneous equation 
estimation behind the scenes?)

Thanks!

--  TMK  --
212-460-5430home
917-656-5351cell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RuleFit quantreg: partial dependence plots; showing an effect

2006-12-20 Thread Ravi Varadhan
Dear Roger,

Is it possible to combine the two ideas that you mentioned: (1) algorithmic
approaches of Breiman, Friedman, and others that achieve flexibility in the
predictor space, and (2) robust and flexible regression like QR that achieve
flexibility in the response space, so as to achieve complete flexibility?
If it is possible, are you or anyone else in the R community working on
this?

Thanks,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of roger koenker
Sent: Wednesday, December 20, 2006 8:57 AM
To: Mark Difford
Cc: R-help list
Subject: Re: [R] RuleFit  quantreg: partial dependence plots; showing an
effect

They are entirely different:  Rulefit is a fiendishly clever  
combination of decision tree  formulation
of models and L1-regularization intended to select parsimonious fits  
to very complicated
responses yielding e.g. piecewise constant functions.  Rulefit   
estimates the  conditional
mean of the response over the covariate space, but permits a very  
flexible, but linear in
parameters specifications of the covariate effects on the conditional  
mean.  The quantile
regression plotting you refer to adopts a fixed, linear specification  
for conditional quantile
functions and given that specification depicts how the covariates  
influence the various
conditional quantiles of the response.   Thus, roughly speaking,  
Rulefit is focused on
flexibility in the x-space, maintaining the classical conditional  
mean objective; while
QR is trying to be more flexible in the y-direction, and maintaining  
a fixed, linear
in parameters specification for the covariate effects at each quantile.


url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:

 Dear List,

 I would greatly appreciate help on the following matter:

 The RuleFit program of Professor Friedman uses partial dependence  
 plots
 to explore the effect of an explanatory variable on the response
 variable, after accounting for the average effects of the other
 variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
 t=seq(.1,.9,.05] of Professor Koenker's quantreg program  
 appears to
 do the same thing.


 Question:
 Is there a difference between these two types of plot in the manner  
 in which they depict the relationship between explanatory variables  
 and the response variable ?

 Thank you inav for your help.

 Regards,
 Mark Difford.

 -
 Mark Difford
 Ph.D. candidate, Botany Department,
 Nelson Mandela Metropolitan University,
 Port Elizabeth, SA.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RuleFit quantreg: partial dependence plots; showing an effect

2006-12-20 Thread Ravi Varadhan
Thanks, Roger.  These should be very useful tools.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: roger koenker [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 20, 2006 10:59 AM
To: Ravi Varadhan
Cc: 'Mark Difford'; 'R-help list'
Subject: Re: [R] RuleFit  quantreg: partial dependence plots; showing an
effect



On Dec 20, 2006, at 8:43 AM, Ravi Varadhan wrote:

 Dear Roger,

 Is it possible to combine the two ideas that you mentioned: (1)  
 algorithmic
 approaches of Breiman, Friedman, and others that achieve  
 flexibility in the
 predictor space, and (2) robust and flexible regression like QR  
 that achieve
 flexibility in the response space, so as to achieve complete  
 flexibility?
 If it is possible, are you or anyone else in the R community  
 working on
 this?


There are some tentative steps in this direction.  One is the rqss()  
fitting
in my quantreg package which does QR fitting with additive models
using total variation as a roughness penalty for nonlinear terms.
Another, along more tree structured lines, is Nicolai Meinshausen's
quantregforest package.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of roger koenker
 Sent: Wednesday, December 20, 2006 8:57 AM
 To: Mark Difford
 Cc: R-help list
 Subject: Re: [R] RuleFit  quantreg: partial dependence plots;  
 showing an
 effect

 They are entirely different:  Rulefit is a fiendishly clever
 combination of decision tree  formulation
 of models and L1-regularization intended to select parsimonious fits
 to very complicated
 responses yielding e.g. piecewise constant functions.  Rulefit
 estimates the  conditional
 mean of the response over the covariate space, but permits a very
 flexible, but linear in
 parameters specifications of the covariate effects on the conditional
 mean.  The quantile
 regression plotting you refer to adopts a fixed, linear specification
 for conditional quantile
 functions and given that specification depicts how the covariates
 influence the various
 conditional quantiles of the response.   Thus, roughly speaking,
 Rulefit is focused on
 flexibility in the x-space, maintaining the classical conditional
 mean objective; while
 QR is trying to be more flexible in the y-direction, and maintaining
 a fixed, linear
 in parameters specification for the covariate effects at each  
 quantile.


 url:www.econ.uiuc.edu/~rogerRoger Koenker
 email[EMAIL PROTECTED]Department of Economics
 vox: 217-333-4558University of Illinois
 fax:   217-244-6678Champaign, IL 61820


 On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:

 Dear List,

 I would greatly appreciate help on the following matter:

 The RuleFit program of Professor Friedman uses partial dependence
 plots
 to explore the effect of an explanatory variable on the response
 variable, after accounting for the average effects of the other
 variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
 t=seq(.1,.9,.05] of Professor Koenker's quantreg program
 appears to
 do the same thing.


 Question:
 Is there a difference between these two types of plot in the manner
 in which they depict the relationship between explanatory variables
 and the response variable ?

 Thank you inav for your help.

 Regards,
 Mark Difford.

 -
 Mark Difford
 Ph.D. candidate, Botany Department,
 Nelson Mandela Metropolitan University,
 Port Elizabeth, SA.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R

Re: [R] any way to make the code more efficient ?

2006-12-08 Thread Ravi Varadhan

Using rbind almost always slows things down significantly.  You should
define the objects aggfxdata and fullaggfxdata before the loop and then
assign appropriate values to the corresponding rows and/or columns.  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
Sent: Friday, December 08, 2006 4:17 PM
To: r-help@stat.math.ethz.ch
Subject: [R] any way to make the code more efficient ?

The code bekow works so this is why I didn't include the data to
reproduce it. The  loops about 500
times and each time, a zoo object with 1400 rows and 4 columns gets
created. ( the rows represent minutes so each file is one day 
worth of data). Inside the loop, I keep rbinding the newly created zoo
object to the current zoo object so that it gets bigger and 
bigger over time.

Eventually, the new zoo object, fullaggfxdata,  containing all the days
of data is created.

I was just wondering if there is a more efficient way of doing this. I
do know the number of times the loop will be done at the beginning so
maybe creating the a matrix or data frame at the beginning and putting
the daily ones in something like that would
Make it be faster. But, the proboem with this is I eventually do need a
zoo object.  I ask this question because at around the 250
mark of the loop, things start to slow down significiantly and I think I
remember reading somewhere that doing an rbind of something to itself is
not a good idea.  Thanks. 

#===
===

start-1

for (filecounter in (1:length(datafilenames))) { 

print(paste(File Counter = , filecounter))
datafile= paste(datadir,/,datafilenames[filecounter],sep=)
aggfxdata-clnaggcompcurrencyfile(fxfile=datafile,aggminutes=aggminutes,
fillholes=1)
logbidask-log(aggfxdata[,bidask]) 
aggfxdata-cbind(aggfxdata,logbidask)

if ( start == 1 ) {
fullaggfxdata-aggfxdata
start-0
} else {
fullaggfxdata-rbind(fullaggfxdata,aggfxdata)
}


}

#===
==


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to solve differential equations with a delay (time lag)?

2006-11-29 Thread Ravi Varadhan
Hi,

 

I would like to solve a system of coupled ordinary differential equations,
where there is a delay (time lag) term.  I would like to use the lsoda
function odesolve package.  However, I am not sure how to specify the
delay term using the syntax allowed by odesolve.  

Here is an example of the kind of problem that I am trying to solve:

 

 library(odesolve) 

 

yprime - function(t, y, p) {  # this function 

yd1 - p[k1] *(t = p[T]) - p[k2] * y[2]

yd2 - p[k3] * y[1](t - p[delay]) - p[k4] * y[2]  # this is not
syntactically valid, but it is what I would like to do

list(c(yd1,yd2))

}

 

times - seq(0,30,by=0.1)

y0 - c(0,0)

parms - c(k1=0.7, k2=0.5, k3=0.2, k4=0.8, T=10, delay=5)

 

Is there a way to incorporate delay in odesolve?

 

Any hints would be much appreciated. 

 

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to solve differential equations with a delay (time lag)?

2006-11-29 Thread Ravi Varadhan
Hi,

I would like to add an additional piece of information to my previous
posting:

y[1](t) = 0, for all t in the interval (-delay, 0)

This completes the specification of the problem.

Thanks,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
Sent: Wednesday, November 29, 2006 4:45 PM
To: r-help@stat.math.ethz.ch
Subject: [R] How to solve differential equations with a delay (time lag)?

Hi,

 

I would like to solve a system of coupled ordinary differential equations,
where there is a delay (time lag) term.  I would like to use the lsoda
function odesolve package.  However, I am not sure how to specify the
delay term using the syntax allowed by odesolve.  

Here is an example of the kind of problem that I am trying to solve:

 

 library(odesolve) 

 

yprime - function(t, y, p) {  # this function 

yd1 - p[k1] *(t = p[T]) - p[k2] * y[2]

yd2 - p[k3] * y[1](t - p[delay]) - p[k4] * y[2]  # this is not
syntactically valid, but it is what I would like to do

list(c(yd1,yd2))

}

 

times - seq(0,30,by=0.1)

y0 - c(0,0)

parms - c(k1=0.7, k2=0.5, k3=0.2, k4=0.8, T=10, delay=5)

 

Is there a way to incorporate delay in odesolve?

 

Any hints would be much appreciated. 

 

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to solve differential equations with a delay (time lag)?

2006-11-29 Thread Ravi Varadhan
Thanks, Woodrow.  I also found a DDE solver called dde23 in Matlab written
by L.F. Shampine.  I will see if I can use it in Scilab, since I don't have
Matlab.  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]

Sent: Wednesday, November 29, 2006 5:26 PM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] How to solve differential equations with a delay (time
lag)?

lsoda does not solve delay differential equations, which is what you
need to be able to do.  A quick search on netlib (netlib.org) turned up
one match for delay differential equations: ode/ddverk.f, which might
help (though not in R).

R. Woodrow Setzer, Ph. D.
National Center for Computational Toxicology
US Environmental Protection Agency
Mail Drop B205-01/US EPA/RTP, NC 27711
Ph: (919) 541-0128Fax: (919) 541-1194



 Ravi Varadhan  
 [EMAIL PROTECTED]   
 eduTo 
 Sent by: r-help@stat.math.ethz.ch  
 [EMAIL PROTECTED]cc 
 tat.math.ethz.ch   
Subject 
  [R] How to solve differential 
 11/29/2006 04:44 equations with a delay (time  
 PM   lag)? 










Hi,



I would like to solve a system of coupled ordinary differential
equations,
where there is a delay (time lag) term.  I would like to use the lsoda
function odesolve package.  However, I am not sure how to specify the
delay term using the syntax allowed by odesolve.

Here is an example of the kind of problem that I am trying to solve:



 library(odesolve)



yprime - function(t, y, p) {  # this function

yd1 - p[k1] *(t = p[T]) - p[k2] * y[2]

yd2 - p[k3] * y[1](t - p[delay]) - p[k4] * y[2]  # this is not
syntactically valid, but it is what I would like to do

list(c(yd1,yd2))

}



times - seq(0,30,by=0.1)

y0 - c(0,0)

parms - c(k1=0.7, k2=0.5, k3=0.2, k4=0.8, T=10, delay=5)



Is there a way to incorporate delay in odesolve?



Any hints would be much appreciated.



Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html










 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix-vector multiplication without loops

2006-11-14 Thread Ravi Varadhan
Hi,

 

I am trying to do the following computation:

 

  p - rep(0, n)

  coef - runif(K+1)

  U - matrix(runif(n*(2*K+1)), n, 2*K+1)

  for (i in 0:K){

  for (j in 0:K){

  p - p + coef[i+1]* coef[j+1] * U[,i+j+1]

  } }

 

I would appreciate any suggestions on how to perform this computation
efficiently without the for loops?

 

Thank you,

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix-vector multiplication without loops

2006-11-14 Thread Ravi Varadhan
In my case, I have n  K so the loop solution is faster than the solutions
proposed by Christos and Dimitris.

In any case, I would like to thank Christos and Dimitris for their
solutions.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: Christos Hatzis [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 14, 2006 2:49 PM
To: 'Dimitris Rizopoulos'; 'Ravi Varadhan'
Cc: r-help@stat.math.ethz.ch
Subject: RE: [R] Matrix-vector multiplication without loops

Ravi,

Here is another version, somewhat similar to what Dimitris proposed but
without using 'rep'.  For large n, K 'rep' tries allocating two vectors,
each of length n*(K+1)^2, which can be a problem.  In this version 'outer'
buys you some efficiency and compactness, but your looping version is
faster.

n - 1000
K - 1000
p - rep(0, n)
cf - runif(K+1)
U - matrix(runif(n*(2*K+1)), n, 2*K+1)

# original code
system.time(
{
for (i in 0:K)
for (j in 0:K)
p - p + cf[i+1]* cf[j+1] * U[,i+j+1]
p.1 - p
} )

# 'vectorized'
system.time(
{
ind - sapply(1:(K+1), seq, length = K+1)
cc - outer(cf,cf)
p.2 - apply(U, 1, FUN=function(u) sum(cc * u[ind]))
} )


all.equal(p.1, p.2)
rm(n,K,p,U,cf,cc,ind,p.1,p.2) 

-Christos

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
 


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dimitris Rizopoulos
Sent: Tuesday, November 14, 2006 11:20 AM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Matrix-vector multiplication without loops

I think you need something along these lines:

ind - c(sapply(1:(K+1), seq, length = K + 1))
cf1 - rep(rep(coef, each = n), K + 1)
cf2 - rep(rep(coef, each = n), each = K + 1)
rowSums(cf1 * cf2 * U[, ind])


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message -
From: Ravi Varadhan [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Tuesday, November 14, 2006 4:45 PM
Subject: [R] Matrix-vector multiplication without loops


 Hi,



 I am trying to do the following computation:



  p - rep(0, n)

  coef - runif(K+1)

  U - matrix(runif(n*(2*K+1)), n, 2*K+1)

  for (i in 0:K){

  for (j in 0:K){

  p - p + coef[i+1]* coef[j+1] * U[,i+j+1]

  } }



 I would appreciate any suggestions on how to perform this 
 computation
 efficiently without the for loops?



 Thank you,

 Ravi.



 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Fortran 9x -- advice

2006-11-13 Thread Ravi Varadhan
Metcalf and Reid - FORTRAN 90/95 Explained

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Christos Hatzis
Sent: Monday, November 13, 2006 2:05 PM
To: 'Mike Prager'; r-help@stat.math.ethz.ch
Subject: Re: [R] R and Fortran 9x -- advice

Mike,

Can you recommend any good books on Fortran 90/95?  I had been an old user
of Fortran 77 but haven't followed the developments in the last 15 years or
so...

Thanks.

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
 


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mike Prager
Sent: Monday, November 13, 2006 1:00 PM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] R and Fortran 9x -- advice

Tamas K Papp [EMAIL PROTECTED] wrote:

 I found some bottlenecks in my R code with Rprof.  First I wanted to 
 rewrite them in C, but a colleague keeps suggesting that I learn 
 Fortran, so maybe this is the time to do it...
 
 1) I hear bad things about Fortran.  Sure, F77 looks archaic, but
 F90/95 seems nicer.  Is it worth learning, especially when I plan to 
 use it mainly from R?  Dusting off my C knowledge would take a bit of 
 time too, and I never liked C that much.

I'll answer this from the perspective of someone who uses Fortran 95
regularly.  It is a modern language, far more reliable and flexible than
Fortran 77 and quite well suited to most scientific problems.  I do think
it's worth learning, particularly if C is not to your taste.

Two free compilers for Fortran 95 are available.  It seems that
g95 is complete, while gfortran is nearing completion.  There are also
several high-quality commercial compilers, some of which are free under
certain operating systems and/or conditions and others of which (Lahey) are
quite inexpensive if one is willing to work from the command line or one's
own editor.

I can't address questions of R interoperability  -- not something I've done.

--
Mike Prager, NOAA, Beaufort, NC
* Opinions expressed are personal and not represented otherwise.
* Any use of tradenames does not constitute a NOAA endorsement.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Numerical Integration

2006-11-09 Thread Ravi Varadhan
Here is a simple example using the bkde (binned kernel density estimator)
from the KernSmooth package that shows how to use the trapezoidal
integration:

 library(KernSmooth)
 data(geyser, package=MASS)
 x - geyser$duration
 est - bkde(x, grid = 101, bandwidth=0.25)
 trap.rule(est$x,est$y)
[1] 0.61

The answer from the simple trapezoidal rule is pretty good with an error of
less than 1.e-04 (number of grid points used is 101, with equal spacing).
You could use higher order rules (e.g. Simpson's), but you can get
reasonable accuracy with Trapezoidal rule by just increasing the number of
grid points.

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Xiaofan Cao [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 08, 2006 4:21 PM
To: Ravi Varadhan
Cc: 'Doran, Harold'; r-help@stat.math.ethz.ch
Subject: RE: [R] Numerical Integration

Hi Ravi and Harold,

Thanks for the input. I'm using trapezoidal rule and like to know if 
there's other alternatives. This f(x) is the kernel density estimator and 
thus we can get an estimate of f(x) at any given x in theory.

Thanks again,
Martha

  On Wed, 8 
Nov 2006, Ravi Varadhan wrote:

 Can you get an estimate of f(x) at any given x?  If so, the Gaussian
 quadrature methods will work, but not otherwise since f(x) must be known
at
 all the nodes.  A rough approximation to the integral can be obtained
using
 the trapezoidal rule.  Here is a simple function to do that:
 trap.rule - function(x,y) sum(diff(x)*(y[-1]+y[-length(y)]))/2

 However, the use of the word knots seems to indicate that some sort of
 spline is being fit to the data.  Martha - can you provide more
information
 about your function f(x)?

 Ravi.


 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
 Sent: Wednesday, November 08, 2006 1:04 PM
 To: Xiaofan Cao; r-help@stat.math.ethz.ch
 Subject: Re: [R] Numerical Integration

 You might try the statmod package which provides nodes and weights for
 gaussian quadrature.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Xiaofan Cao
 Sent: Wednesday, November 08, 2006 12:43 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Numerical Integration

 Hi everyone,

 I'm trying to integrate f(x) over x where f(x) does not have
 a close form but only numerical values at certurn knots of x.
 Is there a way that I can use any generical R function (such
 as integrate) or any package to do so?

 Thanks! I appreciate your time.

 Best Regards,
 Martha Cao

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] solve computationally singular

2006-11-07 Thread Ravi Varadhan
For heaven's sake, please stop sending repeat emails and send your R code
that can reproduce the error.

Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of xchen
Sent: Tuesday, November 07, 2006 2:46 PM
To: r-help@stat.math.ethz.ch
Subject: [R] solve computationally singular

Hi uRsers,

when inverting a 2 by 2 matrix using solve, I encountered a error message:
solve.default(sigma, tol = 1e-07) :
system is computationally singular: reciprocal condition number 
= 1.7671e-017

and then I test the determinant of this matrix: 6.341393e-06.

In my program, I have a condition block that whether a matrix is 
invertible like this:
if(det(sigma)1e-7) return NULL;

but this seems doesnot work to prevent the singularity when inverting a 
matrix. I am some confused about the relationship between reciprocal 
condition number and determinant. Can anybody give me some idea how to 
prevent this situation?

Thanks a lot!

Xiaohui

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ridge for logistic regression

2006-11-06 Thread Ravi Varadhan
Also check out the package glmpath which can incorporate both ridge (L2)
and lasso (L1) type penalties.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
Sent: Monday, November 06, 2006 7:41 AM
To: Prof Brian Ripley
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Ridge for logistic regression

Prof Brian Ripley wrote:
 On Fri, 3 Nov 2006, Zheng Yuan wrote:
 
 Dear all experts,

 Does anyone know if there is a R function which can perform Ridge
 regression for logistic models?
 
 multinom and nnet in the package of that name.
 

You can also use the lrm function in the Design package, which makes it 
relatively easy to specify penalization patterns by type of model term 
(e.g., to obtain more penalization for interaction or nonlinear terms).

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Markov blanket

2006-11-02 Thread Ravi Varadhan
Hi,

 

Is there an R implementation of the HITON Markov blanket discovery algorithm
that is referenced for example in the following paper?

 

http://citeseer.ist.psu.edu/cache/papers/cs/27027/http:zSzzSzdiscover1.mc.va
nderbilt.eduzSzdiscoverzSzpubliczSzPublicationszSzHITON.pdf/aliferis03hiton.
pdf

 

I would also appreciate tips to any related algorithms/methods that are
implemented in R.

 

Thanks,

Ravi.

 

 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hessian matrix

2006-11-01 Thread Ravi Varadhan
The function hessian in the numDeriv package can be useful.  It is based
on Richardson extrapolation, and although it involves more computational
effort (depending on the order of the extrapolation), it will provide highly
accurate values.  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley
Sent: Wednesday, November 01, 2006 3:13 AM
To: Arun Kumar Saha
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Hessian matrix

?deriv  can do so algebraically for some functions.

There are various packages which can do so via finite differences, e.g.
fdHess() in nlme and hessian() in numDeriv.

On Wed, 1 Nov 2006, Arun Kumar Saha wrote:

 Dear all R users,

 Is there any way to calculate hessian matrix of a given function at any
 given point?

 Regards

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Please do: no HTML mail as requested.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multivariate regression

2006-10-31 Thread Ravi Varadhan
Thank you Robin (also thanks to Andris and Bernhard). I did get answers to
my problem using lm as you had suggested. But my main concern is about
getting the appropriate standard errors for \hat{beta}.  

My response is actually preference data, where each individual ranks a list
of k items, assigning them unique ranks from 1 to k.  Since the ranks for
each item are negatively-correlated within an individual, I would like to
take this into consideration.  Although lm gives me correct parameter
estimates, I think that the standard errors are overestimated.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Robin Hankin
Sent: Monday, October 30, 2006 4:17 AM
To: Andris Jankevics
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Multivariate regression

Hi

I discovered the other day that lm() does some of the work for
you:

library(mvtnorm)

X - matrix(rnorm(60),ncol=3)
beta  - matrix(1:6,ncol=2)
sig - matrix(c(1,0.7,0.7,1),2,2)

Y - X %*% beta + rmvnorm(n=20,sigma=sig)


  lm(Y ~ X-1)

Call:
lm(formula = Y ~ X - 1)

Coefficients:
 [,1]   [,2]
X1  1.015  4.065
X2  2.483  5.366
X3  2.762  5.727


This gives an estimate for beta.

But I don't know of a ready-made R solution for estimating
the covariance of  the elements of beta, or the sig matrix
for the covariance matrix of the observation errors.

Anyone?





On 30 Oct 2006, at 09:01, Andris Jankevics wrote:

 Also you can take a look on Partial Least Squares (PLS) regression.
 http://www.statsoft.com/textbook/stpls.html
 R-package: http://mevik.net/work/software/pls.html

 Andris Jankevics

 On Sestdiena, 28. Oktobris 2006 06:04, Ritwik Sinha wrote:
 You can use gee (
 http://finzi.psych.upenn.edu/R/library/geepack/html/00Index.html)  
 or maybe
 the function gls in nlme.

 Ritwik.

 On 10/27/06, Ravi Varadhan [EMAIL PROTECTED] wrote:
 Hi,



 Suppose I have a multivariate response Y (n x k) obtained at a  
 set of
 predictors X (n x p).  I would like to perform a linear  
 regression taking
 into consideration the covariance structure of Y within each unit  
 - this
 would be represented by a specified matrix V (k x k), assumed to  
 be the
 same
 across units.  How do I use lm to do this?



 One approach that I was thinking of is as follows:



 Flatten Y to a vector, say, Yvec (n*k x 1).  Create Xvec (n*k,  
 p*k) such
 that it is made up of block matrices Bij (k x k), where Bij is a  
 diagonal
 matrix with X_ij as the diagonal (i = 1,.n, and j = 1,.,p).  Now  
 I can
 use lm in a univariate mode to regress Yvec against Xvec, with
 covariance matrix Vvec (n*k x n*k).  Vvec is a block-diagonal  
 matrix with
 blocks of V along the diagonal.  This seems like a valid  
 approach, but I
 still don't know how to specify the covariance structure to do  
 weighted
 least squares.



 Any help is appreciated.



 Best,

 Ravi.




  
 -
 --- ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html




  
 -
 --- 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

[R] Multivariate regression

2006-10-27 Thread Ravi Varadhan
Hi,

 

Suppose I have a multivariate response Y (n x k) obtained at a set of
predictors X (n x p).  I would like to perform a linear regression taking
into consideration the covariance structure of Y within each unit - this
would be represented by a specified matrix V (k x k), assumed to be the same
across units.  How do I use lm to do this?  

 

One approach that I was thinking of is as follows:

 

Flatten Y to a vector, say, Yvec (n*k x 1).  Create Xvec (n*k, p*k) such
that it is made up of block matrices Bij (k x k), where Bij is a diagonal
matrix with X_ij as the diagonal (i = 1,.n, and j = 1,.,p).  Now I can use
lm in a univariate mode to regress Yvec against Xvec, with covariance
matrix Vvec (n*k x n*k).  Vvec is a block-diagonal matrix with blocks of V
along the diagonal.  This seems like a valid approach, but I still don't
know how to specify the covariance structure to do weighted least squares.

 

Any help is appreciated.

 

Best,

Ravi.

 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R gams

2006-10-27 Thread Ravi Varadhan
Can you be more specific about what you mean by gams?  Do you mean
generalized additive models (GAM)?  If so, R is a good environment for
forecasting models and GAM.  However, the link that you provided is NOT for
generalized additive modeling, but it is for General Algebraic Modeling
System (GAMS), which is a high-level modeling system for mathematical
programming and optimization.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of vittorio
Sent: Friday, October 27, 2006 3:14 PM
To: r-help@stat.math.ethz.ch
Subject: [R] R  gams

At office I have been introduced by another company  to new, complex energy 
forecasting models using gams as the basic software.
I have been told by the company offering the models that gams is specialised

in dealing with huge, hevy-weight linear and non-linear modelling (see an 
example in http://www.gams.com/modtype/index.htm) and they say it is almost 
the only option for doing it.

I would like to know your opinion on the subject and, above all, if R can be

an effective alternative and to what extent, if any.

Thanks
Vittorio

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >