Re: [R] Modifying Internal C Files
Thanks for your thoughts Prof. Ripley. When I set out to do this I had no idea it would be so difficult. Nevertheless, perhaps the best thing would be for me to say why I want to change internal parts of the function and why the best solution might be for it to be done in the next release of R. At the beginning of the pnbeta.c file we see: /* change errmax and itrmax if desired */ const static double errmax = 1.0e-9; const intitrmax = 100; Which obviously sets the limits for the error and for the number of iterations. The comment says to change if desired, but there is no way for the user to modify these values. If you recall, I posted on problems with the pf() function for large noncentral parameters (On Oct. 24 and filed a bug #8251). Since pf() uses the pnbeta.c, the limitations in pnbeta.c filter though to the pf() function. My thoughts about the limitations of the iterations where confirmed when Chattamvelli and Shanmugam (1997; _Applied Statistics_) illustrate in their table that the method R uses (Length, 1987, and the Frick, 1990, update) that indeed the number of iterations can go well above 100. Dr. T. Lumley emailed the list saying increasing the iterations does in fact seem to solve the problems I described (presumably he modifying the source code directly). So, can the itrmax be changed inside of pnbeta.c to some large value (e.g., 5000; realizing most problems will converge after only a few iterations). Or can the pbeta() function (possibly pf() since it uses pbeta.c) have parameters added that allows user control over these values? I would very much appreciate if this could be done, and obviously it would make R a bit more stable for large noncentral values (which are growing more important in some areas). That way everyone has a working function and not just me (If I were to be able to modify this internal function anyway!). As an aside, the Chattamvelli and Shanmugam (1997) paper update the algorithms that pnbeta.c is based (Algorithm AS 310). It would be nice to incorporate this into R at some point, but first taking care of the iteration issue would be easy to take care of right now. Thanks, Ken Prof Brian Ripley wrote: In your first post you said you called .C. Here you say you call .Internal. Users cannot add .Internal functions, nor can they be used in packages. In any case, The syntax is not as you show, but .Internal(pnbeta(q, shape1, shape2, ncp, lower.tail, log.p)) (And to be definite, that calls the C function do_math4, not pnbeta.) Final comment: Duncan mentioned double pnbeta2(double x, double a, double b, double lambda, int lower_tail, int log_p) but a .C call has to be to a function with all arguments as pointers, so there would need a wrapper function. The reason you are finding this difficult is that you are not dealing with the C function behind .Internal, but one deeper into the system. The simplest thing to do would be to compile R from the sources, then make modifications and rebuild. But in this case I believe Thomas Lumley has already modified the R-devel sources and Duncan Murdoch has provided Windows binaries of R-devel, so perhaps there is a much simpler route? On Mon, 7 Nov 2005, Ken Kelley wrote: Hi Duncan and others. Thanks for your insight. Actually, I did change the function name in the pnbeta2.c code (I should have said so). When I check the package I get no errors. I build the package and all seems well. When I install and then load the package in R, I get: pnbeta2 function (x, a, b, lambda, low.tail, log.p) { res - .Internal(pnbeta2, as.double(x), as.double(a), as.double(b), as.double(lambda), as.integer(low.tail), as.integer(log.p)) return(list(res)) } environment: namespace:try which seems good, but the function does not work: pnbeta2(.1, 10, 25, 2, TRUE, FALSE) Error in pnbeta2(0.1, 10, 25, 2, TRUE, FALSE) : 7 arguments passed to '.Internal' which requires 1 When I try to run the .Internal file directly is perhaps where my problems begin: .Internal(pnbeta2(.1, 10, 25, 2, TRUE, FALSE)) Error in .Internal(pnbeta2(0.1, 10, 25, 2, TRUE, FALSE)) : no internal function pnbeta2 But, when I do the same thing to the pnbeta internal function (which pnbeta2 is identical to at this point) I get the result: .Internal(pnbeta(.1, 10, 25, 2, TRUE, FALSE)) [1] 0.0006837318 So, I'm at a loss for what is going on in this situation. my pnbeta2 internal function doesn't seem to be there. What I want to do is simple (modify an interal C file), but it is proving to be quite difficult for me to implement. Thanks for any thoughts, Ken Duncan Murdoch wrote: On 11/7/2005 5:17 PM, Ken Kelley wrote: Hi All. I want to tweak a few minor things inside of internal C code. I have my Win. XP machine set-up to build packages (including C code), but I'm having problems getting the package to run
[R] Modifying Internal C Files
Hi All. I want to tweak a few minor things inside of internal C code. I have my Win. XP machine set-up to build packages (including C code), but I'm having problems getting the package to run correctly. In particular, I want to modify a few things inside of pnbeta.c (namely errmax and itrmax), which is what the pbeta() function calls upon when there is a noncentral parameter. I copied the pnbeta.c C code, changed its name [to pnbeta2.c], included the nmath.h, dpq.h files, lgamma.c, and pbeta.c in my src folder (since the .h files were called upon and the .c files were). I then created an R function that I thought would call upon the new C code: pnbeta2 - function(x, a, b, lambda, low.tail, log.p) { res - .C(pnbeta2, as.double(x), as.double(a), as.double(b), as.double(lambda), as.integer(low.tail), as.integer(log.p)) return(list(res)) } But after I built the package and loaded it it, the function doesn't work (it isn't even recognized). I have no idea why this is failing. Any information to help me figure out what I need to do to modify internal C code generally, or specifically as it applies to this scenario would be most helpful. Thanks, Ken -- Ken Kelley, Ph.D. Inquiry Methodology Program Indiana University 201 North Rose Avenue, Room 4004 Bloomington, Indiana 47405 http://www.indiana.edu/~kenkel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Modifying Internal C Files
Hi Duncan and others. Thanks for your insight. Actually, I did change the function name in the pnbeta2.c code (I should have said so). When I check the package I get no errors. I build the package and all seems well. When I install and then load the package in R, I get: pnbeta2 function (x, a, b, lambda, low.tail, log.p) { res - .Internal(pnbeta2, as.double(x), as.double(a), as.double(b), as.double(lambda), as.integer(low.tail), as.integer(log.p)) return(list(res)) } environment: namespace:try which seems good, but the function does not work: pnbeta2(.1, 10, 25, 2, TRUE, FALSE) Error in pnbeta2(0.1, 10, 25, 2, TRUE, FALSE) : 7 arguments passed to '.Internal' which requires 1 When I try to run the .Internal file directly is perhaps where my problems begin: .Internal(pnbeta2(.1, 10, 25, 2, TRUE, FALSE)) Error in .Internal(pnbeta2(0.1, 10, 25, 2, TRUE, FALSE)) : no internal function pnbeta2 But, when I do the same thing to the pnbeta internal function (which pnbeta2 is identical to at this point) I get the result: .Internal(pnbeta(.1, 10, 25, 2, TRUE, FALSE)) [1] 0.0006837318 So, I'm at a loss for what is going on in this situation. my pnbeta2 internal function doesn't seem to be there. What I want to do is simple (modify an interal C file), but it is proving to be quite difficult for me to implement. Thanks for any thoughts, Ken Duncan Murdoch wrote: On 11/7/2005 5:17 PM, Ken Kelley wrote: Hi All. I want to tweak a few minor things inside of internal C code. I have my Win. XP machine set-up to build packages (including C code), but I'm having problems getting the package to run correctly. In particular, I want to modify a few things inside of pnbeta.c (namely errmax and itrmax), which is what the pbeta() function calls upon when there is a noncentral parameter. I copied the pnbeta.c C code, changed its name [to pnbeta2.c], included the nmath.h, dpq.h files, lgamma.c, and pbeta.c in my src folder (since the .h files were called upon and the .c files were). I then created an R function that I thought would call upon the new C code: pnbeta2 - function(x, a, b, lambda, low.tail, log.p) { res - .C(pnbeta2, as.double(x), as.double(a), as.double(b), as.double(lambda), as.integer(low.tail), as.integer(log.p)) return(list(res)) } But after I built the package and loaded it it, the function doesn't work (it isn't even recognized). I have no idea why this is failing. Any information to help me figure out what I need to do to modify internal C code generally, or specifically as it applies to this scenario would be most helpful. You didn't say that you changed the name of the function, only the file that contained it. If this wasn't an oversight, then you should put double pnbeta2(double x, double a, double b, double lambda, int lower_tail, int log_p) in the appropriate place in your pnbeta2.c file. The other thing you should do is to add a PACKAGE argument to your .C call, just in case there is already a pnbeta2 function somewhere else (or will be some day). In fact, if you do this, there should be no need to change the name of the function: R will look in the package DLL rather than R.dll to find it. Just make sure that the name in the .C call matches the declared name in the source. Duncan Murdoch -- Ken Kelley, Ph.D. Inquiry Methodology Program Indiana University 201 North Rose Avenue, Room 4004 Bloomington, Indiana 47405 http://www.indiana.edu/~kenkel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Doubly Non-Central F-Distribution
Hi Wolfgang. I would be surprised if a doubly noncentral F-distribution function is available (and encourage you to check its accuracy if it is). Although R allows pf() to have a noncentral value specified, the results are not always accurate. I posted about this about a week ago (10/24/05) and didn't hear anything back regarding the accuracy issue. The bug I reported is here (# 8251): http://viggo.kubism.ku.dk/cgi-bin/R/Accuracy?id=8251;user=guest I am very interested in (singularly) noncentral F-distribution functions, so I too would like to learn of a doubly noncentral F-distribution function (since the singularly noncentral F is a special case of the doubly noncentral F; or learn of an alternative or fix to pf() as it now stands). Have a good day, Ken Viechtbauer Wolfgang (STAT) wrote: Hello All, Has anyone written a function for the distribution function of a *doubly* non-central F-distribution? I looked through the archives, but didn't find anything. Thanks! Wolfgang __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Ken Kelley, Ph.D. Inquiry Methodology Program Indiana University 201 North Rose Avenue, Room 4004 Bloomington, Indiana 47405 http://www.indiana.edu/~kenkel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problems with pf() with certain noncentral values/degrees of freedom combinations
Hello all. It seems that the pf() function when used with noncentral parameters can behave badly at times. I've included some examples below, but what is happening is that with some combinations of df and ncp parameters, regardless of how large the quantile gets, the same probability value is returned. Upon first glance noncentral values greater than 200 may seem large, but they are in some contexts not large at all. The problems with pf() can thus have serious implications (for example, in the context of sample size planning). I noticed that in in 1999 and 2000 issues with large degrees of freedom came about (PR#138), but I couldn't find the present issue reported anywhere. Might there be a way to make the algorithm more stable? I'm not sure how difficult this issue might be to fix, but hopefully it won't be too bad and can be easily done. Any thoughts on a workaround until then? Thanks, Ken Kelley # Begin example code X - seq(10, 600, 10) # Gets stuck at .99135 round(pf(X, 10, 1000, 225), 5) round(pf(X, 10, 200, 225), 5) round(pf(X, 5, 1000, 225), 5) round(pf(X, 5, 200, 225), 5) round(pf(X, 1, 1000, 225), 5) round(pf(X, 1, 200, 225), 5) # Gets stuck at .97035 round(pf(X, 10, 1000, 250), 5) round(pf(X, 10, 200, 250), 5) round(pf(X, 5, 1000, 250), 5) round(pf(X, 5, 200, 250), 5) round(pf(X, 1, 1000, 250), 5) round(pf(X, 1, 200, 250), 5) # Gets stuck at .93539 round(pf(X, 10, 1000, 275), 5) round(pf(X, 10, 200, 275), 5) round(pf(X, 5, 1000, 275), 5) round(pf(X, 5, 200, 275), 5) round(pf(X, 1, 1000, 275), 5) round(pf(X, 1, 200, 275), 5) # end example code version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor2.0 year 2005 month10 day 06 svn rev 35749 language R -- Ken Kelley, Ph.D. Inquiry Methodology Program Indiana University 201 North Rose Avenue, Room 4004 Bloomington, Indiana 47405 http://www.indiana.edu/~kenkel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Sometimes having problems finding a minimum using optim(), optimize(), and nlm() (while searching for noncentral F parameters)
. ## #End Code Am I going about this the best, or even a reasonable, way? Are there other functions I'm missing that would be more appropriate given what I'm trying to do? Any help would most certainly be appreciated. Thanks, Ken -- Ken Kelley, Ph.D. Inquiry Methodology Program Indiana University 201 North Rose Avenue, Room 4004 Bloomington, Indiana 47405 http://www.indiana.edu/~kenkel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Returning singular nlme objects.
Hi everyone. I'm working with nlme and I have a question regarding nlme fits that fail because of singularity issues. Specifically, there a way to return an nlme object when the estimation process runs into a singular matrix? For example, can the results up to the point of an error such as Error in solve.default(pdMatrix(a, fact = TRUE)) : system is computationally singular or Error in MEestimate(nlmeSt, grpShrunk) : Singularity in backsolve at level 0, block 1\n be returned rather than only an error message being returned? Setting the returnObject nlme control option to TRUE seems to return an nlme object only when the maximum number of iterations is reached without meeting the convergence criterion. However, when in the estimation stage a matrix becomes singular, the returnObject option does not return the nlme object up to that point (since the maximum number of iterations wasn't reached). If one reduces the number of iterations to a value less than the iteration at which the singularity occurs, the results are output (when returnObject=TRUE). What I want is for nlme to simply output the current estimates both when the maximum number of iterations is reached and when a singularity issue arises. I know it isn't generally good to make use of the results of nonconverging results, but I'm interested in the estimates when the singularity is realized. Specifically I'll use the information in a simulation study and compare the results from the converging sets with the nonconverging sets, etc. Also, suppose one has a set of data where the values are generally small. If one multiplies the data by some relatively large positive value, thus rescaling the data, will issues of singular matrices in the estimation stage be less problematic with the rescaled data than with the original data? Thanks very much for any help and/or thoughts, Ken __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Individual log likelihoods of nlsList objects.
Hello all. I was wondering if the logLike.nls() and logLike.nlme() functions are still being used. Neither function seems to be available in the most recent release of R (1.9.1). The following is contained in the help file for logLik(): classes which already have methods for this function include: 'glm', 'lm', 'nls' and 'gls', 'lme' and others in package 'nlme'. Thus, I was expecting that logLik.nls() and logLik.nlme() could be used for objects of the nls and nlme class, respectively. Are these functions no longer needed because logLike() subsumes logLike.nls() and logLike.nlme() as special cases? Did/does logLike.nls() and logLike.nlme() have any advantages above and beyond logLike() when applying them to nls and nlme objects? On a related note, is there a way to get the log likelihoods of each individual from an nlsList object? On p. 349 of Pinheiro and Bates (2000) the logLik() function is said to give the sum of the individual nls log-likelihoods. However, logLike(some.nlsList.object) does not work for me (even when there are no NAs). Any ideas? Thanks for any thoughts or feedback. Have a good day, Ken __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Ignoring Errors in Simulations
Hello all. I'm doing a simulation study and every so often I get an error that stops the simulation. I would like to ignore the errors *and* identify the particular iterations where they occurred. I have tried: options(error = expression(NULL)) which I thought would ignore the error, but the simulation is still stopped when an error occurs. I do not think try() is a good idea because of the significant computational time (that I think) it would add. Specifically I am using factanal() from the mva library and the error is: Error in factanal(Data, factors = 1, rotation = none) : Unable to optimize from these starting value(s) -I am using R 1.7.1 on a Windows XP machine. Although increasing the number of starting values attempted would reduces the number or errors, I'm looking for a way that they are ignored and these (likely) untrustworthy results identified. Thanks for any thoughts, Ken __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Generating Data Sets -- Each with a Different Name
Hello all. I was wondering if anyone had insight into how I might generate a large number of data sets/vectors, where each of the data sets/vectors have a different name? For example, suppose I wanted to generate N data sets/vectors, each with a different name such as: Random.Data.1 - rnorm(10, 0, 1) Random.Data.2 - rnorm(10, 0, 1) Random.Data.3 - rnorm(10, 0, 1) . . . . . . Random.Data.N - rnorm(10, 0, 1) Because I don't want to name each data set/vector myself, I want some sort of looping/automatic procedure to do it for me. However, I'm not sure how to do this. What I want is something conceptually analogous to the following: for(i in 1:N) { Random.Data.i - rnorm(10, 0, 1) } Note the i in the Random.Data.i vector. This is the value I want to change for each iteration of the loop, so that I can have N data sets/vectors automatically generated for me with different names. Does anyone know of a method where I can accomplish such a procedure. Thanks for your thoughts, Ken P.S. For those of you wondering why I would ever want to do such a thing, it is because I need a large number of data sets to use in a Monte Carlo study with specialized software. If I could do the simulation all is R, then obviously I could loop the whole procedure. The program I am using has a limited Monte Carlo facility where each of the (say 10,000) data sets must be generated elsewhere. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Denominator Degrees of Freedom in lme() -- Adjusting and Understanding Them
Hello all. I was wondering if there is any way to adjust the denominator degrees of freedom in lme(). It seems to me that there is only one method that can be used. As has been pointed out previously on the list, the denominator degrees of freedom given by lme() do not match those given by SAS Proc Mixed or HLM5. Proc Mixed, for example, offers five different options for computing the denominator degrees of freedom. Is there anyway to make such specifications in lme(), so that the degrees of freedom will correspond with the output given from Proc Mixed. I've looked at Pinheiro and Bates' Mixed-Effects Models book (especially p. 91), but I still don't quite understand the method used for determining the degrees of freedom in lme(). When analyzing longitudinal data with the straight-line growth model (intercept and slope both have fixed and random effects), the degrees of freedom seem to be N*T-N-1, where N is total sample size and T is the number of timepoints (at least when data are balanced). In the Pinheiro and Bates book (p. 91), the degrees of freedom are given as m_i-(m_1-1+pi), where m_i is the number of groups at the ith level, m_0=1 if an intercept is included and p_i is the sum of the degrees of freedom corresponding to the terms estimated. I'm not sure how the N*T-N-1 matches up with the formula given on page 91. It seems to me the number of groups (i.e., m_i) would be equal to N, the number of individuals (note that this is what is given as the number of groups in the summary of the lme() object.). However, as more occasions of measurements are added, the number of degrees of freedom gets larger, making it seems as though m_i represents the total number of observations, not the number of groups. For example, if N=2 and T=3, you end up with 3 degrees of freedom using lme(). Increasing T to 10 has not changed the number of groups (i.e., N still equals 2), but the degrees of freedom increases to 17. In such a situation SAS Proc Mixed would still have 1 degree of freedom (N-1) regardless of T, as the number of groups have not changed (just the number of observations per group have changed). Any insight into understanding the denominator degrees of freedom for the fixed effects would be appreciated. Since the degrees of freedom given by lme() can be made to be arbitrarily larger than those given by PROC MIXED (i.e., by having an arbitrarily large number of measurement occasions for each individual), and since the degrees of freedom affect the standard errors, then the hypothesis tests, then the p values, the differences between the methods is surprising. It seems one of the methods would be better than the other since they can potentially be so much different. Thanks and have a good one, Ken P.S. I have posted this to both the R and Multilevel Modeling list. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help