[R] The math underlying the `betareg' package?

2007-01-18 Thread Ajay Narottam Shah
Folks, The betareg package appears to be polished and works well. But I would like to look at the exact formulas for the underlying model being estimated, the likelihood function, etc. E.g. if one has to compute \frac{\partial E(y)}{\partial x_i}, this requires careful calculations through these f

[R] Pretty-printing multiple regression models

2006-08-31 Thread Ajay Narottam Shah
A few days ago, I had asked this question. Consider this situation: > x1 <- runif(100); x2 <- runif(100); y <- 2 + 3*x1 - 4*x2 + rnorm(100) > m1 <- summary(lm(y ~ x1)) > m2 <- summary(lm(y ~ x2)) > m3 <- summary(lm(y ~ x1 + x2)) You have estimated 3 different "competing" models, and suppose you wa

[R] Ooops, small mistake fixed (pretty printing multiple models)

2006-08-31 Thread Ajay Narottam Shah
The R code I just mailed out had a small error in it. This one works. Now what one needs is a way to get decimal alignment in LaTeX tabular objects. x1 <- runif(100); x2 <- runif(100); y <- 2 + 3*x1 - 4*x2 + rnorm(100) m1 <- summary(lm(y ~ x1)) m2 <- summary(lm(y ~ x2)) m3 <- summary(lm(y ~ x1 + x

[R] Presentation of multiple models in one table using xtable

2006-08-15 Thread Ajay Narottam Shah
Consider this situation: > x1 <- runif(100); x2 <- runif(100); y <- 2 + 3*x1 - 4*x2 + rnorm(100) > m1 <- summary(lm(y ~ x1)) > m2 <- summary(lm(y ~ x2)) > m3 <- summary(lm(y ~ x1 + x2)) Now you have estimated 3 different "competing" models, and suppose you want to present the set of models in one

[R] SUMMARY: making contour plots using (x,y,z) data

2006-07-01 Thread Ajay Narottam Shah
Folks, A few days ago, I had asked a question on this mailing list about making a contour plot where a function z(x,y) is evaluated on a grid of (x,y) points, and the data structure at hand is a simple table of (x,y,z) points. As usual, R has wonderful resources (and subtle complexity) in doing th

[R] Puzzled with contour()

2006-06-26 Thread Ajay Narottam Shah
Folks, The contour() function wants x and y to be in increasing order. I have a situation where I have a grid in x and y, and associated z values, which looks like this: x y z [1,] 0.00 20 1.000 [2,] 0.00 30 1.000 [3,] 0.00 40 1.000 [4,] 0.00 50 1.0

[R] Solved ordered probit question

2006-03-30 Thread Ajay Narottam Shah
A few minutes ago I had asked why this didn't seem to work: # Simulate from probit model -- x1 <- 2*runif(5000) x2 <- 5*runif(5000) ystar <- 7 + 3*x1 - 4*x2 + rnorm(5000) y <- cut(ystar, breaks=c(-100, -5, 0, 5, 100)) table(y) library(MASS) summary(polr(y ~ x1 + x2, method="probit")) A little thi

[R] Ordered probit (Puzzled at MASS:polr())

2006-03-30 Thread Ajay Narottam Shah
This part, a vanilla probit, works perfectly -- # Simulate from probit model -- x1 <- 2*runif(5000) x2 <- 5*runif(5000) ystar <- 7 + 3*x1 - 4*x2 + rnorm(5000) y <- as.numeric(ystar>0) table(y) # Estimation using micEcon::probit() library(micEcon) summary(probit(y ~ x1 + x2)) #

[R] Interleaving elements of two vectors?

2006-03-05 Thread Ajay Narottam Shah
Suppose one has x <- c(1, 2, 7, 9, 14) y <- c(71, 72, 77) How would one write an R function which alternates between elements of one vector and the next? In other words, one wants z <- c(x[1], y[1], x[2], y[2], x[3], y[3], x[4], y[4], x[5], y[5]) I couldn't think of

[R] Prediction when using orthogonal polynomials in regression

2006-01-27 Thread Ajay Narottam Shah
Folks, I'm doing fine with using orthogonal polynomials in a regression context: # We will deal with noisy data from the d.g.p. y = sin(x) + e x <- seq(0, 3.141592654, length.out=20) y <- sin(x) + 0.1*rnorm(10) d <- lm(y ~ poly(x, 4)) plot(x, y, type="l"); lines(x, d$fitted.values, col=

[R] Making a markov transition matrix - more progress

2006-01-23 Thread Ajay Narottam Shah
I solved the problem in one more (and more elegant) way. So here's the program again. Where does R stand on the Anderson-Goodman test of 1957? I hunted around and nobody seems to be doing this in R. Is it that there has been much progress after 1957 and nobody uses it anymore? # Problem statement

Re: [R] Making a markov transition matrix

2006-01-21 Thread Ajay Narottam Shah
On Sun, Jan 22, 2006 at 01:47:00PM +1100, [EMAIL PROTECTED] wrote: > If this is a real problem, here is a slightly tidier version of the > function I gave on R-help: > > transitionM <- function(name, year, state) { > raw <- data.frame(name = name, state = state)[order(name, year), ] > raw01 <-

[R] Making a markov transition matrix

2006-01-21 Thread Ajay Narottam Shah
Folks, I am holding a dataset where firms are observed for a fixed (and small) set of years. The data is in "long" format - one record for one firm for one point in time. A state variable is observed (a factor). I wish to make a markov transition matrix about the time-series evolution of that sta

[R] Tobit estimation?

2006-01-19 Thread Ajay Narottam Shah
Folks, Based on http://www.biostat.wustl.edu/archives/html/s-news/1999-06/msg00125.html I thought I should experiment with using survreg() to estimate tobit models. I start by simulating a data frame with 100 observations from a tobit model > x1 <- runif(100) > x2 <- runif(100)*3 > ystar <- 2

[R] Rpart -- using predict() when missing data is present?

2005-10-08 Thread Ajay Narottam Shah
I am doing > library(rpart) > m <- rpart("y ~ x", D[insample,]) > D[outsample,] y x 8 0.78391922 0.579025591 9 0.06629211 NA 10 NA 0.001593063 > p <- predict(m, newdata=D[9,]) Error in model.frame(formula, rownames, variables, varnames, extras, extraname

[R] Placing axes label strings closer to the graph?

2005-10-01 Thread Ajay Narottam Shah
Folks, I have placed an example of a self-contained R program later in this mail. It generates a file inflation.pdf. When I stare at the picture, I see the "X label string" and "Y label string" sitting lonely and far away from the axes. How can these distances be adjusted? I read ?par and didn't f

Re: [R] Question on lm(): When does R-squared come out as NA?

2005-09-28 Thread Ajay Narottam Shah
On Wed, Sep 28, 2005 at 08:23:59AM +0100, Prof Brian Ripley wrote: > I've not seen a reply to this, nor ever seen it. > Please make a reproducible example available (do see the posting guide). It was a mistake on my part. Just in case others are able to recognise the situation, what was going on w

[R] Question on lm(): When does R-squared come out as NA?

2005-09-25 Thread Ajay Narottam Shah
I have a situation with a large dataset (3000+ observations), where I'm doing lags as regressors, where I get: Call: lm(formula = rj ~ rM + rM.1 + rM.2 + rM.3 + rM.4) Residuals: 1990-06-04 1994-11-14 1998-08-21 2002-03-13 2005-09-15 -5.64672 -0.59596 -0.041430.554128.18229 Coeffi

[R] update.packages() is broken?

2005-08-25 Thread Ajay Narottam Shah
Folks, I am using R 2.1.1 on Apple OS X 10.3. Earlier, I used to say $ sudo R > update.packages() and all the packages used to get installed. For several weeks, I noticed that nothing has been coming through. I used the R-for-Mac graphics console and I find that there are many packages where

[R] Problem with get.hist.quote() in tseries

2005-08-18 Thread Ajay Narottam Shah
When using get.hist.quote(), I find the dates are broken. This is with R 2.1.1 on Mac OS X `panther'. > library(tseries) Loading required package: quadprog 'tseries' version: 0.9-27 'tseries' is a package for time series analysis and computational finance. See 'library(help="tse

[R] Extracting some rows from a data frame - lapses into a vector

2005-08-15 Thread Ajay Narottam Shah
I have a data frame with one column "x": > str(data) `data.frame': 20 obs. of 1 variable: $ x: num 0.0495 0.0986 0.9662 0.7501 0.8621 ... Normally, I know that the notation dataframe[indexes,] gives you a new data frame which is the specified set of rows. But I find: > str(data[1:10,]) num

[R] Panel data handling (lags, growth rates)

2005-08-14 Thread Ajay Narottam Shah
I have written two functions which do useful things with panel data a.k.a. longitudinal data, where one unit of observation (a firm or a person or an animal) is observed on a uniform time grid: - The first function makes lagged values of variables of your choice. - The second function makes g

[R] Puzzled at rpart prediction

2005-08-03 Thread Ajay Narottam Shah
I'm in a situation where I say: > predict(m.rpart, newdata=D[N1+t,]) 0 1 173 0.8 0.2 which I interpret as meaning: an 80% chance of "0" and a 20% chance of "1". Okay. This is consistent with: > predict(m.rpart, newdata=D[N1+t,], type="class") [1] 0 Levels: 0 1 But I'm puzzled at the fol

Re: [R] Misbehaviour of DSE

2005-07-13 Thread Ajay Narottam Shah
On Mon, Jul 11, 2005 at 08:27:40AM -0700, Rob J Goedman wrote: > Ajay, > > After installing both setRNG (2004.4-1, source or binary) and dse > (2005.6-1, source only), it works fine. Thanks! :-) Now dse1 works, but I get: > library(dse2) Warning message: replacing previous import: acf in: name

[R] Puzzled at ifelse()

2005-07-12 Thread Ajay Narottam Shah
I have a situation where this is fine: > if (length(x)>15) { clever <- rr.ATM(x, maxtrim=7) } else { clever <- rr.ATM(x) } > clever $ATM [1] 1848.929 $sigma [1] 1.613415 $trim [1] 0 $lo [1] 1845.714 $hi [1] 1852.143 But this variant, using ifelse(),

Re: [R] Puzzled in utilising summary.lm() to obtain Var(x)

2005-06-14 Thread Ajay Narottam Shah
> > I have a program which is doing a few thousand runs of lm(). Suppose > > it is a simple model > > y = a + bx1 + cx2 + e > > > > I have the R object "d" where > > d <- summary(lm(y ~ x1 + x2)) > > > > I would like to obtain Var(x2) out of "d". How might I do it? > > > > I can, of course,

[R] Puzzled in utilising summary.lm() to obtain Var(x)

2005-06-14 Thread Ajay Narottam Shah
I have a program which is doing a few thousand runs of lm(). Suppose it is a simple model y = a + bx1 + cx2 + e I have the R object "d" where d <- summary(lm(y ~ x1 + x2)) I would like to obtain Var(x2) out of "d". How might I do it? I can, of course, always do sd(x2). But it would be much

[R] optim() does SANN, why not genetic algorithm (genoud)

2005-06-09 Thread Ajay Narottam Shah
It is a very nice touch that optim() offers SANN (simulated annealing) as a random search algorithm. The R community already has genoud - an implementation of a genetic algorithm for search. Wouldn't it be neat if optim() would additionally offer method="GA" where it internally uses code from gen

[R] R and MLE

2005-06-07 Thread Ajay Narottam Shah
I learned R & MLE in the last few days. It is great! I wrote up my explorations as http://www.mayin.org/ajayshah/KB/R/mle/mle.html I will be most happy if R gurus will look at this and comment on how it can be improved. I have a few specific questions: * Should one use optim() or should one

[R] The economist's term "fixed effects model" - plain lm() should work

2005-06-06 Thread Ajay Narottam Shah
> CAN YOU TELL ME HOW TO FIT FIXED-EFFECTS MODEL WITH R? THANK YOU! Ordinary lm() might suffice. In the code below, I try to simulate a dataset from a standard earnings regression, where log earnings is quadratic in experience, but the intercept floats by education category - you have 4 intercep

[R] A performance anomaly

2005-06-06 Thread Ajay Narottam Shah
I wrote a simple log likelihood (for the ordinary least squares (OLS) model), in two ways. The first works out the likelihood. The second merely calls the first, but after transforming the variance parameter, so as to allow an unconstrained maximisation. So the second suffers a slight cost for one

[R] Solved: linear regression example using MLE using optim()

2005-05-31 Thread Ajay Narottam Shah
Thanks to Gabor for setting me right. My code is as follows. I found it useful for learning optim(), and you might find it similarly useful. I will be most grateful if you can guide me on how to do this better. Should one be using optim() or stats4::mle? set.seed(101) # F

Re: [R] R commandline editor question

2005-05-27 Thread Ajay Narottam Shah
> well ESS has such a facility. > > However, I think Mathematica has a super scheme: unbalanced brackets > show up > in red, making them obvious. > > This is particularly good for spotting wrongly interleaved brackets, as > in > > ([ blah di blah )] > > > > in which case both opening brac

[R] R commandline editor question

2005-05-27 Thread Ajay Narottam Shah
I am using R 2.1 on Apple OS X. When I get the ">" prompt, I find it works well with emacs commandline editing. Keys like M-f C-k etc. work fine. The one thing that I really yearn for, which is missing, is bracket matching When I am doing something which ends in it is really useful to have e

[R] Catching an error with lm()

2005-05-24 Thread Ajay Narottam Shah
Folks, I'm in a situation where I do a few thousand regressions, and some of them are bad data. How do I get back an error value (return code such as NULL) from lm(), instead of an error _message_? Here's an example: > x <- c(NA, 3, 4) > y <- c(2, NA, NA) > d <- lm(y ~ x) Error in lm.fit(x, y, o

[R] Summary: My question about factor levels versus factor labels.

2005-05-09 Thread Ajay Narottam Shah
Yesterday, I had asked for help on the list. Brian Ripley and Bruno Falissard had most kindly responded to me. Here is the solution. > factorlabels <- c("School", "College", "Beyond") > # 1 2 3 > education.man <- c(1,2,1,2,1,2,1,2) # PROBLEM: Level "

[R] Need a factor level even though there are no observations

2005-05-08 Thread Ajay Narottam Shah
I'm in this situation: factorlabels <- c("School", "College", "Beyond") with data for 8 families: education.man <- c(1,2,1,2,1,2,1,2) # Note : no "3" values education.wife <- c(1,2,3,1,2,3,1,2) # 1,2,3 are all present. My goal is to create this table:

[R] R on Mac OS X: odd errors when doing install.packages()

2005-05-03 Thread Ajay Narottam Shah
Should I be worried? The installation seems to go through fine and apparently nothing is broken. The errors I repeatedly get are like this: g++ -no-cpp-precomp -I/Library/Frameworks/R.framework/Resources/include -I/usr/ local/include -DUNIX -DOPTIM -DNONR -fno-common -g -O2 -c unif.cpp -o unif.

[R] Memory consumption, integer versus factor

2005-04-30 Thread Ajay Narottam Shah
R is so smart! I found that when you switch a column from integer to factor, the memory consumption goes down rather impressively. Now I'd like to learn more. How does R do this? What does R do? How do I learn more? I got to thinking: If I was really smart, I'd see that a factor with 2 levels req