[R] Problem with the aggregate command
Dear friends, I have a data set with 23 columns and 38000 rows. It is a panel running from the years 1991 through 2005. I want to aggregate the data and get the medians of each of the 23 columns for each of the years. In other words my output should be like this Year Median 1991123 1992145 1993132 etc. The sample lines of code to do this operation is set1 - subset(as.data.frame(dataset),rep1==1) set2 - subset(as.data.frame(dataset),rep1==0) lst - list(unique(yeara)) y1 - aggregate(set1,lst,median) y2 - aggregate(set2,lst,median) However I'm getting an error as follows Error in FUN(X[[1]], ...) : arguments must have same length Can somebody please help me with what I'm doing wrong here? Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with the Aggregate command (PS)
Dear Friends, I forgot to add, the idea is to aggregate the entire dataset based on year and get the median value for each of the columns. Hence the output should be like this YearX1 X2X3 ... 1991102030... 199230 2010... 1993 4 5 6.. .... .... .... 2005100200300.. Thanks and Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incomplete Gamma function
Hi Kris, You just need to understand the mathematics of the incomplete gamma function and the various relationships it has. The answers from both Mathematica and R are correct, except that they are giving you different estimated quantities. It depends on the way the gamma function is written. For instance in R to get the equivalent result from mathematica you should do the following answer - gamma(9) - Igamma(9,11.1). This will give you the incomplete gamma for (9,11.1) as given by Mathematica. You can read more about the model and am sure you will figure it out. Regards Anup - Got a little couch potato? Check out fun summer activities for kids. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Solving equations involving banded matrices
Dear friends, I'm looking for a function which solves the system of equations Ax=B where A is a positive definite banded matrix. I know that the command solve can be used to arrive at a solution. But does it work as well with banded matrices? In GAUSS the command bandsolpd achieves this. So effectively, I guess my question is whether there is a mirror command in R for the same. Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Sampling from a Matrix
Dear Friends, I have a matrix of size 5000 X 20. The first two columns are indicator variables taking the value of either 0 or 1. Let us call the first two columns Y1 and Y2. I need to randomly sample 1000 rows with all the associated columns, in other words my new matrix should be of size 1000 X 20. I realize that using this command newmat - mainmat[sample(1000,replace=F),] achieves this. However, I would like to make sure that both Y1 and Y2 have more or less an equal amount of 0's and 1's. At present when I sample, I get cases where sometimes all my Y2's are 0. Is there any way to accomodate this problem. Thanks in advance. Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a data set within a function
Dear Friends, I'm trying to find if there is a way to automate creation of the design matrix. Suppose we are interested in say running an autoregressive model. The user inputs the following data myfunAR - function(y, order) {. .. } now here y is the data series and order represents the level of the process. In other words if order=2 then we have an AR (2) process. Now it is easy to to create the y vector within the function, but I'm not clear on how to create the design matrix. For instance if order=2 then y - as.matrix(rnorm(100)) ynew - as.matrix(y[3:nrow(y),1]) x - as.matrix(cbind(rep(1, nrow(y)-2), y[2:(nrow(y)-1),1], y[1:(nrow(y)-2),1])) ynew and x gives me the response vector and design matrix respectively. however, I'm trying to write a general function which will accomodate any order. Hence given the user inputs y and the order, is there a way to program the creation of the x matrix automatically. The long way would be if (order=1) {%5} if (order=2) {%5} but this will force me to limit at some point.Is there an alternative way to program this?? Thanks in advance Regards Anup - Building a website is a piece of cake. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with optimization using GENOUD
Dear Friends, I have been trying to learn how to use the derivative free optimization algorithms implemented in the package RGENOUD by Mebane and Sekhon. However, it does not seem to work for reasons best described as my total ignorance. If anybody has experience using this package, it would be really helpful if you can point out where I'm making a mistake. Thanks in advance Anup Sample code attached library(rgenoud) nobs - 5000 t.beta - c(0,1,-1) X - as.matrix(cbind(rep(1, nobs), runif(nobs), runif(nobs))) # Creating the design matrix prodterm - (X%*%t.beta)+rnorm(nrow(X)) Y - as.matrix(ifelse(prodterm0, 0, 1)) # Defining the likelihood function log.like - function(beta, Y, X) { term1 - pnorm(X%*%beta) term2 - 1-term1 loglik - (sum(Y*log(term1))+sum((1-Y)*log(term2))) # Likelihood function to be maximized } stval - c(0,0,0) opt.output - optim(stval,log.like,Y=Y[,1], X=X[,1:3], hessian=T, method=BFGS, control=c(fnscale=-1,trace=1)) opt.output ### Now using GENOUD gives me errors genoud.output - genoud(log.like,beta=stval,X=X[,1:3], Y=Y[,1], nvars=3, pop.size=3000, max=TRUE) - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Passing equations as arguments
Friends, I'm trying to pass an equation as an argument to a function. The idea is as follows. Let us say i write an independent function Ideal Situation: ifunc - function(x) { return((x*x)-2) } mainfunc - function(a,b) { evala - ifunc(a) evalb - ifunc(b) if (evalaevalb){return(evala)} else return(evalb) } Now I want to try and write this entire program in a single function with the user specifying the equation as an argument to the function. myfunc - function(a, b, eqn) { func1 - function (x) ?? { return(eqn in terms of x) ?? } Further arguments to check The imply that this does not seem to be correct. The idea is how to assign the equation expression from the main equation into the inner function. Is there anyway to do that within this set up? Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] row index
Dear Friends, Suppose I have a vector as follows RI Value 1 10 2 11 3 8 4 4 6 12 I would like a function which returns therow index number for the minimum value of VALUE. Can somebody please give me some direction on how I can do this. In effect I'm trying to find a comparable function for the GAUSS command minindc From the GAUSS Manual minindc: Returns the row number of the smallest element in each column of a matrix Thanks in advance Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random numbers from skewed distributions
Dear Friends, I was wondering if there is any package to get random numbers from the Burr 10 distribution. I checked the rmutil and actuar package. Both seems to implement the Burr 12 distribution. thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Source code for rlogis
Dear friends, I was trying to read the source code for rlogis but ran into a roadblock. It shows [[1]] function (n, location = 0, scale = 1) .Internal(rlogis(n, location, scale)) environment: namespace:stats Is is possible to access the source code for the same. Sincerely Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data consistency checks in functions
Dear friends, I'm writing a function with three arguments myfun - function(theta, X, values) { } in this function, I'm trying to write consistency checks. In order to compute the statistic of interest I only need theta and values. The idea of having X in there is that, if values is not provided by the user, then values is computed from X. my problem is I'm trying to write consistency checks. For instance if i say output - myfun(beta, val1), how do I ensure that R reads this as passing arguments to theta and values. In other words is it possible to bypass X completely if values is provided. Also how is it possible for R to recognize the second argument as being values and not X. This is important because X is a matrix and values is a vector. Therefore any checks using the dimensions of either one will land in trouble if it does not correctly capture that. Thanks in advance Sincerely Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with conditional lagging of data
Dear Friends, I have some data with three columns named ID, Year and Measure X. I need to create a column which gives me a lag for each ID (note not a continous lag), but a lag conditional on the id and the given year. Please find below a sample of the data Input file sample ID Year X AB12 2000100 AB12 2001120 AB12 2002140 AB12 200380 BL14 2000180 BL14 2001150 CR93 200045 CR93 200149 CR93 200256 CR93 200367 Expected output from this data ID Year Xlag AB12 2000 . AB12 2001 20 AB12 2002 20 AB12 2003 -60 BL12 2000. BL14 2001 -30 CR93 2000 . CR93 2001 5 CR93 2002 7 CR93 2003 9 Can somebody please help me with how to implement this in R. Thanks. Sincerely Anup - Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scale mixture of normals
Dear Friends, Is there an R package which implements regression models with error distributions following a scale mixture of normals? Thanks Anup __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum per hour
hi jessica, this should possibly do it... year - as.data.frame(c(2000-10-03,2000-10-03,2000-10-03,2000-10-03,2000-10-03,2000-10-03 ,2000-10-03,2000-10-04,2000-10-04,2000-10-04)) colnames(year) - c(year) time - as.data.frame(c(14:00:00,14:10:00,14:20:00,15:30:00,16:40:00,16:50:00, 17:00:00,17:10:00,17:20:00,18:30:00)) colnames(time) - c(time) precipitation-c(0,0.1,0,0,0,0,0.2,0.3,0.5,6) DATA-(cbind(year, time ,precipitation)) tapply(precipitation, year, sum) tapply(precipitation, time, sum) - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum per hour
oops error on my part...forgot that we need to get the day and hour out of the string... [EMAIL PROTECTED] wrote: Dear all, I have a list of precipitation record and a list of time I would like to sum them up per hour, or per day. Does such a function exist ? example: time-c(2000-10-03 14:00:00,2000-10-03 14:10:00,2000-10-03 14:20:00,2000-10-03 15:30:00,2000-10-03 16:40:00,2000-10-03 16:50:00,2000-10-03 17:00:00,2000-10-03 17:10:00,2000-10-03 17:20:00,2000-10-03 18:30:00,2000-10-04 14:00:00,2000-10-04 14:10:00,2000-10-04 14:20:00,2000-10-04 15:30:00,2000-10-04 16:40:00,2000-10-04 16:50:00,2000-10-04 17:00:00,2000-10-04 17:10:00,2000-10-04 17:20:00,2000-10-04 18:30:00) precipitation-c(0,0.1,0,0,0,0,0.2,0.3,0.5,6,7,8,9,1,0,0,0,0,1,0) DATA-cbind(time,precipitation) ... ? how to sum up per hour ? Thanks in advance Jessica __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Sucker-punch spam with award-winning protection. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A programming question
Dear Friends, My problem is related to how to measure probabilities from a probit model by changing one independent variable keeping the others constant. A simple toy example is like this Range for my variables is defined as follows y=0 or 1, x1 = -10 to 10, x2=-40 to 100, x3 = -5 to 5 Model output - glim(y ~ x1+x2+x3 -1, family=binomial(link=probit)) outcoef - output$coef xbeta - as.matrix(cbind(x1, x2, x3) predprob - pnorm(xbeta%*%outcoef) now I have the predicted probabilities for y=1 as defined above. My problem is as follows Keep X2 at 20 and X3 at 2. Then compute the predicted probability (predprob) for the entire range of X1 ie from -10 to 10 with an increment of 1. Therefore i need the predicted probabilities when x1=-10, x1=-9,x1=9, x1=10 keeping the other constant. Can somebody give me some direction on how this can be programmed. Thanks in advance for your help Sincerely Anup - Got a little couch potato? Check out fun summer activities for kids. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bootstrapped standard errors
Dear Friends, I'm trying to learn to how to get Bootstrapped standard errors for estimated coefficients from a regression. For instance suppose I have the following model logitmodel - glm (y~X1+X2+X3, family=binomial(link=logit)) beta - logitmodel$coef can somebody please guide me on how to use the package boot to obtain bootstrapped SE's for the associated betas. Thanks in advance Anup - Boardwalk for $500? In 2007? Ha! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data Manipulation using R
Dear Friends, I have data set with around 220,000 rows and 17 columns. One of the columns is an id variable which is grouped from 1000 through 9000. I need to perform the following operations. 1) Remove all the observations with id's between 6000 and 6999 I tried using this method. remdat1 - subset(data, ID6000) remdat2 - subset(data, ID=7000) donedat - rbind(remdat1, remdat2) I check the last and first entry and found that it did not have ID values 6000. Therefore I think that this might be correct, but is this the most efficient way of doing this? 2) I need to remove observations within columns 3, 4, 6 and 8 when they are negative. For instance if the number in column 3 is -4, then I need to delete the entire observation. Can somebody help me with this too. Thank and Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Marginal Effects from GLM
Dear Friends, Is there a direct way to extract the marginal effects when running discrete choice models such as Probit or Logit using glm? Thanks and Regards Anup - Ahhh...imagining that irresistible new car smell? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Sequence
Dear Friends, I'm trying to generate a sequence of 100 observations with either a 1 or -1. In other words the sequence should look something like this. y = 1 1 -1 1 -1 -1 -1 1 1 .. Can somebody please give me some direction on how I can do this in R. Thanks Anup Don't get soaked. Take a quick peak at the forecast __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Query about substituting characters in a df
Hi lalitha, You can try this way. I think it should solve the problem. Data looks like this YX1 1 1 2 1 3 0 4 1 5 1 6 1 anal.data - read.table(file.txt, header=T)\ attach(anal.data) ## objective is to change the value 1 in column X1 to Gamma anal.data$X1 - replace(anal.data$X1,anal.data$X1==1,Gamma) anal.data this should replace all the 1's with Gamma. Hope this helps. Sincerely However bad life may seem, there is always something you can do and succeed at. While there is life, there is hope. Stephen Hawking Anup Menon Nandialath * http://www.soundclick.com/bands/7/tailgunner_music.htm * * - It's here! Your new message! Get new email alerts with the free Yahoo! Toolbar. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix manipulations
Dear friends, I have a basic question with R. I'm generating a set of random variables and then combining them using the cbind statement. The code for that is given below. for (i in 1:100) { y - rpois(i,lambda=10) X0 - seq(1,1,length=i) X1 - rnorm(i,mean=5,sd=10) X2 - rnorm(i,mean=17,sd=12) X3 - rnorm(i,mean=3, sd=24) ind - rep(1:5,20) } data100 - cbind(y,X0,X1,X2,X3,ind) but when i look at the data100 table, the y values now take the observation count. (ie) the data under Y is not the poisson random generates but the observation number. Hence the last vector (ind) does not have a header. Is there any way i can drop the number of observation counts being added into the matrix. Thanks in advance for your help. Sincerely Anup However bad life may seem, there is always something you can do and succeed at. While there is life, there is hope. Stephen Hawking Anup Menon Nandialath * http://www.soundclick.com/bands/7/tailgunner_music.htm * * Looking for earth-friendly autos? Browse Top Cars by Green Rating at Yahoo! Autos' Green Center. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Integers
Thanks Andy, Alberto, Charles, Paul and Pierre. I needed to simulate a set of counts to test a poisson regression model. Therefore would the best option be as pointed use rpois(.,.)? Sincerely Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.