[R] Compiling R 2.4.0 in ubuntu/linux
I'm not sure if this is the place to post this question, but, I am having trouble compiling the source code. I do have a suitable C compiler and f2c but I get this error when I run ./configure configure: error: --with-readline=yes (default) and headers/libs are not available Any ideas? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] whole object approach for nested loops
I have the following code that I am trying to execute using the whole object approach and get rid of the for loop. I have looked at the manual and seached the database for examples or similar questions with no luck. The following example works without any problems. for (j in 1:186) { entropy.cogp[1:3, j]<-alpha3[1:3]*c[j,2] } But when I try to remove the for loop and use entropy.cogp[1:3, 1:186]<-alpha3[1:3]*c[1:186,2] R tries to multiply the first member of alpha3 with the first member of c[,2] and once c is exhausted, it multiplies the 187th member of alpha3 with the first member of c[,2] and so on, resulting in an error where it requires the size of alpha3 to be an exact multiple of the size of c. This is clearly not what is intended by the for loop given above. Is there a way to do this using a whole object approach to make things run faster? Or is the for loop the only way of doing this? Thanks... __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix logic
Uwe, FYI: I tried: "data3 <- ifelse(is.na(data1), data2, data1)" It seems to me that data3 is an array of length 100. I do NOT end up with a dataset of 5 columns and 20 rows. Uwe Ligges <[EMAIL PROTECTED]> wrote: Tom wrote: > On Tue, 10 Jan 2006 20:25:23 -0500, r user wrote: > > >>I have 2 dataframes, each with 5 columns and 20 rows. >>They are called data1 and data2.I wish to create a >>third dataframe called data3, also with 5 columns and >>20 rows. >> >>I want data3 to contains the values in data1 when the >>value in data1 is not NA. Otherwise it should contain >>the values in data2. >> >>I have tried afew methids, but they do not seem to >>work as intended.: >> >>data3<-ifelse(is.na(data1)=F,data1,data2) >> >>and >> >>data3[,]<-ifelse(is.na(data1[,])=F,data1[,],data2[,]) >> >>Please suggest the âbestâ way. "Better" way is to have the Syntax correct: data3 <- ifelse(is.na(data1), data2, data1) Please check the archives for almost millions of posts asking more or less this question...! > Not sure about the bast but... > > a<-c(1,2,3,NA,5) > b<-c(4,4,4,4,4) > > c<-a > c[which(is.na(a))]<-b[which(is.na(a))] Why do you want to know which()? na <- is.na(a) c[na] <- b[na] Uwe Ligges > > > > >>__ >>R-help@stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! >>http://www.R-project.org/posting-guide.html >> > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - Photo Books. You design it and well bind it! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] transpose a matrix?
I have a data set in the following format: x<-data.frame(id=c(a,b,c),2005-01-15=c(100,225,425), 2005-02-23=c(1100,2325,4525)) > x id X2005.01.15 X2005.02.23 1 a 1001100 2 b 2252325 3 c 4254525 I want: id a b c X2005.01.15 100 225 425 X2005.02.23 1100 2325 4525 Any Suggestions? __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] given a mid-month date, get the month-end date
I have a vector of dates. I wish to find the month end date for each. Any suggestions? e.g. For 12/15/05, I want 12/31/05, For 10/15/1995, I want 10/31/1995, etc __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] creating a subset of a dataset using ifelse statement?
I am using R in a Microsoft Windows environment. I have a dataset called mp1b. I have a variable called h. h can take a value from -1 to 5. If h <1, I want to create a new dataset called mp2 that is the same as mp1b: mp2<-mp1b If h > 0, I want to set create a dataset mp2, where I limit the original dataset to those where mp1b$group = =h. similar to: mp2<-subset (mp1b, group= = h) I have tried this ifelse statement, but it does not seem to work as expected. mp2<-ifelse(h<1,mp1b,subset(mp1b,cluster_q==h)) Assistance is appreciated. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] vector memory exhausted (limit reached?)" error message while loading saved workspace
I am running R 2.2.0 in a Windows XP environment. During my previous r session, I saved my workspace as 20051123b.Rdata. (On disk, it is ~1.25 gb.) I have launched R. (I have the following in my application shortcut <--max-mem-size=4000M>, and although it may not be necessary, I have also run the following line in R: . I have nothing loaded in the workspace. I am trying to load the saved workspace, which is ~1.25gb in size. I get the following message: vector memory exhausted (limit reached?)". I encounter this problem under both the official R 2.2.0, and the patched R 2.2.0. Is there anything I can do to load this workspace? It was working fine last time it was loaded in my R session. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] running out of memory while running a VERY LARGE regression
I am running a VERY LARGE regression (many factors, many rows of data) using LM. I think I have my memory set as high as possible. ( I ran "memory.limit(size = 4000)") Is there anything I can do? ( FYI, I "think" I have removed all data I am not using, and I "think" I have only the data needed for the regression loaded.) Thanks. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] percent rank by an index key?
A couple follow up questions: 1. Is there any way to modify this so that non-numeric values are ignored? (As it is, length seems to "count" the NA values.) 2. In order fro the cbind function "x <- cbind(x, do.call("rbind", r))" to work as intended, does the data need to be Ordered by State and Year? e.g. "x <- x[order(x$State,x$Year), ]" Here is some sample data, with non-numeric values included: Year,State,Subject,Income 2000,TX,1,30776 2000,AL,1,81240 2000,TX,2,28035 2000,AL,2,35947 2000,TX,3,42010 2000,AL,3,48830 2000,TX,4,18040 2000,AL,4,77758 2000,TX,5,20771 2000,AL,5,59132 2000,TX,6,46370 2000,AL,6,45573 2000,TX,7,57256 2000,AL,7,83402 2000,TX,8,3780 2000,AL,8,90695 2000,TX,9,51745 2000,AL,9,4105 2000,TX,10,1154 2000,AL,10,96598 2001,TX,1,25767 2001,AL,1,37032 2001,TX,2,39848 2001,AL,2,69029 2001,TX,3,17142 2001,AL,3,92850 2001,TX,4,62939 2001,AL,4,82730 2001,TX,5,30708 2001,AL,5,25339 2001,TX,6,64710 2001,AL,6,44541 2001,TX,7,96699 2001,AL,7,9151 2001,TX,8,57793 2001,AL,8,20981 2001,TX,9,12523 2001,AL,9,36139 2001,TX,10,53553 2001,AL,10,3767 2002,TX,1,55232 2002,AL,1,54655 2002,TX,2,76255 2002,AL,2,53581 2002,TX,3,77030 2002,AL,3,34869 2002,TX,4,98956 2002,AL,4,60332 2002,TX,5,33052 2002,AL,5,12348 2002,TX,6,96057 2002,AL,6,24509 2002,TX,7,66177 2002,AL,7,45952 2002,TX,8,73331 2002,AL,8,35813 2002,TX,9,3014 2002,AL,9,57097 2002,TX,10,83657 2002,AL,10,91640 2003,TX,1,5638 2003,AL,1,17026 2003,TX,2,66902 2003,AL,2,71080 2003,TX,3,88195 2003,AL,3,95415 2003,TX,4,13028 2003,AL,4,49123 2003,TX,5,19867 2003,AL,5,22990 2003,TX,6,67639 2003,AL,6,69435 2003,TX,7,62469 2003,AL,7,59939 2003,TX,8,24874 2003,AL,8,44829 2003,TX,9,77180 2003,AL,9,68488 2003,TX,10,80686 2003,AL,10,72622 2004,TX,1,46854 2004,AL,1,62499 2004,TX,2,20461 2004,AL,2,53834 2004,TX,3,54909 2004,AL,3,69527 2004,TX,4,33066 2004,AL,4,78035 2004,TX,5,23569 2004,AL,5,59757 2004,TX,6,44514 2004,AL,6,41223 2004,TX,7,85665 2004,AL,7,91972 2004,TX,8,30073 2004,AL,8,90642 2004,TX,9,32741 2004,AL,9,97111 2004,TX,10,8093 2004,AL,10,20077 2005,TX,1,48377 2005,AL,1,88216 2005,TX,2,35752 2005,AL,2,74897 2005,TX,3,27772 2005,AL,3,88945 2005,TX,4,86512 2005,AL,4,88422 2005,TX,5,27488 2005,AL,5,21140 2005,TX,6,35777 2005,AL,6,32772 2005,TX,7,77477 2005,AL,7,98282 2005,TX,8,73346 2005,AL,8,38943 2005,TX,9,38947 2005,AL,9,70195 2005,TX,10,23890 2005,AL,10,84020 2000,TX,11,na 2005,AL,11,null Sundar Dorai-Raj <[EMAIL PROTECTED]> wrote: t c wrote: > What is the easiest way to calculate a percent rank by an index key? > > > > Foe example, I have a dataset with 3 fields: > > > > Year, State, Income , > > > > I wish to calculate the rank, by year, by state. > > I also wish to calculate the percent rank, where I define percent rank as > rank/n. > > > > (n is the number of numeric data points within each date-state grouping.) > > > > > > This is what I am currently doing: > > > > 1. I create a group by field by using the paste function to combine date > and state into a field called date_state. I then use the rank function to > calculate the rank by date, by state. > > > > 2. I then add a field called one that I set to 1 if the value in income is > numeric and to 0 if it is not. > > > > 3. I then take an aggregate sum of one. This gives me a count (n) for each > date-state grouping. > > > > > > 4. I next use merge to add this count to the table. > > > > 5. Finally, I calculate the percent rank. > > > > Pr<-rank/n > > > > The merge takes quite a bit of time to process. > > > > Is there an easier/more efficient way to calculate the percent rank? > How about using ?by: set.seed(100) # fake data set, replace with your own # "Subject" is just a dummy to produce replicates x <- expand.grid(Year = 2000:2005, State = c("TX", "AL"), Subject = 1:10) x$Income <- floor(runif(NROW(x)) * 10) r <- by(x$Income, x[c("Year", "State")], function(x) { r <- rank(x) n <- length(x) cbind(Rank = r, PRank = r/n) }) x <- cbind(x, do.call("rbind", r)) HTH, --sundar Sundar Dorai-Raj <[EMAIL PROTECTED]> wrote: t c wrote: > What is the easiest way to calculate a percent rank by an index key? > > > > Foe example, I have a dataset with 3 fields: > > > > Year, State, Income , > > > > I wish to calculate the rank, by year, by state. > > I also wish to calculate the percent rank, where I define percent rank as > rank/n. > > > > (n is the number of nume
[R] percent rank by an index key?
What is the easiest way to calculate a percent rank by an index key? Foe example, I have a dataset with 3 fields: Year,State, Income , I wish to calculate the rank, by year, by state. I also wish to calculate the percent rank, where I define percent rank as rank/n. (n is the number of numeric data points within each date-state grouping.) This is what I am currently doing: 1. I create a group by field by using the paste function to combine date and state into a field called date_state. I then use the rank function to calculate the rank by date, by state. 2. I then add a field called one that I set to 1 if the value in income is numeric and to 0 if it is not. 3. I then take an aggregate sum of one. This gives me a count (n) for each date-state grouping. 4. I next use merge to add this count to the table. 5. Finally, I calculate the percent rank. Pr<-rank/n The merge takes quite a bit of time to process. Is there an easier/more efficient way to calculate the percent rank? - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] getting last 2 charcters of a string, other "text" functions?
I wish to obtain the right-most n characters of a character string? What is the appropriate function? - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] getting an aggregate count, and adding it to a dataset as a new column
I have the data below in a dataset called test3. What is the easiest way to get a count, by date, of all numerical values of var1, and to then add this count to the test3 dataset as a new column? What I have: date,var1,cat 1/1/2005,5,a 1/1/2005,12,a 1/1/2005,44,b 2/1/2005,1,b 2/1/2005,8,a 2/1/2005,32,z 2/1/2005,44,a 3/1/2005,5,z 3/1/2005,7,e 3/1/2005,95,s 3/1/2005,95,s 1/1/2005,NA,NA 1/1/2005,4,NA 1/1/2005,5,NA 3/1/2005,NA,NA 4/1/2005,NA,a what I want: date,var1,cat,n 1/1/2005,5,a,5 1/1/2005,12,a,5 1/1/2005,44,b,5 2/1/2005,1,b,4 2/1/2005,8,a,4 2/1/2005,32,z,4 2/1/2005,44,a,4 3/1/2005,5,z,4 3/1/2005,7,e,4 3/1/2005,95,s,4 3/1/2005,95,s,4 1/1/2005,NA,NA,5 1/1/2005,4,NA,5 1/1/2005,5,NA,5 3/1/2005,NA,NA,4 4/1/2005,NA,a,0 - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Correlation, by date, of two variables?
I have a dataset with three variables: date, var1, var2 How can I calculate the correlation, by date, between var1 and var2? e.g. datevar1var2 1/1/200154 1/1/200185 1/1/200197 2/1/200172 2/1/200121 2/1/200146 3/1/200135 3/1/200143 3/1/200169 3/1/20017-1 the results I want: 1/1/2001 0.891042111 2/1/2001 0.075093926 3/1/2001 -0.263117406 - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R training/"tutor"
Can anyone recommend a place to post/look for someone to help me with R? I'm looking for a consultant/trainer/tutor that knows R well, and can help me over the phone or via email. Thanks. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] functions available for use with aggregate?
What are the functions available for use with aggregate? Where can a reference to them be found? - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding 1 month to a date
Thanks. How do I use this to calculate a new variable (e.g."data$next_month") from an existing variable (e.g."Data$date_"). I tried : , but get the following error message: "Error in seq.Date(as.Date(data$date), len = 2, by = "1 month") : 'from' must be of length 1" Thanks. Gabor Grothendieck <[EMAIL PROTECTED]> wrote: Try this: seq(as.Date("2005-01-15"), len = 2, by = "month")[2] or here is another approach: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/61570.html On 10/11/05, t c wrote: > > Within an R dataset, I have a date field called "date_". (The dates are in > the format "-MM-DD", e.g. "1995-12-01".) > > > > How can I add or subtract "1 month" from this date, to get "1996-01-01" or " > "1995-11-01". > > > > > > - > > [[alternative HTML version deleted]] > > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > - - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding 1 month to a date
Thanks. How do I se thsi to calculate a new variable (e.g."data$next_month") from an existing variable (e.g."Data$date_"). I tried : , but get the following error message: "Error in seq.Date(as.Date(data$date), len = 2, by = "1 month") : 'from' must be of length 1" Thanks. Gabor Grothendieck <[EMAIL PROTECTED]> wrote: Try this: seq(as.Date("2005-01-15"), len = 2, by = "month")[2] or here is another approach: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/61570.html On 10/11/05, t c wrote: > > Within an R dataset, I have a date field called "date_". (The dates are in > the format "-MM-DD", e.g. "1995-12-01".) > > > > How can I add or subtract "1 month" from this date, to get "1996-01-01" or " > "1995-11-01". > > > > > > - > > [[alternative HTML version deleted]] > > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] adding 1 month to a date
Within an R dataset, I have a date field called date_. (The dates are in the format -MM-DD, e.g. 1995-12-01.) How can I add or subtract 1 month from this date, to get 1996-01-01 or 1995-11-01. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html