[R] array vs matrix vs dataframe?
What is the difference among an array, a dataframe and a matrix? Why is the size of a dataframe so much larger? (see example below) a<-c(rep(1:100,1)) b<-c(rep(1:100,1)) c1<-cbind(a,b) cdf<-as.data.frame(cbind(a,b)) cm<-as.matrix(cbind(a,b)) object.size(a)/100 object.size(b)/100 object.size(c1)/100 object.size(cdf)/100 object.size(cm)/100 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "summarry.lm" and NA values
Gentlemen, (I am using R 2.2.1 in a Windows environment.) I apologize but I did not fully comprehend all of your answer. I have a dataframe called data1. I run several liner regression using the lm function similar to: reg <- ( lm(lm(data1[,2] ~., data1[,2:4])) ) I see from generous answers below how I can use "coef(reg)" to extract the coefficient estimates. (If the coefficient for a variable is for some reason NA, "coef(reg)" returns NA for that coefficient, which is what I want.) My question: What is the best way to get the standard errors, including NA values that go with each of these coefficient estimates? (i.e. If the coefficient estimate is NA, I similarly want the standard error to come back as NA, so that the length of coef(reg) is the same as the length of the vector that contains the standard errors. ) Thanks very much for all your help, and I apologize for my need of additional assistance. --- Berton Gunter <[EMAIL PROTECTED]> wrote: > "Is there a way to..." always has the answer "yes" > in R (or C or any > language for that matter). The question is: "Is > there a GOOD way...?" where > "good" depends on the specifics of the situation. So > after that polemic, > below is an effort to answer, (adding to what Petr > Pikal already said): > > -- Bert Gunter > Genentech Non-Clinical Statistics > South San Francisco, CA > > "The business of the statistician is to catalyze the > scientific learning > process." - George E. P. Box > > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On > Behalf Of r user > > Sent: Tuesday, August 15, 2006 7:01 AM > > To: rhelp > > Subject: [R] question re: "summarry.lm" and NA > values > > > > Is there a way to get the following code to > include > > NA values where the coefficients are "NA"? > > > > ((summary(reg))$coefficients) > BAAAD! Don't so this. Use the extractor on the > object: coef(reg) > This suggests that you haven't read the > documentation carefully, which tends > to arouse the ire of would-be helpers. > > > > > explanation: > > > > Using a loop, I am running regressions on several > > "subsets" of "data1". > > > > "reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) )" > ??? There's an error here I think. Do you mean > update()? Do you have your > subscripting correct? > > > > > My regression has 10 independent variables, and I > > therefore expect 11 coefficients. > > After each regression, I wish to save the > coefficients > > and standard errors of the coefficients in a table > > with 22 columns. > > > > I successfully extract the coefficients using the > > following code: > > "reg$coefficients" > Use the extractor, coef() > > > > > I attempt to extract the standard errors using : > > > > aperm((summary(reg))$coefficients)[2,] > > BAAAD! Use the extractor vcov(): > sqrt(diag(vcov(reg))) > > > > ((summary(reg))$coefficients) > > > > My problem: > > For some of my subsets, I am missing data for one > or > > more of the independent variables. This of course > > causes the coefficients and standard erros for > this > > variable to be "NA". > Not it doesn't, as Petr said. > > One possible approach: Assuming that a variable is > actually missing (all > NA's), note that coef(reg) is a named vector, so > that the character string > names of the regressors actually used are available. > You can thus check for > what's missing and add them as NA's at each return. > Though I confess that I > see no reason to put things ina matrix rather than > just using a list. But > that's a matter of personal taste I suppose. > > > > > Is there a way to include the NA standard errors, > so > > that I have the same number of standard erros and > > coefficients for each regression, and can then > store > > the coefficients and standard erros in my table of > 22 > > columns? > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > reproducible code. > > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting sapply to skip columns with non-numeric data?
getting s-apply to skip columns with non-numeric data? I have a dataframe x of w columns. Some columns are numeric, some are not. I wish to create a function to calculate the mean and standard deviation of each numeric column, and then bind the column mean and standard deviation to the bottom of the dataframe. e.g. tempmean <- apply(data.frame(x), 2, mean, na.rm = T) xnew <- rbind(x,tempmean) I am running into one small problem what is the best way to have sapply skip the non-numeric data and return NAs? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question re: "summarry.lm" and NA values
Is there a way to get the following code to include NA values where the coefficients are NA? ((summary(reg))$coefficients) explanation: Using a loop, I am running regressions on several subsets of data1. reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) ) My regression has 10 independent variables, and I therefore expect 11 coefficients. After each regression, I wish to save the coefficients and standard errors of the coefficients in a table with 22 columns. I successfully extract the coefficients using the following code: reg$coefficients I attempt to extract the standard errors using : aperm((summary(reg))$coefficients)[2,] ((summary(reg))$coefficients) My problem: For some of my subsets, I am missing data for one or more of the independent variables. This of course causes the coefficients and standard erros for this variable to be NA. Is there a way to include the NA standard errors, so that I have the same number of standard erros and coefficients for each regression, and can then store the coefficients and standard erros in my table of 22 columns? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting summary.lm to include data for coefficients that are NAs?
Is there a way to get the following code to include liens where the coefficients are NA? ((summary(reg))$coefficients) explanation: Using a loop, I am running regressions on several subsets of data1. reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) ) My regression has 10 independent variables, and I therefore expect 11 coefficients. After each regression, I wish to save the coefficients and standard errors of the coefficients in a table with 22 columns. I successfully extract the coefficients using the following code: reg$coefficients I attempt to extract the standard erros using : aperm((summary(reg))$coefficients)[2,] ((summary(reg))$coefficients) My problem: For some of my subsets, I am missing data for one or more of the independent variables. This of course causes the coefficients and standard erros for this variable to be NA. Is there a way to include the NA standard errors, so that I have the same number of standard erros and coefficients for each regression, and can then store the coefficients and standard erros in my table of 22 columns? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] basic question re lm()
I am using R in a Windows environment. I have a basic question regarding lm(). I have a dataframe data1 with ncol=w. I know that my dependent variable is in column1. Is there a way to write the regression formula so that I can use columns 2 thru w as my independent variables? e.g. something like: lm(data1[,1] ~ data1[,2:w] ) Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] average by group...
I have a dataframe with 700,000 rows and 2 vectors (columns): group and score. I wish to calculate a third vector of length 70: the average score by group. Even though the avarge value will repeat, I wish to return the average for that particular group for each row. (I know I can do this by calculating each groups average and then using the merge command, but as my calculations get more complex and my data set gets larger, the merge command seems to be fairly slow.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help converting code to a function
I want to write a function that loads a data frame from my hard drive, and then creates a new dataframe that calculates the difference between column n and column n+4, and them saves this new dataframe to my hard drive, and finally, removes both the new and old data frame from memory.. Here is the code I am using. How do I convert this into a function that can be used to perform the same process on any dataframe? load ('c:/r_pit/sampledf.r') w<-ncol(sampledf) l<-nrow(sampledf) sampledf_yychg <- data.frame(matrix(data=NA,nrow=l,ncol=w-4)) for(j in 1:(w-4)) { sampledf_yychg[, j]<-sampledf[, j]- sampledf[, j+4] } save(sampledf_yychg, file='c:/r_pit/sampledf_yychg.r') rm(sampledf, sampledf_sq, sampledf_yychg) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] function to check if an object is present, and if not, load it from my hard drive
I want to check if an "object" (dataset, vector, etc) is present. If it is present, I will do nothing. If it is not present, I will load it from my hard drive. Is there function to determine if an object is present? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] gc(), memory.size()
Can someone please explain for me what the vcells and ncells used column means when I run gc()? > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 882296 23.6 13812157 368.9 19400892 518.1 Vcells 14811586 113.1 114763459 875.6 317464335 2422.1 > (I read the help file , but still do not fully understand?) Also, how do I determine the total memory being used?Do I simply run memory.size()? Finally, when I run memory. size(max=T), I get a negative value. What does this mean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] converting code into a function - seperating a data frame with n columns into n individual vectors
I have many very large dataframes with 20 columns each. In order to conserve memory, I wish to separate the data frame into 20 vectors, each named the name of the dataframe followed by .1,.2,.3 .20. (For example purposes, one data frame is named testa.) e.g. testa.1, testa.2, testa.3 I have written the code to do this (see below). I am trying to convert this into a function that I can reuse. Suggestions are appreciated. (I am not sure if this is the best way to approach the problem, but I do think it will work. FYI, I really do need all the data, so selecting subset of the data is not a good option.) Here is the code Ive been using: load('c:/testa.r') testa.1<-testa[ , 1] testa.2<-testa[ , 2] testa.3<-testa[ , 3] testa.4<-testa[ , 4] testa.5<-testa[ , 5] testa.6<-testa[ , 6] testa.7<-testa[ , 7] testa.8<-testa[ , 8] testa.9<-testa[ , 9] testa.10<-testa[ , 10] testa.11<-testa[ , 11] testa.12<-testa[ , 12] testa.13<-testa[ , 13] testa.14<-testa[ , 14] testa.15<-testa[ , 15] testa.16<-testa[ , 16] testa.17<-testa[ , 17] testa.18<-testa[ , 18] testa.19<-testa[ , 19] testa.20<-testa[ , 20] rm(testa) gc() __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Determining the "memory" used by a dataset or vector?
Is there a function that reports the amount of memory used by a dataset and/or vector? If I have a dataset with only 1 column, does it use more memory then the same data arranged as a vector? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] function to replace missing values with median value?
I have a data set with ~10 variables (i.e. columns). I wrote this little function to replace missing values with zero. sz <- function(x) { ifelse(is.na(x)==F,x,0) } Can anyone help with a function that replaces missing values with the median of the non-missing values? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] pros and cons of "robust regression"? (i.e. rlm vs lm)
Can anyone comment or point me to a discussion of the pros and cons of robust regressions, vs. a more "manual" approach to trimming outliers and/or "normalizing" data used in regression analysis? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] rowVars
I am using the R 2.2.1 in a Windows XP environment. I have a dataframe with 12 columns and 1,000 rows. (Some of the rows have 1 or fewer values.) I am trying to use rowVars to calculate the variance of each row. I am getting the following message: Error in na.remove.default(x) : length of 'dimnames' [1] not equal to array extent Is there a good work-around? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] calcualtign a trailing 12 column mean in a dataframe?
I have a dataframe of 25 columns and 100,000 rows called testdf. I wish to build a new dataframe, with 14 columns and 100,000 rows. I wish the new dataframe to have the trailing 12 column mean. That is, I want column 1 of the new dataframe to have soemthing like: ( mean(testdf[,1:12],na.rm=T) What is the best way to accomplish this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] "renaming" dataframe1 using "column" names from dataframe2?
I have a dataframe named temp, and another dataframe named descriptions. I wish to rename temp, and to call it the names of a certain column in the dataframe descriptions. Is there a good way to do this? A similar question: I am using a for loop to create several new dataframes. e.g. for(j in 1:9){ .. Id like each dataframe to be named d1, d2, d3, with the number being tied to the j (the iteration). Is this possible __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] using a value in a column to "lookup" data in a certian column of a dataset?
I have a dataset with 20 columns and ~600,000 rows. Column 1 has a number from 2-19. This number tells me, for each row, which column has the applicable data. (i.e. the data that I wish to use for each individual row) I want to create a vector that contains the data from the value in column 1. e.g. If column 1, row 1, has a value of 6, I want to obtain the value in column 6, row1. If column1, row 2, has value of 2, I want to obtain the value in column 2, row2. etc I have created a for next loop to do this, but am looking for a more efficient manner. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] vector math: calculating a rolling 12 row product?
I have a dataframe of numeric values with 30 rows and 7 columns. For each column, beginning at row 12 and down to row 30, I wish to calculate the rolling 12 row product. I.e., within each column, I wish to multiply all the values in row 1:12, 2:13, 19:30. I wish to save the results as a new dataframe, which will have 19 rows and 7 columns. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] memory managment under Windows XP
I am using R 2.2.1 in a Windowes XP environment. I work with very large datasets, and occassionally run out of memory. I have modified my boot.ini file to use the "/3gb switch". I also run the following line after I launch R ( I am unsure if it is helpful). "memory.limit(size = 4095)" Please point me to useful references on how to better manage memory, or suggestother actions. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] "Conditional" match?
I have two datasets, big and small. s_date<-c(2005-12-02, 2005-12-01, 2004-11-02,2002-10-05,2000-12-15) s_id<-c(a,a,b,c,d) b_date<- c(2005-12-31, 2005-12-31, 2004-12-31,2002-10-05,2001-10-31,1999-12-31) b_id<-c(a,b,c,d,e,c) small<-data.frame(date_=as.Date(s_date),id=s_id) big<-data.frame(date_=as.Date(b_date),id=b_id) For each row in big, I want to look for a match in small where two conditions are met: a. big$id=small$id b. big$date_>=small$date If match is found, I wish to return the value of the date. If no match is found, I want NA. If more than 1 match is found, I wish to return the match where small$date is greatest. Im thinking I might be able to do this using the match function, and by sorting the small dataset by date_ in descending order. However, I do not know how to make the match conditional on big$date_>=small$date_. Any help is appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] paste - eliminate spaces?
I found the answer: add sep="" to the paste command paste('test',1,sep="") --- r user <[EMAIL PROTECTED]> wrote: > I am trying to combine the value of a variable and > text. > > e.g. > I want test1, with no spaces. > > I try: > > h=1 > paste(test,1) > > But get: > [1] "test 1" > > (i.e. there is a space between test and 1) > > Is there a way to eliminate the space? > > > __ > Do You Yahoo!? > protection around > http://mail.yahoo.com > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] paste - eliminate spaces?
I am trying to combine the value of a variable and text. e.g. I want test1, with no spaces. I try: h=1 paste(test,1) But get: [1] "test 1" (i.e. there is a space between test and 1) Is there a way to eliminate the space? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] importing a VERY LARGE database from Microsoft SQL into R
I am using R 2.1.1 in a Windows Xp environment. I need to import a large database from Microsoft SQL into R. I am currently using the sqlQuery function/command. This works, but I sometimes run out of memory if my database is too big, or it take quite a long time for the data to import into R. Is there a better way to bring a large SQL database into R? IS there an efficient way to convert the data into R format prior to bringing it into R? (E.g. directly from Microsoft SQL?) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] exporting dates into Microsoft SQL Server
I am running R 2.1.1 in a Windows XP environment. I wish to use the sqlSave command to export a dataframe into Microsoft SQL. My dataframe is called temp and has 2 columns, monthenddate and value. Monthenddate is in 'POSIXct', format. (i.e. 'POSIXct', format: chr "1984-01-31" "1984-01-31" "1984-01-31" "1984-01-31" ...). How can I export this dataframe into SQL and have the format in SQL by one of the standard SQL date formats? I am using the following r code: db <- odbcConnect("testserver") sqlSave(db, temp) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Converting from a dataset to a single "column"
I have a dataset of 3 columns and 5 rows. temp<-data.frame(col1=c(5,10,14,56,7),col2=c(4,2,8,3,34),col3=c(28,4,52,34,67)) I wish to convert this to a single column, with column 1 on top and column 3 on bottom. i.e. 5 10 14 56 7 4 2 8 3 34 28 4 52 34 67 Are there any functions that do this, and that will work well on much larger datasets (e.g. 1000 rows, 6000 columns)? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] matrix logic
I have 2 dataframes, each with 5 columns and 20 rows. They are called data1 and data2.I wish to create a third dataframe called data3, also with 5 columns and 20 rows. I want data3 to contains the values in data1 when the value in data1 is not NA. Otherwise it should contain the values in data2. I have tried afew methids, but they do not seem to work as intended.: data3<-ifelse(is.na(data1)=F,data1,data2) and data3[,]<-ifelse(is.na(data1[,])=F,data1[,],data2[,]) Please suggest the best way. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] matrix math
> I am using R 2.1.1 in an windows XP environment. > > I have 2 dataframes, temp1 and temp2. > > Each dataframe has 20 variables (cocolumns") and > 525 observations (rows). All variables are > numeric. > > I want to create a new dataframe that also has 20 > columns and 525 rows. The values in this dataframe > should be the sum of the 2 other dataframe. > > (i.e. temp1$column 1+temp2$column1, > temp1$column2+temp2$column2, etc) > > What is the best/easiest way to accomplish this? > > Is I wish to "multiply" (instead of sum) the > columns, how do I? > > I tried: > > temp3<-as.matrix(temp1)+as.matrix(temp2) > > I get the following error message: Error in > as.matrix(temp1) + as.matrix(temp2) : > non-numeric argument to binary operator > > > > - > $16.99/mo. or less __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] matrix math
I am using R 2.1.1 in an windows XP environment. I have 2 dataframes, temp1 and temp2. Each dataframe has 20 variables (cocolumns") and 525 observations (rows). All variables are numeric. I want to create a new dataframe that also has 20 columns and 525 rows. The values in this dataframe should be the sum of the 2 other dataframe. (i.e. temp1$column 1+temp2$column1, temp1$column2+temp2$column2, etc) What is the best/easiest way to accomplish this? Is I wish to "multiply" (instead of sum) the columns, how do I? I tried: temp3<-as.matrix(temp1)+as.matrix(temp2) I get the following error message: Error in as.matrix(temp1) + as.matrix(temp2) : non-numeric argument to binary operator - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] For loop gets exponentially slower as dataset gets larger...
I am running R 2.1.1 in a Microsoft Windows XP environment. I have a matrix with three vectors (columns) and ~2 million rows. The three vectors are date_, id, and price. The data is ordered (sorted) by code and date_. (The matrix contains daily prices for several thousand stocks, and has ~2 million rows. If a stock did not trade on a particular date, its price is set to NA) I wish to add a fourth vector that is next_price. (Next price is the current price as long as the current price is not NA. If the current price is NA, the next_price is the next price that the security with this same ID trades. If the stock does not trade again, next_price is set to NA.) I wrote the following loop to calculate next_price. It works as intended, but I have one problem. When I have only 10,000 rows of data, the calculations are very fast. However, when I run the loop on the full 2 million rows, it seems to take ~ 1 second per row. Why is this happening? What can I do to speed the calculations when running the loop on the full 2 million rows? (I am not running low on memory, but I am maxing out my CPU at 100%) Here is my code and some sample data: data<- data[order(data$code,data$date_),] l<-dim(data)[1] w<-3 data[l,w+1]<-NA for (i in (l-1):(1)){ data[i,w+1]<-ifelse(is.na(data[i,w])==F,data[i,w],ifelse(data[i,2]==data[i+1,2],data[i+1,w+1],NA)) } date id price next_price 6/24/20051635444.7838 444.7838 6/27/20051635448.4756 448.4756 6/28/20051635455.4161 455.4161 6/29/20051635454.6658 454.6658 6/30/20051635453.9155 453.9155 7/1/2005 1635453.3153 453.3153 7/4/2005 1635NA 453.9155 7/5/2005 1635453.9155 453.9155 7/6/2005 1635453.0152 453.0152 7/7/2005 1635452.8651 452.8651 7/8/2005 1635456.0163 456.0163 12/19/2005 1635442.6982 442.6982 12/20/2005 1635446.5159 446.5159 12/21/2005 1635452.4714 452.4714 12/22/2005 1635451.074 451.074 12/23/2005 1635454.6453 454.6453 12/27/2005 1635NA NA 12/28/2005 1635NA NA 12/1/2003188166.1562 66.1562 12/2/2003188164.9192 64.9192 12/3/2003188166.0078 66.0078 12/4/2003188165.8098 65.8098 12/5/2003188164.1275 64.1275 12/8/2003188164.8697 64.8697 12/9/2003188163.5337 63.5337 12/10/2003 188162.9399 62.9399 - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compare rows of two matrices
> y <- matrix( c(20, NA, NA, 45, 50, 19, 32, 101, 10, 22, NA, NA, > 80, 49, 61, 190), ncol=4 ) > x <- matrix( c(20, NA, NA, NA, 50, 19, 32, 101, 10, 22, NA, NA, > 80, 49, 61, 190), ncol=4 ) > > #Whereas x contains all NA´s from y plus some additional NA´s. > #I want to find the index of these additional NA´s. I think, there must be a > very easy way to do this. How about this: is.na(x) & !is.na(y) Jonne. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Extracting a numeric prefix from a string
You could use something like y <- gsub('([0-9]+(.[0-9]+)?)?.*','\\1',x) as.numeric(y) But maybe there's a much nicer way. Jonne. On Mon, 2005-01-31 at 08:51 +, Mike White wrote: > Hi > Does anyone know if there is a function similar to as.numeric that will > extract a numeric prefix from a string as in the following examples? > > x<-c(3, "abc", 5.67, "2.4a", "6a", "6b", "2.4.a", 3, "4.2a") > df.x<-data.frame(Code=x) > x.str<-levels(df.x[,1]) > # required function result > 2.40 3.00 4.20 5.67 6.00 NA > > Thanks > Mike White > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- R user <[EMAIL PROTECTED]> __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] parameter couldn't be set in high-level plot() function
Think the problem I had with the bandplot (gplots) function is solved by changing the expand.dots = FALSE to expand.dots = TRUE. Don't understand actually why it says FALSE here, because that means it does *not* pass extra arguments to plot. If I change it to TRUE, my main/xlab/ylab arguments are passed just like I wanted. fragment of bandplot[gplots] if (!add) { m <- match.call(expand.dots = FALSE) m$width <- m$add <- m$sd <- m$sd.col <- NULL m$method <- m$n <- NULL m[[1]] <- as.name("plot") mf <- eval(m, parent.frame()) } Jonne. On Mon, 2005-01-24 at 15:50 +0100, R user wrote: > > Dear R users, > > I am using function bandplot from the gplots package. > To my understanding (viewing the source of bandplot) it calls > function plot (add = FALSE) with the same parameters (except for a few > removed). > > I would like to give extra parameters 'xlab' and 'ylab' to function > bandplot, but, as can be seen below, that raises warnings (and the > labels do not show up at the end). > > It does work to call title(... xlab="blah", ylab="foo") after bandplot > (), but then I have two labels on top of each other, which is even more > ugly. > > Can anyone explain me why this goes wrong? > > Thanks in advance, > Jonne. > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- R user <[EMAIL PROTECTED]> __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] parameter couldn't be set in high-level plot() function
Dear R users, I am using function bandplot from the gplots package. To my understanding (viewing the source of bandplot) it calls function plot (add = FALSE) with the same parameters (except for a few removed). I would like to give extra parameters 'xlab' and 'ylab' to function bandplot, but, as can be seen below, that raises warnings (and the labels do not show up at the end). It does work to call title(... xlab="blah", ylab="foo") after bandplot (), but then I have two labels on top of each other, which is even more ugly. Can anyone explain me why this goes wrong? Thanks in advance, Jonne. > x11() ; bandplot(x=xdata, y=zdata) [works fine] > x11() ; bandplot(x=xdata, y=zdata, xlab="blah", ylab="foo") There were 22 warnings (use warnings() to see them) > warnings() Warning messages: 1: parameter "xlab" couldn't be set in high-level plot() function 2: parameter "ylab" couldn't be set in high-level plot() function 3: parameter "xlab" couldn't be set in high-level plot() function 4: parameter "ylab" couldn't be set in high-level plot() function 5: parameter "xlab" couldn't be set in high-level plot() function 6: parameter "ylab" couldn't be set in high-level plot() function 7: parameter "xlab" couldn't be set in high-level plot() function 8: parameter "ylab" couldn't be set in high-level plot() function 9: parameter "xlab" couldn't be set in high-level plot() function 10: parameter "ylab" couldn't be set in high-level plot() function 11: parameter "xlab" couldn't be set in high-level plot() function 12: parameter "ylab" couldn't be set in high-level plot() function 13: parameter "xlab" couldn't be set in high-level plot() function 14: parameter "ylab" couldn't be set in high-level plot() function 15: parameter "xlab" couldn't be set in high-level plot() function 16: parameter "ylab" couldn't be set in high-level plot() function 17: parameter "xlab" couldn't be set in high-level plot() function 18: parameter "ylab" couldn't be set in high-level plot() function 19: parameter "xlab" couldn't be set in high-level plot() function 20: parameter "ylab" couldn't be set in high-level plot() function 21: parameter "xlab" couldn't be set in high-level plot() function 22: parameter "ylab" couldn't be set in high-level plot() function There were 22 warnings (use warnings() to see them) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] 3d bar plot
This graph -> http://www.math.hope.edu/~tanis/dallas/images/disth36.gif is an example I found at http://www.math.hope.edu/~tanis/dallas/disth1.html created by Maple. Does anybody know how to create something similar in R? I have a feeling it could be possible using scatterplot3d (perhaps with type=h, the fourth example in help('scatterplot3d')?), but I cannot figure it out. Thanks in advance, Jonne. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] evaluate expression on several dataframe columns
Hi R-users, I have a collection of dataframes and know how to build a string that refers to it, in this example, name_infra_alg_inc. Then, I have a character string yval, which the user can select from a drop down list. It contains the column names of the dataframes. assign(paste(name_infra_alg_inc, "ci", sep="."), ci(get(name_infra_alg_inc)[[yval]], confidence=0.95)) My problem is that I sometimes want to combine columns. For example, if there are columns A, B and C. Would it be possible that yval has the value "A+B*C" and then call some sort of evaluate function? Maybe I could attach the dataframe and then call some function, I don't know how to figure this out, so hopefully someone can help me. Thanks in advance __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html