Re: [R] linear fit function with NA values
HI, I couldn't get any error message with the data you provided. return<- read.table(text=" ATI AMU -1 0.734 9.003 0 0.999 2.001 1 3.097 -1.003 2 NA NA 3 NA 3.541 ",sep="",header=TRUE) median<- read.table(text=" ATI AMU -1 3.224 -2.003 0 2.999 -1.301 1 1.3 -1.003 2 4.000 2.442 3 -10 4.511 ",sep="",header=TRUE) lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i])}) [[1]] Call: lm(formula = return[, i] ~ median[, i]) Coefficients: (Intercept) median[, i] 4.696 -1.231 [[2]] Call: lm(formula = return[, i] ~ median[, i]) Coefficients: (Intercept) median[, i] 3.3937 -0.1607 lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i],na.action=na.omit)}) #same as above. sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] stringr_0.6.2 reshape2_1.2.2 loaded via a namespace (and not attached): [1] plyr_1.8 tools_3.0.1 BTW, It is better to ?dput() the example dataset. A.K. - Original Message - From: iza.ch1 To: arun Cc: R help Sent: Saturday, July 27, 2013 4:46 PM Subject: Re: Re: [R] linear fit function with NA values Hi Thanks for your hints. I would like to describe my problem better and give an examle of the data that I use. I conduct the event study and I need to create abnormal returns for the daily stock prices. I have for each stock returns from time period of 8 years. For some days I don't have the data for many reasons. in excel file they are just empty cells but I convert my data into 'zoo' and then it is transformed into NA. I get something like this return ATI AMU -1 0.734 9.003 0 0.999 2.001 1 3.097 -1.003 2 NA NA 3 NA 3.541 median ATI AMU -1 3.224 -2.003 0 2.999 -1.301 1 1.3 -1.003 2 4.000 2.442 3 -10 4.511 I want to regress first column return with first column median and second column return with second column median. when I do OLS<-lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i])}) I get an error message. I would like my function to omit the NAs and for example for ATI returns to take into account only the values for -1,0,1 and regress it against the same values from ATI in median which means it would also take only (3.224, 2.999, 1.3) Is it possible to do it? Thanks a lot W dniu 2013-07-27 17:33:30 użytkownik arun napisał: > > > HI, > set.seed(28) > dat1<- as.data.frame(matrix(sample(c(NA,1:20),100,replace=TRUE),ncol=10)) > > set.seed(49) > dat2<- as.data.frame(matrix(sample(c(NA,40:80),100,replace=TRUE),ncol=10)) > lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i])}) #works bcz > the default setting removes NA > Regarding the options: > ?lm() > na.action: a function which indicates what should happen when the data > contain ‘NA’s. The default is set by the ‘na.action’ setting > of ‘options’, and is ‘na.fail’ if that is unset. The > ‘factory-fresh’ default is ‘na.omit’. Another possible value > is ‘NULL’, no action. Value ‘na.exclude’ can be useful. > > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.exclude)}) > #or > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.omit)}) > > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.fail)}) > #Error in na.fail.default(list(`dat2[, i]` = c(54L, 59L, 50L, 64L, 40L, : > # missing values in object > > In your case, the error is different. It could be something similar to the > below case: > dat1[,1]<- NA > > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.omit)}) > #Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > # 0 (non-NA) cases # here it is different > > lapply(seq_len(ncol(dat1)),function(i) {try(lm(dat2[,i]~dat1[,i]))}) #works > in the above case. It may not work in your case. > > You need to provide a reproducible example to understand the situation better. > A.K. > > > > > > > > > > > > > > - Original Message -
Re: [R] linear fit function with NA values
Hi Thanks for your hints. I would like to describe my problem better and give an examle of the data that I use. I conduct the event study and I need to create abnormal returns for the daily stock prices. I have for each stock returns from time period of 8 years. For some days I don't have the data for many reasons. in excel file they are just empty cells but I convert my data into 'zoo' and then it is transformed into NA. I get something like this return ATIAMU -1 0.734 9.003 00.999 2.001 13.097 -1.003 2NANA 3NA 3.541 median ATIAMU -1 3.224 -2.003 02.999 -1.301 11.3-1.003 24.000 2.442 3 -10 4.511 I want to regress first column return with first column median and second column return with second column median. when I do OLS<-lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i])}) I get an error message. I would like my function to omit the NAs and for example for ATI returns to take into account only the values for -1,0,1 and regress it against the same values from ATI in median which means it would also take only (3.224, 2.999, 1.3) Is it possible to do it? Thanks a lot W dniu 2013-07-27 17:33:30 użytkownik arun napisał: > > > HI, > set.seed(28) > dat1<- as.data.frame(matrix(sample(c(NA,1:20),100,replace=TRUE),ncol=10)) > > set.seed(49) > dat2<- as.data.frame(matrix(sample(c(NA,40:80),100,replace=TRUE),ncol=10)) > lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i])}) #works bcz > the default setting removes NA > Regarding the options: > ?lm() > na.action: a function which indicates what should happen when the data > contain ‘NA’s. The default is set by the ‘na.action’ setting > of ‘options’, and is ‘na.fail’ if that is unset. The > ‘factory-fresh’ default is ‘na.omit’. Another possible value > is ‘NULL’, no action. Value ‘na.exclude’ can be useful. > > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.exclude)}) > #or > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.omit)}) > > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.fail)}) > #Error in na.fail.default(list(`dat2[, i]` = c(54L, 59L, 50L, 64L, 40L, : > # missing values in object > > In your case, the error is different. It could be something similar to the > below case: > dat1[,1]<- NA > > lapply(seq_len(ncol(dat1)),function(i) > {lm(dat2[,i]~dat1[,i],na.action=na.omit)}) > #Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > # 0 (non-NA) cases # here it is different > > lapply(seq_len(ncol(dat1)),function(i) {try(lm(dat2[,i]~dat1[,i]))}) #works > in the above case. It may not work in your case. > > You need to provide a reproducible example to understand the situation better. > A.K. > > > > > > > > > > > > > > - Original Message - > From: iza.ch1 > To: r-help@r-project.org > Cc: > Sent: Saturday, July 27, 2013 8:47 AM > Subject: [R] linear fit function with NA values > > Hi > > Quick question. I am running a multiple regression function for each column > of two data sets. That means as a result I get several coefficients. I have a > problem because data that I use for regression contains NA. How can I ignore > NA in lm function. I use the following code for regression: > OLS<-lapply(seq_len(ncol(es.w)),function(i) {lm(es.w[,i]~es.median[,i])}) > as response I get > Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > all values NA > > thanks for help :) > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear fit function with NA values
HI, set.seed(28) dat1<- as.data.frame(matrix(sample(c(NA,1:20),100,replace=TRUE),ncol=10)) set.seed(49) dat2<- as.data.frame(matrix(sample(c(NA,40:80),100,replace=TRUE),ncol=10)) lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i])}) #works bcz the default setting removes NA Regarding the options: ?lm() na.action: a function which indicates what should happen when the data contain ‘NA’s. The default is set by the ‘na.action’ setting of ‘options’, and is ‘na.fail’ if that is unset. The ‘factory-fresh’ default is ‘na.omit’. Another possible value is ‘NULL’, no action. Value ‘na.exclude’ can be useful. lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i],na.action=na.exclude)}) #or lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i],na.action=na.omit)}) lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i],na.action=na.fail)}) #Error in na.fail.default(list(`dat2[, i]` = c(54L, 59L, 50L, 64L, 40L, : # missing values in object In your case, the error is different. It could be something similar to the below case: dat1[,1]<- NA lapply(seq_len(ncol(dat1)),function(i) {lm(dat2[,i]~dat1[,i],na.action=na.omit)}) #Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : # 0 (non-NA) cases # here it is different lapply(seq_len(ncol(dat1)),function(i) {try(lm(dat2[,i]~dat1[,i]))}) #works in the above case. It may not work in your case. You need to provide a reproducible example to understand the situation better. A.K. - Original Message - From: iza.ch1 To: r-help@r-project.org Cc: Sent: Saturday, July 27, 2013 8:47 AM Subject: [R] linear fit function with NA values Hi Quick question. I am running a multiple regression function for each column of two data sets. That means as a result I get several coefficients. I have a problem because data that I use for regression contains NA. How can I ignore NA in lm function. I use the following code for regression: OLS<-lapply(seq_len(ncol(es.w)),function(i) {lm(es.w[,i]~es.median[,i])}) as response I get Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : all values NA thanks for help :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] linear fit function with NA values
Hi Quick question. I am running a multiple regression function for each column of two data sets. That means as a result I get several coefficients. I have a problem because data that I use for regression contains NA. How can I ignore NA in lm function. I use the following code for regression: OLS<-lapply(seq_len(ncol(es.w)),function(i) {lm(es.w[,i]~es.median[,i])}) as response I get Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : all values NA thanks for help :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.