Replace cov(z) with cov(z, use = "pair") is there are missing values as there are here.
On Thu, Jun 5, 2008 at 11:54 AM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > Check out the three vignettes (i.e. pdf documents in the zoo package). e.g. > > > Lines <- "SEC_ID DAY EARNING > IT0000001 20070101 5.467 > IT0000001 20070102 5.456 > IT0000001 20070103 4.954 > IT0000001 20070104 3.456 > IT0000002 20070101 1.456 > IT0000002 20070102 1.345 > IT0000002 20070103 1.233 > IT0000003 20070101 0.345 > IT0000003 20070102 0.367 > IT0000003 20070103 0.319 > " > DF <- read.table(textConnection(Lines), header = TRUE) > DFs <- split(DF, DF$SEC_ID) > > library(zoo) > f <- function(DF.) zoo(DF.$EARNING, as.Date(format(DF.$DAY), "%Y%m%d")) > z <- do.call(merge, lapply(DFs, f)) > cov(z) # uses n-1 > > > On Thu, Jun 5, 2008 at 11:41 AM, <[EMAIL PROTECTED]> wrote: >> Good morning, >> >> I am a new R user and I am trying to learn how to use it. >> I am trying to solve this problem. >> I have a dataframe df of daily securities (for a year) earnings as >> follows: >> >> SEC_ID DAY EARNING >> IT0000001 20070101 5.467 >> IT0000001 20070102 5.456 >> IT0000001 20070103 4.954 >> IT0000001 20070104 3.456 >> .......................... >> IT0000002 20070101 1.456 >> IT0000002 20070102 1.345 >> IT0000002 20070103 1.233 >> .......................... >> IT0000003 20070101 0.345 >> IT0000003 20070102 0.367 >> IT0000003 20070103 0.319 >> .......................... >> >> And so on: about 800 different SEC_ID and about 180000 rows. >> I have to calculate the "covariance" for each couple of securities x and >> y according to the formula: >> >> Cov(x,y) = (sum[(x-x')*(y-y')]/N)/(sx*sy) >> >> being x' and y' the mean of securities earning in the year, N the number >> of observations, sx and sy the standard deviation of x and y. >> To do this I could build a df2 data frame like this: >> >> DAY SEC_ID.x SEC_ID.y EARNING.x >> EARNING.y x' y' sx sy >> 20070101 IT0000001 IT0000002 5.467 1.456 >> a b aa bb >> 20070101 IT0000001 IT0000003 5.467 0.345 >> a c aa cc >> 20070101 IT0000002 IT0000003 1.456 0.345 >> b c bb cc >> 20070102 IT0000001 IT0000002 5.456 1.345 >> a b aa bb >> 20070102 IT0000001 IT0000003 5.456 0.367 >> a c aa cc >> 20070102 IT0000002 IT0000003 1.345 0.367 >> b c bb cc >> ........................................................................ >> ....................................................... >> >> (merging df with itself with a condition SEC_ID.x < SEC_ID.y) and then >> easily calculate the formula; but the dimensions are too big (the >> process stops whit an out-of-memory message). >> Besides partitioning the input and using a loop, are there any smarter >> solutions (eventually using split and other ways of "subgroup merging" >> to solve the problem ? >> Are there any "shortcuts" using statistical built-in functions (e.g. >> cov, vcov) ? >> Thank you in advance >> >> Angelo Linardi >> >> >> >> ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e >> non >> comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che >> cio' non >> sia espressamente previsto da un accordo scritto. >> Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La >> preghiamo di >> comunicarne via e-mail la ricezione al mittente e di distruggerne il >> contenuto. La >> informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi >> allegati >> potrebbe costituire reato. Grazie per la collaborazione. >> -- E-mails from the Bank of Italy are sent in good faith but they are >> neither binding on >> the Bank nor to be understood as creating any obligation on its part except >> where >> provided for in a written agreement. This e-mail is confidential. If you >> have received it >> by mistake, please inform the sender by reply e-mail and delete it from your >> system. >> Please also note that the unauthorized disclosure or use of the message or >> any >> attachments could be an offence. Thank you for your cooperation. ** >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.