On Mon, 21 Jun 2010, Eric Zivot wrote:
I am trying to align an xts object containing irregularly spaced intra-day
price data to a regularly spaced time clock so that I can do realized
variance/covariance calculations. For example, I have two xts objects
msftTrades and geTrades created by the RTAQ package (the time index variable
is a timeDate object) :
I suspect that xts has some high-level tools that could be leveraged for
this. But as a primitive zoo developer, I only know how to do it by hand
(which is hopefully still useful):
1. construct regular time index
2. expand series to regular grid (via merge)
3. use na.locf() for "last observation carried forward"
I used only a subset of your data:
## data
x <- xts(c(30.62, 30.64, 30.66, 30.71, 30.705, 30.695, 30.725, 30.74),
timeDate(c("2010-01-04 09:30:00", "2010-01-04 09:30:01",
"2010-01-04 09:30:02", "2010-01-04 09:30:03", "2010-01-04 09:30:04",
"2010-01-04 09:30:05", "2010-01-04 09:30:06", "2010-01-04 09:30:07")))
y <- xts(c(15.21, 15.22, 15.23, 15.24), timeDate(c("2010-01-04 09:30:01",
"2010-01-04 09:30:03", "2010-01-04 09:30:06", "2010-01-04 09:30:07")))
## 1. regular index
ix <- seq(min(c(start(x), start(y))), max(c(end(x), end(y))), by = "1 s")
## 2. merged series on all time stamps
xy <- merge(xts(,ix), x = x, y = y)
## 3. last observation carried forward
xy <- na.locf(xy)
## collect data on regular index
window(xy, ix)
Remarks:
- Note that this is fully symmetric in x and y.
- It would work analogously for "zoo" series.
- For other index classes, only the regular index construction would
need to be changed (which, I think, is the main reason that this is
not readily available in "zoo").
hth,
Z
start(msftTrades)
GMT
[1] [2010-01-04 09:30:00]
end(msftTrades)
GMT
[1] [2010-01-04 16:00:00]
start(geTrades)
GMT
[1] [2010-01-04 09:30:01]
end(geTrades)
GMT
[1] [2010-01-04 15:59:59]
msftTrades$PRICE[1:10,]
PRICE
2010-01-04 09:30:00 "30.62"
2010-01-04 09:30:01 "30.64"
2010-01-04 09:30:02 "30.66"
2010-01-04 09:30:03 "30.71"
2010-01-04 09:30:04 "30.705"
2010-01-04 09:30:05 "30.695"
2010-01-04 09:30:06 "30.725"
2010-01-04 09:30:07 "30.74"
2010-01-04 09:30:08 "30.74"
2010-01-04 09:30:09 "30.74"
geTrades$PRICE[1:10,]
PRICE
2010-01-04 09:30:01 "15.21"
2010-01-04 09:30:03 "15.22"
2010-01-04 09:30:06 "15.23"
2010-01-04 09:30:07 "15.24"
2010-01-04 09:30:08 "15.23"
2010-01-04 09:30:09 "15.23"
2010-01-04 09:30:10 "15.22"
2010-01-04 09:30:11 "15.22"
2010-01-04 09:30:13 "15.22"
2010-01-04 09:30:14 "15.21"
Here, the trade prices are recorded to the nearest second but are
irregularly spaced and the time clocks for MSFT and GE are different. For
example, there is no transaction for GE at 09:30:02. I want to do the
following
1. Create a regularly spaced one second time clock between 9:30 and
16:00
2. Align the two price series to this regularly spaced time clock
using the "previous tick" method
For (1), I can use the timeDate align() function to get the regularly
spaced time clock. For example,
td1sec = align(index(msftTrades), by="1s")
start(td1sec)
GMT
[1] [2010-01-04 09:30:00]
end(td1sec)
GMT
[1] [2010-01-04 16:00:00]
td1sec[1:5]
GMT
[1] [2010-01-04 09:30:00] [2010-01-04 09:30:01] [2010-01-04 09:30:02]
[2010-01-04 09:30:03] [2010-01-04 09:30:04]
length(td1sec)
[1] 23401
I can't figure out how to do (2). I want to align the data in msftTrades and
geTrades to the new time index td1sec. For the observations that do not
occur on a given time stamp, I want to use the previous tick for that
observation. (In S-PLUS I can easily do this using the align() function).
The RTAQ package has a function called aggregatets() that almost does what I
want. It will do the previous tick aggregation to a one second clock but it
will omit observations in case an interval is empty.
For example, what I want for the geTrades data aligned to the one-second
time clock is the following:
2010-01-04 09:30:00 NA
2010-01-04 09:30:01 "15.21"
2010-01-04 09:30:02 "15.21"
2010-01-04 09:30:03 "15.22"
2010-01-04 09:30:04 "15.22"
2010-01-04 09:30:05 "15.22"
2010-01-04 09:30:06 "15.23"
2010-01-04 09:30:07 "15.24"
2010-01-04 09:30:08 "15.23"
2010-01-04 09:30:09 "15.23"
2010-01-04 09:30:10 "15.22"
2010-01-04 09:30:11 "15.22"
2010-01-04 09:30:12 "15.22"
2010-01-04 09:30:13 "15.22"
2010-01-04 09:30:14 "15.21"
.
Eric Zivot
Professor and Gary Waterman Distinguished Scholar
Department of Economics
Adjunct Professor of Finance
Adjunct Professor of Statistics
Box 353330 email: [email protected]
University of Washington phone: 206-543-6715
Seattle, WA 98195-3330
www: http://faculty.washington.edu/ezivot
[[alternative HTML version deleted]]
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should
go.
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should
go.