Try this using the development version of read.zoo in zoo (which we
source from the R-Forge on the fly).

We use "NULL" in colClasses for those columns we don't need but in
col.names we still have to include dummy names for
them.  Of what is left the index is the first three columns (1:3)
which we convert to chron class times in FUN and then truncate to 5
seconds in FUN2.  Finally we use aggregate = mean to average over the
5 second intervals.

Lines <- "CVX 20070201 9 30 51 73.25 81400 0
CVX 20070201 9 30 51 73.25 100 0
CVX 20070201 9 30 51 73.25 100 0
CVX 20070201 9 30 51 73.25 300 0
CVX 20070201 9 30 51 73.25 81400 0
CVX 20070201 9 40 51 73.25 100 0
CVX 20070201 9 40 52 73.25 100 0
CVX 20070201 9 40 53 73.25 300 0"


library(zoo)
source("http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/*checkout*/pkg/zoo/R/read.zoo.R?rev=611&root=zoo";)
library(chron)

z <- read.zoo(textConnection(Lines),
        colClasses = c("NULL", "NULL", "numeric", "numeric", "numeric", 
"numeric",
                "numeric", "NULL"),
        col.names = c("V1", "V2", "V3", "V4", "V5", "Price", "Volume", "V8"),
        index = 1:3,
        FUN = function(tt) times(paste(tt[,1], tt[,2], tt[,3], sep = ":")),
        FUN2 = function(tt) trunc(tt, "00:00:05"),
        aggregate = mean)

The result of running the above is:

> z
         Price     Volume
09:30:50 73.25 32660.0000
09:40:50 73.25   166.6667

On Tue, Jan 26, 2010 at 10:48 AM, Manta <mantin...@libero.it> wrote:
>
> Dear All,
> I have a large data set that looks like this:
>
> CVX 20070201 9 30 51 73.25 81400 0
> CVX 20070201 9 30 51 73.25 100 0
> CVX 20070201 9 30 51 73.25 100 0
> CVX 20070201 9 30 51 73.25 300 0
>
> First, I would like to import it by merging column 3 4 and 5, since that is
> the timestamp. Then, I would like to aggregate the data by splitting them in
> bins of 5 minutes size, therefore from 93000 up to 93459 etc, givin as
> output the average price and volume in the 5 minutes bin.
>
> Hope this helps,
> Best,
>
> Marco
> --
> View this message in context: 
> http://n4.nabble.com/Large-dataset-importing-columns-merging-and-splitting-tp1294668p1294668.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to