On Fri, Mar 18, 2011 at 2:19 PM, Sam Steingold <s...@gnu.org> wrote: > Hi, > I have data with multiple sub-second entries: > > 2011/03/15 09:32:15.035619,-0.403103,1.09664,48.6,126.92,117.32 > 2011/03/15 09:32:15.069331,-0.39851,1.09874,48.6,126.92,117.32 > 2011/03/15 09:32:15.289135,-0.402463,1.10084,48.59,126.92,117.32 > 2011/03/15 09:32:15.296110,-0.450244,1.10063,48.59,126.92,117.32 > 2011/03/15 09:32:15.451358,-0.438813,1.10273,48.59,126.93,117.32 > 2011/03/15 09:32:15.476948,-0.428701,1.10482,48.6,126.93,117.32 > 2011/03/15 09:32:15.490478,-0.407603,1.10692,48.6,126.93,117.32 > 2011/03/15 09:32:15.592054,-0.396562,1.10901,48.6,126.93,117.32 > 2011/03/15 09:32:15.592054,-0.388935,1.11111,48.6,126.93,117.32 > 2011/03/15 09:32:15.910516,-0.372265,1.11321,48.6,126.93,117.32 > 2011/03/15 09:32:15.910681,-0.336162,1.1153,48.6,126.93,117.32 > 2011/03/15 09:32:16.684729,-0.335838,1.1174,48.6,126.93,117.32 > 2011/03/15 09:32:17.196003,-0.293883,1.1195,48.6,126.93,117.32 > 2011/03/15 09:32:17.271841,-0.322763,1.11925,48.6,126.93,117.32 > 2011/03/15 09:32:17.271841,-0.329827,1.12134,48.6,126.93,117.32 > 2011/03/15 09:32:17.271842,-0.35095,1.12343,48.6,126.93,117.32 > 2011/03/15 09:32:17.331044,-0.33095,1.12552,48.6,126.93,117.32 > 2011/03/15 09:32:18.793662,-0.34787,1.12762,48.605,126.96,117.33 > 2011/03/15 09:32:18.806576,-0.10088,1.12735,48.62,126.96,117.33 > 2011/03/15 09:32:18.915319,-0.0577731,1.12708,48.62,126.96,117.33 > 2011/03/15 09:32:18.915321,-0.0417141,1.12917,48.62,126.96,117.33 > 2011/03/15 09:32:19.766950,-0.0420244,1.13125,48.62,126.95,117.33 > 2011/03/15 09:32:20.143307,-0.0184469,1.13333,48.62,126.95,117.33 > > I can load it into a data.frame and convert the first column to POSIXt. > now I want to convert it to a multi-time series by replacing multiple > entries for each second with a single entry with mean value. > (plus one more column with the count). > e.g., the snippet above would turn into > > 2011/03/15 09:32:15,11,-0.4021,1.10545,48.6,126.926,117.32 > 2011/03/15 09:32:16,1,-0.335838,1.1174,48.6,126.93,117.32 > 2011/03/15 09:32:17,5,-0.3257,... > 2011/03/15 09:32:18,4,... > 2011/03/15 09:32:19,1,... > 2011/03/15 09:32:20,1,... > > how do I do that? >
1. Try this. You can ignore the warning message: > Lines <- "2011/03/15 09:32:15.035619,-0.403103,1.09664,48.6,126.92,117.32 + 2011/03/15 09:32:15.069331,-0.39851,1.09874,48.6,126.92,117.32 + 2011/03/15 09:32:15.289135,-0.402463,1.10084,48.59,126.92,117.32 + 2011/03/15 09:32:15.296110,-0.450244,1.10063,48.59,126.92,117.32 + 2011/03/15 09:32:15.451358,-0.438813,1.10273,48.59,126.93,117.32 + 2011/03/15 09:32:15.476948,-0.428701,1.10482,48.6,126.93,117.32 + 2011/03/15 09:32:15.490478,-0.407603,1.10692,48.6,126.93,117.32 + 2011/03/15 09:32:15.592054,-0.396562,1.10901,48.6,126.93,117.32 + 2011/03/15 09:32:15.592054,-0.388935,1.11111,48.6,126.93,117.32 + 2011/03/15 09:32:15.910516,-0.372265,1.11321,48.6,126.93,117.32 + 2011/03/15 09:32:15.910681,-0.336162,1.1153,48.6,126.93,117.32 + 2011/03/15 09:32:16.684729,-0.335838,1.1174,48.6,126.93,117.32 + 2011/03/15 09:32:17.196003,-0.293883,1.1195,48.6,126.93,117.32 + 2011/03/15 09:32:17.271841,-0.322763,1.11925,48.6,126.93,117.32 + 2011/03/15 09:32:17.271841,-0.329827,1.12134,48.6,126.93,117.32 + 2011/03/15 09:32:17.271842,-0.35095,1.12343,48.6,126.93,117.32 + 2011/03/15 09:32:17.331044,-0.33095,1.12552,48.6,126.93,117.32 + 2011/03/15 09:32:18.793662,-0.34787,1.12762,48.605,126.96,117.33 + 2011/03/15 09:32:18.806576,-0.10088,1.12735,48.62,126.96,117.33 + 2011/03/15 09:32:18.915319,-0.0577731,1.12708,48.62,126.96,117.33 + 2011/03/15 09:32:18.915321,-0.0417141,1.12917,48.62,126.96,117.33 + 2011/03/15 09:32:19.766950,-0.0420244,1.13125,48.62,126.95,117.33 + 2011/03/15 09:32:20.143307,-0.0184469,1.13333,48.62,126.95,117.33" > > library(zoo) > z <- read.zoo(textConnection(Lines), sep = ",", tz = "") Warning message: In zoo(rval3, ix) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique > secs <- as.POSIXct(trunc(time(z), "sec")) > cbind(aggregate(z, secs, mean), count = aggregate(z, secs, length)[,1]) V2 V3 V4 V5 V6 count 2011-03-15 09:32:15 -0.4021237 1.105450 48.59727 126.9264 117.32 11 2011-03-15 09:32:16 -0.3358380 1.117400 48.60000 126.9300 117.32 1 2011-03-15 09:32:17 -0.3256746 1.121808 48.60000 126.9300 117.32 5 2011-03-15 09:32:18 -0.1370593 1.127805 48.61625 126.9600 117.33 4 2011-03-15 09:32:19 -0.0420244 1.131250 48.62000 126.9500 117.33 1 2011-03-15 09:32:20 -0.0184469 1.133330 48.62000 126.9500 117.33 1 2. An alternative that won't generate a warning message but involves a double read is: library(zoo) toPOSIXct <- function(x) as.POSIXct(trunc(as.POSIXct(x), "sec")) z <- read.zoo(textConnection(Lines), FUN = toPOSIXct, sep = ",", aggregate = mean) zl <- read.zoo(textConnection(Lines), FUN = toPOSIXct, sep = ",", aggregate = length) cbind(z, count = zl[,1]) In all cases replace textConnection(Lines) with the "myfile.csv" or whatever. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.