Re: [R] Replace split with regex for speed ?
Thanks for your suggestions. Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3388958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace split with regex for speed ?
Try this: sub(\\.(\\d+)$, \\1, ts) On Thu, Mar 17, 2011 at 11:01 PM, rivercode aqua...@gmail.com wrote: Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so it is in format HH:MM:SS.MMMUUU. What is the fastest way to do this, since it has to be repeated on millions of rows. Should I use regex ? Currently doing it with a string split, which is slow: head(ts) [1] 09:30:00.000.245 09:30:00.000.256 09:30:00.000.633 09:30:00.001.309 09:30:00.003.635 09:30:00.026.370 ts = strsplit(ts, ., fixed = TRUE) ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } ) # Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU ts = unlist(ts) Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace split with regex for speed ?
That's a good solution, but if you're really, really sure that the timestamps are in the format you gave, it's quite a bit faster to use substr and paste, because you don't have to do any searching in the string. HTH Rex x = rep(09:30:00.000.633,100) system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=)) user system elapsed 0.870.000.88 system.time(y-sub(\\.(\\d+)$, \\1, x)) user system elapsed 1.650.001.65 system.time(y-sub(\\.(\\d+)$, \\1, x)) user system elapsed 1.650.001.66 system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=)) user system elapsed 0.880.000.89 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique Dallazuanna Sent: Friday, March 18, 2011 8:32 AM To: rivercode Cc: r-help@r-project.org Subject: Re: [R] Replace split with regex for speed ? Try this: sub(\\.(\\d+)$, \\1, ts) On Thu, Mar 17, 2011 at 11:01 PM, rivercode aqua...@gmail.com wrote: Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so it is in format HH:MM:SS.MMMUUU. What is the fastest way to do this, since it has to be repeated on millions of rows. Should I use regex ? Currently doing it with a string split, which is slow: head(ts) [1] 09:30:00.000.245 09:30:00.000.256 09:30:00.000.633 09:30:00.001.309 09:30:00.003.635 09:30:00.026.370 ts = strsplit(ts, ., fixed = TRUE) ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } ) # Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU ts = unlist(ts) Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace split with regex for speed ?
Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so it is in format HH:MM:SS.MMMUUU. What is the fastest way to do this, since it has to be repeated on millions of rows. Should I use regex ? Currently doing it with a string split, which is slow: head(ts) [1] 09:30:00.000.245 09:30:00.000.256 09:30:00.000.633 09:30:00.001.309 09:30:00.003.635 09:30:00.026.370 ts = strsplit(ts, ., fixed = TRUE) ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } ) # Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU ts = unlist(ts) Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.