Re: [R] Replace split with regex for speed ?

2011-03-19 Thread rivercode
Thanks for your suggestions.

Cheers,
Chris

--
View this message in context: 
http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3388958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace split with regex for speed ?

2011-03-18 Thread Henrique Dallazuanna
Try this:

 sub(\\.(\\d+)$, \\1, ts)


On Thu, Mar 17, 2011 at 11:01 PM, rivercode aqua...@gmail.com wrote:

 Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so
 it is in format HH:MM:SS.MMMUUU.

 What is the fastest way to do this, since it has to be repeated on millions
 of rows. Should I use regex ?

 Currently doing it with a string split, which is slow:

  head(ts)
 [1]  09:30:00.000.245  09:30:00.000.256  09:30:00.000.633  09:30:00.001.309
 09:30:00.003.635  09:30:00.026.370


  ts = strsplit(ts, ., fixed = TRUE)
  ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } )  #
 Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU
  ts = unlist(ts)

 Thanks,
 Chris

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace split with regex for speed ?

2011-03-18 Thread rex.dwyer
That's a good solution, but if you're really, really sure that the timestamps 
are in the format you gave, it's quite a bit faster to use substr and paste, 
because you don't have to do any searching in the string.
HTH
Rex

 x = rep(09:30:00.000.633,100)
 system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=))
   user  system elapsed
   0.870.000.88
 system.time(y-sub(\\.(\\d+)$, \\1, x))
   user  system elapsed
   1.650.001.65
 system.time(y-sub(\\.(\\d+)$, \\1, x))
   user  system elapsed
   1.650.001.66
 system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=))
   user  system elapsed
   0.880.000.89


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Henrique Dallazuanna
Sent: Friday, March 18, 2011 8:32 AM
To: rivercode
Cc: r-help@r-project.org
Subject: Re: [R] Replace split with regex for speed ?

Try this:

 sub(\\.(\\d+)$, \\1, ts)


On Thu, Mar 17, 2011 at 11:01 PM, rivercode aqua...@gmail.com wrote:

 Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so
 it is in format HH:MM:SS.MMMUUU.

 What is the fastest way to do this, since it has to be repeated on millions
 of rows. Should I use regex ?

 Currently doing it with a string split, which is slow:

  head(ts)
 [1]  09:30:00.000.245  09:30:00.000.256  09:30:00.000.633  09:30:00.001.309
 09:30:00.003.635  09:30:00.026.370


  ts = strsplit(ts, ., fixed = TRUE)
  ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } )  #
 Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU
  ts = unlist(ts)

 Thanks,
 Chris

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replace split with regex for speed ?

2011-03-17 Thread rivercode

Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so
it is in format HH:MM:SS.MMMUUU.

What is the fastest way to do this, since it has to be repeated on millions
of rows. Should I use regex ?

Currently doing it with a string split, which is slow:

 head(ts)
[1]  09:30:00.000.245  09:30:00.000.256  09:30:00.000.633  09:30:00.001.309 
09:30:00.003.635  09:30:00.026.370


  ts = strsplit(ts, ., fixed = TRUE)
  ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } )  #
Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU
  ts = unlist(ts)

Thanks,
Chris

--
View this message in context: 
http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.