Dear all, I have a vector of filenames which begins like this : X <- c("OrthoP1_DNA_str.aln", "OrthoP10_DNA_str.aln", "OrthoP100_DNA_str.aln", "OrthoP101_DNA_str.aln", "OrthoP102_DNA_str.aln", "OrthoP103_DNA_str.aln", "OrthoP104_DNA_str.aln", "OrthoP105_DNA_str.aln", "OrthoP106_DNA_str.aln", "OrthoP107_DNA_str.aln")
using grep("(\\d+)",X,perl=T,value=T) I get the complete values back. Yet, I want a vector : c(1,10,100,101,102,103,104,105,106,107) In Perl, using the brackets allows for extracting only the numbers (using a construct with $1 for those who know Perl). I want to do the same in R, but can't find a way of doing that without extensive string manipulations. Problem is that the length of the numbers differ, so I can't use substr. I tried > strsplit(X,"\\d+") [[1]] [1] "OrthoP" "_DNA_str.aln" which gives me exactly what I want to throw away. So : > strsplit(X,"\\D+") [[1]] [1] "" "1" [[2]] [1] "" "10" gives something I can use, but it still requires a lot of list manipulation afterwards to get the right vector. Is there an option or a function I'm missing somewhere? Cheers Joris -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.