I suspect there are cleverer ways to do it, especially using packages like stringr and gsubfn, but using base tools, you can hack it without too much effort:
?gregexpr is the key. To get started (x is your example vector of character strings): > gregexpr("[[:alpha:]]+[[:digit:]]+",x) [[1]] [1] 1 3 attr(,"match.length") [1] 2 2 attr(,"useBytes") [1] TRUE [[2]] [1] 1 3 attr(,"match.length") [1] 2 2 attr(,"useBytes") [1] TRUE [[3]] [1] 1 attr(,"match.length") [1] 2 attr(,"useBytes") [1] TRUE [[4]] [1] 1 3 5 attr(,"match.length") [1] 2 2 2 attr(,"useBytes") [1] TRUE The components of the result give you indices of the start and stop values for each "entry" in your final matrix/data frame. You can thus lapply() on this list to get the column name-value pairs substrings and decode them. Alternatively, if all your names are really 6 characters and all your values are really two digits, ?nchar and ?substring will get you the name-value substrings directly. I leave the niggling details to you (or to other helpeRs -- especially those who can suggest a more elegant approach). -- Bert On Wed, Feb 8, 2012 at 12:34 PM, Sam Steingold <s...@gnu.org> wrote: > Suppose I have a vector of strings: > c("A1B2","A3C4","B5","C6A7B8") > [1] "A1B2" "A3C4" "B5" "C6A7B8" > where each string is a sequence of <column><value> pairs > (fixed width, in this example both value and name are 1 character, in > reality the column name is 6 chars and value is 2 digits). > I need to convert it to a data frame: > data.frame(A=c(1,3,0,7),B=c(2,0,5,8),C=c(0,4,0,6)) > A B C > 1 1 2 0 > 2 3 0 4 > 3 0 5 0 > 4 7 8 6 > > how do I do that? > thanks. > > -- > Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X > 11.0.11004000 > http://mideasttruth.com http://jihadwatch.org http://pmw.org.il > http://openvotingconsortium.org http://iris.org.il http://memri.org > What's the difference between Apathy & Ignorance? -I don't know and don't > care! > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.