There might be something simpler, but this is what I came up with: form = "C5H11BrO" ups = c(gregexpr("[[:upper:]]", form)[[1]], nchar(form) + 1) seperated = sapply(1:(length(ups)-1), function(x) substr(form, ups[x], ups[x+1] - 1)) elements = gsub("[[:digit:]]", "", seperated) nums = gsub("[[:alpha:]]", "", seperated) ans = data.frame(element = as.character(elements), num = as.numeric(ifelse(nums == "", 1, nums)), stringsAsFactors = FALSE) -- View this message in context: http://r.789695.n4.nabble.com/Parsing-a-Simple-Chemical-Formula-tp3164562p3164581.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.