Re: [R] Counting the occurences of a charater within a string
On Thu, Dec 1, 2011 at 10:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Here's an easy way from stringr: library(stringr) str_count( c(abc/def, ghi/jkl/mno), /) # [1] 1 2 Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting the occurences of a charater within a string
I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
I used within and vapply: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /)within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col21 abc/def 12 ghi/jkl/mno 2 On Thu, Dec 1, 2011 at 1:05 PM, Bert Gunter gunter.ber...@gene.com wrote: ## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
Resending my code, not sure why the linebreaks got eaten: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /) within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col2 1 abc/def1 2 ghi/jkl/mno2 On Thu, Dec 1, 2011 at 10:32 PM, Florent D. flo...@gmail.com wrote: I used within and vapply: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /)within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col21 abc/def 12 ghi/jkl/mno 2 On Thu, Dec 1, 2011 at 1:05 PM, Bert Gunter gunter.ber...@gene.com wrote: ## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
strsplit is certainly an alternative, but your approach is unnecessarily complicated and inefficient. Do this, instead: sapply(strsplit(x,/),length)-1 Cheers, Bert On Thu, Dec 1, 2011 at 7:44 PM, Florent D. flo...@gmail.com wrote: Resending my code, not sure why the linebreaks got eaten: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /) within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col2 1 abc/def 1 2 ghi/jkl/mno 2 On Thu, Dec 1, 2011 at 10:32 PM, Florent D. flo...@gmail.com wrote: I used within and vapply: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /)within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col21 abc/def 12 ghi/jkl/mno 2 On Thu, Dec 1, 2011 at 1:05 PM, Bert Gunter gunter.ber...@gene.com wrote: ## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
Inefficient, maybe, but what you suggest does not work if a string starts or ends with a slash. On Thu, Dec 1, 2011 at 11:11 PM, Bert Gunter gunter.ber...@gene.com wrote: strsplit is certainly an alternative, but your approach is unnecessarily complicated and inefficient. Do this, instead: sapply(strsplit(x,/),length)-1 Cheers, Bert On Thu, Dec 1, 2011 at 7:44 PM, Florent D. flo...@gmail.com wrote: Resending my code, not sure why the linebreaks got eaten: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /) within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col2 1 abc/def 1 2 ghi/jkl/mno 2 On Thu, Dec 1, 2011 at 10:32 PM, Florent D. flo...@gmail.com wrote: I used within and vapply: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /)within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col21 abc/def 12 ghi/jkl/mno 2 On Thu, Dec 1, 2011 at 1:05 PM, Bert Gunter gunter.ber...@gene.com wrote: ## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting the occurences of a charater within a string
On Dec 1, 2011, at 11:11 PM, Bert Gunter wrote: strsplit is certainly an alternative, but your approach is unnecessarily complicated and inefficient. Do this, instead: sapply(strsplit(x,/),length)-1 Definitely more compact that the regex alternates I came up with, but one of these still might appeal in situations where it was desireable to have the source strings as labels: sapply( sapply(x$Col1, gregexpr, patt=/), length) abc/def ghi/jkl/mno 1 2 nchar( sapply(x$Col1, gsub, patt=[^/], rep= ) ) abc/def ghi/jkl/mno 1 2 -- David Cheers, Bert On Thu, Dec 1, 2011 at 7:44 PM, Florent D. flo...@gmail.com wrote: Resending my code, not sure why the linebreaks got eaten: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /) within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col2 1 abc/def1 2 ghi/jkl/mno2 On Thu, Dec 1, 2011 at 10:32 PM, Florent D. flo...@gmail.com wrote: I used within and vapply: x - data.frame(Col1 = c(abc/def, ghi/jkl/mno), stringsAsFactors = FALSE) count.slashes - function(string)sum(unlist(strsplit(string, NULL)) == /)within(x, Col2 - vapply(Col1, count.slashes, 1)) Col1 Col21 abc/def12 ghi/jkl/mno2 On Thu, Dec 1, 2011 at 1:05 PM, Bert Gunter gunter.ber...@gene.com wrote: ## It's not a data frame -- it's just a vector. x [1] abc/def ghi/jkl/mno gsub([^/],,x) [1] / // nchar(gsub([^/],,x)) [1] 1 2 ?gsub ?nchar -- Bert On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault douglas.esnea...@mecglobal.com wrote: I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level. My dataframe, x, is structured as below: Col1 abc/def ghi/jkl/mno I found this code on the board but it counts all occurrences of / in the dataframe. chr.pos - which(unlist(strsplit(x,NULL))=='/') chr.count - length(chr.pos) chr.count [1] 3 I'd like to append a column, say cnt, that has the count of / for each row. Can anyone point me in the right direction or offer some code to do this? Thanks in advance for the help. Doug Esneault Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions and other information in this message that do not relate to the official business of the GroupM companies shall be understood as neither given nor endorsed by it. GroupM companies are a member of WPP plc. For more information on our business ethical standards and Corporate Responsibility policies please refer to our website at http://www.wpp.com/WPP/About/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.