Re: [R] Calculating sum of letter values
on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote: Hi all If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? Trying something like which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not quite correct, as it does not count repeated characters. I guess what I need is some kind of lookup table? Cheers Rory sum(as.numeric(factor(unlist(strsplit(ABCDA, ) [1] 11 Convert the letters to factors, after splitting the vector, which then enables the use of the underlying numeric codes: as.numeric(factor(unlist(strsplit(ABCDA, [1] 1 2 3 4 1 HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Hi Mark Thanks, that's almost exactly what I need...theres just a slight difference with my requirement, in that I am looking for the actual index value in the alphabetical sequence, so that instead of: as.numeric(factor(unlist(strsplit(XYZ, [1] 1 2 3 I would expect to see [1] 24 25 26 I have got it to work in a fairly non-elegant manner, using the following code: sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) ) And over a list of names, this becomes: lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } ) But this is kind of ugly Rory Winston RBS Global Banking Markets Office: +44 20 7085 4476 -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: 24 November 2008 15:09 To: WINSTON, Rory, GBM Cc: r-help@r-project.org Subject: Re: [R] Calculating sum of letter values on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote: Hi all If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? Trying something like which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not quite correct, as it does not count repeated characters. I guess what I need is some kind of lookup table? Cheers Rory sum(as.numeric(factor(unlist(strsplit(ABCDA, ) [1] 11 Convert the letters to factors, after splitting the vector, which then enables the use of the underlying numeric codes: as.numeric(factor(unlist(strsplit(ABCDA, [1] 1 2 3 4 1 HTH, Marc Schwartz *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority This e-mail message is confidential and for use by the=2...{{dropped:22}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Here are a couple of solutions. The first matches each character against LETTERS returning the position number in LETTERS of the match. strsplit returns a list of which we want the first element and then we sum that. The second applies function(x) match(x, LETTERS), which is specified in formula notation, to each letter and simplifies the result using sum. sum(match(strsplit(s, )[[1]], LETTERS)) library(gsubfn) strapply(s, ., ~ match(x, LETTERS), simplify = sum) On Mon, Nov 24, 2008 at 9:57 AM, [EMAIL PROTECTED] wrote: Hi all If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? Trying something like which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not quite correct, as it does not count repeated characters. I guess what I need is some kind of lookup table? Cheers Rory Rory Winston RBS Global Banking Markets 280 Bishopsgate, London, EC2M 4RB Office: +44 20 7085 4476 *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority This e-mail message is confidential and for use by the=2...{{dropped:25}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
You can use Mark's code by giving levels to the factor, e.g. as.numeric(factor(unlist(strsplit(ABCDAXYZ, )), levels=LETTERS)) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Monday, November 24, 2008 9:15 AM To: [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Calculating sum of letter values Hi Mark Thanks, that's almost exactly what I need...theres just a slight difference with my requirement, in that I am looking for the actual index value in the alphabetical sequence, so that instead of: as.numeric(factor(unlist(strsplit(XYZ, [1] 1 2 3 I would expect to see [1] 24 25 26 I have got it to work in a fairly non-elegant manner, using the following code: sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) ) And over a list of names, this becomes: lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } ) But this is kind of ugly Rory Winston RBS Global Banking Markets Office: +44 20 7085 4476 -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: 24 November 2008 15:09 To: WINSTON, Rory, GBM Cc: r-help@r-project.org Subject: Re: [R] Calculating sum of letter values on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote: Hi all If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? Trying something like which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not quite correct, as it does not count repeated characters. I guess what I need is some kind of lookup table? Cheers Rory sum(as.numeric(factor(unlist(strsplit(ABCDA, ) [1] 11 Convert the letters to factors, after splitting the vector, which then enables the use of the underlying numeric codes: as.numeric(factor(unlist(strsplit(ABCDA, [1] 1 2 3 4 1 HTH, Marc Schwartz *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority This e-mail message is confidential and for use by\ the=...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Thanks, that's almost exactly what I need...theres just a slight difference with my requirement, in that I am looking for the actual index value in the alphabetical sequence, so that instead of: as.numeric(factor(unlist(strsplit(XYZ, [1] 1 2 3 I would expect to see [1] 24 25 26 A minor modeification of Mark's solution works in this case: as.numeric(factor(unlist(strsplit(XYZ, )), levels=LETTERS)) # [1] 24 25 26 Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Yep, my error...it should be: as.numeric(factor(unlist(strsplit(ABCDA, )), levels = LETTERS)) [1] 1 2 3 4 1 as.numeric(factor(unlist(strsplit(XYZ, )), levels = LETTERS)) [1] 24 25 26 The step that I missed was setting the factor levels to the full set of LETTERS. HTH, Marc on 11/24/2008 09:14 AM [EMAIL PROTECTED] wrote: Hi Mark Thanks, that's almost exactly what I need...theres just a slight difference with my requirement, in that I am looking for the actual index value in the alphabetical sequence, so that instead of: as.numeric(factor(unlist(strsplit(XYZ, [1] 1 2 3 I would expect to see [1] 24 25 26 I have got it to work in a fairly non-elegant manner, using the following code: sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) ) And over a list of names, this becomes: lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } ) But this is kind of ugly Rory Winston RBS Global Banking Markets Office: +44 20 7085 4476 -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: 24 November 2008 15:09 To: WINSTON, Rory, GBM Cc: r-help@r-project.org Subject: Re: [R] Calculating sum of letter values on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote: Hi all If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? Trying something like which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not quite correct, as it does not count repeated characters. I guess what I need is some kind of lookup table? Cheers Rory sum(as.numeric(factor(unlist(strsplit(ABCDA, ) [1] 11 Convert the letters to factors, after splitting the vector, which then enables the use of the underlying numeric codes: as.numeric(factor(unlist(strsplit(ABCDA, [1] 1 2 3 4 1 HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Thanks, that's almost exactly what I need...theres just a slight difference with my requirement, in that I am looking for the actual index value in the alphabetical sequence, so that instead of: as.numeric(factor(unlist(strsplit(XYZ, [1] 1 2 3 I would expect to see [1] 24 25 26 How about this? as.numeric(factor(unlist(strsplit(ECX, )), levels=LETTERS)) Best regards, Stefan Evert [ [EMAIL PROTECTED] | http://purl.org/stefan.evert ] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Thanks a lot for the solutions everyone...really appreciated. Cheers Rory Rory Winston RBS Global Banking Markets Office: +44 20 7085 4476 -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: 24 November 2008 15:24 To: WINSTON, Rory, GBM Cc: r-help@r-project.org Subject: Re: [R] Calculating sum of letter values Yep, my error...it should be: as.numeric(factor(unlist(strsplit(ABCDA, )), levels = LETTERS)) [1] 1 2 3 4 1 as.numeric(factor(unlist(strsplit(XYZ, )), levels = LETTERS)) [1] 24 25 26 The step that I missed was setting the factor levels to the full set of LETTERS. HTH, Marc on 11/24/2008 09:14 AM [EMAIL PROTECTED] wrote: Hi Mark Thanks, that's almost exactly what I need...theres just a slight difference with my requirement, in that I am looking for the actual index value in the alphabetical sequence, so that instead of: as.numeric(factor(unlist(strsplit(XYZ, [1] 1 2 3 I would expect to see [1] 24 25 26 I have got it to work in a fairly non-elegant manner, using the following code: sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) ) And over a list of names, this becomes: lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } ) But this is kind of ugly Rory Winston RBS Global Banking Markets Office: +44 20 7085 4476 -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: 24 November 2008 15:09 To: WINSTON, Rory, GBM Cc: r-help@r-project.org Subject: Re: [R] Calculating sum of letter values on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote: Hi all If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? Trying something like which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not quite correct, as it does not count repeated characters. I guess what I need is some kind of lookup table? Cheers Rory sum(as.numeric(factor(unlist(strsplit(ABCDA, ) [1] 11 Convert the letters to factors, after splitting the vector, which then enables the use of the underlying numeric codes: as.numeric(factor(unlist(strsplit(ABCDA, [1] 1 2 3 4 1 HTH, Marc Schwartz *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority This e-mail message is confidential and for use by the=2...{{dropped:22}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
G'day Rory, On Mon, 24 Nov 2008 14:57:57 + [EMAIL PROTECTED] wrote: If I have a string, say ABCDA, and I want to convert this to the sum of the letter values, e.g. A - 1 B - 2 etc, so ABCDA = 1+2+3+4+1 = 11 Is there an elegant way to do this? [...] R sum(as.numeric(factor(unlist(strsplit(ABCDA,)), levels=LETTERS))) [1] 11 R sum(as.numeric(factor(unlist(strsplit(ABCEA,)), levels=LETTERS))) [1] 12 HTH. Best wishes, Berwin === Full address = Berwin A TurlachTel.: +65 6515 4416 (secr) Dept of Statistics and Applied Probability+65 6515 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating sum of letter values
Rory Winston wrote: I have got it to work in a fairly non-elegant manner, using the following code: sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) ) And over a list of names, this becomes: lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } ) But this is kind of ugly Rory Winston RBS Global Banking Markets Office: +44 20 7085 4476 Do you mean that the nested lapply's are kind of ugly. You don't need them. I think the following does the same as what you wrote f1 - function(namelist)lapply(strsplit(namelist,), function(x) sum(match(x,LETTERS))) where your code as a function would be f0 - function(namelist)lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } ) (Since f0() and f1() return lists of scalar integers, it might make more sense to call unlist() on their outputs before returning them.) Another approach is to use a named vector of character values to map characters to values, such as in f2 - function(namelist) { values - c(seq_along(LETTERS), seq_along(letters), 0L, 0L, 0L) names(values) - c(LETTERS, letters, , -, .) lapply(strsplit(namelist,), function(characters, values)sum(values[characters]), values) } E.g., f2(c(Mary Jean, Maryjean, Mary-Jean, MARYJEAN)) [[1]] [1] 87 [[2]] [1] 87 [[3]] [1] 87 [[4]] [1] 87 That approach lets you map several characters to the same value, and the values are not restricted to the small positive integers 1:length(possibleCharacters). Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.