Re: [R] Calculating sum of letter values

2008-11-24 Thread Marc Schwartz
on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote:
 Hi all
 
 If I have a string, say ABCDA, and I want to convert this to the sum of the 
 letter values, e.g.
 
 A - 1
 B - 2
 
 etc, so ABCDA = 1+2+3+4+1 = 11
 
 Is there an elegant way to do this? Trying something like
 
 which(LETTERS %in% unlist(strsplit(ABCDA, )))
 is not  quite correct, as it does not count repeated characters. I guess what 
 I need is some kind of lookup table?
 
 Cheers
 Rory


 sum(as.numeric(factor(unlist(strsplit(ABCDA, )
[1] 11


Convert the letters to factors, after splitting the vector, which then
enables the use of the underlying numeric codes:

 as.numeric(factor(unlist(strsplit(ABCDA, 
[1] 1 2 3 4 1

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Rory.WINSTON
Hi Mark

Thanks, that's almost exactly what I need...theres just a slight difference 
with my requirement, in that I am looking for the actual index value in the 
alphabetical sequence, so that instead of:

as.numeric(factor(unlist(strsplit(XYZ,
[1] 1 2 3

I would expect to see

[1] 24 25 26

I have got it to work in a fairly non-elegant manner, using the following code:

sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) )

And over a list of names, this becomes:

lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), function(x) 
match(x,LETTERS) )) ) } )

But this is kind of ugly

Rory Winston
RBS Global Banking  Markets
Office: +44 20 7085 4476

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED]
Sent: 24 November 2008 15:09
To: WINSTON, Rory, GBM
Cc: r-help@r-project.org
Subject: Re: [R] Calculating sum of letter values

on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote:
 Hi all

 If I have a string, say ABCDA, and I want to convert this to the sum of the 
 letter values, e.g.

 A - 1
 B - 2

 etc, so ABCDA = 1+2+3+4+1 = 11

 Is there an elegant way to do this? Trying something like

 which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not  quite
 correct, as it does not count repeated characters. I guess what I need is 
 some kind of lookup table?

 Cheers
 Rory


 sum(as.numeric(factor(unlist(strsplit(ABCDA, )
[1] 11


Convert the letters to factors, after splitting the vector, which then enables 
the use of the underlying numeric codes:

 as.numeric(factor(unlist(strsplit(ABCDA, 
[1] 1 2 3 4 1

HTH,

Marc Schwartz

***
The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered 
Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority 

This e-mail message is confidential and for use by the=2...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Gabor Grothendieck
Here are a couple of solutions.

The first matches each
character against LETTERS returning the position number
in LETTERS of the match.  strsplit returns a list of which
we want the first element and then we sum that.

The second applies function(x) match(x, LETTERS),
which is specified in formula notation, to each letter
and simplifies the result using sum.

sum(match(strsplit(s, )[[1]], LETTERS))

library(gsubfn)
strapply(s, ., ~ match(x, LETTERS), simplify = sum)

On Mon, Nov 24, 2008 at 9:57 AM,  [EMAIL PROTECTED] wrote:
 Hi all

 If I have a string, say ABCDA, and I want to convert this to the sum of the 
 letter values, e.g.

 A - 1
 B - 2

 etc, so ABCDA = 1+2+3+4+1 = 11

 Is there an elegant way to do this? Trying something like

 which(LETTERS %in% unlist(strsplit(ABCDA, )))
 is not  quite correct, as it does not count repeated characters. I guess what 
 I need is some kind of lookup table?

 Cheers
 Rory

 Rory Winston
 RBS Global Banking  Markets
 280 Bishopsgate, London, EC2M 4RB
 Office: +44 20 7085 4476



 ***
 The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered 
 Office: 36 St Andrew Square, Edinburgh EH2 2YB.
 Authorised and regulated by the Financial Services Authority

 This e-mail message is confidential and for use by the=2...{{dropped:25}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Jagat.K.Sheth
You can use Mark's code by giving levels to the factor, e.g.

as.numeric(factor(unlist(strsplit(ABCDAXYZ, )), levels=LETTERS))  

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of [EMAIL PROTECTED]
Sent: Monday, November 24, 2008 9:15 AM
To: [EMAIL PROTECTED]
Cc: r-help@r-project.org
Subject: Re: [R] Calculating sum of letter values

Hi Mark

Thanks, that's almost exactly what I need...theres just a slight
difference with my requirement, in that I am looking for the actual
index value in the alphabetical sequence, so that instead of:

as.numeric(factor(unlist(strsplit(XYZ,
[1] 1 2 3

I would expect to see

[1] 24 25 26

I have got it to work in a fairly non-elegant manner, using the
following code:

sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS)
)) )

And over a list of names, this becomes:

lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,),
function(x) match(x,LETTERS) )) ) } )

But this is kind of ugly

Rory Winston
RBS Global Banking  Markets
Office: +44 20 7085 4476

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED]
Sent: 24 November 2008 15:09
To: WINSTON, Rory, GBM
Cc: r-help@r-project.org
Subject: Re: [R] Calculating sum of letter values

on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote:
 Hi all

 If I have a string, say ABCDA, and I want to convert this to the sum
of the letter values, e.g.

 A - 1
 B - 2

 etc, so ABCDA = 1+2+3+4+1 = 11

 Is there an elegant way to do this? Trying something like

 which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not  quite 
 correct, as it does not count repeated characters. I guess what I need
is some kind of lookup table?

 Cheers
 Rory


 sum(as.numeric(factor(unlist(strsplit(ABCDA, )
[1] 11


Convert the letters to factors, after splitting the vector, which then
enables the use of the underlying numeric codes:

 as.numeric(factor(unlist(strsplit(ABCDA, 
[1] 1 2 3 4 1

HTH,

Marc Schwartz


***
The Royal Bank of Scotland plc. Registered in Scotland No 90312.
Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority 

This e-mail message is confidential and for use by\ the=...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Richard . Cotton
 Thanks, that's almost exactly what I need...theres just a slight 
 difference with my requirement, in that I am looking for the actual 
 index value in the alphabetical sequence, so that instead of:
 
 as.numeric(factor(unlist(strsplit(XYZ,
 [1] 1 2 3
 
 I would expect to see
 
 [1] 24 25 26

A minor modeification of Mark's solution works in this case:

as.numeric(factor(unlist(strsplit(XYZ, )), levels=LETTERS))
# [1] 24 25 26

Regards,
Richie.

Mathematical Sciences Unit
HSL




ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Marc Schwartz
Yep, my error...it should be:

 as.numeric(factor(unlist(strsplit(ABCDA, )), levels = LETTERS))
[1] 1 2 3 4 1

 as.numeric(factor(unlist(strsplit(XYZ, )), levels = LETTERS))
[1] 24 25 26

The step that I missed was setting the factor levels to the full set of
LETTERS.

HTH,

Marc

on 11/24/2008 09:14 AM [EMAIL PROTECTED] wrote:
 Hi Mark
 
 Thanks, that's almost exactly what I need...theres just a slight difference 
 with my requirement, in that I am looking for the actual index value in the 
 alphabetical sequence, so that instead of:
 
 as.numeric(factor(unlist(strsplit(XYZ,
 [1] 1 2 3
 
 I would expect to see
 
 [1] 24 25 26
 
 I have got it to work in a fairly non-elegant manner, using the following 
 code:
 
 sum ( unlist(lapply(strsplit(TESTING,), function(x) match(x,LETTERS) )) )
 
 And over a list of names, this becomes:
 
 lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,), 
 function(x) match(x,LETTERS) )) ) } )
 
 But this is kind of ugly
 
 Rory Winston
 RBS Global Banking  Markets
 Office: +44 20 7085 4476
 
 -Original Message-
 From: Marc Schwartz [mailto:[EMAIL PROTECTED]
 Sent: 24 November 2008 15:09
 To: WINSTON, Rory, GBM
 Cc: r-help@r-project.org
 Subject: Re: [R] Calculating sum of letter values
 
 on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote:
 Hi all

 If I have a string, say ABCDA, and I want to convert this to the sum of 
 the letter values, e.g.

 A - 1
 B - 2

 etc, so ABCDA = 1+2+3+4+1 = 11

 Is there an elegant way to do this? Trying something like

 which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not  quite
 correct, as it does not count repeated characters. I guess what I need is 
 some kind of lookup table?

 Cheers
 Rory
 
 
 sum(as.numeric(factor(unlist(strsplit(ABCDA, )
 [1] 11
 
 
 Convert the letters to factors, after splitting the vector, which then 
 enables the use of the underlying numeric codes:
 
 as.numeric(factor(unlist(strsplit(ABCDA, 
 [1] 1 2 3 4 1
 
 HTH,
 
 Marc Schwartz


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Stefan Evert




Thanks, that's almost exactly what I need...theres just a slight  
difference with my requirement, in that I am looking for the actual  
index value in the alphabetical sequence, so that instead of:


as.numeric(factor(unlist(strsplit(XYZ,
[1] 1 2 3

I would expect to see

[1] 24 25 26



How about this?

as.numeric(factor(unlist(strsplit(ECX, )), levels=LETTERS))




Best regards,
Stefan Evert

[ [EMAIL PROTECTED] | http://purl.org/stefan.evert ]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Rory.WINSTON
Thanks a lot for the solutions everyone...really appreciated.

Cheers
Rory

Rory Winston
RBS Global Banking  Markets
Office: +44 20 7085 4476

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED]
Sent: 24 November 2008 15:24
To: WINSTON, Rory, GBM
Cc: r-help@r-project.org
Subject: Re: [R] Calculating sum of letter values

Yep, my error...it should be:

 as.numeric(factor(unlist(strsplit(ABCDA, )), levels = LETTERS))
[1] 1 2 3 4 1

 as.numeric(factor(unlist(strsplit(XYZ, )), levels = LETTERS))
[1] 24 25 26

The step that I missed was setting the factor levels to the full set of LETTERS.

HTH,

Marc

on 11/24/2008 09:14 AM [EMAIL PROTECTED] wrote:
 Hi Mark

 Thanks, that's almost exactly what I need...theres just a slight difference 
 with my requirement, in that I am looking for the actual index value in the 
 alphabetical sequence, so that instead of:

 as.numeric(factor(unlist(strsplit(XYZ,
 [1] 1 2 3

 I would expect to see

 [1] 24 25 26

 I have got it to work in a fairly non-elegant manner, using the following 
 code:

 sum ( unlist(lapply(strsplit(TESTING,), function(x)
 match(x,LETTERS) )) )

 And over a list of names, this becomes:

 lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,),
 function(x) match(x,LETTERS) )) ) } )

 But this is kind of ugly

 Rory Winston
 RBS Global Banking  Markets
 Office: +44 20 7085 4476

 -Original Message-
 From: Marc Schwartz [mailto:[EMAIL PROTECTED]
 Sent: 24 November 2008 15:09
 To: WINSTON, Rory, GBM
 Cc: r-help@r-project.org
 Subject: Re: [R] Calculating sum of letter values

 on 11/24/2008 08:57 AM [EMAIL PROTECTED] wrote:
 Hi all

 If I have a string, say ABCDA, and I want to convert this to the sum of 
 the letter values, e.g.

 A - 1
 B - 2

 etc, so ABCDA = 1+2+3+4+1 = 11

 Is there an elegant way to do this? Trying something like

 which(LETTERS %in% unlist(strsplit(ABCDA, ))) is not  quite
 correct, as it does not count repeated characters. I guess what I need is 
 some kind of lookup table?

 Cheers
 Rory


 sum(as.numeric(factor(unlist(strsplit(ABCDA, )
 [1] 11


 Convert the letters to factors, after splitting the vector, which then 
 enables the use of the underlying numeric codes:

 as.numeric(factor(unlist(strsplit(ABCDA, 
 [1] 1 2 3 4 1

 HTH,

 Marc Schwartz


***
The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered 
Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority 

This e-mail message is confidential and for use by the=2...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread Berwin A Turlach
G'day Rory,

On Mon, 24 Nov 2008 14:57:57 +
[EMAIL PROTECTED] wrote:

 If I have a string, say ABCDA, and I want to convert this to the
 sum of the letter values, e.g.
 
 A - 1
 B - 2
 
 etc, so ABCDA = 1+2+3+4+1 = 11
 
 Is there an elegant way to do this? [...]

R sum(as.numeric(factor(unlist(strsplit(ABCDA,)), levels=LETTERS)))
[1] 11
R sum(as.numeric(factor(unlist(strsplit(ABCEA,)), levels=LETTERS)))
[1] 12

HTH.

Best wishes,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating sum of letter values

2008-11-24 Thread William Dunlap
Rory Winston wrote:
 I have got it to work in a fairly non-elegant manner, using the
following code:
 
 sum ( unlist(lapply(strsplit(TESTING,), function(x)
match(x,LETTERS) )) )
 
 And over a list of names, this becomes:
 
 lapply(namelist, function(Z) { sum ( unlist(lapply(strsplit(Z,),
function(x) match(x,LETTERS) )) ) } )
 
 But this is kind of ugly
 
 Rory Winston
 RBS Global Banking  Markets
 Office: +44 20 7085 4476

Do you mean that the nested lapply's are kind of ugly.  You don't
need them.  I think the following does the same as what you wrote

 f1 - function(namelist)lapply(strsplit(namelist,), function(x)
sum(match(x,LETTERS)))

where your code as a function would be
 
 f0 - function(namelist)lapply(namelist, function(Z) { sum (
unlist(lapply(strsplit(Z,), function(x) match(x,LETTERS) )) ) } )

(Since f0() and f1() return lists of scalar integers, it might make more
sense to call unlist() on their outputs before returning them.)

Another approach is to use a named vector of character values to map
characters to values, such as in

 f2 - function(namelist) {
 values - c(seq_along(LETTERS), seq_along(letters), 0L, 0L, 0L)
 names(values) - c(LETTERS, letters,  , -, .)
 lapply(strsplit(namelist,), function(characters,
values)sum(values[characters]), values)
 }

E.g.,
   f2(c(Mary Jean, Maryjean, Mary-Jean, MARYJEAN))
  [[1]]
  [1] 87

  [[2]]
  [1] 87

  [[3]]
  [1] 87

  [[4]]
  [1] 87

That approach lets you map several characters to the same value, and the
values are not restricted to the small positive integers
1:length(possibleCharacters).
  
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.