[R] character to numeric conversion

2007-03-19 Thread Robin Hankin
Hi.

Is there a straightforward way to convert a character string  
containing comma-delimited
numbers  to a numeric vector?

In my application, I use

system(executable.string, intern=TRUE)

which returns a string like

[0.E-38, 2.096751179214927596171268230,  
3.678944959657480671183123052, 4.976528845643001020345216157,  
6.072390165503099343887569007, 7.007958550337542210168866070,  
7.807464185827177139302778736, 8.486139455817034846608029724,  
9.053706780665060873259065771, 9.516172308326877463284426111,  
9.876856047379733199590985269, 10.13695826383869052536062804,  
10.29580989588667234885515374, 10.35092785255025551187463209,  
10.29795676261278695909972578, 10.13052574735986793562227138,  
9.839990935943625006580521345, 9.414977153151389385186358494,  
8.840562526759586215404890348, 8.096830792651667245232639586,  
7.156244887881612948153311800, 5.978569259122249264778017262,  
4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]


(the output is a single line).   In a big run, the string may contain  
10^5 or possibly 10^6 numbers.

What's the recommended way to convert this to a numeric vector?






--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Dimitris Rizopoulos
you could give a try to strsplit(), e.g.,

strg - 0.E-38, 2.096751179214927596171268230, 
3.678944959657480671183123052
strg - paste(rep(strg, 5000), collapse = , )
##
f.out - factor(strsplit(strg, , )[[1]])
n.out - as.numeric(levels(f.out))[as.integer(f.out)]


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Robin Hankin [EMAIL PROTECTED]
To: RHelp help r-help@stat.math.ethz.ch
Sent: Monday, March 19, 2007 10:18 AM
Subject: [R] character to numeric conversion


 Hi.

 Is there a straightforward way to convert a character string
 containing comma-delimited
 numbers  to a numeric vector?

 In my application, I use

 system(executable.string, intern=TRUE)

 which returns a string like

 [0.E-38, 2.096751179214927596171268230,
 3.678944959657480671183123052, 4.976528845643001020345216157,
 6.072390165503099343887569007, 7.007958550337542210168866070,
 7.807464185827177139302778736, 8.486139455817034846608029724,
 9.053706780665060873259065771, 9.516172308326877463284426111,
 9.876856047379733199590985269, 10.13695826383869052536062804,
 10.29580989588667234885515374, 10.35092785255025551187463209,
 10.29795676261278695909972578, 10.13052574735986793562227138,
 9.839990935943625006580521345, 9.414977153151389385186358494,
 8.840562526759586215404890348, 8.096830792651667245232639586,
 7.156244887881612948153311800, 5.978569259122249264778017262,
 4.499809670330265066808481929, 2.602689685444383764768503589, 
 0.E-38]


 (the output is a single line).   In a big run, the string may 
 contain
 10^5 or possibly 10^6 numbers.

 What's the recommended way to convert this to a numeric vector?






 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Peter Dalgaard
Robin Hankin wrote:
 Hi.

 Is there a straightforward way to convert a character string  
 containing comma-delimited
 numbers  to a numeric vector?

 In my application, I use

 system(executable.string, intern=TRUE)

 which returns a string like

 [0.E-38, 2.096751179214927596171268230,  
 3.678944959657480671183123052, 4.976528845643001020345216157,  
 6.072390165503099343887569007, 7.007958550337542210168866070,  
 7.807464185827177139302778736, 8.486139455817034846608029724,  
 9.053706780665060873259065771, 9.516172308326877463284426111,  
 9.876856047379733199590985269, 10.13695826383869052536062804,  
 10.29580989588667234885515374, 10.35092785255025551187463209,  
 10.29795676261278695909972578, 10.13052574735986793562227138,  
 9.839990935943625006580521345, 9.414977153151389385186358494,  
 8.840562526759586215404890348, 8.096830792651667245232639586,  
 7.156244887881612948153311800, 5.978569259122249264778017262,  
 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]


 (the output is a single line).   In a big run, the string may contain  
 10^5 or possibly 10^6 numbers.

 What's the recommended way to convert this to a numeric vector?

   
scan() on a text connection:

 x - [0.E-38, 2.096751179214927596171268230,
+ 3.678944959657480671183123052, 4.976528845643001020345216157,
+ 6.072390165503099343887569007, 7.007958550337542210168866070,
+ 7.807464185827177139302778736, 8.486139455817034846608029724,
+ 9.053706780665060873259065771, 9.516172308326877463284426111,
+ 9.876856047379733199590985269, 10.13695826383869052536062804,
+ 10.29580989588667234885515374, 10.35092785255025551187463209,
+ 10.29795676261278695909972578, 10.13052574735986793562227138,
+ 9.839990935943625006580521345, 9.414977153151389385186358494,
+ 8.840562526759586215404890348, 8.096830792651667245232639586,
+ 7.156244887881612948153311800, 5.978569259122249264778017262,
+ 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]
 tc - textConnection(gsub([][ \n],,x))
 xx - scan(tc,sep=,)
Read 25 items
 summary(xx)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  0.000   4.977   8.097   7.049   9.840  10.350
 close(tc)

(By far, the hardest bit was getting the gsub regexp right...)

Alternatively, just get rid of the brackets and replace commas with
whitespace. A problem with sep=, is that it gets confused by line
endings following a comma.

 tc - textConnection(gsub(,,  , gsub([][], , x)))
 xx - scan(tc)
Read 25 items
 summary(xx)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  0.000   4.977   8.097   7.049   9.840  10.350
 close(tc)





 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
   tel  023-8059-7743

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Robin Hankin
Hello everybody

thanks for the tips.

I *think* this should be the same thread


The manpage for system() says that lines of over 8095
characters will be split.  This is causing me problems.
How do I get round the 8095 character limit?


Simple toy example follows:



jj - system(echo 4 | awk '{for(i=1;i100;i++){printf(\%s,\, 
$1)}}'| sed -e \s/,$//\,intern=T)


This is  fine.  But .. . .


jj - system(echo 4 | awk '{for(i=1;i1;i++){printf(\%s,\, 
$1)}}'| sed -e \s/,$//\,intern=T)



has  jj  split into three bits, which is upsetting my call.  In my  
application
the split occurs in the middle of a multi-digit number, which messes up
my conversion to numeric?







--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Robin Hankin

On 19 Mar 2007, at 11:20, Robin Hankin wrote:

 Hello everybody

 thanks for the tips.

 I *think* this should be the same thread


 The manpage for system() says that lines of over 8095
 characters will be split.  This is causing me problems.
 How do I get round the 8095 character limit?



Er, just paste the output together using paste(..., collapse = )

The split is clean so concatenating the lines will not lose
any characters.


HTH






--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Roger Bivand
On Mon, 19 Mar 2007, Robin Hankin wrote:

 Hello everybody
 
 thanks for the tips.
 
 I *think* this should be the same thread
 
 
 The manpage for system() says that lines of over 8095
 characters will be split.  This is causing me problems.
 How do I get round the 8095 character limit?
 

Can you use sed or awk in a pipe externally to change ,  into \n while
still out in the system() call, for example the record separator RS in
awk?

 
 Simple toy example follows:
 
 
 
 jj - system(echo 4 | awk '{for(i=1;i100;i++){printf(\%s,\, 
 $1)}}'| sed -e \s/,$//\,intern=T)
 
 
 This is  fine.  But .. . .
 
 
 jj - system(echo 4 | awk '{for(i=1;i1;i++){printf(\%s,\, 
 $1)}}'| sed -e \s/,$//\,intern=T)
 
 
 
 has  jj  split into three bits, which is upsetting my call.  In my  
 application
 the split occurs in the middle of a multi-digit number, which messes up
 my conversion to numeric?
 
 
 
 
 
 
 
 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
   tel  023-8059-7743
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Gabor Grothendieck
Here is one way.  This matches strings which contain those characters
found in a number, converting each such string to numeric.

library(gsubfn)
strapply(x, [-0-9+.E]+, as.numeric)


On 3/19/07, Robin Hankin [EMAIL PROTECTED] wrote:
 Hi.

 Is there a straightforward way to convert a character string
 containing comma-delimited
 numbers  to a numeric vector?

 In my application, I use

 system(executable.string, intern=TRUE)

 which returns a string like

 [0.E-38, 2.096751179214927596171268230,
 3.678944959657480671183123052, 4.976528845643001020345216157,
 6.072390165503099343887569007, 7.007958550337542210168866070,
 7.807464185827177139302778736, 8.486139455817034846608029724,
 9.053706780665060873259065771, 9.516172308326877463284426111,
 9.876856047379733199590985269, 10.13695826383869052536062804,
 10.29580989588667234885515374, 10.35092785255025551187463209,
 10.29795676261278695909972578, 10.13052574735986793562227138,
 9.839990935943625006580521345, 9.414977153151389385186358494,
 8.840562526759586215404890348, 8.096830792651667245232639586,
 7.156244887881612948153311800, 5.978569259122249264778017262,
 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]


 (the output is a single line).   In a big run, the string may contain
 10^5 or possibly 10^6 numbers.

 What's the recommended way to convert this to a numeric vector?






 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to numeric conversion

2007-03-19 Thread Greg Snow
Could you replace 'system' with 'pipe' and read directly from the pipe
connection rather than the intermediate step of having a text string?
If the external function just returns the numbers with commas and spaces
(but no line feeds), then you should be able to use 'scan' directly on
the connection.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Robin Hankin
 Sent: Monday, March 19, 2007 5:20 AM
 To: Peter Dalgaard
 Cc: RHelp help; Robin Hankin
 Subject: Re: [R] character to numeric conversion
 
 Hello everybody
 
 thanks for the tips.
 
 I *think* this should be the same thread
 
 
 The manpage for system() says that lines of over 8095 
 characters will be split.  This is causing me problems.
 How do I get round the 8095 character limit?
 
 
 Simple toy example follows:
 
 
 
 jj - system(echo 4 | awk '{for(i=1;i100;i++){printf(\%s,\,
 $1)}}'| sed -e \s/,$//\,intern=T)
 
 
 This is  fine.  But .. . .
 
 
 jj - system(echo 4 | awk '{for(i=1;i1;i++){printf(\%s,\, 
 $1)}}'| sed -e \s/,$//\,intern=T)
 
 
 
 has  jj  split into three bits, which is upsetting my call.  In my  
 application
 the split occurs in the middle of a multi-digit number, which 
 messes up
 my conversion to numeric?
 
 
 
 
 
 
 
 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
   tel  023-8059-7743
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.