[R] data file import - numbers and letters in a matrix(!)

2007-04-12 Thread Felix Wave
Hello,
I have a problem with the import of a date file. I seems verry tricky.
I have a text file (end of the mail). Every file has a different number of 
measurments 
witch start with START OF HEIGHT DATA and ende with END OF HEIGHT DATA.

I imported the file in a matrix but the letters before the numbers are my 
problem 
(S= ,S=,x=,y=).
Because through the letters and the space after S= I got a different number
of columns in my matrix and with letters in my matrix I can't count.


My question. Is it possible to import the file to got 3 columns only with 
numbers and 
no letters like x=, y=?

Thank's a lot
Felix




My R Code:
--

# na.strings = S=

Measure1 - matrix(scan(data.dat, n= 5063 * 4, skip =   20, what = 
character() ), 5063, 3, byrow = TRUE)
Measure2 - matrix(scan(data.dat, n= 5063 * 4, skip = 5220, what = 
character() ), 5063, 3, byrow = TRUE)



My data file:
---

FILEDATE:02.02.2007
...

START OF HEIGHT DATA
S= 0 y=0.0 x=0.
S= 0 y=0.1 x=0.00055643
...
S= 9 y=4.9 x=1.67278117
S= 9 y=5.0 x=1.74873257
S=10 y=0.0 x=0.
S=10 y=0.1 x=0.00075557
...
S=99 y=5.3 x=1.94719490
END OF HEIGHT DATA
...

START OF HEIGHT DATA
S= 0 y=0.0 x=0.
S= 0 y=0.1 x=0.00055643



The imported matrix: 

  [,1]   [,2]   [,3]   [,4]  
 [6,] S=   9y=4.9x=1.67278117
 [7,] S=   9y=5.0x=1.74873257
 [8,] S=10 y=0.0x=0. S=10
 [9,] y=0.1x=0.00075557 S=10 y=0.2   
[10,] x=0.00277444 S=10 y=0.3x=0.00605958

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data file import - numbers and letters in a matrix(!)

2007-04-12 Thread Gabor Grothendieck
Try pasting this into an R session:


Lines.raw - FILEDATE:02.02.2007
...

START OF HEIGHT DATA
S= 0 y=0.0 x=0.
S= 0 y=0.1 x=0.00055643
...
S= 9 y=4.9 x=1.67278117
S= 9 y=5.0 x=1.74873257
S=10 y=0.0 x=0.
S=10 y=0.1 x=0.00075557
...
S=99 y=5.3 x=1.94719490
END OF HEIGHT DATA
...

START OF HEIGHT DATA
S= 0 y=0.0 x=0.
S= 0 y=0.1 x=0.00055643


# next line would be replaced by
#  somthing like: Lines - readLines(myfile.dat)
Lines - readLines(textConnection(Lines.raw))

# extract those lines that contain an =
Lines - grep(=, Lines, value = TRUE)

# get col names by removing all but letters  spaces from line 1
cn - gsub([^a-zA-Z ], , Lines[1])
cn - scan(textConnection(cn), what = )

# remove anything that is not a number, dot or space and read in
Lines - gsub([^ .0-9], , Lines)
DF - read.table(textConnection(Lines), col.names = cn)
closeAllConnections()
DF




On 4/12/07, Felix Wave [EMAIL PROTECTED] wrote:
 Hello,
 I have a problem with the import of a date file. I seems verry tricky.
 I have a text file (end of the mail). Every file has a different number of 
 measurments
 witch start with START OF HEIGHT DATA and ende with END OF HEIGHT DATA.

 I imported the file in a matrix but the letters before the numbers are my 
 problem
 (S= ,S=,x=,y=).
 Because through the letters and the space after S= I got a different number
 of columns in my matrix and with letters in my matrix I can't count.


 My question. Is it possible to import the file to got 3 columns only with 
 numbers and
 no letters like x=, y=?

 Thank's a lot
 Felix




 My R Code:
 --

 # na.strings = S=

 Measure1 - matrix(scan(data.dat, n= 5063 * 4, skip =   20, what = 
 character() ), 5063, 3, byrow = TRUE)
 Measure2 - matrix(scan(data.dat, n= 5063 * 4, skip = 5220, what = 
 character() ), 5063, 3, byrow = TRUE)



 My data file:
 ---

 FILEDATE:02.02.2007
 ...

 START OF HEIGHT DATA
 S= 0 y=0.0 x=0.
 S= 0 y=0.1 x=0.00055643
 ...
 S= 9 y=4.9 x=1.67278117
 S= 9 y=5.0 x=1.74873257
 S=10 y=0.0 x=0.
 S=10 y=0.1 x=0.00075557
 ...
 S=99 y=5.3 x=1.94719490
 END OF HEIGHT DATA
 ...

 START OF HEIGHT DATA
 S= 0 y=0.0 x=0.
 S= 0 y=0.1 x=0.00055643



 The imported matrix:
 
  [,1]   [,2]   [,3]   [,4]
  [6,] S=   9y=4.9x=1.67278117
  [7,] S=   9y=5.0x=1.74873257
  [8,] S=10 y=0.0x=0. S=10
  [9,] y=0.1x=0.00075557 S=10 y=0.2
 [10,] x=0.00277444 S=10 y=0.3x=0.00605958

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data file import - numbers and letters in a matrix(!)

2007-04-12 Thread Adaikalavan Ramasamy
Here is the contents of my testdata.txt :

-
START OF HEIGHT DATA
S= 0y=0.0 x=0.
S= 0 y=0.1 x=0.00055643
  S= 9 y=4.9 x=1.67278117
   S= 9 y=5.0 x=1.74873257
S=10   y=0.0   x=0.
 S=10y=0.1 x=0.00075557
S=99 y=5.3x=1.94719490
END OF HEIGHT DATA
-

If you have access to a shell command, you can try changing the input 
file for read.delim using

cat testdata.txt | grep -v ^START | grep -v ^END | sed 's/ //g' | 
sed 's/S=//' | sed 's/y=/\t/' | sed 's/x=/\t/'

or here is my ugly fix in R

  my.read.file - function(file=file){

   v1 - readLines( con=file, n=-1)
   v2 - v1[ - grep( ^START|^END, v1 ) ]
   v3 - gsub( , , v2)
   v4 - gsub( S=|y=|x=,  , v3 )
   v5 - gsub(^ , , v4)

   m  - t( sapply( strsplit(v5, split= ), as.numeric ) )
   colnames(m) - c(S, y, x )
   return(m)
  }

  my.read.file( testdata.txt )

Regards, Adai




Felix Wave wrote:
 Hello,
 I have a problem with the import of a date file. I seems verry tricky.
 I have a text file (end of the mail). Every file has a different number of 
 measurments 
 witch start with START OF HEIGHT DATA and ende with END OF HEIGHT DATA.
 
 I imported the file in a matrix but the letters before the numbers are my 
 problem 
 (S= ,S=,x=,y=).
 Because through the letters and the space after S= I got a different number
 of columns in my matrix and with letters in my matrix I can't count.
 
 
 My question. Is it possible to import the file to got 3 columns only with 
 numbers and 
 no letters like x=, y=?
 
 Thank's a lot
 Felix
 
 
 
 
 My R Code:
 --
 
 # na.strings = S=
 
 Measure1 - matrix(scan(data.dat, n= 5063 * 4, skip =   20, what = 
 character() ), 5063, 3, byrow = TRUE)
 Measure2 - matrix(scan(data.dat, n= 5063 * 4, skip = 5220, what = 
 character() ), 5063, 3, byrow = TRUE)
 
 
 
 My data file:
 ---
 
 FILEDATE:02.02.2007
 ...
 
 START OF HEIGHT DATA
 S= 0 y=0.0 x=0.
 S= 0 y=0.1 x=0.00055643
 ...
 S= 9 y=4.9 x=1.67278117
 S= 9 y=5.0 x=1.74873257
 S=10 y=0.0 x=0.
 S=10 y=0.1 x=0.00075557
 ...
 S=99 y=5.3 x=1.94719490
 END OF HEIGHT DATA
 ...
 
 START OF HEIGHT DATA
 S= 0 y=0.0 x=0.
 S= 0 y=0.1 x=0.00055643
 
 
 
 The imported matrix: 
   [,1]   [,2]   [,3]   [,4]  
  [6,] S=   9y=4.9x=1.67278117
  [7,] S=   9y=5.0x=1.74873257
  [8,] S=10 y=0.0x=0. S=10
  [9,] y=0.1x=0.00075557 S=10 y=0.2   
 [10,] x=0.00277444 S=10 y=0.3x=0.00605958
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.