Dear all,
I am working with a csv file.
Some data of the file are not valid and they are marked with a star '*'.
For example : *789.
I have attached with this email a example file (test.txt) that looks like
the data I have to work with.
I see 2 possibilities ..thast I cannot manage anyway in R:
1-first & easiest solution:
Read the data with read.csv in R, and define as na strings all cells
containing a star (*).
Something which would looks like this ...
>
DATA<-read.csv("test.txt",na.strings=list(length(grep("\\*",DATA,value=T))==0))
> DATA
X1 X.789 LNM. X78 X56 X89 X56.1 X100
1 2 700 AUW 78 56 89 56 100
2 3 400 TOC 78 56 89 56 10
3 4 389 RMN 78 56 89 56 *89
4 5 400 LNM 78 56 *452 56 100
5 6 200 UTC 78 *40 89 56 100
6 7 100 GAT 78 56 8 56 *100
7 8 79 *LNM 78 56 9 56 100
8 9 89 TCG 78 56 800 56 *100
9 10 78* LNM 78 56 89 56 100
...but which would work (Stars are still there)! Do anyone knows how to do
that ?
2-Second solution:
- first read the file with DATA<-read.csv("test.txt")
- then replace all fields containing a * with NA in applying the following
function to the object DATA:
DATA_cleaned<-apply(DATA,c(1,2),function(x){if(length(grep("\\*",x,value=TRUE))==1){x<-NA}})
DATA_cleaned
X1 X.789 LNM. X78 X56 X89 X56.1 X100
[1,] NULL NULL NULL NULL NULL NULL NULL NULL
[2,] NULL NULL NULL NULL NULL NULL NULL NULL
[3,] NULL NULL NULL NULL NULL NULL NULL NA
[4,] NULL NULL NULL NULL NULL NA NULL NULL
[5,] NULL NULL NULL NULL NA NULL NULL NULL
[6,] NULL NULL NULL NULL NULL NULL NULL NA
[7,] NULL NULL NA NULL NULL NULL NULL NULL
[8,] NULL NULL NULL NULL NULL NULL NULL NA
[9,] NULL NA NULL NULL NULL NULL NULL NULL
stars have deaseper, but all the rest too !
The pb comes from the fact that if a field does not contain any *, the
command
if(length(grep("\\*",x,value=T))==1) return NULL instead of FALSE !
I you have any idea, please let me know !
Many thanks,
Jessica
____________________________________
Jessica Gervais
Mail: [EMAIL PROTECTED]
Resource Centre for Environmental Technologies,
Public Research Centre Henri Tudor,
Technoport Schlassgoart,
66 rue de Luxembourg,
P.O. BOX 144,
L-4002 Esch-sur-Alzette, Luxembourg
(See attached file: test.txt)
1,*789,LNM*,78,56,89,56,100
2,700,AUW,78,56,89,56,100
3,400,TOC,78,56,89,56,10
4,389,RMN,78,56,89,56,*89
5,400,LNM,78,56,*452,56,100
6,200,UTC,78,*40,89,56,100
7,100,GAT,78,56,8,56,*100
8,79,*LNM,78,56,9,56,100
9,89,TCG,78,56,800,56,*100
10,78*,LNM,78,56,89,56,100
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.