Thanks you very much Jim,

As usual, your reply was very helpful (although I did not directly apply 
it). My dataset contains many columns, including numerous columns that I 
wanted to keep as factor. Instead of adding as.is = true at the 
readtable step, I am using the same "philosophy" but on a  limited 
range. I simply change the type of my column of interest using 
as.character, then do the replacement and finally change the column back 
to factor. It does the trick and, this way, I do not have to change all 
my columns to factors.

Thank again for your help.

Sebastien

jim holtman a écrit :
> The problem is that the first column is probably a factor and you are
> trying to assign a value that is not already a 'level' in the factor.
> One way is to read the data with as.is=TRUE to keep it as character,
> replace the NAs and then convert back to factors if you want to:
>
>   
>> x <- read.csv(textConnection("A,B
>>     
> + a,3
> + b,4
> + .,.
> + c,5"), na.strings='.', as.is=TRUE)  # keep as character
>   
>> # replace NAs
>> x[is.na(x[,1]), 1] <- "Missing Value"
>> # convert back to factors if you want to
>> x[[1]] <- factor(x[[1]])
>> str(x)
>>     
> 'data.frame':   4 obs. of  2 variables:
>  $ A: Factor w/ 4 levels "a","b","c","Missing Value": 1 2 4 3
>  $ B: int  3 4 NA 5
>   
>>     
>
>
> On 8/11/07, Sébastien <[EMAIL PROTECTED]> wrote:
>   
>> Dear R-users,
>>
>> My script imports a dataset from a csv file, in which missing values are
>> represented by ".". This importation is done into a dataframe using the
>> read.table function with na.strings = "."  Then I want to replace the
>> NAs in the first column of the dataframe by "Missing data". I am using
>> the following code to do so :
>>
>> mydata<-data.frame(read.table(myFile,sep=",",header=TRUE,na.strings="."))
>>   # myFile is the full path of the source file
>>
>> mydata[,1][is.na(mydata[,1])]<-"Missing value"
>>
>> This code works perfectly fine if this first column contains only
>> missing values, i.e. ".". As soon as it contains multiple levels and
>> missing values, things start to get wrong. I get the following error
>> message and the replacement is not done.
>>
>> Warning message:
>> invalid factor level, NAs generated in: `[<-.factor`(`*tmp*`,
>> is.na(mydata[, 1]), value = "Missing value")
>>
>> Is there an error in my code or is that a bug (I doubt about it) ?
>>
>> Thanks in advance.
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>     
>
>
>   

        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to