On 25/01/12 09:45, Sam Steingold wrote:
I get this error from read.table():
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
   line 234 did not have 8 elements
The error is genuine (an extra field separator between 1st and 2nd element).

1. is there a way to see this bad line 234 from R without diving into the file?

2. is there a way to ignore the bad lines and get the data from the good
lines only (I do want to see the bad lines, but I don't want to stop all
work until some issue which causes 1% of data is resolved).

thanks.

Oh, yeah, a reproducible example:

read.csv from
=====
a,b
1,2
3,4
5,,6
7,8
=====
I want to be able to extract the data frame
   a b
1 1 1
2 3 4
3 7 8

and a list of strings of length 1 containing "5,,6".

Try:

xxx <- readLines("<filename>")
hhh <- read.csv(textConnection(xxx[1]),header=FALSE)
yyy <- hhh[-1,]
names(yyy) <- hhh[1,]
bad <- list()
j <- 0
for(i in 2:length(xxx)) {
    tmp <- read.csv(textConnection(xxx[i]),header=FALSE)
    if(ncol(tmp)==ncol(yyy)) yyy <- rbind(yyy,tmp) else {
        j <- j+1
        bad[[j]] <- tmp
    }
}
closeAllConnections()

HTH

    cheers,

        Rolf Turner

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to