On Wed, 12 Dec 2007, Peter Dalgaard wrote: > Philippe Grosjean wrote: > The problem is often a misspecification of the comment.char argument. > For read.table(), it defaults to '#'. This means that everywhere you > have a '#' char in your Excel sheet, the rest of the line is ignored. > This results in a different number of items per line. > > You should better use read.csv() which provides better default arguments > for your particular problem. > Best, > > Or read.delim/read.delim2, which should be even better at TAB-separated files.
In general, be very suspicious of read.table() with such files, not only because of the '#' but also because it expects columns separated by _arbitrary_ amounts of whitespace. I.e., n TABs counts as one, so empty fields are skipped over. ******* End of other contributions (not sure why my mailer didn't mark them) I would also say be very suspicious of Excel writing .csv files. I found by looking at the .csv file in an editor that for some reason when there were empty fields in the original .xls file that for some records, Excel didn't add in enough commas to make up the correct number of fields. It did for some records but not for others. Excel truly works in misterious ways. read.csv has an argument fill which should fix this problem. In my case I was actually reading the .csv file into mySQL and the solution was to select the whole of the .xls file and format it as text before writing the .csv file. David SCott _________________________________________________________________ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.