[R] strange behavior when reading csv - line wraps

Martin Tomko Fri, 29 May 2009 12:18:44 -0700

Dear All,

I am observing a strange behavior and searching the archives and helppages didn't help much.

I have a csv with a variable number of fields in each line.


I use
dataPoints <- read.csv(inputFile, head=FALSE, sep=";",fill =TRUE);

to read it in, and it works. But - some lines are long and 'wrap', orsplit and continue on the next line. So when I check the dim of theframe, they are not correct and I can see when I do a printout that thelines is split into two in the frame. I checked the input file and allis good.


an example of the input is:
37;2175168475;13;8.522729;47.19537;16366...@n00;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switzerland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;tourismus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitriotnet;

where the last values occurs on the next line in the data frame.

It does not have to be the last value, as in the follwong example, theword "kempten" starts the next line:

39;167757703;12;10.309295;47.724545;21903...@n00;36;white;building;tower;clock;clouds;germany;bayern;deutschland;bavaria;europa;europe;eagle;adler;eu;wolke;dome;townhall;rathaus;turm;weiss;allemagne;europeanunion;bundesrepublik;gebaeude;glocke;brd;allgau;kuppel;europ;kempten;niemcy;europo;federalrepublic;europaischeunion;europaeischeunion;germanio;

What could be the reason?

I ws thinking about solving the issue by using a different separator,that I would use for the first 7 fields and concatenating all of theremaining values into a single stirng value, but could not figure outhow to do such a substitution in R. Unfortunately, on my system I cannotspecify a range for sed...


Thanks for any help/pointers
Martin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strange behavior when reading csv - line wraps

Reply via email to