Dear All,
I am observing a strange behavior and searching the archives and help pages didn't help much.
I have a csv with a variable number of fields in each line.

I use
dataPoints <- read.csv(inputFile, head=FALSE, sep=";",fill =TRUE);

to read it in, and it works. But - some lines are long and 'wrap', or split and continue on the next line. So when I check the dim of the frame, they are not correct and I can see when I do a printout that the lines is split into two in the frame. I checked the input file and all is good.

an example of the input is:
37;2175168475;13;8.522729;47.19537;16366...@n00;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switzerland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;tourismus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitriotnet;

where the last values occurs on the next line in the data frame.

It does not have to be the last value, as in the follwong example, the word "kempten" starts the next line:
39;167757703;12;10.309295;47.724545;21903...@n00;36;white;building;tower;clock;clouds;germany;bayern;deutschland;bavaria;europa;europe;eagle;adler;eu;wolke;dome;townhall;rathaus;turm;weiss;allemagne;europeanunion;bundesrepublik;gebaeude;glocke;brd;allgau;kuppel;europ;kempten;niemcy;europo;federalrepublic;europaischeunion;europaeischeunion;germanio;

What could be the reason?

I ws thinking about solving the issue by using a different separator, that I would use for the first 7 fields and concatenating all of the remaining values into a single stirng value, but could not figure out how to do such a substitution in R. Unfortunately, on my system I cannot specify a range for sed...

Thanks for any help/pointers
Martin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to