Dear All,
I am observing a strange behavior and searching the archives and help
pages didn't help much.
I have a csv with a variable number of fields in each line.
I use
dataPoints <- read.csv(inputFile, head=FALSE, sep=";",fill =TRUE);
to read it in, and it works. But - some lines are long and 'wrap', or
split and continue on the next line. So when I check the dim of the
frame, they are not correct and I can see when I do a printout that the
lines is split into two in the frame. I checked the input file and all
is good.
an example of the input is:
37;2175168475;13;8.522729;47.19537;16366...@n00;30;sculpture;bird;tourism;animal;statue;canon;eos;rebel;schweiz;switzerland;eagle;swiss;adler;skulptur;zug;1750;28;tamron;f28;canton;tourismus;vogel;baar;kanton;xti;tamron1750;1750mm;tamron1750mm;400d;rabbitriotnet;
where the last values occurs on the next line in the data frame.
It does not have to be the last value, as in the follwong example, the
word "kempten" starts the next line:
39;167757703;12;10.309295;47.724545;21903...@n00;36;white;building;tower;clock;clouds;germany;bayern;deutschland;bavaria;europa;europe;eagle;adler;eu;wolke;dome;townhall;rathaus;turm;weiss;allemagne;europeanunion;bundesrepublik;gebaeude;glocke;brd;allgau;kuppel;europ;kempten;niemcy;europo;federalrepublic;europaischeunion;europaeischeunion;germanio;
What could be the reason?
I ws thinking about solving the issue by using a different separator,
that I would use for the first 7 fields and concatenating all of the
remaining values into a single stirng value, but could not figure out
how to do such a substitution in R. Unfortunately, on my system I cannot
specify a range for sed...
Thanks for any help/pointers
Martin
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.