Hi Ingo, Sorry for being so slow to get back to you. I've had a bit of a problem with my internet connection.
Just how large is the data set? You might want to have a look at this thread re size of R data files. http://r.789695.n4.nabble.com/Boundaries-of-R-td3312593.html . In any case, from a bit more poking around in the file it looks like the last column, Column LN in Calc, is the problem. It has only NaN in row 57 as a value. If I remove it I can read in the rest of the file. In fact simply changing it to 0 (zero) makes the file readable. I've had a look at it in Calc and in jEdit but cannot see anything suspicious there. I suspect there must be something funny in there since at Row 32 also ends with NaN and seems to be reading in properly. BTW what are the NaN's doing there? --- On Fri, 2/18/11, Ingo Reinhold <in...@kth.se> wrote: > From: Ingo Reinhold <in...@kth.se> > Subject: RE: [R] Variable length datafile import problem > To: "John Kane" <jrkrid...@yahoo.ca>, "r-help@r-project.org" > <r-help@r-project.org> > Received: Friday, February 18, 2011, 3:16 AM > Hi John, > > seems there is no easy way. I'll just precondition it with > AWK as described here > http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg53401.html > > There are some remarks in the thread that R is not supposed > to read too large files for "political" reasons. Maybe > that's it. > > Many thanks again for the effort. > > Ingo > ________________________________________ > From: John Kane [jrkrid...@yahoo.ca] > Sent: Thursday, February 17, 2011 11:54 AM > To: Ingo Reinhold > Subject: RE: [R] Variable length datafile import problem > > Generally most of the gurus are in this list. > Hopefully someone will take an interest in the problem. > > I suspect that there may be some kind of weird value in the > file that is upsetting in import. Given the results I > got when I removed the data past BD and then at AL it seems > that the problem might be within this range. > > You could try removing half the data between those columns > and see what happens, then repeat if something turns up. > It's tedious but unless someone with a better grasp of > variable length data import can help it's the best I can > suggest. > > BTW you only replied to me. You should make sure to > cc the list otherwise readers won't realise that I am being > of no help. > > If you still have the problem by Saturday e-mail me or post > to the list and I'll try to spent some more time messing > about with the problem. > > Sorry to be of so little help. > --- On Thu, 2/17/11, Ingo Reinhold <in...@kth.se> > wrote: > > > From: Ingo Reinhold <in...@kth.se> > > Subject: RE: [R] Variable length datafile import > problem > > To: "John Kane" <jrkrid...@yahoo.ca> > > Received: Thursday, February 17, 2011, 5:36 AM > > Hi John, > > > > as it seems we're hitting the wall here, can you > maybe > > recommend another mailing list with "gurus" (as you > put it) > > that may be able to help? > > > > Regards, > > > > Ingo > > ________________________________________ > > From: John Kane [jrkrid...@yahoo.ca] > > Sent: Thursday, February 17, 2011 11:25 AM > > To: Ingo Reinhold > > Subject: RE: [R] Variable length datafile import > problem > > > > Hi Ingo, > > > > I've had a bit of time to examine the file and I must > say > > that, at the moment, I have no idea what is going on. > > I tried the old cut the file into pieces trick just > came up > > with even more anomalous results. > > > > My first attempt remove all the data past column AL in > an > > OOo Calc spreadsheet. This created a > rectangular > > dataset It imported into R with no problem with 38 > columns > > as expected. > > > > Then I deleted all the data from the orignal data > file > > (test.dat) removing all the data past column BD in an > OOo > > Calc spreadsheet. > > > > This imported a file with only 38 columns. > > > > Something very funny is happening but at the moment I > have > > no > > > > --- On Wed, 2/16/11, Ingo Reinhold <in...@kth.se> > > wrote: > > > > > From: Ingo Reinhold <in...@kth.se> > > > Subject: RE: [R] Variable length datafile import > > problem > > > To: "John Kane" <jrkrid...@yahoo.ca> > > > Received: Wednesday, February 16, 2011, 1:59 AM > > > Hi John, > > > > > > V1 should be just a character. However I figured > > something > > > out myself. The import looks OK in terms of > column > > when > > > adding the flush=TRUE option. > > > > > > I am still very confused about the dimensions > that > > the > > > imported data shows. Loading my data file into > > something > > > like OOspreadsheet shows me a maximum of about > 245, > > which > > > does not correspond to the 146 generated by R. > Any > > idea > > > where this saturation comes from? > > > > > > Thanks, > > > > > > Ingo > > > ________________________________________ > > > From: John Kane [jrkrid...@yahoo.ca] > > > Sent: Wednesday, February 16, 2011 1:57 AM > > > To: Ingo Reinhold > > > Subject: RE: [R] Variable length datafile import > > problem > > > > > > Is rawData$V1 intended to be factor or > character? > > > > > > str(rawData) gives > > > $ V1 : Factor w/ 54 levels > "-232.0","-234.0",..: > > 41 > > > 41 41 41 41 41 41 41 41 41 ... > > > > > > If you were not expecting a factor you might try > > > options(stringsAsFactors = FALSE) before > importing > > the > > > data. > > > > > > --- On Tue, 2/15/11, Ingo Reinhold <in...@kth.se> > > > wrote: > > > > > > > From: Ingo Reinhold <in...@kth.se> > > > > Subject: RE: [R] Variable length datafile > import > > > problem > > > > To: "John Kane" <jrkrid...@yahoo.ca> > > > > Received: Tuesday, February 15, 2011, 3:35 > PM > > > > Dear all, > > > > > > > > I have changed the file-ending with no > change in > > the > > > > result. I don't think that this should > matter. > > > > > > > > http://dl.dropbox.com/u/2414056/Test.dat > > > > is a test file which represent the structure > I > > am > > > trying to > > > > read. So far I have used > > > > > > > > rawData=read.table("Test.txt", fill=TRUE, > > sep="\t", > > > > header=FALSE); > > > > > > > > When then looking at rawData$V1 this gives > me a > > > distorted > > > > view of my original first column. > > > > > > > > Thanks, > > > > > > > > Ingo > > > > > > > > > > > > > > > > > > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.