Re: [R] Dropping a digit with scan() on a connection
Thank you Dr. Ripley and Christoph Buser for your explanations and help. Using sep = " " within scan worked within lines of my file, but then I gained an NA record when wrapping from one line to the next (because the linebreak character is no longer recognized as a sep?). So, I'll continue by ensuring each group I read ends at the end of a line (as scan was designed), and by using scan without the sep option. FYI, Here's how the NA showed up, each line is 800 numbers long: >test4 <- scan(cn.test, n=1600, sep = " ") >test5 <- scan(cn.test, n=1600) >test4[797:803] [1] 81.0 81.08746 81.89484 82.0NA 580.09030 576.90300 > test5[797:803] [1] 81.01944 81.62060 81.96495 82.0 82.0 567.91840 563.10470 Thanks again. Tim >>> Prof Brian Ripley <[EMAIL PROTECTED]> 01/19/05 03:42AM >>> This is because scan() has a private pushback. Either: 1) Read the file a whole line at a time: I cannot see why you need to do so here nor in your sketched application. or 2) Use an explicit separator, e.g. " " in your example. scan() is not designed to read parts of lines of a file, On Tue, 18 Jan 2005, Tim Howard wrote: > R gurus, > > My use of scan() seems to be dropping the first digit of sequential > scans on a connection. It looks like it happens only within a line: > >> cat("TITLE extra line", "235 335 535 735", "115 135 175", > file="ex.data", sep="\n") >> cn.x <- file("ex.data", open="r") >> a <- scan(cn.x, skip=1, n=2) > Read 2 items >> a > [1] 235 335 >> b <- scan(cn.x, n=2) > Read 2 items >> b > [1] 35 735 >> c <- scan(cn.x, n=2) > Read 2 items >> c > [1] 115 135 >> d <- scan(cn.x, n=1) > Read 1 items >> d > [1] 75 >> > > Note in b, I should get 535, not 35 as the first value. In d, I should > get 175. Does anyone know how to get these digits? > > The reason I'm not scanning the entire file at once is that my real > dataset is much larger than a Gig and I'll need to pull only portions of > the file in at once. I got readLines to work, but then I have to figure > out how to convert each entire line into a data.frame. Scan seems a lot > cleaner, with the exception of the funny character dropping issue. > > Thanks so much! > Tim Howard > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Dropping a digit with scan() on a connection
This is because scan() has a private pushback. Either: 1) Read the file a whole line at a time: I cannot see why you need to do so here nor in your sketched application. or 2) Use an explicit separator, e.g. " " in your example. scan() is not designed to read parts of lines of a file, On Tue, 18 Jan 2005, Tim Howard wrote: R gurus, My use of scan() seems to be dropping the first digit of sequential scans on a connection. It looks like it happens only within a line: cat("TITLE extra line", "235 335 535 735", "115 135 175", file="ex.data", sep="\n") cn.x <- file("ex.data", open="r") a <- scan(cn.x, skip=1, n=2) Read 2 items a [1] 235 335 b <- scan(cn.x, n=2) Read 2 items b [1] 35 735 c <- scan(cn.x, n=2) Read 2 items c [1] 115 135 d <- scan(cn.x, n=1) Read 1 items d [1] 75 Note in b, I should get 535, not 35 as the first value. In d, I should get 175. Does anyone know how to get these digits? The reason I'm not scanning the entire file at once is that my real dataset is much larger than a Gig and I'll need to pull only portions of the file in at once. I got readLines to work, but then I have to figure out how to convert each entire line into a data.frame. Scan seems a lot cleaner, with the exception of the funny character dropping issue. Thanks so much! Tim Howard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Dropping a digit with scan() on a connection
Dear Tim You can use cat("TITLE extra line", "235 335 535 735", "115 135 175", file="ex.data", sep="\n") cn.x <- file("ex.data", open="r") a <- scan(cn.x, skip=1, n=2, sep = " ") > Read 2 items a > [1] 235 335 b <- scan(cn.x, n=2, sep = " ") > Read 2 items b > [1] 535 735 c <- scan(cn.x, n=2, sep = " ") > Read 2 items c > [1] 115 135 d <- scan(cn.x, n=1, sep = " ") > Read 1 items d > [1] 175 Regards, Christoph Buser -- Christoph Buser <[EMAIL PROTECTED]> Seminar fuer Statistik, LEO C11 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-5414 fax: 632-1228 http://stat.ethz.ch/~buser/ Tim Howard writes: > R gurus, > > My use of scan() seems to be dropping the first digit of sequential > scans on a connection. It looks like it happens only within a line: > > > cat("TITLE extra line", "235 335 535 735", "115 135 175", > file="ex.data", sep="\n") > > cn.x <- file("ex.data", open="r") > > a <- scan(cn.x, skip=1, n=2) > Read 2 items > > a > [1] 235 335 > > b <- scan(cn.x, n=2) > Read 2 items > > b > [1] 35 735 > > c <- scan(cn.x, n=2) > Read 2 items > > c > [1] 115 135 > > d <- scan(cn.x, n=1) > Read 1 items > > d > [1] 75 > > > > Note in b, I should get 535, not 35 as the first value. In d, I should > get 175. Does anyone know how to get these digits? > > The reason I'm not scanning the entire file at once is that my real > dataset is much larger than a Gig and I'll need to pull only portions of > the file in at once. I got readLines to work, but then I have to figure > out how to convert each entire line into a data.frame. Scan seems a lot > cleaner, with the exception of the funny character dropping issue. > > Thanks so much! > Tim Howard > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html