RE: [R] Using files as connections
I nearly forgot to thank Andy Liaw and Tony Plate for their help with this problem. BTW Andy's method does run faster than the natural fix-up of my original code. Murray Jorgensen > You are using the connection the wrong way. You need to do something > like: > > fcon <- file("c:/data/perry/data.csv", open="r") > for (iline in 1:slines) { > isel <- isel + 1 > cline <- readLines(fcon, n=1) > ... > } > close(fcon) > > BTW, here's how I'd do it (not tested!): > > strvec <- rep("",slines) > selected <- sort(sample(flines, slines)) > skip <- c(0, diff(selected) - 1) > fcon <- file("c:/data/[erry/data.csv", open="r") > for (i in 1:length(skip)) { > ## skip to the selected line > readLines(fcon, n=skip[i]) > strvec[i] <- readLines(fcon, n=1) > } > close(fcon) > > HTH, > Andy > > >> -Original Message- >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, August 27, 2003 7:19 PM >> To: [EMAIL PROTECTED] >> Subject: [R] Using files as connections >> >> >> I have been trying to read a random sample of lines from a >> file into a data frame using readLines(). The help indicates >> that readLines() will start from the current line if the >> connection is open, but presented with a closed connection it >> will open it, start from the beginning, and close it when finished. >> >> In the code that follows I tried to open the file before >> reading but apparently without success, because the result >> was repeated copies of the first line: >> >> flines <- 107165 >> slines <- 100 >> selected <- sort(sample(flines,slines)) >> strvec <- rep("",slines) >> file("c:/data/perry/data.csv",open="r") >> isel <- 0 >> for (iline in 1:slines) { >> isel <- isel + 1 >> cline <- readLines("c:/data/perry/data.csv",n=1) >> if (iline == selected[isel]) strvec[isel] <- cline else >> isel <- isel - 1 >> } >> close("c:/data/perry/data.csv") >> sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") >> >> >> There was also an error "no applicable method" for close. >> >> Comments gratefully received. >> >> Murray Jorgensen >> >> __ >> [EMAIL PROTECTED] mailing list >> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help >> > > -- > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Using files as connections
You are using the connection the wrong way. You need to do something like: fcon <- file("c:/data/perry/data.csv", open="r") for (iline in 1:slines) { isel <- isel + 1 cline <- readLines(fcon, n=1) ... } close(fcon) BTW, here's how I'd do it (not tested!): strvec <- rep("",slines) selected <- sort(sample(flines, slines)) skip <- c(0, diff(selected) - 1) fcon <- file("c:/data/[erry/data.csv", open="r") for (i in 1:length(skip)) { ## skip to the selected line readLines(fcon, n=skip[i]) strvec[i] <- readLines(fcon, n=1) } close(fcon) HTH, Andy > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Wednesday, August 27, 2003 7:19 PM > To: [EMAIL PROTECTED] > Subject: [R] Using files as connections > > > I have been trying to read a random sample of lines from a > file into a data frame using readLines(). The help indicates > that readLines() will start from the current line if the > connection is open, but presented with a closed connection it > will open it, start from the beginning, and close it when finished. > > In the code that follows I tried to open the file before > reading but apparently without success, because the result > was repeated copies of the first line: > > flines <- 107165 > slines <- 100 > selected <- sort(sample(flines,slines)) > strvec <- rep("",slines) > file("c:/data/perry/data.csv",open="r") > isel <- 0 > for (iline in 1:slines) { > isel <- isel + 1 > cline <- readLines("c:/data/perry/data.csv",n=1) > if (iline == selected[isel]) strvec[isel] <- cline else > isel <- isel - 1 > } > close("c:/data/perry/data.csv") > sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") > > > There was also an error "no applicable method" for close. > > Comments gratefully received. > > Murray Jorgensen > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-help > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Using files as connections
You need to save the connection object returned by file() and then use that object in other functions. You need to change the appropriate lines to the following (at least): con <- file("c:/data/perry/data.csv",open="r") cline <- readLines(con,n=1) close(con) (I don't know if more changes are needed to get it working.) Note that using the connection object in other functions can have side effects on the connection object (which is how a connection "remembers" its point in the file.) (Perhaps more accurately, the side effect is on the internal system data referred to by the R connection object.) > con <- textConnection(letters) > con descriptionclass mode text "letters" "textConnection" "r" "text" opened can readcan write "opened""yes" "no" > readLines(con, 1) [1] "a" > readLines(con, 1) [1] "b" > con.saved <- con > readLines(con, 1) [1] "c" > readLines(con.saved, 1) [1] "d" > readLines(con, 1) [1] "e" > identical(con, con.saved) [1] TRUE > showConnections() description classmode text isopen can read can write 3 "letters" "textConnection" "r" "text" "opened" "yes""no" > > hope this helps, Tony Plate At Thursday 11:19 AM 8/28/2003 +1200, you wrote: I have been trying to read a random sample of lines from a file into a data frame using readLines(). The help indicates that readLines() will start from the current line if the connection is open, but presented with a closed connection it will open it, start from the beginning, and close it when finished. In the code that follows I tried to open the file before reading but apparently without success, because the result was repeated copies of the first line: flines <- 107165 slines <- 100 selected <- sort(sample(flines,slines)) strvec <- rep("",slines) file("c:/data/perry/data.csv",open="r") isel <- 0 for (iline in 1:slines) { isel <- isel + 1 cline <- readLines("c:/data/perry/data.csv",n=1) if (iline == selected[isel]) strvec[isel] <- cline else isel <- isel - 1 } close("c:/data/perry/data.csv") sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") There was also an error "no applicable method" for close. Comments gratefully received. Murray Jorgensen __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help