[R] Using files as connections

2003-08-27 Thread maj
I have been trying to read a random sample of lines from a file into a
data frame using readLines(). The help indicates that readLines() will
start from the current line if the connection is open, but presented with
a closed connection it will open it, start from the beginning, and close
it when finished.

In the code that follows I tried to open the file before reading but
apparently without success, because the result was repeated copies of the
first line:

flines <- 107165
slines <- 100
selected <- sort(sample(flines,slines))
strvec <- rep(“”,slines)
file(“c:/data/perry/data.csv”,open="r")
isel <- 0
for (iline in 1:slines) {
  isel <- isel + 1
  cline <- readLines(“c:/data/perry/data.csv”,n=1)
  if (iline == selected[isel]) strvec[isel] <- cline else
isel <- isel - 1
}
close(“c:/data/perry/data.csv”)
sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",")


There was also an error "no applicable method"  for close.

Comments gratefully received.

Murray Jorgensen

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Using files as connections

2003-08-28 Thread Tony Plate
You need to save the connection object returned by file() and then use that 
object in other functions.

You need to change the appropriate lines to the following (at least):

con <- file("c:/data/perry/data.csv",open="r")
  cline <- readLines(con,n=1)
close(con)
(I don't know if more changes are needed to get it working.)

Note that using the connection object in other functions can have side 
effects on the connection object (which is how a connection "remembers" its 
point in the file.) (Perhaps more accurately, the side effect is on the 
internal system data referred to by the R connection object.)

> con <- textConnection(letters)
> con
 descriptionclass mode text
   "letters" "textConnection"  "r"   "text"
  opened can readcan write
"opened""yes" "no"
> readLines(con, 1)
[1] "a"
> readLines(con, 1)
[1] "b"
> con.saved <- con
> readLines(con, 1)
[1] "c"
> readLines(con.saved, 1)
[1] "d"
> readLines(con, 1)
[1] "e"
> identical(con, con.saved)
[1] TRUE
> showConnections()
  description classmode text   isopen   can read can write
3 "letters"   "textConnection" "r"  "text" "opened" "yes""no"
>
>
hope this helps,

Tony Plate

At Thursday 11:19 AM 8/28/2003 +1200, you wrote:
I have been trying to read a random sample of lines from a file into a
data frame using readLines(). The help indicates that readLines() will
start from the current line if the connection is open, but presented with
a closed connection it will open it, start from the beginning, and close
it when finished.
In the code that follows I tried to open the file before reading but
apparently without success, because the result was repeated copies of the
first line:
flines <- 107165
slines <- 100
selected <- sort(sample(flines,slines))
strvec <- rep("",slines)
file("c:/data/perry/data.csv",open="r")
isel <- 0
for (iline in 1:slines) {
  isel <- isel + 1
  cline <- readLines("c:/data/perry/data.csv",n=1)
  if (iline == selected[isel]) strvec[isel] <- cline else
isel <- isel - 1
}
close("c:/data/perry/data.csv")
sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",")
There was also an error "no applicable method"  for close.

Comments gratefully received.

Murray Jorgensen

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] Using files as connections

2003-08-28 Thread Liaw, Andy
You are using the connection the wrong way.  You need to do something like:

fcon <- file("c:/data/perry/data.csv", open="r")
for (iline in 1:slines) {
isel <- isel + 1
cline <- readLines(fcon, n=1)
...
}
close(fcon)

BTW, here's how I'd do it (not tested!):

strvec <- rep("",slines)
selected <- sort(sample(flines, slines))
skip <- c(0, diff(selected) - 1)
fcon <- file("c:/data/[erry/data.csv", open="r")
for (i in 1:length(skip)) {
## skip to the selected line
readLines(fcon, n=skip[i])
strvec[i] <- readLines(fcon, n=1)
}
close(fcon)

HTH,
Andy


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, August 27, 2003 7:19 PM
> To: [EMAIL PROTECTED]
> Subject: [R] Using files as connections
> 
> 
> I have been trying to read a random sample of lines from a 
> file into a data frame using readLines(). The help indicates 
> that readLines() will start from the current line if the 
> connection is open, but presented with a closed connection it 
> will open it, start from the beginning, and close it when finished.
> 
> In the code that follows I tried to open the file before 
> reading but apparently without success, because the result 
> was repeated copies of the first line:
> 
> flines <- 107165
> slines <- 100
> selected <- sort(sample(flines,slines))
> strvec <- rep("",slines)
> file("c:/data/perry/data.csv",open="r")
> isel <- 0
> for (iline in 1:slines) {
>   isel <- isel + 1
>   cline <- readLines("c:/data/perry/data.csv",n=1)
>   if (iline == selected[isel]) strvec[isel] <- cline else
> isel <- isel - 1
> }
> close("c:/data/perry/data.csv")
> sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",")
> 
> 
> There was also an error "no applicable method"  for close.
> 
> Comments gratefully received.
> 
> Murray Jorgensen
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> 

--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] Using files as connections

2003-08-29 Thread maj
I nearly forgot to thank Andy Liaw and Tony Plate for their help with this
problem. BTW Andy's method does run faster than the natural fix-up of my
original code.

Murray Jorgensen


> You are using the connection the wrong way.  You need to do something
> like:
>
> fcon <- file("c:/data/perry/data.csv", open="r")
> for (iline in 1:slines) {
> isel <- isel + 1
> cline <- readLines(fcon, n=1)
> ...
> }
> close(fcon)
>
> BTW, here's how I'd do it (not tested!):
>
> strvec <- rep("",slines)
> selected <- sort(sample(flines, slines))
> skip <- c(0, diff(selected) - 1)
> fcon <- file("c:/data/[erry/data.csv", open="r")
> for (i in 1:length(skip)) {
> ## skip to the selected line
> readLines(fcon, n=skip[i])
> strvec[i] <- readLines(fcon, n=1)
> }
> close(fcon)
>
> HTH,
> Andy
>
>
>> -Original Message-
>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, August 27, 2003 7:19 PM
>> To: [EMAIL PROTECTED]
>> Subject: [R] Using files as connections
>>
>>
>> I have been trying to read a random sample of lines from a
>> file into a data frame using readLines(). The help indicates
>> that readLines() will start from the current line if the
>> connection is open, but presented with a closed connection it
>> will open it, start from the beginning, and close it when finished.
>>
>> In the code that follows I tried to open the file before
>> reading but apparently without success, because the result
>> was repeated copies of the first line:
>>
>> flines <- 107165
>> slines <- 100
>> selected <- sort(sample(flines,slines))
>> strvec <- rep("",slines)
>> file("c:/data/perry/data.csv",open="r")
>> isel <- 0
>> for (iline in 1:slines) {
>>   isel <- isel + 1
>>   cline <- readLines("c:/data/perry/data.csv",n=1)
>>   if (iline == selected[isel]) strvec[isel] <- cline else
>> isel <- isel - 1
>> }
>> close("c:/data/perry/data.csv")
>> sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",")
>>
>>
>> There was also an error "no applicable method"  for close.
>>
>> Comments gratefully received.
>>
>> Murray Jorgensen
>>
>> __
>> [EMAIL PROTECTED] mailing list
>> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>>
>
> --
> Notice:  This e-mail message, together with any
> attachments,...{{dropped}}
>
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help