Apologies, Laurent, for this two-part answer. I misunderstood your
post where you stated you wanted to "filter(ing) some
selected lines according to the line name... ." I thought that meant
you had a separate index (like a series of primes) that you wanted to
use to only read-in selected line numbers from a file (test file below
with numbers 1:1000 each on a separate line):
library(gmp)
library(iterators)
iprime <- iter(1:100, checkFunc = function(n) isprime(n))
scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 2
scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 3
scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 5
scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 7
However, what it really seems that you want to do is read each line of
a (possibly enormous) file, test each line "string-wise" to keep or
discard, and if you're keeping it, append the line to a list. I can
certainly see the advantage of this strategy for reading in very, very
large files, but it's not clear to me how the "ireadLines" function (
in the "iterators" package) will help you, since it doesn't seem to
generate anything but a sequential index.
Anyway, below is an absolutely standard read-in of your data using
read.table(). Hopefully some of the code I've posted has been useful
to you.
sensors <- c("N053", "N163")
read.table("test2.txt")
V1 V2 V3 V4 V5 V6 V7
V8 V9 V10
1 Time 0.000000 0.000999 0.001999 0.002998 0.003998 0.004997
0.005997 0.006996 0.007996
2 N023 -0.031323 -0.035026 -0.029759 -0.024886 -0.024464 -0.026816
-0.033690 -0.041067 -0.038747
3 N053 -0.014083 -0.004741 0.001443 -0.010152 -0.012996 -0.005337
-0.008738 -0.015094 -0.012104
4 N123 -0.019008 -0.013494 -0.013180 -0.029208 -0.032748 -0.020243
-0.015089 -0.014439 -0.011681
5 N163 -0.054023 -0.049345 -0.037158 -0.041120 -0.044612 -0.036953
-0.036061 -0.044516 -0.046436
6 N193 -0.022171 -0.022384 -0.022338 -0.023304 -0.022569 -0.021827
-0.021996 -0.021755 -0.021846
Laurent_data <- read.table("test2.txt")
Laurent_data[Laurent_data$V1 %in% sensors, ]
V1 V2 V3 V4 V5 V6 V7
V8 V9 V10
3 N053 -0.014083 -0.004741 0.001443 -0.010152 -0.012996 -0.005337
-0.008738 -0.015094 -0.012104
5 N163 -0.054023 -0.049345 -0.037158 -0.041120 -0.044612 -0.036953
-0.036061 -0.044516 -0.046436
Best, Bill.
W. Michels, Ph.D.
On Sun, May 17, 2020 at 5:43 PM Laurent Rhelp <laurentrh...@free.fr> wrote:
Dear R-Help List,
I would like to use an iterator to read a file filtering some
selected lines according to the line name in order to use after a
foreach loop. I wanted to use the checkFunc argument as the following
example found on internet to select only prime numbers :
| iprime <- ||iter||(1:100, checkFunc =
||function||(n) ||isprime||(n))|
|(https://datawookie.netlify.app/blog/2013/11/iterators-in-r/)
<https://datawookie.netlify.app/blog/2013/11/iterators-in-r/>|
but the checkFunc argument seems not to be available with the function
ireadLines (package iterators). So, I did the code below to solve my
problem but I am sure that I miss something to use iterators with files.
Since I found nothing on the web about ireadLines and the checkFunc
argument, could somebody help me to understand how we have to use
iterator (and foreach loop) on files keeping only selected lines ?
Thank you very much
Laurent
Presently here is my code:
## mock file to read: test.txt
##
# Time 0 0.000999 0.001999 0.002998 0.003998 0.004997
0.005997 0.006996 0.007996
# N023 -0.031323 -0.035026 -0.029759 -0.024886 -0.024464
-0.026816 -0.03369 -0.041067 -0.038747
# N053 -0.014083 -0.004741 0.001443 -0.010152 -0.012996
-0.005337 -0.008738 -0.015094 -0.012104
# N123 -0.019008 -0.013494 -0.01318 -0.029208 -0.032748
-0.020243 -0.015089 -0.014439 -0.011681
# N163 -0.054023 -0.049345 -0.037158 -0.04112 -0.044612
-0.036953 -0.036061 -0.044516 -0.046436
# N193 -0.022171 -0.022384 -0.022338 -0.023304 -0.022569
-0.021827 -0.021996 -0.021755 -0.021846
# sensors to keep
sensors <- c("N053", "N163")
library(iterators)
library(rlist)
file_name <- "test.txt"
con_obj <- file( file_name , "r")
ifile <- ireadLines( con_obj , n = 1 )
## I do not do a loop for the example
res <- list()
r <- get_Lines_iter( ifile , sensors)
res <- list.append( res , r )
res
r <- get_Lines_iter( ifile , sensors)
res <- list.append( res , r )
res
r <- get_Lines_iter( ifile , sensors)
do.call("cbind",res)
## the function get_Lines_iter to select and process the line
get_Lines_iter <- function( iter , sensors, sep = '\t', quiet = FALSE){
## read the next record in the iterator
r = try( nextElem(iter) )
while( TRUE ){
if( class(r) == "try-error") {
return( stop("The iterator is empty") )
} else {
## split the read line according to the separator
r_txt <- textConnection(r)
fields <- scan(file = r_txt, what = "character", sep = sep, quiet =
quiet)
## test if we have to keep the line
if( fields[1] %in% sensors){
## data processing for the selected line (for the example
transformation in dataframe)
n <- length(fields)
x <- data.frame( as.numeric(fields[2:n]) )
names(x) <- fields[1]
## We return the values
print(paste0("sensor ",fields[1]," ok"))
return( x )
}else{
print(paste0("Sensor ", fields[1] ," not selected"))
r = try(nextElem(iter) )}
}
}# end while loop
}
--
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel
antivirus Avast.
https://www.avast.com/antivirus
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.