Hi Jim, Thanks for your insight. I used Linux split to split my large file into smaller partitions. On the server I work on, multipath I/O access is enabled and we use RAID for storage; thus, I don't think I can put each partition on a spindle. I'm able to open multiple files at a time into stdin from the command line:
> cat file1.txt | wc -l & > cat file2.txt | wc -l & > cat file3.txt | wc -l & > cat file4.txt | wc -l & But I'm still not sure how to read each partition in parallel. When I run this code, it doesn't run in parallel, instead file.list gets filled with 1 cpu doing all the work. R> library(doMC) R> files <- Sys.glob("x*") # Grabs all the file partitions created by split R> file.list <- lapply(files,function(x){file(x,"r")}) # Creates all the file partitions connections R> master <- foreach (i = icount(length(open))) %dopar% # Attempt at parallel readLines +{ + readLines(file.list[[i]],1000000) +} -- View this message in context: http://r.789695.n4.nabble.com/Parallel-Scan-of-Large-File-tp3077545p3079110.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.