hello *, im running into two major bottlenecks an R script. 1. going through a 40mb file and reading in via readLines() 1 line at a time is almost an order of magnitude slow than the equivalent in python, im wondering if there are alternatives to readLines(), doing more lines at a time helps a bit
2. generating date sequences takes a long time, im basically doing something like seq.Date(Sys.Date(), length.out = 300, by ='day') a lot while digging into it, i strace'd the running process and it seems the bulk of the time is spent checking for /etc/localtime stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0 strace -cp 2964 Process 2964 attached - interrupt to quit ^CProcess 2964 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 94.61 0.006387 0 55872 stat 2.58 0.000174 0 568 read 1.42 0.000096 0 285 write 1.39 0.000094 1 137 brk ------ ----------- ----------- --------- --------- ---------------- 100.00 0.006751 56862 total has anybody ran into similar problems? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.