Hi, I need to resample characters from a dataset that consists of an extremely long string that is written over hundreds of thousands of lines, each of length 50 characters. I am currently doing this by first inserting a space after each character in the dataset and then using the following commands:
y <- as.matrix(read.table("data.txt"), stringsAsFactors=FALSE) bstrap <- sample(length(y), 100000, TRUE) write(y[bstrap], file="Rep1.txt", ncolumns=50, append=FALSE) bstrap <- sample(length(y), 100000, TRUE) write(y[bstrap], file="Rep2.txt", ncolumns=50, append=FALSE) bstrap <- sample(length(y), 100000, TRUE) . . . and so on for 500 reps. I think there should be a better way of doing this. My specific questions: 1. Is there a way to avoid inserting spaces between the characters before calling the "sample" command (because I don't want spaces between the resampled characters in the output either; see number 2 below)? 2. If I have no choice but to insert the spaces in my data before resampling, is there a way to output the resampled data without spaces, but simply as 50-character long strings one below the other)? I tried inserting the following command: strip.white=TRUE in the write command line, but it gave me an error as it did not understand the command. 3. Finally, since I have to get 500 such resampled reps from each dataset (and there are over 20 such huge datasets) is there a way around having to write a separate write command for each rep? Any suggestions will be greatly appreciated. Thanks, S. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.