Sometimes I have hundreds of CSV files scattered in a directory tree, resulting from experiments' executions. For instance, giving an example from my field, I may want to collect the performance of a processor for several design parameters such as "cache size" (possible values: 2, 4, 8 and 16) and "cache associativity" (possible values: direct-mapped, 4-way, fully-associative). The results of all these experiments will be stored in a directory tree like:
results |-- direct-mapped | |-- 2 -- data.csv | |-- 4 -- data.csv | |-- 8 -- data.csv | |-- 16 -- data.csv |-- 4-way | |-- 2 -- data.csv | |-- 4 -- data.csv ... |-- fully-associative | |-- 2 -- data.csv | |-- 4 -- data.csv ... I am developing a package that would allow me to gather all those CSV into a single data frame. Currently, I just need to execute the following statement: dframe <- gather("results/@ASSOC@/@SIZE@/data.csv") and this command returns a data frame containing the columns ASSOC, SIZE and all the remaining columns inside the CSV files (in my case the processor performance), effectively loading all the CSV files into a single data frame. So, I would get something like: ASSOC, SIZE, PERF direct-mapped, 2, 1.4 direct-mapped, 4, 1.6 direct-mapped, 8, 1.7 direct-mapped, 16, 1.7 4-way, 2, 1.4 4-way, 4, 1.5 ... I would like to ask whether there is any similar functionality already implemented in R. If so, there is no need to reinvent the wheel :) If it is not implemented and the R community believes that this feature would be useful, I would be glad to contribute my code. Thank you, Victor P.S: I was not sure whether to submit this question to R-devel or R-help, but since it may lead to some programming discussion I decided to post it to R-devel. Please, let me know if it is better to move it to the other list. [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel