On Aug 28, 2013, at 2:24 PM, Hadley Wickham wrote:

>> Yup - parsing is the most expensive part. That's why for high-throughput 
>> data you don't want to use ASCII representation. It's amazing that the disk 
>> speeds are now so high that CPUs are the bottlenecks now, not vice versa.
> 
> Do you have any recommendations for binary formats? For R, is there anything 
> obviously better than Rdata?
> 

native formats are the fastest (and versatile), so
readBin/writeBin or mmap
I tend to avoid strings (I use dates as POSIXct which are doubles and for 
anything else factors - which are integers) so the above works for me just fine.
I am working on a way to do direct mmap serialization of SEXPs but it's not 
ready yet (basic vectors are supported but complex objects not yet).

Cheers,
Simon

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to