Hi,
I need to serialize and save a 20K x 20K matrix as a binary file. This process
is significantly slower in R compared to Python (4X slower).
I'm not sure about the best approach to optimize the below code. Is it possible
to parallelize the serialization function to enhance performance?
n <- 20000^2
cat("Generating matrices ... ")
INI.TIME <- proc.time()
A <- matrix(runif(n), ncol = m)
END_GEN.TIME <- proc.time()
arg_ser <- serialize(object = A, connection = NULL)
END_SER.TIME <- proc.time()
con <- file(description = "matrix_file", open = "wb")
writeBin(object = arg_ser, con = con)
close(con)
END_WRITE.TIME <- proc.time()
con <- file(description = "matrix_file", open = "rb")
par_raw <- readBin(con, what = raw(), n = file.info("matrix_file")$size)
END_READ.TIME <- proc.time()
B <- unserialize(connection = par_raw)
close(con)
END_DES.TIME <- proc.time()
TIME <- END_GEN.TIME - INI.TIME
cat("Generation time", TIME[3], " seconds.")
TIME <- END_SER.TIME - END_GEN.TIME
cat("Serialization time", TIME[3], " seconds.")
TIME <- END_WRITE.TIME - END_SER.TIME
cat("Writting time", TIME[3], " seconds.")
TIME <- END_READ.TIME - END_WRITE.TIME
cat("Read time", TIME[3], " seconds.")
TIME <- END_DES.TIME - END_READ.TIME
cat("Deserialize time", TIME[3], " seconds.")
Best,
--Sameh
--
This message and its contents, including attachments are intended solely
for the original recipient. If you are not the intended recipient or have
received this message in error, please notify me immediately and delete
this message from your computer system. Any unauthorized use or
distribution is prohibited. Please consider the environment before printing
this email.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel