I’ve got an Excel workbook with about 30 worksheets. Each worksheet
has 10000 rows of data over 30 columns.
I’d like to read the data from each worksheet into a dataframe or
matrix in R for processing. Normally, I use read.csv when interacting
with Excel but I’d rather manipulate a multisheet workbook directly
than set about splitting the original workbook and saving down each
part as a csv.
So far, I’ve tried using read.xlsx from the xlsx package. This works
fine for small test files – e.g. suppose I’m trying to read from the
test_file workbook on my desktop. The following code extracts rows 1
and 2 from worksheet = “johnny”.
setwd("C:\\Documents and Settings\\dmenezes\\Desktop")
info<-
read.xlsx("test_file.xlsx",sheetName="johnny",rowIndex=1:2,header=FALSE)
info
However, when I try to apply this to my real, large workbook, things
go wrong, with the following error message. Any ideas/workarounds?
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod",
cl, :
java.lang.OutOfMemoryError: Java heap space
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.