All, I have been doing some stress testing on the XLSX2CSV program that
I sent along yesterday. I have a new version if anyone would like it.
Today I am working with an XLSX file that is 19Mb on disk. First I opened
it with winzip, which reports the total content is about 150Mb. No problem
I thought, I'm using XSSF and SAX event-based processing, so I should never
have much in memory.
In my previous version I used Package.open(FileInputStream), and that blew up.
It is documented to read the entire package, which is exactly what I do not want.
So I switched to using Package.open(String path), and then I could get thru
the open step; now it works instantly, just like winzip.
Anyhow, immediately after getting the package opened, my program requests the
shared strings table. But the getSharedStringsTable() method blew up:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.xmlbeans.impl.store.Saver$TextSaver.resize(Saver.java:1592)
at
org.apache.xmlbeans.impl.store.Saver$TextSaver.preEmit(Saver.java:1223)
at org.apache.xmlbeans.impl.store.Saver$TextSaver.emit(Saver.java:1144)
at
org.apache.xmlbeans.impl.store.Saver$TextSaver.emitElement(Saver.java:926)
at org.apache.xmlbeans.impl.store.Saver.processElement(Saver.java:456)
at org.apache.xmlbeans.impl.store.Saver.process(Saver.java:307)
at
org.apache.xmlbeans.impl.store.Saver$TextSaver.saveToString(Saver.java:1727)
at org.apache.xmlbeans.impl.store.Cursor._xmlText(Cursor.java:546)
at org.apache.xmlbeans.impl.store.Cursor.xmlText(Cursor.java:2436)
at
org.apache.xmlbeans.impl.values.XmlObjectBase.xmlText(XmlObjectBase.java:1455)
at
org.apache.xmlbeans.impl.values.XmlObjectBase.toString(XmlObjectBase.java:1440)
at
org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:109)
at
org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:93)
at
org.apache.poi.xssf.eventusermodel.XSSFReader.getSharedStringsTable(XSSFReader.java:72)
at something.or.other.XLSX2CSV.process(XLSX2CSV.java:315)
at something.or.other.XLSX2CSV.main(XLSX2CSV.java:360)
Winzip tells me the shared-string table file is about 6Mb. That's not exactly
small,
but I sure would not describe it as large. I launch the program from eclipse
with no
special arguments, so Java defaults to 64Mb. I watched the javaw process (on a windoze
box) climb to about 82Mb. Finally I threw in the towel and used these arguments as JVM
args to boost the limit (-Xmx256M -Xms256M -Xss128k) and then it worked like a
charm.
So my next question is, is there something especially memory hungry in the
shared strings
table builder? Is there anything I can do here? Would you like me to write a bug?
Please advise, thanks in advance.
chris...
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]