All, I have been doing some stress testing on the XLSX2CSV program that I sent along yesterday. I have a new version if anyone would like it.

Today I am working with an XLSX file that is 19Mb on disk. First I opened it with winzip, which reports the total content is about 150Mb. No problem I thought, I'm using XSSF and SAX event-based processing, so I should never have much in memory.
In my previous version I used Package.open(FileInputStream), and that blew up.
It is documented to read the entire package, which is exactly what I do not want. So I switched to using Package.open(String path), and then I could get thru the open step; now it works instantly, just like winzip.

Anyhow, immediately after getting the package opened, my program requests the
shared strings table.  But the getSharedStringsTable() method blew up:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.xmlbeans.impl.store.Saver$TextSaver.resize(Saver.java:1592)
        at 
org.apache.xmlbeans.impl.store.Saver$TextSaver.preEmit(Saver.java:1223)
        at org.apache.xmlbeans.impl.store.Saver$TextSaver.emit(Saver.java:1144)
        at 
org.apache.xmlbeans.impl.store.Saver$TextSaver.emitElement(Saver.java:926)
        at org.apache.xmlbeans.impl.store.Saver.processElement(Saver.java:456)
        at org.apache.xmlbeans.impl.store.Saver.process(Saver.java:307)
        at 
org.apache.xmlbeans.impl.store.Saver$TextSaver.saveToString(Saver.java:1727)
        at org.apache.xmlbeans.impl.store.Cursor._xmlText(Cursor.java:546)
        at org.apache.xmlbeans.impl.store.Cursor.xmlText(Cursor.java:2436)
        at 
org.apache.xmlbeans.impl.values.XmlObjectBase.xmlText(XmlObjectBase.java:1455)
        at 
org.apache.xmlbeans.impl.values.XmlObjectBase.toString(XmlObjectBase.java:1440)
        at 
org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:109)
        at 
org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:93)
        at 
org.apache.poi.xssf.eventusermodel.XSSFReader.getSharedStringsTable(XSSFReader.java:72)
        at something.or.other.XLSX2CSV.process(XLSX2CSV.java:315)
        at something.or.other.XLSX2CSV.main(XLSX2CSV.java:360)

Winzip tells me the shared-string table file is about 6Mb.  That's not exactly 
small,
but I sure would not describe it as large.  I launch the program from eclipse 
with no
special arguments, so Java defaults to 64Mb. I watched the javaw process (on a windoze box) climb to about 82Mb. Finally I threw in the towel and used these arguments as JVM
args to boost the limit (-Xmx256M -Xms256M -Xss128k) and then it worked like a 
charm.

So my next question is, is there something especially memory hungry in the 
shared strings
table builder? Is there anything I can do here? Would you like me to write a bug?
Please advise, thanks in advance.

chris...

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to