I found a way to work with very big excel files, using the low level API.
I would recommend first explore other formats: CVS could be a very good
alternative and is a lot easier to manipulate in big volume.
If you don't have other choices, you need to work with the BIFF stream,
reading record by record: open a RecordInputStream from the POIFileSystem,
read record by record to find the place you want to add cells, and write
the stream back to the FS. At that point the file will be invalid, because
the "header" contains the BoundSheetRecords with pointers to the position
in the stream of the BOF records corresponding to the worksheets, and now
some of those BOF records shifted. While you read the stream, you need to
capture the BSRecords in a list, and as you progress through the stream
adding cells (or whatever) you need to calculate how much is the shift and
update the pointers of the BSRecords in the list. After the first pass
adding all the records you need, open again the file, update the BSRs and
save it again. The resulting XLS should work.
Actually looks more complex than it really is (I'm omitting some details,
anyway).  But I've successfully added a new column with values to a excel
with 5 worksheets of 50000 rows each with a small memory footprint and
pretty quick. Using the HSSF model was impossible (= impractical) : I gave
up when loading the XLS with just 1 worksheet took more than 1Gb and a
couple of minutes.

If I find a way to generalize it, maybe I should contribute the code to the
project...

Regards
Gabriel Claramunt

Nick Burch <[EMAIL PROTECTED]>
06/19/2008 02:28 PM CETPlease respond to"POI Users List"

 To   "Singh, Saurabh (IT)" <[EMAIL PROTECTED]>
 cc   POI Users List <[email protected]>
 bcc
 Subject   Re: large data: outOfMemory Exception




On Thu, 19 Jun 2008, Singh, Saurabh (IT) wrote:
> We have large data fetched from database. Around 10000 rows and 33
> columns per row. Also, we do not want to increase the heap size of jvm.

Your two options are:
* increase the JVM heap size
* use a less memory hungry format, eg csv

> Can you suggest something on lines of writingthe rows incrementally into
> Excel sheet. I mean, say every 1000 records I write I close the output
> stream object.

Alas the excel file format doesn't work like that. You need to keep going
back and adding / changing earlier records when you write the later ones.
The excel file format was also never really designed to hold the sorts of
data volumes you're talking about, which is why you're hitting issues -
it's just not optimised for it.


I'd suggest you just whack up the jvm heap size - the default is really
rather small.

Nick

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to