On Tue, 29 Jan 2008, Daniel Noll wrote:
Is your formula related eventusermodel code in a format suitable for contributing back? It'd be handy to be able to put something in svn that would make dealing with the formula stuff much simpler. I'd be happy to spend a bit of time tidying it up / writing tests for it, if you could contribute it?

If I ever figure out how to handle it, I probably would contribute it back because it would mean changes to how shared formulas work. At the moment as you say, it does require a Workbook. At the moment I don't have a Workbook to work with. Maybe I can store off the first however many records and then create the Workbook from those -- I haven't tried so I don't know what happens if you feed in a list of records without the ones which make up the read of the file.

I think you might be able to get away with that. If not, shout and we can tweak things.

If it gets you close, then we should probably come up with something like a WorkbookRecordSource interface, which model.Workbook implements. Tweak the formula code to use those instead, then it's easier for you to pass in the records that mater. Let us know if that looks like being worth doing.


Memory is indeed cheap, but unless you have the luxury of a 64-bit JVM, there is an upper limit of somewhere around 1.4GB, sometimes less. This would normally be nearly 2GB but Windows allocates some DLLs in weird positions on some systems, and Sun insist on allocating a contiguous block of memory for the heap which sometimes causes a huge unusable memory hole above that.

Have you tried tweaking your windows box to use a 1gb/3gb split, instead of the usual 2gb/2gb one? Might help out in the absence of a 64 bit jvm / a licence for a non-hobbled 32 bit version of windows.
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx


In actual fact for us, something closer to RecordInputStream would be even better, where we can just say nextRecord() and have it return a properly constructed Record. Then we have control over the loop, which is ideal when you need to return a Reader.

Does the newly added org.apache.poi.hssf.eventusermodel.HSSFRecordStream look roughly like what you need? I've converted the existing eventusermodel code to use it under the hood, so it ought to behave pretty much the same, except with pull instead of push.


As far as the records keeping a copy, could they not instead keep an offset and a reference to the original buffer? Then if someone calls a setter, it would need to create a new buffer, set the offset to 0 and copy the data before doing the actual set.

In many cases, they only keep the parsed data in memory, and not the source bytes. That's certainly one of the advantages of the (not so) new RecordInputStream method

And as far as POIFS keeping a copy, yes... POIFS is full of issues like that. For instance, even if all you need to read is the CLSID, you still have to read the entire file. If POIFSFileSystem could construct from a ByteBuffer and not take unnecessary copies, it could speed things up dramatically for that situation... but ultimately that would need to propagate to the whole framework for it to really show benefits.

Do feel free to submit patches for that sort of thing :)

I haven't played with ByteBuffer before, so do feel free to suggest how it might help + point at code examples / patches that show it

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to