Hi,
Someone in my team wrote an excel reader layer that utilized the DOM model
for xls and xlsx for reading excels. This was quite quick and easy and his
code was tightly coupled with the poi DOM model methods checking cell types,
converting to date formats etc ..

The problem is that we started getting large xlsx files, that did not fit in
memory using the DOM model, i was requested to add an "ability" to load
large xlsx files (and later on large csv files too).

After understanding the table in
https://poi.apache.org/spreadsheet/index.html
I realized i would need to: 
A) create a common intermediate ExcelReader interface for our code +
de-couple the poi DOM model parts out to an implementor of that interface
(XlsDOMReader / XlsxDOMReader).

B) build a new SAX reader that implements the same interface (XlsxSAXReader)
and re-implement many of the poi DOM methods in it that are missing (e.g.,
cell type methods, etc ...). In addtion it will expose a read iterator which
will actuall buffer a window of a of rows each time in memory, iterate
through it, and reload the next rows in the buffer when done with current
one. (our program utilizes forward only read functionality)

I feel i might be doing work that someone already has done as my situation
looks like a common one in the world of excel reading - am I doing the wrong
thing here ? is there an existing common poi interface for DOM and SAX ?



--
View this message in context: 
http://apache-poi.1045710.n5.nabble.com/Common-interface-for-DOM-and-SAX-in-poi-tp5721234.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to