Hi, Someone in my team wrote an excel reader layer that utilized the DOM model for xls and xlsx for reading excels. This was quite quick and easy and his code was tightly coupled with the poi DOM model methods checking cell types, converting to date formats etc ..
The problem is that we started getting large xlsx files, that did not fit in memory using the DOM model, i was requested to add an "ability" to load large xlsx files (and later on large csv files too). After understanding the table in https://poi.apache.org/spreadsheet/index.html I realized i would need to: A) create a common intermediate ExcelReader interface for our code + de-couple the poi DOM model parts out to an implementor of that interface (XlsDOMReader / XlsxDOMReader). B) build a new SAX reader that implements the same interface (XlsxSAXReader) and re-implement many of the poi DOM methods in it that are missing (e.g., cell type methods, etc ...). In addtion it will expose a read iterator which will actuall buffer a window of a of rows each time in memory, iterate through it, and reload the next rows in the buffer when done with current one. (our program utilizes forward only read functionality) I feel i might be doing work that someone already has done as my situation looks like a common one in the world of excel reading - am I doing the wrong thing here ? is there an existing common poi interface for DOM and SAX ? -- View this message in context: http://apache-poi.1045710.n5.nabble.com/Common-interface-for-DOM-and-SAX-in-poi-tp5721234.html Sent from the POI - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
