On Fri, 5 Nov 2010 21:24:08 -0400, Scott Deerwester <scott.deerwes...@gmail.com> wrote: > The following code (in Python): > > for r in range(dataRange.StartRow, dataRange.EndRow): > for c in range(dataRange.StartColumn, dataRange.EndColumn): > cell = sheet.getCellByPosition(c,r) > > > takes nearly two hours to run on a reasonably fast workstation for a > spreadsheet with 32 columns and ~27,000 rows (about .2 seconds per row). By > comparison, opening the file takes about 8 seconds and saving the file to a > CSV (which is functionally equivalent to the above) takes a few seconds. > The > original file was written in Excel as an XLS (not XLSX). This seems > impossibly slow. Am I misusing the API somehow? What can I do to speed it > up?
First, I usually determine which cells are used. Depending on the operations, this may be a gross test such as simply finding the largest used block. Next, if I must access the data in the cells, I obtain the data by getting the entire range and then calling getDataArray() (or getData, depending on my need / usage). I have a section on timing in AndrewMacor.odt where I search a Calc document using different methods. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@api.openoffice.org For additional commands, e-mail: dev-h...@api.openoffice.org