Daniel Carrera wrote:

Randomthots wrote:

The number of characters has no effect on speed. There is no reason why <w:r> is faster to parse than <text:span text:style-name="T1">.


I'm sorry, Daniel, but I find that hard to believe.

I have a file that is strictly text, numbers, and dates. Seven columns by 63,260 rows -- no formulas, no formatting. Importing as csv takes a few seconds. Converted to ods it takes *much* longer to load -- around 30 seconds or so.


What makes you think that the reason for the slowdown is because OpenDocument uses verbose tags instead of hard to understand tags? The size of the tag has essentially *zero* effect on speed.

For one particular tag, or for a normally sized spreadsheet, I'm sure you're right. But even a little bit has to add up. In that particular file the tag sequence I posted is essentially repeated 63,260 x 7 times. That's 442,820 times.

The slow down is because of the additional steps in compression,

That's arguable. Comparing the time it takes to zip the archive with 7-zip vs. the time it takes OOo to save the file, I would estimate that the compression step takes up maybe 20% of the total time at most.

XML parsing, and the fact that OpenDocument files contain more information than CSV files.

Tell me please, Daniel, what extra information is contained in the xml snippet: <table:table-cell office:value-type="string"><text:p>arin</text:p></table:table-cell>

that isn't contained in: ,"arin",



You can't just compare CSV vs OpenDocument and conclude that the problem is the size of the XML tags. That's plain silly.

In this particular case, it's not silly at all. If I do some simple substitutions and some liberal deleting, I can fairly easily reproduce the csv from the ods. And I won't lose a scrap of information in the process.

I realize this is not a normal case. It's more like a controlled experiment where you remove as many variables as you can in order to study the particular phenomenon of interest.

The only conclusion I can make is that XML makes a terrible format for databases that look like spreadsheets (or spreadsheets that look like databases). Maybe this will spur people to learn how to use Base.

--

Rod



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to