Re: [discuss] Re: Email vital for Desktop Linux adoption, prime role available for OOo

Daniel Carrera Mon, 05 Dec 2005 14:38:09 -0800

Randomthots wrote:

So if I have two files... same format... but one is twice as big as theother... the bigger file isn't going to take longer to load?

Irrelevant example. The fact that a bigger file loads slower doesn'tmean that the fault is on the size of the tag. There are several thingsthat increase with the size of the file. For example, the number ofelements, the complexity of the tree, the amount of content, etc. All ofthose can cause a slowdown and they are unrelated to the size of the tag.

What's silly is thinking that the two situations are at all analogous.

The size of an XML tag *is* analogous to the size of a variable name.They really, turly are analogous.

Once you've compiled the program,

I was not assuming a compiled language. Assume we're talking aboutPython or Ruby. The size of the variable name is inconsequential to theperformance of the program.

Note however, that if I /had/ been talking about a compiled language thecomparison would still be apt. More on this below.

Thanks for assuming that I'm a real idiot.

I'm making no assumption about your intelligence. I'm forming aneducated opinion about your knowledge based on what you say.

It POTENTIALLY has more and more complex data.

Please understand the difference between data and data structures. Ifyou open a CSV file and immediately save it as OpenDocument you aresaving it into a more complex data structure. Just like an n-ary tree isa more complex data structure than a two dimmensional array, regardlessof what data you store in them.

Where you lose the argument is that an xls file, like an ods,also has more and more complex data, but a csv import into Excel isstill slower than loading the equivalent xls. So why is that?

This point is irrelevant to the case you're trying to argue. But I'lltry to answer anyways. I'm not an Excel developer, but I understand that.doc and .xls files are just a memmory dump. This means that "reading" a.xls file involves nothing more than sticking it into RAM, with no otherconversion necessary.

And are you telling me that the cell, sheet, chart, etc. objects inworking memory... the stuff you are actually manipulating when you workwith the spreadsheet... aren't the same regardless of the format of theoriginal data file?


I fail to see what this has to do with your argument.

Statistically, it would be unlikely if the we were talking about adifference a couple MB. But 45 MB is a substantial fraction of 256.

But here's where you're making silly claims. The fact that unzipping thefile produces a 45MB XML set of files doesn't mean that when it's loadedinto memmory it will actually take up 45MB. It won't. When you load anXML file into memmory, XML tags are replaced by a pointer structure.This goes back to the example of compiled software. It's just like, whenyou compile software, variable names are replaced by pointers and thesize of the binary is not affected by the size of the variables. In asimilar way, when you read an XML file, the tags are replaced bypointers, and the size of the XML tag does not affect the size of thebinary data stored in RAM.

Look, when you parse anXML tree and put it into RAM, you don't put the XML tags in plain textand re-copy them every time the tag appears.
I wouldn't assume the developers would be that stupid or profligate withresources.

But that's what you are assuming by saying that the size of the XML tagsmatters. It *is* like saying that using longer variables will make aprogram run slower.

But the ods version of the same document is no more complex than thecsv.

Please read up on data structures. Find out what an n-ary tree is andwhat an array is. Then you'll see that even if you store the same datain the two structures the fact remains that the n-ary tree is a morecomplex data structure and is still slower to navigate. Especially whenall the nodes of the n-ary tree, by design, always add extra data thatisn't present in the array.

Another question: Is the XML processed in a serial fashion? Is itnecessary to hold the entire file in memory to parse it?

In theory it's not necessary, but in practice most content is in thesame place (content.xml) which puts a bit of a limit on how you canoptimize the parsing. For example, if all you wanted was to extract theauthor of the document, I could write a program that could get thatinformation lighting fast, regardless of the size of your document. Butmost of the time that's not what you want, you want to actually load thedocument contents into the application.

If I had the time I would. Unfortunately, I have to study forcertification exams and wade through some mostly useless labs forAdvanced Switching and Network Security classes. You see I'm nottechnically illiterate;


What year are you in?

telling me how silly and stupid I am.


I never said you're stupid. I said you said some very silly things.

I'm not sure I like you very much anymore.


My goal in life is not that you like me or dislike me.

Cheers,
Daniel.
--
     /\/`) http://oooauthors.org
    /\/_/  http://opendocumentfellowship.org
   /\/_/
   \/_/    I am not over-weight, I am under-tall.
   /

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [discuss] Re: Email vital for Desktop Linux adoption, prime role available for OOo

Reply via email to