2013-01-09 2:55, Leif Halvard Silli wrote:

> The benefit of doing such a comparison is that we then get to
count both the HTML page *plus* all the extra fonts that is included in
the "romanized Singhala file". Thus, we get a more *real* basis for
comparing the relative size of the two pages.

Not really. I don’t want to comment “romanized Singhala” any more, but I can’t leave a different fallacy uncommented.

When comparing sizes of web pages, it is clearly not sufficient to compare just HTML pages only. It is not uncommon to have just a few kilobytes of HTML but with loads on JavaScript and images, totalling a megabyte or more. This makes it relatively irrelevant whether some characters occupy one byte or two bytes. (Besides, HTML often gets automatically compressed for transmission.)

But if we count font files as well, we should count them in all alternatives being compared. Although you can, in principle, write e.g. a web page in Sinhala by simply providing the text content, sitting back and expecting browsers to render it using whatever fonts they prefer using, that’s a very unrealistic approach in practice. It would work for English (though few web content providers do that – they mostly want to set fonts), but for Sinhala, it would mean that a very large part of users (possibly the majority) would not see the Sinhala letters. The reason is that their computers lack any font that contains them. (Well, not the only reason, but the most common one.)

So in order to make (almost) all visitors see the content OK, the author of a Sinhala page should probably provide a downloadable font, via @font-face, that contains Sinhala letters (as a Unicode encoded font). Another option is to link to a font that the visitor can download and install, and this is what e.g. the site of the Parliament of Sri Lanka http://www.parliament.lk/ does, but the more modern way of using @font-face is much smoother and does not disturb the visitor with technicalities (and, besides, not all users can install fonts).

And, to be fair, Unicode-encoded fonts that contain Sinhala letters tend to be considerably larger than 8-bit ad-hoc encoded fonts. Then again, these days, size does not matter that much, and a downloadable font gets cached, and a Unicode-encoded font typically contains a much richer repertoire of characters, so that characters from different scripts (like Sinhala, English, and Common-script characters) have been designed to fit together.

Yucca







Reply via email to