Marco Cimarosti scripsit:

> { As a side note, the idea that a language my use "foreign" words seems
> terribly naive to me. It is true that, in Italian, we use loanwords such as
> "hardware", "punk", or "footing", but it would be silly to consider or tag
> them as "English words". They are genuinely Italian words, [...]

In English, however, the distinction between borrowings and truly
foreign words does make sense.  Such a word as Weltanschauung, for example,
is written in its native orthography complete with capital letter, is
almost invariably typeset in italics, and is most often (by the educated;
the uneducated will not know or use it at all) given some approximation
of its original pronunciation.

Even in Italian, what about Latin terms embedded in classic poetry?
Are you going to say that those too are Italian, just with a slightly
peculiar morphology?

Hindi-Urdu is another good example.  There is a core of common words
with a common phonology.  Then there is a long list of Sanskrit-based
terms, mostly used in the Hindi varieties of the language, which use
a reduced form of Skt phonology.  Similarly, there is another long list
of Persian- and Arabic-based terms, mostly used in the Urdu varieties
of the language (there are lots of Persian and Arabic borrowings in the
core, however), which use a reduced form of Persian or Arabic phonology.

> As I see it, the problem is not merely that the two fashions of tags may
> specifying different languages. That would not be a real conflict. It is
> perfectly legitimate to embed language tags into each other: the rule is
> that the inner language tag wins. This general rule can be extended to
> accommodate plain text tags, they will always take the precedence as they
> clearly are the innermost specification.

Plain-text tags don't nest, however: you need to give a tag explicitly
naming the outer language when you return to it.

> If they are rendered as invisible glyphs, they make the text more difficult
> to edit and to move the cursor within, because the user will have no way of
> understanding why the cursor stops twice in apparently random positions.
> This also exposes the information contained in language tags to be
> unwillingly corrupted by subsequent editing.

This argument proves too much: it applies with equal force to the
invisible bidi controls and the other Unicode controls.  In practice
these things are not available for plaintext-style editing except in a
"reveal controls" mode, which could equally well reveal the tags using
some stylized glyphs.

-- 
One art / There is                      John Cowan <[EMAIL PROTECTED]>
No less / No more                       http://www.reutershealth.com
All things / To do                      http://www.ccil.org/~cowan
With sparks / Galore                     -- Douglas Hofstadter

Reply via email to