On Sunday, 29 May 2016 at 12:41:50 UTC, Chris wrote:
Ok, you have a point there, to be precise <sh> is a multigraph (a digraph)(cf. [1]). In French you can have multigraphs consisting of three or more characters <eau> /o/, as in Irish <aoi> => /i:/. However, a phoneme is not necessarily a spoken "character" as <sh> represents one phoneme but consists of two "characters" or graphemes. <th> can represent two different phonemes (voiced and unvoiced "th" as in `this` vs. `thorough`).

What I meant was, a phoneme is the "character" (smallest unit) in a spoken language, not that it corresponds to a character (whatever that means).

My point was that we have to be _very_ careful not to mix our cultural experience with written text with machine representations. There's bound to be confusion. That's why we should always make clear what we refer to when we use the words grapheme, character, code point etc.

I used 'character' in quotes, because it's not a well defined therm. Code point, grapheme and phoneme are well defined.

Reply via email to