Someone on Stack Overflow had the same question a while ago, and Robin Leroy managed to find some old documents that provide a likely explanation: https://stackoverflow.com/questions/79104685/in-unicode-why-%e0%a5%98-is-excluded-from-composition-whereas-%c3%85-is-not/79115293#79115293
The short version is that the decomposed forms of U+0958 and similar letters were the preferred representations by experts at the time, presumably because ISCII also encoded them using a combining nukta. Am Do., 1. Mai 2025 um 06:37 Uhr schrieb Neha Gupta via Unicode < [email protected]>: > Dear All, > > I have a question regarding Unicode normalization, specifically in the > context of Indian languages and the Devanagari script. In the Unicode > Devanagari block, characters such as U+0958 (क़), U+0915 (क), and U+093C (़) > were introduced in Unicode version 1. U+0958 (क़), known as DEVANAGARI > LETTER QA, visually and functionally represents the combination of U+0915 > (क) and the NUKTA sign U+093C (़). > > According to Unicode Normalization Form C (NFC), normalization involves > first fully decomposing a string , and then recomposing it, except in cases > where canonical composition is blocked or the composition is explicitly > excluded in the Unicode Character Database. In this context, U+0958 is > listed as a script-specific composition exclusion, meaning the sequence > <U+0915, U+093C> is not normalized (i.e., recomposed) into U+0958. > > I understand that some characters are deliberately excluded from canonical > composition to preserve distinctions important in specific scripts or > historical encoding practices. However, in this case, my confusion arises > from the fact that U+0958 was introduced in Unicode version 1, along with > its decomposable components. Given that it predates the formalization of > many normalization rules, and that the canonical equivalence between > <U+0915, U+093C> and U+0958 appears linguistically justified, I am curious > about the rationale behind its exclusion from composition. > > Could you please help clarify why U+0958 is treated as a composition > exclusion despite its early inclusion? > > Regards, > > Neha >
