Hi! I have finally had the opportunity review the new 3GPP 23.038 "code pages", mostly for Indic scripts. (Started five years ago, but been busy with other stuff; sorry for the delay.) Note that these "code pages" are for SMS/CBS use only. They are suitable ONLY for that realm of use, and inappropriate everywhere else.
Unfortunately, the "code pages" in current 23.038 are not well constructed, nor does it seem that they have even been independently reviewed. So... I made new ones to replace them (technically with other reference numbers since changing an existing "code page", using the same refence number would be inappropriate). I also added "code pages" for several scripts not currently covered by 7-bit code pages (thus having to fall back to using "UCS2" (actually UTF-16(BE) currently, likely incurring a “size penalty”…; the SMS protocol has strict size restrictions, it is not called SHORT message service for nothing). I have no "new" "code pages" for Spanish, Portuguese or Turkish (which have separate "code pages" in 23.038), since these languages are covered better by the new(!!) "default" (actually not default but Latin script) "code page"; intending to deprecate the special code pages for Spanish, Portuguese and Turkish. (Though I call it "new default" it actually has to be set explicitly.) SMS and CBS are still “a thing” for 5G, 6G and very likely beyond, despite the numerous chat apps and other apps. You can find (draft!) mapping tables (.TXT) and charts (.docx) in https://github.com/kent-karlsson/3gpp-propositions. The text files have in the file name the language code for the principal language for which it is intended (except the "default" code page). The charts have the (SMS/CBS) protocol code page number (in hexadecimal) in the file name and section name. Note that this is work in progress, not yet put forward for standardisation. If you want to comment on these draft proposals, you can do so via github. /Kent Karlsson
