Re: Text encoding Babel. Was Re: George Keremedjiev

Grant Taylor via cctalk Fri, 30 Nov 2018 15:12:02 -0800

On 11/30/2018 03:57 PM, Sean Conner via cctalk wrote:

There are several problems with this. One, how many bits do you setaside per character? 8? 16? There are potentially an open ended setof stylings that one might use.

I acknowledge that the idea I shared was incomplete and likely hasshortcomings. But I do think that it demonstrates a concept, which iswhat I was after.

Second problem---where do you store such bits? Not to imply this is abad idea, just that there are issues that need to be resolved with howthings are done today (how does this interact with UTF-8 for instance?Or UCS-4?).

Ideally, I'd like to see UTF-8 / UTF-16 code points (?) for thedifferent styles of a letter. Not every letter (character ~> byte /double) needs the styling. So I suspect that it would be better tojudiciously place code points in the UTF-8 / UTF-16 space.

Sadly, when I try to search for "this", the letters aren't found in"𝑡ℎ𝑖𝑠 𝑖𝑠 𝑎 𝑠𝑡𝑟𝑖𝑛𝑔" or "𝘁𝗵𝗶𝘀 𝗶𝘀 𝗮 𝗰𝗼𝗺𝗺𝗲𝗻𝘁".Something that I think should work.


Also, storage of these letters can work just like it is in this email.  ;-)



--
Grant. . . .
unix || die

Re: Text encoding Babel. Was Re: George Keremedjiev

Reply via email to