[ 
https://issues.apache.org/jira/browse/DAFFODIL-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18045832#comment-18045832
 ] 

Mike Beckerle commented on DAFFODIL-551:
----------------------------------------

W.r.t. priority of this feature. Here's advice from chatgpt5 about what 
languages use surrogate-pair characters.

 
||Category||Requires surrogate pairs?||Prevalence||
|Modern alphabetic languages|❌ No|Universal|
|Modern East Asian languages (core use)|❌ No|Universal|
|Chinese (rare names / classical texts)|⚠️ Sometimes|Real but niche|
|Historic scripts|✅ Yes|Academic|
|Emoji in modern text|✅ Yes|Extremely common|

 

Makes it clear that the most important case for our most common use case 
(cybersecurity) is going to be when emojis become an important part of texts in 
data formats we deal with. :) (y)(*)(*g)(flag)(/) Even then, they would be 
preserved by utf16Width="variable", they just wouldn't count as one character 
for length purposes. 

 

Based on this I'm going to resolve this issue as do not fix. I think this DFDL 
feature is likely just not needed and was overkill to put into the language. 

> Variable width UTF-16 encoding not implemented
> ----------------------------------------------
>
>                 Key: DAFFODIL-551
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-551
>             Project: Daffodil
>          Issue Type: New Feature
>          Components: Back End
>            Reporter: Jessie Chab
>            Priority: Minor
>
> I found DFDL-219 for the implementation of variable width UTF-8 encoding. Is 
> this going to be implemented for UTF-16 as well? currently when I try to set 
> utf16Width to 16 in my schema, I get the following error:
> Schema Definition Error: Property utf16Width='variable' not supported.
> I can't find a ticket for this so I thought I would at least create one to 
> bring attention to the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to