Dear all,

It will take some time, until the Emojis from Unicode 14 will be generally available, but when this comes, we should have already everything working in NaviServer and the DB interfaces. I've added a small demo page, one can try when the new clients come out:

https://openacs.org/emojis.tcl

One interesting part is the grapheme cluster (like e.g. 👩‍👩‍👧‍👦), which is made up of the following unicode graphemes:

   WOMAN👩 ZWJ WOMAN👩 ZWJ GIRL👧 ZWJ BOY👦

where ZWJ are zero-width joiners. One can enter these e.g. via

  set x \ud83d\udc69\u200d\ud83d\udc69\u200d\ud83d\udc67\u200d\ud83d\udc66

into current Tcl. When just passing such string through Tcl, everything seems fine. I would not be surprised that  "string length", "string range" etc can lead to unexpected results, but is is quite fun to decompose this emoji with Tcl:

   % set x \ud83d\udc69\u200d\ud83d\udc69\u200d\ud83d\udc67\u200d\ud83d\udc66
   👩‍👩‍👧‍👦
   % string range $x 0 1
   👩
   % string range $x 6 7
   👧
   % string range $x 9 10
   👦

AFIKT, eveything is fine with NaviServer in this respect.

Concerning Unicode 14:

Android 12L contains support for Emojis from Unicode 14. Google announced Android 12L in October 2021, less than one month after the stable release of Android 12. 12L is expected in early 2022 [2].

According to [3] iOS 15.0 will not include Unicode 14 emojis. Support for Emoji 14.0 on Apple platforms is expected in the first half of 2022 (probably in iOS 16).

all the best

-g

[1] https://9to5google.com/2021/10/27/android-12l-unicode-14/
[2] https://developer.android.com/about/versions/12/12L/summary
[3] https://emojipedia.org/apple/


On 26.11.21 10:40, Wolfgang Winkler via naviserver-devel wrote:

Hi!

We've testet the encoding now extensively. All Emojis up to 13 <https://emojipedia.org/unicode-13.0/> are handled correctly, including database storage and retrieving, tdom and form handling.

Version 14 emojis <https://emojipedia.org/unicode-14.0/> are not supported by any of the browsers we've testet, but don't throw errors. It seems we are save for future updates.

Wolfgang

Am 18.11.21 um 18:24 schrieb Gustaf Neumann:

Dear all

On bitbucket is now an update (see change log message below) that introduces support of UTF-8 characters using up to 4 bytes (with Tcl 8.6). It should work as well with 6 byte UTF when Tcl 8.7 is properly compiled (by setting TCL_UTF_MAX).

One can now use e.g. emoticons in SQL queries

     db_0or1row ... {select 1 from cr_items where name = '😈'}

or as values of bind variables

     set x 😈
     db_0or1row ... {select 1 from cr_items where name = :x}

... but not as names of bind variables (these have the same restricted syntax than before
(in essence no funny characters).

The code is already running at openacs.org.

all the best

-gn


Added support for UTF-8 characters up to 4 bytes

These changes add proper export of UTF-8 for Unicode symbols taking up
to 4 bytes. For the western world the biggest interest is probably for
emoticons. The change is implemented with performance in mind. The
proper encoded byte-strings are cached in Tcl_Objs, such that only the
values for bind-vars (which have probably different values per call)
have to be recoded at call time. This should keep the performance
penalty small (we see on some of our servers in day-average 1500 SQL
operations per second, peaks at >10K).

The names of bind variables follow still the same rules as before (no
emoticons as variable names).

On 16.11.21 16:39, Wolfgang Winkler via naviserver-devel wrote:

the fix worked, thank you Gustaf! But we still have a problem with emojis when writing them to the database. The error we get is:

Database operation "dml" failed (exception ERROR, "ERROR:  invalid byte sequence for encoding "UTF8": 0xf0 0x9f




_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel
--

*Wolfgang Winkler*
Geschäftsführung
wolfgang.wink...@digital-concepts.com
mobil +43.699.19971172

dc:*büro*
digital concepts Novak Winkler OG
Software & Design
Landstraße 68, 5. Stock, 4020 Linz
www.digital-concepts.com <http://www.digital-concepts.com>
tel +43.732.997117.72
tel +43.699.1997117.2

Firmenbuchnummer: 192003h
Firmenbuchgericht: Landesgericht Linz




_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

--
Univ.Prof. Dr. Gustaf Neumann
Head of the Institute of Information Systems and New Media
of Vienna University of Economics and Business
Program Director of MSc "Information Systems"
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to