It's critical, with Manifold, that the database instance be capable of
handling any characters it's likely to encounter.  For Postgresql we tell
people to install it with the utf-8 collation, for instance, and when we
create database instances ourselves we try to specify that as well.  For
MariaDB, have a look at the database implementation we've got, and let me
know if this is something we're missing anywhere?

Thanks,
Karl


On Wed, Jan 23, 2019 at 3:00 AM Markus Schuch <markus_sch...@web.de> wrote:

> Hi,
>
> while using MySQL/MariaDB for MCF i encountered a "deadlock" kind of
> situation caused by a UTF-16 character (e.g. U+1F3AE) in a String
> inserted in one of the varchar colums.
>
> In my case a connector wrote th title of a parent document in to the
> version string of the process document, which contained the character
> U+1F3AE - a gamepad :)
>
> This lead to SQL Error 22001 "Incorrect string value: '\xF0\x9F\x8E\xAE'
> for column 'lastversion' at row 1" in mysql because the utf8 collation
> encoding does not support that kind of chars. (utf8mb4 does)
>
> The cause was hard to find, because it somehow it lead to a transaction
> abort loop in the incremental ingester and the error was not logged
> properly.
>
> My question:
> - should we create the mysql database with utf8mb4 by default?
> - or should inserted strings be sanatized from UTF-16 chars?
> - or should 22001 be handled better?
>
> Thanks in advance
> Markus
>

Reply via email to