Hi Christian,

Thank you for a detailed and thorough reply.

Since it's a fresh installation I've opted for mysql-server 5.7.30-1debian9
and utf8.

Would expect any issues with it?

Thanks,
Adam


On Mon, 18 May 2020 at 04:39, Christian Hammond <christ...@beanbaginc.com>
wrote:

> Hi Adam,
>
> Yeah... Here's the situation with MySQL/MariaDB and "utf8".
>
> When MySQL introduced utf8 charset, they went with a sort of "compressed"
> version of UTF-8 that excluded bits for some character ranges (I am super
> simplifying this). Emojis and some other character ranges didn't exist at
> the time, and now cannot be represented by their "utf8".
>
> utf8mb4 is the "real" UTF-8 charset type. However, it's not a drop-in
> replacement. It affects key lengths, amongst other things, and is
> incompatible with, well, many things.
>
> There *is* a way to get true UTF-8 support. It requires utf8mb4, and a
> handful of global settings applied to the server to enable large keys and a
> different InnoDB file format. It then requires a special command to be set
> at the beginning of each MySQL/MariaDB session to opt into some better
> support.
>
> Basically, it's invasive and not something that we can currently tell
> people to enable, or it'll cause new problems. It also requires full table
> rebuilds. The instructions also depend on the version of MySQL/MariaDB.
>
> We plan to bake in some level of support for it in Review Board in the
> future, but Django doesn't natively support it, and it'll require a bunch
> of special logic to rebuild data.
>
> I can't currently provide the settings you may need, because many of them
> are dependent on the version of MySQL/MariaDB you're using, and I haven't
> verified them lately (just working off internal notes). It boils down to:
>
> 1) Using utf8mb4 charsets for all databases, tables, and
> connections/sessions
> 2) Using utf8mb4_bin collation for all the above
> 3) Enabling innodb_large_prefix and innodb_per_table (might depend on the
> versions of MySQL/MariaDB)
> 4) Enabling innodb_file_format=barracuda (not needed on modern versions)
>
> This is not an exhaustive step-by-step.
>
> PostgreSQL will do UTF-8 by default, fwiw.
>
> Hoping to revisit this support in MySQL/MariaDB after RB4 wraps up. Should
> be easier now that MySQL/MariaDB have made progress in this area, and I
> need to update my knowledge of what that progress looks like.
>
> Christian
>
>
> On Fri, May 15, 2020 at 5:31 AM Adam Weremczuk <veremch...@gmail.com>
> wrote:
>
>> I don't think utf8mb4 was a good idea and I believe it's now leading to:
>>
>> sudo rb-site install /var/www/mysite
>> (...)
>> * Installing the site...
>> (...)
>> Creating table scmtools_repository
>>
>> [!] There was an error synchronizing the database. Make sure the
>>     database is created and has the appropriate permissions, and then
>>     continue.
>> [!] Details: (1071, 'Specified key was too long; max key length is 767
>>     bytes')
>>
>> Press Enter to continue
>>
>>
>>
>> On Thursday, 14 May 2020 16:01:35 UTC+1, Adam Weremczuk wrote:
>>>
>>> Hi all,
>>>
>>> Following installation guide for MySQL I've added to /etc/mysql/my.cnf
>>>
>>> [client]
>>> default-character-set=utf8
>>>
>>> [mysqld]
>>> character-set-server=utf8
>>>
>>> MariaDB fails to start:
>>>
>>> May 14 14:01:41 gittest systemd[1]: Starting MariaDB 10.1.44 database
>>> server...
>>> May 14 14:01:41 gittest mysqld[10318]: 2020-05-14 14:01:41
>>> 139687784537472 [Note] /usr/sbin/mysqld (mysqld 10.1.44-MariaDB-0+deb9u1)
>>> starting as process 10318 ...
>>> May 14 14:01:41 gittest mysqld[10318]: 2020-05-14 14:01:41
>>> 139687784537472 [ERROR] COLLATION 'utf8mb4_general_ci' is not valid for
>>> CHARACTER SET 'utf8'
>>> May 14 14:01:41 gittest mysqld[10318]: 2020-05-14 14:01:41
>>> 139687784537472 [ERROR] Aborting
>>> May 14 14:01:41 gittest systemd[1]: mariadb.service: Main process
>>> exited, code=exited, status=1/FAILURE
>>> May 14 14:01:41 gittest systemd[1]: Failed to start MariaDB 10.1.44
>>> database server.
>>>
>>> When I comment out these 2 addition it starts fine and I can retrieve
>>> the following:
>>>
>>> MariaDB [(none)]> SHOW COLLATION LIKE 'utf8%';
>>>
>>> +------------------------------+---------+-----+---------+----------+---------+
>>> | Collation                    | Charset | Id  | Default | Compiled |
>>> Sortlen |
>>>
>>> +------------------------------+---------+-----+---------+----------+---------+
>>> | utf8_general_ci              | utf8    |  33 | Yes     | Yes      |
>>>    1 |
>>> | utf8_bin                     | utf8    |  83 |         | Yes      |
>>>    1 |
>>> | utf8_unicode_ci              | utf8    | 192 |         | Yes      |
>>>    8 |
>>> | utf8_icelandic_ci            | utf8    | 193 |         | Yes      |
>>>    8 |
>>> | utf8_latvian_ci              | utf8    | 194 |         | Yes      |
>>>    8 |
>>> | utf8_romanian_ci             | utf8    | 195 |         | Yes      |
>>>    8 |
>>> | utf8_slovenian_ci            | utf8    | 196 |         | Yes      |
>>>    8 |
>>> | utf8_polish_ci               | utf8    | 197 |         | Yes      |
>>>    8 |
>>> | utf8_estonian_ci             | utf8    | 198 |         | Yes      |
>>>    8 |
>>> | utf8_spanish_ci              | utf8    | 199 |         | Yes      |
>>>    8 |
>>> | utf8_swedish_ci              | utf8    | 200 |         | Yes      |
>>>    8 |
>>> | utf8_turkish_ci              | utf8    | 201 |         | Yes      |
>>>    8 |
>>> | utf8_czech_ci                | utf8    | 202 |         | Yes      |
>>>    8 |
>>> | utf8_danish_ci               | utf8    | 203 |         | Yes      |
>>>    8 |
>>> | utf8_lithuanian_ci           | utf8    | 204 |         | Yes      |
>>>    8 |
>>> | utf8_slovak_ci               | utf8    | 205 |         | Yes      |
>>>    8 |
>>> | utf8_spanish2_ci             | utf8    | 206 |         | Yes      |
>>>    8 |
>>> | utf8_roman_ci                | utf8    | 207 |         | Yes      |
>>>    8 |
>>> | utf8_persian_ci              | utf8    | 208 |         | Yes      |
>>>    8 |
>>> | utf8_esperanto_ci            | utf8    | 209 |         | Yes      |
>>>    8 |
>>> | utf8_hungarian_ci            | utf8    | 210 |         | Yes      |
>>>    8 |
>>> | utf8_sinhala_ci              | utf8    | 211 |         | Yes      |
>>>    8 |
>>> | utf8_german2_ci              | utf8    | 212 |         | Yes      |
>>>    8 |
>>> | utf8_croatian_mysql561_ci    | utf8    | 213 |         | Yes      |
>>>    8 |
>>> | utf8_unicode_520_ci          | utf8    | 214 |         | Yes      |
>>>    8 |
>>> | utf8_vietnamese_ci           | utf8    | 215 |         | Yes      |
>>>    8 |
>>> | utf8_general_mysql500_ci     | utf8    | 223 |         | Yes      |
>>>    1 |
>>> | utf8_croatian_ci             | utf8    | 576 |         | Yes      |
>>>    8 |
>>> | utf8_myanmar_ci              | utf8    | 577 |         | Yes      |
>>>    8 |
>>> | utf8_thai_520_w2             | utf8    | 578 |         | Yes      |
>>>    4 |
>>> | utf8mb4_general_ci           | utf8mb4 |  45 | Yes     | Yes      |
>>>    1 |
>>> | utf8mb4_bin                  | utf8mb4 |  46 |         | Yes      |
>>>    1 |
>>> | utf8mb4_unicode_ci           | utf8mb4 | 224 |         | Yes      |
>>>    8 |
>>> | utf8mb4_icelandic_ci         | utf8mb4 | 225 |         | Yes      |
>>>    8 |
>>> | utf8mb4_latvian_ci           | utf8mb4 | 226 |         | Yes      |
>>>    8 |
>>> | utf8mb4_romanian_ci          | utf8mb4 | 227 |         | Yes      |
>>>    8 |
>>> | utf8mb4_slovenian_ci         | utf8mb4 | 228 |         | Yes      |
>>>    8 |
>>> | utf8mb4_polish_ci            | utf8mb4 | 229 |         | Yes      |
>>>    8 |
>>> | utf8mb4_estonian_ci          | utf8mb4 | 230 |         | Yes      |
>>>    8 |
>>> | utf8mb4_spanish_ci           | utf8mb4 | 231 |         | Yes      |
>>>    8 |
>>> | utf8mb4_swedish_ci           | utf8mb4 | 232 |         | Yes      |
>>>    8 |
>>> | utf8mb4_turkish_ci           | utf8mb4 | 233 |         | Yes      |
>>>    8 |
>>> | utf8mb4_czech_ci             | utf8mb4 | 234 |         | Yes      |
>>>    8 |
>>> | utf8mb4_danish_ci            | utf8mb4 | 235 |         | Yes      |
>>>    8 |
>>> | utf8mb4_lithuanian_ci        | utf8mb4 | 236 |         | Yes      |
>>>    8 |
>>> | utf8mb4_slovak_ci            | utf8mb4 | 237 |         | Yes      |
>>>    8 |
>>> | utf8mb4_spanish2_ci          | utf8mb4 | 238 |         | Yes      |
>>>    8 |
>>> | utf8mb4_roman_ci             | utf8mb4 | 239 |         | Yes      |
>>>    8 |
>>> | utf8mb4_persian_ci           | utf8mb4 | 240 |         | Yes      |
>>>    8 |
>>> | utf8mb4_esperanto_ci         | utf8mb4 | 241 |         | Yes      |
>>>    8 |
>>> | utf8mb4_hungarian_ci         | utf8mb4 | 242 |         | Yes      |
>>>    8 |
>>> | utf8mb4_sinhala_ci           | utf8mb4 | 243 |         | Yes      |
>>>    8 |
>>> | utf8mb4_german2_ci           | utf8mb4 | 244 |         | Yes      |
>>>    8 |
>>> | utf8mb4_croatian_mysql561_ci | utf8mb4 | 245 |         | Yes      |
>>>    8 |
>>> | utf8mb4_unicode_520_ci       | utf8mb4 | 246 |         | Yes      |
>>>    8 |
>>> | utf8mb4_vietnamese_ci        | utf8mb4 | 247 |         | Yes      |
>>>    8 |
>>> | utf8mb4_croatian_ci          | utf8mb4 | 608 |         | Yes      |
>>>    8 |
>>> | utf8mb4_myanmar_ci           | utf8mb4 | 609 |         | Yes      |
>>>    8 |
>>> | utf8mb4_thai_520_w2          | utf8mb4 | 610 |         | Yes      |
>>>    4 |
>>>
>>> +------------------------------+---------+-----+---------+----------+---------+
>>> 59 rows in set (0.00 sec)
>>>
>>> I've replaced utf8 with utf8mb4 in my.cf and MariaDB is now starting
>>> fine.
>>>
>>> Have I done the right thing?
>>>
>>> Shall the installation documentation be updated?
>>>
>>> Thanks,
>>> Adam
>>>
>>> --
>> Supercharge your Review Board with Power Pack:
>> https://www.reviewboard.org/powerpack/
>> Want us to host Review Board for you? Check out RBCommons:
>> https://rbcommons.com/
>> Happy user? Let us know! https://www.reviewboard.org/users/
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Review Board Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to reviewboard+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/reviewboard/a02fb57b-6547-4d43-a028-4e8706a42860%40googlegroups.com
>> <https://groups.google.com/d/msgid/reviewboard/a02fb57b-6547-4d43-a028-4e8706a42860%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> --
> Christian Hammond
> President/CEO of Beanbag <https://www.beanbaginc.com/>
> Makers of Review Board <https://www.reviewboard.org/>
>
> --
> Supercharge your Review Board with Power Pack:
> https://www.reviewboard.org/powerpack/
> Want us to host Review Board for you? Check out RBCommons:
> https://rbcommons.com/
> Happy user? Let us know! https://www.reviewboard.org/users/
> ---
> You received this message because you are subscribed to the Google Groups
> "Review Board Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to reviewboard+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/reviewboard/CAE7VndmhBFOxeePH3NGSV9dg2B1XQ8D-guiyRJxABps0%3D%2BK--Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/reviewboard/CAE7VndmhBFOxeePH3NGSV9dg2B1XQ8D-guiyRJxABps0%3D%2BK--Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
Supercharge your Review Board with Power Pack: 
https://www.reviewboard.org/powerpack/
Want us to host Review Board for you? Check out RBCommons: 
https://rbcommons.com/
Happy user? Let us know! https://www.reviewboard.org/users/
--- 
You received this message because you are subscribed to the Google Groups 
"Review Board Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to reviewboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/reviewboard/CALC-DAG9ZmC0YzHtwtU1KgnAOc96Up9B%2BcEa_fuWS%3D6ghY7bOQ%40mail.gmail.com.

Reply via email to