Den 2019-04-11 kl. 15:41, skrev Christoph M. Becker:
On 02.04.2019 at 11:42, Nicolai Scheer wrote:

I'm currently in the process of migrating an old application from php 5.6
to 7.2.
In the process, I fiddled with the default_charset ini setting.

The documentation states (c.f.
https://www.php.net/manual/en/ini.core.php#ini.default-charset):

"In PHP 5.6 onwards, "UTF-8" is the default value and [...] The value of
default_charset
will also be used to set the default character set for [...] and for
mbstring functions
if the mbstring.http_input mbstring.http_output mbstring.internal_encoding
configuration option is unset."

As such, I'd expect to be able to set default_charset to iso-8859-1 and
mbstring to pick that same setting for its internal encoding (if the
mentioned directives are unset, that is).

This seems not to be the case:

<?php
ini_set( 'default_charset', 'iso-8859-1' );
var_dump( ini_get("mbstring.internal_encoding") );
var_dump( ini_get("mbstring.http_input") );
var_dump( ini_get("mbstring.http_output") );
echo mb_internal_encoding() . "\n";
echo mb_strlen( "\xc3\xb6" ) . "\n";
echo mb_strlen( "\xc3\xb6", '8bit' ) . "\n";

This outputs (7.2.15 on a CentOS box):
string(0) ""
string(0) ""
string(0) ""
UTF-8
1
2

The default_charset is set but mbstring settings are not, so I'd expect to
get 2 as the character/byte count in both cases.

If I throw a mb_internal_encoding("iso-8859-1") in the mix, both string
lengths are equal.

Since the mentioned mbstring directives are deprecated as of 5.6.0 - do I
really need to use mb_internal_encoding() instead?
Is the documentation wrong or am I just misinterpreting it? I thought that
default_charset should act as some kind of "master setting" in order not to
have to set all specific settings as well (e.g. iconv, mbstring).

Usually we use UTF-8, so I did not come across this before...

Any insight?
<https://3v4l.org/ZvQ67> confirms the reported behavior.  A quick look
at the code, too.  I suggest you file a ticket on <https://bugs.php.net/>.

Thanks,
Christoph M. Becker

Hi,

Did this lead to a bug report?

It lead to a bug in Smarty 3.1.33 for me. I got a warning about
"mbregex compile err: invalid code point value" in mb_split().
I have content in ISO-8859-1 and Smarty normal procedure to
set encoding and php.ini setting to ISO-8859-1 flunked.

However mb_regex_encoding('ISO-8859-1') did the trick!

r//Björn L


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to