2025年7月15日(火) 16:05 youkidearitai <youkideari...@gmail.com>:
>
> 2025年7月14日(月) 19:22 Derick Rethans <der...@php.net>:
> >
> > On Wed, 9 Jul 2025, youkidearitai wrote:
> >
> > > Hi, Internals
> > >
> > > I changed below the RFC.
> > > - https://wiki.php.net/rfc/grapheme_add_locale_for_case_insensitive
> > > Pull request is below:
> > > - https://github.com/php/php-src/pull/18792
> > >
> > > Change point is below:
> > > - Add a strength for grapheme_* functions
> > >   - Affect to all over the world characters, ex: Ideographic Variation
> > > Sequence(IVS)
> > >   - Use Collator object const values.
> >
> > These settings are indeed important for these functions, but I can't get
> > around the fact that it makes these APIs really cluttered and
> > complicated — something that many functions in the grapheme_ / intl
> > extension already suffer from.
> >
> > Is this API really the best way?
> >
> > > $locale parameter is not change anything. Because I could not find any 
> > > way.
> >
> > It seems that I came to a similar conclusion, but locales are much more
> > complicated than just languageCode_regionCode (for example, see
> > https://github.com/derickr/php-text/blob/main/tests/text-contains.phpt#L25)
> >
> > You also don't really need a strength argument, as you can 'encode' that
> > in the locale name, like: 'nb_NO-u-ks-primary' (I know, it's rather ugly
> > and the list of options is vast:
> > https://www.unicode.org/reports/tr35/tr35-collation.html#Common_Settings
> >
> > cheers,
> > Derick
>
> Hi, Derick
>
> Thank you very much for response.
>
> > Is this API really the best way?
>
> I reconsidered the function signature based on what you said.
>
> > It seems that I came to a similar conclusion, but locales are much more
> > complicated than just languageCode_regionCode (for example, see
> > https://github.com/derickr/php-text/blob/main/tests/text-contains.phpt#L25)
> >
> > You also don't really need a strength argument, as you can 'encode' that
> > in the locale name, like: 'nb_NO-u-ks-primary' (I know, it's rather ugly
> > and the list of options is vast:
> > https://www.unicode.org/reports/tr35/tr35-collation.html#Common_Settings
>
> Indeed, since strength can be specified in the locale,
> I thought it would be better to specify it in the locale rather than
> as a parameter for strength.
>
> For example, The grapheme_* functions can detect difference for IVS.
> ```
> $ sapi/cli/php -r 'var_dump(grapheme_levenshtein("\u{908A}",
> "\u{908A}\u{E0101}", locale: "ja_JP-u-ks-identic"));'
> int(1)
> $ sapi/cli/php -r 'var_dump(grapheme_levenshtein("\u{908A}",
> "\u{908A}\u{E0101}"));'
> int(0)
> $ sapi/cli/php -r 'var_dump(grapheme_strpos("\u{908A}", 
> "\u{908A}\u{E0101}"));'
> int(0)
> $ sapi/cli/php -r 'var_dump(grapheme_strpos("\u{908A}",
> "\u{908A}\u{E0101}", locale: "ja_JP-u-ks-identic"));'
> bool(false)
> ```
>
> Since ideographic characters also have identities (e.g., names), we
> would like to make IVS compatible with them.
> However, it should be simple, so we should compromise somewhere.
>
> Regards
> Yuya
>
>
> --
> ---------------------------
> Yuya Hamada (tekimen)
> - https://tekitoh-memdhoi.info
> - https://github.com/youkidearitai
> -----------------------------

Hi, Internals

I have revised this RFC.
https://wiki.php.net/rfc/grapheme_add_locale_for_case_insensitive

I believe I have done my best to address the complexity of Unicode.
I would like to go to "Voting" phase.

If there are no objections, I would like to start voting this week.

Regards
Yuya


-- 
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- https://github.com/youkidearitai
-----------------------------

Reply via email to