On Thu, Apr 7, 2016 at 9:36 AM, Bishop Bettini <bis...@php.net> wrote:
> The problem is, developers are going to write code to guess character sets.
>
True.  But they're going to put more faith in something in the
standard distribution, assuming it's passed muster.

> Ironically, PHPUnit attempts to detect UTF-8
>
Awwwwwwwwkward....

> I'd rather we include the patch for a few reasons:
>
> 1. so that there's a modern "standard" method of doing so, and that
> "standard" method has plenty of documentation that points people to the
> limitations.
>
In that spirit, how about we put in some stub documentation under the
intl extension with a paragraph or two on why UCharsetDetector *isn't*
wrapped, and why it's such a bad idea to try to solve the problem from
this end.

> 2. to completely expose the underlying ICU, rather than arbitrarily
> deciding one part isn't good for developers to use.
>
Is it arbitrary though?  The fact that coming up with test cases which
produce reasonable/expected results is half crap-shoot makes this an
evidence based decision, not a capricious one.

> 3. to provide an alternative to mb_detect_encoding.
>
And again in that spirit, I think this is a good argument for going
E_DEPRECATED on mb_detect_encoding().  The entire conversation which
led to prototyping an IntlCharsetDetector extension came from the fact
that mb_detect_encoding() wasn't doing its job well.  Rather than have
two supported, bad solutions, I think it'd be better to have one
deprecated (and thus unsupported) bad solution (which is only kept for
BC).

> While I can't say if this will or won't cause more user confusion, I do
> believe this adds value: ICU provides a confidence metric, which no other
> in-built or buildable solution (to my knowledge) provides.
>
The confidence metric is useful, but my spidey sense tells me that
it'll simply be ignored.

How about a compromise.  I'll reorder this patch to be a standalone
extension and we PECLize it.  If someone REALLY wants to throw caution
to the wind, they can, but they're on their own when it gives them
fugly results.

-Sara

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to