php-i18n Digest 28 Jun 2006 10:40:07 -0000 Issue 332
Topics (messages 1013 through 1022):
Re: [PHP-DEV] RFC: Error handling in HTTP input decoding
1013 by: Jared Williams
1014 by: Andrei Zmievski
TextIterator changes
1015 by: Andrei Zmievski
1016 by: Michael Wallner
Re: [PHP-DEV] Re: TextIterator changes
1017 by: Andrei Zmievski
1018 by: Michael Wallner
1020 by: Marcus Boerger
1021 by: Andrei Zmievski
Re: [PHP-DEV] TextIterator changes
1019 by: Andrei Zmievski
Renaming unicode_semantics
1022 by: Andrei Zmievski
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
[email protected]
----------------------------------------------------------------------
--- Begin Message ---
> -----Original Message-----
> From: Andrei Zmievski [mailto:[EMAIL PROTECTED]
> Sent: 22 June 2006 22:46
> To: PHP Internals
> Cc: PHP I18N
> Subject: [PHP-DEV] RFC: Error handling in HTTP input decoding
>
> I'd like to solicit opinions on how we should treat
> conversion failures
> during HTTP input decoding. There are two issues at hand: fallback
> mechanism and application-driven decoding in case of failure. Let's
> look at the proposal for the latter one first.
>
> If the decoding of HTTP input fails (and the failure state would be
> achieved as soon as even one variable fails), PHP should set an error
> flag somewhere that is accessible to the user, via either a global
> variable or a function. It should also keep the original request data
> around (query string, POST body, and cookie data). The application
> should be able to access this data, since the encoding can be
> passed in
> the query string [1]. The application can then check this error flag
> and then call a function -- request_decode() perhaps -- to ask PHP to
> re-decode the request data based on a this specific encoding. For
> example:
>
> if (request_decoding_failed()) {
> request_decode(request_get_raw('ei'));
> }
>
> We might be able to tie this in with the input filter, but that means
> that the input filter will have to be required by PHP. I am open to
> other suggestions in this area.
>
> As for the first issue, PHP attempts to decode the input using the
> value of the unicode.output_encoding setting, because that is
> the most
> logical choice if we assume that the clients send the data
> back in the
> encoding that the page with the form was in. We could implement a
> fallback mechanism where PHP looks at the Accept-Charset
> header sent by
> the client[2]. This header is supposed to indicate what
> character sets
https://bugzilla.mozilla.org/show_bug.cgi?id=18643
Maybe of interest, it's the kludge for determining form charsets, after the
charset in the Content-Type header broke too much.
> are acceptable for the response. While this is not the same as
> specifying the character set of the request, it might be a
> good enough
> indicator of it. Or we could simply set the error state and let
> application figure out what charset it wants to use for decoding.
>
> Thanks for your attention.
>
> -Andrei
>
> [1] http://search.yahoo.com/search?ei=UTF-8&p=php
> [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
--- End Message ---
--- Begin Message ---
Thanks for the link!
-Andrei
On Jun 22, 2006, at 5:11 PM, Jared Williams wrote:
https://bugzilla.mozilla.org/show_bug.cgi?id=18643
Maybe of interest, it's the kludge for determining form charsets,
after the
charset in the Content-Type header broke too much.
--- End Message ---
--- Begin Message ---
I am working on implementing BreakIterator API [1]. I considered two
approaches: making a separate class or merging the API into the
existing TextIterator. Having a separate class would be a bit cleaner,
but I can see people wanting to use it in foreach(), and since
TextIterator already provides a lot of BreakIterator's functionality, I
decided that merging would be the best option. However, there is an
overlap between the BreakIterator API and the current TextIterator one,
so there will have to be some changes.
1. TextIterator::current() signature will change from:
mixed current()
to:
mixed current(integer &$offset)
in order to support BreakIterator's functionality of returning offset
in current().
2. TextIterator::next() will return the offset of the next boundary
instead of returning nothing.
3. TextIterator::rewind() will be renamed to TextIterator::first() to
conform to BreakIterator's first()/last() API.
So this is heads up. Let me know if you have a problem with this.
-A
[1] http://icu.sourceforge.net/apiref/icu4c/ubrk_8h.html
--- End Message ---
--- Begin Message ---
Andrei Zmievski wrote:
I am working on implementing BreakIterator API [1]. I considered two
approaches: making a separate class or merging the API into the existing
TextIterator. Having a separate class would be a bit cleaner, but I can
see people wanting to use it in foreach(), and since TextIterator
already provides a lot of BreakIterator's functionality, I decided that
merging would be the best option. However, there is an overlap between
the BreakIterator API and the current TextIterator one, so there will
have to be some changes.
1. TextIterator::current() signature will change from:
mixed current()
to:
mixed current(integer &$offset)
This will raise the same issue as we have/had with the reflection API:
[EMAIL PROTECTED]:~/build/php-unicode-debug$ cli -r 'interface i{function f();}
class c implements i{function f($a){}}'
Fatal error: Declaration of c::f() must be compatible with that of i::f() in
Command line code on line 1
Nah, don't look at me--I don't like that either.
3. TextIterator::rewind() will be renamed to TextIterator::first() to
conform to BreakIterator's first()/last() API.
So this is heads up. Let me know if you have a problem with this.
Huh? Rename or alias? It can't implement Iterator if there's no rewind() method.
Regards,
--
Michael
--- End Message ---
--- Begin Message ---
TextIterator does not implement Iterator interface, only Traversable.
It just happens to have functions of the same name as Iterator.
-Andrei
On Jun 23, 2006, at 2:16 PM, Michael Wallner wrote:
Andrei Zmievski wrote:
I am working on implementing BreakIterator API [1]. I considered two
approaches: making a separate class or merging the API into the
existing TextIterator. Having a separate class would be a bit
cleaner, but I can see people wanting to use it in foreach(), and
since TextIterator already provides a lot of BreakIterator's
functionality, I decided that merging would be the best option.
However, there is an overlap between the BreakIterator API and the
current TextIterator one, so there will have to be some changes.
1. TextIterator::current() signature will change from:
mixed current()
to:
mixed current(integer &$offset)
This will raise the same issue as we have/had with the reflection API:
[EMAIL PROTECTED]:~/build/php-unicode-debug$ cli -r 'interface
i{function f();} class c implements i{function f($a){}}'
Fatal error: Declaration of c::f() must be compatible with that of
i::f() in Command line code on line 1
Nah, don't look at me--I don't like that either.
3. TextIterator::rewind() will be renamed to TextIterator::first() to
conform to BreakIterator's first()/last() API.
So this is heads up. Let me know if you have a problem with this.
Huh? Rename or alias? It can't implement Iterator if there's no
rewind() method.
Regards,
--
Michael
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
--- End Message ---
--- Begin Message ---
Andrei Zmievski wrote:
TextIterator does not implement Iterator interface, only Traversable. It
just happens to have functions of the same name as Iterator.
Ah, okay.
That leaves the picky OO strictness I personally don't like.
I think we should give PHP some freedom back in this area.
Regards,
--
Michael
--- End Message ---
--- Begin Message ---
Hello Andrei,
you should reconsider giving it full iterator interface. Adding more
functionality is of course quite easy and no problem at all. Sharing an
interface also makes the whole thing feel much better as in no conflicts
as in consistent. Actually that's what interfaces are for.
Friday, June 23, 2006, 11:20:28 PM, you wrote:
> TextIterator does not implement Iterator interface, only Traversable.
> It just happens to have functions of the same name as Iterator.
> -Andrei
> On Jun 23, 2006, at 2:16 PM, Michael Wallner wrote:
>> Andrei Zmievski wrote:
>>> I am working on implementing BreakIterator API [1]. I considered two
>>> approaches: making a separate class or merging the API into the
>>> existing TextIterator. Having a separate class would be a bit
>>> cleaner, but I can see people wanting to use it in foreach(), and
>>> since TextIterator already provides a lot of BreakIterator's
>>> functionality, I decided that merging would be the best option.
>>> However, there is an overlap between the BreakIterator API and the
>>> current TextIterator one, so there will have to be some changes.
>>> 1. TextIterator::current() signature will change from:
>>> mixed current()
>>> to:
>>> mixed current(integer &$offset)
>>
>> This will raise the same issue as we have/had with the reflection API:
>> [EMAIL PROTECTED]:~/build/php-unicode-debug$ cli -r 'interface
>> i{function f();} class c implements i{function f($a){}}'
>>
>> Fatal error: Declaration of c::f() must be compatible with that of
>> i::f() in Command line code on line 1
>>
>>
>> Nah, don't look at me--I don't like that either.
>>
>>
>>> 3. TextIterator::rewind() will be renamed to TextIterator::first() to
>>> conform to BreakIterator's first()/last() API.
>>> So this is heads up. Let me know if you have a problem with this.
>>
>> Huh? Rename or alias? It can't implement Iterator if there's no
>> rewind() method.
Best regards,
Marcus
--- End Message ---
--- Begin Message ---
Thanks, I know what they are for. I remember discussing Iterator vs.
Traversable with you on IRC, and for some reason we settled on
Traversable. Anyway, I'm changing it now.
-Andrei
On Jun 24, 2006, at 2:35 AM, Marcus Boerger wrote:
Hello Andrei,
you should reconsider giving it full iterator interface. Adding more
functionality is of course quite easy and no problem at all.
Sharing an
interface also makes the whole thing feel much better as in no
conflicts
as in consistent. Actually that's what interfaces are for.
Friday, June 23, 2006, 11:20:28 PM, you wrote:
TextIterator does not implement Iterator interface, only Traversable.
It just happens to have functions of the same name as Iterator.
-Andrei
On Jun 23, 2006, at 2:16 PM, Michael Wallner wrote:
Andrei Zmievski wrote:
I am working on implementing BreakIterator API [1]. I considered
two
approaches: making a separate class or merging the API into the
existing TextIterator. Having a separate class would be a bit
cleaner, but I can see people wanting to use it in foreach(), and
since TextIterator already provides a lot of BreakIterator's
functionality, I decided that merging would be the best option.
However, there is an overlap between the BreakIterator API and the
current TextIterator one, so there will have to be some changes.
1. TextIterator::current() signature will change from:
mixed current()
to:
mixed current(integer &$offset)
This will raise the same issue as we have/had with the reflection
API:
[EMAIL PROTECTED]:~/build/php-unicode-debug$ cli -r 'interface
i{function f();} class c implements i{function f($a){}}'
Fatal error: Declaration of c::f() must be compatible with that of
i::f() in Command line code on line 1
Nah, don't look at me--I don't like that either.
3. TextIterator::rewind() will be renamed to TextIterator::first
() to
conform to BreakIterator's first()/last() API.
So this is heads up. Let me know if you have a problem with this.
Huh? Rename or alias? It can't implement Iterator if there's no
rewind() method.
Best regards,
Marcus
--- End Message ---
--- Begin Message ---
Sean (on IRC) convinced me that something called *Iterator had better
implement Iterator interface (which TextIterator currently does not).
So changing method signatures is out of the question. Towards that, the
current functions will stay as they are, but I'll have to add
current_offset() (for getting the offset of the current element as
opposed to the element itself), and alias rewind() to first().
-Andrei
On Jun 23, 2006, at 2:00 PM, Andrei Zmievski wrote:
I am working on implementing BreakIterator API [1]. I considered two
approaches: making a separate class or merging the API into the
existing TextIterator. Having a separate class would be a bit cleaner,
but I can see people wanting to use it in foreach(), and since
TextIterator already provides a lot of BreakIterator's functionality,
I decided that merging would be the best option. However, there is an
overlap between the BreakIterator API and the current TextIterator
one, so there will have to be some changes.
1. TextIterator::current() signature will change from:
mixed current()
to:
mixed current(integer &$offset)
in order to support BreakIterator's functionality of returning offset
in current().
2. TextIterator::next() will return the offset of the next boundary
instead of returning nothing.
3. TextIterator::rewind() will be renamed to TextIterator::first() to
conform to BreakIterator's first()/last() API.
So this is heads up. Let me know if you have a problem with this.
-A
[1] http://icu.sourceforge.net/apiref/icu4c/ubrk_8h.html
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
--- End Message ---
--- Begin Message ---
I am considering renaming unicode_semantics to unicode.semantics to
conform to the rest of the Unicode-related INI settings. If there are
no major objections, I am going to patch it tomorrow.
-Andrei
--- End Message ---