drysler,

So you are asking about what PHP6 calls "script_encoding".   PHP5 does not 
have this concept.

PHP5 does not  know the character set that your script is encoded in. The 
script can be encoded in any character set that is "ASCII compatible". 
ASCII compatible encodings have the property that any byte value which has 
a meaning as an ASCII character has the same meaning in the 
character set.  iso-8859-1 or utf-8 both have this property whereas utf-16 
for example does not.   There are rules about
what characters can be used in identifiers, (see 
http://uk.php.net/manual/en/language.functions.php#functions.user-defined) 
 but you'll see that 
these are described in terms of byte values.  Thus when you define an 
identifier or string literal in a script it is simply treated as a series 
of byte 
values by PHP without any understanding of the meaning of those byte 
values other than if they happen to represent ASCII characters.

Rob.

drysler <[EMAIL PROTECTED]> wrote on 27/09/2007 20:32:09:

> > On 9/27/07, drysler <[EMAIL PROTECTED]> wrote:
> >> Hello,
> >>
> >> i am practising with charsets at the moment and so i thought:
> >>
> >> -> How does PHP know the charset i use in my source-code?
> >> -> Are php-sources limited to specific charsets?
> >> -> In which areas you have to be aware of the source-code-charset?
> >>
> >>
> >> Perhaps somebody here on the list can tell something about these 
issues?
> >> Thanks!
> >>
> > Unless I'm mistaken, PHP expects the source files to be in the
> > internal charset, which is ISO-8859-1. If you use the mbstring
> > extension, you can use different internal encodings. See:
> > http://www.php.net/mbstring
> > 
> > Another good read on charset vs. PHP is:
> > http://www.phpwact.org/php/i18n/charsets?s=utf
> > 
> > --
> > troels
> 
> 
> I think, the problem may be divided into 2 areas:
> 
> 1) handling charsets of data (e.g. regex or string functions)
> ------------------------------------------------------------------------
> No unsolvable problem. You have to know (and/or validate) the charset of 

> the data you process, no matter if typed in in the source code or loaded 

> from other data sources. There are "tools and workarounds" available, to 

> do the things right.
> 
> 2) paying attention to the charset of the source code
> ------------------------------------------------------------------------
> This is the main issue, i wanted to address with my posting.
> I asked myself, if there can be characters i use as source code, which
> php perhaps can not recognize because of the charset i used in the 
> source-code-document.
> Or perhaps in php are only characters "allowed", which are represented 
> all the same in all supported charsets, so there might be a list of 
> charsets, you can safely use when scripting php.
> 
> I mean, is there a difference (bytes?) writing the following in 
> iso-8859-1 or utf-8?
> 
> public function foo($bar = true) {
>    return self::SOME_CONSTANT;
> }
> 
> And if there is a difference, how php knows what i typed?
> 
> So many questions .... :)
> 
> 
> --
> Greetings,
> drysler
> 
> -- 
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
> 






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





Reply via email to