ID:               39244
 Updated by:       [EMAIL PROTECTED]
 Reported By:      j dot hakvoort at publiceren dot net
-Status:           Open
+Status:           Wont fix
 Bug Type:         PCRE related
 Operating System: Linux
 PHP Version:      4.4.4
 New Comment:

Unicode support is on its way and will appear in PHP6.
I have to wait until then.


Previous Comments:
------------------------------------------------------------------------

[2006-10-25 12:26:27] j dot hakvoort at publiceren dot net

Hi!

I don't know PCRElib, but I am not aspecting that it has anything to do
with the issue as you mention "functions" reading unicode.

It's not about that, it's about that PHP can't be started when the php
script document is encoded in utf-8 format.

This will cause 3 characters to print so that sending headers isn't
possible anymore....

Also, PHP doesn't recognize UTF-8 characters in functions, but this is
not the main issue I am refering to with the BOM of UTF-8 encoded
documents.

Best Regards,
Jan Jaap Hakvoort

------------------------------------------------------------------------

[2006-10-25 12:11:25] [EMAIL PROTECTED]

PCRE functions in PHP are just wrappers for PCRElib.
If PCRElib is unable to read Unicode texts with BOM, then it's PCRElib
fault.
But I guess you shouldn't be using Notepad in the first place.

------------------------------------------------------------------------

[2006-10-25 12:08:07] j dot hakvoort at publiceren dot net

Ok, I found out that it's due to the folowing BUG in PHP......

UTF-8 encoded documents have 3 characters on top of the document wich
specify the UTF-8, this is called BOM.

These characters might be needed, but to get PHP working it would be
required to remove these characters.

The only solution to remove these characters I've found is by using
special editors. This will take a huge amount of time!

Is there no other solution for this?? Why doesn't PHP read UTF-8
encoded files???

Best Regards,
Jan Jaap Hakvoort

------------------------------------------------------------------------

[2006-10-24 09:55:58] [EMAIL PROTECTED]

Sorry, but your problem does not imply a bug in PHP itself.  For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions.  Due to the volume
of reports we can not explain in detail here why your report is not
a bug.  The support channels will be able to provide an explanation
for you.

Thank you for your interest in PHP.



------------------------------------------------------------------------

[2006-10-24 08:55:58] j dot hakvoort at publiceren dot net

Description:
------------
Hi!

I've been working on the encoding issue to make a site compatible for
any language, but the problem is that when you print special characters
in PHP it will be malformed.

So the advice I received is to encode the PHP code to UTF-8, but when I
do this however the script will fail because PHP doesn't read UTF-8
encoded PHP code!

Is this a bug? I am using 4.4.4

Best Regards,
Jan Jaap Hakvoort

Reproduce code:
---------------
<?php
$str = 'éúïäç Ä';
$match = preg_match('¡[ ]+¡',$str);
?>



Expected result:
----------------
$match will be set to true.

Actual result:
--------------
PHP error, unexpected character [block]...


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=39244&edit=1

Reply via email to