ID:               31064
 User updated by:  km at control-b dot de
 Reported By:      km at control-b dot de
 Status:           Bogus
 Bug Type:         Strings related
 Operating System: windows
 PHP Version:      5.0.2
 New Comment:

well, i think that has nothing to do with utf-8 encoding. (it is
misdecoded in the email!)
i type the text, as it is, in german (no encodings or entities are
used).

what bothers me, is the fact, that it works with all german umlauts
expect "ö" and the "ß".
when it works for ä and ü, but not für ö, it must be a bug!
so the question is, if there are more person, like me, who discovered
this!


Previous Comments:
------------------------------------------------------------------------

[2004-12-12 14:54:46] [EMAIL PROTECTED]

This is not a bug, PHP doesn't know anything about UTF8 encodings and
will split up a word if it's not [A-z].

------------------------------------------------------------------------

[2004-12-12 01:57:22] km at control-b dot de

Description:
------------
str_word_count return wrong number, if german umlaut "ö" (ö) is
contained in the word. 
it is okay, if the umlaut is the first or the last character.

the same goes für the ligature "ß" ß.



Reproduce code:
---------------
echo str_word_count('wäre');         # 1 - okay
echo str_word_count('würde');        # 1 - okay
echo str_word_count('wérk');         # 1 - okay
echo str_word_count('wörk');         # 2 - wrong!!!
echo str_word_count('örk');          # 1 - okay
echo str_word_count('werök');        # 2 - wrong!!!
echo str_word_count('weräk');        # 1 - okay


echo str_word_count('straßenbahnölbehälter'); # 3 words???

Expected result:
----------------
the above code should return always 1



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=31064&edit=1

Reply via email to