Edit report at https://bugs.php.net/bug.php?id=48507&edit=1
ID: 48507 Comment by: me at monicag dot it Reported by: krynble at yahoo dot com dot br Summary: fgetcsv() ignoring special characters Status: Bogus Type: Bug Package: Filesystem function related Operating System: Unix PHP Version: 5.* Block user comment: N Private report: N New Comment: Quoting my fellows above: how comes this is not a bug? Previous Comments: ------------------------------------------------------------------------ [2011-10-10 10:03:58] ghosh at q-one dot com Sorry. I don't understand why this isn't a bug either. Could someone please elaborate? I tried setting all different kinds of locale to no avail. The first letter of a string starting with a UTF-8 character is always missing. IMHO, fgetcsv should work as a simple string operation (or - whatever weird things it does right now - at least have a parameter to do so - count this as a feature request if you wish). I think, the current behavior is totally confusing. For instance, I don't understand why only the first character is missing but the problem doesnt appear if a character is in the middle of a string. ------------------------------------------------------------------------ [2011-07-17 16:19:28] max dot wildgrube at web dot de The problem does also appears if the special char is preceded by a blank. This blank also disappears. I use this ugly workaround: 1. first reading the complete csv file into a variable: $import 2. $import = preg_replace ("{(^|\t)([â¬-ÿ ])}m", "$1~~$2", $import); 3. after fgetcsv; for each $field of the row array: $field = str_replace ('~~', '', $field); This means: before using fgetcsv inserting a magic sequence (e.g. ~~) on the beginning of a field which begins with a blank or a special char; after parsing with fgetcsv removing it from each field. Max. ------------------------------------------------------------------------ [2011-07-08 08:39:50] php-bug-48507 at bsrealm dot net This IS a bug. Whatever locale is, I expect this function to read everything between delimiter characters without stripping the contents. Besides, docs say that files in one-byte encoding would read wrong, and there is a different case. This bug causes serious portability issue. In my case, this function was used to read custom database that was storing descriptions entered by users. Some descriptions were in utf-8 enconding. Function just had to read whatever was between delimiter characters and it worked like that on Windows hosting and stopped working after moving to Unix hosting. Note, file itself is not utf-8 encoded and it should not be. It is not related to locale. It must read data, even if it's binary, between delimiters. ------------------------------------------------------------------------ [2011-02-26 02:46:32] gjorgjioski at gmail dot com This is short example: kategorija Å¡irina platiÅ¡Ä Å¡tevilo read: kategorija irina platiÅ¡Ä tevilo expected: kategorija Å¡irina platiÅ¡Ä Å¡tevilo ------------------------------------------------------------------------ [2011-02-26 02:36:32] gjorgjioski at gmail dot com This bug occurs also when file is in UTF8 (tab delimited file using Å¡,Ä characters). I can provide an example. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at https://bugs.php.net/bug.php?id=48507 -- Edit this bug report at https://bugs.php.net/bug.php?id=48507&edit=1