Edit report at https://bugs.php.net/bug.php?id=48507&edit=1

 ID:                 48507
 Comment by:         tero dot tasanen at gmail dot com
 Reported by:        krynble at yahoo dot com dot br
 Summary:            fgetcsv() ignoring special characters
 Status:             Bogus
 Type:               Bug
 Package:            Filesystem function related
 Operating System:   Unix
 PHP Version:        5.*
 Block user comment: N
 Private report:     N

 New Comment:

I can also confirm that this is an actual bug. File encoding UTF-8, locale 
settings are set correctly and characters like äöå are dropped from the 
beginning 
of the csv column. 

Tested with php versions 5.2.6, 5.2.10, 5.3.6


Previous Comments:
------------------------------------------------------------------------
[2011-10-28 08:33:25] peter dot e dot lind at gmail dot com

This is definitely still a bug - my locale is set to da_DK.utf8, the file I'm 
trying to read is in UTF8 (confirmed with a hex-editor but in fact does not 
matter - the behaviour is the same, UTF8 or ISO-8859-1) yet special characters 
are still thrown away when they are first in a field

------------------------------------------------------------------------
[2011-10-18 13:59:30] me at monicag dot it

Quoting my fellows above: how comes this is not a bug?

------------------------------------------------------------------------
[2011-10-10 10:03:58] ghosh at q-one dot com

Sorry. I don't understand why this isn't a bug either. Could someone please 
elaborate? I tried setting all different kinds of locale to no avail. The first 
letter of a string starting with a UTF-8 character is always missing. IMHO, 
fgetcsv should work as a simple string operation (or - whatever weird things it 
does right now - at least have a parameter to do so - count this as a feature 
request if you wish). I think, the current behavior is totally confusing. For 
instance, I don't understand why only the first character is missing but the 
problem doesnt appear if a character is in the middle of a string.

------------------------------------------------------------------------
[2011-07-17 16:19:28] max dot wildgrube at web dot de

The problem does also appears if the special char is preceded by a blank. This 
blank also disappears.

I use this ugly workaround:
1. first reading the complete csv file into a variable: $import
2. $import = preg_replace ("{(^|\t)([€-ÿ ])}m", "$1~~$2", $import); 
3. after fgetcsv; for each $field of the row array: $field = str_replace ('~~', 
'', $field);

This means: before using fgetcsv inserting a magic sequence (e.g. ~~) on the 
beginning of a field which begins with a blank or a special char; after parsing 
with fgetcsv removing it from each field.

Max.

------------------------------------------------------------------------
[2011-07-08 08:39:50] php-bug-48507 at bsrealm dot net

This IS a bug. Whatever locale is, I expect this function to read everything 
between delimiter characters without stripping the contents. Besides, docs say 
that files in one-byte encoding would read wrong, and there is a different 
case. This bug causes serious portability issue. In my case, this function was 
used to read custom database that was storing descriptions entered by users. 
Some descriptions were in utf-8 enconding. Function just had to read whatever 
was between delimiter characters and it worked like that on Windows hosting and 
stopped working after moving to Unix hosting. Note, file itself is not utf-8 
encoded and it should not be. It is not related to locale. It must read data, 
even if it's binary, between delimiters.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=48507


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=48507&edit=1

Reply via email to