Edit report at https://bugs.php.net/bug.php?id=45356&edit=1
ID: 45356
Comment by: gtisza at gmail dot com
Reported by: al at txtlocal dot com
Summary: fgetcsv() £ symbol stripped if first char in cell
Status: No Feedback
Type: Bug
Package: Filesystem function related
Operating System: Linux
PHP Version: 5.2.6
Block user comment: N
Private report: N
New Comment:
fgetcsv() seems to throw the first character away if it is invalid in the
current locale, but ignores invalid characters which are not at the beginning
of a cell. This code reproduces the problem in PHP 5.3.6:
<?php
setlocale(LC_ALL,'C');
$utfchar = chr(0xC3).chr(0x89); // U+009C in UTF-8
$csv = $utfchar."x".$utfchar."x\n";
file_put_contents('test.csv', $csv);
$file = fopen('test.csv', 'r');
$data = fgetcsv($file);
for ($i = 0; $i < strlen($data[0]); $i++) {
echo dechex(ord($data[0][$i])).' ';
}
echo "\n";
unlink('test.csv');
// expected: c3 89 78 c3 89 78 - "ÃxÃx"
// actual: 78 c3 89 78 - "xÃx"
?>
I agree with the commenter in bug 12127 that a CSV function should not mess
with encodings in the first place, just copy the content byte-by-byte.
Previous Comments:
------------------------------------------------------------------------
[2008-09-08 22:06:42] sfschiller at gmail dot com
based on [mk at kurznet dot com]
a change of the locale information helps.
setlocale(LC_ALL,'de_DE.8859-1');
setting the locale information to a unicode or UTF locale names will lose the
first letters.
------------------------------------------------------------------------
[2008-09-08 19:04:43] mk at kurznet dot com
if have the same problem with php 5.2.6
the csv file looks like this: äüö123äüö;auo123äüö
$handle = fopen($path."Mappe3.csv","r");
while ($data = fgetcsv ($handle, 4096, ";")) {
print_r($data);
}
fclose ($handle);
Array
(
[0] => 123äüö
[1] => auo123äüö
)
with PHP 5.2.5 and 4.4.8 everything is ok ?
is this a bug or a feature ?
------------------------------------------------------------------------
[2008-07-27 01:00:01] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
------------------------------------------------------------------------
[2008-07-19 17:50:18] [email protected]
Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves.
A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external
resources such as databases, etc. If the script requires a
database to demonstrate the issue, please make sure it creates
all necessary tables, stored procedures etc.
Please avoid embedding huge scripts into the report.
I'm unable to reproduce it with a simple scripts neither with 5.2.6 nor with
5.3.0-dev.
------------------------------------------------------------------------
[2008-06-25 18:08:31] al at txtlocal dot com
If you have csv file:
name,price
James,£150
fgetcsv() will remove the £.
All other chars seem to be fine.
I have searched forums for an answer to this and there are a few people
reporting the same - but no definitive answer.
In addition - this is only if the £ character in the first char in a
"cell". This would work fine:
name,price
James,1£50
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
https://bugs.php.net/bug.php?id=45356
--
Edit this bug report at https://bugs.php.net/bug.php?id=45356&edit=1