ID:               45017
 User updated by:  nospam at nihonbunka dot com
 Reported By:      nospam at nihonbunka dot com
 Status:           Bogus
 Bug Type:         *Unicode Issues
 Operating System: BSD
 PHP Version:      5.2.6
 New Comment:

Given that there is no single byte tilde in shift_jis then shouldn't
"//TRANSLIT" give the ascii code all the same? 

//TRANSLIT seems to break at the tilde and not display either the ascii
code or the rest of the string. 

For some unknown reason shift-jis encoded pages seem to display the
char(126) 7E character as a tilde and not an overbar, so the fact that
the tilde does not exist would not be a problem if it were TRANSLITted.
http://md2.cc.yamaguchi-u.ac.jp/~eigo/temp/tilde.php


Previous Comments:
------------------------------------------------------------------------

[2008-05-16 07:16:52] nospam at nihonbunka dot com

Hmm tilde is often displayed and used in Japan. How can this be? 
I have web pages such as that below, which I can type into and display
on a shift_jis encoded page

http://md2.cc.yamaguchi-u.ac.jp/~eigo/temp/tilde.php

The contents of this file is
<?PHP 
$string = 'https://md2.cc.yamaguchi-u.ac.jp/~eigo/temp/tilde.php'; //
This is what we start off with
echo ('this is what we start with = '.$string.'<BR />'); //print string
at start
$conv_str = iconv('utf-8','shift-jis'.'//TRANSLIT',$string); 
echo ('this is not working = '.$conv_str.'<BR />'); //Just to show that
this is not working.

$rstring = preg_replace ('/~/','1bytetilde',$string);   //modify before
conversion
echo ('this is modified string here = '.$rstring.'<BR />'); //This is
the modified string

$conv_str2 = iconv('utf-8','shift-jis'.'//TRANSLIT',$rstring);
//convert
$rereplace=chr(126); //$rereplace is a one byte tilde in shift_jis
$rerstring = preg_replace ('/1bytetilde/',$rereplace,$conv_str2);
//rereplace with tildes
echo ('this is the correct result = '.$rerstring.'<BR />'); //the
correct result
?>

------------------------------------------------------------------------

[2008-05-16 06:12:33] [EMAIL PROTECTED]

That's because shift-jis doesn't support the (ASCII) tilde. (Unicode
char 0x7D):
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P130-1999&ShowSet&s=ALL#ShowSet

------------------------------------------------------------------------

[2008-05-16 02:11:51] nospam at nihonbunka dot com

Description:
------------
inconv does not seem to convert single space tildes from utf8 to
shift_jis

Please bear in mind that shift_jis tildes are not where one would
expect them to be. 
http://en.wikipedia.org/wiki/Shift-JIS
"The single-byte characters 0x00 to 0x7F match the ASCII encoding,
except for a yen sign at 0x5C and an overline at 0x7E in place of the
ASCII character set's backslash and tilde respectively."

If I use //IGNORE then the tildes just disappear. If I use //TRANSLIT
the bug is even worse - all of the string after and including the first
the ~ disappears.

There was also a double byte tilde problem in the past, but this is
different. 



Reproduce code:
---------------
<?PHP 
$conv_str = iconv('utf-8','shift-jis'.'//IGNORE','where are the (~) (~)
tildes?'); 
echo ($conv_str);
?>



Expected result:
----------------
where are the (~) (~) tildes?

the above in shift_jist using //IGNORE



Actual result:
--------------
where are the () () tildes?

using //IGNORE and

where are the (

using //TRANSLIT


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=45017&edit=1

Reply via email to