ID:               18169
 Comment by:       [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
 Status:           Analyzed
 Bug Type:         MSSQL related
 Operating System: Windows 2000 Server
 PHP Version:      4.1.2
 New Comment:

If you're using PHP on a Windows platform you can use the PHP COM
extension to communicate with SQL Server via ADO.  The PHP COM
extension is capable of translating UTF-8 to UCS-2 and back if you
specify so as the third parameter:

  $oDb = new COM('ADODB.Connection', NULL, CP_UTF8);

This way you can use Unicode UTF-8 within PHP and Unicode UCS-2 within
SQL Server with all the translations done for you automatically.

HTH, Freddy Vulto


Previous Comments:
------------------------------------------------------------------------

[2002-07-06 07:08:48] [EMAIL PROTECTED]

Thanks Marko

-I guess this means that if you are to use binary (ie. unicode) data,
then COM/ADO is your only option, if SQL Server is the database of your
choice.

>From yohgaki's answer, I guess also the multibyte encoding
functionality lacks proper Unicode support -am I correct in assuming
that we will have to move to PHP4.2.x and do our own encoding/decoding
through the Win32 API then?

------------------------------------------------------------------------

[2002-07-05 05:34:22] [EMAIL PROTECTED]

PHP's mssql extension uses the Microsoft SQL Server's C 
API, the "DB-Library for C". Specifically, SQL queries are 
sent to the server using the dbcmd() function. This 
function is not binary safe, so inserting UCS2 text or 
images or any binary data is likely to fail.

The DB-Library for C has separate, binary-safe APIs for 
entering text and images, but they are complicated and 
difficult to seamlessly integrate to the current mssql 
extension. Look up the documentation for dbwritetext() if 
you feel like implementing this change.

UTF-8 and UTF-7 are, IIRC, the only Unicode encoding that 
are guaranteed not to include null characters. They are, 
therefore, the only encodings that can be reliably used 
with PHP's mssql extension at this time.

------------------------------------------------------------------------

[2002-07-05 04:21:52] [EMAIL PROTECTED]

You are probably right. However, Unicode is central to making
world-wide web applications, and all major database vendors have this
posibility.
I find it to be a hindrance to wider deployment of large-scale,
worldwide php applications.

Does anyone know if it is only the MSSQL module? -are there any plans
to look into this issue?

What are the future directions for PHP and Unicode support?

------------------------------------------------------------------------

[2002-07-05 04:14:38] [EMAIL PROTECTED]

Wide char encoding, UCS2/UCS4/UTF16/UTF32, don't work well with current
PHP. I guess SQL Server module is using strlen() or like, that cannot
be used with wide char...

Fixing this is not simple at all. 


------------------------------------------------------------------------

[2002-07-04 18:10:24] [EMAIL PROTECTED]

I have a problem converting UTF-8 (web character encoding) to UCS2
(Microsoft Windows character encoding) using PHP, and storing this in
the Microsoft SQL Server 2000 database.

My setup is:
Windows 2000 Server, with Apache 1.3.24/PHP 4.1.1 and Microsoft SQL
Server 2000

Now, as a result of Microsofts Q232580, I will have to do conversion
between UTF-8 and UCS-2. For this, I thought I would use the Multibyte
String functions.
However, this does not seem to work.

I am absolutely sure, that I input UTF-8 encoded data into my string,
and then I do:
$ucs2string=mb_convert_encoding($string,"UCS2","UTF-8");
$sqlStmt="insert into testtbl (tekst) values(N'".($ucs2string)."')";
$rs=$DBCon->Execute($sqlStmt);

When I access the database, then I will see something stored, that does
not resemble the input at all (most times, I see Japanese/Chinese
characters?!??). Furthermore, the insert sometimes comes up with an
error, and consequently stores nothing.

To me, it seems like either one of these (or both) are flawed:
1. the Multibyte String encoding funtion does not work properly (ie.
encoding from UTF-8 to UCS-2 does not happen correctly).
2. The PHP MSSQL driver does not handle unicode data properly, even
though the target column in the database is specified as Unicode and N
is prepended to the string before insert.

This leads me to use ADO (as in the example above), storing UTF-8
encoded data into SQL Server -this is a very short term solution, as
data are not sortable in the database (some of it looks like garbage
because of the
missing encoding).


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=18169&edit=1

Reply via email to