RE: Java, SQL, Unicode and Databases

2000-06-23 Thread Michael Kaplan (Trigeminal Inc.)

The datatype *does* matter in that sense you would use UTF-16 data
fields (NTEXT and NCHAR and NVARCHAR) and access it with your favorite data
access method, which will convert as needed to whatever format IS uses. You
will never know oc care what the underlying engine stores.

The web site stuff will not work for you since you would have to do the
extra conversions to do the data mining, so you would probably go with plan
"A".

My general point is that OLE DB to an Oracle UTF-8 field and to a SQL Server
UTF-16 field all return the same type of data UTF-16. So COM in this
case is hiding the differences.

Michael

> --
> From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
> Sent: Friday, June 23, 2000 2:27 PM
> To:   Michael Kaplan (Trigeminal Inc.)
> Cc:   Unicode List; [EMAIL PROTECTED]
> Subject:      RE: Java, SQL, Unicode and Databases
> 
> 
> 
> Michael, are you saying that the data type (char or nchar) doesn't matter?
> Are
> you saying that if we just use UTF-16 or wchar_t interfaces to access the
> data
> all will be fine and we will be able to store multilingual data even in
> fields
> defined as char? Maybe things aren't as bad as I feared.
> 
> With respect to the web applications you describe, do they store the UTF-8
> as
> binary data? This wouldn't work for us, since we want other data mining
> applications to be able to access the same data.
> 
> Thanks,
> Joe
> 
> "Michael Kaplan (Trigeminal Inc.)" <[EMAIL PROTECTED]> on 06/23/2000
> 10:41:39 AM
> 
> To:   Unicode List <[EMAIL PROTECTED]>, Joe Ross/Tivoli Systems@Tivoli
> Systems
> cc:   Hossein Kushki@IBMCA
> Subject:  RE: Java, SQL, Unicode and Databases
> 
> 
> 
> 
> Microsoft is very COM-based for its actual data access methods and COM
> uses BSTRs that are BOM-less UTF-16. Because of that, the actual storage
> format of any database ends up irrelevant since it will be converted to
> UTF-16 anyway.
> 
> Given that this is what the data layers do, performance is certainly
> better
> if there does not have to be an extra call to the Windows
> MutliByteToWideChar to convert UTF-8 to UTF-16. So from a Windows
> perspective, not only is it no trouble, but it also the best possible
> solution!
> 
> In any case, I know plenty of web people who *do* encode their strings in
> SQL Server databases as UTF-8 for web applications, since UTF-8 is their
> preference. They are willing to take the hit of "converting themselves"
> because when data is being read it is faster to go through no conversions
> at
> all.
> 
> Michael
> 
> > --
> > From:   [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
> > Sent:   Friday, June 23, 2000 7:55 AM
> > To: Unicode List
> > Cc: Unicode List; [EMAIL PROTECTED]
> > Subject: Re: Java, SQL, Unicode and Databases
> >
> >
> >
> > I think that this is also true for DB2 using UTF-8 as the database
> > encoding.
> > From an application perspective, MS SQL Server is the one that gives us
> > the most
> > trouble, because it doesn't support UTF-8 as a database encoding for
> char,
> > etc.
> > Joe
> >
> > Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM
> >
> > To:   "Unicode List" <[EMAIL PROTECTED]>
> > cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe
> > Ross/Tivoli
> >   Systems)
> > Subject:  Re: Java, SQL, Unicode and Databases
> >
> >
> >
> >
> > Jianping responded:
> >
> > >
> > > Tex,
> > >
> > > Oracle doesn't have special requirement for datatype in JDBC driver if
> > you use
> > UTF8 as database
> > > character set. In this case, all the text datatype in JDBC will
> support
> > Unicode data.
> > >
> >
> > The same thing is, of course, true for Sybase databases using UTF-8
> > at the database character set, accessing them through a JDBC driver.
> >
> > But I think Tex's question is aimed at the much murkier area
> > of what the various database vendors' strategies are for dealing
> > with UTF-16 Unicode as a datatype. In that area, the answers for
> > what a cross-platform application vendor needs to do and for how
> > JDBC drivers might abstract differences in database implementations
> > are still unclear.
> >
> > --Ken
> >
> >
> >
> 
> 
> 



RE: Java, SQL, Unicode and Databases

2000-06-23 Thread Joe_Ross



Michael, are you saying that the data type (char or nchar) doesn't matter? Are
you saying that if we just use UTF-16 or wchar_t interfaces to access the data
all will be fine and we will be able to store multilingual data even in fields
defined as char? Maybe things aren't as bad as I feared.

With respect to the web applications you describe, do they store the UTF-8 as
binary data? This wouldn't work for us, since we want other data mining
applications to be able to access the same data.

Thanks,
Joe

"Michael Kaplan (Trigeminal Inc.)" <[EMAIL PROTECTED]> on 06/23/2000
10:41:39 AM

To:   Unicode List <[EMAIL PROTECTED]>, Joe Ross/Tivoli Systems@Tivoli Systems
cc:   Hossein Kushki@IBMCA
Subject:  RE: Java, SQL, Unicode and Databases




Microsoft is very COM-based for its actual data access methods and COM
uses BSTRs that are BOM-less UTF-16. Because of that, the actual storage
format of any database ends up irrelevant since it will be converted to
UTF-16 anyway.

Given that this is what the data layers do, performance is certainly better
if there does not have to be an extra call to the Windows
MutliByteToWideChar to convert UTF-8 to UTF-16. So from a Windows
perspective, not only is it no trouble, but it also the best possible
solution!

In any case, I know plenty of web people who *do* encode their strings in
SQL Server databases as UTF-8 for web applications, since UTF-8 is their
preference. They are willing to take the hit of "converting themselves"
because when data is being read it is faster to go through no conversions at
all.

Michael

> --
> From:   [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
> Sent:   Friday, June 23, 2000 7:55 AM
> To: Unicode List
> Cc:     Unicode List; [EMAIL PROTECTED]
> Subject: Re: Java, SQL, Unicode and Databases
>
>
>
> I think that this is also true for DB2 using UTF-8 as the database
> encoding.
> From an application perspective, MS SQL Server is the one that gives us
> the most
> trouble, because it doesn't support UTF-8 as a database encoding for char,
> etc.
> Joe
>
> Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM
>
> To:   "Unicode List" <[EMAIL PROTECTED]>
> cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe
> Ross/Tivoli
>   Systems)
> Subject:  Re: Java, SQL, Unicode and Databases
>
>
>
>
> Jianping responded:
>
> >
> > Tex,
> >
> > Oracle doesn't have special requirement for datatype in JDBC driver if
> you use
> UTF8 as database
> > character set. In this case, all the text datatype in JDBC will support
> Unicode data.
> >
>
> The same thing is, of course, true for Sybase databases using UTF-8
> at the database character set, accessing them through a JDBC driver.
>
> But I think Tex's question is aimed at the much murkier area
> of what the various database vendors' strategies are for dealing
> with UTF-16 Unicode as a datatype. In that area, the answers for
> what a cross-platform application vendor needs to do and for how
> JDBC drivers might abstract differences in database implementations
> are still unclear.
>
> --Ken
>
>
>






Re: Java, SQL, Unicode and Databases

2000-06-23 Thread Joe_Ross



Yes,  version 7. It requires us to use a different data type (nchar) if we want
to store multilingual text as UTF-16. We want our applications to be database
vendor independent so that customers can use any database under the covers. If
all databases supported UTF-8 as an encoding for char, we could support
multilingual data in the same way for all vendors. As it is, we have to use a
different schema for MS SQL server than we do for the others.
Joe


"Tex Texin" <[EMAIL PROTECTED]> on 06/23/2000 11:50:06 AM

To:   Joe Ross/Tivoli Systems@Tivoli Systems
cc:   Unicode List <[EMAIL PROTECTED]>, Hossein Kushki@IBMCA, Vladimir Dvorkin
  <[EMAIL PROTECTED]>, Steven Watt <[EMAIL PROTECTED]>
Subject:  Re: Java, SQL, Unicode and Databases




Joe,

Can you expand on this a bit more? Privately if you prefer.
Do you mean version 7 of MS SQL Server?

I assume if it doesn't have UTF-8, it uses UTF-16. How does this
being the storage encoding, become problematic?
tex


[EMAIL PROTECTED] wrote:
>
> I think that this is also true for DB2 using UTF-8 as the database encoding.
> From an application perspective, MS SQL Server is the one that gives us the
most
> trouble, because it doesn't support UTF-8 as a database encoding for char,
etc.
> Joe
>
> Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM
>
> To:   "Unicode List" <[EMAIL PROTECTED]>
> cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe
Ross/Tivoli
>   Systems)
> Subject:  Re: Java, SQL, Unicode and Databases
>
> Jianping responded:
>
> >
> > Tex,
> >
> > Oracle doesn't have special requirement for datatype in JDBC driver if you
use
> UTF8 as database
> > character set. In this case, all the text datatype in JDBC will support
> Unicode data.
> >
>
> The same thing is, of course, true for Sybase databases using UTF-8
> at the database character set, accessing them through a JDBC driver.
>
> But I think Tex's question is aimed at the much murkier area
> of what the various database vendors' strategies are for dealing
> with UTF-16 Unicode as a datatype. In that area, the answers for
> what a cross-platform application vendor needs to do and for how
> JDBC drivers might abstract differences in database implementations
> are still unclear.
>
> --Ken

--


Tex Texin Director, International Products

Progress Software Corp.   +1-781-280-4271
14 Oak Park   +1-781-280-4655 (Fax)
Bedford, MA 01730  USA[EMAIL PROTECTED]

http://www.progress.com   The #1 Embedded Database
http://www.SonicMQ.comJMS Compliant Messaging- Best Middleware
Award
http://www.aspconnections.com Leading provider in the ASP marketplace

Progress Globalization Program (New URL)
http://www.progress.com/partners/globalization.htm


Come to the Panel on Open Source Approaches to Unicode Libraries at
the Sept. Unicode Conference
http://www.unicode.org/iuc/iuc17






Re: Java, SQL, Unicode and Databases

2000-06-23 Thread Tex Texin

Michael,
Thanks for this.
Are there any programming adjustements needed that arise when 
accessing MS SQL Server to store/retrieve UTF-16 using SQL, Java and 
JDBC?

(If the MS JDBC driver goes thru COM to the database, I didn't
know that.)

Tex

"Michael Kaplan (Trigeminal Inc.)" wrote:
> 
> Microsoft is very COM-based for its actual data access methods and COM
> uses BSTRs that are BOM-less UTF-16. Because of that, the actual storage
> format of any database ends up irrelevant since it will be converted to
> UTF-16 anyway.
> 
> Given that this is what the data layers do, performance is certainly better
> if there does not have to be an extra call to the Windows
> MutliByteToWideChar to convert UTF-8 to UTF-16. So from a Windows
> perspective, not only is it no trouble, but it also the best possible
> solution!
> 
> In any case, I know plenty of web people who *do* encode their strings in
> SQL Server databases as UTF-8 for web applications, since UTF-8 is their
> preference. They are willing to take the hit of "converting themselves"
> because when data is being read it is faster to go through no conversions at
> all.
> 
> Michael
> 
> > --
> > From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
> > Sent: Friday, June 23, 2000 7:55 AM
> > To:   Unicode List
> > Cc:   Unicode List; [EMAIL PROTECTED]
> > Subject:  Re: Java, SQL, Unicode and Databases
> >
> >
> >
> > I think that this is also true for DB2 using UTF-8 as the database
> > encoding.
> > From an application perspective, MS SQL Server is the one that gives us
> > the most
> > trouble, because it doesn't support UTF-8 as a database encoding for char,
> > etc.
> > Joe
> >
> > Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM
> >
> > To:   "Unicode List" <[EMAIL PROTECTED]>
> > cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe
> > Ross/Tivoli
> >   Systems)
> > Subject:  Re: Java, SQL, Unicode and Databases
> >
> >
> >
> >
> > Jianping responded:
> >
> > >
> > > Tex,
> > >
> > > Oracle doesn't have special requirement for datatype in JDBC driver if
> > you use
> > UTF8 as database
> > > character set. In this case, all the text datatype in JDBC will support
> > Unicode data.
> > >
> >
> > The same thing is, of course, true for Sybase databases using UTF-8
> > at the database character set, accessing them through a JDBC driver.
> >
> > But I think Tex's question is aimed at the much murkier area
> > of what the various database vendors' strategies are for dealing
> > with UTF-16 Unicode as a datatype. In that area, the answers for
> > what a cross-platform application vendor needs to do and for how
> > JDBC drivers might abstract differences in database implementations
> > are still unclear.
> >
> > --Ken
> >
> >
> >

-- 

Tex Texin Director, International Products
 
Progress Software Corp.   +1-781-280-4271
14 Oak Park   +1-781-280-4655 (Fax)
Bedford, MA 01730  USA[EMAIL PROTECTED]

http://www.progress.com   The #1 Embedded Database
http://www.SonicMQ.comJMS Compliant Messaging- Best Middleware
Award
http://www.aspconnections.com Leading provider in the ASP marketplace

Progress Globalization Program (New URL)
http://www.progress.com/partners/globalization.htm

Come to the Panel on Open Source Approaches to Unicode Libraries at
the Sept. Unicode Conference
http://www.unicode.org/iuc/iuc17



Re: Java, SQL, Unicode and Databases

2000-06-23 Thread Tex Texin

Joe, 

Can you expand on this a bit more? Privately if you prefer.
Do you mean version 7 of MS SQL Server?

I assume if it doesn't have UTF-8, it uses UTF-16. How does this
being the storage encoding, become problematic?
tex


[EMAIL PROTECTED] wrote:
> 
> I think that this is also true for DB2 using UTF-8 as the database encoding.
> From an application perspective, MS SQL Server is the one that gives us the most
> trouble, because it doesn't support UTF-8 as a database encoding for char, etc.
> Joe
> 
> Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM
> 
> To:   "Unicode List" <[EMAIL PROTECTED]>
> cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe Ross/Tivoli
>   Systems)
> Subject:  Re: Java, SQL, Unicode and Databases
> 
> Jianping responded:
> 
> >
> > Tex,
> >
> > Oracle doesn't have special requirement for datatype in JDBC driver if you use
> UTF8 as database
> > character set. In this case, all the text datatype in JDBC will support
> Unicode data.
> >
> 
> The same thing is, of course, true for Sybase databases using UTF-8
> at the database character set, accessing them through a JDBC driver.
> 
> But I think Tex's question is aimed at the much murkier area
> of what the various database vendors' strategies are for dealing
> with UTF-16 Unicode as a datatype. In that area, the answers for
> what a cross-platform application vendor needs to do and for how
> JDBC drivers might abstract differences in database implementations
> are still unclear.
> 
> --Ken

-- 

Tex Texin Director, International Products
 
Progress Software Corp.   +1-781-280-4271
14 Oak Park   +1-781-280-4655 (Fax)
Bedford, MA 01730  USA[EMAIL PROTECTED]

http://www.progress.com   The #1 Embedded Database
http://www.SonicMQ.comJMS Compliant Messaging- Best Middleware
Award
http://www.aspconnections.com Leading provider in the ASP marketplace

Progress Globalization Program (New URL)
http://www.progress.com/partners/globalization.htm

Come to the Panel on Open Source Approaches to Unicode Libraries at
the Sept. Unicode Conference
http://www.unicode.org/iuc/iuc17



RE: Java, SQL, Unicode and Databases

2000-06-23 Thread Michael Kaplan (Trigeminal Inc.)

Microsoft is very COM-based for its actual data access methods and COM
uses BSTRs that are BOM-less UTF-16. Because of that, the actual storage
format of any database ends up irrelevant since it will be converted to
UTF-16 anyway.

Given that this is what the data layers do, performance is certainly better
if there does not have to be an extra call to the Windows
MutliByteToWideChar to convert UTF-8 to UTF-16. So from a Windows
perspective, not only is it no trouble, but it also the best possible
solution!

In any case, I know plenty of web people who *do* encode their strings in
SQL Server databases as UTF-8 for web applications, since UTF-8 is their
preference. They are willing to take the hit of "converting themselves"
because when data is being read it is faster to go through no conversions at
all.

Michael

> --
> From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
> Sent: Friday, June 23, 2000 7:55 AM
> To:   Unicode List
> Cc:   Unicode List; [EMAIL PROTECTED]
> Subject:      Re: Java, SQL, Unicode and Databases
> 
> 
> 
> I think that this is also true for DB2 using UTF-8 as the database
> encoding.
> From an application perspective, MS SQL Server is the one that gives us
> the most
> trouble, because it doesn't support UTF-8 as a database encoding for char,
> etc.
> Joe
> 
> Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM
> 
> To:   "Unicode List" <[EMAIL PROTECTED]>
> cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe
> Ross/Tivoli
>   Systems)
> Subject:  Re: Java, SQL, Unicode and Databases
> 
> 
> 
> 
> Jianping responded:
> 
> >
> > Tex,
> >
> > Oracle doesn't have special requirement for datatype in JDBC driver if
> you use
> UTF8 as database
> > character set. In this case, all the text datatype in JDBC will support
> Unicode data.
> >
> 
> The same thing is, of course, true for Sybase databases using UTF-8
> at the database character set, accessing them through a JDBC driver.
> 
> But I think Tex's question is aimed at the much murkier area
> of what the various database vendors' strategies are for dealing
> with UTF-16 Unicode as a datatype. In that area, the answers for
> what a cross-platform application vendor needs to do and for how
> JDBC drivers might abstract differences in database implementations
> are still unclear.
> 
> --Ken
> 
> 
> 



Re: Java, SQL, Unicode and Databases

2000-06-23 Thread Joe_Ross



I think that this is also true for DB2 using UTF-8 as the database encoding.
>From an application perspective, MS SQL Server is the one that gives us the most
trouble, because it doesn't support UTF-8 as a database encoding for char, etc.
Joe

Kenneth Whistler <[EMAIL PROTECTED]> on 06/22/2000 06:42:20 PM

To:   "Unicode List" <[EMAIL PROTECTED]>
cc:   [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] (bcc: Joe Ross/Tivoli
      Systems)
Subject:  Re: Java, SQL, Unicode and Databases




Jianping responded:

>
> Tex,
>
> Oracle doesn't have special requirement for datatype in JDBC driver if you use
UTF8 as database
> character set. In this case, all the text datatype in JDBC will support
Unicode data.
>

The same thing is, of course, true for Sybase databases using UTF-8
at the database character set, accessing them through a JDBC driver.

But I think Tex's question is aimed at the much murkier area
of what the various database vendors' strategies are for dealing
with UTF-16 Unicode as a datatype. In that area, the answers for
what a cross-platform application vendor needs to do and for how
JDBC drivers might abstract differences in database implementations
are still unclear.

--Ken






Re: Java, SQL, Unicode and Databases

2000-06-23 Thread Tex Texin

Addison, thanks for this. Good points.
I am sure if we bear down on it, there can be many more than 2
problems. JDBC driver differences will be a third.
We went thru similar issues programming for
double-byte databases a few years back. At least with
Unicode, we are doing this for the last time. ;-)

What prompted the question was some allegations about
constraints on datatypes and resulting reprogramming that
would be required. I am not sure I believe them so
I don't want to repeat them. I was glad to see the comments
on Oracle and Sybase, I hope we will hear from some others.

Ken is right about UTF-16 being murkier, but not too many
databases are there yet.

Tex


Addison Phillips wrote:
> 
> I dunno, Tex, sounds like two problems to me.
> 
> 1. How do I configure all these different databases to support Unicode
> (transparently to my app)?
> 2. How do I write my program to store/retrieve Unicode (independent of
> database)?
> 
> When generating SQL statements, there are relatively few differences in the
> actual statements. It's the database schema that has to be modified to take
> advantage of Unicode in most instances that I've dealt with (since your SQL
> statement will be generated from class String and won't specify a datatype
> explictly-- the JDBC driver "knows" if the column is fixed width and is
> supposed to handle trimming or blank padding for you). Otherwise, a SQL
> statement is basically a big string, and doens't specify the explicit datatype.
> 
> In fact, the JDBC 1.2 spec says "There is no need for Java programmers to
> distinguish among the three different flavors of SQL strings CHAR, VARCHAR, and
> LONGVARCHAR. These can all be expressed identically in Java. It is possible to
> read and write SQL correctly without needing to know the exact data type that
> was expected..."
> 
> Of course, if your program is going to generate tables, you will have to know
> the specifics of the schema, and this varies by manufacturer.
> 
> There are other little configuration tweaks you may have to master, also
> (again, not in your Java code directly). With Oracle, you do have to set the
> NLS_LANG parameter appropriately to get the JDBC driver to generate UTF-8 SQL
> statements (and receive Unicode back from the database). In addition, you have
> to modify (better, create) your Oracle instance to use UTF-8 as the database
> character set.
> 
> The use of nchar types is database dependent. Oracle doesn't require/use nchar
> types to store Unicode. Some other databases do. The Transact-SQL 7.x
> documentation maintains that MS SQL Server still requires nchar/nvarchar
> datatypes to store Unicode data. I haven't fooled much with MS-SQL in awhile,
> so it could be true (but it sounds dubious, doesn't it?). In any case, it
> shouldn't make a difference in your Java code. It'll be at database
> configuration time that you have to decide.
> 
> As usual, the "real" problem with storing Unicode may not be with the character
> set anyway. It's with things like collation sequence (Oracle, for example,
> allows only a single locale collation sequence at one time), normalization (most
> databases don't), and data expansion (does varchar 50 mean 50 characters or 50
> bytes? if it's 50 bytes, how many are enough for *your* data at worst case
> expansion? is worst case expansion still 3 bytes per character given the Outer
> Planes? will your customer accept that?). Your Java code will have to make up
> for the idiosyncrasies of your database with regard to these (locale-related)
> factors.
> 
> Best Regards,
> 
> Addison
> 
> Addison P. Phillips
> Principal Consultant
> Inter-Locale, LLC
> Globalization Engineering & Consulting Services
> 
> +1 408.210.3569 (mobile)  +1 408.904.4762 (fax)
> mailto: [EMAIL PROTECTED]   http://www.inter-locale.com

-- 

Tex Texin Director, International Products
 
Progress Software Corp.   +1-781-280-4271
14 Oak Park   +1-781-280-4655 (Fax)
Bedford, MA 01730  USA[EMAIL PROTECTED]

http://www.progress.com   The #1 Embedded Database
http://www.SonicMQ.comJMS Compliant Messaging- Best Middleware
Award
http://www.aspconnections.com Leading provider in the ASP marketplace

Progress Globalization Program (New URL)
http://www.progress.com/partners/globalization.htm

Come to the Panel on Open Source Approaches to Unicode Libraries at
the Sept. Unicode Conference
http://www.unicode.org/iuc/iuc17



Re: Java, SQL, Unicode and Databases

2000-06-22 Thread Kenneth Whistler

Jianping responded:

> 
> Tex,
> 
> Oracle doesn't have special requirement for datatype in JDBC driver if you use UTF8 
>as database
> character set. In this case, all the text datatype in JDBC will support Unicode data.
> 

The same thing is, of course, true for Sybase databases using UTF-8
at the database character set, accessing them through a JDBC driver.

But I think Tex's question is aimed at the much murkier area
of what the various database vendors' strategies are for dealing
with UTF-16 Unicode as a datatype. In that area, the answers for
what a cross-platform application vendor needs to do and for how
JDBC drivers might abstract differences in database implementations
are still unclear.

--Ken



Re: Java, SQL, Unicode and Databases

2000-06-22 Thread Jianping Yang

Tex,

Oracle doesn't have special requirement for datatype in JDBC driver if you use UTF8 as 
database
character set. In this case, all the text datatype in JDBC will support Unicode data.

Regards,
Jianping.

Tex Texin wrote:

> I want to write an application in Java that will store information
> in a database using Unicode. Ideally the application will run
> with any database that supports Unicode. One would presume that the
> JDBC driver would take care of any differences between databases
> so my application could be independent of database.
> (OK, I know it is a naive view.)
>
> However, I am hearing that databases from different vendors require
> use of different datatypes or limit you to using certain datatypes
> if you want to store Unicode. Changing datatypes would I presume make
> a significant different in my programming of the application...
>
> So, I want to make a list of the changes I need to make to
> my Java, SQL application in the event I want to
> support each of the major databases (Oracle 8I, MS SQL Server 7,
> etc.) with respect to Unicode data storage.
>
> (I am sure there are other differences programming to different
> databases, independent of Unicode data, but those issues are
> understood.)
>
> So, if you can help me by identifying specific changes you would make
> to query or update a major vendor's database with respect to Unicode
> support, I would be very appreciative. If I get a good list, I'll
> post it back here. I am most interested in Oracle and MS SQL Server,
> but will collect info on any database.
>
> As an example, I am hearing that some databases would require varchar,
> others nchar, for Unicode data.
>
> tex
>
> --
> 
>
> Tex Texin Director, International Products
>
> Progress Software Corp.   +1-781-280-4271
> 14 Oak Park   +1-781-280-4655 (Fax)
> Bedford, MA 01730  USA[EMAIL PROTECTED]
>
> http://www.progress.com   The #1 Embedded Database
> http://www.SonicMQ.comJMS Compliant Messaging- Best Middleware
> Award
> http://www.aspconnections.com Leading provider in the ASP marketplace
>
> Progress Globalization Program (New URL)
> http://www.progress.com/partners/globalization.htm
> 
>
> Come to the Panel on Open Source Approaches to Unicode Libraries at
> the Sept. Unicode Conference
> http://www.unicode.org/iuc/iuc17


begin:vcard 
n:Yang;Jianping
tel;fax:650-506-7225
tel;work:650-506-4865
x-mozilla-html:FALSE
org:Server Gobalization Technology;Server Technology
version:2.1
email;internet:[EMAIL PROTECTED]
title:Senior Development Manager
adr;quoted-printable:;;500 Oracle Packway=0D=0AM/S 659407;Redwood Shores;CA;94065;
fn:Jianping Yang
end:vcard