[Phpgroupware-developers] Re: utf-8 vs iso-8859-1

Sigurd Nes Fri, 24 Feb 2006 12:40:11 -0800

> From: Dave Hall [EMAIL PROTECTED]
> Sent: 2006-02-23 10:03:12 CET
> To: Sigurd Nes [EMAIL PROTECTED]
> Cc: phpGroupWare Developers [EMAIL PROTECTED]
> Subject: Re: utf-8 vs iso-8859-1
> 
> On Thu, 2006-02-23 at 09:08 +0100, Sigurd Nes wrote:
> > I'm reposting this off the list...
> > 
> 
> No problem, I have CC'd my response there.  The lag is frustrating.
> 
> > Sigurd
> > 
> > > From: Sigurd Nes [EMAIL PROTECTED]
> > > Sent: 2006-02-22 18:09:51 CET
> > > To: phpgroupware-developers@gnu.org
> > > Subject: utf-8 vs iso-8859-1
> > > 
> > > The conversion to utf-8 is giving me problems.
> > > I have a database with more than 5000 dwellings, 35000
> > workorders ...
> > > The language is norwegian - and I really would like to keep the
> > character set (at least for norwegian) - this way I can use what query
> > tool (as M$access) I like to make anaylis without the need for
> > postprocessing.
> > > Please enlighten me if I am missing something.
> > > 
> 
> There are several reasons for the switch to utf-8.  The main one is that
> from db to the user interface we can know that we are always dealing
> with utf-8.  We can then remove things like lang('chartset').  
> 
> Unicode also means we can have multi lingual installs.  For example if a
> company has operations across Europe they can not use a single phpgw
> install, as we currently use at least 3 different charsets for
> translations.  I would also like to hardcode urf-8 into stuff instead of
> having to keep track of charsets which potentially causes problems.  It
> is also easier if everyone knows to use utf-8 compliant tools.
> 
> I haven't used M$ Access since O2k days, but I know that OO.o2 Base
> allows you to specify the charset for the database connection.  Maybe M$
> Access has the same option tucked away somewhere
> 
> What are the problems you have?  I am happy to see if we can find a way
> of fixing the problems instead of switching back to encoding soup :)
> 
> Cheers
> 
> Dave
>


I'm not sure I grasp all the consequenses - this is from some testing:

I seems that postgres has an unicode odbc-driver so that "should" be ok - but 
it don't seems to work (if there is any converted characters - I got 'ODBC -- 
called failed').

I will need to convert all the characters in the database to unicode - I figure 
I can dump the database, convert the characters (there is a tool ?) and reload 
the data into an empty database. At this point I will most certainly run into 
problems - 'cause the fields will be to short in many cases.

Writing lang-files will be somewhat more difficult ?
When saving a file with gedit as unicode it is ok when reopened in gedit and 
TexPad (my favorite) but not in emacs.

When insterting new values to the database - do I need to filter the values 
trough a converter?
I certainly cannot edit records with webmin.

I thought that the lang-table combined with the users preferences took care of 
multilanguage issue.

If there is special functions in the api the reqiure unicode - I'm more than 
willing to convert the input to that function to unicode at demand.

All in all - As I see it - there is a number of limitations compared to allow 
iso-8859-1 for the xsl:stylesheet

Regards

Sigurd

_______________________________________________
Phpgroupware-developers mailing list
Phpgroupware-developers@gnu.org
http://lists.gnu.org/mailman/listinfo/phpgroupware-developers

[Phpgroupware-developers] Re: utf-8 vs iso-8859-1

Reply via email to