The META tag you include looks correct to me.

Does perl get the chars right after CGI decodes them?

The browser, ultimately, will escape the accented characters into urlencoded 
chars based on the utf-8 charset you specify in the HTML META  tag.  Then Perl 
(via CGI) is going to decode those back into characters, probably using the 
host's default charset.  Seems like there's a chance for perl to mangle the 
accented chars during this step.

 -Clark

----- Original Message ----
From: P Kishor <[EMAIL PROTECTED]>
To: Nuno Lucas <[EMAIL PROTECTED]>
Cc: sqlite-users@sqlite.org
Sent: Thursday, September 20, 2007 6:43:37 AM
Subject: Re: [sqlite] SQLite and html character entities

Thanks Nuno. Since I am raw in this matter, could I ask you for a
little more hand-holding as specified below --

On 9/20/07, Nuno Lucas <[EMAIL PROTECTED]> wrote:
> You have to know the encoding of the user input. To do that, all your
> html forms _MUST_ have proper <META> tags, and as you will be using
> SQLite, the obvious encoding choice will be UTF-8 (because that way
> you don't need to do any conversions when feeding/retrieving data
> to/from SQLite).

So, what is the proper meta tag? Is the following sufficient?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd";>
<html lang="">
  <head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8">
    <title></title>
  </head>
  <body>

And, other than the above, I don't have to do anything else? Just a
straight ahead INSERT with bind vars is enough?

Many thanks in advance,

Puneet.


>
> Then there is the problem of non-compliant browsers, but that is
> another history...
>
>
> Best regards,
> ~Nuno Lucas
>
>
> On 9/20/07, P Kishor <[EMAIL PROTECTED]> wrote:
> > Folks,
> >
> > I come to ask you a question that may be basic for many of you but
 is
> > leaving me completely bewildered. My work environment is a Mac OS X
> > (Tiger) computer, and I use a Cocoa-based text editor, and am
 writing
> > a Perl-based web app. Data are in several different languages,
> > predominantly English, but with Portuguese, Spanish, and other
> > languages mixed in... hence, have accent marks (diacritics).
> >
> > Goal: To reliably and consistently show the retrieved data in a web
> > page or a web form with the correct diacritics, and when the user
> > edits and updates that data, reliably and consistently update the
> > database.
> >
> > Summary of problem: Data with diacritics show up fine in web forms,
> > but on updating, they get clobbered with gibberish and subsequently
> > show up incorrectly.
> >
> > So, I decided to do a little test. I created a small table, wrote a
> > script, and inserted a few records from the web. See the output of
 my
> > investigation below. I ask you, what is it that I have to do to
> > achieve my goal above? (output of test follows; I have separated
> > logical sections with a "-------" line, and my comments start with
 #)
> >
> > Lucknow:~/Data/ecoservices punkish$ sqlite3 entities.sqlite
> > SQLite version 3.3.8
> > Enter ".help" for instructions
> > sqlite> .s
> > CREATE TABLE tbl (a text);
> > sqlite> select * from tbl;
> > the first record
> > é ç ñ î
> > more from 3rd row
> > row four
> > these "volunteered" activities
> > <á ø ã ü î & others>
> > -----------------------------
> > sqlite> .mode csv
> > sqlite> .output foo.csv
> > sqlite> select * from tbl;
> > sqlite> .q
> > Lucknow:~/Data/ecoservices punkish$ less foo.csv
> > "the first record"
> > "\351 \347 \361 \356"
> > "more from 3rd row"
> > "row four"
> > "these \223volunteered\224 activities"
> > "<\341 \370 \343 \374 \356 & others>"
> > foo.csv (END)
> > -----------------------------
> > sqlite> .mode html
> > sqlite> .output foo.html
> > sqlite> select * from tbl;
> > sqlite> .q
> > Lucknow:~/Data/ecoservices punkish$ less foo.html
> > "foo.html" may be a binary file.  See it anyway?
> > <TR><TD>the first record</TD>
> > </TR>
> > <TR><TD><E9> <E7> <F1> <EE></TD>
> > </TR>
> > <TR><TD>more from 3rd row</TD>
> > </TR>
> > <TR><TD>row four</TD>
> > </TR>
> > <TR><TD>these <93>volunteered<94> activities</TD>
> > </TR>
> > <TR><TD>&lt;<E1> <F8> <E3> <FC> <EE> &amp; others></TD>
> > </TR>
> > foo.html (END)
> > -----------------------------
> > # below foo.html in my Cocoa-based text editor
> > <TR><TD>the first record</TD>
> > </TR>
> > <TR><TD>È Á Ò Ó</TD>
> > </TR>
> > <TR><TD>more from 3rd row</TD>
> > </TR>
> > <TR><TD>row four</TD>
> > </TR>
> > <TR><TD>these ìvolunteeredî activities</TD>
> > </TR>
> > <TR><TD>&lt;· ¯ „ ¸ Ó &amp; others></TD>
> > </TR>
> > -----------------------------
> > # below foo.html in Safari; I added <TABLE> tags to format
 correctly
> > the first record
> > é ç ñ î
> > more from 3rd row
> > row four
> > these "volunteered" activities
> > <á ø ã ü î & others>
> >
> >
 -----------------------------------------------------------------------------
> > To unsubscribe, send email to [EMAIL PROTECTED]
> >
 -----------------------------------------------------------------------------
> >
> >
>


-- 
Puneet Kishor
http://punkish.eidesis.org/
Nelson Institute for Environmental Studies
http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo)
http://www.osgeo.org/
Summer 2007 S&T Policy Fellow, The National Academies
http://www.nas.edu/

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------





-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to