I'm stuck on this.  I'm writing a data layer that potentially needs to handle
diacritical (sp?) characters, such a French accented é characters or German
umlauted characters (sp?).  It should be rare that I would run into
something like this, but the data layer should handle it nevertheless.  For
example, it would certainly be expected to handle something as simple as the
word résumé or the name Réggé.

I've tried quite a few things now, and I just can't get to a solid solution. 
The data gets stored to Sqlite, but when I try to select it, I have
problems.  Here's a sample of the error I get from the Python shell trying
to select data with accented characters:

>>> import sqlite3
>>> con = sqlite3.connect('test.db')
>>> cur = con.cursor()
>>> cur.execute("select * from test order by name")
<sqlite3.Cursor object at 0x009E8D40>
>>> l = cur.fetchall()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.OperationalError: Could not decode to UTF-8 column 'name' with text
'RΘs
umΘ'
>>>

Now, I'll post the code that created it.  It tried to store résumé as name
(even though that's not really a name)... The data will typically be
collected from an HTML page, so I am posting the HTML first and then the
Python code that handles it.  I have tried using the Python unicode()
function and the Python decode() function before the data goes into the
database...  In any event, though, storing the data is not the problem...
And, indeed, the data ALWAYS seems to get stored to Sqlite as unicode.  But,
even stored as unicode, pysqlite has problems fetching the data.  In fact,
my problem may lay with pysqlite...  In any case, here's the code.  Any
insight would be most welcome.  Please reach me by e-mail at
[EMAIL PROTECTED]  Thanks...

Oh, the structure of the table (called test) is simply (name, birthdate).

Here's the HTML:

<html>
<head>
</head>
<body>
<form name="frmMain" method="post" action="cptest2.py">
Name: <input type="text" name="txtName"><br>
B-Dt: <input type="text" name="txtBirthDate"><br>
<input type="submit">
</form>
</body>
</html>

Here's cptest2.py (the program that the HTML posts to):

import cgi
import sqlite3
f = cgi.FieldStorage()

def StripNonAlpha(pstrValue):
  lstrRetVal = ''
  for s in pstrValue:
    if 'a' <= s.lower() <= 'z':
      lstrRetVal += s
  return lstrRetVal

TypeName = StripNonAlpha(str(type(f['txtName'].value)))
TypeBirthDate = StripNonAlpha(str(type(f['txtBirthDate'].value)))

#uName = unicode(f['txtName'].value,"Latin-1")
#uBirthDate = unicode(f['txtBirthDate'].value,"Latin-1")
#uTypeName = StripNonAlpha(str(type(unicode(f['txtName'].value,"Latin-1"))))
#uTypeBirthDate =
StripNonAlpha(str(type(unicode(f['txtBirthDate'].value,"Latin-1"))))

print """Content-type: text/html

<html>
<head>
</head>
<body>
name: %s<br>
name type: %s<br>
b-dt: %s<br>
b-dt type: %s<br>
</body>
</html>
""" % (f['txtName'].value,TypeName,f['txtBirthDate'].value,TypeBirthDate)

con = sqlite3.connect('test.db')
cur = con.cursor()
cur.execute("insert into test values (?,?)",
(f['txtName'].value,f['txtBirthDate'].value))
con.commit()
cur.close()
con.close()
-- 
View this message in context: 
http://www.nabble.com/Unicode-tf4167305.html#a11856263
Sent from the SQLite mailing list archive at Nabble.com.


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to