Christian,
Christian Smith <[EMAIL PROTECTED]>
03/11/2004 02:33 AM
Please respond to sqlite-users
To: [EMAIL PROTECTED]
cc:
Subject: Re: [sqlite] Question about UTF8 encoding in SQLite version
2.8.13
> On Tue, 2 Nov 2004, Liz Steel wrote:
> >To clarify: I have a database name with Swedish characters in, which
are
> >converted to multibyte characters, however, the filename that is
created
> >treats each of the characters separately, which then causes problems
later.
> >As an example, the string "�ndrad" is converted to "�"ndrad".
> The code to parse filenames is not UTF8 aware, and so will cause
problems
> when splitting a filename into directory and filename components if the
> string is a UTF8 string. The offending function appears to be
> sqlitepager_open in pager.c, which steps backwards through the path name
a
> character at a time looking the directory seperator character, which
will
> obviously be tripped up by a multi-byte character.
I wonder if you could add some explaination for your comments above. UTF-8
is a special unicode encoding that contains no null characters, preserves
the ascii code range verabatim, and does not include any characters that
"look like" ascii characters. That is to say, each byte is either an ascii
character (0-127) or is in the high byte range (128-255) and therefore
can't be confused with an ascii character. I would have thought that any
special path characters (eg, '/', '\'...) would be a subset of the ascii
range and therefore require no special unicode-aware handling. The
function that sqlite calls to actually create the file, on the other hand,
would have to be unicode-aware for such filenames to work.
Benjamin