Hello list, I'm trying to persist some strings with Cyrillic characters in them into a Postgres 9.1 database. Here's my program:
table entry : {Id : int, Title: string}
PRIMARY KEY Id
sequence entryS
fun new_handle r =
id <- nextval entryS;
dml (INSERT INTO entry (Id, Title) VALUES ({[id]}, {[r.Title]}));
return <xml><body><p>OK</p></body></xml>
fun main (): transaction page =
return <xml><body>
<form>
Title: <textbox {#Title}/>
<submit action={new_handle}/>
</form>
</body></xml>
When I submit "текст" to Ur/Web, I get an error along these lines:
Fatal error: /home/user/proj/simple.ur:7:2-10:2: DML failed:
INSERT INTO uw_Simple_entry (uw_Id, uw_Title) VALUES (20::int8,
E'\377\377\377\377\377\377\377\377'::text)
ERROR: invalid byte sequence for encoding "UTF8": 0xff
I've prepared a patch (attached; it is made against the tip revision).
The behaviour of sprintf/printf for characters with high bit set is
unexpected on my system, for instance, the following program:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv) {
char c = (char)255;
printf("%03o\n", c);
return 0;
}
prints "37777777777". If [c] is cast to [unsigned char], then the
program prints "377" (as expected). I'm wondering if this has to do
with locale? FYI, on my system, LANG is set to en_US.UTF-8.
--
Cheers,
Artyom Shalkhakov
tip.patch
Description: Binary data
_______________________________________________ Ur mailing list [email protected] http://www.impredicative.com/cgi-bin/mailman/listinfo/ur
