I'm guessing that most installations, maybe the vast majority, would have encoding set up for UTF-8. (You will have to query SQLite experts on how many people really use an encoding other than UTF-8.)
When (and if) you decide to support UTF-16 (native endian), UTF-16BE and UTF-16LE, it should be straightforward. Maybe by then uniEncode and uniDecode will have endian options and it will be even easier. This will save the need for a dialog box. Dar On Jun 8, 2013, at 4:41 PM, Peter Haworth wrote: > Thanks Dar, I think I get the picture now. I'll stick with UTF8 for now. > > > > Pete > lcSQL Software <http://www.lcsql.com> > > > On Sat, Jun 8, 2013 at 11:11 AM, Dar Scott <[email protected]> wrote: > >> I encourage you to go to full Unicode. >> >> That means (for now) using the unicodeText to get text into and out of a >> field. >> >> Then convert that to UTF8 and back for a database with UTF-8 encoding. >> >> (And at this point you can say that your program only works with UTF-8 >> encoding set for SQLite.) >> >> It is my understanding that SQLite does not have any lossy encodings. >> That is, you don't lose anything by saving to the db. That is, all >> encodings are Unicode. My comments are based on that. >> >> You probably can't reliably move a db with UTF-16 encoding from one >> machine to another. >> >> Since your program is general, you probably want to accommodate db with >> UTF-16, UTF-16LE, and UTF-16BE. >> >> I'm guessing you can store the Unicode you get from the unicodeText >> directly for UTF-16. >> >> For the others you might have to byte swap. >> >> Long ago I made a enhancement suggestion to include UTF-16LE and UTF-16BE >> in uniEncode and uniDecode. I don't think it is there, so you will have to >> do it yourself. >> >> Essentially, you see if what is native for your machine matches your >> target encoding. If not swap. To see if the chars are stored little >> endian or not ... >> >> Gotta run. >> >> Dar >> >> >> On Jun 8, 2013, at 11:20 AM, Peter Haworth wrote: >> >>> I apologize up front for being particularly clueless on this whole >>> character encoding concept. I'm still trying to adjust to speaking >>> American English as opposed to the Queen's English so not too suprising >> I'm >>> not grasping unicode too well! >>> >>> I understand the concepts and the use of uniencode and unidecode but I >>> don;t understand when I need to care. >>> >>> I'll use my SQLiteAdmin program as an example. It provides schema >>> maintenance and data browsing/update features for SQLite databases and >> uses >>> most of the standard LC controls, including datagrids. Users can enter >>> data into it and have it used to INSERT, UPDATE, or DELETE rows. They >> can >>> also type in SELECT criteria and have the qualifying data displayed in >>> field and datagrid controls. Currently, there is no attempt to do any >>> encoding or decoding of data. >>> >>> On my computers here in the USA, I've never had any issues using it on >> any >>> of my databases, but I've never tried to access one whose contents >> weren't >>> in American English.. >>> >>> Now let's say someone in a country whose language requires the use of >>> unicode encoding purchases the program. WIll it work OK for that person >> in >>> terms of entering data into the controls and displaying data in the >>> controls from their database, assuming that the database contains UTF8 >>> encoded data? Or do I have to uniencode/decode to ensure things work >> right? >>> >>> Now let's say the database is using UTF16 encoding, or anything other >> than >>> UTF8. I can detect that situation in the database and I think I would >> need >>> to use uniencode/decode to deal with it? >>> >>> Now the user takes his UTF8 database and puts it on a colleague's >> computer >>> here in the USA with the computer's language settings set to American >>> English. I would then need to decode/encode.... I think. >>> >>>> From the original thread, it seems clear that when I import data into >> the >>> database via SQLiteAdmin, I do need to be aware of the encoding in the >>> imported file and that there may be a way to detect that within the file >>> depending on how it was produced. Conversely, when I export data, I >> should >>> try to create the same marker in the file. >>> >>> And finally, is the simplest way to take care of this to simply >>> uniencode/decode everything using the databases encoding without regard >> as >>> to whether that's necessary or not? >>> >>> Pete >>> lcSQL Software <http://www.lcsql.com> >>> _______________________________________________ >>> use-livecode mailing list >>> [email protected] >>> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >>> http://lists.runrev.com/mailman/listinfo/use-livecode >> >> >> _______________________________________________ >> use-livecode mailing list >> [email protected] >> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >> http://lists.runrev.com/mailman/listinfo/use-livecode >> > _______________________________________________ > use-livecode mailing list > [email protected] > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list [email protected] Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
