I am writing a FTS3 tokenizer that works with iPhoneOS using Apple's CoreFoundation library.
What encoding is used on inbound insert statements into a FTS3 virtual table? For example I have Japanese text encoded as UTF-8 and passed in as UTF-8 insert statement is encoded as UTF-8. I am not using the ICU library (SQLITE_ENABLE_ICU is not defined). In the above tokenizer I want to eliminate words (stemming), do I just not return those words, and move to the next? Does this actually influence the text that is stored, or is this just used for indexing? I am building a static library, and I want the tokenizer to be available for anything that I link it with. Is there an easy way to make that happen? I currently configured it through adding code into the core FTS3 code. I don't want to do this each time, and would like to keep the source separated, so I can easily change versions of SQLite. Finally, which distro should I use the regular amalgamation or the unix amalgamation? If the latter what options should be passed into the config for iphoneOS? Thanks Garry _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users