I'll take a look right now. Though my first thought was if you change U8_NEXT to U16_NEXT, wouldn't you have to change it everywhere else? I recompiled ICU with U_CHARSET_IS_UTF8 earlier and this did not help.
On Mon, Jun 18, 2012 at 2:06 PM, Dan Kennedy <danielk1...@gmail.com> wrote: > On 06/19/2012 03:39 AM, E. Timothy Uy wrote: > > If anyone can unravel this mystery, it would be much appreciated. For > now, > > I inserted a comma - 一日、耶羅波安出 and it works. I suspect it must be somehow > > that the sequence of bytes encodes another character, which throws the > > tokenizer out of whack or maybe the fts4aux table. > > Can you try with this: > > http://www.sqlite.org/src/info/892b74116a > > Thanks. > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users