Dear Dan,

With the change from U8_NEXT to U16_NEXT, I am able to insert 一日耶羅波安出. I
was also able to insert the rest of the data set (about 31000 more rows
containing both traditional and simplified Chinese). Is this an ICU error?
Seems like everything should be using U8_ in the tokenizer.

Thank you much.

Respectfully,
Tim


On Mon, Jun 18, 2012 at 2:20 PM, E. Timothy Uy <t...@loqu8.com> wrote:

> I'll take a look right now. Though my first thought was if you change
> U8_NEXT to U16_NEXT, wouldn't you have to change it everywhere else?  I
> recompiled ICU with U_CHARSET_IS_UTF8 earlier and this did not help.
>
>
> On Mon, Jun 18, 2012 at 2:06 PM, Dan Kennedy <danielk1...@gmail.com>wrote:
>
>> On 06/19/2012 03:39 AM, E. Timothy Uy wrote:
>> > If anyone can unravel this mystery, it would be much appreciated. For
>> now,
>> > I inserted a comma - 一日、耶羅波安出 and it works. I suspect it must be somehow
>> > that the sequence of bytes encodes another character, which throws the
>> > tokenizer out of whack or maybe the fts4aux table.
>>
>> Can you try with this:
>>
>>  http://www.sqlite.org/src/info/892b74116a
>>
>> Thanks.
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to