http://www.sqlite.org/cvstrac/tktview?tn=2166
I'm probably not going to be back on this until Monday or Tuesday, unfortunately. -scott On 1/12/07, Scott Hess <[EMAIL PROTECTED]> wrote:
[Find attached the file I'm using to debug this.] I think I've found the difference causing this, but I don't understand why it matters. It all should apply to fts2, the code in question didn't change in a way likely to change this. When an insert is done against an fts1 table, index_insert() is called with the list of sqlite3_values passed in from sqlite code, which in turn calls content_insert() with those values, which runs an insert statement binding each value to the appropriate parameter. Then index_insert() calls insertTerms() to tokenize the data and insert the terms into the fulltext index. When an update is done, index_update() is called with the list of sqlite3_values. Here, it calls insertTerms() to insert the terms into the fulltext index, then content_update() to write the data into the content table. The important point is that insertTerms() calls sqlite3_value_text() on the values. This call appears to destructively convert a UTF16LE-encoded value to a UTF8-encoded value. So, on insert the values bound are UTF16 values, while on update the values bound are UTF8 values. At this time I don't understand why this would be the case, I would expect sqlite to convert things as needed (the enc variable in the sqlite3_value _appears_ to be correct). But, indeed, if I rearrange the calls to insertTerms() and content_update() in fts1.c index_update(), things work as expected. -scott
----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------