Marvin Humphrey wrote on 01/25/2010 11:48 AM: > > It would be interesting to see a hexdump of "lextemp" starting at byte 12464. > That's where the PostingPool run starts. The combining sequence that triggers > the exception starts two bytes later, at 12466.
$ hexdump -C -s 12464 -n 16 sources.index.ks/seg_1/lextemp 000030b0 00 00 1f 00 00 00 c1 5c 3c 20 62 20 3e 20 57 69 |.......\< b > Wi| the sequence c1 5c 3c 20 looks odd to me. It's definitely not UTF-8. [... /me debugs ... hours pass ...] the problem is in libswish3, not KinoSearch or the Search::Tools or the original docs. Thanks for the tips on how UTF-8 works in KS, though. It was helpful. -- Peter Karman . http://peknet.com/ . [email protected]
