Marvin Humphrey wrote on 01/25/2010 11:48 AM:

> 
> It would be interesting to see a hexdump of "lextemp" starting at byte 12464.
> That's where the PostingPool run starts.  The combining sequence that triggers
> the exception starts two bytes later, at 12466.

$ hexdump -C -s 12464 -n 16 sources.index.ks/seg_1/lextemp
000030b0  00 00 1f 00 00 00 c1 5c  3c 20 62 20 3e 20 57 69  |.......\< b
> Wi|

the sequence c1 5c 3c 20 looks odd to me. It's definitely not UTF-8.

[... /me debugs ... hours pass ...]

the problem is in libswish3, not KinoSearch or the Search::Tools or the
original docs.

Thanks for the tips on how UTF-8 works in KS, though. It was helpful.

-- 
Peter Karman  .  http://peknet.com/  .  [email protected]

Reply via email to