On Wed, Oct 12, 2011 at 4:37 PM, Peter Karman <[email protected]> wrote: > If you really don't need to preserve your UTF-8 text, look at > Search::Tools::Transliterate. Search::Tools::UTF8 is also helpful for > debugging these kinds of issues.
Thanks for the suggestions - you've given me food for thought. > It sounds like, without seeing a reproduce-able test case, that Lucy is > choking appropriately on malformed UTF-8. Absolutely. What's interesting is that the same Lucy code does not choke on the other machines with the older Perl. Of course, this may not be the only factor which is different - just the most obvious (eg, perl modules, libraries, etc will also differ). Anyway, I like the idea of rolling my own perl to be absolutely sure of coherence across my machines. This is something I've avoided up until now.
