On 09/02/2012 02:49, Marvin Humphrey wrote:
On Wed, Feb 08, 2012 at 05:04:56PM +0100, Nick Wellnhofer wrote:
On 23/12/2011 04:18, Marvin Humphrey wrote:
    * Implement CaseFolder as a subclass of Normalizer.

This has yet to be done. We could also mark the CaseFolder as deprecated
and remove it completely later.

The cost for keeping CaseFolder around in its current form is high, because it
is tied into a perlapi function and thus needs a per-host implementation. (The
perlapi function's name broke in late Perl 5.15 releases, which was a PITA to
troubleshoot).  In contrast, the cost for keeping CaseFolder around is small
if it becomes a subclass of Normalizer.

However, CaseFolder and Normalizer presumably have slightly different case
mappings, thus the subclassing change is a back compat break.  It shouldn't be
a horrible break (depending on how close the mappings are) because it will
only affect search-time, screwing up the results only for terms which contain
code points whose mapping has changed.

I don't think we should outright remove CaseFolder without a really good
reason, because that will force almost all of our users to change their code
and then reindex from scratch.  But a subtle compat break might be OK,
especially since you can update all the docs in place after upgrading and only
suffer during a window of time from slightly degraded search results.

The original plan was to implement CaseFolder as a subclass of Normalizer, but I think that doesn't play well with the Dump/Load functions. Composition might be a better approach.

Nick

Reply via email to