[ 
https://issues.apache.org/jira/browse/CODEC-187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030799#comment-14030799
 ] 

michael tobias edited comment on CODEC-187 at 6/13/14 4:21 PM:
---------------------------------------------------------------

when we get to that point I will happily do so.

Gary should I start a new issue for the continuing bug(s) ?

I was also wondering..... (dangerous I know).

Because of the potential existing index issues after the revised code is issued 
(though it looks like anybody using EXACT is probably ok), would it be possible 
/ better for us to leave the current BMPM coding untouched from 1.9 and issue 
the bug-fixed version as a BMPM3.02 ADDITIONAL functionality?  In that way 
existing users could continue to use the current (buggy) versions if it works 
fine for them while those wanting/needing the full correct implementation could 
use the 3.02 version.  This would also make it 100% clear which version of the 
algorithm is coded/being used. 

I realise this means we are 'bloating' the Codec having 2 versions of the code, 
but it does actually keep things quite clean and allows users to ignore the 
bug-fixes and/or move over to the fixed 3.02 version in their own time.

It could also be made clear that eventually the original buggy BMPM will be 
dropped and users would be encouraged to adopt the 3.02 version.

What do you think?

Michael


was (Author: mikkitobi):
when we get to that point I will happily do so.

Gary should I start a new issue for the continuing bug(s) ?

I was also wondering..... (dangerous I know).

Because of the potential existing index issues after the revised code is issued 
(though it looks like anybody using EXACT is probably ok), would it be possible 
/ better for us to leave the current BMPM coding untouched from 1.9 and issue 
the bug-fixed version as a BMPM3.02 ADDITIONAL functionality?  In that way 
existing users could continue to use the current (buggy) versions if it works 
fine for them while those wanting/needing the full correct implementation could 
use the 3.02 version.  This would also make it 100% clear which version of the 
algorithm is coded/being used. 

I realise this means we are 'bloating' the Codec having 2 versions of the code, 
but it does actually keep things quite clean and allows users to ignore the 
bug-fixes and/or move over to the fixed 3.02 version in their own time.

What do you think?

Michael

> Beider Morse Phonetic Matching producing incorrect tokens
> ---------------------------------------------------------
>
>                 Key: CODEC-187
>                 URL: https://issues.apache.org/jira/browse/CODEC-187
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.9
>            Reporter: michael tobias
>            Priority: Minor
>             Fix For: 1.10
>
>         Attachments: CODEC-187.patch
>
>
> I believe the Beider Morse Phonetic Matching algorithm was added in Commons 
> Codec 1.6
> The BMPM algorithm is an EVOLVING algorithm that is currently on version 3.02 
> though it had been static since version 3.01 dated 19 Dec 2011 (it was first 
> available as opensource as version 1.00 on 6 May 2009).
> I can see nothing in the Commons Codec Docs to say which version of BMPM was 
> implemented so I am not sure if the problem with the algorithm as coded in 
> the Codec is simply an old version or whether there are more basic problems 
> with the implementation.
> How do I determine the version of the algorithm that was implemented in the 
> Commons Codec?
> How do we ensure that the algorithm is updated if/when the BMPM algorithm 
> changes?
> How do we ensure that the algorithm as coded in the Commons Codec is accurate 
> and working as expected?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to