[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-04-26 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592611#action_12592611
 ] 

Henri Yandell commented on CODEC-57:


Note - the Perl variant can run in original-mode.

I've added a note to the javadoc about this issue.

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-03-09 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576842#action_12576842
 ] 

Henri Yandell commented on CODEC-57:


It's a bug in the algorithm.

The problem with both fixes is in the other words they affect.

So PHP makes WHY->H and affects things like WHAT being HT and not WT.
My suggestion makes WHY->Y, and changes WHYTE from T to WT. 

PHP change all 'WH' from W to H. Mine only affects WH?Y and the result is to 
make the W more sticky. I think mine's going to affect a lot less words (so 
better for backwards compat) and also more correct as it creates better looking 
tokens, but I'm hardly an expert at any of this.  It's entirely possible that 
an empty String is intended to be an acceptable Soundex. 

Or the best solution might be to, if the metaphone is an empty String; simply 
return the first character of the input. 

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-03-09 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576839#action_12576839
 ] 

Gary Gregory commented on CODEC-57:
---

If there any consensus on which is "correct"? WH->W or WY->Y? Matching Python 
and Ruby is besides the point IMO. If WH->W is a bug then there is no sense 
replicating it ;)

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-03-08 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576512#action_12576512
 ] 

Henri Yandell commented on CODEC-57:


I think that unless we hear otherwise, we should not change the CC/GG bit. 
People's data needs to be likely to work, regardless of the better algorithm 
etc.

With this specific bug though, having an empty string feels bad. Both the Ruby 
and Python implementations do WH->W. I don't feel a huge urge to make it WH->H 
just to match PHP. I'm more tempted to do the WY->Y. 

Any thoughts?

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-02-04 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565653#action_12565653
 ] 

Henri Yandell commented on CODEC-57:


Also... can't help but think that there's sod all testing in MetaphoneTest. We 
need to add some test cases.

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-02-04 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565418#action_12565418
 ] 

Henri Yandell commented on CODEC-57:


Looks like we do the MB->M [CODEC-17]. 

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-01-20 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560782#action_12560782
 ] 

Henri Yandell commented on CODEC-57:


Incidentally, those other bugs are that 'CC' should be considered as not 
allowed duplication and that MB->M should only happen at the end of a word.  An 
unmentioned change is that they've also made 'GG' an allowed duplication. 

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2008-01-20 Thread Henri Yandell (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560781#action_12560781
 ] 

Henri Yandell commented on CODEC-57:


Looking at the code; the first step it does is turn the WH into a W. 

Then later on, both W and Y are silent if they are not followed by a vowel.

Playing with the link above, it looks like WH is turned into H there. A quick 
look at the source code to PHP shows that it is indeed converted to H. Another 
quick look, this time at text.rubyforge shows that the Ruby version converts to 
W as we do [though it claims to compare with the PHP version for differences]. 

Looking at DoubleMetaphone, it handles WH differently. If ^WH, then it'll 
append an A.

Looking at the original BASIC code [as posted by aspell.sf.net]:

IF TWO = "WH" THEN ENAME = "W":ENAME[3,]

So it looks like PHP are the one with the bigger bug - a surname of WHYE should 
be YE and not HE. Then it seems that Metaphone itself is weak in that (my 
opinion) it should consider 'Y' a vowel when looking after 'W' for a vowel. 

I'm not sure what we should do though. The documentation at 
http://text.rubyforge.org/svn/lib/text/metaphone.rb also indicates that there 
are other bugs in the original BASIC compared to the original discussion 
(anyone got that magazine article? :) ). So this might just be a bug in the 
BASIC implementation rather than the original algorithm.

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CODEC-57) Metaphone.metaphone(String) returns an empty string when passed the word "why".

2007-10-24 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537448
 ] 

Gary Gregory commented on CODEC-57:
---

I have a unit test that confirms this (committed). I can see the "H" in "WHY" 
also here: http://www.searchforancestors.com/utility/metaphone.php (picked from 
Google search results)

> Metaphone.metaphone(String) returns an empty string when passed the word 
> "why".
> ---
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
>Reporter: Adam Wilmore
>
> An empty string is returned from the Metaphone.metaphone(String) method when 
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy" 
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone 
> algorithm, namely the PHP version, returns "H" when passed the value "why".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.