[ 
https://issues.apache.org/jira/browse/OPENNLP-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788112#comment-17788112
 ] 

ASF GitHub Bot commented on OPENNLP-1519:
-----------------------------------------

Sujishark commented on PR #556:
URL: https://github.com/apache/opennlp/pull/556#issuecomment-1819541012

   Greetings, 
   
   I've added the environment details. 
   
   A similar issue was already addressed in this PR:  
https://github.com/apache/opennlp/pull/387 
   
   The change I made is because in the `equals` method designed for comparing 
elements within a Set, the ordering of the elements is also checked for the Set.
   
   
https://github.com/apache/opennlp/blob/dab19af803e9139b57b972eb4e1af4c978d2ff3f/opennlp-tools/src/main/java/opennlp/tools/dictionary/Dictionary.java#L356
   
   The `entrySet` here is a HashSet which doesn't maintain a constant order as 
mentioned in the 
[Java_17_documentation](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/HashSet.html).
 
   
   
https://github.com/apache/opennlp/blob/dab19af803e9139b57b972eb4e1af4c978d2ff3f/opennlp-tools/src/main/java/opennlp/tools/dictionary/Dictionary.java#L95
   
   I've made changes only to the test files now without altering the actual 
code.




> Use LinkedHashSet for deterministic iteration order
> ---------------------------------------------------
>
>                 Key: OPENNLP-1519
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1519
>             Project: OpenNLP
>          Issue Type: Improvement
>            Reporter: Sujithra Rajan
>            Priority: Minor
>
> Two tests
>  * `opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEquals`
>  * 
> `opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEqualsDifferentCase`
> uses HashSet for entrySet while initializing the 'Dictionary' and thus the 
> order is not constant all the time.
> This was found by using the 
> [NonDex][https://github.com/TestingResearchIllinois/NonDex] tool.
> Encountered the following error messages:
> {quote}org.opentest4j.AssertionFailedError: expected: <[1a, 1b]> but was: 
> <[1b, 1a]>
> at 
> opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEquals(DictionaryAsSetCaseInsensitiveTest.java:121)
> {quote}
> {quote}org.opentest4j.AssertionFailedError: expected: <[1a, 1b]> but was: 
> <[1B, 1A]>
> at 
> opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest.testEqualsDifferentCase(DictionaryAsSetCaseInsensitiveTest.java:142)
> {quote}
> {quote}org.opentest4j.AssertionFailedError: expected: <[[Berlin], 
> [Stockholm], [New,York], [London], [Copenhagen], [Paris]]> but was: 
> <[[Copenhagen], [London], [New,York], [Stockholm], [Paris], [Berlin]]>
> at 
> opennlp.uima.dictionary.DictionaryResourceTest.testDictionaryWasLoaded(DictionaryResourceTest.java:76)
> {quote}
>  
> The fix is to change HashSet to LinkedHashSet so that the iteration order 
> remains stable all the time. 
> Assertion statement of 'testDictionaryWasLoaded' was modified to match the 
> exact ordering of dictionary.dic.
>  
> {*}REPRODUCE{*}:
> ```
> mvn edu.illinois:nondex-maven-plugin:2.1.7-SNAPSHOT:nondex 
> -Dtest=opennlp.tools.dictionary.DictionaryAsSetCaseInsensitiveTest#testEquals
> ```
>  Can I proceed and create PR ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to