Richard Zowalla created OPENNLP-1738:
----------------------------------------
Summary: OpenNLP Sandbox WSD has random test failures on Ubuntu
24.04 LTS on GH actions
Key: OPENNLP-1738
URL: https://issues.apache.org/jira/browse/OPENNLP-1738
Project: OpenNLP
Issue Type: Bug
Components: sandbox, wsd
Affects Versions: 2.5.4
Reporter: Richard Zowalla
We have failing sandbox text on the WSD component on GH actions for Ubuntu
(24.04) Latest Runners with JDK 21 (temurin) and JDK 24-ea
Error:
opennlp.tools.disambiguator.WSDisambiguatorMETest.testDisambiguateOneWord --
Time elapsed: 0.007 s <<< FAILURE!org.opentest4j.AssertionFailedError: Check
'please' sense ID ==> expected: <WORDNET please%4:02:00::> but was: <WORDNET
please%2:37:00::>
The input into the test is similar between OS:
The input is similar between Windows/OSX and Linux:
Ubuntu:
[We, need, to, discuss, an, important, topic, ,, please, write, to, me, soon, .]
[PRP, VB, RP, VB, DT, JJ, NN, ., UH, VB, IN, PRP, RB, .]
[we, need, to, discuss, a, important, topic, ,, please, write, to, i, soon, .]
Mac / Windows:
We, need, to, discuss, an, important, topic, ,, please, write, to, me, soon, .]
[PRP, VB, RP, VB, DT, JJ, NN, ., UH, VB, IN, PRP, RB, .]
[we, need, to, discuss, a, important, topic, ,, please, write, to, i, soon, .]
Older version of Sandbox for Mac:
[We, need, to, discuss, an, important, topic, ,, please, write, to, me, soon, .]
[PRP, VB, RP, VB, DT, JJ, NN, ., UH, VB, IN, PRP, RB, .]
[we, need, to, discuss, a, important, topic, ,, please, write, to, i, soon, .]
----
In addition, older commits are also failing with the issue above, which
indicates an environment issue.
I did some tests on a Gitlab Runner (Ubuntu 24.04 Latest Docker with JDK 21)
and German locale, which does not fail. I also did some tests on an Ubuntu
24.04 Desktop system hosted on VDI infrastructure with German locale and could
not reproduce the test failures.
It must be an environment and / or locale issue on the GH action runners.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)