So I wanted to build tika from source, and failed:
Failures:
TabularFormatsTest.testSAS7BDAT:229->assertContents:216 en_US Wrong
text in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs
03Mär1963:09:46:40.00
TabularFormatsTest.testXLS:236->assertContents:216 en_US Wrong text
in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs
03Mär63 09:46:40
TabularFormatsTest.testXLSB:250->assertContents:216 en_US Wrong text
in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs
03Mär63 09:46:40
TabularFormatsTest.testXLSX:243->assertContents:216 en_US Wrong text
in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs
03Mär63 09:46:40
It fails because the expected "Mar" is not identical to "Mär". I tried
to set the Locale to the US
@Before
public void setUp()
{
Locale.setDefault(Locale.US);
}
but this works only when the test is run alone, not if the whole build
is running, despite that the Locale is set. See the output above, I have
changed the assert to
assertTrue(Locale.getDefault() + " " + error,
((Pattern)table[cn][rn]).matcher(val).matches());
A possible solution would be to change the test file to have June
instead of March, but we could still get in trouble e.g. in Russia,
China, Korea, Thailand, Japan, ....
Tilman