So I wanted to build tika from source, and failed:

Failures:
  TabularFormatsTest.testSAS7BDAT:229->assertContents:216 en_US Wrong text in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs 03Mär1963:09:46:40.00   TabularFormatsTest.testXLS:236->assertContents:216 en_US Wrong text in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs 03Mär63 09:46:40   TabularFormatsTest.testXLSB:250->assertContents:216 en_US Wrong text in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs 03Mär63 09:46:40   TabularFormatsTest.testXLSX:243->assertContents:216 en_US Wrong text in row 9 and column 7 - 03(MAR|Mar)(63|1963)[:\s]09:46:40(.00)? vs 03Mär63 09:46:40

It fails because the expected "Mar" is not identical to "Mär". I tried to set the Locale to the US

    @Before
    public void setUp()
    {
        Locale.setDefault(Locale.US);
    }

but this works only when the test is run alone, not if the whole build is running, despite that the Locale is set. See the output above, I have changed the assert to

assertTrue(Locale.getDefault() + " " + error, ((Pattern)table[cn][rn]).matcher(val).matches());

A possible solution would be to change the test file to have June instead of March, but we could still get in trouble e.g. in Russia, China, Korea, Thailand, Japan, ....

Tilman

Reply via email to