janhoy commented on PR #3670: URL: https://github.com/apache/solr/pull/3670#issuecomment-3325360384
@epugh So I pushed a baby step more code, providing * Sub classed the test, so all the `assertQ` tests run both for `local` and `tikaserver` backends * Made some more tikaserver tests pass <img width="348" height="459" alt="Skjermbilde 2025-09-23 kl 19 29 10" src="https://github.com/user-attachments/assets/2f71100b-b1a4-4807-acbc-422aad69b11d" /> Failing tests typically fail due to lack of support for the SAX style capture feature of Tika API. Others fail due to difference in metadata between Tika1 and TikaServer3, e.g. where Tika1 would output metadata like `title` and `author`, this is now normalized and would be `dc:title` and `dc:author`. We can either provide a (feature-toggled?) mapper in TikaServer that tries to fill in those old metadata keys, and make tests pass that way. Or we could modify tests to look for the "modern" normalized equivalent, which is output by both Tika1 and TikaServer. I think perhaps we can start with the latter. Unfortunately it looks like Crave cannot run TestContainers tests (no docker available?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
