Right, the affected class is the PDFParser. langid can't possibly use that. This sounds like a reasonable solution.
If scanners are their only concern, there's not much I can recommend aside from that. For others, we found that one way to mitigate that cve is to add woodstox to your classpath[0]. If you already have it, you _should_ be ok as is. I should be able to share a triggering file if you want to test that that actually fixes the problem shortly. I concur that migrating from Tika 1.x to Tika 3.x, or, worse, having them both available would be a nightmare. [0] https://lists.apache.org/thread/ymw9kkh94kvw0s6plwvjrp577sl1wbp8 On Thu, Sep 18, 2025 at 1:24 PM David Eric Pugh <[email protected]> wrote: > > Hey all, I've got a client who would like a CVE for Tika fixed in Solr: > https://nvd.nist.gov/vuln/detail/CVE-2025-54988 > They use the langid module but don't use the extraction module. So if the > CVE is fixed in langid and they delete the extraction module from their > deployments, they don't violate the CVE scanners. > I took a stab at upgrading to Tika 3, and was able to do it for langid > module, but ran into very hard times on updating the extraction module. It's > on Tika 1, and so upgrading it is the same as redoing it from scratch, and > I'd rather focus on getting a extraction module that delegated Tika > processing to a seperate Tika Server. > Before I try to press ahead with having Tika 1 and Tika 3 in Solr 9 and 10, I > thought I would look for feedback on how crazy this would be? It appears > that even if we migrate extraction to Tika Server, we'll still have a tika > jar to support how we are using it in the langid module.... > Eric --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
