Right, the affected class is the PDFParser. langid can't possibly use
that. This sounds like a reasonable solution.

If scanners are their only concern, there's not much I can recommend
aside from that. For others, we found that one way to mitigate that
cve is to add woodstox to your classpath[0]. If you already have it,
you _should_ be ok as is. I should be able to share a triggering file
if you want to test that that actually fixes the problem shortly.

I concur that migrating from Tika 1.x to Tika 3.x, or, worse, having
them both available would be a nightmare.

[0] https://lists.apache.org/thread/ymw9kkh94kvw0s6plwvjrp577sl1wbp8

On Thu, Sep 18, 2025 at 1:24 PM David Eric Pugh <[email protected]> wrote:
>
> Hey all, I've got a client who would like a CVE for Tika fixed in Solr: 
> https://nvd.nist.gov/vuln/detail/CVE-2025-54988
> They use the langid module but don't use the extraction module.   So if the 
> CVE is fixed in langid and they delete the extraction module from their 
> deployments, they don't violate the CVE scanners.
> I took a stab at upgrading to Tika 3, and was able to do it for langid 
> module, but ran into very hard times on updating the extraction module.  It's 
> on Tika 1, and so upgrading it is the same as redoing it from scratch, and 
> I'd rather focus on getting a extraction module that delegated Tika 
> processing to a seperate Tika Server.
> Before I try to press ahead with having Tika 1 and Tika 3 in Solr 9 and 10, I 
> thought I would look for feedback on how crazy this would be?   It appears 
> that even if we migrate extraction to Tika Server, we'll still have a tika 
> jar to support how we are using it in the langid module....
> Eric

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to