Hi Team,

We are using "tika-app-2.6.0.jar" in our product.
We have a CVE alert on using "NEKOHTML".

We also found that "tika-app-2.6.0.jar" uses "NEKOHTML", But we are unable to 
find that which version of  "NEKOHTML" is being used by tika.

By extracting named files, we see below files that are from nekohtml present in 
tika app:
>>>
org/cyberneko/html/HTMLElements$Element.class
org/cyberneko/html/HTMLElements$ElementList.class
org/cyberneko/html/HTMLElements.class
org/cyberneko/html/HTMLTagBalancer$ElementEntry.class
org/cyberneko/html/HTMLTagBalancer$Info.class
org/cyberneko/html/HTMLTagBalancer$InfoStack.class
org/cyberneko/html/HTMLTagBalancer.class
<<<

And also there are few entries regarding nekohtml upgrade in CHANGES.txt:
>>>
22. TIKA-144 - Upgrade nekohtml dependency (Jukka Zitting)
40. TIKA-164 - Upgrade of the nekohtml dependency to 1.9.9 (Jukka Zitting)
<<<

Summarizing my questions:

  1.  What version of "NEKOHTML" is tika using currently?
  2.  How is "NEKOHTML" included in tika-app, since there is no entry of it in 
pom.xml?
  3.  Is there a way that I can list all the dependencies of tika (including 
4th party libraries) ?
Running mvn dependency:tree OR mvn dependency:list on source does not help in 
finding the version of "NEKOHTML"?

Any help will be greatly appreciated.

Thanks,
Hanumesh

Reply via email to