Tim Allison created TIKA-4439:
---------------------------------
Summary: Improve text extraction from EMF, round 2
Key: TIKA-4439
URL: https://issues.apache.org/jira/browse/TIKA-4439
Project: Tika
Issue Type: Task
Reporter: Tim Allison
In our recent regression testing for the 3.2.1 release, we found that changes
made on TIKA-4432 increased the number of common tokens extracted from emf
files by about 3%.
We also noticed some regressions. I'm opening this issue to track improvements
to EMF parsing (hopefully after the 3.2.1 release :D).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)