[ 
https://issues.apache.org/jira/browse/TIKA-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707280#comment-16707280
 ] 

Tim Allison commented on TIKA-2791:
-----------------------------------

I'd want to focus on a handful of common tags: p, div, ul, ol, li, table, tr, 
td, u, i, b, a...any others?

> Add structure tags to tika-eval
> -------------------------------
>
>                 Key: TIKA-2791
>                 URL: https://issues.apache.org/jira/browse/TIKA-2791
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Major
>
> It would be useful to be able to compare counts of common structure tags in 
> tika-eval.  We could also detect and flag bad structure tags, e.g.: 
> <i><u></i></u>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to