[
https://issues.apache.org/jira/browse/TIKA-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arpit updated TIKA-4694:
------------------------
Description:
Currently we are using <tika-core.version>3.2.3</tika-core. Version> , where we
are seeing Line Count and Paragraph count attribute are not coming for doc
extensions type
Note : When we are opening the doc using any tool we are able to see the line
count and paragraph
was:
Currently we are using <tika-core.version>3.2.3</tika-core. Version> , where we
are seeing for subject attribute both subject and keywords are being returned
instead of returning on subject for doc and docx files
This is the metadata attribute (dc:subject) which we are using for fetching
subject and it return both subject + keyword
Able to see one more issue related to same which is in resolved state for PDF
File https://issues.apache.org/jira/browse/TIKA-4444
> Apache Tika Parser : Line Count and Paragraph count are not coming for doc
> extensions type using Tika File metadata
> --------------------------------------------------------------------------------------------------------------------
>
> Key: TIKA-4694
> URL: https://issues.apache.org/jira/browse/TIKA-4694
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 3.2.3
> Reporter: Arpit
> Priority: Major
>
> Currently we are using <tika-core.version>3.2.3</tika-core. Version> , where
> we are seeing Line Count and Paragraph count attribute are not coming for
> doc extensions type
> Note : When we are opening the doc using any tool we are able to see the line
> count and paragraph
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)