[ 
https://issues.apache.org/jira/browse/TIKA-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551812#comment-17551812
 ] 

Hudson commented on TIKA-3788:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk8 #634 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/634/])
TIKA-3788 -- Record embedded file exceptions in the container file's metadata. 
(tallison: 
[https://github.com/apache/tika/commit/6f2ef64a582328fb13198c97d51205b4d469424e])
* (edit) 
tika-core/src/main/java/org/apache/tika/extractor/ParsingEmbeddedDocumentExtractor.java
* (edit) CHANGES.txt
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/parser/AutoDetectParserTest.java
* (add) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/test-documents/mock/null_pointer.xml.gz
* (edit) 
tika-core/src/main/java/org/apache/tika/metadata/TikaCoreProperties.java
* (edit) tika-core/src/main/java/org/apache/tika/parser/ParseRecord.java


> Allow embedded exceptions and warnings to percolate to the parent's metadata
> ----------------------------------------------------------------------------
>
>                 Key: TIKA-3788
>                 URL: https://issues.apache.org/jira/browse/TIKA-3788
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Major
>             Fix For: 2.4.1
>
>
> As part of work on TIKA-3787, I'll add a ParseRecord to the ParseContext.  
> This can be used by parsers that parse embedded files to record caught 
> exceptions and warning messages.  The CompositeParser keeps track of depth of 
> its parse and when the depth returns to 0, it will write these exceptions and 
> warnings to the Metadata object.
> I would still highly recommend /rmeta, -J, the RecursiveParserWrapper, but 
> this new capability adds some functionality to the standard /tika (with json 
> output), and programmatically to the AutoDetectParser.
> Because this information is added to the metadata object _after_ the parse, 
> it will not come through in streaming contexts where the metadata object is 
> written to the xhtml before the content of the file is parsed.  So, this will 
> not add any benefit to /tika (text/html).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to