[ 
https://issues.apache.org/jira/browse/TIKA-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462568#comment-17462568
 ] 

ASF GitHub Bot commented on TIKA-3627:
--------------------------------------

tballison merged pull request #467:
URL: https://github.com/apache/tika/pull/467


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> OOXML parsing is not working as intended using multiple threads
> ---------------------------------------------------------------
>
>                 Key: TIKA-3627
>                 URL: https://issues.apache.org/jira/browse/TIKA-3627
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Bernhard Geisberger
>            Priority: Major
>
> In the latest version, the parsing of OOXML files is broken if multiple 
> threads are used. I investigated and compared the call stack between 2.1.0 
> and 2.2.0, and came to the conclusion that this is caused by [this 
> commit|https://github.com/apache/tika/commit/10d925439cd862f74679ec5fa9a9b5863f50ce2c]
>  in line 86 of OOXMLExtractorFactory.
> In version 2.1.0, the call 
> `ExtractorFactory.setThreadPrefersEventExtractors(true)` is used in every 
> `parse` call, resulting in setting the thread-local property for every 
> thread. In version 2.2.0, the call is used in the static block. This leads to 
> the property being the default value (=false) for all other threads than the 
> first one. Effectively, this breaks the parsing of macros in  OOXML files.
> An easy workaround in version 2.2.0 is to call 
> `ExtractorFactory.setAllThreadsPreferEventExtractors(true)` at some time 
> before tika is used first.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to