Hi Tim,

> On Nov 20, 2020, at 11:21 AM, Tim Allison <[email protected]> wrote:
> 
> Y.  That should do it.  I don't think we're currently documenting this.  It
> looks like POI and PDFBox also require jce unlimited to build.
> 
> Hmmm... Should we assumeTrue that jce is installed and then skip that unit
> test if not or do we want to require it to build Tika?

I think we should document that when building, it’s required to install the JCE 
Unlimited Strength Jurisdiction Policy Files.

For Java 8 on my Mac, this worked:

1. Go to https://www.oracle.com/java/technologies/javase-jce8-downloads.html, 
and click the download link.
2. Sign in with your Oracle account, accept the license, and wait for the 
(small) file to download.
3. Expand the downloaded zip

From a terminal:

> sudo cp ~/Downloads/UnlimitedJCEPolicyJDK8/US_export_policy.jar 
> $JAVA_HOME/jre/lib/security/
> sudo cp ~/Downloads/UnlimitedJCEPolicyJDK8/local_policy.jar 
> $JAVA_HOME/jre/lib/security/

— Ken


> 
> On Fri, Nov 20, 2020 at 1:43 PM Ken Krugler <[email protected]>
> wrote:
> 
>> Hi all,
>> 
>> I was trying to build the 1.25-rc1 branch, and ran into this same issue
>> while building the Tika parsers:
>> 
>>> Tests run: 87, Failures: 0, Errors: 1, Skipped: 3, Time elapsed: 6.816 s
>> <<< FAILURE! - in org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest
>>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testEncrypted
>> Time elapsed: 0.286 s  <<< ERROR!
>>> org.apache.tika.exception.TikaException: Unexpected RuntimeException
>> from org.apache.tika.parser.microsoft.OfficeParser@c0de6c9
>>>      at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testEncrypted(OOXMLParserTest.java:1120)
>>> Caused by: org.apache.poi.EncryptedDocumentException: Export
>> Restrictions in place - please install JCE Unlimited Strength Jurisdiction
>> Policy files
>>>      at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testEncrypted(OOXMLParserTest.java:1120)
>> 
>> I assume I need to follow instructions at say
>> https://dzone.com/articles/install-java-cryptography-extension-jce-unlimited
>> to get the appropriate files installed, yes?
>> 
>> And is this documented for Tika somewhere?
>> 
>> Thanks,
>> 
>> — Ken
>> 
>> 
>>> On Jul 31, 2019, at 9:45 AM, Tim Allison <[email protected]> wrote:
>>> 
>>> Dave,
>>> So that I can fix stuff in the future...can you share with me how to
>>> fix this issue on Hudson?
>>> 
>>> org.apache.tika.parser.microsoft.OfficeParser@6f1fd7c1
>>> at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testEncrypted(OOXMLParserTest.java:1234)
>>> Caused by: org.apache.poi.EncryptedDocumentException: Export
>>> Restrictions in place - please install JCE Unlimited Strength
>>> Jurisdiction Policy files
>>> at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testEncrypted(OOXMLParserTest.java:1234)
>>> 
>>> Many thanks!
>>> 
>>>       Cheers,
>>> 
>>>             Tim
>>> 
>>> On Wed, Jul 31, 2019 at 12:43 PM Hudson (JIRA) <[email protected]> wrote:
>>>> 
>>>> 
>>>>   [
>> https://issues.apache.org/jira/browse/TIKA-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897315#comment-16897315
>> ]
>>>> 
>>>> Hudson commented on TIKA-2917:
>>>> ------------------------------
>>>> 
>>>> UNSTABLE: Integrated in Jenkins build tika-2.x-windows #446 (See [
>> https://builds.apache.org/job/tika-2.x-windows/446/])
>>>> TIKA-2917 -- extract metadata that accompanies inline images (tallison:
>> rev 86325105ab206dca88d076dc865fcb17404c4531)
>>>> * (edit)
>> tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
>>>> * (edit)
>> tika-parsers/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
>>>> * (edit)
>> tika-parsers/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java
>>>> * (edit)
>> tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>>>> * (add)
>> tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDMetadataExtractor.java
>>>> 
>>>> 
>>>>> Extract metadata from inline images in PDFs
>>>>> -------------------------------------------
>>>>> 
>>>>>               Key: TIKA-2917
>>>>>               URL: https://issues.apache.org/jira/browse/TIKA-2917
>>>>>           Project: Tika
>>>>>        Issue Type: Improvement
>>>>>          Reporter: Tim Allison
>>>>>          Assignee: Tim Allison
>>>>>          Priority: Minor
>>>>> 
>>>>> Inline images may have XMP associated with them.  We are not currently
>> extracting this metadata.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> This message was sent by Atlassian JIRA
>>>> (v7.6.14#76016)
>> 
>> --------------------------
>> Ken Krugler
>> http://www.scaleunlimited.com
>> custom big data solutions & training
>> Hadoop, Cascading, Cassandra & Solr
>> 
>> 

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

Reply via email to