[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259290#comment-15259290
 ] 

Hudson commented on TIKA-1894:
--

SUCCESS: Integrated in tika-trunk-jdk1.7 #971 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/971/])
TIKA-1894 -- fix potential NPE in XMPMM extraction (tallison: rev 
92a4835d02d94fddbc7d70c0507b8a32345662d9)
* 
tika-parsers/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java
TIKA-1894 -- fix potential NPE in XMPMM extraction (tallison: rev 
ee60bc6e1b10e7abdb1d36464fb564b195f37dcc)
* tika-parsers/pom.xml
* CHANGES.txt


> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
> Fix For: 1.13
>
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202035#comment-15202035
 ] 

Hudson commented on TIKA-1894:
--

SUCCESS: Integrated in tika-2.x #52 (See 
[https://builds.apache.org/job/tika-2.x/52/])
TIKA-1894 -- clean up following recommendations from Ray Gauss and Bob 
(tallison: rev c58af959b6cc3f3a3d8f555d53b147388e36b01d)
* 
tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/module/xmp/internal/Activator.java
* tika-parser-modules/tika-parser-xmp-module/pom.xml
* tika-parser-modules/tika-parser-multimedia-module/pom.xml
* tika-parser-bundles/tika-parser-multimedia-bundle/pom.xml
* 
tika-parser-modules/tika-parser-xmp-commons/src/main/java/org/apache/tika/parser/xmp/XMPPacketScanner.java
* 
tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/XMPPacketScanner.java
* tika-parser-modules/tika-parser-pdf-module/pom.xml
* tika-parser-modules/pom.xml
* 
tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/JempboxExtractor.java
* 
tika-parser-modules/tika-parser-xmp-module/src/test/java/org/apache/tika/parser/xmp/JempboxExtractorTest.java
* 
tika-parser-modules/tika-parser-xmp-commons/src/test/java/org/apache/tika/parser/xmp/JempboxExtractorTest.java
* 
tika-parser-modules/tika-parser-xmp-commons/src/main/java/org/apache/tika/parser/xmp/JempboxExtractor.java
* tika-parser-bundles/tika-parser-pdf-bundle/pom.xml
* tika-parser-modules/tika-parser-xmp-commons/pom.xml


> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-19 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201733#comment-15201733
 ] 

Tim Allison commented on TIKA-1894:
---

Let's see if Hudson likes it... I just pushed this clean-up in 2.x.  Thank you!

> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-15 Thread Bob Paulin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195282#comment-15195282
 ] 

Bob Paulin commented on TIKA-1894:
--

I think that sounds like a good idea.

> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-15 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195210#comment-15195210
 ] 

Tim Allison commented on TIKA-1894:
---

Thank you, [~rgauss].  [~bobpaulin], if you're ok with this, I'll rename the 
module today.

> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-14 Thread Ray Gauss II (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193622#comment-15193622
 ] 

Ray Gauss II commented on TIKA-1894:


The {{tika-xmp}} project deals with converting a populated Tika {{Metadata}} 
object into XMP.

Perhaps that project should be renamed to something more specific at some 
point, but regardless, I don't think it's the right spot for this sort of 
shared parser code.

I'd vote for the simpler shared util jar, but I think it can still live next to 
the modules, something like {{/tika-parsers-modules/tika-parser-xmp-commons}}?

> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-11 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190840#comment-15190840
 ] 

Tim Allison commented on TIKA-1894:
---

Makes sense.  There isn't a parser in there now, but at some point, I think I'd 
like to add a parser that combines the PacketScanner and the XMP 
extractor...won't have time for a while...though. 

By "shared jar", would that be a tika-utils package at the main level?

Does this belong in the tika-xmp module...or would we run into circular 
references eventually?  [~rgauss], any recommendations?

> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-10 Thread Bob Paulin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190439#comment-15190439
 ] 

Bob Paulin commented on TIKA-1894:
--

[~talli...@mitre.org] So after looking at this I'm thinking a new module might 
be overkill here.  There's no parsers in it so there's no need for there to be 
an Activator class also I see a number of the image classes instantiating 
objects that do not need to be instantiated.
{code}
new JempboxExtractor(metadata).parse(tis);
{code}

could be
{code}
JempboxExtractor.parse(metadata, tis);
{code}

  I feel the pain that there is shared code between pdf and multimedia now.  
Maybe just a simple shared util jar?

> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183455#comment-15183455
 ] 

Hudson commented on TIKA-1894:
--

SUCCESS: Integrated in tika-2.x #47 (See 
[https://builds.apache.org/job/tika-2.x/47/])
TIKA-1894 - Add XMPMM support to PDFParser and JpegParser via Jempbox 
(tallison: rev dc4ca999c2855814158868af97e877cbcc74079a)
* CHANGES.txt
* tika-core/src/main/java/org/apache/tika/metadata/XMPMM.java
* tika-parser-modules/tika-parser-multimedia-module/pom.xml
* 
tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/xmp/XMPPacketScanner.java
* tika-parser-modules/pom.xml
* 
tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/jpeg/JpegParser.java
* 
tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/module/xmp/internal/Activator.java
* 
tika-parser-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
* 
tika-parser-modules/tika-parser-pdf-module/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
* 
tika-parser-modules/tika-parser-multimedia-module/src/test/java/org/apache/tika/parser/image/xmp/JempboxExtractorTest.java
* 
tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/JempboxExtractor.java
* 
tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/XMPPacketScanner.java
* tika-parser-modules/tika-parser-pdf-module/pom.xml
* tika-parser-modules/tika-parser-xmp-module/pom.xml
* 
tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/TiffParser.java
* 
tika-parser-modules/tika-parser-xmp-module/src/test/java/org/apache/tika/parser/xmp/JempboxExtractorTest.java
* tika-parser-bundles/tika-parser-multimedia-bundle/pom.xml
* 
tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java
* tika-parser-bundles/tika-parser-pdf-bundle/pom.xml
* 
tika-parser-modules/tika-parser-multimedia-module/src/test/java/org/apache/tika/parser/jpeg/JpegParserTest.java


> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183166#comment-15183166
 ] 

Hudson commented on TIKA-1894:
--

SUCCESS: Integrated in tika-trunk-jdk1.7 #924 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/924/])
TIKA-1894: Add XMPMM support to PDFParser and JpegParser via Jempbox (tallison: 
rev c5d4ec6c50824a9a40fdd2b492bf7557d8d693f3)
* tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
* tika-parsers/src/test/java/org/apache/tika/parser/jpeg/JpegParserTest.java
* CHANGES.txt
* 
tika-parsers/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java
* tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
* tika-core/src/main/java/org/apache/tika/metadata/XMPMM.java


> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor

2016-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183137#comment-15183137
 ] 

Tim Allison commented on TIKA-1894:
---

Update made to trunk with commit c5d4ec6c50824a9a40fdd2b492bf7557d8d693f3.

In 2.0, I'm not sure how to share JempboxExtractor with the multi-media-module 
and the pdf-module.  As expected, we get a cyclic dependency error if I add the 
multi-media-module as a dependency to the pdf-module, and, even if it did work, 
that wasn't a good option.

Some options:

#. Create a tika-parser-xmp-module that would include helper functionality for 
extracting xmp packets & metadata.  Is this enough to warrant a separate module?
#. Duplicate code (no!!!).
#. Other options?


> Add XMPMM metadata extraction to JempboxExtractor
> -
>
> Key: TIKA-1894
> URL: https://issues.apache.org/jira/browse/TIKA-1894
> Project: Tika
>  Issue Type: New Feature
>Reporter: Tim Allison
>Priority: Minor
>
> The XMP Media Management (XMPMM) section of xmp carries some useful 
> information.  We currently have keys for many of the important attributes in 
> tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but 
> the wiring between the two has not yet been installed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)