[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259290#comment-15259290 ] Hudson commented on TIKA-1894: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #971 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/971/]) TIKA-1894 -- fix potential NPE in XMPMM extraction (tallison: rev 92a4835d02d94fddbc7d70c0507b8a32345662d9) * tika-parsers/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java TIKA-1894 -- fix potential NPE in XMPMM extraction (tallison: rev ee60bc6e1b10e7abdb1d36464fb564b195f37dcc) * tika-parsers/pom.xml * CHANGES.txt > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > Fix For: 1.13 > > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202035#comment-15202035 ] Hudson commented on TIKA-1894: -- SUCCESS: Integrated in tika-2.x #52 (See [https://builds.apache.org/job/tika-2.x/52/]) TIKA-1894 -- clean up following recommendations from Ray Gauss and Bob (tallison: rev c58af959b6cc3f3a3d8f555d53b147388e36b01d) * tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/module/xmp/internal/Activator.java * tika-parser-modules/tika-parser-xmp-module/pom.xml * tika-parser-modules/tika-parser-multimedia-module/pom.xml * tika-parser-bundles/tika-parser-multimedia-bundle/pom.xml * tika-parser-modules/tika-parser-xmp-commons/src/main/java/org/apache/tika/parser/xmp/XMPPacketScanner.java * tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/XMPPacketScanner.java * tika-parser-modules/tika-parser-pdf-module/pom.xml * tika-parser-modules/pom.xml * tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/JempboxExtractor.java * tika-parser-modules/tika-parser-xmp-module/src/test/java/org/apache/tika/parser/xmp/JempboxExtractorTest.java * tika-parser-modules/tika-parser-xmp-commons/src/test/java/org/apache/tika/parser/xmp/JempboxExtractorTest.java * tika-parser-modules/tika-parser-xmp-commons/src/main/java/org/apache/tika/parser/xmp/JempboxExtractor.java * tika-parser-bundles/tika-parser-pdf-bundle/pom.xml * tika-parser-modules/tika-parser-xmp-commons/pom.xml > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201733#comment-15201733 ] Tim Allison commented on TIKA-1894: --- Let's see if Hudson likes it... I just pushed this clean-up in 2.x. Thank you! > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195282#comment-15195282 ] Bob Paulin commented on TIKA-1894: -- I think that sounds like a good idea. > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195210#comment-15195210 ] Tim Allison commented on TIKA-1894: --- Thank you, [~rgauss]. [~bobpaulin], if you're ok with this, I'll rename the module today. > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193622#comment-15193622 ] Ray Gauss II commented on TIKA-1894: The {{tika-xmp}} project deals with converting a populated Tika {{Metadata}} object into XMP. Perhaps that project should be renamed to something more specific at some point, but regardless, I don't think it's the right spot for this sort of shared parser code. I'd vote for the simpler shared util jar, but I think it can still live next to the modules, something like {{/tika-parsers-modules/tika-parser-xmp-commons}}? > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190840#comment-15190840 ] Tim Allison commented on TIKA-1894: --- Makes sense. There isn't a parser in there now, but at some point, I think I'd like to add a parser that combines the PacketScanner and the XMP extractor...won't have time for a while...though. By "shared jar", would that be a tika-utils package at the main level? Does this belong in the tika-xmp module...or would we run into circular references eventually? [~rgauss], any recommendations? > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190439#comment-15190439 ] Bob Paulin commented on TIKA-1894: -- [~talli...@mitre.org] So after looking at this I'm thinking a new module might be overkill here. There's no parsers in it so there's no need for there to be an Activator class also I see a number of the image classes instantiating objects that do not need to be instantiated. {code} new JempboxExtractor(metadata).parse(tis); {code} could be {code} JempboxExtractor.parse(metadata, tis); {code} I feel the pain that there is shared code between pdf and multimedia now. Maybe just a simple shared util jar? > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183455#comment-15183455 ] Hudson commented on TIKA-1894: -- SUCCESS: Integrated in tika-2.x #47 (See [https://builds.apache.org/job/tika-2.x/47/]) TIKA-1894 - Add XMPMM support to PDFParser and JpegParser via Jempbox (tallison: rev dc4ca999c2855814158868af97e877cbcc74079a) * CHANGES.txt * tika-core/src/main/java/org/apache/tika/metadata/XMPMM.java * tika-parser-modules/tika-parser-multimedia-module/pom.xml * tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/xmp/XMPPacketScanner.java * tika-parser-modules/pom.xml * tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/jpeg/JpegParser.java * tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/module/xmp/internal/Activator.java * tika-parser-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/PDFParser.java * tika-parser-modules/tika-parser-pdf-module/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java * tika-parser-modules/tika-parser-multimedia-module/src/test/java/org/apache/tika/parser/image/xmp/JempboxExtractorTest.java * tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/JempboxExtractor.java * tika-parser-modules/tika-parser-xmp-module/src/main/java/org/apache/tika/parser/xmp/XMPPacketScanner.java * tika-parser-modules/tika-parser-pdf-module/pom.xml * tika-parser-modules/tika-parser-xmp-module/pom.xml * tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/TiffParser.java * tika-parser-modules/tika-parser-xmp-module/src/test/java/org/apache/tika/parser/xmp/JempboxExtractorTest.java * tika-parser-bundles/tika-parser-multimedia-bundle/pom.xml * tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java * tika-parser-bundles/tika-parser-pdf-bundle/pom.xml * tika-parser-modules/tika-parser-multimedia-module/src/test/java/org/apache/tika/parser/jpeg/JpegParserTest.java > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183166#comment-15183166 ] Hudson commented on TIKA-1894: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #924 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/924/]) TIKA-1894: Add XMPMM support to PDFParser and JpegParser via Jempbox (tallison: rev c5d4ec6c50824a9a40fdd2b492bf7557d8d693f3) * tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java * tika-parsers/src/test/java/org/apache/tika/parser/jpeg/JpegParserTest.java * CHANGES.txt * tika-parsers/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java * tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java * tika-core/src/main/java/org/apache/tika/metadata/XMPMM.java > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1894) Add XMPMM metadata extraction to JempboxExtractor
[ https://issues.apache.org/jira/browse/TIKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183137#comment-15183137 ] Tim Allison commented on TIKA-1894: --- Update made to trunk with commit c5d4ec6c50824a9a40fdd2b492bf7557d8d693f3. In 2.0, I'm not sure how to share JempboxExtractor with the multi-media-module and the pdf-module. As expected, we get a cyclic dependency error if I add the multi-media-module as a dependency to the pdf-module, and, even if it did work, that wasn't a good option. Some options: #. Create a tika-parser-xmp-module that would include helper functionality for extracting xmp packets & metadata. Is this enough to warrant a separate module? #. Duplicate code (no!!!). #. Other options? > Add XMPMM metadata extraction to JempboxExtractor > - > > Key: TIKA-1894 > URL: https://issues.apache.org/jira/browse/TIKA-1894 > Project: Tika > Issue Type: New Feature >Reporter: Tim Allison >Priority: Minor > > The XMP Media Management (XMPMM) section of xmp carries some useful > information. We currently have keys for many of the important attributes in > tika-core's o.a.t.metadata.XMPMM, and JempBox extracts the XMPMM schema, but > the wiring between the two has not yet been installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)