[ https://issues.apache.org/jira/browse/OODT-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney updated OODT-630: -------------------------------------- Labels: memex (was: ) > Upgrade OODT components from using Tika 0.8 to Tika 1.6 > ------------------------------------------------------- > > Key: OODT-630 > URL: https://issues.apache.org/jira/browse/OODT-630 > Project: OODT > Issue Type: Improvement > Components: file manager, metadata container, product server > Affects Versions: 0.6 > Reporter: Rishi Verma > Assignee: Tyler Palsulich > Labels: memex > Fix For: 0.8 > > Attachments: OODT-630.Palsulich.101014.patch, > OODT-630.Palsulich.101014.v3.patch, OODT-630.Palsulich.101014.v4.patch > > > Currently, OODT makes use of Tika v0.8 (tika-core) for mime-detection > purposes. This version is quite out-of-date, and is incompatible with the use > of a tika-core or tika-app v1.3 JAR. > Tika v1.3 contains numerous upgrades since 0.8 (see [1]), some of which > include improved metadata generation for common files. These improved > features are extremely useful for metadata gathering. > If a project using OODT needs features provided with the v1.3 tika-core or > tika-app JAR (e.g. custom met extractor), currently they cannot use this > version when interacting with OODT server-side components like filemgr, > crawler etc. since it is incompatible with OODT's use of v0.8. > One of the incompatibilities is the deprecation of the 'getMimeType' method > within org.apache.tika.mime.MimeTypes.getMimeType(URL). This has been > supplemented with Tika.detect(URL.getPath()) & > MimeTypes.getRegisteredMimeType(String) > See example exception thrown below. when crawler 0.6-SNAPSHOT was invoked > while a 'tika-app-1.3.jar' was placed in the crawler's lib directory: > --- > Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.crawl.ProductCrawler ingest > INFO: ProductCrawler: Ready to ingest product: [/data/staging/IMG_2590.jpg]: > ProductType: [GenericFile] > Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.filemgr.ingest.StdIngester > setFileManager > INFO: StdIngester: connected to file manager: [http://localhost:9000] > Jun 18, 2013 3:40:07 PM > org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferer > setFileManagerUrl > INFO: In Place Data Transfer to: [http://localhost:9000] enabled > Exception in thread "main" java.lang.NoSuchMethodError: > org.apache.tika.mime.MimeTypes.getMimeType(Ljava/net/URL;)Lorg/apache/tika/mime/MimeType; > at org.apache.oodt.cas.filemgr.structs.Reference.<init>(Reference.java:115) > at > org.apache.oodt.cas.filemgr.versioning.VersioningUtils.addRefsFromUris(VersioningUtils.java:251) > at org.apache.oodt.cas.filemgr.ingest.StdIngester.ingest(StdIngester.java:189) > at org.apache.oodt.cas.crawl.ProductCrawler.ingest(ProductCrawler.java:304) > at > org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:188) > at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:108) > at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75) > at > org.apache.oodt.cas.crawl.daemon.CrawlDaemon.startCrawling(CrawlDaemon.java:82) > at > org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:55) > at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331) > at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187) > at org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36) > --- > This JIRA issue is seeks to document efforts to upgrade OODT's use of tika > from 0.8 to 1.3. > --- > [1] http://www.apache.org/dist/tika/CHANGES-1.3.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)