Good point Uwe. I went over to Tika's home page and tried to figure out what are the JARs I need (so I don't have to use Tika's JARs that come with Solr). I looked around and couldn't find a "dist" of the JARs. From my reading under Getting Started, it looks like I have to build Tika from source to get the JARs. Is this true or am I missing something? I would like to skip the build process and simply grab all the required JARs.
Steve On Wed, Feb 3, 2016 at 12:28 PM, Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > > > Morphlines stuff is not needed at all. This is a Mapreduce/Hadoop > integration of Solr (see documentation) – mostly command line tools around > Solr and Hadoop. > > > > FYI: In Solr we don’t show the warnings, because otherwise the user would > get a lot of useless warnings. We may fix this when TIKA 2.0 comes with the > tika-parsers module split into multiple parts. In Solr we currently removed > all TIKA dependencies that conflict with the rest of Solr/Lucene or are not > useful for fulltext indexing. We only left those that are useful for > “fulltext” extraction (e.g. office document formats). But we have no > parsers for CLASS files (breaks, because of ASM conflict) or Netcdf files > (License issues in older versions of TIKA). > > > > I see no reason to show these warnings, because if you use Solr as > documented, it should work correctly. We no longer support running Solr > inside foreign Application Servers. So everything should work out of box. > > > > Uwe > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* Steven White [mailto:swhite4...@gmail.com] > *Sent:* Wednesday, February 03, 2016 5:44 PM > *To:* user@tika.apache.org > *Subject:* Re: Using Tika that comes with Solr 5.2 > > > > Nick, that would be a good think to do: changing Ignore to Warn otherwise > newcomers will have no clue why this isn't working. > > > > Another question to the team regarding this topic. > > > > I see JARs under solr\contrib\morphlines-cell\lib\ and > solr\contrib\morphlines-core\lib\ The ones under "morphlines-cell" there > are 2 files with "tika" as their name. My question is, do I need those for > general Tika usage? The README.txt clearly states "*Experimental*" but > doesn't say if I need them to use Tika. > > > > Steve > > > > > > On Wed, Feb 3, 2016 at 9:16 AM, Nick Burch <apa...@gagravarr.org> wrote: > > On Wed, 3 Feb 2016, Uwe Schindler wrote: > > The reason for this behaviour is part of TIKA: If a parser cannot load > because of classes it refers to are missing, it is automatically disabled. > Because you missed the actual PDF/Powerpoint/… classes, this is what > happens for all those parsers. > > > I wonder if it might be worth SOLR changing their default Tika config from > Ignore to Warn, so that SOLR users (who probably aren't as clued up on how > it all works as the average Tika user) will get to find out more quickly > that they've missed something? > > Nick > > >