Sorry, my error! In that case you *do* have to do some fiddling to get it all to work.
Good Luck! Erick On Fri, Feb 17, 2012 at 3:27 PM, alessio crisantemi <alessio.crisant...@gmail.com> wrote: > i try...but i works with solr 1.4.1.... > > Il giorno 17 febbraio 2012 15:59, Erick Erickson > <erickerick...@gmail.com>ha scritto: > >> You should not have to do anything with Maven, the instructions >> you followed were from 1.4.1 days...... >> >> Assuming you're working with a 3.x build, here's a data-config >> that worked for me, just a straight distro. But note a couple of things: >> 1> for simplicity, I changed the schema.xml to NOT require >> the id field. You'll have to change this back probably and >> select a good <uniqueKey> >> 2> I had to add this line to solrconfig.xml to find the path: >> <lib dir="../../dist/" >> regex="apache-solr-dataimporthandler-extras-\d.*\.jar"/> >> 3> If this all works without errors in the Solr log and you still >> can't find anything, be sure you issue a commit. >> >> Best >> Erick >> >> <dataConfig> >> <dataSource name="bin" type="BinFileDataSource"/> >> <document> >> <entity baseDir="/Users/Erick/testdocs" fileName=".*pdf" name="sd" >> processor="FileListEntityProcessor" recursive="true" >> rootEntity="false"> >> <entity dataSource="bin" format="text" name="tika-test" >> processor="TikaEntityProcessor" url="${sd.fileAbsolutePath}"> >> <field column="Author" meta="true" name="author"/> >> <field column="Content-Type" meta="true" name="title"/> >> <!-- field column="title" name="title" meta="true"/ --> >> <field column="text" name="text"/> >> </entity> >> <!-- field column="fileLastModified" name="date" >> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss" / --> >> <field column="fileSize" meta="true" name="size"/> >> </entity> >> </document> >> </dataConfig> >> On Fri, Feb 17, 2012 at 9:35 AM, alessio crisantemi >> <alessio.crisant...@gmail.com> wrote: >> > thanks gora for your help. >> > I installed Maven and downloaded Tika following the guide: But I have an >> > errore during the built of Tika about 'tika compiler', and the maven >> > installation of Tika is stopped. >> > >> > there is another way? >> > thank you >> > a. >> > >> > 2012/2/16 Gora Mohanty <g...@mimirtech.com> >> > >> >> On 16 February 2012 21:37, alessio crisantemi >> >> <alessio.crisant...@gmail.com> wrote: >> >> > here the log: >> >> > >> >> > >> >> > org.apache.solr.handler.dataimport.DataImporter doFullImport >> >> > Grave: Full Import failed >> >> > org.apache.solr.handler.dataimport.DataImportHandlerException: >> 'baseDir' >> >> is >> >> > a required attribute Processing Document # 1 >> >> [...] >> >> >> >> The exception message above is pretty clear. You need to define a >> >> baseDir attribute for the second entity. >> >> >> >> However, even if you fix this, the setup will *not* work for indexing >> >> PDFs. Did you read the URLs that I sent earlier? >> >> >> >> Regards, >> >> Gora >> >> >>