i try...but i works with solr 1.4.1.... Il giorno 17 febbraio 2012 15:59, Erick Erickson <erickerick...@gmail.com>ha scritto:
> You should not have to do anything with Maven, the instructions > you followed were from 1.4.1 days...... > > Assuming you're working with a 3.x build, here's a data-config > that worked for me, just a straight distro. But note a couple of things: > 1> for simplicity, I changed the schema.xml to NOT require > the id field. You'll have to change this back probably and > select a good <uniqueKey> > 2> I had to add this line to solrconfig.xml to find the path: > <lib dir="../../dist/" > regex="apache-solr-dataimporthandler-extras-\d.*\.jar"/> > 3> If this all works without errors in the Solr log and you still > can't find anything, be sure you issue a commit. > > Best > Erick > > <dataConfig> > <dataSource name="bin" type="BinFileDataSource"/> > <document> > <entity baseDir="/Users/Erick/testdocs" fileName=".*pdf" name="sd" > processor="FileListEntityProcessor" recursive="true" > rootEntity="false"> > <entity dataSource="bin" format="text" name="tika-test" > processor="TikaEntityProcessor" url="${sd.fileAbsolutePath}"> > <field column="Author" meta="true" name="author"/> > <field column="Content-Type" meta="true" name="title"/> > <!-- field column="title" name="title" meta="true"/ --> > <field column="text" name="text"/> > </entity> > <!-- field column="fileLastModified" name="date" > dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss" / --> > <field column="fileSize" meta="true" name="size"/> > </entity> > </document> > </dataConfig> > On Fri, Feb 17, 2012 at 9:35 AM, alessio crisantemi > <alessio.crisant...@gmail.com> wrote: > > thanks gora for your help. > > I installed Maven and downloaded Tika following the guide: But I have an > > errore during the built of Tika about 'tika compiler', and the maven > > installation of Tika is stopped. > > > > there is another way? > > thank you > > a. > > > > 2012/2/16 Gora Mohanty <g...@mimirtech.com> > > > >> On 16 February 2012 21:37, alessio crisantemi > >> <alessio.crisant...@gmail.com> wrote: > >> > here the log: > >> > > >> > > >> > org.apache.solr.handler.dataimport.DataImporter doFullImport > >> > Grave: Full Import failed > >> > org.apache.solr.handler.dataimport.DataImportHandlerException: > 'baseDir' > >> is > >> > a required attribute Processing Document # 1 > >> [...] > >> > >> The exception message above is pretty clear. You need to define a > >> baseDir attribute for the second entity. > >> > >> However, even if you fix this, the setup will *not* work for indexing > >> PDFs. Did you read the URLs that I sent earlier? > >> > >> Regards, > >> Gora > >> >