Hi

Im trying to index pdf files in solr 4.3.0 using the data import handler. 

*My request handler - *

<requestHandler name="/dataimport1" 
class="org.apache.solr.handler.dataimport.DataImportHandler"> 
    <lst name="defaults"> 
      <str name="config">data-config1.xml</str> 
    </lst> 
  </requestHandler> 

*My data-config1.xml *

<dataConfig> 
<dataSource type="BinFileDataSource" /> 
<document> 
<entity name="f" dataSource="null" rootEntity="false" 
processor="FileListEntityProcessor" 
baseDir="C:\Users\aroraarc\Desktop\Impdo" fileName=".*pdf" 
recursive="true"> 
<entity name="tika-test" processor="TikaEntityProcessor" 
url="${f.fileAbsolutePath}" format="text"> 
<field column="Author" name="author" meta="true"/>
<field column="title" name="title1" meta="true"/>
<field column="text" name="text"/>
</entity> 
</entity> 
</document> 
</dataConfig> 


Now When i try and index the files i get the following error -

org.apache.solr.common.SolrException: Document is missing mandatory
uniqueKey field: id
        at
org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:517)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:396)
        at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
        at 
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:70)
        at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:235)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:500)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:491)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
        at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)
        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)
        at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)
        at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)
        at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)


This problem can be solved easily in case of database indexing but i dont
know how to go about the unique key of a document. how do i define the id
field (unique key) of a pdf file. how do i solve this problem?

Thanks in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to