I trying to index Word, PDF and other documents with Solr. I installed
the latest nightly build of Solr on March 17. I followed the
instructions in the Wiki for ExtractingRequestHandler at
http://wiki.apache.org/solr/ExtractingRequestHandler#head-c95841f9eda007b6b4e4594ead12a04223cf7b6e.

I have produced text output from tiki in the nightly build directories
from PDF files.

When I try the suggested test curl commands in the "Getting Started with
the Solr Examle" section of the Wiki page, I get the following. Any idea
what I've done wrong? Thanks in advance for your help.

$ curl http://localhost:8983/solr/update/extract?ext.idx.attr=true
\&ext.def.fl=text -F "myfi...@tutorial.pdf"
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>
<body><h2>HTTP ERROR: 500</h2><pre>org.apache.solr.common.SolrException:
Document [null] missing required field: id

org.apache.solr.common.SolrException:
org.apache.solr.common.SolrException: Document [null] missing required
field: id
        at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333)
        at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
        at org.mortbay.jetty.servlet.ServletHandler
$CachedChain.doFilter(ServletHandler.java:1089)
        at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
        at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
        at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
        at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
        at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
        at org.mortbay.jetty.Server.handle(Server.java:285)
        at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
        at org.mortbay.jetty.HttpConnection
$RequestHandler.content(HttpConnection.java:835)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
        at org.mortbay.jetty.bio.SocketConnector
$Connection.run(SocketConnector.java:226)
        at org.mortbay.thread.BoundedThreadPool
$PoolThread.run(BoundedThreadPool.java:442)
Caused by: org.apache.solr.common.SolrException: Document [null] missing
required field: id
        at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:292)
        at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59)
        at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:90)
        at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:95)
        at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:157)
        ... 22 more
</pre>
<p>RequestURI=/solr/update/extract</p><p><i><small><a
href="http://jetty.mortbay.org/";>Powered by
Jetty://</a></small></i></p><br/>                                               
 
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                
<br/>                                                

</body>
</html>


Larry Reid
Principal Consultant, Jade Systems Inc
Mobile: +1 604.376.8884
Pragmatic IT Blog | El Blog Technologia Pragmatica | www.jadesystems.ca

Reply via email to