Getting indexed content of files using ExtractingRequestHandler

xan Sun, 14 Jul 2013 01:06:58 -0700

Hi,

I'm using the PHP Solr client (ver: 1.0.2).


I'm indexing the contents through my database. 
Suppose $data is a stdClass object having id, name, title, etc. from a
database entry.

Next, I declare a solr Document and assign fields to it.:

$doc = new SolrInputDocument();
$doc->addField ('id' , $data->id);
$doc->addField ('name' , $data->name);
....
....

I wanted to know how can I store the contents of a pdf file (whose path I've
stored in $data->filepath), in the same solr document, say in a field
('filecontent').

Referring to the wiki, I was unable to figure out the proper cURL request
for achieving this. I was able to create a completely new solr document but
how do I get the contents of the pdf file in the same solr document so that
I can store that in a field?


$doc = new SolrInputDocument();
$doc->addField ('id' , $data->id);
$doc->addField ('name' , $data->name);
....
....
//fire the curl request here referring to the file at $data->filepath
$doc->addField ('filecontent' , //content of the pdf file);

Also, instead of firing the raw cURL request, is there a better way? I don't
know if the current PECL SOLR Client 1.0.2 has the feature of indexing pdf
files.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-indexed-content-of-files-using-ExtractingRequestHandler-tp4077856.html
Sent from the Solr - User mailing list archive at Nabble.com.

Getting indexed content of files using ExtractingRequestHandler

Reply via email to