Hello,

I am trying to create a new index.

I have over 130,000 full text articles in a MySQL database ranging from 300
- 1000 words.

I am trying to figure out the best practice to create the index as I am
running in to issues with Max memory exhausted errors when I get to around
15,000 articles.

I read http://framework.zend.com/manual/en/zend.search.index-creation.html

So I am basically doing this:

<?php
// Create index
$index = Zend_Search_Lucene::create('/data/my-index');
// loop through all the articles
while ($row = mysql_fetch_assoc($result)) {
  $title = trim($row["title"]);
  $content = trim($row["content"]);
  $pname = trim($row["PenName"]);
  $doc->addField(Zend_Search_Lucene_Field::Text('title', sanitize($title)));
  $doc->addField(Zend_Search_Lucene_Field::Text('author',
sanitize($pname)));
  $doc->addField(Zend_Search_Lucene_Field::UnStored('contents', 
sanitize($content)));
  $index->addDocument($doc);
}
$index->commit();
?>

This crashes with memory exhausted for a PHP script (I have it set at 40MB).

I am trying to figure out what is being loaded in to memory and what would
the best way to run a script for a few hours that will index my whole DB
from the start to end.

Any help would be appreciated.

-- 
View this message in context: 
http://www.nabble.com/Zend_Search_Lucene---Best-Practices-for-Indexing-100k%2B-articles-tf3712199s16154.html#a10383911
Sent from the Zend Framework mailing list archive at Nabble.com.

Reply via email to