I'm working on some database search ranking code. It currently represents 95-98% of the time spent when doing fuzzy seaches. I have tried my best to optimize my code - algorithmic shortcuts, eliminating session variables, unsetting irrelevant results, etc and benchmarking to find the best techniques. That's given me over a 10x improvement. Unfortunately, because of the number of results it must process (up to 20,000), it is still somewhat slow. I think it could use some code structure/formating tweaks to eek out that last bit of performance, but I don't know much about optimizing PHP code in that way. Does anybody have suggestions?

Also, might there be some tweaks for PHP itself to improve performance? I already know about the Zend Optimizer and have it installed, it does help.

Thanks!

Here's my code:

if ($search_results[0]["relevancy"] == "")
{
function cmp($a, $b)
{
if($a["relevancy"] < $b["relevancy"])
{
return 1;
}
elseif($a["relevancy"] > $b["relevancy"])
{
return -1;
}
else
{
return 0;
}
}

$search_statements = $_SESSION["search"]["statements"];

foreach($search_results as $key1 => $value1)
{
$num_fields_matched = 0;
$result_score = 0;
$metaphone_ratio = 0;
foreach($search_statements as $key => $value)
{
if ($value != "" AND $value1[$key] != $value)
{
$value = strtolower(trim($value));
$value1[$key] = strtolower(trim(($value1[$key])));
$num_fields_matched++;
$value_metaphone = metaphone($value1[$key]);
$search_metaphone = metaphone($value);
$search_position = strpos($value1[$key], $value);
$string_count = substr_count($value1[$key], $value);
$levenshtein = levenshtein($value, $value1[$key], "0.5", 1, 1);

if ($search_metaphone == $value_metaphone AND $value_metaphone != "")
{
$metaphone_ratio = 1;
}
elseif ($search_metaphone != 0)
{
$metaphone_ratio = 0.6 * (1 / levenshtein($search_metaphone, $value_metaphone));
}

$result_score = $result_score + ($levenshtein + (8 * $search_position)) - (2 * ($string_count - 1)) - (1.1 * $metaphone_ratio * $levenshtein);
}
elseif ($value1[$key] == $value)
{
$result_score = $result_score - 5;
}
}
if ($num_fields_matched == 0)
{
$num_fields_matched = 1;
}
$search_results[$key1]["relevancy"] = ($result_score * -1) / $num_fields_matched;

if ($fuzzy_search == true AND $search_results[$key1]["relevancy"] < -5)
{
unset($search_results[$key1]);
}
}

usort($search_results, "cmp");
}


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Reply via email to