>Benny Rasmussen wrote:
>> Hi,
>>
>> In my application I would like to offer a search interface like Google
>> and other popular search engines. The complication for me is to explode
>> the search string into proper array elements, like this:
>>
>> $search_str = "\"search for this sentence\" -NotForThisWord
>> ButDefinitelyForThisWord";
>>
>> $array[0]: "search for this sentence"
>> $array[1]: "-NotForThisWord"
>> $array[2]: "ButDefinitelyForThisWord"
>>
>> I have tried to use regular expressions but my case seems to be a bit
>> more complicated for this (?).
>>
>> Does anybody have a code snippet, a class or something, that can help
>> me with this?
--------------------[snip]-------------------- 

If I understand you correctly you want to isolate either quoted strings
(with or without whitespace), or tokens separated by whitespace, as array
elements?

For this you would first have to isolate the first quoted sentence, then
tokenize the part before, and loop this as long you're not done.

Should work something like that:

------------<code>------------

function tokenize_search($input) {
    $re = '/\s*(.*?)\s*"\s*([^"]*?)\s*"\s*(.*)/s';
    /* look for 3 groups:
       a - prematch - anything up to the first quote
       b - match - anything until the next quote
       c - postmatch - rest of the string
    */
    $tokens = array();
    while (preg_match($re, $input, $aresult)) {
        // aresult contains: [0]-total [1]-a [2]-b [3]-c
        // tokenize the prematch
        array_push($tokens, explode(' ', $aresult[1]));
        array_push($tokens, $aresult[2]);
        $input = $aresult[3];
    }
    // $input has the rest of the line
    array_push($tokens, explode(' ', $input));
    return $tokens;
}
------------</code>------------

Disclaimer: untested as usual. _Should_ behave like this:

$string = "\"search for this sentence\" -NotForThisWord
ButDefinitelyForThisWord";
$tokens = tokenize_search($string);
print_r($tokens);
Array(
    [0] - search for this sentence
    [1] - -NotForThisWord
    [2} - ButDefinitelyForThisWord
)


-- 
   >O     Ernest E. Vogelsinger
   (\)    ICQ #13394035
    ^     http://www.vogelsinger.at/



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to