On Feb 4, 2008 3:40 PM, Shawn McKenzie <[EMAIL PROTECTED]> wrote:
> strip_tags() perhaps?

Perhaps; I've never been thrilled with strip_tags(), but it should
work well enough here. But combined with grep? I guess for most
searches grep would narrow things down reasonably well before you have
to start processing files in PHP. It would definitely only be useful
for a small site (as you suggested).

Identifying keywords wouldn't be all that difficult using the OP's
method either. The script could easily count the number of occurrences
of each word and create an index with the word, the URL, and the
number of occurrences (even excluding a list of noise words if
desired) without someone having to manually define a list of keywords.
It could be run as often as needed to keep the index up-to-date.

However, the thing I like most about using FULLTEXT or something like
htdig is that they already provide a good combination of indexing and
advanced search operators.

Andrew

>
> Andrew Ballard wrote:
> > On Feb 4, 2008 3:13 PM, Shawn McKenzie <[EMAIL PROTECTED]> wrote:
> >> If there aren't many files and you don't intend to grow this site much
> >> larger and intend to always have static HTML, any easy implementation
> >> would be to read each file and search for the terms either in the
> >> keywords tag or in the entire file.
> >>
> >> Optionally, if you're on a *nix host you could exec() a grep for the
> >> terms which returns the matching lines in an array and display as needed.
> >>
> >> -Shawn
> >>
> >
> > I'm dreading any searches that contain terms like "table", "body",
> > "style", "background", etc. These could be perfectly legitimate search
> > terms, but without the right filter they would match every document in
> > the site rather than just those that contain these terms in the actual
> > content rather than the markup.
> >
> > Andrew
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to