Ashley Sheridan wrote:

> Here's the rub though. As the content is in HTML form, I can't just
> grab the first 100 characters and display them as that could leave an
> open tag  without a closing one, potentially breaking the page. I
> could use strip_tags on the 100-character excerpt, but what if the
> excerpt itself broke a tag in half (i.e. <acronym title="something">
> could become <acron )
> 
> The only solutions I can see are:
> 
> 
>       * retrieve the entire article, perform a strip_tags and then
>       take the excerpt
>       * use a regex inside of mysql to pull out only the text
> 

- parse the HTML and extract the text elements.

If the HTML is well-formed, this is relatively easily done with XSL, if
not, you might need to use Beautiful Soup or similar.



-- 
Per Jessen, Zürich (16.1°C)


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to