If the HTML files were valid XHTML it would be a relatively easy job extracting the styles using XSLT.

On 6/1/05, Patrick H. Lauke < [EMAIL PROTECTED]> wrote:
Scott Swabey (Lafinboy Productions) wrote:

> Am not aware on any package that would do this for you, but it should be
> quite easy to set up a Regular _expression_ routine to strip all style='foo'
> content from a page.

The harder part would be to have it not just strip out the styles, but
externalise them while keeping them working in the final output. The
brute force approach would be to just assign a unique, random ID to each
element where a style attribute is found, then create a matching,
specific entry in the new external CSS. The - hypothetical - right way
would be for a parser to analyse the entire document structure, work out
how styles can be applied generically (all paragraphs inside this div
have a certain style, so create a rule for div#blah p rather than
individual style rules), and still find the odd few "special" cases and
assign a class.

Sounds like AI to me...even if you can find a halfway automated system,
I doubt that the final result would be any more satisfactory than just
leaving the style attributes in the markup, I'm afraid...

--
Patrick H. Lauke
_____________________________________________________
re·dux (adj.): brought back; returned. used postpositively
[latin : re-, re- + dux, leader; see duke.]
www.splintered.co.uk | www.photographia.co.uk
http://redux.deviantart.com

******************************************************
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
******************************************************


Reply via email to