On Dec 29, 2010, at 3:10 PM, Stef Mientki wrote: > > Now I encounter 2 problems: > 1. the raw html contains tags, like: > > <p>toch iets over stromen, een plaatje</p><p><br /></p><br > /><p> > > I "remove" them with the helper function TAG. > Although it seems to work, I'm not convinced that this is the right method. > Is this the correct way ? > Are there better ways ? > > The second problem is the image locations: > > <p><img width="307" height="244" alt="" src="veiligheid_website_img1.jpg" > /></p> > > I change the image location with, not a neat way, but it works. > Any better ways ? > > Result = DIV () > Result.append ( H2 ( Book + ', ' + Chapter ) ) > Result.append ( H3 ( Paragraph ) ) > Text = Text.replace ( 'src="', 'src="/E_Veiligheid/static/images/' ) > Result.append ( TAG ( Text ) )
I'd try something like this; it might need a little tweaking. Call URL so that you'll get the benefit of any rewriting that you might choose to do someday. An alternative to Text.replace would be to do a global regex substitution, where the replacement pattern is a function that calls URL('static', 'images/%s' % matchobj.group(0)). See the re.sub section of http://docs.python.org/library/re.html for details. But for your purposes, the code below, more or less, ought to do it. prefix = URL('static', 'images') Result = DIV( H2(Book + ', ' + Chapter), H3(Paragraph), XML(Text.replace('src="', 'src="%s/' % prefix) )