I changed the syntax. Now it is more flexible:

>>> a=TAG('<h1>Header</h1><p>this is a     test</p>')
>>> def markdown(text,tag=None,attributes={}):
>>>     if tag==None: return re.sub('\s+',' ',text)
>>>     elif tag=='h1': return '#'+text+'\n\n'
>>>     elif tag=='p': return text+'\n'
>>>     return text
...
>>> a.flatten(markdown)
'#Header\n\nthis is a test\n'

Moreover elements accept jQuery syntax:

>>> a=TAG('<div><span><a id="1">hello</a></span><p class="this is a 
>>> test">world</p></div>')
>>> for e in a.elements('div a#1, p.is'): print e.flatten()
hello
world

>>> a.elements('a[id=1]')[0].xml()
'<a id="1">hello</a>'

Please check it.

On May 25, 12:24 pm, Iceberg <iceb...@21cn.com> wrote:
> On May26, 12:35am, mdipierro <mdipie...@cs.depaul.edu> wrote:
>
>
>
> > I cannot push it until tonight but I have this:
>
> > >>> a=TAG('<h1>Header</h1><p>this is a     test</p>')
> > >>> print a
>
> > <h1>Header</h1><p>this is a test</p>>>> a.flatten()
>
> > 'Headerthis is a     test'>>> a.flatten(filter=lambda x: re.sub('\s+',' 
> > ',x))
>
> > 'Headerthis is a test'>>> a.flatten(filter=lambda x: re.sub('\s+','-',x))
>
> > 'Headerthis-is-a-test'>>> a.flatten(render=dict(h1=lambda x: 
> > '#'+x+'\n\n'),filter=lambda x: x.replace(' ','-'))
>
> > '#Header\n\nthis-is-a-test'
>
> > filter is applied to text and render is applier to tags.
> > so your
>
> >    result = web2pyHTMLParser(form.vars.input).tree
>
> > could be written as
>
> >    result = TAG(form.vars.input).flatten(filter=lambda x: re.sub('\s
> > +',' ',x)), render=dict(br=lambda x:'\n',p=lambda x: x+'\n'))
>
> > Can somebody propose better names for "filter" ad "render"? I could
> > not come up with anything better.
>
> > Massimo
>
> Since render={...} does render html tags into another form, so I think
> "render" is good name.
>
> filter=lambda... is not very good because the python has a reserved
> keyword "filter" for built-in filter() which acts in different logic.
> We should avoid conflict and confusing. How about we just use
> "replace"?  I mean .flatten(replace=lambda x:x, render={...})
>
> Regards,
> Iceberg

Reply via email to