The entire code is 40 lines and uses the python built-in html parser.
It will not be a problem to maintain it. Actually we could even use
this simplify both XML(...,sanitize) and gluon.contrib.markdown.WIKI
On May 25, 12:50 am, Thadeus Burgess thade...@thadeusb.com wrote:
So why our own?
yet a better syntax and more API:
1) no more web2pyHTMLParser, use TAG(...) instead. and flatten (remove
tags)
a=TAG('divHellospanworld/span/div')
print a
divHellospanworld/span/div
print a.element('span')
spanworld/span
print a.flatten()
Helloworld
2) search by multiple conditions,
Was going to say web2pyHTMLParser is too cumbersome - glad you
changed to TAG
I do some scraping with lxml so am also wary about including this, but
the example look very convenient.
On May 26, 1:11 am, mdipierro mdipie...@cs.depaul.edu wrote:
Here is a one liner to remove all tags from a
It makes assumptions. It fails if Python HTMLParser fails. For
example:
from gluon.html import TAG
print TAG('c/bddd/aeee')
c/c/bddd/aeee
print TAG('c/bdddeee')
c/c/bdddeee/a
print TAG('b x=bbbc/bdddeee')
/a
print TAG('b bbbc
On Tue, May 25, 2010 at 12:11, mdipierro mdipie...@cs.depaul.edu wrote:
Here is a one liner to remove all tags from a some html text:
html = 'divhellospanworld/span/div'
print TAG(html).flatten()
helloworld
Very good!
--
Álvaro Justen - Turicas
http://blog.justen.eng.br/
21 9898-0141
there are docstrings. I will write something more asap.
On May 25, 10:28 pm, weheh richard_gor...@verizon.net wrote:
This is very nice. I think Thadeus' point is well made. I agree it's
useful. It is fringe, but I absolutely need this and will be using it
on my current project. Where's the
Hmm, I wonder if this is worth the possible maintenance cost? It also
transcends the role of a web framework and now you are getting into
network programming.
I have a currently deployed screen scraping app and found PyQuery to
be more than adequate. There is also lxml directly, or Beautiful
So why our own?
Because it converts it into web2py helpers.
And you don't have to deal with installing anything other than web2py.
--
Thadeus
On Tue, May 25, 2010 at 12:14 AM, Kevin Bowling kevin.bowl...@gmail.com wrote:
Hmm, I wonder if this is worth the possible maintenance cost? It
8 matches
Mail list logo