wow, this lib is great. danke.
heh, love that feeling that i've been wasting my life coding scrapers
with regexs up until now... :)
patrick k. wrote:
> it´s easy to write a customized sanitizer using beautifulsoup.
> http://www.crummy.com/software/BeautifulSoup/
>
> 1) place
well, but sometimes you want them to be able to enter HTML. style
items, simple links, etc...
[EMAIL PROTECTED] wrote:
> Yes it is much safer to reject rather than sanitize. If bad tags are
> detected then reject the input out of hand. If you don't your
> sanitizer could be turned against
Brett Parker wrote:
> On Fri, Jul 13, 2007 at 11:48:50AM +0100, Nic James Ferrier wrote:
>> Brett Parker <[EMAIL PROTECTED]> writes:
>>
>>> On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
Derek Anderson <[EMAIL PROTECTED]> writes:
> hey all,
>
> could
On Fri, Jul 13, 2007 at 11:48:50AM +0100, Nic James Ferrier wrote:
>
> Brett Parker <[EMAIL PROTECTED]> writes:
>
> > On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
> >>
> >> Derek Anderson <[EMAIL PROTECTED]> writes:
> >>
> >> > hey all,
> >> >
> >> > could anyone point
Brett Parker <[EMAIL PROTECTED]> writes:
> On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
>>
>> Derek Anderson <[EMAIL PROTECTED]> writes:
>>
>> > hey all,
>> >
>> > could anyone point me to a python html sanitizer implementation (or
>> > example)? i don't mean to strip
On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
>
> Derek Anderson <[EMAIL PROTECTED]> writes:
>
> > hey all,
> >
> > could anyone point me to a python html sanitizer implementation (or
> > example)? i don't mean to strip all html, just tags and attributes not
> > on a
Derek Anderson <[EMAIL PROTECTED]> writes:
> hey all,
>
> could anyone point me to a python html sanitizer implementation (or
> example)? i don't mean to strip all html, just tags and attributes not
> on a whitelist, such as I/B/A href/U/etc.
I use libxml2/libxslt, something like:
doc =
it´s easy to write a customized sanitizer using beautifulsoup.
http://www.crummy.com/software/BeautifulSoup/
1) place beautifulsoup.py somewhere in your pythonpath
2) build your sanitizer and save it somewhere on your pythonpath
in my case it´s called eatMe and looks like this:
hey all,
could anyone point me to a python html sanitizer implementation (or
example)? i don't mean to strip all html, just tags and attributes not
on a whitelist, such as I/B/A href/U/etc.
danke,
derek
--~--~-~--~~~---~--~~
You received this message because
9 matches
Mail list logo