Re: html sanitizers

2007-07-13 Thread Derek Anderson
wow, this lib is great. danke. heh, love that feeling that i've been wasting my life coding scrapers with regexs up until now... :) patrick k. wrote: > it´s easy to write a customized sanitizer using beautifulsoup. > http://www.crummy.com/software/BeautifulSoup/ > > 1) place

Re: html sanitizers

2007-07-13 Thread Derek Anderson
well, but sometimes you want them to be able to enter HTML. style items, simple links, etc... [EMAIL PROTECTED] wrote: > Yes it is much safer to reject rather than sanitize. If bad tags are > detected then reject the input out of hand. If you don't your > sanitizer could be turned against

Re: html sanitizers

2007-07-13 Thread Horst Gutmann
Brett Parker wrote: > On Fri, Jul 13, 2007 at 11:48:50AM +0100, Nic James Ferrier wrote: >> Brett Parker <[EMAIL PROTECTED]> writes: >> >>> On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote: Derek Anderson <[EMAIL PROTECTED]> writes: > hey all, > > could

Re: html sanitizers

2007-07-13 Thread Brett Parker
On Fri, Jul 13, 2007 at 11:48:50AM +0100, Nic James Ferrier wrote: > > Brett Parker <[EMAIL PROTECTED]> writes: > > > On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote: > >> > >> Derek Anderson <[EMAIL PROTECTED]> writes: > >> > >> > hey all, > >> > > >> > could anyone point

Re: html sanitizers

2007-07-13 Thread Nic James Ferrier
Brett Parker <[EMAIL PROTECTED]> writes: > On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote: >> >> Derek Anderson <[EMAIL PROTECTED]> writes: >> >> > hey all, >> > >> > could anyone point me to a python html sanitizer implementation (or >> > example)? i don't mean to strip

Re: html sanitizers

2007-07-13 Thread Brett Parker
On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote: > > Derek Anderson <[EMAIL PROTECTED]> writes: > > > hey all, > > > > could anyone point me to a python html sanitizer implementation (or > > example)? i don't mean to strip all html, just tags and attributes not > > on a

Re: html sanitizers

2007-07-13 Thread Nic James Ferrier
Derek Anderson <[EMAIL PROTECTED]> writes: > hey all, > > could anyone point me to a python html sanitizer implementation (or > example)? i don't mean to strip all html, just tags and attributes not > on a whitelist, such as I/B/A href/U/etc. I use libxml2/libxslt, something like: doc =

Re: html sanitizers

2007-07-13 Thread patrick k.
it´s easy to write a customized sanitizer using beautifulsoup. http://www.crummy.com/software/BeautifulSoup/ 1) place beautifulsoup.py somewhere in your pythonpath 2) build your sanitizer and save it somewhere on your pythonpath in my case it´s called eatMe and looks like this:

html sanitizers

2007-07-13 Thread Derek Anderson
hey all, could anyone point me to a python html sanitizer implementation (or example)? i don't mean to strip all html, just tags and attributes not on a whitelist, such as I/B/A href/U/etc. danke, derek --~--~-~--~~~---~--~~ You received this message because