okay, sounds good. I'll commit the other cleanups then. On Wed, Jul 8, 2009 at 12:28 PM, Louis Ryan <[email protected]> wrote:
> The switch to use htmlparser is something I've been planning to do for > quite > a while. We're currently waiting for Mike et al to fix some issues in their > CSS DOM before I go ahead and make the switch, which has significant > benefits for our sanitization and cajoling pipelines. I believe there is a > CL out for review to fix this on Caja. > > On Wed, Jul 8, 2009 at 3:46 AM, Paul Lindner <[email protected]> > wrote: > > > I filed https://issues.apache.org/jira/browse/SHINDIG-1107 > > Does anyone have any opinion about cleaning up those dependencies? We > were > > pulling in json-lib which seems unnecessary since we have a native json > > serializer in place now. > > > > Another simplification is deprecating nekohtml for htmlparser, which is > > used > > by caja. I asked the caja folks about using neko and this was their > > response: > > > > htmlparser was recommended by Ian Hickson, author of large chunks of > > the HTML5 spec > > as conforming closely to the spec. Nekohtml is indeed quite fast but > > htmlparser does > > a better job of more accurately producing the kind of DOM that you > > would get in an > > actual browser (which is what we're trying to codify) when parsing tag > > soup. > > > > Mike Samuel looked at nekohtml more recently (primarily to see if we > > could benefit > > from faster parsing by neko) and improved our own parsing speed to a > > point where it > > is comparable to neko. I am not sure I fully follow the benefit of > > removing > > dependency on icu4j. > > >

