Re: XNI performance: resetting the pipeline

2003-10-27 Thread Andy Clark
Arnaud Le Hors wrote: Andy, this last recommendation is at odd with your first statement. I agree we should look for performance gains in other places. But we should do that too, not instead. 20% is such a large number that we ought to address this first. I think I stated it poorly. What I meant

Re: XNI performance: resetting the pipeline

2003-10-27 Thread Arnaud Le Hors
Andy Clark wrote: > Interesting. I'm quite surprised that 20% of parsing > time (for small documents) is taken up by this. I guess > it's directly related to the number of components in > the parser configuration. ... > > Perhaps we need to look for performance gains in > other places... > Andy, th

Re: parsing 'html' documents using DOMParser

2003-10-27 Thread Andy Clark
Mushfiqur Rahman wrote: I want to parse a HTML document( may not be a XHTML document) using org.apache.xerces.parsers.DOMParser and get a org.w3c.dom.Document after parsing. Can anyone tell me how can I do it? If you just need a DOM document, there are a few options. Check out JTidy[1] and NekoHT

Re: XNI performance: resetting the pipeline

2003-10-27 Thread Andy Clark
Sandy Gao wrote: There seem to be 2 ways to pass features/properties to components: via reset() method and via setFeature/Property() methods. Currently most Xerces components rely on reset() to get feature/property values, which is the reason why reset is so slow. The reset() and setFeature/Propert

Re: XNI performance: resetting the pipeline

2003-10-27 Thread Sandy Gao
> > Just in case you are wondering what are other solutions are: > > 1) each component could implement setFeature, setProperty and get all > > the properties and features. However, during > > XMLComponent.reset() no features and properties will be queried. > > I am not sure if XNI components we

DO NOT REPLY [Bug 24113] - Hardcoded "Created-By" in manifest

2003-10-27 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bu

parsing 'html' documents using DOMParser

2003-10-27 Thread Mushfiqur Rahman
Hello sir, I want to parse a HTML document( may not be a XHTML document) using org.apache.xerces.parsers.DOMParser and get a org.w3c.dom.Document after parsing. Can anyone tell me how can I do it? Note: The HTML page may have tags like: "", with no ending tag( that means it may be nonconformin xml

DO NOT REPLY [Bug 24113] - Hardcoded "Created-By" in manifest

2003-10-27 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bu