Created : https://issues.apache.org/bugzilla/show_bug.cgi?id=55632
On Thu, Sep 26, 2013 at 11:05 PM, Philippe Mouawad < [email protected]> wrote: > > > > On Thu, Sep 26, 2013 at 10:58 PM, sebb <[email protected]> wrote: > >> On 26 September 2013 21:48, Philippe Mouawad <[email protected]> >> wrote: >> > Hello, >> > I really think this setting should be changed as >> > HtmlParserHTMLParser is really catastrophic in terms of performance and >> > memory use. >> > >> > Or at least a note should be added, but my preference goes to switching >> to >> > REGEXP which seems to be doing the job. >> >> I don't think we should change the default; it may well break test >> plans as commenting out sections is a common practise. >> > > Why not change the default and document that users can set the old parser > to what it was ? > Take a new comer, he won't read all documentation once, in my opinion, > defaults should be the best options for performances. > > If users have issues with Regexp, we will have bugzillas and will fix > them, they can provide the page for which parsing failed , as we already > had a report on this, it's easy. > > While if we keep it like this, you will have users face OOM on high load > tests because of this, and I am not sure they will report or if they do it > could be much harder to find out it was due to this. > And we will always have this "urban legend" about JMeter having OOM, which > frankly is starting to upset me :-) > > >> However by all means add a note to jmeter.properties and >> component_reference >> >> > Regards >> > Philippe >> > >> > >> > On Sun, Mar 3, 2013 at 9:06 PM, Philippe Mouawad < >> [email protected] >> >> wrote: >> > >> >> Hello, >> >> >> >> I made recently a Real world test which downloaded resources. >> >> As the site started to slow down, I ended up having an OOM. >> >> >> >> Analyzing Heap Dump, I noticed one JMeterThread held around 3 mo which >> >> majority was taken by DOM build by htmlparser. >> >> >> >> So I think Regexp is far more efficient on memory usage. But if you >> say it >> >> is a quick and dirty alternative then it's another point. >> >> >> >> I wonder if it would not be interesting to explore using JSOUP in a new >> >> implementation. >> >> >> >> Regards >> >> Philippe >> >> >> >> >> >> On Sun, Mar 3, 2013 at 3:42 PM, sebb <[email protected]> wrote: >> >> >> >>> On 2 March 2013 19:42, Philippe Mouawad <[email protected]> >> >>> wrote: >> >>> > Hello, >> >>> > I was wondering if there is any reason for htmlParser.className >> default >> >>> > value being >> org.apache.jmeter.protocol.http.parser.HtmlParserHTMLParser >> >>> and >> >>> > not org.apache.jmeter.protocol.http.parser.RegexpHTMLParser >> >>> > >> >>> > It seems to me the latter is much more efficient than the current >> >>> default >> >>> > value. >> >>> >> >>> I think one would need to benchmark that to see how much faster it is. >> >>> >> >>> > Any objection on changing to >> >>> > org.apache.jmeter.protocol.http.parser.RegexpHTMLParser >> >>> >> >>> The Regex version does not take account of context, so will find >> >>> references in comment sections. >> >>> >> >>> It was intended as a quick and dirty alternative. >> >>> >> >>> > -- >> >>> > Regards. >> >>> > Philippe >> >>> >> >> >> >> >> >> >> >> -- >> >> Cordialement. >> >> Philippe Mouawad. >> >> >> >> >> >> >> > >> > >> > -- >> > Cordialement. >> > Philippe Mouawad. >> > > > > -- > Cordialement. > Philippe Mouawad. > > > -- Cordialement. Philippe Mouawad.
