Hello, I really think this setting should be changed as HtmlParserHTMLParser is really catastrophic in terms of performance and memory use.
Or at least a note should be added, but my preference goes to switching to REGEXP which seems to be doing the job. Regards Philippe On Sun, Mar 3, 2013 at 9:06 PM, Philippe Mouawad <[email protected] > wrote: > Hello, > > I made recently a Real world test which downloaded resources. > As the site started to slow down, I ended up having an OOM. > > Analyzing Heap Dump, I noticed one JMeterThread held around 3 mo which > majority was taken by DOM build by htmlparser. > > So I think Regexp is far more efficient on memory usage. But if you say it > is a quick and dirty alternative then it's another point. > > I wonder if it would not be interesting to explore using JSOUP in a new > implementation. > > Regards > Philippe > > > On Sun, Mar 3, 2013 at 3:42 PM, sebb <[email protected]> wrote: > >> On 2 March 2013 19:42, Philippe Mouawad <[email protected]> >> wrote: >> > Hello, >> > I was wondering if there is any reason for htmlParser.className default >> > value being org.apache.jmeter.protocol.http.parser.HtmlParserHTMLParser >> and >> > not org.apache.jmeter.protocol.http.parser.RegexpHTMLParser >> > >> > It seems to me the latter is much more efficient than the current >> default >> > value. >> >> I think one would need to benchmark that to see how much faster it is. >> >> > Any objection on changing to >> > org.apache.jmeter.protocol.http.parser.RegexpHTMLParser >> >> The Regex version does not take account of context, so will find >> references in comment sections. >> >> It was intended as a quick and dirty alternative. >> >> > -- >> > Regards. >> > Philippe >> > > > > -- > Cordialement. > Philippe Mouawad. > > > -- Cordialement. Philippe Mouawad.
