Re: htmlParser.className default value

Philippe Mouawad Thu, 26 Sep 2013 13:50:10 -0700

Hello,
I really think this setting should be changed as
HtmlParserHTMLParser is really catastrophic in terms of performance and
memory use.


Or at least a note should be added, but my preference goes to switching to
REGEXP which seems to be doing the job.

Regards
Philippe


On Sun, Mar 3, 2013 at 9:06 PM, Philippe Mouawad <[email protected]
> wrote:

> Hello,
>
> I made recently a Real world test which downloaded resources.
> As the site started to slow down, I ended up having an OOM.
>
> Analyzing Heap Dump, I noticed one JMeterThread held around 3 mo which
> majority was taken by DOM build by htmlparser.
>
> So I think Regexp is far more efficient on memory usage. But if you say it
> is a quick and dirty alternative then it's another point.
>
> I wonder if it would not be interesting to explore using JSOUP in a new
> implementation.
>
> Regards
> Philippe
>
>
> On Sun, Mar 3, 2013 at 3:42 PM, sebb <[email protected]> wrote:
>
>> On 2 March 2013 19:42, Philippe Mouawad <[email protected]>
>> wrote:
>> > Hello,
>> > I was wondering if there is any reason for htmlParser.className default
>> > value being org.apache.jmeter.protocol.http.parser.HtmlParserHTMLParser
>> and
>> > not org.apache.jmeter.protocol.http.parser.RegexpHTMLParser
>> >
>> > It seems to me the latter is much more efficient than the current
>> default
>> > value.
>>
>> I think one would need to benchmark that to see how much faster it is.
>>
>> > Any objection on changing to
>> > org.apache.jmeter.protocol.http.parser.RegexpHTMLParser
>>
>> The Regex version does not take account of context, so will find
>> references in comment sections.
>>
>> It was intended as a quick and dirty alternative.
>>
>> > --
>> > Regards.
>> > Philippe
>>
>
>
>
> --
> Cordialement.
> Philippe Mouawad.
>
>
>


-- 
Cordialement.
Philippe Mouawad.

Re: htmlParser.className default value

Reply via email to