Hi Olivier,
The HTML parser built into MCF is quite resilient against badly formed
HTML, but there are limits. Characters like "<" and ">" are used to denote
tags and thus they confuse the parser when they are present in unescaped
form. It may be possible, with a fair bit of work, to handle
Hi Karl,
Thanks for your answer.
Could you detail your answer please ? Just to better understand : you mean that
there is no chance that special characters could be escaped in the MCF code in
this case ie the website needs to escape itself the special characters
otherwise the extraction will
Hi Olivier,
You can create a ticket but I don't have a good solution for you in any
case.
Karl
On Thu, Nov 15, 2018 at 6:53 AM Olivier Tavard <
olivier.tav...@francelabs.com> wrote:
> Hi Karl,
>
> Do you think that I need to create a Jira issue relative to this bug ie
> that the links
Hi Karl,
Thanks for your answer.
I kept looking into this and I found what was the problem. The Javascript code
into the tags contained the character '<'. If so the links
extraction does not work with the web connector.
To reproduce it, I created this page hosted in local Apache
Hi Olivier,
Javascript inclusion in the Web Connector is not evaluated. In fact, no
Javascript is executed at all. Therefore it should not matter what is
included via javascript.
Thanks,
Karl
On Mon, Oct 29, 2018 at 1:39 PM Olivier Tavard <
olivier.tav...@francelabs.com> wrote:
> Hi,
>
>
Hi,
Regarding the web connector, I noticed that for specific websites, some
Javascript code can prevent the web connector to fetch correctly all the links
present on the page. Specifically, for websites that contain a deprecated
version of New relic web agent as