[ http://jira.codehaus.org/browse/MPLINKCHECK-23?page=comments#action_45504 ]
Carlos Sanchez commented on MPLINKCHECK-23: ------------------------------------------- Looks good I'll take a closer look when I have more time > Improve linkcheck performance (2x+) getting rid of jtidy dependency via > regexps > ------------------------------------------------------------------------------- > > Key: MPLINKCHECK-23 > URL: http://jira.codehaus.org/browse/MPLINKCHECK-23 > Project: maven-linkcheck-plugin > Type: Improvement > Versions: 1.3.4 > Reporter: Ignacio G. Mac Dowell > Assignee: Carlos Sanchez > Attachments: linkcheck.patch > > > At the moment, the linkcheck plugin uses jtidy and xpath for retreiving all > links. IMHO regexps would work much faster/better than jtidy-xpath > combination. > The following regexp would be a replacement for the xpath expressions: > <(?>link|a|img|script)[^>]*?(?>href|src)\s*?=\s*?[\"'](.*?)[\"'][^>]*? > All tests pass with this regexp and in project ws-jaxme I am getting these > results for maven-linkcheck-plugin:clearcache > maven-linkcheck-plugin:report-real: > with jtidy/xpath: Total time: 2 minutes 43 seconds > with regexps: Total time: 1 minutes 10 seconds > I am sure some regexp guru can improve the performance of this. > I have a question, though. Are mailto links supposed to count as checkable? > IMO no. > PD: Also, IMO the createDocument method from LinkCheck should be on a try > finally block. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]