[ https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma updated NUTCH-2478: --------------------------------- Description: This test fails: {code} @Test public void testBadResolver() throws Exception { URL base = new URL("//www.example.org/"); String target = "index/produkt/kanaly/"; URL abs = URLUtil.resolveURL(base, target); Assert.assertEquals("http://www.example.org/index/produkt/kanaly/", abs.toString()); } {code} and has to fail because of invalid base URL, so the current URL is used. If current URL is not /, its path will be prepended, resulting in 404 being crawled. This ticket must allow // as base, and resolve the protocol. > // is not a valid base URL > -------------------------- > > Key: NUTCH-2478 > URL: https://issues.apache.org/jira/browse/NUTCH-2478 > Project: Nutch > Issue Type: Bug > Affects Versions: 1.13 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Fix For: 1.14 > > > This test fails: > {code} > @Test > public void testBadResolver() throws Exception { > URL base = new URL("//www.example.org/"); > String target = "index/produkt/kanaly/"; > > URL abs = URLUtil.resolveURL(base, target); > Assert.assertEquals("http://www.example.org/index/produkt/kanaly/", > abs.toString()); > } > {code} > and has to fail because of invalid base URL, so the current URL is used. If > current URL is not /, its path will be prepended, resulting in 404 being > crawled. > This ticket must allow // as base, and resolve the protocol. -- This message was sent by Atlassian JIRA (v6.4.14#64029)