[
https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288403#comment-16288403
]
Sebastian Nagel commented on NUTCH-2478:
??resolve the missing protocol using the current page's
[
https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288300#comment-16288300
]
Markus Jelsma commented on NUTCH-2478:
--
To clarify a bad sentence, i resolve the missing protocol
[
https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288289#comment-16288289
]
Markus Jelsma commented on NUTCH-2478:
--
Yes, this needs a change in the parser plugins. I sought to
[
https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288280#comment-16288280
]
Sebastian Nagel commented on NUTCH-2478:
Confirmed:
{noformat}
$ cat
[
https://issues.apache.org/jira/browse/NUTCH-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287823#comment-16287823
]
Jurian Broertjes commented on NUTCH-2477:
-
Feedback is welcome
> Refactor *Checker classes to use
[
https://issues.apache.org/jira/browse/NUTCH-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jurian Broertjes updated NUTCH-2477:
External issue URL: https://github.com/apache/nutch/pull/256
> Refactor *Checker classes to
[
https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2478:
-
Description:
This test fails:
{code}
@Test
public void testBadResolver() throws Exception {
Jurian Broertjes created NUTCH-2477:
---
Summary: Refactor *Checker classes to use base class for common
code
Key: NUTCH-2477
URL: https://issues.apache.org/jira/browse/NUTCH-2477
Project: Nutch
Markus Jelsma created NUTCH-2478:
Summary: // is not a valid base URL
Key: NUTCH-2478
URL: https://issues.apache.org/jira/browse/NUTCH-2478
Project: Nutch
Issue Type: Bug
Affects
Happy to hear. There are major improvements in Tika 1.17, it deals much better
with some of the more extravagant web pages you find on the web.
-Original message-
> From:Sebastian Nagel
> Sent: Tuesday 12th December 2017 13:36
> To: dev@nutch.apache.org
>
Hi Julien,
yes, I know there's an open issue by Markus which depends on Tika 1.7.
If the Tika release happens this week, I'll make sure that it's included.
Thanks,
Sebastian
On 12/11/2017 10:22 AM, Julien Nioche wrote:
> Tika 1.17 will be released shortly, maybe it would be worth waiting a bit
[
https://issues.apache.org/jira/browse/NUTCH-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287338#comment-16287338
]
ASF GitHub Bot commented on NUTCH-2473:
---
mfeltscher commented on issue #253: fix for NUTCH-2473
12 matches
Mail list logo