[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2936:
---
Priority: Blocker (was: Major)
> Early registration of URL stream handlers provided by
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476615#comment-17476615
]
Sebastian Nagel commented on NUTCH-2936:
Using protocol-okhttp causes parsechecker to raise the
[
https://issues.apache.org/jira/browse/NUTCH-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476606#comment-17476606
]
ASF GitHub Bot commented on NUTCH-2573:
---
sebastian-nagel opened a new pull request #724:
URL:
sebastian-nagel opened a new pull request #724:
URL: https://github.com/apache/nutch/pull/724
- add properties
- `http.robots.503.defer.visits` :
enable/disable the feature (default: enabled)
- `http.robots.503.defer.visits.delay` :
delay to wait before the next
Sebastian Nagel created NUTCH-2937:
--
Summary: parse-tika: review dependency exclusions and avoid
dependency conflicts in distributed mode
Key: NUTCH-2937
URL: https://issues.apache.org/jira/browse/NUTCH-2937
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476614#comment-17476614
]
ASF GitHub Bot commented on NUTCH-2919:
---
sebastian-nagel commented on pull request #717:
URL:
sebastian-nagel commented on pull request #717:
URL: https://github.com/apache/nutch/pull/717#issuecomment-1013689493
Ok, the dependency conflict with commons-io is tracked in NUTCH-2937.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2919:
Summary: NUTCH-2919 Upgrade to Tika 2.2.1 and Any23 2.6 (was: NUTCH-2919
Upgrade
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476708#comment-17476708
]
ASF GitHub Bot commented on NUTCH-2919:
---
lewismc merged pull request #717:
URL:
lewismc commented on pull request #717:
URL: https://github.com/apache/nutch/pull/717#issuecomment-1013770346
Thanks @sebastian-nagel
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
lewismc merged pull request #717:
URL: https://github.com/apache/nutch/pull/717
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476709#comment-17476709
]
ASF GitHub Bot commented on NUTCH-2919:
---
lewismc commented on pull request #717:
URL:
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2919.
-
Resolution: Fixed
> NUTCH-2919 Upgrade to Tika 2.2.1 and Any23 2.6
>
[
https://issues.apache.org/jira/browse/NUTCH-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476712#comment-17476712
]
ASF GitHub Bot commented on NUTCH-2573:
---
lewismc commented on a change in pull request #724:
URL:
lewismc commented on a change in pull request #724:
URL: https://github.com/apache/nutch/pull/724#discussion_r785373191
##
File path: src/java/org/apache/nutch/fetcher/FetchItemQueues.java
##
@@ -195,11 +195,15 @@ public synchronized FetchItem getFetchItem() {
return
Lewis John McGibbney created NUTCH-2938:
---
Summary: Use Any23's RepositoryWriter to write structured data to
Rdf4j repository
Key: NUTCH-2938
URL: https://issues.apache.org/jira/browse/NUTCH-2938
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476720#comment-17476720
]
Hudson commented on NUTCH-2919:
---
SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #69 (See
lewismc opened a new pull request #725:
URL: https://github.com/apache/nutch/pull/725
PR addresses https://issues.apache.org/jira/browse/NUTCH-2938
We could improve the performance of this plugin if we could reuse the
repository connection however I am not entirely sure how to do that
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2936:
---
Assignee: Lewis John McGibbney
> Early registration of URL stream handlers
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2936 started by Lewis John McGibbney.
---
> Early registration of URL stream handlers provided by plugins
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476719#comment-17476719
]
Lewis John McGibbney commented on NUTCH-2936:
-
I'll try to reproduce. Thanks
> Early
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476730#comment-17476730
]
Lewis John McGibbney commented on NUTCH-2936:
-
I can reproduce this. Although I was planning
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476734#comment-17476734
]
ASF GitHub Bot commented on NUTCH-2936:
---
lewismc opened a new pull request #726:
URL:
[
https://issues.apache.org/jira/browse/NUTCH-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476724#comment-17476724
]
ASF GitHub Bot commented on NUTCH-2938:
---
lewismc opened a new pull request #725:
URL:
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476728#comment-17476728
]
Lewis John McGibbney commented on NUTCH-2936:
-
[~snagel] which JDK are you using?
> Early
lewismc opened a new pull request #726:
URL: https://github.com/apache/nutch/pull/726
I ended up producing this PR as a result of investigating NUTCH-2936. This
PR does not fix NUTCH-2936.
The problem is that the
26 matches
Mail list logo