[
https://issues.apache.org/jira/browse/NUTCH-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554866#comment-17554866
]
Lewis John McGibbney commented on NUTCH-2940:
-
WIP PR available at
[
https://issues.apache.org/jira/browse/NUTCH-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554865#comment-17554865
]
ASF GitHub Bot commented on NUTCH-2490:
---
lewismc commented on PR #735:
URL:
lewismc commented on PR #735:
URL: https://github.com/apache/nutch/pull/735#issuecomment-1157159803
I'll squash and merge commits into one when we are ready to merge into
`master` branch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
lewismc opened a new pull request, #735:
URL: https://github.com/apache/nutch/pull/735
This is a WIP for https://issues.apache.org/jira/browse/NUTCH-2940. The work
was conducted by @AzureTriple @imanzanganeh @jbsimmon @LilyPerr and
@Lirongxuan1 from the 2022 USC Senior CS Capstone Program.
[
https://issues.apache.org/jira/browse/NUTCH-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2940:
---
Assignee: Lewis John McGibbney
> Develop Gradle Core Build for Apache Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554859#comment-17554859
]
ASF GitHub Bot commented on NUTCH-2936:
---
lewismc commented on PR #733:
URL:
lewismc commented on PR #733:
URL: https://github.com/apache/nutch/pull/733#issuecomment-1157149313
This is exciting!!! Excellent debugging ... you got further than me.
I can't get around to testing it until next week at earliest.
Thinking back, I did observe revisits (recursive
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554702#comment-17554702
]
Sebastian Nagel commented on NUTCH-2936:
Update: the issue is reproducible also in local mode
[
https://issues.apache.org/jira/browse/NUTCH-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554701#comment-17554701
]
ASF GitHub Bot commented on NUTCH-2952:
---
sebastian-nagel commented on PR #734:
URL:
sebastian-nagel commented on PR #734:
URL: https://github.com/apache/nutch/pull/734#issuecomment-1156706013
Update: the failing unit test (TestCrawlDbDeduplication) on my development
system stem from a modified nutch-site.xml requesting protocol-okhttp -
obviously, it's the combination of
[
https://issues.apache.org/jira/browse/NUTCH-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554663#comment-17554663
]
ASF GitHub Bot commented on NUTCH-2952:
---
sebastian-nagel opened a new pull request, #734:
URL:
sebastian-nagel opened a new pull request, #734:
URL: https://github.com/apache/nutch/pull/734
Upgrade of core dependencies
- Hadoop 3.1.3 -> 3.3.3
- log4j 2.17.0 -> 2.17.2
- and some more
Note: I've observed that some unit tests are failing with same/similar
errors than
[
https://issues.apache.org/jira/browse/NUTCH-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2952:
--
Assignee: Sebastian Nagel
> Upgrade core dependencies (Hadoop 3.3.3, log4j 2.17.2)
>
Sebastian Nagel created NUTCH-2952:
--
Summary: Upgrade core dependencies (Hadoop 3.3.3, log4j 2.17.2)
Key: NUTCH-2952
URL: https://issues.apache.org/jira/browse/NUTCH-2952
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554564#comment-17554564
]
Sebastian Nagel commented on NUTCH-2949:
This is addressed in
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554563#comment-17554563
]
ASF GitHub Bot commented on NUTCH-2936:
---
sebastian-nagel opened a new pull request, #733:
URL:
sebastian-nagel opened a new pull request, #733:
URL: https://github.com/apache/nutch/pull/733
Fixes to address the failing of Nutch jobs in (pseudo-)distributed mode.
Implements:
- caching of URLStreamHandlers per protocol to avoid that handlers are
created anew
- enforce
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554531#comment-17554531
]
Sebastian Nagel commented on NUTCH-2936:
After debugging this: the call by the Hadoop MR Job to
18 matches
Mail list logo