[
https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-3015 stopped by Lewis John McGibbney.
---
> Add more CI steps to GitHub master-build.
[
https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney closed NUTCH-3015.
---
> Add more CI steps to GitHub master-build.
[
https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-3015.
-
Resolution: Fixed
> Add more CI steps to GitHub master-build.
[
https://issues.apache.org/jira/browse/NUTCH-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2887 started by Lewis John McGibbney.
---
> Migrate to JUnit 5 Jupi
Lewis John McGibbney created NUTCH-3016:
---
Summary: Upgrade Apache Ivy to 2.5.2
Key: NUTCH-3016
URL: https://issues.apache.org/jira/browse/NUTCH-3016
Project: Nutch
Issue Type: Task
[
https://issues.apache.org/jira/browse/NUTCH-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2887:
---
Assignee: Lewis John McGibbney
> Migrate to JUnit 5 Jupi
[
https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-3015 started by Lewis John McGibbney.
---
> Add more CI steps to GitHub master-build.
[
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-3014 started by Lewis John McGibbney.
---
> Standardize Job na
Hi dev@,
For the longest time the Nutch codebase has shipped with a
eclipse-codeformat.xml [0] file.
Whilst this has been largely successful in keeping the codebase uniform, it
cannot/has not been integrated into continuous integration (CI) and
subsequently not really enforced!
Whilst I’m a big
Lewis John McGibbney created NUTCH-3015:
---
Summary: Add more CI steps to GitHub master-build.yml
Key: NUTCH-3015
URL: https://issues.apache.org/jira/browse/NUTCH-3015
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-3014:
Description:
There is a large degree of variability when we set the job name
[
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-3014:
Description:
There is a large degree of variability when we set the job name
[
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-3014:
Summary: Standardize Job names (was: Standardize NutchJob job names
[
https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-3013.
-
Resolution: Fixed
Thanks for the review [~snagel]
> Employ commons-lang
[
https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney closed NUTCH-3013.
---
> Employ commons-lang3's StopWatch to simplify timing lo
Lewis John McGibbney created NUTCH-3014:
---
Summary: Standardize NutchJob job names
Key: NUTCH-3014
URL: https://issues.apache.org/jira/browse/NUTCH-3014
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-3013 started by Lewis John McGibbney.
---
> Employ commons-lang3's StopWatch to simplify timing lo
Lewis John McGibbney created NUTCH-3013:
---
Summary: Employ commons-lang3's StopWatch to simplify timing logic
Key: NUTCH-3013
URL: https://issues.apache.org/jira/browse/NUTCH-3013
Project: Nutch
Hi dev@,
I've been at arms length for a while as $dayjob changed and then
changed again over the last number of years.
With that being said, I wanted to start a thread on $title with the
goal of establishing some "big items" we could put on the roadmap and
maybe even publish...
Here are some of
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2856:
---
Assignee: (was: Lewis John McGibbney)
> Implement a protocol-smb plu
[
https://issues.apache.org/jira/browse/NUTCH-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694741#comment-17694741
]
Lewis John McGibbney commented on NUTCH-2988:
-
Actually, digging deeper it looks like
[
https://issues.apache.org/jira/browse/NUTCH-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694736#comment-17694736
]
Lewis John McGibbney commented on NUTCH-2988:
-
It looks the the [elasticsearch-java
client
[
https://issues.apache.org/jira/browse/NUTCH-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554866#comment-17554866
]
Lewis John McGibbney commented on NUTCH-2940:
-
WIP PR available at https://github.com/apache
[
https://issues.apache.org/jira/browse/NUTCH-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2940:
---
Assignee: Lewis John McGibbney
> Develop Gradle Core Build for Apache Nu
Lewis John McGibbney created NUTCH-2944:
---
Summary: Create Gradle Javadoc task
Key: NUTCH-2944
URL: https://issues.apache.org/jira/browse/NUTCH-2944
Project: Nutch
Issue Type: Sub-task
[
https://issues.apache.org/jira/browse/NUTCH-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2944 started by Lewis John McGibbney.
---
> Create Gradle Javadoc t
[
https://issues.apache.org/jira/browse/NUTCH-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2943.
-
Resolution: Fixed
> Implement core dependencies in build.gradle.
[
https://issues.apache.org/jira/browse/NUTCH-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526695#comment-17526695
]
Lewis John McGibbney commented on NUTCH-2943:
-
Implemented in https://github.com/csci401
[
https://issues.apache.org/jira/browse/NUTCH-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2943:
Component/s: build
> Implement core dependencies in build.gradle.
[
https://issues.apache.org/jira/browse/NUTCH-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2943:
Summary: Implement core dependencies in build.gradle.kts (was: Management
[
https://issues.apache.org/jira/browse/NUTCH-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2943:
---
Assignee: Lewis John McGibbney
> Implement core dependenc
[
https://issues.apache.org/jira/browse/NUTCH-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2943 started by Lewis John McGibbney.
---
> Implement core dependencies in build.gradle.
[
https://issues.apache.org/jira/browse/NUTCH-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526527#comment-17526527
]
Lewis John McGibbney commented on NUTCH-2939:
-
Hi [~Lirongxuan01] did you ever complete
Description:
An XML external entity (XXE) injection vulnerability was discovered in
the Any23 RDFa XSLTStylesheet extractor and is known to affect Any23
versions < 2.7. XML external entity injection (also known as XXE) is a
web security vulnerability that allows an attacker to interfere with
an
The Apache Any23 Project Management Committee is pleased to announce
the release of Apache Any23 2.7.
Apache Anything To Triples (Any23) is a library, a web service and a
command line tool that extracts structured data in RDF format from a
variety of Web documents.
Any23 2.7 requires JDK11 to
[
https://issues.apache.org/jira/browse/NUTCH-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2939:
---
Assignee: Ryan Li
> Create Jenkinsfile for Nutch Gradle Bu
Lewis John McGibbney created NUTCH-2939:
---
Summary: Create Jenkinsfile for Nutch Gradle Build
Key: NUTCH-2939
URL: https://issues.apache.org/jira/browse/NUTCH-2939
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2934:
Issue Type: Task (was: Improvement)
> Replace Apache Ant build system with Gra
[
https://issues.apache.org/jira/browse/NUTCH-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481528#comment-17481528
]
Lewis John McGibbney commented on NUTCH-2925:
-
Non-functioning branch available at
https
[
https://issues.apache.org/jira/browse/NUTCH-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2925 started by Lewis John McGibbney.
---
> Secure the Nutch REST API using Apache Sh
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2936 started by Lewis John McGibbney.
---
> Early registration of URL stream handlers provided by plug
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2936:
---
Assignee: Lewis John McGibbney
> Early registration of URL stream handl
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476730#comment-17476730
]
Lewis John McGibbney commented on NUTCH-2936:
-
I can reproduce this. Although I was planning
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476728#comment-17476728
]
Lewis John McGibbney commented on NUTCH-2936:
-
[~snagel] which JDK are you using?
> Ea
Lewis John McGibbney created NUTCH-2938:
---
Summary: Use Any23's RepositoryWriter to write structured data to
Rdf4j repository
Key: NUTCH-2938
URL: https://issues.apache.org/jira/browse/NUTCH-2938
[
https://issues.apache.org/jira/browse/NUTCH-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476719#comment-17476719
]
Lewis John McGibbney commented on NUTCH-2936:
-
I'll try to reproduce. Thanks
> Ea
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2919:
Summary: NUTCH-2919 Upgrade to Tika 2.2.1 and Any23 2.6 (was: NUTCH-2919
Upgrade
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2919.
-
Resolution: Fixed
> NUTCH-2919 Upgrade to Tika 2.2.1 and Any23
Hi dev@,
I'm about to start a new project with USC's Seniro Capstone program which
will replace our legacy Ant build with Gradle.
I opened https://issues.apache.org/jira/browse/NUTCH-2934 to track the work.
I wasn't very sure about how well Fireant would serve us moving forward so
although it was
[
https://issues.apache.org/jira/browse/NUTCH-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475625#comment-17475625
]
Lewis John McGibbney commented on NUTCH-2934:
-
I some house cleaning by closing off all
[
https://issues.apache.org/jira/browse/NUTCH-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2293.
-
Fix Version/s: 1.19
Resolution: Abandoned
> Make the unit tests wh
[
https://issues.apache.org/jira/browse/NUTCH-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2901.
-
Resolution: Abandoned
> migrate to maven or gra
[
https://issues.apache.org/jira/browse/NUTCH-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2244.
-
Fix Version/s: 1.19
Resolution: Abandoned
> Publish Proto
[
https://issues.apache.org/jira/browse/NUTCH-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2638.
-
Fix Version/s: 1.19
Resolution: Abandoned
> Publish plugins in Ma
[
https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2292.
-
Resolution: Abandoned
> Mavenize the build for nutch-core and nutch-plug
Lewis John McGibbney created NUTCH-2934:
---
Summary: Replace Apache Ant build system with Gradle
Key: NUTCH-2934
URL: https://issues.apache.org/jira/browse/NUTCH-2934
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2926:
Parent: NUTCH-2931
Issue Type: Sub-task (was: Improvement)
> Implem
Lewis John McGibbney created NUTCH-2933:
---
Summary: GET /seed doesn't return previously generated seed lists
Key: NUTCH-2933
URL: https://issues.apache.org/jira/browse/NUTCH-2933
Project: Nutch
Lewis John McGibbney created NUTCH-2932:
---
Summary: Create OpenAPI specification for Nutch 1.x REST API
Key: NUTCH-2932
URL: https://issues.apache.org/jira/browse/NUTCH-2932
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2925:
Parent: NUTCH-2931
Issue Type: Sub-task (was: Improvement)
> Sec
Lewis John McGibbney created NUTCH-2931:
---
Summary: Improvements to 1.x REST API
Key: NUTCH-2931
URL: https://issues.apache.org/jira/browse/NUTCH-2931
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2919:
Summary: NUTCH-2919 Upgrade to Tika 2.2.0 and Any23 2.6 (was: Upgrade to
Tika
The Apache Any23 Team is pleased to announce the release of Apache Any23
2.6.
Apache Anything To Triples (Any23) is a library, a web service and a
command line tool that extracts structured data in RDF format from a
variety of Web documents.
Any23 2.6 requires JDK11 to build and run.
Release
[
https://issues.apache.org/jira/browse/NUTCH-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2839 stopped by Lewis John McGibbney.
---
> Implement Tez counters in Injector
[
https://issues.apache.org/jira/browse/NUTCH-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17471239#comment-17471239
]
Lewis John McGibbney commented on NUTCH-2839:
-
Really interesting [~abstractdog]. Your short
[
https://issues.apache.org/jira/browse/NUTCH-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17471011#comment-17471011
]
Lewis John McGibbney commented on NUTCH-2839:
-
Hi [~abstractdog] I documented everything I
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17470992#comment-17470992
]
Lewis John McGibbney commented on NUTCH-2856:
-
I'm focusing on this now.
> Implem
[
https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2429:
---
Assignee: Lewis John McGibbney
> Fix Plugin System to allow proto
[
https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2429.
-
Resolution: Fixed
Finally merged into master branch
[~hiranchaudhuri] thank you
[
https://issues.apache.org/jira/browse/NUTCH-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2926:
Description:
The Nutch webserver caches resources (seed lists, configuration, jobs
Lewis John McGibbney created NUTCH-2926:
---
Summary: Implement persistent storage for Nutch Webserver resources
Key: NUTCH-2926
URL: https://issues.apache.org/jira/browse/NUTCH-2926
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468743#comment-17468743
]
Lewis John McGibbney commented on NUTCH-2925:
-
[~markus17] didn't really like the idea
Lewis John McGibbney created NUTCH-2925:
---
Summary: Secure the Nutch REST API using Apache Shiro
Key: NUTCH-2925
URL: https://issues.apache.org/jira/browse/NUTCH-2925
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468361#comment-17468361
]
Lewis John McGibbney commented on NUTCH-2923:
-
Yes it absolutely would. I didn't see
[
https://issues.apache.org/jira/browse/NUTCH-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468361#comment-17468361
]
Lewis John McGibbney edited comment on NUTCH-2923 at 1/4/22, 5:11 AM
[
https://issues.apache.org/jira/browse/NUTCH-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467745#comment-17467745
]
Lewis John McGibbney commented on NUTCH-2923:
-
We can easily obtain it via
{{job.getStatus
Hi Gavin,
Thanks for the email below. It was my understanding that the Nutch
project no longer relied on the legacy CMS framework.
I wrote a new website and published it at
https://github.com/apache/nutch-site with the static content being
served on the asf-site branch.
The old CMS website
[
https://issues.apache.org/jira/browse/NUTCH-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467688#comment-17467688
]
Lewis John McGibbney commented on NUTCH-2278:
-
No problems Fengtan… a test case would
[
https://issues.apache.org/jira/browse/NUTCH-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467687#comment-17467687
]
Lewis John McGibbney commented on NUTCH-2923:
-
Hi Prakhar, I agree with you. Are you able
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2856:
Summary: Implement a protocol-smb plugin based on hierynomus/smbj (was:
Implement
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467111#comment-17467111
]
Lewis John McGibbney commented on NUTCH-2856:
-
Adding some notes from my research
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2856:
Summary: Implement a protocol-smb plugin based on (was: Implement
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2856 started by Lewis John McGibbney.
---
> Implement an appropriately licensed protocol-smb plu
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2856:
Issue Type: New Feature (was: Bug)
> Implement an appropriately licensed proto
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2856:
Summary: Implement an appropriately licensed protocol-smb plugin (was:
protocol
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466704#comment-17466704
]
Lewis John McGibbney commented on NUTCH-2856:
-
I'll take this one on. I intend to use https
[
https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2856:
---
Assignee: Lewis John McGibbney
> protocol-smb plugin is outda
[
https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466703#comment-17466703
]
Lewis John McGibbney commented on NUTCH-427:
An old thread but I found an alternative SMB
Hi dev@,
*What?*
I've been chipping away at some documentation which would provide a
one-stop-shop for understanding Nutch metrics. My first pass is available at
https://cwiki.apache.org/confluence/display/NUTCH/Metrics
This relates to the recent JIRA issue I filed about establishing a Nutch
Hi user@, dev@,
I took the liberty of setting up a #nutch channel for our community to
communicate in a lower latency manner.
First join the-asf.slack.com Slack workspace
https://infra.apache.org/slack.html
Then simply join the #nutch channel.
See you there :)
Thanks
lewismc
--
I also should note that the -deleteGone setting cannot be overriden via
nutch-site.xml whereas similar settings do have equivalent configuration
properties in nutch-default.xml
https://github.com/apache/nutch/blob/master/conf/nutch-default.xml#L1361-L1373
On 2021/12/29 17:08:20 lewis john
Hi dev@,
Reading the code for the IndexerJob -deleteGone flag [0] you can clearly
see that we bundle deletion requests for 404s, redirects and duplicates
into one option.
This of course has pros and cons.
Does anyone wish to share their opinion on how this is implemented?
My opinion is that
1. The
Lewis John McGibbney created NUTCH-2920:
---
Summary: Implement a indexer-opensearch plugin
Key: NUTCH-2920
URL: https://issues.apache.org/jira/browse/NUTCH-2920
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2449.
-
Resolution: Fixed
> Usage of Tika LanguageIdentifier in language-identif
[
https://issues.apache.org/jira/browse/NUTCH-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461637#comment-17461637
]
Lewis John McGibbney commented on NUTCH-2278:
-
Out of curiosity [~Fengtan] are you still
[
https://issues.apache.org/jira/browse/NUTCH-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461635#comment-17461635
]
Lewis John McGibbney edited comment on NUTCH-2278 at 12/17/21, 7:48 PM
[
https://issues.apache.org/jira/browse/NUTCH-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461635#comment-17461635
]
Lewis John McGibbney commented on NUTCH-2278:
-
[~snagel] wdyt about this?
> Handle alph
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461620#comment-17461620
]
Lewis John McGibbney commented on NUTCH-2919:
-
The artifacts have not yet made maven central
Lewis John McGibbney created NUTCH-2919:
---
Summary: Upgrade to Tika 2.2.0
Key: NUTCH-2919
URL: https://issues.apache.org/jira/browse/NUTCH-2919
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2919 started by Lewis John McGibbney.
---
> Upgrade to Tika 2.
101 - 200 of 3729 matches
Mail list logo