[jira] [Updated] (NUTCH-2854) Address ALL security vulnerabilities indicated by report-vulnerabilities ant target

2021-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2854: Description: NUTCH-2840 uncovered lots of security issues for us to work

[jira] [Updated] (NUTCH-2855) Update org.elasticsearch.client

2021-03-05 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2855: Fix Version/s: 1.19 > Update org.elasticsearch.cli

[jira] [Updated] (NUTCH-2855) Update org.elasticsearch.client

2021-03-05 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2855: Component/s: build > Update org.elasticsearch.cli

[jira] [Assigned] (NUTCH-2855) Update org.elasticsearch.client

2021-03-05 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2855: --- Assignee: Randall Williams > Update org.elasticsearch.cli

[jira] [Updated] (NUTCH-2855) Update org.elasticsearch.client

2021-03-05 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2855: Affects Version/s: 1.18 > Update org.elasticsearch.cli

[jira] [Updated] (NUTCH-2854) Address ALL security vulnerabilities indicated by report-vulnerabilities ant target

2021-03-05 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2854: Description: NUTCH-2840 uncovered lots of issues for us to work on. This is simply

[jira] [Created] (NUTCH-2854) Address ALL security vulnerabilities indicated by report-vulnerabilities ant target

2021-03-05 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2854: --- Summary: Address ALL security vulnerabilities indicated by report-vulnerabilities ant target Key: NUTCH-2854 URL: https://issues.apache.org/jira/browse/NUTCH-2854

[jira] [Created] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2021-02-18 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2852: --- Summary: Method invokes System.exit(...) 9 bugs Key: NUTCH-2852 URL: https://issues.apache.org/jira/browse/NUTCH-2852 Project: Nutch Issue

[jira] [Work started] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2021-02-18 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2852 started by Lewis John McGibbney. --- > Method invokes System.exit(...) 9 b

[jira] [Resolved] (NUTCH-2851) Random object created and used only once

2021-02-18 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2851. - Resolution: Fixed > Random object created and used only o

[jira] [Resolved] (NUTCH-2850) Method ignores exceptional return value

2021-02-18 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2850. - Resolution: Fixed > Method ignores exceptional return va

[jira] [Created] (NUTCH-2851) Random object created and used only once

2021-02-17 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2851: --- Summary: Random object created and used only once Key: NUTCH-2851 URL: https://issues.apache.org/jira/browse/NUTCH-2851 Project: Nutch Issue

[jira] [Work started] (NUTCH-2851) Random object created and used only once

2021-02-17 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2851 started by Lewis John McGibbney. --- > Random object created and used only o

[jira] [Created] (NUTCH-2850) Method ignores exceptional return value

2021-02-17 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2850: --- Summary: Method ignores exceptional return value Key: NUTCH-2850 URL: https://issues.apache.org/jira/browse/NUTCH-2850 Project: Nutch Issue

[jira] [Work started] (NUTCH-2850) Method ignores exceptional return value

2021-02-17 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2850 started by Lewis John McGibbney. --- > Method ignores exceptional return va

[jira] [Work stopped] (NUTCH-1860) Protocol IMAPS Support

2021-02-17 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-1860 stopped by Lewis John McGibbney. --- > Protocol IMAPS Supp

[jira] [Updated] (NUTCH-1860) Protocol IMAPS Support

2021-02-16 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1860: Description: Implementing the Internet Messaging Access Protocol within Nutch

[jira] [Commented] (NUTCH-1860) Protocol IMAPS Support

2021-02-16 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285418#comment-17285418 ] Lewis John McGibbney commented on NUTCH-1860: - I'm back to work on this issue folks

[jira] [Work stopped] (NUTCH-2849) Replace remaining package.html files with package-info.java

2021-02-16 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2849 stopped by Lewis John McGibbney. --- > Replace remaining package.html files with package-info.j

[jira] [Resolved] (NUTCH-2849) Replace remaining package.html files with package-info.java

2021-02-16 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2849. - Resolution: Fixed Thanks for review [~snagel] > Replace remaining package.h

[jira] [Work started] (NUTCH-2849) Replace remaining package.html files with package-info.java

2021-02-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2849 started by Lewis John McGibbney. --- > Replace remaining package.html files with package-info.j

[jira] [Resolved] (NUTCH-2842) Fix Javadoc warnings, errors and add Javadoc check to Github Action and Jenkins

2021-02-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2842. - Resolution: Fixed > Fix Javadoc warnings, errors and add Javadoc check to Git

[jira] [Work stopped] (NUTCH-2842) Fix Javadoc warnings, errors and add Javadoc check to Github Action and Jenkins

2021-02-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2842 stopped by Lewis John McGibbney. --- > Fix Javadoc warnings, errors and add Javadoc check to Git

[jira] [Created] (NUTCH-2849) Replace remaining package.html files with package-info.java

2021-02-11 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2849: --- Summary: Replace remaining package.html files with package-info.java Key: NUTCH-2849 URL: https://issues.apache.org/jira/browse/NUTCH-2849 Project

[jira] [Updated] (NUTCH-2842) Fix Javadoc warnings, errors and add Javadoc check to Github Action and Jenkins

2021-02-09 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2842: Summary: Fix Javadoc warnings, errors and add Javadoc check to Github Action

[jira] [Commented] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-02-08 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281521#comment-17281521 ] Lewis John McGibbney commented on NUTCH-2842: - Yeah this task is pretty heavy lifting. So far

[jira] [Created] (NUTCH-2848) Consider usefulness of StringUtil#isEmpty

2021-02-06 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2848: --- Summary: Consider usefulness of StringUtil#isEmpty Key: NUTCH-2848 URL: https://issues.apache.org/jira/browse/NUTCH-2848 Project: Nutch Issue

[jira] [Updated] (NUTCH-2848) Consider use of StringUtil#isEmpty

2021-02-06 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2848: Summary: Consider use of StringUtil#isEmpty (was: Consider usefulness

[jira] [Commented] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-02-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277668#comment-17277668 ] Lewis John McGibbney commented on NUTCH-2842: - Hi folks, I am nearly finished this behemoth

[jira] [Commented] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-02-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277669#comment-17277669 ] Lewis John McGibbney commented on NUTCH-2842: - I should be finished some time this week

[jira] [Assigned] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-01-31 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2842: --- Assignee: Lewis John McGibbney > Fix Javadoc warnings and add Javadoc ch

[jira] [Work started] (NUTCH-2843) Duplicate declaration of dependencies in ivy.xml

2021-01-31 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2843 started by Lewis John McGibbney. --- > Duplicate declaration of dependencies in ivy.

[jira] [Work started] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-01-31 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2842 started by Lewis John McGibbney. --- > Fix Javadoc warnings and add Javadoc check to Github Act

[jira] [Work stopped] (NUTCH-2840) Fix 'report-vulnerabilities' ant target in build.xml

2021-01-31 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2840 stopped by Lewis John McGibbney. --- > Fix 'report-vulnerabilities' ant target in build.

[jira] [Resolved] (NUTCH-2840) Fix 'report-vulnerabilities' ant target in build.xml

2021-01-31 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2840. - Resolution: Fixed > Fix 'report-vulnerabilities' ant target in build.

[jira] [Resolved] (NUTCH-2819) Move spotbugs "installation" directory to avoid that spotbugs is shipped in Nutch runtime

2021-01-31 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2819. - Resolution: Fixed > Move spotbugs "installation" direc

Security vulnerability reduction for Nutch

2021-01-27 Thread Lewis John McGibbney
Hi dev@, This is a heads up that I have created a project titled "Security vulnerability reduction for the Apache Nutch Web crawler project" which will be taken on within USC's CSCI 401 senior computer science capstone program. A very brief description is below for anyone interested. This

CVE-2021-23901: An XML external entity (XXE) injection vulnerability exists in the Nutch DmozParser

2021-01-24 Thread lewis john mcgibbney
Description: An XML external entity (XXE) injection vulnerability was discovered in the Nutch DmozParser and is known to affect Nutch versions < 1.18. XML external entity injection (also known as XXE) is a web security vulnerability that allows an attacker to interfere with an application's

[ANNOUNCE] Apache Nutch 1.18 Release

2021-01-24 Thread lewis john mcgibbney
*What?* The Apache Nutch team is pleased to announce the release of Apache Nutch v1.18. Nutch is a well matured, production ready Web crawler. Nutch 1.x enables fine grained configuration, relying on Apache Hadoop™ data structures. *Where?* Source and binary distributions are available for

[RESULT] WAS Re: [VOTE] Release Apache Nutch 1.18 RC1

2021-01-24 Thread lewis john mcgibbney
user@, dev@, The 72hr VOTE'ing period has elapsed. The RESULT's are as follows [5] +1 Release this package as Apache Nutch 1.18. Lewis John McGibbney* Ralf Kotowski* Jorge Luis Betancourt Gonzalez* Sebastian Nagel* Shashanka Balakuntala Srinivasa* [0] -1 Do not release this package because

Re: [VOTE] Release Apache Nutch 1.18 RC1

2021-01-24 Thread Lewis John McGibbney
-and-sums >"SHOULD NOT supply a MD5 or SHA-1 checksum file because these are > deprecated" > > > Best, > Sebastian > > On 1/21/21 2:22 AM, lewis john mcgibbney wrote: > > Hi Folks, > > ssh://g...@gitlab.padim.fim.uni-passau.de:13003/os

Re: [VOTE] Release Apache Nutch 1.18 RC1

2021-01-24 Thread Lewis John McGibbney
r. > at > org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:413) > at > org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:122) > ... 43 more > > On Thu, Jan 21, 2021 at 2:22 AM lewis jo

[jira] [Closed] (NUTCH-2844) Link Alternatif Joker123

2021-01-24 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2844. --- > Link Alternatif Joker123 > > >

[jira] [Resolved] (NUTCH-2844) Link Alternatif Joker123

2021-01-24 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2844. - Resolution: Not A Problem > Link Alternatif Joker

[jira] [Commented] (NUTCH-2826) Migrate Nutch Site from Apache CMS to Hugo

2021-01-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17269026#comment-17269026 ] Lewis John McGibbney commented on NUTCH-2826: - This is pretty cool I never heard of Hugo

[VOTE] Release Apache Nutch 1.18 RC1

2021-01-20 Thread lewis john mcgibbney
Hi Folks, A first candidate for the Nutch 1.18 release is available at [0] where accompanying SHA512, ASC and MD5 signatures can also be found. Information on verifying releases can be found at [1]. The release candidate is a .zip and tar.gz archive of the sources in [2] In addition, a staged

[jira] [Commented] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-01-19 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17268213#comment-17268213 ] Lewis John McGibbney commented on NUTCH-2842: - I discovered that we can add a _failonerror_

[jira] [Created] (NUTCH-2843) Duplicate declaration of dependencies in ivy.xml

2021-01-19 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2843: --- Summary: Duplicate declaration of dependencies in ivy.xml Key: NUTCH-2843 URL: https://issues.apache.org/jira/browse/NUTCH-2843 Project: Nutch

[jira] [Created] (NUTCH-2842) Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins

2021-01-14 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2842: --- Summary: Fix Javadoc warnings and add Javadoc check to Github Action and Jenkins Key: NUTCH-2842 URL: https://issues.apache.org/jira/browse/NUTCH-2842

[jira] [Updated] (NUTCH-2839) Implement Tez counters in Injector job

2021-01-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2839: Fix Version/s: (was: 1.18) 1.19 > Implement Tez count

[DISCUSS] Nutch 1.18 Release

2021-01-13 Thread Lewis John McGibbney
Hi dev@, Here were the stats for 1.18 122 issues in total Done29 In Progress 1 To Do 90 The 1 IN-PROGRESS item is https://issues.apache.org/jira/browse/NUTCH-2840 Fix 'report-vulnerabilities' ant target in build.xml, however I have no immediate desire to merge that as it is not ready

[jira] [Updated] (NUTCH-2832) Create tutorial on sending Nutch logs to Elasticsearch

2021-01-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2832: Fix Version/s: (was: 1.18) 1.19 > Create tutor

[jira] [Updated] (NUTCH-2840) Fix 'report-vulnerabilities' ant target in build.xml

2021-01-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2840: Fix Version/s: (was: 1.18) 1.19 > Fix 'rep

[jira] [Closed] (NUTCH-2841) Upgrade xercesImpl dependency

2021-01-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2841. --- > Upgrade xercesImpl dependency > - > >

[jira] [Resolved] (NUTCH-2841) Upgrade xercesImpl dependency

2021-01-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2841. - Resolution: Fixed > Upgrade xercesImpl depende

[jira] [Work started] (NUTCH-2841) Upgrade xercesImpl dependency

2021-01-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2841 started by Lewis John McGibbney. --- > Upgrade xercesImpl depende

[jira] [Created] (NUTCH-2841) Upgrade xercesImpl dependency

2021-01-13 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2841: --- Summary: Upgrade xercesImpl dependency Key: NUTCH-2841 URL: https://issues.apache.org/jira/browse/NUTCH-2841 Project: Nutch Issue Type

[jira] [Closed] (NUTCH-2837) Update multiple dependencies

2021-01-08 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2837. --- > Update multiple dependencies > > >

[jira] [Resolved] (NUTCH-2837) Update multiple dependencies

2021-01-08 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2837. - Resolution: Fixed > Update multiple dependenc

[jira] [Work started] (NUTCH-2837) Update multiple dependencies

2021-01-08 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2837 started by Lewis John McGibbney. --- > Update multiple dependenc

[jira] [Updated] (NUTCH-2837) Update multiple dependencies

2021-01-08 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2837: Description: Began with a trivial upgrade of slf4j-api and slf4j-log4j12

[jira] [Work started] (NUTCH-2840) Fix 'report-vulnerabilities' ant target in build.xml

2021-01-07 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2840 started by Lewis John McGibbney. --- > Fix 'report-vulnerabilities' ant target in build.

[jira] [Created] (NUTCH-2840) Fix 'report-vulnerabilities' ant target in build.xml

2021-01-07 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2840: --- Summary: Fix 'report-vulnerabilities' ant target in build.xml Key: NUTCH-2840 URL: https://issues.apache.org/jira/browse/NUTCH-2840 Project: Nutch

[jira] [Resolved] (NUTCH-2836) Upgrade various commons dependencies

2021-01-07 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2836. - Resolution: Fixed > Upgrade various commons dependenc

[jira] [Closed] (NUTCH-2836) Upgrade various commons dependencies

2021-01-07 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2836. --- > Upgrade various commons dependenc

[jira] [Work started] (NUTCH-2836) Upgrade various commons dependencies

2021-01-07 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2836 started by Lewis John McGibbney. --- > Upgrade various commons dependenc

[jira] [Updated] (NUTCH-2837) Update multiple dependencies

2020-12-22 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2837: Summary: Update multiple dependencies (was: Upgrade Slf4j dependencies) > Upd

[jira] [Created] (NUTCH-2839) Implement Tez counters in Injector job

2020-12-22 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2839: --- Summary: Implement Tez counters in Injector job Key: NUTCH-2839 URL: https://issues.apache.org/jira/browse/NUTCH-2839 Project: Nutch Issue

[jira] [Work started] (NUTCH-2839) Implement Tez counters in Injector job

2020-12-22 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2839 started by Lewis John McGibbney. --- > Implement Tez counters in Injector

[jira] [Created] (NUTCH-2838) Apache Tez integration

2020-12-22 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2838: --- Summary: Apache Tez integration Key: NUTCH-2838 URL: https://issues.apache.org/jira/browse/NUTCH-2838 Project: Nutch Issue Type: New Feature

Re: [DISCUSS] Replacing MapReduce with Tez

2020-12-21 Thread Lewis John McGibbney
you On 2020/12/10 07:46:30, lewis john mcgibbney wrote: > Hi dev@, > A while ago I had thought about bringing this topic up... I then got > busy... for ages. I'll therefore get straight to the point. > Has anyone on the dev@ team had an experience using Apache Tez - > tez.ap

Re: RE: [DISCUSS] Replacing MapReduce with Tez

2020-12-21 Thread Lewis John McGibbney
Hi Markus, Thanks for chiming in :) My responses below On 2020/12/21 21:32:08, Markus Jelsma wrote: > Hello Lewis, > > 1. counters, for me they are a requirement to have as they are key to regular > inspections of ongoing crawls, finding errors and debugging. I hope you can > find a work

Re: [DISCUSS] Replacing MapReduce with Tez

2020-12-21 Thread Lewis John McGibbney
On 2020/12/10 07:46:30, lewis john mcgibbney wrote: > Hi dev@, > A while ago I had thought about bringing this topic up... I then got > busy... for ages. I'll therefore get straight to the point. > Has anyone on the dev@ team had an experience using Apache Tez - > tez.apache.org?

[jira] [Created] (NUTCH-2837) Upgrade Slf4j dependencies

2020-12-20 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2837: --- Summary: Upgrade Slf4j dependencies Key: NUTCH-2837 URL: https://issues.apache.org/jira/browse/NUTCH-2837 Project: Nutch Issue Type

[jira] [Created] (NUTCH-2836) Upgrade various commons dependencies

2020-12-19 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2836: --- Summary: Upgrade various commons dependencies Key: NUTCH-2836 URL: https://issues.apache.org/jira/browse/NUTCH-2836 Project: Nutch Issue Type

[jira] [Work started] (NUTCH-2835) Upgrade commons-jexl from 2 --> 3

2020-12-17 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2835 started by Lewis John McGibbney. --- > Upgrade commons-jexl from 2 --

[jira] [Resolved] (NUTCH-2835) Upgrade commons-jexl from 2 --> 3

2020-12-17 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2835. - Resolution: Fixed > Upgrade commons-jexl from 2 --

[jira] [Created] (NUTCH-2835) Upgrade commons-jexl from 2 --> 3

2020-12-16 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2835: --- Summary: Upgrade commons-jexl from 2 --> 3 Key: NUTCH-2835 URL: https://issues.apache.org/jira/browse/NUTCH-2835 Project: Nutch Issue T

[DISCUSS] Replacing MapReduce with Tez

2020-12-09 Thread lewis john mcgibbney
Hi dev@, A while ago I had thought about bringing this topic up... I then got busy... for ages. I'll therefore get straight to the point. Has anyone on the dev@ team had an experience using Apache Tez - tez.apache.org? Tez promises multiple improvements over MapReduce. Naturally I wondered whether

[jira] [Commented] (NUTCH-2832) Create tutorial on sending Nutch logs to Elasticsearch

2020-11-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17236551#comment-17236551 ] Lewis John McGibbney commented on NUTCH-2832: - The framework above requires Log4j2 which

[jira] [Work started] (NUTCH-2832) Create tutorial on sending Nutch logs to Elasticsearch

2020-11-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2832 started by Lewis John McGibbney. --- > Create tutorial on sending Nutch logs to Elasticsea

[jira] [Updated] (NUTCH-2832) Create tutorial on sending Nutch logs to Elasticsearch

2020-11-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2832: Summary: Create tutorial on sending Nutch logs to Elasticsearch (was: Implement

[jira] [Created] (NUTCH-2832) Implement capability to send Nutch log's directly to Elasticsearch

2020-11-20 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2832: --- Summary: Implement capability to send Nutch log's directly to Elasticsearch Key: NUTCH-2832 URL: https://issues.apache.org/jira/browse/NUTCH-2832

[jira] [Resolved] (NUTCH-2809) Upgrade any23 plugin dependency to 2.4

2020-11-17 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2809. - Resolution: Fixed > Upgrade any23 plugin dependency to

[jira] [Assigned] (NUTCH-2809) Upgrade any23 plugin dependency to 2.4

2020-10-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2809: --- Assignee: Lewis John McGibbney (was: Shashanka Balakuntala Srinivasa

[jira] [Work started] (NUTCH-2809) Upgrade any23 plugin dependency to 2.4

2020-10-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2809 started by Lewis John McGibbney. --- > Upgrade any23 plugin dependency to

[jira] [Updated] (NUTCH-2809) Upgrade any23 plugin dependency to 2.4

2020-10-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2809: Description: Any23 2.4 has been released, the plugin any23 should be upgraded

[jira] [Updated] (NUTCH-2809) Upgrade any23 plugin dependency to 2.4

2020-10-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2809: Summary: Upgrade any23 plugin dependency to 2.4 (was: Upgrade any23 plugin

[jira] [Work started] (NUTCH-1860) Protocol IMAPS Support

2020-10-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-1860 started by Lewis John McGibbney. --- > Protocol IMAPS Supp

[jira] [Assigned] (NUTCH-1860) Protocol IMAPS Support

2020-10-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-1860: --- Assignee: Lewis John McGibbney > Protocol IMAPS Supp

[jira] [Closed] (NUTCH-1861) Implement POP3 Protocol

2020-10-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-1861. --- Fix Version/s: 1.16 Resolution: Won't Do Closing in favor of IMAP(S) support

[ANNOUNCEMENT] Apache Any23 2.4 Release

2020-10-06 Thread lewis john mcgibbney
The Apache Any23 Team is pleased to announce the release of Apache Any23 2.4. Apache Anything To Triples (Any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents. Release Notes:

[jira] [Created] (NUTCH-2812) Methods returning array may expose internal representation

2020-07-31 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-2812: --- Summary: Methods returning array may expose internal representation Key: NUTCH-2812 URL: https://issues.apache.org/jira/browse/NUTCH-2812 Project: Nutch

Re: Setting up automatic tests and check in GIT

2020-07-30 Thread lewis john mcgibbney
Hi Folks, Moving private@ to Bcc I think this is an excellent idea. As you know Shashanka, the examples in Gora can be found at https://github.com/apache/gora/tree/master/.github/workflows These would run on an incoming build and a push to master respectively. On another note, I don't know if

[jira] [Comment Edited] (NUTCH-2669) Reliable solution for javax.ws packaging.type

2020-07-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166875#comment-17166875 ] Lewis John McGibbney edited comment on NUTCH-2669 at 7/29/20, 5:21 AM

[jira] [Comment Edited] (NUTCH-2669) Reliable solution for javax.ws packaging.type

2020-07-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166875#comment-17166875 ] Lewis John McGibbney edited comment on NUTCH-2669 at 7/29/20, 5:22 AM

[jira] [Comment Edited] (NUTCH-2669) Reliable solution for javax.ws packaging.type

2020-07-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166875#comment-17166875 ] Lewis John McGibbney edited comment on NUTCH-2669 at 7/29/20, 5:21 AM

[jira] [Commented] (NUTCH-2669) Reliable solution for javax.ws packaging.type

2020-07-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166875#comment-17166875 ] Lewis John McGibbney commented on NUTCH-2669: - Hi [~snagel] I did the following and was able

[jira] [Commented] (NUTCH-2805) Rename plugin urlfilter-domainblacklist

2020-07-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155728#comment-17155728 ] Lewis John McGibbney commented on NUTCH-2805: - Nice > Rename plugin urlfil

[jira] [Assigned] (NUTCH-2803) Rename property http.robot.rules.whitelist

2020-07-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2803: --- Assignee: Lewis John McGibbney > Rename property http.robot.rules.whitel

<    1   2   3   4   5   6   7   8   9   10   >