dev
Thread
Date
Earlier messages
Messages by Thread
[jira] [Created] (NUTCH-3060) Javadoc link broken on website
Sebastian Nagel (Jira)
[jira] [Updated] (NUTCH-3060) Javadoc link broken on website
Sebastian Nagel (Jira)
[jira] [Commented] (NUTCH-3056) Injector to support resolving seed URLs
Markus Jelsma (Jira)
[jira] [Created] (NUTCH-3059) Generator: selector job does not count reduce output records
Sebastian Nagel (Jira)
[PR] NUTCH-3058 Fetcher: counter for hung threads [nutch]
via GitHub
[jira] [Commented] (NUTCH-3058) Fetcher: counter for hung threads
ASF GitHub Bot (Jira)
[jira] [Created] (NUTCH-3058) Fetcher: counter for hung threads
Sebastian Nagel (Jira)
[jira] [Resolved] (NUTCH-3055) README: fix Github "hub" commands
Sebastian Nagel (Jira)
Re: [PR] NUTCH-3055 README: fix Github "hub" commands [nutch]
via GitHub
[jira] [Resolved] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
Sebastian Nagel (Jira)
[PR] NUTCH-3057 - Fix for index-arbitrary plugin improper retention and us… [nutch]
via GitHub
Re: [PR] NUTCH-3057 - Fix for index-arbitrary plugin improper retention and us… [nutch]
via GitHub
[jira] [Commented] (NUTCH-3057) Arbitrary indexer "leaks" previous value into a field processed after an exception
Joe Gilvary (Jira)
[jira] [Commented] (NUTCH-3057) Arbitrary indexer "leaks" previous value into a field processed after an exception
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3057) Arbitrary indexer "leaks" previous value into a field processed after an exception
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3057) Arbitrary indexer "leaks" previous value into a field processed after an exception
Joe Gilvary (Jira)
[jira] [Created] (NUTCH-3057) Arbitrary indexer "leaks" previous value into a field processed after an exception
Joe Gilvary (Jira)
[jira] [Updated] (NUTCH-3056) Injector to support resolving seed URLs
Markus Jelsma (Jira)
[jira] [Updated] (NUTCH-3056) Injector to support resolving seed URLs
Markus Jelsma (Jira)
[jira] [Updated] (NUTCH-3056) Injector to support resolving seed URLs
Markus Jelsma (Jira)
[jira] [Created] (NUTCH-3056) Injector to support resolving seed URLs
Markus Jelsma (Jira)
[jira] [Closed] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Work stopped] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-3043) Generator: count URLs rejected by URL filters
Sebastian Nagel (Jira)
[jira] [Resolved] (NUTCH-3039) Failure to handle ftp:// URLs
Sebastian Nagel (Jira)
Community over Code EU 2024: The countdown has started!
Ryan Skraba
[PR] Revert incorrect change [nutch-site]
via GitHub
Re: [PR] Revert incorrect change [nutch-site]
via GitHub
Re: [PR] Revert incorrect change [nutch-site]
via GitHub
Re: [PR] Revert incorrect change [nutch-site]
via GitHub
[jira] [Commented] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed
Joe Gilvary (Jira)
[jira] [Closed] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
Lewis John McGibbney (Jira)
Re: [PR] NUTCH-3054 Address deprecation of Node16 for all GitHub Actions [nutch]
via GitHub
[jira] [Commented] (NUTCH-3055) README: fix Github "hub" commands
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3055) README: fix Github "hub" commands
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3055) README: fix Github "hub" commands
Hudson (Jira)
[jira] [Created] (NUTCH-3055) README: fix Github "hub" commands
Sebastian Nagel (Jira)
[jira] [Commented] (NUTCH-3045) Upgrade from Java 11 to 17
Sebastian Nagel (Jira)
[jira] [Commented] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
Hudson (Jira)
[jira] [Updated] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Work started] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3054) Address deprecation of Node16 for all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Commented] (NUTCH-3049) Investigate using Records
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3053) Upgrade build and CI to JDK17
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3052) Investigate using sealed classes
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3051) Investigate using new pattern matching syntax in switch expressions
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3050) Investigate use of the enhanced instanceof operator
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3049) Investigate using Records
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3048) Investigate where/if new string utility methods could be used
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3047) Use multi-line text blocks
Lewis John McGibbney (Jira)
[jira] [Updated] (NUTCH-3046) Use compact strings
Lewis John McGibbney (Jira)
[jira] [Commented] (NUTCH-1806) Delegate processing of URL domains to crawler commons
ASF GitHub Bot (Jira)
[PR] NUTCH-1806 Delegate processing of URL domains to crawler-common [nutch]
via GitHub
[jira] [Created] (NUTCH-3046) Use compact strings
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3045) Upgrade from Java 11 to 17
Lewis John McGibbney (Jira)
[ANNOUNCE] Apache Nutch 1.20 Release
lewis john mcgibbney
Re: [PR] NUTCH-3044 Generator: NPE when extracting the host part of a URL fails [nutch]
via GitHub
Re: [PR] NUTCH-3044 Generator: NPE when extracting the host part of a URL fails [nutch]
via GitHub
Re: [PR] NUTCH-3044 Generator: NPE when extracting the host part of a URL fails [nutch]
via GitHub
Re: [PR] NUTCH-3044 Generator: NPE when extracting the host part of a URL fails [nutch]
via GitHub
Re: [PR] NUTCH-3043 Generator: count URLs rejected by URL filters [nutch]
via GitHub
Re: [PR] NUTCH-3043 Generator: count URLs rejected by URL filters [nutch]
via GitHub
Re: [PR] NUTCH-3043 Generator: count URLs rejected by URL filters [nutch]
via GitHub
Re: [PR] NUTCH-3043 Generator: count URLs rejected by URL filters [nutch]
via GitHub
Re: [PR] NUTCH-3043 Generator: count URLs rejected by URL filters [nutch]
via GitHub
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
Hudson (Jira)
[jira] [Created] (NUTCH-3044) Generator: NPE when extracting the host part of a URL fails
Sebastian Nagel (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3043) Generator: count URLs rejected by URL filters
Hudson (Jira)
[jira] [Created] (NUTCH-3043) Generator: count URLs rejected by URL filters
Sebastian Nagel (Jira)
[DISCUSS] Consolidating Nutch Continuous Integration
lewis john mcgibbney
Re: [DISCUSS] Consolidating Nutch Continuous Integration
Lewis John McGibbney
Re: [DISCUSS] Consolidating Nutch Continuous Integration
Sebastian Nagel
Re: [DISCUSS] Consolidating Nutch Continuous Integration
Lewis John McGibbney
Re: [PR] NUTCH-3041 Address confusing logging in o.a.n.net.URLExemptionFilters [nutch]
via GitHub
Re: [PR] NUTCH-3041 Address confusing logging in o.a.n.net.URLExemptionFilters [nutch]
via GitHub
[jira] [Updated] (NUTCH-3042) Use GitHub cache action to improve CI execution time
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3042) Use GitHub cache action to improve CI execution time
Lewis John McGibbney (Jira)
[jira] [Work started] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Commented] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Hudson (Jira)
[jira] [Updated] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Updated] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3041) Address confusing logging in o.a.n.net.URLExemptionFilters
Lewis John McGibbney (Jira)
[jira] [Commented] (NUTCH-3040) Upgrade to Hadoop 3.4.0
Tim Allison (Jira)
[jira] [Created] (NUTCH-3040) Upgrade to Hadoop 3.4.0
Sebastian Nagel (Jira)
[jira] [Commented] (NUTCH-3039) Failure to handle ftp:// URLs
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3039) Failure to handle ftp:// URLs
Markus Jelsma (Jira)
[jira] [Commented] (NUTCH-3039) Failure to handle ftp:// URLs
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3039) Failure to handle ftp:// URLs
Hudson (Jira)
[jira] [Assigned] (NUTCH-3039) Failure to handle ftp:// URLs
Sebastian Nagel (Jira)
[PR] NUTCH-3039 Failure to handle ftp:// URLs [nutch]
via GitHub
Re: [PR] NUTCH-3039 Failure to handle ftp:// URLs [nutch]
via GitHub
[jira] [Created] (NUTCH-3039) Failure to handle ftp:// URLs
Sebastian Nagel (Jira)
[VOTE] Apache Nutch 1.20 Release
lewis john mcgibbney
Re: [VOTE] Apache Nutch 1.20 Release
Sebastian Nagel
Re: [VOTE] Apache Nutch 1.20 Release
Lewis John McGibbney
Re: [VOTE] Apache Nutch 1.20 Release
lewis john mcgibbney
Re: [VOTE] Apache Nutch 1.20 Release
Joe Gilvary
Re: [VOTE] Apache Nutch 1.20 Release
Shashanka Balakuntala
Re: [VOTE] Apache Nutch 1.20 Release
Joe Gilvary
Re: [VOTE] Apache Nutch 1.20 Release
BlackIce
[RESULT] WAS Re: [VOTE] Apache Nutch 1.20 Release
lewis john mcgibbney
[jira] [Resolved] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Lewis John McGibbney (Jira)
[jira] [Closed] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Lewis John McGibbney (Jira)
[jira] [Work stopped] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Lewis John McGibbney (Jira)
Mentor request for lewismc
lewis john mcgibbney
Re: Mentor request for lewismc
Furkan KAMACI
Re: Mentor request for lewismc
Sanyam Goel
Re: Mentor request for lewismc
Lewis John McGibbney
Re: Mentor request for lewismc
Sanyam Goel
[jira] [Resolved] (NUTCH-2937) parse-tika: review dependency exclusions and avoid dependency conflicts in distributed mode
Sebastian Nagel (Jira)
[jira] [Assigned] (NUTCH-2937) parse-tika: review dependency exclusions and avoid dependency conflicts in distributed mode
Sebastian Nagel (Jira)
[jira] [Resolved] (NUTCH-3005) Upgrade selenium as needed
Sebastian Nagel (Jira)
[jira] [Resolved] (NUTCH-3016) Upgrade Apache Ivy to 2.5.2
Sebastian Nagel (Jira)
[jira] [Updated] (NUTCH-3016) Upgrade Apache Ivy to 2.5.2
Sebastian Nagel (Jira)
[jira] [Updated] (NUTCH-3005) Upgrade selenium as needed
Sebastian Nagel (Jira)
[jira] [Updated] (NUTCH-3005) Upgrade selenium as needed
Sebastian Nagel (Jira)
[jira] [Commented] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Hudson (Jira)
[jira] [Work started] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Lewis John McGibbney (Jira)
[jira] [Updated] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Lewis John McGibbney (Jira)
[PR] NUTCH-3038 Address issues discovered during 1.20 release management dryrun [nutch]
via GitHub
Re: [PR] NUTCH-3038 Address issues discovered during 1.20 release management dryrun [nutch]
via GitHub
[jira] [Created] (NUTCH-3038) Address issues discovered during 1.20 release management dryrun
Lewis John McGibbney (Jira)
[jira] [Closed] (NUTCH-3032) Indexing plugin as an adapter for end user's own POJO instances
Lewis John McGibbney (Jira)
Community over Code EU 2024: Start planning your trip!
Ryan Skraba
Participate in the ASF 25th Anniversary Campaign
Brian Proffitt
[jira] [Resolved] (NUTCH-3032) Indexing plugin as an adapter for end user's own POJO instances
Joe Gilvary (Jira)
[jira] [Work started] (NUTCH-3032) Indexing plugin as an adapter for end user's own POJO instances
Joe Gilvary (Jira)
[jira] [Assigned] (NUTCH-3032) Indexing plugin as an adapter for end user's own POJO instances
Lewis John McGibbney (Jira)
[jira] [Work stopped] (NUTCH-2856) Implement a protocol-smb plugin based on hierynomus/smbj
Lewis John McGibbney (Jira)
[jira] [Work stopped] (NUTCH-2887) Migrate to JUnit 5 Jupiter
Lewis John McGibbney (Jira)
[jira] [Closed] (NUTCH-2832) Create tutorial on sending Nutch logs to Elasticsearch
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-2832) Create tutorial on sending Nutch logs to Elasticsearch
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-3036) Upgrade org.seleniumhq.selenium:selenium-java dependency in lib-selenium
Lewis John McGibbney (Jira)
[jira] [Closed] (NUTCH-3036) Upgrade org.seleniumhq.selenium:selenium-java dependency in lib-selenium
Lewis John McGibbney (Jira)
[jira] [Closed] (NUTCH-3035) Update license and notice file for release of 1.20
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-3035) Update license and notice file for release of 1.20
Lewis John McGibbney (Jira)
[jira] [Resolved] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
Lewis John McGibbney (Jira)
[jira] [Closed] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
Lewis John McGibbney (Jira)
Re: [PR] NUTCH-3037 Upgrade org.apache.kafka:kafka_2.12: to v3.7.0 [nutch]
via GitHub
Community Over Code NA 2024 Travel Assistance Applications now open!
Gavin McDonald
[PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
Re: [PR] NUTCH-3032 Code for an ArbitraryIndexingFilter to index values resolved by user POJO code at index time [nutch]
via GitHub
[jira] [Commented] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
ASF GitHub Bot (Jira)
[jira] [Work stopped] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
Lewis John McGibbney (Jira)
[jira] [Updated] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
Lewis John McGibbney (Jira)
[jira] [Work started] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
Lewis John McGibbney (Jira)
[jira] [Created] (NUTCH-3037) Upgrade org.apache.kafka:kafka_2.12: to v3.7.0
Lewis John McGibbney (Jira)
[jira] [Comment Edited] (NUTCH-3026) Allow statusOnly option for IndexingJob
Tim Allison (Jira)
[jira] [Resolved] (NUTCH-3026) Allow statusOnly option for IndexingJob
Tim Allison (Jira)
Re: [PR] WIP StatsD metrics example [nutch]
via GitHub
Re: [PR] WIP StatsD metrics example [nutch]
via GitHub
[jira] [Work stopped] (NUTCH-3036) Upgrade org.seleniumhq.selenium:selenium-java dependency in lib-selenium
Lewis John McGibbney (Jira)
Re: [PR] NUTCH-3036 Upgrade org.seleniumhq.selenium:selenium-java dependency i… [nutch]
via GitHub
Re: [PR] NUTCH-3036 Upgrade org.seleniumhq.selenium:selenium-java dependency i… [nutch]
via GitHub
Re: [PR] NUTCH-3036 Upgrade org.seleniumhq.selenium:selenium-java dependency i… [nutch]
via GitHub
Re: [PR] NUTCH-3036 Upgrade org.seleniumhq.selenium:selenium-java dependency i… [nutch]
via GitHub
Re: [PR] NUTCH-3036 Upgrade org.seleniumhq.selenium:selenium-java dependency i… [nutch]
via GitHub
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3035) Update license and notice file for release of 1.20
Hudson (Jira)
[PR] NUTCH-3035 Update license and notice file for release of 1.20 [nutch]
via GitHub
Re: [PR] NUTCH-3035 Update license and notice file for release of 1.20 [nutch]
via GitHub
Re: [PR] NUTCH-3035 Update license and notice file for release of 1.20 [nutch]
via GitHub
Re: [PR] NUTCH-3035 Update license and notice file for release of 1.20 [nutch]
via GitHub
Re: [PR] NUTCH-3035 Update license and notice file for release of 1.20 [nutch]
via GitHub
[jira] [Commented] (NUTCH-3036) Upgrade org.seleniumhq.selenium:selenium-java dependency in lib-selenium
ASF GitHub Bot (Jira)
[jira] [Commented] (NUTCH-3036) Upgrade org.seleniumhq.selenium:selenium-java dependency in lib-selenium
ASF GitHub Bot (Jira)
Earlier messages