[jira] [Updated] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3034: Description: h1. Motivation Plugins provide a large part of the functionality

[jira] [Updated] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3034: Description: h1. Motivation Plugins provide a large part of the functionality

[jira] [Updated] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3034: Description: h1. Motivation Plugins provide a large part of the functionality

[jira] [Updated] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3034: Description: h1. Motivation Plugins provide a large part of the functionality

[jira] [Updated] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3034: Description: h1. Motivation Plugins provide a large part of the functionality

[jira] [Updated] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3034: Description: h1. Motivation Plugins provide a large part of the functionality

[jira] [Created] (NUTCH-3034) Overhaul the legacy Nutch plugin framework and replace it with PF4J

2024-03-12 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3034: --- Summary: Overhaul the legacy Nutch plugin framework and replace it with PF4J Key: NUTCH-3034 URL: https://issues.apache.org/jira/browse/NUTCH-3034

Differences in retrieve pattern between Ivy 2.5.0/2.5.1 & 2.5.2?

2024-03-11 Thread lewis john mcgibbney
Hi ivy-user@, I am working on upgrading Ivy to latest over in the Apache Nutch project. The build works just fine with 2.5.0 and 2.5.1 but with 2.5.2 the CI fails with the following complaint /home/runner/work/nutch/nutch/src/plugin/build-plugin.xml:234: impossible to ivy retrieve:

[jira] [Created] (NUTCH-3033) Upgrade Ivy to v2.5.2

2024-03-11 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3033: --- Summary: Upgrade Ivy to v2.5.2 Key: NUTCH-3033 URL: https://issues.apache.org/jira/browse/NUTCH-3033 Project: Nutch Issue Type: Task

[jira] [Work started] (NUTCH-3033) Upgrade Ivy to v2.5.2

2024-03-11 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-3033 started by Lewis John McGibbney. --- > Upgrade Ivy to v2.

Re: [DISCUSS] Release Nutch 1.20

2024-03-10 Thread Lewis John McGibbney
> > - address the ES licensing issue, > >the easiest way is to downgrade, see NUTCH-3008 > >If done update the license-related files. > > > > - there are three short PRs open > > > > I'll try to have a look at these points the next days

Re: Indexing arbitrary fields

2024-03-08 Thread Lewis John McGibbney
Hi Joe, Thanks for describing your work in detail. It provides a great utility which I think could be of immense value. Please feel free to create a JIRA ticket which can be used as the basis for linking to the prior similar examples you referenced. A WIP pull request would be ideal. Thanks

[DISCUSS] Release Nutch 1.20

2024-03-07 Thread lewis john mcgibbney
Hi dev@, As of today, 51 issues have been addressed in the 1.20 development drive. https://issues.apache.org/jira/projects/NUTCH/versions/12352190 I would like to push a release soon and ship it to the user community. Any objections? Thank you lewismc

Re: [DISCUSS] Graduate Apache SDAP (Incubating) as a Top Level Project

2024-03-07 Thread Lewis John McGibbney
Julien’s has very succinctly described the community growth challenges and podling direction. For a number of years I acted as mentor for SDAP and was puzzled by the inability for the community to push releases. This still concerns me... That being said, there is definitely potential (the

Re: [DISCUSS] Incubating Proposal for StormCrawler

2024-03-07 Thread Lewis John McGibbney
I think StromCrawler would be an excellent candidate for the Incubator. If the podling is looking for an additional mentor, I would be happy to chip in. lewismc On 2024/03/03 23:24:38 PJ Fanning wrote: > Hi everyone, > > I would like to propose StormCrawler [1] as a new Apache Incubator

Re: [DISCUSS] Graduate Apache Celeborn (Incubating) as a Top Level Project

2024-03-07 Thread Lewis John McGibbney
+1 Excellent work on the Incubating releases and community building, lewismc On 2024/03/05 06:00:49 Yu Li wrote: > Hi All, > > Apache Celeborn joined Incubator in October 2022 [1]. Since then, > we've made significant progress towards maturing our community and > adopting the Apache Way. > >

[jira] [Closed] (NUTCH-3024) Remove flaky 'dependency check' target

2023-11-24 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-3024. --- > Remove flaky 'dependency check' tar

[jira] [Resolved] (NUTCH-3024) Remove flaky 'dependency check' target

2023-11-24 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-3024. - Resolution: Fixed > Remove flaky 'dependency check' tar

[jira] [Updated] (TIKA-4169) Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension

2023-11-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4169: --- Description: An Functional Mockup Unit (FMU) is a software component used

[jira] [Updated] (TIKA-4169) Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension

2023-11-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4169: --- Description: An Functional Mockup Unit (FMU) is a software component used

[jira] [Created] (TIKA-4169) Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension

2023-11-13 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created TIKA-4169: -- Summary: Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension Key: TIKA-4169 URL: https://issues.apache.org/jira/browse/TIKA-4169

[jira] [Closed] (NUTCH-3007) Fix impossible casts

2023-11-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-3007. --- > Fix impossible casts > > > Key

[jira] [Closed] (NUTCH-2846) Fix various bugs spotted by NUTCH-2815

2023-11-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2846. --- > Fix various bugs spotted by NUTCH-2

[jira] [Closed] (NUTCH-2852) Method invokes System.exit(...) 9 bugs

2023-11-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2852. --- > Method invokes System.exit(...) 9 b

[jira] [Closed] (NUTCH-2819) Move spotbugs "installation" directory to avoid that spotbugs is shipped in Nutch runtime

2023-11-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2819. --- > Move spotbugs "installation" directory to avoid that spotbugs is shippe

[jira] [Closed] (NUTCH-2851) Random object created and used only once

2023-11-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2851. --- > Random object created and used only o

[jira] [Closed] (NUTCH-2850) Method ignores exceptional return value

2023-11-10 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-2850. --- > Method ignores exceptional return va

[jira] [Created] (NUTCH-3024) Remove flaky 'dependency check' target

2023-11-03 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3024: --- Summary: Remove flaky 'dependency check' target Key: NUTCH-3024 URL: https://issues.apache.org/jira/browse/NUTCH-3024 Project: Nutch Issue

Removing “dependency-check” target from build.xml

2023-11-03 Thread lewis john mcgibbney
Hi dev@, Recently I was doing a bit of work on CI and made an attempt to activate the “dependency-check” target (previously named “report-vulnerabilities”). It appears that the underlying “dependency-check” tooling is flaky at best. It appears to take an awful long time to execute and seems to

[jira] [Created] (NUTCH-3023) Use mikepenz/action-junit-report to improve interpretation of failed tests during CI

2023-11-02 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3023: --- Summary: Use mikepenz/action-junit-report to improve interpretation of failed tests during CI Key: NUTCH-3023 URL: https://issues.apache.org/jira/browse/NUTCH-3023

[jira] [Closed] (NUTCH-3014) Standardize Job names

2023-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-3014. --- Thanks [~snagel] for the review > Standardize Job na

[jira] [Resolved] (NUTCH-3014) Standardize Job names

2023-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-3014. - Resolution: Fixed > Standardize Job na

[jira] [Created] (NUTCH-3022) Experiment formatting codebase per google-java-format

2023-11-02 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3022: --- Summary: Experiment formatting codebase per google-java-format Key: NUTCH-3022 URL: https://issues.apache.org/jira/browse/NUTCH-3022 Project: Nutch

[jira] [Work stopped] (NUTCH-3014) Standardize Job names

2023-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-3014 stopped by Lewis John McGibbney. --- > Standardize Job na

Re: Nutch codebase formatting

2023-11-02 Thread Lewis John McGibbney
y emerging as the industry OSS default, offers a > >> GitHub action and could also be configured to lint dockerfile, and other > >> artifacts. It can also be configured to use the google Java style as well… > > +1 (with Google Java style) > > > > I’ll submit a PR for su

Re: Nutch codebase formatting

2023-10-28 Thread Lewis John McGibbney
Any thoughts on this folks. I’ll submit a PR for superlinter so everyone can see what it would look like. lewismc On 2023/10/23 19:28:45 lewis john mcgibbney wrote: > Hi dev@, > > For the longest time the Nutch codebase has shipped with a > eclipse-codeformat.xml [0] file. >

[jira] [Work stopped] (NUTCH-3015) Add more CI steps to GitHub master-build.yml

2023-10-27 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-3015 stopped by Lewis John McGibbney. --- > Add more CI steps to GitHub master-build.

[jira] [Closed] (NUTCH-3015) Add more CI steps to GitHub master-build.yml

2023-10-27 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-3015. --- > Add more CI steps to GitHub master-build.

[jira] [Resolved] (NUTCH-3015) Add more CI steps to GitHub master-build.yml

2023-10-27 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-3015. - Resolution: Fixed > Add more CI steps to GitHub master-build.

[jira] [Work started] (NUTCH-2887) Migrate to JUnit 5 Jupiter

2023-10-24 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2887 started by Lewis John McGibbney. --- > Migrate to JUnit 5 Jupi

[jira] [Created] (NUTCH-3016) Upgrade Apache Ivy to 2.5.2

2023-10-24 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3016: --- Summary: Upgrade Apache Ivy to 2.5.2 Key: NUTCH-3016 URL: https://issues.apache.org/jira/browse/NUTCH-3016 Project: Nutch Issue Type: Task

[jira] [Assigned] (NUTCH-2887) Migrate to JUnit 5 Jupiter

2023-10-23 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2887: --- Assignee: Lewis John McGibbney > Migrate to JUnit 5 Jupi

[jira] [Work started] (NUTCH-3015) Add more CI steps to GitHub master-build.yml

2023-10-23 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-3015 started by Lewis John McGibbney. --- > Add more CI steps to GitHub master-build.

[jira] [Work started] (NUTCH-3014) Standardize Job names

2023-10-23 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-3014 started by Lewis John McGibbney. --- > Standardize Job na

Nutch codebase formatting

2023-10-23 Thread lewis john mcgibbney
Hi dev@, For the longest time the Nutch codebase has shipped with a eclipse-codeformat.xml [0] file. Whilst this has been largely successful in keeping the codebase uniform, it cannot/has not been integrated into continuous integration (CI) and subsequently not really enforced! Whilst I’m a big

[jira] [Created] (NUTCH-3015) Add more CI steps to GitHub master-build.yml

2023-10-22 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3015: --- Summary: Add more CI steps to GitHub master-build.yml Key: NUTCH-3015 URL: https://issues.apache.org/jira/browse/NUTCH-3015 Project: Nutch

[jira] [Updated] (NUTCH-3014) Standardize Job names

2023-10-22 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3014: Description: There is a large degree of variability when we set the job name

[jira] [Updated] (NUTCH-3014) Standardize Job names

2023-10-22 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3014: Description: There is a large degree of variability when we set the job name

[jira] [Updated] (NUTCH-3014) Standardize Job names

2023-10-22 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-3014: Summary: Standardize Job names (was: Standardize NutchJob job names

[jira] [Resolved] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-3013. - Resolution: Fixed Thanks for the review [~snagel]  > Employ commons-lang

[jira] [Closed] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-3013. --- > Employ commons-lang3's StopWatch to simplify timing lo

Re: Roll-Call for Apache Flagon

2023-10-21 Thread lewis john mcgibbney
I’m here. lewismc On Sat, Oct 21, 2023 at 08:28 Christofer Dutz wrote: > Hi all, > > > > I was tasked at the last board report to pursue a roll call for Apache > Flagon after we saw that a VOTE thread has currently been open for over 2 > weeks with only one vote (which was “-0”). > > Also

[jira] [Created] (NUTCH-3014) Standardize NutchJob job names

2023-10-21 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3014: --- Summary: Standardize NutchJob job names Key: NUTCH-3014 URL: https://issues.apache.org/jira/browse/NUTCH-3014 Project: Nutch Issue Type

[jira] [Work started] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-3013 started by Lewis John McGibbney. --- > Employ commons-lang3's StopWatch to simplify timing lo

[jira] [Created] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-20 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3013: --- Summary: Employ commons-lang3's StopWatch to simplify timing logic Key: NUTCH-3013 URL: https://issues.apache.org/jira/browse/NUTCH-3013 Project: Nutch

No appenders could be found for logger (org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn)

2023-10-18 Thread lewis john mcgibbney
Hi user@, I am making progress in my experiments integrating Nutch 1.20-SNAPSHOT, Hadoop 3.3.4 and Celeborn 0.4.0-SNAPSHOT-incubating! In both the Hadoop work count example and with all of the Nutch MapReduce jobs I run, I see the following output present in the YARN container stderr log output

Re: java.lang.NumberFormatException: null when running Hadoop Mapreduce Wordcount example

2023-10-18 Thread Lewis John McGibbney
Hi Ethan, Thanks for the advice! As I am in an experimental phase, I decided to try again in pseudo-distributed mode... I tried downgrading to Hadoop 3.2.1 (OpenJDK8) but apparently that Hadoop distribution doesn't run on Apple M1 chip! I therefore tried again on Hadoop 3.3.4 and was

Re: [NEW FEATURE AVAILABLE] Celeborn support MapReduce engine.

2023-10-18 Thread Lewis John McGibbney
Excellent. Thanks for the heads up :) lewismc On 2023/10/18 03:44:54 Ethan Feng wrote: > Hi Lewis, > > Thanks for reaching out. > > I can confirm that future Celeborn releases will include the "mr" > client jars since Celeborn 0.4.0 and it will start the release process > in a short period. >

java.lang.NumberFormatException: null when running Hadoop Mapreduce Wordcount example

2023-10-17 Thread lewis john mcgibbney
Hi user@, I cloned Celeborn (0.4.0-Incubating) 69defcad7f9423c9c24d2d22ead856b4225671c6 today and built it with the -Pmr profile. openjdk version "11.0.20.1" 2023-08-24 OpenJDK Runtime Environment Homebrew (build 11.0.20.1+0) OpenJDK 64-Bit Server VM Homebrew (build 11.0.20.1+0, mixed mode)

Re: [NEW FEATURE AVAILABLE] Celeborn support MapReduce engine.

2023-10-17 Thread Lewis John McGibbney
Hi Ethan, I'm just picking up Celeborn now and plan on running some experiments with the Apache Nutch (https://nutch.apache.org) project. I downloaded Celeborn 0.3.1-incubating (2023-10-13) from the downloads page and noticed that no Celeborn client jars for MapReduce exist at

Establishing a Nutch development roadmap

2023-09-26 Thread lewis john mcgibbney
Hi dev@, I've been at arms length for a while as $dayjob changed and then changed again over the last number of years. With that being said, I wanted to start a thread on $title with the goal of establishing some "big items" we could put on the roadmap and maybe even publish... Here are some of

Re: [DISCUSS] Removing Any23 from Nutch?

2023-09-14 Thread lewis john mcgibbney
+1 Tim. On Wed, Sep 13, 2023 at 16:50 > > > > -- Forwarded message -- > From: Tim Allison > To: user@nutch.apache.org, d...@nutch.apache.org > Cc: > Bcc: > Date: Wed, 13 Sep 2023 10:50:08 -0400 > Subject: [DISCUSS] Removing Any23 from Nutch? > All, > I opened

Yahoo's Burst

2023-05-18 Thread lewis john mcgibbney
Hi user@, I stumbled across Burst today... It looks like it is under active development and the documentation is lacking for loading data via a client. https://github.com/yahoo/burst lewismc -- http://home.apache.org/~lewismc/ http://people.apache.org/keys/committer/lewismc

Re: [VOTE] Move OODT to Attic?

2023-04-05 Thread Lewis John McGibbney
+1 move to the attic. I share Sean's sentiment entirely. A real success story. Thanks Imesha for representing the project to the Board. lewismc On 2023/04/03 01:02:01 Imesha Sudasingha wrote: > Hello everyone: > > Due to inactivity, Apache OODT is considering moving to the Attic [1]. This >

Re: FLAGON IS A TOP LEVEL PROJECT

2023-03-23 Thread lewis john mcgibbney
Congrats community. lewismc On Wed, Mar 22, 2023 at 19:55 Joshua Poore wrote: > All, > > I’m so excited to tell you that the ASF Board unanimously approved the > resolution to establish Apache Flagon as an ASF Top Level Project. > > HUGE thanks to our community—PMC, committers, contributors,

Re: FLAGON IS A TOP LEVEL PROJECT

2023-03-23 Thread lewis john mcgibbney
Congrats community. lewismc On Wed, Mar 22, 2023 at 19:55 Joshua Poore wrote: > All, > > I’m so excited to tell you that the ASF Board unanimously approved the > resolution to establish Apache Flagon as an ASF Top Level Project. > > HUGE thanks to our community—PMC, committers, contributors,

Re: Tika server crashes

2023-03-20 Thread Lewis John McGibbney
Bit of a plug for tika-helm here folks... Horizontal pod autoscaling [0] is available (off by default) and can be configured via values.yaml or overridden on the CLI. This would mean that the availability to a tika-server would still be available in the event that one particular pod went down

[jira] [Updated] (TIKA-3989) Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2

2023-03-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3989: --- Description: The _*autoscaling/v2beta1*_ API is superseded with {_}*autoscaling/v2

[jira] [Created] (TIKA-3989) Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2

2023-03-20 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created TIKA-3989: -- Summary: Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2 Key: TIKA-3989 URL: https://issues.apache.org/jira/browse/TIKA

[jira] [Updated] (TIKA-3989) Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2

2023-03-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3989: --- Description: The _*autoscaling/v2beta1*_ API is superseded with autoscaling/v2

[jira] [Closed] (TIKA-3988) Add Github Action to Lint and Test Charts

2023-03-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed TIKA-3988. -- > Add Github Action to Lint and Test Cha

[jira] [Resolved] (TIKA-3988) Add Github Action to Lint and Test Charts

2023-03-20 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3988. Resolution: Fixed > Add Github Action to Lint and Test Cha

[ANNOUNCEMENT] Apache Tika Helm Chart v2.7.0 and v2.7.0-full released

2023-03-19 Thread lewis john mcgibbney
The Tika PMC is happy to announce that tika-helm v2.7.0 and v2.7.0-full Charts are now available. Documentation can be found at https://github.com/apache/tika-helm#readme Please register support and feedback at https://github.com/apache/tika-helm/pulls Thanks to everyone who contributed to

[ANNOUNCEMENT] Apache Tika Helm Chart v2.7.0 and v2.7.0-full released

2023-03-19 Thread lewis john mcgibbney
The Tika PMC is happy to announce that tika-helm v2.7.0 and v2.7.0-full Charts are now available. Documentation can be found at https://github.com/apache/tika-helm#readme Please register support and feedback at https://github.com/apache/tika-helm/pulls Thanks to everyone who contributed to

[jira] [Commented] (TIKA-3988) Add Github Action to Lint and Test Charts

2023-03-19 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702421#comment-17702421 ] Lewis John McGibbney commented on TIKA-3988: It looks like there are some permissions issues

[jira] [Created] (TIKA-3988) Add Github Action to Lint and Test Charts

2023-03-19 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created TIKA-3988: -- Summary: Add Github Action to Lint and Test Charts Key: TIKA-3988 URL: https://issues.apache.org/jira/browse/TIKA-3988 Project: Tika Issue Type

[jira] [Commented] (TIKA-3985) Automate tika-helm Chart releases with helm/chart-releaser-action

2023-03-19 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702402#comment-17702402 ] Lewis John McGibbney commented on TIKA-3985: https://github.com/marketplace/actions/jfrog-cli

Re: Userale Schema

2023-03-15 Thread lewis john mcgibbney
Big +1 one this. Would be useful as we are thinking about potentially pushing data into OpenSearch in the future. A schema and data types would be very useful. Lewis On Wed, Mar 15, 2023 at 1:48 PM Gedd Johnson wrote: > > Hi all, > > As discussed in this PR, we'd like to ideate on the topic of

[jira] [Created] (TIKA-3985) Automate tika-helm Chart releases with helm/chart-releaser-action

2023-03-10 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created TIKA-3985: -- Summary: Automate tika-helm Chart releases with helm/chart-releaser-action Key: TIKA-3985 URL: https://issues.apache.org/jira/browse/TIKA-3985 Project

[jira] [Updated] (TIKA-3452) java.nio.file.FileSystemException Read-only file system

2023-03-03 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3452: --- Fix Version/s: 2.7.0 (was: 2.0.0-BETA

[jira] [Resolved] (TIKA-3452) java.nio.file.FileSystemException Read-only file system

2023-03-03 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3452. Resolution: Fixed > java.nio.file.FileSystemException Read-only file sys

[jira] [Assigned] (TEZ-4371) Implement ClientServiceDelegate.getJobCounters

2023-02-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TEZ-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TEZ-4371: - Assignee: Lewis John McGibbney > Implement ClientServiceDelegate.getJobCount

[jira] (TEZ-4371) Implement ClientServiceDelegate.getJobCounters

2023-02-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TEZ-4371 ] Lewis John McGibbney deleted comment on TEZ-4371: --- was (Author: lewismc): [~abstractdog] I have to finish off NUTCH-2856 then I could make an effort to investigate and implement

[jira] [Assigned] (NUTCH-2856) Implement a protocol-smb plugin based on hierynomus/smbj

2023-02-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2856: --- Assignee: (was: Lewis John McGibbney) > Implement a protocol-smb plu

[jira] [Commented] (NUTCH-2988) Elasticsearch 7.13.2 compatible with ASL 2.0?

2023-02-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694741#comment-17694741 ] Lewis John McGibbney commented on NUTCH-2988: - Actually, digging deeper it looks like

[jira] [Commented] (NUTCH-2988) Elasticsearch 7.13.2 compatible with ASL 2.0?

2023-02-28 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694736#comment-17694736 ] Lewis John McGibbney commented on NUTCH-2988: - It looks the the [elasticsearch-java client

[jira] [Updated] (TIKA-3452) java.nio.file.FileSystemException Read-only file system

2023-02-15 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3452: --- Summary: java.nio.file.FileSystemException Read-only file system

Re: [ROLL CALL] Project status of Any23

2023-02-13 Thread Lewis John McGibbney
Hi, I am here but have been busy doing other tasks right now. The project has been very quiet for quite a while and this has been reported to the Board in many previous reports. I neglected to submit reports for two or three months but will attempt to rectify that. I also try to push releases

Re: user Digest 8 Nov 2022 10:16:05 -0000 Issue 3169

2022-11-08 Thread lewis john mcgibbney
Hi Mike, Yes it is possible to extend the TLD list. In fact, when the TLD lost was compiled the author left a note explicitly stating that it may not be complete. https://github.com/apache/nutch/blob/master/conf/domain-suffixes.xml.template Please submit a PR if you wish to make any changes or

Re: [ VOTE ] Graduation of Flagon Project

2022-11-05 Thread lewis john mcgibbney
;> >> >> On Tue, Nov 1, 2022 at 7:06 AM Gedd Johnson >> wrote: >> >> >> >> > +1 >> >> > >> >> > Best, >> >> > Gedd Johnson >> >> > >> >> > On Mon, Oct 31, 2022 at 23:22 Joshua Poor

Re: [ VOTE ] Graduation of Flagon Project

2022-11-05 Thread lewis john mcgibbney
;> >> >> On Tue, Nov 1, 2022 at 7:06 AM Gedd Johnson >> wrote: >> >> >> >> > +1 >> >> > >> >> > Best, >> >> > Gedd Johnson >> >> > >> >> > On Mon, Oct 31, 2022 at 23:22 Joshua Poor

[jira] [Commented] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies

2022-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628032#comment-17628032 ] Lewis John McGibbney commented on TIKA-2536: The may appreciate a contribution which allows

[jira] [Comment Edited] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies

2022-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628029#comment-17628029 ] Lewis John McGibbney edited comment on TIKA-2536 at 11/3/22 12:36 AM

[jira] [Commented] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies

2022-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628029#comment-17628029 ] Lewis John McGibbney commented on TIKA-2536: As of version 5.0, netCDF-Java is released under

[jira] [Commented] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies

2022-11-02 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628027#comment-17628027 ] Lewis John McGibbney commented on TIKA-2536: As [~nick] mentioned referencing 3rd-party

Re: Community VOTE started for Flagon graduation

2022-11-02 Thread lewis john mcgibbney
Hi Austin, +1 from me. lewismc On Wed, Nov 2, 2022 at 3:46 PM wrote: > -- Forwarded message -- > From: Austin Bennett > To: Incubator General > Cc: > Bcc: > Date: Mon, 31 Oct 2022 09:39:09 -0700 > Subject: Community VOTE started for Flagon graduation > Hi Incubator, > > Per >

Re: [ VOTE ] Graduation of Flagon Project

2022-10-31 Thread lewis john mcgibbney
+1 On Mon, Oct 31, 2022 at 09:31 Austin Bennett wrote: > Hi Flagon Community, > +1 > Given recent discussions around the graduation status of the project, it > is time to work through the process. We have had a recent discussion > on-list, and consensus seems to be in favor of graduation. The

Re: [DISCUSS] Graduate Apache Flagon

2022-10-22 Thread lewis john mcgibbney
Hi Josh, In short… yes. I firmly Believe the project should graduate. Please carry my +1 through to a VOTE thread. Thanks lewismc On Fri, Oct 21, 2022 at 20:42 Joshua Poore wrote: > Hi All, > > I’m initiating this DISCUSS thread to talk through the communities > perspective (at this point in

Re: [DISCUSS] Graduate Apache Flagon

2022-10-22 Thread lewis john mcgibbney
Hi Josh, In short… yes. I firmly Believe the project should graduate. Please carry my +1 through to a VOTE thread. Thanks lewismc On Fri, Oct 21, 2022 at 20:42 Joshua Poore wrote: > Hi All, > > I’m initiating this DISCUSS thread to talk through the communities > perspective (at this point in

[jira] [Commented] (TIKA-3826) Helm: use appVersion from Charts.yaml intsead of images.tag

2022-10-13 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617298#comment-17617298 ] Lewis John McGibbney commented on TIKA-3826: [~hairmare] good suggestion. Please file a PR

<    1   2   3   4   5   6   7   8   9   10   >