[jira] [Updated] (TIKA-4232) Create and execute unit tests for tika-helm
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4232: --- Fix Version/s: 2.9.3 > Create and execute unit tests for tika-helm > --- > > Key: TIKA-4232 > URL: https://issues.apache.org/jira/browse/TIKA-4232 > Project: Tika > Issue Type: Improvement > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.9.3 > > > The goal is to execute chart unit tests against each tika-helm pull request. > I found the [Helm Unit > Tests|[https://github.com/marketplace/actions/helm-unit-tests]] GitHub Action > which uses [https://github.com/helm-unittest/helm-unittest] as a Helm plugin. > The PR will consist of one or more unit tests automated via the GitHub action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (TIKA-4232) Create and execute unit tests for tika-helm
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-4232. Resolution: Fixed > Create and execute unit tests for tika-helm > --- > > Key: TIKA-4232 > URL: https://issues.apache.org/jira/browse/TIKA-4232 > Project: Tika > Issue Type: Improvement > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > > The goal is to execute chart unit tests against each tika-helm pull request. > I found the [Helm Unit > Tests|[https://github.com/marketplace/actions/helm-unit-tests]] GitHub Action > which uses [https://github.com/helm-unittest/helm-unittest] as a Helm plugin. > The PR will consist of one or more unit tests automated via the GitHub action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (TIKA-4232) Create and execute unit tests for tika-helm
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed TIKA-4232. -- > Create and execute unit tests for tika-helm > --- > > Key: TIKA-4232 > URL: https://issues.apache.org/jira/browse/TIKA-4232 > Project: Tika > Issue Type: Improvement > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.9.3 > > > The goal is to execute chart unit tests against each tika-helm pull request. > I found the [Helm Unit > Tests|[https://github.com/marketplace/actions/helm-unit-tests]] GitHub Action > which uses [https://github.com/helm-unittest/helm-unittest] as a Helm plugin. > The PR will consist of one or more unit tests automated via the GitHub action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (TIKA-4233) Check tika-helm for deprecated k8s APIs
[ https://issues.apache.org/jira/browse/TIKA-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed TIKA-4233. -- > Check tika-helm for deprecated k8s APIs > --- > > Key: TIKA-4233 > URL: https://issues.apache.org/jira/browse/TIKA-4233 > Project: Tika > Issue Type: New Feature > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.9.3 > > > It is useful to know when a Helm Chart uses deprecated k8s APIs. A check for > this would be ideal. The “Check deprecated k8s APIs” GitHub action > accomplishes this. > [https://github.com/marketplace/actions/check-deprecated-k8s-apis] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (TIKA-4233) Check tika-helm for deprecated k8s APIs
[ https://issues.apache.org/jira/browse/TIKA-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-4233. Resolution: Fixed This PR broke one of the GitHub Action workflows. I have written to INFRA about it https://issues.apache.org/jira/browse/INFRA-25775 > Check tika-helm for deprecated k8s APIs > --- > > Key: TIKA-4233 > URL: https://issues.apache.org/jira/browse/TIKA-4233 > Project: Tika > Issue Type: New Feature > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.9.3 > > > It is useful to know when a Helm Chart uses deprecated k8s APIs. A check for > this would be ideal. The “Check deprecated k8s APIs” GitHub action > accomplishes this. > [https://github.com/marketplace/actions/check-deprecated-k8s-apis] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-4233) Check tika-helm for deprecated k8s APIs
[ https://issues.apache.org/jira/browse/TIKA-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4233: --- Fix Version/s: 2.9.3 > Check tika-helm for deprecated k8s APIs > --- > > Key: TIKA-4233 > URL: https://issues.apache.org/jira/browse/TIKA-4233 > Project: Tika > Issue Type: New Feature > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.9.3 > > > It is useful to know when a Helm Chart uses deprecated k8s APIs. A check for > this would be ideal. The “Check deprecated k8s APIs” GitHub action > accomplishes this. > [https://github.com/marketplace/actions/check-deprecated-k8s-apis] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4232) Create and execute unit tests for tika-helm
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835077#comment-17835077 ] Lewis John McGibbney commented on TIKA-4232: It turns out that the original GitHub action I wanted to use will not be approved to use. I’m therefore investigating running the tests via the [https://github.com/marketplace/actions/docker-run-action] to run the {{{}helmunittest/helm-unittest Docker image{}}}, and generate the junit report and then using the [https://github.com/marketplace/actions/junit-report-action] to report the tests to the PR. I’ll do further investigation and followup here. > Create and execute unit tests for tika-helm > --- > > Key: TIKA-4232 > URL: https://issues.apache.org/jira/browse/TIKA-4232 > Project: Tika > Issue Type: Improvement > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > > The goal is to execute chart unit tests against each tika-helm pull request. > I found the [Helm Unit > Tests|[https://github.com/marketplace/actions/helm-unit-tests]] GitHub Action > which uses [https://github.com/helm-unittest/helm-unittest] as a Helm plugin. > The PR will consist of one or more unit tests automated via the GitHub action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-4233) Check tika-helm for deprecated k8s APIs
Lewis John McGibbney created TIKA-4233: -- Summary: Check tika-helm for deprecated k8s APIs Key: TIKA-4233 URL: https://issues.apache.org/jira/browse/TIKA-4233 Project: Tika Issue Type: New Feature Components: tika-helm Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.9.2 It is useful to know when a Helm Chart uses deprecated k8s APIs. A check for this would be ideal. The “Check deprecated k8s APIs” GitHub action accomplishes this. [https://github.com/marketplace/actions/check-deprecated-k8s-apis] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-4232) Create and execute unit tests for tika-helm
Lewis John McGibbney created TIKA-4232: -- Summary: Create and execute unit tests for tika-helm Key: TIKA-4232 URL: https://issues.apache.org/jira/browse/TIKA-4232 Project: Tika Issue Type: Improvement Components: tika-helm Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.9.2 The goal is to execute chart unit tests against each tika-helm pull request. I found the [Helm Unit Tests|[https://github.com/marketplace/actions/helm-unit-tests]] GitHub Action which uses [https://github.com/helm-unittest/helm-unittest] as a Helm plugin. The PR will consist of one or more unit tests automated via the GitHub action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4227) Register tika-helm Chart in artifacthub.io
[ https://issues.apache.org/jira/browse/TIKA-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832505#comment-17832505 ] Lewis John McGibbney commented on TIKA-4227: Available at [https://artifacthub.io/packages/helm/apache-tika/tika] > Register tika-helm Chart in artifacthub.io > -- > > Key: TIKA-4227 > URL: https://issues.apache.org/jira/browse/TIKA-4227 > Project: Tika > Issue Type: Task > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > Fix For: 2.9.2 > > > [https://artifacthub.io/] represents the most popular search interface for > (amongst lots of other artifacts) Helm Charts. > This task will register the tika-helm Chart with [https://artifacthub.io/]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (TIKA-4227) Register tika-helm Chart in artifacthub.io
[ https://issues.apache.org/jira/browse/TIKA-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-4227. Resolution: Fixed > Register tika-helm Chart in artifacthub.io > -- > > Key: TIKA-4227 > URL: https://issues.apache.org/jira/browse/TIKA-4227 > Project: Tika > Issue Type: Task > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > Fix For: 2.9.2 > > > [https://artifacthub.io/] represents the most popular search interface for > (amongst lots of other artifacts) Helm Charts. > This task will register the tika-helm Chart with [https://artifacthub.io/]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (TIKA-4227) Register tika-helm Chart in artifacthub.io
[ https://issues.apache.org/jira/browse/TIKA-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed TIKA-4227. -- > Register tika-helm Chart in artifacthub.io > -- > > Key: TIKA-4227 > URL: https://issues.apache.org/jira/browse/TIKA-4227 > Project: Tika > Issue Type: Task > Components: tika-helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > Fix For: 2.9.2 > > > [https://artifacthub.io/] represents the most popular search interface for > (amongst lots of other artifacts) Helm Charts. > This task will register the tika-helm Chart with [https://artifacthub.io/]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-4227) Register tika-helm Chart in artifacthub.io
Lewis John McGibbney created TIKA-4227: -- Summary: Register tika-helm Chart in artifacthub.io Key: TIKA-4227 URL: https://issues.apache.org/jira/browse/TIKA-4227 Project: Tika Issue Type: Task Components: tika-helm Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.9.2 [https://artifacthub.io/] represents the most popular search interface for (amongst lots of other artifacts) Helm Charts. This task will register the tika-helm Chart with [https://artifacthub.io/]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-4169) Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension
[ https://issues.apache.org/jira/browse/TIKA-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4169: --- Description: An Functional Mockup Unit (FMU) is a software component used for exchanging and simulating dynamic system models. It is designed to enable simulations of system models regardless of the simulation tool, programming language, or hardware platform. This is made possible through a standard interface that allows FMUs to be exported and imported across different simulation environments. The FMU media type ships with the .fmu file suffix I think the MIT licensed [NTNU-IHB/FMI4j|https://github.com/NTNU-IHB/FMI4j] can be used as the underlying parser implementation. I will go on the hunt for some sample files we can use in unit tests. I think we can make some available via [https://github.com/Open-MBEE/perseverance-modelica] was: An Functional Mockup Unit (FMU) is a software component used for exchanging and simulating dynamic system models. It is designed to enable simulations of system models regardless of the simulation tool, programming language, or hardware platform. This is made possible through a standard interface that allows FMUs to be exported and imported across different simulation environments. The FMU media type ships with the .fmu file suffix I think the MIT licensed [NTNU-IHB/FMI4j|https://github.com/NTNU-IHB/FMI4j] can be used as the underlying parser implementation. > Create a parser for Functional Mockup Unit (FMU) media type with .fmu > extension > --- > > Key: TIKA-4169 > URL: https://issues.apache.org/jira/browse/TIKA-4169 > Project: Tika > Issue Type: New Feature > Components: parser >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > > An Functional Mockup Unit (FMU) is a software component used for exchanging > and simulating dynamic system models. It is designed to enable simulations of > system models regardless of the simulation tool, programming language, or > hardware platform. This is made possible through a standard interface that > allows FMUs to be exported and imported across different simulation > environments. > The FMU media type ships with the .fmu file suffix > I think the MIT licensed [NTNU-IHB/FMI4j|https://github.com/NTNU-IHB/FMI4j] > can be used as the underlying parser implementation. > I will go on the hunt for some sample files we can use in unit tests. I think > we can make some available via > [https://github.com/Open-MBEE/perseverance-modelica] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-4169) Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension
[ https://issues.apache.org/jira/browse/TIKA-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-4169: --- Description: An Functional Mockup Unit (FMU) is a software component used for exchanging and simulating dynamic system models. It is designed to enable simulations of system models regardless of the simulation tool, programming language, or hardware platform. This is made possible through a standard interface that allows FMUs to be exported and imported across different simulation environments. The FMU media type ships with the .fmu file suffix I think the MIT licensed [NTNU-IHB/FMI4j|https://github.com/NTNU-IHB/FMI4j] can be used as the underlying parser implementation. was: An Functional Mockup Unit (FMU) is a software component used for exchanging and simulating dynamic system models. It is designed to enable simulations of system models regardless of the simulation tool, programming language, or hardware platform. This is made possible through a standard interface that allows FMUs to be exported and imported across different simulation environments. The FMU media type ships with the .fmu file suffix I think the MIT licensed [NTNU-IHB/FMI4j|[https://github.com/NTNU-IHB/FMI4j]] can be used as the underlying parser implementation. > Create a parser for Functional Mockup Unit (FMU) media type with .fmu > extension > --- > > Key: TIKA-4169 > URL: https://issues.apache.org/jira/browse/TIKA-4169 > Project: Tika > Issue Type: New Feature > Components: parser >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > > An Functional Mockup Unit (FMU) is a software component used for exchanging > and simulating dynamic system models. It is designed to enable simulations of > system models regardless of the simulation tool, programming language, or > hardware platform. This is made possible through a standard interface that > allows FMUs to be exported and imported across different simulation > environments. > The FMU media type ships with the .fmu file suffix > I think the MIT licensed [NTNU-IHB/FMI4j|https://github.com/NTNU-IHB/FMI4j] > can be used as the underlying parser implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-4169) Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension
Lewis John McGibbney created TIKA-4169: -- Summary: Create a parser for Functional Mockup Unit (FMU) media type with .fmu extension Key: TIKA-4169 URL: https://issues.apache.org/jira/browse/TIKA-4169 Project: Tika Issue Type: New Feature Components: parser Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney An Functional Mockup Unit (FMU) is a software component used for exchanging and simulating dynamic system models. It is designed to enable simulations of system models regardless of the simulation tool, programming language, or hardware platform. This is made possible through a standard interface that allows FMUs to be exported and imported across different simulation environments. The FMU media type ships with the .fmu file suffix I think the MIT licensed [NTNU-IHB/FMI4j|[https://github.com/NTNU-IHB/FMI4j]] can be used as the underlying parser implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-3989) Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2
[ https://issues.apache.org/jira/browse/TIKA-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3989: --- Description: The _*autoscaling/v2beta1*_ API is superseded with {_}*autoscaling/v2*{_}. This is documented thoroughly at [https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/] (was: The _*autoscaling/v2beta1*_ API is superseded with autoscaling/v2. This is documented thoroughly at https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) > Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to > autoscaling/v2 > -- > > Key: TIKA-3989 > URL: https://issues.apache.org/jira/browse/TIKA-3989 > Project: Tika > Issue Type: Task > Components: helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > > The _*autoscaling/v2beta1*_ API is superseded with {_}*autoscaling/v2*{_}. > This is documented thoroughly at > [https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-3989) Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2
Lewis John McGibbney created TIKA-3989: -- Summary: Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2 Key: TIKA-3989 URL: https://issues.apache.org/jira/browse/TIKA-3989 Project: Tika Issue Type: Task Components: helm Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney The _*autoscaling/v2beta1*_ API is superseded with autoscaling/v2. This is documented thoroughly in [https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/|https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-3989) Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to autoscaling/v2
[ https://issues.apache.org/jira/browse/TIKA-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3989: --- Description: The _*autoscaling/v2beta1*_ API is superseded with autoscaling/v2. This is documented thoroughly at https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/ (was: The _*autoscaling/v2beta1*_ API is superseded with autoscaling/v2. This is documented thoroughly in [https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/|https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) > Upgrade tika-helm Horizontal Pod Autoscaling from to autoscaling/v2beta1 to > autoscaling/v2 > -- > > Key: TIKA-3989 > URL: https://issues.apache.org/jira/browse/TIKA-3989 > Project: Tika > Issue Type: Task > Components: helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Minor > > The _*autoscaling/v2beta1*_ API is superseded with autoscaling/v2. This is > documented thoroughly at > https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (TIKA-3988) Add Github Action to Lint and Test Charts
[ https://issues.apache.org/jira/browse/TIKA-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed TIKA-3988. -- > Add Github Action to Lint and Test Charts > - > > Key: TIKA-3988 > URL: https://issues.apache.org/jira/browse/TIKA-3988 > Project: Tika > Issue Type: Improvement > Components: helm >Affects Versions: 2.7.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.7.0 > > > The [chart-testing-action|https://github.com/helm/chart-testing-action] will > improve CI for the tika-helm. PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (TIKA-3988) Add Github Action to Lint and Test Charts
[ https://issues.apache.org/jira/browse/TIKA-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3988. Resolution: Fixed > Add Github Action to Lint and Test Charts > - > > Key: TIKA-3988 > URL: https://issues.apache.org/jira/browse/TIKA-3988 > Project: Tika > Issue Type: Improvement > Components: helm >Affects Versions: 2.7.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.7.0 > > > The [chart-testing-action|https://github.com/helm/chart-testing-action] will > improve CI for the tika-helm. PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-3988) Add Github Action to Lint and Test Charts
[ https://issues.apache.org/jira/browse/TIKA-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702421#comment-17702421 ] Lewis John McGibbney commented on TIKA-3988: It looks like there are some permissions issues which needs to be configured before the Github action can be run. I got in touch with INFRA about this. The Github Action output is as follows {quote} Error: .github#L1 helm/chart-testing-action@v2.3.1 and helm/kind-action@v1.4.0 are not allowed to be used in apache/tika-helm. Actions in this workflow must be: within a repository owned by apache, created by GitHub, verified in the GitHub Marketplace, or matching the following: {*}/{*}@[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]+, AdoptOpenJDK/install-jdk@{*}, JamesIves/github-pages-deploy-action@5dc1d5a192aeb5ab5b7d5a77b7d36aea4a7f5c92, TobKed/label-when-approved-action@{*}, actions-cool/issues-helper@{*}, actions-rs/{*}, al-cheb/configure-pagefile-action@{*}, amannn/action-semantic-pull-request@{*}, apache/{*}, burrunan/gradle-cache-action@{*}, bytedeco/javacpp-presets/.github/actions/{*}, chromaui/action@{*}, codecov/codecov-action@{*}, conda-incubator/setup-miniconda@{*}, container-tools/kind-action@{*}, container-tools/microshift-action@{*}, dawidd6/action-download-artifact@{*}, delaguardo/setup-graalvm@{*}, docker://jekyll/jekyll:{*}, docker://pandoc/core:2.9, eps1lon/actions-label-merge-conflict@{*}, gaurav-nelson/gith... {quote} > Add Github Action to Lint and Test Charts > - > > Key: TIKA-3988 > URL: https://issues.apache.org/jira/browse/TIKA-3988 > Project: Tika > Issue Type: Improvement > Components: helm >Affects Versions: 2.7.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.7.0 > > > The [chart-testing-action|https://github.com/helm/chart-testing-action] will > improve CI for the tika-helm. PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-3988) Add Github Action to Lint and Test Charts
Lewis John McGibbney created TIKA-3988: -- Summary: Add Github Action to Lint and Test Charts Key: TIKA-3988 URL: https://issues.apache.org/jira/browse/TIKA-3988 Project: Tika Issue Type: Improvement Components: helm Affects Versions: 2.7.0 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.7.0 The [chart-testing-action|https://github.com/helm/chart-testing-action] will improve CI for the tika-helm. PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-3985) Automate tika-helm Chart releases with helm/chart-releaser-action
[ https://issues.apache.org/jira/browse/TIKA-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702402#comment-17702402 ] Lewis John McGibbney commented on TIKA-3985: https://github.com/marketplace/actions/jfrog-cli-for-github-actions https://github.com/helm/chart-releaser-action > Automate tika-helm Chart releases with helm/chart-releaser-action > -- > > Key: TIKA-3985 > URL: https://issues.apache.org/jira/browse/TIKA-3985 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.7.0 > > > I've received several requests for > [tika-helm|https://github.com/apache/tika-helm] releases to shadow > [tika-docker|https://github.com/apache/tika-docker]. > I found a Github action which will enable that. PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (TIKA-3985) Automate tika-helm Chart releases with helm/chart-releaser-action
Lewis John McGibbney created TIKA-3985: -- Summary: Automate tika-helm Chart releases with helm/chart-releaser-action Key: TIKA-3985 URL: https://issues.apache.org/jira/browse/TIKA-3985 Project: Tika Issue Type: Improvement Components: helm Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.7.0 I've received several requests for [tika-helm|https://github.com/apache/tika-helm] releases to shadow [tika-docker|https://github.com/apache/tika-docker]. I found a Github action which will enable that. PR coming up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-3452) java.nio.file.FileSystemException Read-only file system
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3452: --- Fix Version/s: 2.7.0 (was: 2.0.0-BETA) > java.nio.file.FileSystemException Read-only file system > --- > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.7.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (TIKA-3452) java.nio.file.FileSystemException Read-only file system
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3452. Resolution: Fixed > java.nio.file.FileSystemException Read-only file system > --- > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.7.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (TIKA-3452) java.nio.file.FileSystemException Read-only file system
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3452: --- Summary: java.nio.file.FileSystemException Read-only file system (was: java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker) > java.nio.file.FileSystemException Read-only file system > --- > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0-BETA > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628032#comment-17628032 ] Lewis John McGibbney commented on TIKA-2536: The may appreciate a contribution which allows them to [accommodate dual publication|https://docs.unidata.ucar.edu/netcdf-java/current/userguide/building_from_source.html#publishing]. If you can look at the question above [~nick], then I'll go ahead and ask. I'm trying to anticipate them asking why we can't just reference their repository... > Move to later edu.ucar version to avoid EOL dependencies > > > Key: TIKA-2536 > URL: https://issues.apache.org/jira/browse/TIKA-2536 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.16, 1.17 > Environment: All >Reporter: Richard Jones >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > The currently referenced 4.5.5 versions of edu.ucar:grib and edu.ucar:cdm > (released in Mar 2015), as well as being branch EOL themselves, depend on > many other project/branch/version EOL artifacts for which much later and > active versions are often available. The list is as follows: > - edu.ucar:grib depends on the project EOL bzip2. Much more recent versions > of edu.ucar:grib exist that no longer depend on bzip2 (note: Jbzip2 is hosted > on the Google Code site, which was shut down for active development in 2015. > The project was never migrated to another site, e.g. Github). > - edu.ucar:grib depends on the 2.0.4 EOL version of org.jdom:jdom2 > - edu.ucar:cdm depends on the 2.6.2 branch EOL version of > net.sf.ehcache:ehcache-core > - edu.ucar:cdm depends on the 2.2.0 EOL version of > org.quartz-scheduler:quartz for which active versions are available. In turn > org.quartz-scheduler:quartz depends on the 0.9.1.1 branch EOL version of > c3p0:c3p0. Later versions of quartz have moved to the active com.mchange:c3p0 > - edu.ucar:grib depends on the 2.5.0 branch EOL version of > com.google.protobuf:protobuf-java for which active versions are available. > Request moving to a much later version of edu.ucar, or alternative artifacts > to address all the above EOL issues (lack of active support for > vulnerabilities and bugs). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628029#comment-17628029 ] Lewis John McGibbney edited comment on TIKA-2536 at 11/3/22 12:36 AM: -- As of version 5.0, netCDF-Java is released under the [BSD-3 licence|https://github.com/Unidata/netcdf-java/blob/master/LICENSE] * tika main branch relies on v4.5.5 * current netCDF-java release appears to be 5.5.2 [~nick] I _think_ I used to know the answer to this question but it escapes me now. What conditions/restrictions result in the following statement "...We can only depend on versions in maven central, we can't depend on versions hosted elsewhere"? Please remind me. Thanks was (Author: lewismc): As of version 5.0, netCDF-Java is released under the [BSD-3 licence|https://github.com/Unidata/netcdf-java/blob/master/LICENSE] * tika main branch relies on v4.5.5 * current netCDF-java release appears to be 5.5.2 [~nick] I think I used to know the answer to this question but what conditions/restriuctions result in the following statement "...We can only depend on versions in maven central, we can't depend on versions hosted elsewhere"? Please remind me. Thanks > Move to later edu.ucar version to avoid EOL dependencies > > > Key: TIKA-2536 > URL: https://issues.apache.org/jira/browse/TIKA-2536 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.16, 1.17 > Environment: All >Reporter: Richard Jones >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > The currently referenced 4.5.5 versions of edu.ucar:grib and edu.ucar:cdm > (released in Mar 2015), as well as being branch EOL themselves, depend on > many other project/branch/version EOL artifacts for which much later and > active versions are often available. The list is as follows: > - edu.ucar:grib depends on the project EOL bzip2. Much more recent versions > of edu.ucar:grib exist that no longer depend on bzip2 (note: Jbzip2 is hosted > on the Google Code site, which was shut down for active development in 2015. > The project was never migrated to another site, e.g. Github). > - edu.ucar:grib depends on the 2.0.4 EOL version of org.jdom:jdom2 > - edu.ucar:cdm depends on the 2.6.2 branch EOL version of > net.sf.ehcache:ehcache-core > - edu.ucar:cdm depends on the 2.2.0 EOL version of > org.quartz-scheduler:quartz for which active versions are available. In turn > org.quartz-scheduler:quartz depends on the 0.9.1.1 branch EOL version of > c3p0:c3p0. Later versions of quartz have moved to the active com.mchange:c3p0 > - edu.ucar:grib depends on the 2.5.0 branch EOL version of > com.google.protobuf:protobuf-java for which active versions are available. > Request moving to a much later version of edu.ucar, or alternative artifacts > to address all the above EOL issues (lack of active support for > vulnerabilities and bugs). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628029#comment-17628029 ] Lewis John McGibbney commented on TIKA-2536: As of version 5.0, netCDF-Java is released under the [BSD-3 licence|https://github.com/Unidata/netcdf-java/blob/master/LICENSE] * tika main branch relies on v4.5.5 * current netCDF-java release appears to be 5.5.2 [~nick] I think I used to know the answer to this question but what conditions/restriuctions result in the following statement "...We can only depend on versions in maven central, we can't depend on versions hosted elsewhere"? Please remind me. Thanks > Move to later edu.ucar version to avoid EOL dependencies > > > Key: TIKA-2536 > URL: https://issues.apache.org/jira/browse/TIKA-2536 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.16, 1.17 > Environment: All >Reporter: Richard Jones >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > The currently referenced 4.5.5 versions of edu.ucar:grib and edu.ucar:cdm > (released in Mar 2015), as well as being branch EOL themselves, depend on > many other project/branch/version EOL artifacts for which much later and > active versions are often available. The list is as follows: > - edu.ucar:grib depends on the project EOL bzip2. Much more recent versions > of edu.ucar:grib exist that no longer depend on bzip2 (note: Jbzip2 is hosted > on the Google Code site, which was shut down for active development in 2015. > The project was never migrated to another site, e.g. Github). > - edu.ucar:grib depends on the 2.0.4 EOL version of org.jdom:jdom2 > - edu.ucar:cdm depends on the 2.6.2 branch EOL version of > net.sf.ehcache:ehcache-core > - edu.ucar:cdm depends on the 2.2.0 EOL version of > org.quartz-scheduler:quartz for which active versions are available. In turn > org.quartz-scheduler:quartz depends on the 0.9.1.1 branch EOL version of > c3p0:c3p0. Later versions of quartz have moved to the active com.mchange:c3p0 > - edu.ucar:grib depends on the 2.5.0 branch EOL version of > com.google.protobuf:protobuf-java for which active versions are available. > Request moving to a much later version of edu.ucar, or alternative artifacts > to address all the above EOL issues (lack of active support for > vulnerabilities and bugs). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-2536) Move to later edu.ucar version to avoid EOL dependencies
[ https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17628027#comment-17628027 ] Lewis John McGibbney commented on TIKA-2536: As [~nick] mentioned referencing 3rd-party artifact repos is a no-go. [UCAR provides documentation on the repos and how to do exactly that|https://docs.unidata.ucar.edu/netcdf-java/current/userguide/using_netcdf_java_artifacts.html] but that doesn't help us as we would need to reference their repos... I will attempt to contact the UCAR team and see where I get... I'll write back here. > Move to later edu.ucar version to avoid EOL dependencies > > > Key: TIKA-2536 > URL: https://issues.apache.org/jira/browse/TIKA-2536 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.16, 1.17 > Environment: All >Reporter: Richard Jones >Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > The currently referenced 4.5.5 versions of edu.ucar:grib and edu.ucar:cdm > (released in Mar 2015), as well as being branch EOL themselves, depend on > many other project/branch/version EOL artifacts for which much later and > active versions are often available. The list is as follows: > - edu.ucar:grib depends on the project EOL bzip2. Much more recent versions > of edu.ucar:grib exist that no longer depend on bzip2 (note: Jbzip2 is hosted > on the Google Code site, which was shut down for active development in 2015. > The project was never migrated to another site, e.g. Github). > - edu.ucar:grib depends on the 2.0.4 EOL version of org.jdom:jdom2 > - edu.ucar:cdm depends on the 2.6.2 branch EOL version of > net.sf.ehcache:ehcache-core > - edu.ucar:cdm depends on the 2.2.0 EOL version of > org.quartz-scheduler:quartz for which active versions are available. In turn > org.quartz-scheduler:quartz depends on the 0.9.1.1 branch EOL version of > c3p0:c3p0. Later versions of quartz have moved to the active com.mchange:c3p0 > - edu.ucar:grib depends on the 2.5.0 branch EOL version of > com.google.protobuf:protobuf-java for which active versions are available. > Request moving to a much later version of edu.ucar, or alternative artifacts > to address all the above EOL issues (lack of active support for > vulnerabilities and bugs). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-3826) Helm: use appVersion from Charts.yaml intsead of images.tag
[ https://issues.apache.org/jira/browse/TIKA-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17617298#comment-17617298 ] Lewis John McGibbney commented on TIKA-3826: [~hairmare] good suggestion. Please file a PR and tage me. i will be happy to review. Thanks > Helm: use appVersion from Charts.yaml intsead of images.tag > --- > > Key: TIKA-3826 > URL: https://issues.apache.org/jira/browse/TIKA-3826 > Project: Tika > Issue Type: Bug > Components: helm >Affects Versions: 2.2.1 >Reporter: Lucas Bickel >Priority: Major > > This is about the [tika Helm chart|https://github.com/apache/tika-helm]. > In `values.yaml` we currently have > [this|https://github.com/apache/tika-helm/blob/492386471616713bddbc5851912acdd78bd87609/values.yaml#L25-L26]: > {code:yaml} > # Overrides the image tag whose default is the chart appVersion. > tag: "1.26" > {code} > This leads to {{ .Values.image.tag | default .Chart.AppVersion }} [in > deployment.yaml|https://github.com/apache/tika-helm/blob/492386471616713bddbc5851912acdd78bd87609/templates/deployment.yaml#L52] > being dead code. > Currently the docs indicate that we should set {{image.tag}} during the > deployment, skipping this step results in deploying a very outdated tika 1.26. > My proposal for fixing this is to set the appVersion in {{Chart.yaml}} to the > latest 2.4.1-full version and set the image.tag to an empty version so it > defaults to the version from Chart.yaml. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (TIKA-3648) Fail build if ossindex-maven-plugin violation is detected
[ https://issues.apache.org/jira/browse/TIKA-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3648. Resolution: Fixed With dependabot now activated our goal should be to keep things up-to-date. > Fail build if ossindex-maven-plugin violation is detected > - > > Key: TIKA-3648 > URL: https://issues.apache.org/jira/browse/TIKA-3648 > Project: Tika > Issue Type: Improvement > Components: build, security >Affects Versions: 2.2.1 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Critical > > The ossindex-maven-plugin can really assist us in detecting and preventing > security vulnerabilities and also mitigating associated risk and exposure. > I propose to fail the build if ossindex-maven-plugin violation is detected > https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L639 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3648) Fail build if ossindex-maven-plugin violation is detected
[ https://issues.apache.org/jira/browse/TIKA-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3648: --- Fix Version/s: 2.3.1 > Fail build if ossindex-maven-plugin violation is detected > - > > Key: TIKA-3648 > URL: https://issues.apache.org/jira/browse/TIKA-3648 > Project: Tika > Issue Type: Improvement > Components: build, security >Affects Versions: 2.2.1 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Critical > Fix For: 2.3.1 > > > The ossindex-maven-plugin can really assist us in detecting and preventing > security vulnerabilities and also mitigating associated risk and exposure. > I propose to fail the build if ossindex-maven-plugin violation is detected > https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L639 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-3566) Upgrade tika-helm to 2.2.1
[ https://issues.apache.org/jira/browse/TIKA-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3566. Fix Version/s: 2.2.1 (was: 2.1.0) Resolution: Fixed Done. I made the announcements to the user@ and dev@ mailing lists. I'm planning on writing a Jenkinsfile which will automate all of this in the future. > Upgrade tika-helm to 2.2.1 > -- > > Key: TIKA-3566 > URL: https://issues.apache.org/jira/browse/TIKA-3566 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Simple upgrade to tika-docker 2.2.1. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3566) Upgrade tika-helm to 2.2.1
[ https://issues.apache.org/jira/browse/TIKA-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3566: --- Description: Simple upgrade to tika-docker 2.2.1. (was: Simple upgrade to [tika-docker 2.1.0|https://hub.docker.com/layers/apache/tika/2.1.0/images/sha256-5bb52afa9726cf2ca022441cc75ef357de9f8deb41a88a9b2964780e934d11e7?context=explore].) > Upgrade tika-helm to 2.2.1 > -- > > Key: TIKA-3566 > URL: https://issues.apache.org/jira/browse/TIKA-3566 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.0 > > > Simple upgrade to tika-docker 2.2.1. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3566) Upgrade tika-helm to 2.2.1
[ https://issues.apache.org/jira/browse/TIKA-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3566: --- Summary: Upgrade tika-helm to 2.2.1 (was: Upgrade tika-helm to 2.1.0) > Upgrade tika-helm to 2.2.1 > -- > > Key: TIKA-3566 > URL: https://issues.apache.org/jira/browse/TIKA-3566 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.0 > > > Simple upgrade to [tika-docker > 2.1.0|https://hub.docker.com/layers/apache/tika/2.1.0/images/sha256-5bb52afa9726cf2ca022441cc75ef357de9f8deb41a88a9b2964780e934d11e7?context=explore]. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-3649) Perform findbugs static analysis on the project and address the issues
[ https://issues.apache.org/jira/browse/TIKA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3649. Resolution: Fixed Thanks [~dkryukov] > Perform findbugs static analysis on the project and address the issues > -- > > Key: TIKA-3649 > URL: https://issues.apache.org/jira/browse/TIKA-3649 > Project: Tika > Issue Type: Improvement >Reporter: Dmitrii Kriukov >Assignee: Dmitrii Kriukov >Priority: Major > Fix For: 2.2.2 > > > I'm going to create one PR per module of the project. > The first one is https://github.com/apache/tika/pull/478 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (TIKA-3649) Perform findbugs static analysis on the project and address the issues
[ https://issues.apache.org/jira/browse/TIKA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TIKA-3649: -- Assignee: Dmitrii Kriukov > Perform findbugs static analysis on the project and address the issues > -- > > Key: TIKA-3649 > URL: https://issues.apache.org/jira/browse/TIKA-3649 > Project: Tika > Issue Type: Improvement >Reporter: Dmitrii Kriukov >Assignee: Dmitrii Kriukov >Priority: Major > Fix For: 2.2.2 > > > I'm going to create one PR per module of the project. > The first one is https://github.com/apache/tika/pull/478 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3649) Perform findbugs static analysis on the project and address the issues
[ https://issues.apache.org/jira/browse/TIKA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3649: --- Fix Version/s: 2.2.2 > Perform findbugs static analysis on the project and address the issues > -- > > Key: TIKA-3649 > URL: https://issues.apache.org/jira/browse/TIKA-3649 > Project: Tika > Issue Type: Improvement >Reporter: Dmitrii Kriukov >Priority: Major > Fix For: 2.2.2 > > > I'm going to create one PR per module of the project. > The first one is https://github.com/apache/tika/pull/478 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3649) Perform findbugs static analysis on the project and address the issues
[ https://issues.apache.org/jira/browse/TIKA-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3649: --- Summary: Perform findbugs static analysis on the project and address the issues (was: Perform static analysis on the project and address the issues) > Perform findbugs static analysis on the project and address the issues > -- > > Key: TIKA-3649 > URL: https://issues.apache.org/jira/browse/TIKA-3649 > Project: Tika > Issue Type: Improvement >Reporter: Dmitrii Kriukov >Priority: Major > > I'm going to create one PR per module of the project. > The first one is https://github.com/apache/tika/pull/478 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (TIKA-3651) Activate Dependabot on Tika main branch
Lewis John McGibbney created TIKA-3651: -- Summary: Activate Dependabot on Tika main branch Key: TIKA-3651 URL: https://issues.apache.org/jira/browse/TIKA-3651 Project: Tika Issue Type: Improvement Components: depedency, build, security Affects Versions: 2.2.1 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.2.2 [Dependabot|https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates] allows projects to keep the packages you use updated to the latest versions. It is a piece of cake to configure. PR coming up. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3648) Fail build if ossindex-maven-plugin violation is detected
[ https://issues.apache.org/jira/browse/TIKA-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17478180#comment-17478180 ] Lewis John McGibbney commented on TIKA-3648: {quote}dependabot only sends one ping/PR per dependency and we can ignore it for those dependencies, right?{quote} Correct. I will send that in a different PR right now. > Fail build if ossindex-maven-plugin violation is detected > - > > Key: TIKA-3648 > URL: https://issues.apache.org/jira/browse/TIKA-3648 > Project: Tika > Issue Type: Improvement > Components: build, security >Affects Versions: 2.2.1 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Critical > Fix For: 2.2.2 > > > The ossindex-maven-plugin can really assist us in detecting and preventing > security vulnerabilities and also mitigating associated risk and exposure. > I propose to fail the build if ossindex-maven-plugin violation is detected > https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L639 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3648) Fail build if ossindex-maven-plugin violation is detected
[ https://issues.apache.org/jira/browse/TIKA-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17478172#comment-17478172 ] Lewis John McGibbney commented on TIKA-3648: Your points are well made and I hear you. What do you think about turning on Dependabot? This way we keep up with issues as we go along. I personally feel Tika has more to gain by turning on Dependabot than by not. The dependency sprawl is pretty significant across the codebase so anything we can do to keep on top of things would be great. > Fail build if ossindex-maven-plugin violation is detected > - > > Key: TIKA-3648 > URL: https://issues.apache.org/jira/browse/TIKA-3648 > Project: Tika > Issue Type: Improvement > Components: build, security >Affects Versions: 2.2.1 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Critical > Fix For: 2.2.2 > > > The ossindex-maven-plugin can really assist us in detecting and preventing > security vulnerabilities and also mitigating associated risk and exposure. > I propose to fail the build if ossindex-maven-plugin violation is detected > https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L639 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (TIKA-3648) Fail build if ossindex-maven-plugin violation is detected
Lewis John McGibbney created TIKA-3648: -- Summary: Fail build if ossindex-maven-plugin violation is detected Key: TIKA-3648 URL: https://issues.apache.org/jira/browse/TIKA-3648 Project: Tika Issue Type: Improvement Components: security, build Affects Versions: 2.2.1 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.2.2 The ossindex-maven-plugin can really assist us in detecting and preventing security vulnerabilities and also mitigating associated risk and exposure. I propose to fail the build if ossindex-maven-plugin violation is detected https://github.com/apache/tika/blob/main/tika-parent/pom.xml#L639 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-3539) jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE
[ https://issues.apache.org/jira/browse/TIKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3539. Resolution: Fixed > jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE > > > Key: TIKA-3539 > URL: https://issues.apache.org/jira/browse/TIKA-3539 > Project: Tika > Issue Type: Task > Components: parser >Affects Versions: 2.1.0 >Reporter: Julian Reschke >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Might be good to avoid the use of JDOM altogether. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (TIKA-3539) jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE
[ https://issues.apache.org/jira/browse/TIKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TIKA-3539: -- Assignee: Lewis John McGibbney > jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE > > > Key: TIKA-3539 > URL: https://issues.apache.org/jira/browse/TIKA-3539 > Project: Tika > Issue Type: Task > Components: parser >Affects Versions: 2.1.0 >Reporter: Julian Reschke >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Might be good to avoid the use of JDOM altogether. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3539) jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE
[ https://issues.apache.org/jira/browse/TIKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466682#comment-17466682 ] Lewis John McGibbney commented on TIKA-3539: This issue was fixed for 2.X in https://github.com/apache/tika/pull/469 > jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE > > > Key: TIKA-3539 > URL: https://issues.apache.org/jira/browse/TIKA-3539 > Project: Tika > Issue Type: Task > Components: parser >Affects Versions: 2.1.0 >Reporter: Julian Reschke >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Might be good to avoid the use of JDOM altogether. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3539) jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE
[ https://issues.apache.org/jira/browse/TIKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3539: --- Fix Version/s: 2.2.1 > jdom 2.0.6 dependency in tika-parser-news-module has unfixed CVE > > > Key: TIKA-3539 > URL: https://issues.apache.org/jira/browse/TIKA-3539 > Project: Tika > Issue Type: Task > Components: parser >Affects Versions: 2.1.0 >Reporter: Julian Reschke >Priority: Major > Fix For: 2.2.1 > > > Might be good to avoid the use of JDOM altogether. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-3635) Upgrade to rome 1.18.0
[ https://issues.apache.org/jira/browse/TIKA-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3635. Resolution: Fixed > Upgrade to rome 1.18.0 > -- > > Key: TIKA-3635 > URL: https://issues.apache.org/jira/browse/TIKA-3635 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 2.2.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > It looks like my activity over on the rome project had a positive impact. > Although the project is basically dormant, the primary author took on the > task of pulling in my work and improving on it which is excellent. He's made > two releases in the last week or so. > (Hopefully) a trivial dependency upgrade PR coming up... -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-3488) Security issue XXE in TIKA due to JDOM
[ https://issues.apache.org/jira/browse/TIKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3488. Resolution: Fixed > Security issue XXE in TIKA due to JDOM > -- > > Key: TIKA-3488 > URL: https://issues.apache.org/jira/browse/TIKA-3488 > Project: Tika > Issue Type: Task > Components: tika-server >Affects Versions: 1.25 >Reporter: Arvind Jagtap >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Apache TIKA 1.35 is vulnerable due to dependency on JDOM 2.0.6. Black Duck > Hub has reported this vulnerability CVE-2021-33813 with more detail on the > following page. > [https://nvd.nist.gov/vuln/detail/CVE-2021-33813#range-6782705] > Although the following issue is entered, it is not yet fixed and there is no > timeline given. > https://github.com/hunterhacker/jdom/issues/189 > There are some workaround discussed on this issue. Can this be fixed in TIKA > in the meanwhile? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3488) Security issue XXE in TIKA due to JDOM
[ https://issues.apache.org/jira/browse/TIKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3488: --- Fix Version/s: 2.2.1 > Security issue XXE in TIKA due to JDOM > -- > > Key: TIKA-3488 > URL: https://issues.apache.org/jira/browse/TIKA-3488 > Project: Tika > Issue Type: Task > Components: tika-server >Affects Versions: 1.25 >Reporter: Arvind Jagtap >Priority: Major > Fix For: 2.2.1 > > > Apache TIKA 1.35 is vulnerable due to dependency on JDOM 2.0.6. Black Duck > Hub has reported this vulnerability CVE-2021-33813 with more detail on the > following page. > [https://nvd.nist.gov/vuln/detail/CVE-2021-33813#range-6782705] > Although the following issue is entered, it is not yet fixed and there is no > timeline given. > https://github.com/hunterhacker/jdom/issues/189 > There are some workaround discussed on this issue. Can this be fixed in TIKA > in the meanwhile? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (TIKA-3488) Security issue XXE in TIKA due to JDOM
[ https://issues.apache.org/jira/browse/TIKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TIKA-3488: -- Assignee: Lewis John McGibbney > Security issue XXE in TIKA due to JDOM > -- > > Key: TIKA-3488 > URL: https://issues.apache.org/jira/browse/TIKA-3488 > Project: Tika > Issue Type: Task > Components: tika-server >Affects Versions: 1.25 >Reporter: Arvind Jagtap >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Apache TIKA 1.35 is vulnerable due to dependency on JDOM 2.0.6. Black Duck > Hub has reported this vulnerability CVE-2021-33813 with more detail on the > following page. > [https://nvd.nist.gov/vuln/detail/CVE-2021-33813#range-6782705] > Although the following issue is entered, it is not yet fixed and there is no > timeline given. > https://github.com/hunterhacker/jdom/issues/189 > There are some workaround discussed on this issue. Can this be fixed in TIKA > in the meanwhile? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3488) Security issue XXE in TIKA due to JDOM
[ https://issues.apache.org/jira/browse/TIKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466681#comment-17466681 ] Lewis John McGibbney commented on TIKA-3488: This was fixed in https://github.com/apache/tika/pull/469 > Security issue XXE in TIKA due to JDOM > -- > > Key: TIKA-3488 > URL: https://issues.apache.org/jira/browse/TIKA-3488 > Project: Tika > Issue Type: Task > Components: tika-server >Affects Versions: 1.25 >Reporter: Arvind Jagtap >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > Apache TIKA 1.35 is vulnerable due to dependency on JDOM 2.0.6. Black Duck > Hub has reported this vulnerability CVE-2021-33813 with more detail on the > following page. > [https://nvd.nist.gov/vuln/detail/CVE-2021-33813#range-6782705] > Although the following issue is entered, it is not yet fixed and there is no > timeline given. > https://github.com/hunterhacker/jdom/issues/189 > There are some workaround discussed on this issue. Can this be fixed in TIKA > in the meanwhile? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3635) Upgrade to rome 1.18.0
[ https://issues.apache.org/jira/browse/TIKA-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3635: --- Fix Version/s: 2.2.1 > Upgrade to rome 1.18.0 > -- > > Key: TIKA-3635 > URL: https://issues.apache.org/jira/browse/TIKA-3635 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 2.2.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.1 > > > It looks like my activity over on the rome project had a positive impact. > Although the project is basically dormant, the primary author took on the > task of pulling in my work and improving on it which is excellent. He's made > two releases in the last week or so. > (Hopefully) a trivial dependency upgrade PR coming up... -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (TIKA-3635) Upgrade to rome 1.18.0
Lewis John McGibbney created TIKA-3635: -- Summary: Upgrade to rome 1.18.0 Key: TIKA-3635 URL: https://issues.apache.org/jira/browse/TIKA-3635 Project: Tika Issue Type: Improvement Components: parser Affects Versions: 2.2.0 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney It looks like my activity over on the rome project had a positive impact. Although the project is basically dormant, the primary author took on the task of pulling in my work and improving on it which is excellent. He's made two releases in the last week or so. (Hopefully) a trivial dependency upgrade PR coming up... -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3620) Language detection documentation needs attention
[ https://issues.apache.org/jira/browse/TIKA-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3620: --- Fix Version/s: 2.2.0 > Language detection documentation needs attention > > > Key: TIKA-3620 > URL: https://issues.apache.org/jira/browse/TIKA-3620 > Project: Tika > Issue Type: Improvement > Components: languageidentifier >Affects Versions: 2.1.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.0 > > > This language identifier/detection suffers from a few problems > # Clarity is needed on identifier/identification Vs detector/detection. Which > is it? The source code says identifier whereas the [documentation is nested > under > detection|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > # The > [org.apache.tika.language.LanguageIdentifier|https://tika.apache.org/2.1.0/api/org/apache/tika/language/LanguageIdentifier.html] > returns 404. What is this meant to resolve to? > # Generally speaking the [documentation is literally > non-existent|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > I checked the wiki and failed to find anything. I did find some [minor > documentation|https://tika.apache.org/2.1.0/examples.html#Language_Identification] > but this is also severely lacking. Also note the broken hyperlink. > Some suggestions for improvement > # Fix the broken hyperlinks. > # Hyperlink to the existing example namely > [LanguageDetectorExample.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java], > > [LanguageDetectingParser.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java] > and > [Language.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java] > # Hyperlink to the [LanguageDetector > Javadoc|https://tika.apache.org/2.1.0/api/index.html?org/apache/tika/language/detect/LanguageDetector.html] > and atleast mention some of the other implementations. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-3620) Language detection documentation needs attention
[ https://issues.apache.org/jira/browse/TIKA-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3620. Resolution: Fixed https://tika.apache.org/2.2.0/detection.html#Language_Detection > Language detection documentation needs attention > > > Key: TIKA-3620 > URL: https://issues.apache.org/jira/browse/TIKA-3620 > Project: Tika > Issue Type: Improvement > Components: languageidentifier >Affects Versions: 2.1.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.2.0 > > > This language identifier/detection suffers from a few problems > # Clarity is needed on identifier/identification Vs detector/detection. Which > is it? The source code says identifier whereas the [documentation is nested > under > detection|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > # The > [org.apache.tika.language.LanguageIdentifier|https://tika.apache.org/2.1.0/api/org/apache/tika/language/LanguageIdentifier.html] > returns 404. What is this meant to resolve to? > # Generally speaking the [documentation is literally > non-existent|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > I checked the wiki and failed to find anything. I did find some [minor > documentation|https://tika.apache.org/2.1.0/examples.html#Language_Identification] > but this is also severely lacking. Also note the broken hyperlink. > Some suggestions for improvement > # Fix the broken hyperlinks. > # Hyperlink to the existing example namely > [LanguageDetectorExample.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java], > > [LanguageDetectingParser.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java] > and > [Language.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java] > # Hyperlink to the [LanguageDetector > Javadoc|https://tika.apache.org/2.1.0/api/index.html?org/apache/tika/language/detect/LanguageDetector.html] > and atleast mention some of the other implementations. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3620) Language detection documentation needs attention
[ https://issues.apache.org/jira/browse/TIKA-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461618#comment-17461618 ] Lewis John McGibbney commented on TIKA-3620: % svn ci -m "TIKA-3620 Language detection documentation needs attention" Sendingpublish/2.2.0/detection.html Sendingsrc/site/apt/2.2.0/detection.apt Transmitting file data ..done Committing transaction... Committed revision 1896103. > Language detection documentation needs attention > > > Key: TIKA-3620 > URL: https://issues.apache.org/jira/browse/TIKA-3620 > Project: Tika > Issue Type: Improvement > Components: languageidentifier >Affects Versions: 2.1.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > > This language identifier/detection suffers from a few problems > # Clarity is needed on identifier/identification Vs detector/detection. Which > is it? The source code says identifier whereas the [documentation is nested > under > detection|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > # The > [org.apache.tika.language.LanguageIdentifier|https://tika.apache.org/2.1.0/api/org/apache/tika/language/LanguageIdentifier.html] > returns 404. What is this meant to resolve to? > # Generally speaking the [documentation is literally > non-existent|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > I checked the wiki and failed to find anything. I did find some [minor > documentation|https://tika.apache.org/2.1.0/examples.html#Language_Identification] > but this is also severely lacking. Also note the broken hyperlink. > Some suggestions for improvement > # Fix the broken hyperlinks. > # Hyperlink to the existing example namely > [LanguageDetectorExample.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java], > > [LanguageDetectingParser.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java] > and > [Language.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java] > # Hyperlink to the [LanguageDetector > Javadoc|https://tika.apache.org/2.1.0/api/index.html?org/apache/tika/language/detect/LanguageDetector.html] > and atleast mention some of the other implementations. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3616) Upgrade log4j2
[ https://issues.apache.org/jira/browse/TIKA-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460813#comment-17460813 ] Lewis John McGibbney commented on TIKA-3616: Hi Tim, I’m not familiar with the downstream release process for the various Docker artifacts. Publishing the Helm chart is a walk in the park but entirely conditional upon the Docker artifact. Here’s the release process https://wiki.apache.org/confluence/display/TIKA/Release+Process+for+tika-helm Can I make the helm release? Absolutely. -- http://home.apache.org/~lewismc/ http://people.apache.org/keys/committer/lewismc > Upgrade log4j2 > -- > > Key: TIKA-3616 > URL: https://issues.apache.org/jira/browse/TIKA-3616 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > Fix For: 2.1.1 > > > RCE...might be difficult to trigger in Tika, but why ask for a PoC... > This only affects 2.x. We were still using the old log4j in 1.x -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (TIKA-3620) Language detection documentation needs attention
Lewis John McGibbney created TIKA-3620: -- Summary: Language detection documentation needs attention Key: TIKA-3620 URL: https://issues.apache.org/jira/browse/TIKA-3620 Project: Tika Issue Type: Improvement Components: languageidentifier Affects Versions: 2.1.0 Reporter: Lewis John McGibbney This language identifier/detection suffers from a few problems # Clarity is needed on identifier/identification Vs detector/detection. Which is it? The source code says identifier whereas the [documentation is nested under detection|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. # The [org.apache.tika.language.LanguageIdentifier|https://tika.apache.org/2.1.0/api/org/apache/tika/language/LanguageIdentifier.html] returns 404. What is this meant to resolve to? # Generally speaking the [documentation is literally non-existent|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. I checked the wiki and failed to find anything. I did find some [minor documentation|https://tika.apache.org/2.1.0/examples.html#Language_Identification] but this is also severely lacking. Also note the broken hyperlink. Some suggestions for improvement # Fix the broken hyperlinks. # Hyperlink to the existing example namely [LanguageDetectorExample.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java], [LanguageDetectingParser.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java] and [Language.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java] # Hyperlink to the [LanguageDetector Javadoc|https://tika.apache.org/2.1.0/api/index.html?org/apache/tika/language/detect/LanguageDetector.html] and atleast mention some of the other implementations. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (TIKA-3620) Language detection documentation needs attention
[ https://issues.apache.org/jira/browse/TIKA-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TIKA-3620: -- Assignee: Lewis John McGibbney > Language detection documentation needs attention > > > Key: TIKA-3620 > URL: https://issues.apache.org/jira/browse/TIKA-3620 > Project: Tika > Issue Type: Improvement > Components: languageidentifier >Affects Versions: 2.1.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > > This language identifier/detection suffers from a few problems > # Clarity is needed on identifier/identification Vs detector/detection. Which > is it? The source code says identifier whereas the [documentation is nested > under > detection|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > # The > [org.apache.tika.language.LanguageIdentifier|https://tika.apache.org/2.1.0/api/org/apache/tika/language/LanguageIdentifier.html] > returns 404. What is this meant to resolve to? > # Generally speaking the [documentation is literally > non-existent|https://tika.apache.org/2.1.0/detection.html#Language_Detection]. > I checked the wiki and failed to find anything. I did find some [minor > documentation|https://tika.apache.org/2.1.0/examples.html#Language_Identification] > but this is also severely lacking. Also note the broken hyperlink. > Some suggestions for improvement > # Fix the broken hyperlinks. > # Hyperlink to the existing example namely > [LanguageDetectorExample.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java], > > [LanguageDetectingParser.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java] > and > [Language.java|https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java] > # Hyperlink to the [LanguageDetector > Javadoc|https://tika.apache.org/2.1.0/api/index.html?org/apache/tika/language/detect/LanguageDetector.html] > and atleast mention some of the other implementations. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3241) Clarify parser module structure in 2.0.0
[ https://issues.apache.org/jira/browse/TIKA-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460066#comment-17460066 ] Lewis John McGibbney commented on TIKA-3241: Hi [~tallison] can this ticket be closed? > Clarify parser module structure in 2.0.0 > > > Key: TIKA-3241 > URL: https://issues.apache.org/jira/browse/TIKA-3241 > Project: Tika > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Major > > In 2.0.0, we currently have: > tika-parser-modules/ > tika-parsers/ > tika-parsers-advanced/ > tika-parsers-extended > where {{tika-parsers}} is a module that includes all parsers in > {{tika-parser-modules}}. > I think we can make the structure a bit clearer by: > tika-parsers/ >tika-parsers-classic/ (renamed from tika-parser-modules) >tika-parsers-advanced/ >tika-parsers-extended > As before in 2.0.0, tika-app and tika-server would pull from > tika-parsers-classic. If users wanted the heavier parsers in > tika-parsers-advanced/tika-parsers-extended, they could pull those in on > their own. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3229) mvn clean install failure - tika-1.24 on windows
[ https://issues.apache.org/jira/browse/TIKA-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460065#comment-17460065 ] Lewis John McGibbney commented on TIKA-3229: [~Simmo] are you able to reproduce this/ Otherwise I think we should close this ticket. > mvn clean install failure - tika-1.24 on windows > - > > Key: TIKA-3229 > URL: https://issues.apache.org/jira/browse/TIKA-3229 > Project: Tika > Issue Type: Bug >Affects Versions: 1.24 > Environment: windows 10 >Reporter: Simon Opper >Priority: Major > > getting a build fail on mvn clean install > > ERROR] Failed to execute goal > org.apache.felix:maven-bundle-plugin:4.1.0:bundle (default-bundle) on project > tika-core: Execution default-bundle of goal > org.apache.felix:maven-bundle-plugin:4.1.0:bundle failed.: > ConcurrentModificationException -> [Help 1] > > the complete verbose error text is below > > --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-core --- > [INFO] > > [INFO] Reactor Summary for Apache Tika 2.0.0-SNAPSHOT: > [INFO] > [INFO] Apache Tika parent . SUCCESS [ 1.813 > s] > [INFO] Apache Tika core ... FAILURE [ 7.528 > s] > [INFO] Apache Tika parser modules . SKIPPED > [INFO] tika-parser-jdbc-commons ... SKIPPED > [INFO] tika-parser-digest-commons . SKIPPED > [INFO] tika-parser-mail-commons ... SKIPPED > [INFO] tika-parser-xmp-commons SKIPPED > [INFO] tika-parser-zip-commons SKIPPED > [INFO] tika-parser-image-module ... SKIPPED > [INFO] tika-parser-ocr-module . SKIPPED > [INFO] tika-parser-audiovideo-module .. SKIPPED > [INFO] tika-parser-text-module SKIPPED > [INFO] tika-parser-code-module SKIPPED > [INFO] tika-parser-html-module SKIPPED > [INFO] tika-parser-font-module SKIPPED > [INFO] tika-parser-xml-module . SKIPPED > [INFO] tika-parser-microsoft-module ... SKIPPED > [INFO] tika-parser-pkg-module . SKIPPED > [INFO] tika-parser-pdf-module . SKIPPED > [INFO] tika-parser-apple-module ... SKIPPED > [INFO] tika-parser-cad-module . SKIPPED > [INFO] tika-parser-mail-module SKIPPED > [INFO] tika-parser-miscoffice-module .. SKIPPED > [INFO] tika-parser-news-module SKIPPED > [INFO] tika-parser-crypto-module .. SKIPPED > [INFO] tika-parser-integration-tests .. SKIPPED > [INFO] tika-parsers ... SKIPPED > [INFO] tika-parsers-extended .. SKIPPED > [INFO] tika-parser-sqlite3-module . SKIPPED > [INFO] tika-parser-scientific-module .. SKIPPED > [INFO] tika-parsers-extended-integration-tests SKIPPED > [INFO] Apache Tika XMP SKIPPED > [INFO] Apache Tika serialization .. SKIPPED > [INFO] Apache Tika batch .. SKIPPED > [INFO] Apache Tika language detection . SKIPPED > [INFO] tika-langdetect-commons SKIPPED > [INFO] tika-langdetect-lingo24 SKIPPED > [INFO] tika-langdetect-optimaize .. SKIPPED > [INFO] tika-langdetect-mitll-text . SKIPPED > [INFO] tika-langdetect-opennlp SKIPPED > [INFO] Apache Tika application SKIPPED > [INFO] Apache Tika translate .. SKIPPED > [INFO] Apache Tika server . SKIPPED > [INFO] Apache Tika fuzzing SKIPPED > [INFO] Apache Tika eval ... SKIPPED > [INFO] Apache Tika examples ... SKIPPED > [INFO] Apache Tika Java-7 Components .. SKIPPED > [INFO] tika-parsers-advanced .. SKIPPED > [INFO] tika-parser-nlp-module . SKIPPED > [INFO] Apache Tika Natural Language Processing SKIPPED > [INFO] tika-parser-advancedmedia-module ... SKIPPED > [INFO] Apache Tika Deep Learning (powered by DL4J) SKIPPED >
[jira] [Resolved] (TIKA-491) Add language identification support for Norwegian Bokmål and Norwegian Nynorsk
[ https://issues.apache.org/jira/browse/TIKA-491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-491. --- Resolution: Won't Fix Cleanup [~kkrugler] [~pandermusubi] > Add language identification support for Norwegian Bokmål and Norwegian Nynorsk > -- > > Key: TIKA-491 > URL: https://issues.apache.org/jira/browse/TIKA-491 > Project: Tika > Issue Type: New Feature > Components: languageidentifier >Affects Versions: 0.7 >Reporter: Jan Høydahl >Assignee: Kenneth William Krugler >Priority: Major > > Currently there is one Norwegian language profile in Tika - "no". We need to > distinguish between the two official Norwegian languages defined by ISO 639-1 > codes "nb" and "nn". Those codes are recommended used instead of the common > "no" tag. > Proposed solved by removing the current language profile no.ngp and replacing > it with two new ones for nb and nn. > We must also add tests for Norwegian -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-369) Improve accuracy of language detection
[ https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-369. --- Resolution: Fixed Cleaning this one up [~kkrugler] > Improve accuracy of language detection > -- > > Key: TIKA-369 > URL: https://issues.apache.org/jira/browse/TIKA-369 > Project: Tika > Issue Type: Improvement > Components: languageidentifier >Affects Versions: 0.6 >Reporter: Kenneth William Krugler >Assignee: Kenneth William Krugler >Priority: Major > Attachments: Surprise and Coincidence.pdf, lingdet-mccs.pdf, > textcat.pdf > > > Currently the LanguageProfile code uses 3-grams to find the best language > profile using Pearson's chi-square test. This has three issues: > 1. The results aren't very good for short runs of text. Ted Dunning's paper > (attached) indicates that a log-likelihood ratio (LLR) test works much > better, which would then make language detection faster due to less text > needing to be processed. > 2. The current LanguageIdentifier.isReasonablyCertain() method uses an exact > value as a threshold for certainty. This is very sensitive to the amount of > text being processed, and thus gives false negative results for short runs of > text. > 3. Certainty should also be based on how much better the result is for > language X, compared to the next best language. If two languages both had > identical sum-of-squares values, and this value was below the threshold, then > the result is still not very certain. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (TIKA-3619) Augment README with build prerequisites
Lewis John McGibbney created TIKA-3619: -- Summary: Augment README with build prerequisites Key: TIKA-3619 URL: https://issues.apache.org/jira/browse/TIKA-3619 Project: Tika Issue Type: Improvement Components: documentation Affects Versions: 2.2.0 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney When [reviewing the 2.2.0 RC |https://lists.apache.org/thread/pfwm8sn7w3lsrsckd8b9v3b32byj4zms] I became aware that although Docker IS required to build tika-pipes modules, there is no guidance to reflect that. I think we could cleanup the README to reflect the installation prerequisites. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (TIKA-3606) Introduce a text summarization capability
[ https://issues.apache.org/jira/browse/TIKA-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452086#comment-17452086 ] Lewis John McGibbney commented on TIKA-3606: I'll most likely compile any needs/wants/suggestions into a wiki page or some sort of formal documentation. I'll let this one stew for a bit. > Introduce a text summarization capability > - > > Key: TIKA-3606 > URL: https://issues.apache.org/jira/browse/TIKA-3606 > Project: Tika > Issue Type: New Feature > Components: summarization >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > > I've been looking out for a nice text summarization capability for a while > and have been using [SummerTime|https://github.com/Yale-LILY/SummerTime] > recently which is really quite nice. > From the research I've been doing, the best of class in the summarization > field appears to reside within the Python ecosystem. This statement > accommodates both summarization results AND usability of the tools. > Before I go and start writing a tika.summarize API, I wanted to poll the > developer community to see what kind of things people are aware of. So here > it is, > *When thinking about a Tika text summarization capability, what would you > like to see?* > Thanks folks. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (TIKA-3606) Introduce a text summarization capability
Lewis John McGibbney created TIKA-3606: -- Summary: Introduce a text summarization capability Key: TIKA-3606 URL: https://issues.apache.org/jira/browse/TIKA-3606 Project: Tika Issue Type: New Feature Components: summarization Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney I've been looking out for a nice text summarization capability for a while and have been using [SummerTime|https://github.com/Yale-LILY/SummerTime] recently which is really quite nice. >From the research I've been doing, the best of class in the summarization >field appears to reside within the Python ecosystem. This statement >accommodates both summarization results AND usability of the tools. Before I go and start writing a tika.summarize API, I wanted to poll the developer community to see what kind of things people are aware of. So here it is, *When thinking about a Tika text summarization capability, what would you like to see?* Thanks folks. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (TIKA-2796) Update GoogleTranslator to use google-cloud-translate Java API
[ https://issues.apache.org/jira/browse/TIKA-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-2796. Resolution: Duplicate https://issues.apache.org/jira/browse/TIKA-3404 > Update GoogleTranslator to use google-cloud-translate Java API > -- > > Key: TIKA-2796 > URL: https://issues.apache.org/jira/browse/TIKA-2796 > Project: Tika > Issue Type: Improvement > Components: translation >Affects Versions: 1.19.1 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0-BETA > > > The GoogleTranslator logic has been neglected and is no longer functional. > We can upgrade to use the official Google Java API at > https://search.maven.org/artifact/com.google.cloud/google-cloud-translate/1.54.0/jar > Additionally, documentaion for this upgrade can be found at > https://cloud.google.com/translate/docs/quickstart-client-libraries#client-libraries-install-java -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA and 2.1.0
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3453: --- Summary: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA and 2.1.0 (was: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA) > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA and 2.1.0 > --- > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426383#comment-17426383 ] Lewis John McGibbney commented on TIKA-3453: The problem lies in the *__* of the [logging dependencies|https://github.com/apache/tika/blob/main/tika-server/tika-server-core/pom.xml#L129-L139]. {code:xml} org.apache.logging.log4j log4j-core ${log4j2.version} test org.apache.logging.log4j log4j-slf4j-impl ${log4j2.version} test {code} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426378#comment-17426378 ] Lewis John McGibbney commented on TIKA-3453: OK so the problem lies in the tika-server source NOT with tika-docker or tika-helm. Using main branch I can replicate as well % java -jar tika-server-core-2.1.1-SNAPSHOT.jar SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Oct 08, 2021 2:05:20 PM org.apache.cxf.endpoint.ServerImpl initDestination INFO: Setting the server's publish address to be http://localhost:9998/ > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426372#comment-17426372 ] Lewis John McGibbney commented on TIKA-3453: [~davemeikle] can you also reproduce? > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426369#comment-17426369 ] Lewis John McGibbney edited comment on TIKA-3453 at 10/8/21, 8:47 PM: -- Yep I can reproduce this [~scottbessler]. Using tika-docker 2.1.0-full via tika-helm {code:bash} % kubectl logs tika-66796c96-psl2b -n tika -f SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. {code} [~tallison] was this not supposed to be sorted out? What digging did we do before on this? Oct 08, 2021 8:40:28 PM org.apache.cxf.endpoint.ServerImpl initDestination INFO: Setting the server's publish address to be [http://0.0.0.0:9998/] was (Author: lewismc): Yep I can reproduce this [~scottbessler]. Using tika-docker 2.1.0-full via tika-helm {{% kubectl logs tika-66796c96-psl2b -n tika -f SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.}} [~tallison] was this not supposed to be sorted out? What digging did we do before on this? Oct 08, 2021 8:40:28 PM org.apache.cxf.endpoint.ServerImpl initDestination INFO: Setting the server's publish address to be http://0.0.0.0:9998/ > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426369#comment-17426369 ] Lewis John McGibbney edited comment on TIKA-3453 at 10/8/21, 8:46 PM: -- Yep I can reproduce this [~scottbessler]. Using tika-docker 2.1.0-full via tika-helm {{% kubectl logs tika-66796c96-psl2b -n tika -f SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.}} [~tallison] was this not supposed to be sorted out? What digging did we do before on this? Oct 08, 2021 8:40:28 PM org.apache.cxf.endpoint.ServerImpl initDestination INFO: Setting the server's publish address to be http://0.0.0.0:9998/ was (Author: lewismc): Yep I can reproduce this [~scottbessler]. Using tika-docker 2.1.0-full via tika-helm {{% kubectl logs tika-66796c96-psl2b -n tika -f SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.}} [~tallison] was this not supposed to be sorted out? What digging did we do before on this? Oct 08, 2021 8:40:28 PM org.apache.cxf.endpoint.ServerImpl initDestination INFO: Setting the server's publish address to be http://0.0.0.0:9998/ > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426369#comment-17426369 ] Lewis John McGibbney commented on TIKA-3453: Yep I can reproduce this [~scottbessler]. Using tika-docker 2.1.0-full via tika-helm {{% kubectl logs tika-66796c96-psl2b -n tika -f SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.}} [~tallison] was this not supposed to be sorted out? What digging did we do before on this? Oct 08, 2021 8:40:28 PM org.apache.cxf.endpoint.ServerImpl initDestination INFO: Setting the server's publish address to be http://0.0.0.0:9998/ > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3453: --- Fix Version/s: (was: 2.0.0) 2.1.1 > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.1.1 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TIKA-3566) Upgrade tika-helm to 2.1.0
Lewis John McGibbney created TIKA-3566: -- Summary: Upgrade tika-helm to 2.1.0 Key: TIKA-3566 URL: https://issues.apache.org/jira/browse/TIKA-3566 Project: Tika Issue Type: Improvement Components: helm Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.1.0 Simple upgrade to [tika-docker 2.1.0|https://hub.docker.com/layers/apache/tika/2.1.0/images/sha256-5bb52afa9726cf2ca022441cc75ef357de9f8deb41a88a9b2964780e934d11e7?context=explore]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425772#comment-17425772 ] Lewis John McGibbney commented on TIKA-3453: I can investiagte. Thanks [~scottbessler]we didn't upgrade internally yet but I will force that now and report back. > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (TIKA-3483) Implement a network policy for Helm Chart
[ https://issues.apache.org/jira/browse/TIKA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3483. Resolution: Fixed > Implement a network policy for Helm Chart > - > > Key: TIKA-3483 > URL: https://issues.apache.org/jira/browse/TIKA-3483 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Priority: Major > Fix For: 2.0.1 > > > See https://github.com/apache/tika-helm/pull/5 for context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TIKA-3483) Implement a network policy for Helm Chart
[ https://issues.apache.org/jira/browse/TIKA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3483: --- Fix Version/s: (was: 2.0.0-BETA) 2.0.1 > Implement a network policy for Helm Chart > - > > Key: TIKA-3483 > URL: https://issues.apache.org/jira/browse/TIKA-3483 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Priority: Major > Fix For: 2.0.1 > > > See https://github.com/apache/tika-helm/pull/5 for context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TIKA-3483) Implement a network policy for Helm Chart
[ https://issues.apache.org/jira/browse/TIKA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3483: --- Summary: Implement a network policy for Helm Chart (was: Implement a network policy) > Implement a network policy for Helm Chart > - > > Key: TIKA-3483 > URL: https://issues.apache.org/jira/browse/TIKA-3483 > Project: Tika > Issue Type: Improvement > Components: helm >Reporter: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > See https://github.com/apache/tika-helm/pull/5 for context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TIKA-3483) Implement a network policy
Lewis John McGibbney created TIKA-3483: -- Summary: Implement a network policy Key: TIKA-3483 URL: https://issues.apache.org/jira/browse/TIKA-3483 Project: Tika Issue Type: Improvement Components: helm Reporter: Lewis John McGibbney Fix For: 2.0.0 See https://github.com/apache/tika-helm/pull/5 for context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (TIKA-3454) Facilitate configuration of translation and transcription impls in tika-server/tika-docker/tika-helm
[ https://issues.apache.org/jira/browse/TIKA-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369234#comment-17369234 ] Lewis John McGibbney edited comment on TIKA-3454 at 6/25/21, 4:46 AM: -- There are a couple of confusing statements {quote} Configuring Language Identifiers At this time, there is no unified way to configure language identifiers. While the work on that is ongoing, for now you will need to review the Tika Javadocs to see how individual identifiers are configured. Configuring Translators At this time, there is no unified way to configure Translators. While the work on that is ongoing, for now you will need to review the Tika Javadocs to see how individual Translators are configured. {quote} The hyperlinks point to https://tika.apache.org/1.26/api/ which is not particularly useful. I think this is going to take some collective input to arrive at a decent solution. Is anyone else interested in this? was (Author: lewismc): There are a couple of confusing statements {quote} Configuring Language Identifiers At this time, there is no unified way to configure language identifiers. While the work on that is ongoing, for now you will need to review the Tika Javadocs to see how individual identifiers are configured. Configuring Translators At this time, there is no unified way to configure Translators. While the work on that is ongoing, for now you will need to review the Tika Javadocs to see how individual Translators are configured. {quote} I think this is going to take some collective input. Is anyone else interested in this? > Facilitate configuration of translation and transcription impls in > tika-server/tika-docker/tika-helm > > > Key: TIKA-3454 > URL: https://issues.apache.org/jira/browse/TIKA-3454 > Project: Tika > Issue Type: Bug > Components: docker, helm, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > I need an easy way to configure, for example, the > [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] > implementation when I deploy tika-server (tika-docker) via the Helm chart > into Kubernetes. The same goes for TIka translation implementations. > We have [documentation for configuring tika-server to run via > Docker|https://github.com/apache/tika-docker#custom-config] however > currently, there is [no way to configure translators or > transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] > > This task will determine a sensible means by which we can configure > translators and transcribers for tika-server such that it can be used further > downstream via Docker and Helm on K8s. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3454) Facilitate configuration of translation and transcription impls in tika-server/tika-docker/tika-helm
[ https://issues.apache.org/jira/browse/TIKA-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369234#comment-17369234 ] Lewis John McGibbney commented on TIKA-3454: There are a couple of confusing statements {quote} Configuring Language Identifiers At this time, there is no unified way to configure language identifiers. While the work on that is ongoing, for now you will need to review the Tika Javadocs to see how individual identifiers are configured. Configuring Translators At this time, there is no unified way to configure Translators. While the work on that is ongoing, for now you will need to review the Tika Javadocs to see how individual Translators are configured. {quote} I think this is going to take some collective input. Is anyone else interested in this? > Facilitate configuration of translation and transcription impls in > tika-server/tika-docker/tika-helm > > > Key: TIKA-3454 > URL: https://issues.apache.org/jira/browse/TIKA-3454 > Project: Tika > Issue Type: Bug > Components: docker, helm, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > I need an easy way to configure, for example, the > [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] > implementation when I deploy tika-server (tika-docker) via the Helm chart > into Kubernetes. The same goes for TIka translation implementations. > We have [documentation for configuring tika-server to run via > Docker|https://github.com/apache/tika-docker#custom-config] however > currently, there is [no way to configure translators or > transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] > > This task will determine a sensible means by which we can configure > translators and transcribers for tika-server such that it can be used further > downstream via Docker and Helm on K8s. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TIKA-3454) Facilitate configuration of translation and transcription impls in tika-server/tika-docker/tika-helm
[ https://issues.apache.org/jira/browse/TIKA-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3454: --- Description: I need an easy way to configure, for example, the [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] implementation when I deploy tika-server (tika-docker) via the Helm chart into Kubernetes. The same goes for TIka translation implementations. We have [documentation for configuring tika-server to run via Docker|https://github.com/apache/tika-docker#custom-config] however currently, there is [no way to configure translators or transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] This task will determine a sensible means by which we can configure translators and transcribers for tika-server such that it can be used further downstream via Docker and Helm on K8s. was: I need an easy way to configure, for example, the [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] implementation when I deploy Tika via the Helm chart into Kubernetes. The same goes for TIka translation implementations. We have [documentation for configuring tika-server to run via Docker|https://github.com/apache/tika-docker#custom-config] however currently, there is [no way to configure translators or transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] This task will determine a sensible means by which we can configure translators and transcribers for tika-server such that it can be used further downstream via Docker and Helm on K8s. > Facilitate configuration of translation and transcription impls in > tika-server/tika-docker/tika-helm > > > Key: TIKA-3454 > URL: https://issues.apache.org/jira/browse/TIKA-3454 > Project: Tika > Issue Type: Bug > Components: docker, helm, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > I need an easy way to configure, for example, the > [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] > implementation when I deploy tika-server (tika-docker) via the Helm chart > into Kubernetes. The same goes for TIka translation implementations. > We have [documentation for configuring tika-server to run via > Docker|https://github.com/apache/tika-docker#custom-config] however > currently, there is [no way to configure translators or > transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] > > This task will determine a sensible means by which we can configure > translators and transcribers for tika-server such that it can be used further > downstream via Docker and Helm on K8s. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TIKA-3454) Facilitate configuration of translation and transcription impls in tika-server/tika-docker/tika-helm
Lewis John McGibbney created TIKA-3454: -- Summary: Facilitate configuration of translation and transcription impls in tika-server/tika-docker/tika-helm Key: TIKA-3454 URL: https://issues.apache.org/jira/browse/TIKA-3454 Project: Tika Issue Type: Bug Components: docker, helm, server Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.0.0 I need an easy way to configure, for example, the [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] implementation when I deply Tika via the Helm chart into Kubernetes. The same goes for TIka translation implementations. We have [documentation for configuring tika-server to run via Docker|https://github.com/apache/tika-docker#custom-config] however currently, there is [no way to configure translators or transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] This task will determine a sensible means by which we can configure translators and transcribers for tika-server such that it can be used further downstream via Docker and Helm on K8s. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TIKA-3454) Facilitate configuration of translation and transcription impls in tika-server/tika-docker/tika-helm
[ https://issues.apache.org/jira/browse/TIKA-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3454: --- Description: I need an easy way to configure, for example, the [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] implementation when I deploy Tika via the Helm chart into Kubernetes. The same goes for TIka translation implementations. We have [documentation for configuring tika-server to run via Docker|https://github.com/apache/tika-docker#custom-config] however currently, there is [no way to configure translators or transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] This task will determine a sensible means by which we can configure translators and transcribers for tika-server such that it can be used further downstream via Docker and Helm on K8s. was: I need an easy way to configure, for example, the [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] implementation when I deply Tika via the Helm chart into Kubernetes. The same goes for TIka translation implementations. We have [documentation for configuring tika-server to run via Docker|https://github.com/apache/tika-docker#custom-config] however currently, there is [no way to configure translators or transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] This task will determine a sensible means by which we can configure translators and transcribers for tika-server such that it can be used further downstream via Docker and Helm on K8s. > Facilitate configuration of translation and transcription impls in > tika-server/tika-docker/tika-helm > > > Key: TIKA-3454 > URL: https://issues.apache.org/jira/browse/TIKA-3454 > Project: Tika > Issue Type: Bug > Components: docker, helm, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > I need an easy way to configure, for example, the > [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-ml/tika-transcribe-aws/src/main/java/org/apache/tika/parser/transcribe/aws/AmazonTranscribe.java] > implementation when I deploy Tika via the Helm chart into Kubernetes. The > same goes for TIka translation implementations. > We have [documentation for configuring tika-server to run via > Docker|https://github.com/apache/tika-docker#custom-config] however > currently, there is [no way to configure translators or > transcribers|https://tika.apache.org/1.26/configuring.html#Configuring_Translators] > > This task will determine a sensible means by which we can configure > translators and transcribers for tika-server such that it can be used further > downstream via Docker and Helm on K8s. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (TIKA-3403) Create example for Transcription
[ https://issues.apache.org/jira/browse/TIKA-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3403. Resolution: Fixed > Create example for Transcription > > > Key: TIKA-3403 > URL: https://issues.apache.org/jira/browse/TIKA-3403 > Project: Tika > Issue Type: Improvement > Components: transcription >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > Post-TIKA-94, we lack a transcription tutorial. > I have implemented a tutorial and several improvements for the > [AmazonTranscribe|https://github.com/apache/tika/blob/main/tika-transcribe/src/main/java/org/apache/tika/transcribe/AmazonTranscribe.java]. > PR coming up!!! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-3453. Resolution: Fixed > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1736#comment-1736 ] Lewis John McGibbney commented on TIKA-3453: Excellent [~tallison] I'll close this off :) > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to > no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA > - > > Key: TIKA-3453 > URL: https://issues.apache.org/jira/browse/TIKA-3453 > Project: Tika > Issue Type: Bug > Components: docker, server >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > It looks like logging libraries are not being interpreted correctly from Java > classpath. > We need logging turned on so we can intercept anomalies. > Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3452) java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368531#comment-17368531 ] Lewis John McGibbney commented on TIKA-3452: I think the correct way for this to be done is to mount a emptyDir volume with type Memory. For example https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod > java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA > tika-docker > - > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TIKA-3452) java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368528#comment-17368528 ] Lewis John McGibbney commented on TIKA-3452: OK the issue stems from TIKA-3381 cf. https://github.com/apache/tika-helm/pull/2. Specifically, the following [security context setting|https://github.com/apache/tika-helm/blob/main/values.yaml#L50] {code:yaml} securityContext: capabilities: drop: - ALL readOnlyRootFilesystem: true <<< if toggled to false then the issue is fixed. runAsNonRoot: true runAsUser: 35002 runAsGroup: 35002 {code} > java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA > tika-docker > - > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA
Lewis John McGibbney created TIKA-3453: -- Summary: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA Key: TIKA-3453 URL: https://issues.apache.org/jira/browse/TIKA-3453 Project: Tika Issue Type: Bug Components: docker, server Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.0.0 It looks like logging libraries are not being interpreted correctly from Java classpath. We need logging turned on so we can intercept anomalies. Investigating... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (TIKA-3452) java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368524#comment-17368524 ] Lewis John McGibbney edited comment on TIKA-3452 at 6/23/21, 11:06 PM: --- OK, so building and running tika:2.0.0-BETA-full everything goes just fine {code:bash} % docker pull apache/tika:2.0.0-BETA-full % docker run -d -p 9998:9998 apache/tika:2.0.0-BETA-full ... 4f88baf2e274 % docker exec -it 4f88baf2e274 /bin/bash root@4f88baf2e274:/# ls /tmp/ apache-tika-server-forked-tmp-14387144943566291399 cxf-tmp-877558076030522104/ hsperfdata_root/ {code} This is after uploading a sample PPT document. So I suspect the problem is now in the Helm deployment. Investigating further. was (Author: lewismc): OK, so building and running tika:2.0.0-BETA-full everything goes just fine {code:bash} % docker exec -it 4f88baf2e274 /bin/bash root@4f88baf2e274:/# ls /tmp/ apache-tika-server-forked-tmp-14387144943566291399 cxf-tmp-877558076030522104/ hsperfdata_root/ {code} This is after uploading a sample PPT document. So I suspect the problem is now in the Helm deployment. Investigating further. > java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA > tika-docker > - > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was
[jira] [Commented] (TIKA-3452) java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368524#comment-17368524 ] Lewis John McGibbney commented on TIKA-3452: OK, so building and running tika:2.0.0-BETA-full everything goes just fine {code:bash} % docker exec -it 4f88baf2e274 /bin/bash root@4f88baf2e274:/# ls /tmp/ apache-tika-server-forked-tmp-14387144943566291399 cxf-tmp-877558076030522104/ hsperfdata_root/ {code} This is after uploading a sample PPT document. So I suspect the problem is now in the Helm deployment. Investigating further. > java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA > tika-docker > - > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (TIKA-3452) java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker
[ https://issues.apache.org/jira/browse/TIKA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-3452: --- Component/s: helm > java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA > tika-docker > - > > Key: TIKA-3452 > URL: https://issues.apache.org/jira/browse/TIKA-3452 > Project: Tika > Issue Type: Bug > Components: docker, helm >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 2.0.0 > > > The following ExecutionException is thrown when I attempt to run [tika-docker > 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] > {code:bash} > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) > at > org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) > at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) > Caused by: java.nio.file.FileSystemException: > /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system > at > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at > java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) > at java.base/java.nio.file.Files.newByteChannel(Files.java:375) > at java.base/java.nio.file.Files.createFile(Files.java:652) > at > java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) > at > java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) > at java.base/java.nio.file.Files.createTempFile(Files.java:917) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) > at > org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) > at > org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at java.base/java.lang.Thread.run(Thread.java:832) > {code} > There are differences/improvements in the way the [tika-server child process > is > spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] > in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (TIKA-3452) java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker
Lewis John McGibbney created TIKA-3452: -- Summary: java.nio.file.FileSystemException Read-only file system in 2.0.0-BETA tika-docker Key: TIKA-3452 URL: https://issues.apache.org/jira/browse/TIKA-3452 Project: Tika Issue Type: Bug Components: docker Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 2.0.0 The following ExecutionException is thrown when I attempt to run [tika-docker 2.0.0-BETA|https://hub.docker.com/layers/apache/tika/2.0.0-BETA-full/images/sha256-2d735f7bdf86e618a5390d92614a310697f9134d11a2b2e4c1c0cfcde1f68b1d?context=explore] {code:bash} SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. java.util.concurrent.ExecutionException: java.nio.file.FileSystemException: /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.tika.server.core.TikaServerCli.mainLoop(TikaServerCli.java:116) at org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:88) at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66) Caused by: java.nio.file.FileSystemException: /tmp/apache-tika-server-forked-tmp-8374629799942405236: Read-only file system at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) at java.base/java.nio.file.Files.newByteChannel(Files.java:375) at java.base/java.nio.file.Files.createFile(Files.java:652) at java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137) at java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160) at java.base/java.nio.file.Files.createTempFile(Files.java:917) at org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:220) at org.apache.tika.server.core.TikaServerWatchDog$ForkedProcess.(TikaServerWatchDog.java:210) at org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:117) at org.apache.tika.server.core.TikaServerWatchDog.call(TikaServerWatchDog.java:50) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) at java.base/java.lang.Thread.run(Thread.java:832) {code} There are differences/improvements in the way the [tika-server child process is spawned|https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-MakingTikaServerRobusttoOOMs,InfiniteLoopsandMemoryLeaks] in the 2.0.0-BETA docker image. I am investigating a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)