[
https://issues.apache.org/jira/browse/TIKA-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4285.
---
Resolution: Fixed
Thank you [~tom_1st] and [~tilman]! Should be fixed now.
> Invalid L
[
https://issues.apache.org/jira/browse/TIKA-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-4285:
-
Assignee: Tim Allison
> Invalid Link for changelog CHANGES.txt fi
[
https://issues.apache.org/jira/browse/TIKA-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866843#comment-17866843
]
Tim Allison commented on TIKA-4281:
---
For some reason, now, it looks like {{javadocs}} works fine
[
https://issues.apache.org/jira/browse/TIKA-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866742#comment-17866742
]
Tim Allison commented on TIKA-4281:
---
Well, that didn't work: {{javadoc: error - No source files
0.0.
-- Tim Allison, on behalf of the Apache Tika community
Tim Allison created TIKA-4281:
-
Summary: Fix javadoc plugin configuration
Key: TIKA-4281
URL: https://issues.apache.org/jira/browse/TIKA-4281
Project: Tika
Issue Type: Task
Reporter
[
https://issues.apache.org/jira/browse/TIKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-4280:
--
Description:
I'm too lazy to open separate tickets. Please do so if desired.
Some items:
* Before
[
https://issues.apache.org/jira/browse/TIKA-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-4280:
--
Description:
I'm too lazy to open separate tickets. Please do so if desired.
Some items:
* Before
I released the artifacts and built the docker images. I'll work on the
site and announcement tomorrow.
On Mon, Jul 15, 2024 at 1:50 PM Tim Allison wrote:
>
> The vote has passed with 3 PMC +1s, 2 non-binding +1s and no -1s.
>
> +1s (binding)
> Tim Allison
> Nicholas DiPiazza
Tim Allison created TIKA-4280:
-
Summary: Tasks for the 3.0.0 release
Key: TIKA-4280
URL: https://issues.apache.org/jira/browse/TIKA-4280
Project: Tika
Issue Type: Task
Reporter: Tim
The vote has passed with 3 PMC +1s, 2 non-binding +1s and no -1s.
+1s (binding)
Tim Allison
Nicholas DiPiazza
Tilman Hausherr
+1s (non-binding)
Kiran Bachu
Gary Gregory
I'll release the artifacts shortly and update the website.
Thank you, all!
Best,
Tim
On Fri, Jul 12, 2024 at 12:08
[
https://issues.apache.org/jira/browse/TIKA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866075#comment-17866075
]
Tim Allison commented on TIKA-4278:
---
Thank you, [~tilman], y, that's probably an oversight on my part
; dependencies
> (I've added these so we support these other projects by testing them),
> and decide about the ffmpeg issue and the hdf5 issue.
>
> Tilman
>
> On 12.07.2024 18:08, Tim Allison wrote:
> > A candidate for the Tika 3.0.0-BETA2 release is available at:
> &g
A candidate for the Tika 3.0.0-BETA2 release is available at:
https://dist.apache.org/repos/dist/dev/tika/3.0.0-BETA2
The release candidate is a zip archive of the sources in:
https://github.com/apache/tika/tree/3.0.0-BETA2-rc1/
The SHA-512 checksum of the archive is
ve the same stuff INSTALLed as well, see line 32!
> Looking more...
>
>
> On Wed, Jul 10, 2024 at 9:44 PM Tim Allison wrote:
>
> > Apache Maven 3.9.7 (8b094c9513efc1b9ce2d952b3b9c8eaedaf8cbf0)
> > Maven home: /apache/apache-maven-3.9.7
> > Java version: 11.0.23, vendor
Sorry, should have been dev@tika earlier, not private@tika.
I rolled back the deploy-plugin to 3.1.1, which was successful for our last
deployment of 2.9.2. That worked then. It does not work now with this new
tika-grpc module.
On Wed, Jul 10, 2024 at 3:42 PM Tim Allison wrote:
> Apache Ma
Let's aim for tomorrow after review of TIKA-4275?
Any other fellow devs want to join?
On Tue, Jul 9, 2024 at 4:46 PM Tim Allison wrote:
> Doh. Sorry. Starting now...
>
> On Tue, Jul 9, 2024 at 12:47 PM Nicholas DiPiazza <
> nicholas.dipia...@gmail.com> wrote:
>
>>
Tim Allison created TIKA-4275:
-
Summary: Make tika-grpc a top-level module
Key: TIKA-4275
URL: https://issues.apache.org/jira/browse/TIKA-4275
Project: Tika
Issue Type: Task
Reporter
Doh. Sorry. Starting now...
On Tue, Jul 9, 2024 at 12:47 PM Nicholas DiPiazza <
nicholas.dipia...@gmail.com> wrote:
> Hi all,
>
> Just seeing if we were planning to build Beta2 today? I'd like to tag along
> and see how it's done if ya'll don't mind!
>
> -Nicholas
>
All,
I think it is time to go for a 3.0.0-BETA2. What do you think about
cutting that release this Friday or maybe next week?
Best,
Tim
cripts. how do i go
> > about getting that created any idea?
> >
> > On Wed, Jun 26, 2024 at 2:41 PM Tim Allison
> > wrote:If we
> >
> >> LIke a 3.0.0-BETA2 release?
> >>
> >> On Wed, Jun 26, 2024 at 12:06 PM Nicholas DiPiazza
[
https://issues.apache.org/jira/browse/TIKA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860241#comment-17860241
]
Tim Allison commented on TIKA-4272:
---
Y, I concur, we should have a completely separate image.
> cre
LIke a 3.0.0-BETA2 release?
On Wed, Jun 26, 2024 at 12:06 PM Nicholas DiPiazza <
nicholas.dipia...@gmail.com> wrote:
> At some point I would like to build a 3.0.0 beta version.
>
> How can I go about this?
>
> -Nicholas
>
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860035#comment-17860035
]
Tim Allison commented on TIKA-4251:
---
W00t!
> [DISCUSS] move to cosium's git-code-format-maven-plu
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860020#comment-17860020
]
Tim Allison commented on TIKA-4251:
---
Sounds great. My personal preference would be to move away from our
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860007#comment-17860007
]
Tim Allison commented on TIKA-4251:
---
> we eat the 1-time-format cost
That's where the vulnerabil
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1785#comment-1785
]
Tim Allison commented on TIKA-4251:
---
Makes sense. Tilman's observation is legit, and I don't see a way
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859739#comment-17859739
]
Tim Allison edited comment on TIKA-4251 at 6/25/24 6:19 PM:
Y. I agree. When I
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859739#comment-17859739
]
Tim Allison commented on TIKA-4251:
---
Y. I agree. When I started with checkstyle, it modified nearly
pia...@gmail.com> wrote:
> I just started using it for a big project and it is awesome
>
> On Sat, Jun 22, 2024, 6:11 AM Tim Allison wrote:
>
> > https://issues.apache.org/jira/browse/TIKA-4251
> >
> > Anything that works and doesn't allow wildcard imports I'
https://issues.apache.org/jira/browse/TIKA-4251
Anything that works and doesn't allow wildcard imports I'm good with. Have
you had luck with OpenRewrite?
On Wed, Jun 19, 2024 at 12:55 PM Nicholas DiPiazza <
nicholas.dipia...@gmail.com> wrote:
> Hey Tim and Team:
>
> I remember someone stating
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853241#comment-17853241
]
Tim Allison commented on TIKA-4243:
---
This is what the json currently looks like.
{code:json
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853240#comment-17853240
]
Tim Allison commented on TIKA-4243:
---
I opened a PR with some cleanup, fixes and a new unit test
[
https://issues.apache.org/jira/browse/TIKA-4268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4268.
---
Fix Version/s: 3.0.0
Resolution: Fixed
> Use title for embedded resource path in embedded
[
https://issues.apache.org/jira/browse/TIKA-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853157#comment-17853157
]
Tim Allison commented on TIKA-4251:
---
Unless there are any objections, I'll likely move forward
Tim Allison created TIKA-4268:
-
Summary: Use title for embedded resource path in embedded msg files
Key: TIKA-4268
URL: https://issues.apache.org/jira/browse/TIKA-4268
Project: Tika
Issue Type
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852876#comment-17852876
]
Tim Allison edited comment on TIKA-4243 at 6/6/24 5:39 PM:
---
I think our joint
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852876#comment-17852876
]
Tim Allison commented on TIKA-4243:
---
I think our joint recent PR on TIKA-4252 accomplishes the goals
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852874#comment-17852874
]
Tim Allison commented on TIKA-4252:
---
K. I think we're at "good enough" here. [~ndipiazza],
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4252.
---
Resolution: Fixed
> PipesClient#process - seems to lose the Fetch input metad
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852808#comment-17852808
]
Tim Allison commented on TIKA-4243:
---
Oh, and documentation, lots of documentation. :LOL:
> t
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852804#comment-17852804
]
Tim Allison edited comment on TIKA-4243 at 6/6/24 2:11 PM:
---
Current status
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852804#comment-17852804
]
Tim Allison commented on TIKA-4243:
---
Current status on TIKA-4243 -- works up through and including tika
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852098#comment-17852098
]
Tim Allison commented on TIKA-4243:
---
Let me know if there are any objections to heading
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852097#comment-17852097
]
Tim Allison commented on TIKA-4243:
---
K, I chatted briefly with [~ndipiazza] this morning. Unless
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851727#comment-17851727
]
Tim Allison edited comment on TIKA-4243 at 6/3/24 5:10 PM:
---
I spent a bit
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851727#comment-17851727
]
Tim Allison edited comment on TIKA-4243 at 6/3/24 5:02 PM:
---
I spent a bit
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851727#comment-17851727
]
Tim Allison edited comment on TIKA-4243 at 6/3/24 5:02 PM:
---
I spent a bit
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851727#comment-17851727
]
Tim Allison edited comment on TIKA-4243 at 6/3/24 4:45 PM:
---
I spent a bit
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851727#comment-17851727
]
Tim Allison edited comment on TIKA-4243 at 6/3/24 4:45 PM:
---
I spent a bit
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851727#comment-17851727
]
Tim Allison commented on TIKA-4243:
---
I spent a bit of time trying to serialize ParseContext, and I now
[
https://issues.apache.org/jira/browse/TIKA-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4260.
---
Resolution: Duplicate
Turns out this is a duplicate. Onwards to TIKA-4243!
> Add parse cont
Tim Allison created TIKA-4266:
-
Summary: Improve multithreading and the xml parser pools in
XMLUtils
Key: TIKA-4266
URL: https://issues.apache.org/jira/browse/TIKA-4266
Project: Tika
Issue Type
[
https://issues.apache.org/jira/browse/TIKA-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4221.
---
Fix Version/s: 3.0.0
2.9.3
Resolution: Fixed
Many thanks to [~ggregory
[
https://issues.apache.org/jira/browse/TIKA-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4220.
---
Fix Version/s: 3.0.0
2.9.3
Resolution: Fixed
Many thanks to [~ggregory
[
https://issues.apache.org/jira/browse/TIKA-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850776#comment-17850776
]
Tim Allison commented on TIKA-4265:
---
It doesn't help at all if there's a modification in tika-core, even
[
https://issues.apache.org/jira/browse/TIKA-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850773#comment-17850773
]
Tim Allison commented on TIKA-4265:
---
I just pushed a demo to {{build-cache}}. This includes
Tim Allison created TIKA-4265:
-
Summary: Consider adding maven build cache extension
Key: TIKA-4265
URL: https://issues.apache.org/jira/browse/TIKA-4265
Project: Tika
Issue Type: Task
Tim Allison created TIKA-4261:
-
Summary: Add attachment type metadata filter
Key: TIKA-4261
URL: https://issues.apache.org/jira/browse/TIKA-4261
Project: Tika
Issue Type: Task
[
https://issues.apache.org/jira/browse/TIKA-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4259.
---
Fix Version/s: 3.0.0
Resolution: Fixed
> Decouple xml parser stuff from ParseCont
[
https://issues.apache.org/jira/browse/TIKA-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849298#comment-17849298
]
Tim Allison commented on TIKA-4260:
---
That PR currently only works on tika-core. More needs to be done
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849288#comment-17849288
]
Tim Allison commented on TIKA-4243:
---
[~ndipiazza], I added parseContext to fetchers and emitters
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849103#comment-17849103
]
Tim Allison edited comment on TIKA-4243 at 5/24/24 1:00 PM:
Proposed basic
Tim Allison created TIKA-4260:
-
Summary: Add parse context to the fetcher interface in 3.x
Key: TIKA-4260
URL: https://issues.apache.org/jira/browse/TIKA-4260
Project: Tika
Issue Type: Task
Tim Allison created TIKA-4259:
-
Summary: Decouple xml parser stuff from ParseContext
Key: TIKA-4259
URL: https://issues.apache.org/jira/browse/TIKA-4259
Project: Tika
Issue Type: Task
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849114#comment-17849114
]
Tim Allison commented on TIKA-4243:
---
I'm going to start working on PRs that will be generally helpful
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849108#comment-17849108
]
Tim Allison commented on TIKA-4243:
---
The downsides we see:
a) if we there's agreement to add jackson
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849103#comment-17849103
]
Tim Allison commented on TIKA-4243:
---
Proposed basic roadmap:
Serialize ParseContext as is...
Allow
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849101#comment-17849101
]
Tim Allison commented on TIKA-4243:
---
Fellow devs, in chatting with Nicholas, we're thinking
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4258.
---
Resolution: Fixed
Just pushed 2.9.2.1/*-latest
Thank you, all!
> Multi-arch support for doc
All,
Many thanks to the many community members who helped figure this out and
get it out the door! As of tika-docker 2.9.2.1, we now have multi-arch
support (and on noble!).
Let us know if there are any surprises. Thank you, again!
Cheers,
Tim
Ref:
[
https://issues.apache.org/jira/browse/TIKA-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847980#comment-17847980
]
Tim Allison commented on TIKA-4255:
---
Thank you for opening this PR. Are you able to add a small unit
[
https://issues.apache.org/jira/browse/TIKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4256.
---
Fix Version/s: 3.0.0
Resolution: Fixed
> Allow inlining of ocr'd text in container docum
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847950#comment-17847950
]
Tim Allison commented on TIKA-4258:
---
I'm sure I'll need to modify the PR when I actually go to run
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847949#comment-17847949
]
Tim Allison commented on TIKA-4258:
---
Let's give it a day for fellow devs to weigh
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847943#comment-17847943
]
Tim Allison commented on TIKA-4258:
---
And here's the full version:
https://hub.docker.com/layers/apache
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847931#comment-17847931
]
Tim Allison commented on TIKA-4243:
---
Separately, but related to this and also to TIKA-4252 -- should we
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847883#comment-17847883
]
Tim Allison commented on TIKA-4258:
---
Helpful links from #infra:
https://infra.apache.org/docker-hub
[
https://issues.apache.org/jira/browse/TIKA-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847882#comment-17847882
]
Tim Allison commented on TIKA-4258:
---
If fellow devs with better knowledge of github actions and docker
Tim Allison created TIKA-4258:
-
Summary: Multi-arch support for docker images
Key: TIKA-4258
URL: https://issues.apache.org/jira/browse/TIKA-4258
Project: Tika
Issue Type: Task
[
https://issues.apache.org/jira/browse/TIKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-4256:
--
Description:
For legacy tika, we're inlining all content from embedded files including ocr
content
[
https://issues.apache.org/jira/browse/TIKA-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-4256:
--
Description:
For legacy tika, we're inlining all content from embedded files including ocr
content
Tim Allison created TIKA-4256:
-
Summary: Allow inlining of ocr'd text in container document
Key: TIKA-4256
URL: https://issues.apache.org/jira/browse/TIKA-4256
Project: Tika
Issue Type: Task
[
https://issues.apache.org/jira/browse/TIKA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846697#comment-17846697
]
Tim Allison commented on TIKA-4137:
---
Y, done just now.
> Building current Tika main branch fails un
[
https://issues.apache.org/jira/browse/TIKA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-4137:
--
Fix Version/s: 2.9.3
> Building current Tika main branch fails under Java 20
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845081#comment-17845081
]
Tim Allison commented on TIKA-4252:
---
fetchRequestMetadata, fetchResponseMetadata?
> PipesClient#proc
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845072#comment-17845072
]
Tim Allison edited comment on TIKA-4252 at 5/9/24 5:14 PM:
---
fetcher.fetch(String
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845072#comment-17845072
]
Tim Allison commented on TIKA-4252:
---
fetcher.fetch(String key, Metadata writeMetadata, Metadata
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845068#comment-17845068
]
Tim Allison commented on TIKA-4252:
---
Should we add an optional Metadata object to the FetchKey. We could
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845062#comment-17845062
]
Tim Allison commented on TIKA-4252:
---
K, but you don't want that coming back and being populated
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845051#comment-17845051
]
Tim Allison commented on TIKA-4252:
---
Or, if you mean that metadata gathered from the fetcher isn't
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845048#comment-17845048
]
Tim Allison commented on TIKA-4252:
---
My initial thought for injecting user metadata was to pass through
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845047#comment-17845047
]
Tim Allison commented on TIKA-4252:
---
I opened this branch: https://github.com/apache/tika/tree/TIKA-4252
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reopened TIKA-4252:
---
I pointed you to the wrong part of the code ... sorry. The design goal was to
overwrite the extracted
[
https://issues.apache.org/jira/browse/TIKA-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845022#comment-17845022
]
Tim Allison commented on TIKA-4253:
---
This is happening in the unit tests because there are multiple
Tim Allison created TIKA-4253:
-
Summary: Duplicate parsers loaded in AutoDetectParser in 3.x at
least in some unit tests
Key: TIKA-4253
URL: https://issues.apache.org/jira/browse/TIKA-4253
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844998#comment-17844998
]
Tim Allison commented on TIKA-4252:
---
Good catch:
https://github.com/apache/tika/blob/main/tika-core/src
[
https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844976#comment-17844976
]
Tim Allison edited comment on TIKA-4250 at 5/9/24 12:59 PM:
libpst issue
[
https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844976#comment-17844976
]
Tim Allison commented on TIKA-4250:
---
libpff issue opened: https://github.com/libyal/libpff/issues/128
All,
I'd like to go for another 3.x beta release and then move fairly quickly
to a 3.0.0 release. I was hoping that
https://issues.apache.org/jira/browse/TIKA-4221 would be wrapped up soon.
It hasn't been, but I can add the workaround we did in 2.x.
What do you think?
Any blockers?
1 - 100 of 9743 matches
Mail list logo