entation:
https://cwiki.apache.org/confluence/display/TIKA/Configuring+Parsers+At+Parse+Time+in+tika-server
Please let me know if you have any questions or want write access to improve
the documentation!
On Wed, Feb 15, 2023 at 11:07 AM Julien Massiera
wrote:
>
> Hi Tim,
>
> bounc
Hi Tim,
bouncing back on our mail thread, could you share more documentation on how to
use the header to configure the PDFParser on the fly ?
Thanks,
Julien
-Message d'origine-
De : Julien Massiera
Envoyé : vendredi 3 février 2023 13:08
À : dev@tika.apache.org
Objet : RE: A
Hi Tim,
The NER Parse config via headers like the PDFParserConfig sounds an interesting
approach but I have just discovered that feature thanks to your reply and I
tried to find a documentation about this, unfortunately the only thing I found
was a TBD note on that page
https://cwiki.apache.or
Julien Massiera created TIKA-3958:
-
Summary: Add tika-parser-nlp-package to release artifacts
Key: TIKA-3958
URL: https://issues.apache.org/jira/browse/TIKA-3958
Project: Tika
Issue Type
Hi Tim,
First, I would like to wish you all the best for 2023 !
I am writing because I am using NER parsers with Tika Server, but to do so,
I had to build the NER package myself from the Tika repository. Indeed, for
Tika Server 2.x, I did not find any NER pre-made package to add to the
cla
Hi Tim,
+1 for new tika releases on my side
Regards,
Julien
On 2022/08/29 18:24:40 Tim Allison wrote:
> Are we in decent shape to start the release processes for 1.x and 2.x?
>
> Maybe start 1.x this week and 2.x next week?
>
> Any blockers?
>
> Best,
>
> Tim
>
> On Wed, Aug 1
Hi Tim,
from our side we already dropped java 8 and only support java 11, so it would
not be a problem for us
Cheers,
Julien
-Message d'origine-
De : Tim Allison
Envoyé : vendredi 25 mars 2022 15:47
À :
Objet : [DISCUSS] support for Java 8?
All,
I'm somewhat interested in moving
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511180#comment-17511180
]
Julien Massiera commented on TIKA-3695:
---
It is a good idea to set a hard limi
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510715#comment-17510715
]
Julien Massiera commented on TIKA-3695:
---
Indeed, I get the following result fil
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510706#comment-17510706
]
Julien Massiera commented on TIKA-3695:
---
Yes, I was about to test and it is
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509316#comment-17509316
]
Julien Massiera commented on TIKA-3695:
---
Concerning the X-TIKA:EXCEPTION
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509315#comment-17509315
]
Julien Massiera commented on TIKA-3695:
---
I think that the minimum is to count
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509249#comment-17509249
]
Julien Massiera commented on TIKA-3695:
---
Thanks Tim, with the conf you provid
1.x into config.xml in 2.x, I didn't imagine users would want more complexity.
I can add them back if you want them.
On Fri, Mar 18, 2022 at 2:17 PM Julien Massiera
wrote:
>
> Hi,
>
>
>
> I am currently testing the current trunk Tika server-standard 2.3.1.
> Everyt
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509041#comment-17509041
]
Julien Massiera commented on TIKA-3695:
---
I am not sure I understood how it work
Hi,
I am currently testing the current trunk Tika server-standard 2.3.1.
Everything works fine with the config tika-config.xml except for three
parameters : pingTimeoutMillis, pingPulseMillis and javaHome
Indeed, when I try to set them in the config file like specified in the
documentation
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508129#comment-17508129
]
Julien Massiera commented on TIKA-3695:
---
I took a better look at the code
[
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507662#comment-17507662
]
Julien Massiera commented on TIKA-3695:
---
[~tallison] concerning that point &
Julien Massiera created TIKA-3695:
-
Summary: LimitingMetadataFilter
Key: TIKA-3695
URL: https://issues.apache.org/jira/browse/TIKA-3695
Project: Tika
Issue Type: New Feature
Hi Tim,
We identified cases where pdf files may contain abnormaly big metadata
(several MB, be it for the metadata values, the metadata names, but also for
the total amount of metadata). Some time ago, I proposed the creation of a
"writeLimit" header in Tika Server (and you accepted to implemen
[
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338281#comment-17338281
]
Julien Massiera commented on TIKA-3372:
---
[~tallison] the fix works !
So for
[
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337445#comment-17337445
]
Julien Massiera edited comment on TIKA-3372 at 4/30/21, 3:1
[
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337445#comment-17337445
]
Julien Massiera commented on TIKA-3372:
---
[~tallison] so I tested on a 1.27 b
[
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333168#comment-17333168
]
Julien Massiera commented on TIKA-3372:
---
Concerning the behavior you desc
[
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333027#comment-17333027
]
Julien Massiera edited comment on TIKA-3372 at 4/27/21, 8:0
[
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333027#comment-17333027
]
Julien Massiera commented on TIKA-3372:
---
[~tallison] here is my use case :
I
[
https://issues.apache.org/jira/browse/TIKA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304306#comment-17304306
]
Julien Massiera commented on TIKA-3325:
---
Indeed, I see no reason one would wan
[
https://issues.apache.org/jira/browse/TIKA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302577#comment-17302577
]
Julien Massiera commented on TIKA-3325:
---
[~tallison] yes please, it would be re
Julien Massiera created TIKA-3325:
-
Summary: Add header to limit extracted content
Key: TIKA-3325
URL: https://issues.apache.org/jira/browse/TIKA-3325
Project: Tika
Issue Type: Improvement
Julien Massiera created TIKA-2881:
-
Summary: Obsolete MircrosoftTranslator implementation
Key: TIKA-2881
URL: https://issues.apache.org/jira/browse/TIKA-2881
Project: Tika
Issue Type: Bug
Julien Massiera created TIKA-2753:
-
Summary: ChildProcess does not use the JAVA_HOME
Key: TIKA-2753
URL: https://issues.apache.org/jira/browse/TIKA-2753
Project: Tika
Issue Type: Bug
Julien Massiera created TIKA-2371:
-
Summary: Check properties presence - PDFParser
Key: TIKA-2371
URL: https://issues.apache.org/jira/browse/TIKA-2371
Project: Tika
Issue Type: Improvement
32 matches
Mail list logo