[
https://issues.apache.org/jira/browse/TIKA-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242408#comment-13242408
]
Jukka Zitting commented on TIKA-888:
bq. The question is: The parser is still listed in
[
https://issues.apache.org/jira/browse/TIKA-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232975#comment-13232975
]
Jukka Zitting commented on TIKA-878:
Do you have a benchmark that shows this to be a not
[
https://issues.apache.org/jira/browse/TIKA-866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210429#comment-13210429
]
Jukka Zitting commented on TIKA-866:
Actually, scrap the above rationale. The DefaultPar
[
https://issues.apache.org/jira/browse/TIKA-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210240#comment-13210240
]
Jukka Zitting commented on TIKA-864:
Like in TIKA-865, is this a real measurable perform
[
https://issues.apache.org/jira/browse/TIKA-865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210237#comment-13210237
]
Jukka Zitting commented on TIKA-865:
I'd keep the synchronization on "this", as it also
[
https://issues.apache.org/jira/browse/TIKA-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205453#comment-13205453
]
Jukka Zitting commented on TIKA-860:
bq. Couldnt SecureContentHandler not simply get the
[
https://issues.apache.org/jira/browse/TIKA-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205440#comment-13205440
]
Jukka Zitting commented on TIKA-860:
The mentioned configurability is currently only at
[
https://issues.apache.org/jira/browse/TIKA-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198767#comment-13198767
]
Jukka Zitting commented on TIKA-853:
Do you have a virus scanner running? I've seen quit
[
https://issues.apache.org/jira/browse/TIKA-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189897#comment-13189897
]
Jukka Zitting commented on TIKA-843:
FWIW, I've found it most reliable to convert a date
[
https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177248#comment-13177248
]
Jukka Zitting commented on TIKA-833:
Thanks, Jeremy!
> POI Daily beta6
[
https://issues.apache.org/jira/browse/TIKA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177183#comment-13177183
]
Jukka Zitting commented on TIKA-833:
It's good that we monitor changes in POI and make s
[
https://issues.apache.org/jira/browse/TIKA-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177088#comment-13177088
]
Jukka Zitting commented on TIKA-830:
Excellent, thanks Nick!
> Tika.par
[
https://issues.apache.org/jira/browse/TIKA-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176702#comment-13176702
]
Jukka Zitting commented on TIKA-830:
The problem here is the basic assumption that the T
[
https://issues.apache.org/jira/browse/TIKA-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175622#comment-13175622
]
Jukka Zitting commented on TIKA-830:
I'm not sure if we should try to support passing a
[
https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175591#comment-13175591
]
Jukka Zitting commented on TIKA-832:
bq. I can write it if you want.
That would be grea
[
https://issues.apache.org/jira/browse/TIKA-810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172335#comment-13172335
]
Jukka Zitting commented on TIKA-810:
In revision 1220781 I updated the parser code in PD
[
https://issues.apache.org/jira/browse/TIKA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166148#comment-13166148
]
Jukka Zitting commented on TIKA-801:
bq. patch attached
Looks good, +1.
[
https://issues.apache.org/jira/browse/TIKA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165291#comment-13165291
]
Jukka Zitting commented on TIKA-801:
See the org.apache.tika.sax.EmbeddedContentHandler
[
https://issues.apache.org/jira/browse/TIKA-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163701#comment-13163701
]
Jukka Zitting commented on TIKA-800:
Note that calling TikaInputStream.get(InputStream)
[
https://issues.apache.org/jira/browse/TIKA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162795#comment-13162795
]
Jukka Zitting commented on TIKA-801:
bq. EndDocumentShieldingContentHandler
IMHO we sho
[
https://issues.apache.org/jira/browse/TIKA-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162734#comment-13162734
]
Jukka Zitting commented on TIKA-800:
bq. If the POIFS detector (now by run by default if
[
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160919#comment-13160919
]
Jukka Zitting commented on TIKA-623:
bq. Is there some way to proceed here without requi
[
https://issues.apache.org/jira/browse/TIKA-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154154#comment-13154154
]
Jukka Zitting commented on TIKA-786:
Cool, looks good. I was simultaneously approaching
[
https://issues.apache.org/jira/browse/TIKA-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154135#comment-13154135
]
Jukka Zitting commented on TIKA-786:
bq. Do we have any control over the ordering though
[
https://issues.apache.org/jira/browse/TIKA-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154127#comment-13154127
]
Jukka Zitting commented on TIKA-786:
Hmm, I didn't think of such a case when doing the D
[
https://issues.apache.org/jira/browse/TIKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152985#comment-13152985
]
Jukka Zitting commented on TIKA-784:
bq. For now, I've invented mimetypes for the subtyp
[
https://issues.apache.org/jira/browse/TIKA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152405#comment-13152405
]
Jukka Zitting commented on TIKA-734:
Did you see the parse() method [1] that returns a j
[
https://issues.apache.org/jira/browse/TIKA-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150395#comment-13150395
]
Jukka Zitting commented on TIKA-778:
Looks like the problem is coming from the HTML seri
[
https://issues.apache.org/jira/browse/TIKA-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150392#comment-13150392
]
Jukka Zitting commented on TIKA-773:
There's now an ikvm profile in the tika-app POM tha
[
https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147342#comment-13147342
]
Jukka Zitting commented on TIKA-774:
Some notes:
* We already have existing places for
[
https://issues.apache.org/jira/browse/TIKA-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147339#comment-13147339
]
Jukka Zitting commented on TIKA-775:
I'd like to have a concrete use case for introducin
[
https://issues.apache.org/jira/browse/TIKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144862#comment-13144862
]
Jukka Zitting commented on TIKA-772:
The metacharacters you mention do sound suspicious.
[
https://issues.apache.org/jira/browse/TIKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144854#comment-13144854
]
Jukka Zitting commented on TIKA-772:
The test case you added prints out "text/html" for
[
https://issues.apache.org/jira/browse/TIKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144849#comment-13144849
]
Jukka Zitting commented on TIKA-772:
The latter method makes also the .html suffix avail
[
https://issues.apache.org/jira/browse/TIKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144836#comment-13144836
]
Jukka Zitting commented on TIKA-772:
I piped the files to tika-app to prevent it from se
[
https://issues.apache.org/jira/browse/TIKA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144763#comment-13144763
]
Jukka Zitting commented on TIKA-772:
Can you attach an example document that illustrates
[
https://issues.apache.org/jira/browse/TIKA-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138545#comment-13138545
]
Jukka Zitting commented on TIKA-763:
I updated the embedded LICENSE files in revisions 1
[
https://issues.apache.org/jira/browse/TIKA-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135257#comment-13135257
]
Jukka Zitting commented on TIKA-761:
See revision 1188803 for a slightly modified versio
[
https://issues.apache.org/jira/browse/TIKA-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135167#comment-13135167
]
Jukka Zitting commented on TIKA-761:
bq. the dots will still be replaced with slashes
O
[
https://issues.apache.org/jira/browse/TIKA-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135129#comment-13135129
]
Jukka Zitting commented on TIKA-761:
I'd simply hardcode the properties file path as
{{
[
https://issues.apache.org/jira/browse/TIKA-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134151#comment-13134151
]
Jukka Zitting commented on TIKA-761:
bq. It seems - from what I read - these are only av
[
https://issues.apache.org/jira/browse/TIKA-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134113#comment-13134113
]
Jukka Zitting commented on TIKA-761:
+1 Looks good.
As a possible improvement, as Nick
[
https://issues.apache.org/jira/browse/TIKA-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129971#comment-13129971
]
Jukka Zitting commented on TIKA-755:
Hmm, I looked at the interaction between Tika and T
[
https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129957#comment-13129957
]
Jukka Zitting commented on TIKA-756:
Rough first version committed in revision 1185805.
[
https://issues.apache.org/jira/browse/TIKA-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129637#comment-13129637
]
Jukka Zitting commented on TIKA-754:
I don't think it's necessarily a good idea to make
[
https://issues.apache.org/jira/browse/TIKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126938#comment-13126938
]
Jukka Zitting commented on TIKA-657:
In revision 1183109 I increased the default line an
[
https://issues.apache.org/jira/browse/TIKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122639#comment-13122639
]
Jukka Zitting commented on TIKA-513:
Is there a DjVu parser we could use?
[
https://issues.apache.org/jira/browse/TIKA-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122625#comment-13122625
]
Jukka Zitting commented on TIKA-272:
See PDFBOX-577 for some related work in PDFBox.
[
https://issues.apache.org/jira/browse/TIKA-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122617#comment-13122617
]
Jukka Zitting commented on TIKA-381:
The relevant TagSoup scanner state transitions are
[
https://issues.apache.org/jira/browse/TIKA-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121157#comment-13121157
]
Jukka Zitting commented on TIKA-636:
Do you still see this problem with Tika 0.10? If ye
[
https://issues.apache.org/jira/browse/TIKA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121032#comment-13121032
]
Jukka Zitting commented on TIKA-734:
Tika 0.10 is now available. If the problem still oc
[
https://issues.apache.org/jira/browse/TIKA-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118861#comment-13118861
]
Jukka Zitting commented on TIKA-735:
A parser should always produce valid XHTML output.
52 matches
Mail list logo