Tim Allison created TIKA-1731:
-
Summary: Try to integrate java-hwp into Tika
Key: TIKA-1731
URL: https://issues.apache.org/jira/browse/TIKA-1731
Project: Tika
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/TIKA-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734808#comment-14734808
]
Tim Allison commented on TIKA-1728:
---
Opened separate ticket for potential integration: TI
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734812#comment-14734812
]
Tim Allison edited comment on TIKA-1731 at 9/8/15 1:46 PM:
---
One o
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734812#comment-14734812
]
Tim Allison commented on TIKA-1731:
---
One other library [h2tlib|https://sites.google.com/s
[
https://issues.apache.org/jira/browse/TIKA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734821#comment-14734821
]
Tim Allison commented on TIKA-1726:
---
My preference would be for {{getPath()}} and {{creat
[
https://issues.apache.org/jira/browse/TIKA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734823#comment-14734823
]
Tim Allison commented on TIKA-1726:
---
I'll take a look at tika-batch and see what we can m
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734855#comment-14734855
]
Tim Allison commented on TIKA-1731:
---
Opened https://github.com/ddoleye/java-hwp/issues/2
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1731:
--
Description:
Now that we have detection working for hwp files, it would be great to add a
parser.
[java
[
https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734927#comment-14734927
]
Tim Allison commented on TIKA-1513:
---
Hi [~iryndin], I wanted to check in to see if you've
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736663#comment-14736663
]
Tim Allison commented on TIKA-1731:
---
Thank you for the feedback! Are there other options
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736663#comment-14736663
]
Tim Allison edited comment on TIKA-1731 at 9/9/15 11:08 AM:
Tha
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736676#comment-14736676
]
Tim Allison commented on TIKA-1731:
---
[~mungeol], on another note...did hwp ever go the oo
[
https://issues.apache.org/jira/browse/TIKA-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736959#comment-14736959
]
Tim Allison commented on TIKA-1732:
---
Odd...What happens if you call TikaInputStream.get()
[
https://issues.apache.org/jira/browse/TIKA-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736959#comment-14736959
]
Tim Allison edited comment on TIKA-1732 at 9/9/15 2:56 PM:
---
Odd..
[
https://issues.apache.org/jira/browse/TIKA-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737046#comment-14737046
]
Tim Allison commented on TIKA-1732:
---
Any chance there's an old version of POI on your cla
[
https://issues.apache.org/jira/browse/TIKA-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737440#comment-14737440
]
Tim Allison commented on TIKA-1732:
---
NP. Thank you for closing the loop!
Your test doc
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738638#comment-14738638
]
Tim Allison commented on TIKA-1731:
---
Thank you for looking into this.
bq. can Tika+POI
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738660#comment-14738660
]
Tim Allison commented on TIKA-1731:
---
Great. Thank you so much! It would be helpful to k
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738660#comment-14738660
]
Tim Allison edited comment on TIKA-1731 at 9/10/15 12:26 PM:
-
G
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738663#comment-14738663
]
Tim Allison commented on TIKA-1731:
---
[~mungeol], out of curiosity, what is your gut feeli
[
https://issues.apache.org/jira/browse/TIKA-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738809#comment-14738809
]
Tim Allison commented on TIKA-1733:
---
Thank you for submitting a document that triggers th
[
https://issues.apache.org/jira/browse/TIKA-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739005#comment-14739005
]
Tim Allison commented on TIKA-1733:
---
Can't figure out what's going wrong, I've opened:
h
[
https://issues.apache.org/jira/browse/TIKA-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739022#comment-14739022
]
Tim Allison commented on TIKA-1733:
---
And, y, in Tika 1.4 we grabbed footer text with this
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740625#comment-14740625
]
Tim Allison commented on TIKA-1731:
---
Based on only a very cursory look at the examples+sp
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738660#comment-14738660
]
Tim Allison edited comment on TIKA-1731 at 9/16/15 10:52 AM:
-
G
[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14747292#comment-14747292
]
Tim Allison commented on TIKA-1731:
---
Please don't stop watching. We can use your help!
[
https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14747338#comment-14747338
]
Tim Allison commented on TIKA-1607:
---
Thank you, [~rgauss], for your thoughtful responses
[
https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14747338#comment-14747338
]
Tim Allison edited comment on TIKA-1607 at 9/16/15 11:31 AM:
-
T
[
https://issues.apache.org/jira/browse/TIKA-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1736:
--
Description:
One file in our Common Crawl stash demonstrates a Bouncy Castle version
conflict...incompat
Tim Allison created TIKA-1736:
-
Summary: Bouncy Castle version binary incompatibility
Key: TIKA-1736
URL: https://issues.apache.org/jira/browse/TIKA-1736
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900508#comment-14900508
]
Tim Allison commented on TIKA-1737:
---
Thank you for raising this issue. I don't think we'
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902522#comment-14902522
]
Tim Allison commented on TIKA-1737:
---
Thank you, [~tilman]!
> PDFBox 1.8.10 is still a ba
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902528#comment-14902528
]
Tim Allison commented on TIKA-1737:
---
bq. there were many more that just had a single lin
[
https://issues.apache.org/jira/browse/TIKA-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902567#comment-14902567
]
Tim Allison commented on TIKA-1734:
---
About to commit, unless you'd like to. :)
> Use jav
[
https://issues.apache.org/jira/browse/TIKA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902591#comment-14902591
]
Tim Allison commented on TIKA-1740:
---
How about we store a list of pairs instead of Metad
[
https://issues.apache.org/jira/browse/TIKA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902592#comment-14902592
]
Tim Allison commented on TIKA-1740:
---
Oops. Nick beat me to it. That was plan B.
[~gagr
[
https://issues.apache.org/jira/browse/TIKA-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1734.
---
Resolution: Fixed
r1704620.
Thank you, [~kunda]!
> Use java.nio.file.Path in TemporaryResources
> ---
[
https://issues.apache.org/jira/browse/TIKA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902613#comment-14902613
]
Tim Allison commented on TIKA-1726:
---
Thank you, [~kkrugler]. [~kunda], is there enough c
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902622#comment-14902622
]
Tim Allison commented on TIKA-1737:
---
Could we have done something at the Tika level to ca
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902659#comment-14902659
]
Tim Allison commented on TIKA-1737:
---
bq. dating back as far as 1992
Y, I just confirmed
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902835#comment-14902835
]
Tim Allison commented on TIKA-1737:
---
See PDFBOX-2986 for a resource leak discovered throu
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902528#comment-14902528
]
Tim Allison edited comment on TIKA-1737 at 9/22/15 4:16 PM:
bq.
[
https://issues.apache.org/jira/browse/TIKA-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904360#comment-14904360
]
Tim Allison commented on TIKA-1742:
---
The HORROR! If it were a second rate conference, it
[
https://issues.apache.org/jira/browse/TIKA-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904371#comment-14904371
]
Tim Allison commented on TIKA-1743:
---
Oh, I wish I had time to finish off TIKA-1657 and TI
[
https://issues.apache.org/jira/browse/TIKA-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904388#comment-14904388
]
Tim Allison commented on TIKA-1744:
---
Thank you, [~kunda]!
I think this was part of the
[
https://issues.apache.org/jira/browse/TIKA-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904388#comment-14904388
]
Tim Allison edited comment on TIKA-1744 at 9/23/15 12:06 PM:
-
T
Tim Allison created TIKA-1747:
-
Summary: Change file->path in tika-batch throughout
Key: TIKA-1747
URL: https://issues.apache.org/jira/browse/TIKA-1747
Project: Tika
Issue Type: Sub-task
Tim Allison created TIKA-1748:
-
Summary: Upgrade to POI 3.13-final when available
Key: TIKA-1748
URL: https://issues.apache.org/jira/browse/TIKA-1748
Project: Tika
Issue Type: Task
Re
[
https://issues.apache.org/jira/browse/TIKA-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906238#comment-14906238
]
Tim Allison commented on TIKA-1742:
---
[~tilman] fixed this over in PDFBox 1.8.x (already n
[
https://issues.apache.org/jira/browse/TIKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907914#comment-14907914
]
Tim Allison commented on TIKA-1667:
---
Which issue?
> Upgrade to POI 3.13-beta1 when avail
[
https://issues.apache.org/jira/browse/TIKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907914#comment-14907914
]
Tim Allison edited comment on TIKA-1667 at 9/25/15 11:02 AM:
-
W
[
https://issues.apache.org/jira/browse/TIKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907917#comment-14907917
]
Tim Allison commented on TIKA-1753:
---
Y. I defer to [~lehmi] on PDFBOX-2991 for whether t
[
https://issues.apache.org/jira/browse/TIKA-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907925#comment-14907925
]
Tim Allison commented on TIKA-1657:
---
Thank you, [~gagravarr], for moving this forward...a
[
https://issues.apache.org/jira/browse/TIKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907988#comment-14907988
]
Tim Allison commented on TIKA-1667:
---
Thank you for raising this. I agree, this is probab
[
https://issues.apache.org/jira/browse/TIKA-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933175#comment-14933175
]
Tim Allison commented on TIKA-1736:
---
Should be fixed when
[2.1.1|https://sourceforge.net
[
https://issues.apache.org/jira/browse/TIKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933186#comment-14933186
]
Tim Allison commented on TIKA-1748:
---
As [~kunda] pointed out, you're using a future versi
[
https://issues.apache.org/jira/browse/TIKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933186#comment-14933186
]
Tim Allison edited comment on TIKA-1748 at 9/28/15 11:40 AM:
-
A
[
https://issues.apache.org/jira/browse/TIKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1748:
--
Attachment: TIKA-1748.patch
Y, not too much work. All tests pass, what could possibly go wrong?
I added
[
https://issues.apache.org/jira/browse/TIKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936698#comment-14936698
]
Tim Allison commented on TIKA-1748:
---
Thank you! Will commit today.
> Upgrade to POI 3.1
[
https://issues.apache.org/jira/browse/TIKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-1747:
-
Assignee: Tim Allison
> Change file->path in tika-batch throughout
> -
[
https://issues.apache.org/jira/browse/TIKA-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-1744:
-
Assignee: Tim Allison
> Use java.nio.file.Path in TikaInputStream
> --
[
https://issues.apache.org/jira/browse/TIKA-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1744.
---
Resolution: Fixed
r1706056
Thank you, [~kunda]!
> Use java.nio.file.Path in TikaInputStream
> ---
Tim Allison created TIKA-1754:
-
Summary: tika-batch's FileListCrawler truncates the first
character of the fileList if the root is e.g. X:
Key: TIKA-1754
URL: https://issues.apache.org/jira/browse/TIKA-1754
[
https://issues.apache.org/jira/browse/TIKA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1747.
---
Resolution: Fixed
r1706060
> Change file->path in tika-batch throughout
>
[
https://issues.apache.org/jira/browse/TIKA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1754.
---
Resolution: Fixed
Fixed with TIKA-1747.
> tika-batch's FileListCrawler truncates the first character o
[
https://issues.apache.org/jira/browse/TIKA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937440#comment-14937440
]
Tim Allison commented on TIKA-1754:
---
Y, probably. This particular issue is fixed for now.
[
https://issues.apache.org/jira/browse/TIKA-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-1707:
-
Assignee: Tim Allison
> Upgrade to Apache POI 3.13 Beta 2
> -
>
>
[
https://issues.apache.org/jira/browse/TIKA-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1707.
---
Resolution: Fixed
r1706079.
Thank you, [~kiwiwings]!
Apologies to you and [~gagravarr] for not rememb
[
https://issues.apache.org/jira/browse/TIKA-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1742.
---
Resolution: Fixed
r1706086
> StackOverflowError parsing a PDF with ExtractInlineImages=true
>
[
https://issues.apache.org/jira/browse/TIKA-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937720#comment-14937720
]
Tim Allison commented on TIKA-1742:
---
Thank you, [~nated], for raising this, and thank you
[
https://issues.apache.org/jira/browse/TIKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1748.
---
Resolution: Fixed
> Upgrade to POI 3.13-final when available
>
Tim Allison created TIKA-1755:
-
Summary: Make ppt and pptx paragraph/div breaks more consistent
Key: TIKA-1755
URL: https://issues.apache.org/jira/browse/TIKA-1755
Project: Tika
Issue Type: Impro
[
https://issues.apache.org/jira/browse/TIKA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938273#comment-14938273
]
Tim Allison commented on TIKA-1755:
---
Current patch gets us this with PPTX:
{noformat}
[
https://issues.apache.org/jira/browse/TIKA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1755:
--
Attachment: TIKA-1755.patch
Initial patch
> Make ppt and pptx paragraph/div breaks more consistent
> ---
[
https://issues.apache.org/jira/browse/TIKA-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938788#comment-14938788
]
Tim Allison commented on TIKA-1744:
---
Doh! Thank you.
> Use java.nio.file.Path in TikaIn
[
https://issues.apache.org/jira/browse/TIKA-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938880#comment-14938880
]
Tim Allison commented on TIKA-1755:
---
Y, you've got plenty of bigger fish to fry, and the
[
https://issues.apache.org/jira/browse/TIKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-1757:
-
Assignee: Tim Allison
> tika-batch tests fail on systems with whitespace or special chars in folde
[
https://issues.apache.org/jira/browse/TIKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938893#comment-14938893
]
Tim Allison commented on TIKA-1757:
---
Sorry about that. Will fix shortly. Thank you!
>
[
https://issues.apache.org/jira/browse/TIKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938913#comment-14938913
]
Tim Allison commented on TIKA-1757:
---
Y, won't be able to fix for a few hours, but I can r
[
https://issues.apache.org/jira/browse/TIKA-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1757.
---
Resolution: Fixed
Mea culpa. Tests pass for me on Windows with space in path and Linux.
Let me know i
[
https://issues.apache.org/jira/browse/TIKA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1758.
---
Resolution: Fixed
r1706178. Thank you, [~kunda].
> BatchCommandLineBuilder fails on systems with whit
[
https://issues.apache.org/jira/browse/TIKA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939102#comment-14939102
]
Tim Allison edited comment on TIKA-1758 at 10/1/15 12:27 AM:
-
r
[
https://issues.apache.org/jira/browse/TIKA-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1756.
---
Resolution: Fixed
r1706242.
Thank you, [~thetaphi]!
> Update forbiddenapis to v2.0
>
[
https://issues.apache.org/jira/browse/TIKA-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939822#comment-14939822
]
Tim Allison commented on TIKA-1744:
---
committed r1706249. Thank you!
> Use java.nio.file
Tim Allison created TIKA-1759:
-
Summary: Extract contributor metadata from supporting file formats
Key: TIKA-1759
URL: https://issues.apache.org/jira/browse/TIKA-1759
Project: Tika
Issue Type: Im
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1759:
--
Description:
Many common file formats store information about contributors (broadly
speaking) to a docum
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939860#comment-14939860
]
Tim Allison commented on TIKA-1759:
---
Question #1: are there any other types of embedded c
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1759:
--
Attachment: contributors.zip
I've created test files for MSOffice docs. If anyone would be willing to ad
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939881#comment-14939881
]
Tim Allison commented on TIKA-1759:
---
If we don't want to call all of the above {{dc:contr
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940009#comment-14940009
]
Tim Allison commented on TIKA-1759:
---
[~tilman], I think we're good with PDAnnotationMarku
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940091#comment-14940091
]
Tim Allison commented on TIKA-1759:
---
Y, it is. I think it would be useful to try to get
[
https://issues.apache.org/jira/browse/TIKA-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940981#comment-14940981
]
Tim Allison commented on TIKA-1760:
---
Thank you for raising this issue. I'm not sure ther
[
https://issues.apache.org/jira/browse/TIKA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940983#comment-14940983
]
Tim Allison commented on TIKA-1759:
---
Will do. Thank you!
> Extract contributor metadata
[
https://issues.apache.org/jira/browse/TIKA-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-1761:
-
Assignee: Tim Allison
> Error Parsing PPT (97-2003) files with password protection against
> modi
[
https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942248#comment-14942248
]
Tim Allison commented on TIKA-1285:
---
Completely agree. If I update the PDFBox 2.0 branch
[
https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943339#comment-14943339
]
Tim Allison commented on TIKA-1285:
---
Thank you, [~b...@benmccann.com]! The more eyes we
[
https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943339#comment-14943339
]
Tim Allison edited comment on TIKA-1285 at 10/5/15 1:14 PM:
Tha
[
https://issues.apache.org/jira/browse/TIKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944295#comment-14944295
]
Tim Allison commented on TIKA-1737:
---
[~alanbur], over on TIKA-1285, I posted a link for m
[
https://issues.apache.org/jira/browse/TIKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946775#comment-14946775
]
Tim Allison commented on TIKA-1764:
---
Ha, I've been wanting to do this for a while.
I'm n
Tim Allison created TIKA-1765:
-
Summary: Some doc and docx store multiple authors as semi-colon
delimited list
Key: TIKA-1765
URL: https://issues.apache.org/jira/browse/TIKA-1765
Project: Tika
I
1001 - 1100 of 9278 matches
Mail list logo