Tamara created TIKA-1533:
Summary: PDF parse failing to capture right order of text (2
columns)
Key: TIKA-1533
URL: https://issues.apache.org/jira/browse/TIKA-1533
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295159#comment-14295159
]
Tim Allison commented on TIKA-1533:
---
In the first document, printed page 303/pdf page 152
[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295165#comment-14295165
]
Tim Allison commented on TIKA-1511:
---
Ok, great. We just added the RecursiveParserWrapper
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tamara updated TIKA-1533:
-
Description:
When I am converting a document with two columns the order of the columns are
inverted in the text fi
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295290#comment-14295290
]
Tamara commented on TIKA-1533:
--
No, not yet. Only tika 1.6, 1.7 and the PDFXStream.
I have a
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295340#comment-14295340
]
Tim Allison commented on TIKA-1533:
---
I'm getting the same "mis"-ordering with PDFBox 1.8.
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295376#comment-14295376
]
Tyler Palsulich commented on TIKA-1521:
---
Ah, I missed that comment. The test also pas
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295376#comment-14295376
]
Tyler Palsulich edited comment on TIKA-1521 at 1/28/15 4:46 PM:
-
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295380#comment-14295380
]
Tim Allison commented on TIKA-1521:
---
That's why I opened COMPRESS-299. :) Not sure, yet
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295460#comment-14295460
]
Tamara commented on TIKA-1533:
--
Thank you for the help Tim, next time I will post directly to
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295459#comment-14295459
]
Tamara commented on TIKA-1533:
--
Thank you for the help Tim, next time I will post directly to
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tamara updated TIKA-1533:
-
Comment: was deleted
(was: Thank you for the help Tim, next time I will post directly to them.
Here is the issue o
[
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tyler Palsulich reopened TIKA-1518:
---
Reopening as suggested above.
1. I'm thinking we can place the Dockerfile in trunk/tika-server? Th
[
https://issues.apache.org/jira/browse/TIKA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295489#comment-14295489
]
Tim Allison commented on TIKA-1533:
---
Always happy to pass the buck. ;) But seriously, th
[
https://issues.apache.org/jira/browse/TIKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295498#comment-14295498
]
Nick Burch commented on TIKA-1532:
--
For the mimetype part, do you have a small sample file
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295499#comment-14295499
]
Tim Allison commented on TIKA-1521:
---
Take a look at [comment
14295473|https://issues.apa
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295499#comment-14295499
]
Tim Allison edited comment on TIKA-1521 at 1/28/15 6:02 PM:
Tak
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295619#comment-14295619
]
Tim Allison commented on TIKA-1521:
---
Added conditional testing in r1655431. Build worked
Tim Allison created TIKA-1534:
-
Summary: Upgrade to Commons Compress 1.9
Key: TIKA-1534
URL: https://issues.apache.org/jira/browse/TIKA-1534
Project: Tika
Issue Type: Improvement
Repo
[
https://issues.apache.org/jira/browse/TIKA-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1534.
---
Resolution: Fixed
r1655433.
> Upgrade to Commons Compress 1.9
> ---
>
>
[
https://issues.apache.org/jira/browse/TIKA-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295629#comment-14295629
]
Tim Allison commented on TIKA-1534:
---
While waiting for 1.10, may as well upgrade to 1.9
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295750#comment-14295750
]
Hudson commented on TIKA-1521:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #457 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295749#comment-14295749
]
Hudson commented on TIKA-1534:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #457 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1329.
---
Resolution: Fixed
r1655449.
Added a few examples.
> Add RecursiveParserWrapper aka Jukka's (and Nick'
[
https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reopened TIKA-1329:
---
Wait, do I need to update the webpage, too? Or is that done automatically from
tika-examples?
> Add Recu
[
https://issues.apache.org/jira/browse/TIKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295791#comment-14295791
]
Hudson commented on TIKA-1521:
--
SUCCESS: Integrated in tika-trunk-jdk1.6 #442 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295789#comment-14295789
]
Hudson commented on TIKA-1534:
--
SUCCESS: Integrated in tika-trunk-jdk1.6 #442 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295790#comment-14295790
]
Hudson commented on TIKA-1329:
--
SUCCESS: Integrated in tika-trunk-jdk1.6 #442 (See
[https://b
[
https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295800#comment-14295800
]
Nick Burch commented on TIKA-1329:
--
Website still needs updating - just use the snippet to
[
https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295819#comment-14295819
]
Hudson commented on TIKA-1329:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #458 (See
[https://b
Luke sh created TIKA-1535:
-
Summary: Inheritance modification for the class MIMETypes
Key: TIKA-1535
URL: https://issues.apache.org/jira/browse/TIKA-1535
Project: Tika
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke sh updated TIKA-1517:
--
Summary: MIME type selection with probability (was: MIME type detection
with probability)
> MIME type selection
[
https://issues.apache.org/jira/browse/TIKA-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295922#comment-14295922
]
Luke sh commented on TIKA-1535:
---
TIKA-1517, the mime type selection mechanism with probabilit
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295928#comment-14295928
]
Luke sh commented on TIKA-1517:
---
the probability selection will inherit the class MIMETypes,
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295928#comment-14295928
]
Luke sh edited comment on TIKA-1517 at 1/28/15 11:06 PM:
-
the proba
[
https://issues.apache.org/jira/browse/TIKA-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296084#comment-14296084
]
Tyler Palsulich commented on TIKA-1535:
---
Maybe someone else can comment on this too.
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296103#comment-14296103
]
Tyler Palsulich commented on TIKA-1517:
---
Hi [~Lukeliush]. Thanks for raising this ide
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296103#comment-14296103
]
Tyler Palsulich edited comment on TIKA-1517 at 1/29/15 12:04 AM:
[
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296129#comment-14296129
]
Lewis John McGibbney commented on TIKA-1423:
I am working on this and think I h
Hi Professor and all,
Bayesian or machine learning Detector is different from Bayesian Selection
mechanism reported in TIKA-1517.
It would make sense if we implemented a machine learning algorithm in separate
Detector class, I have not gone too far with this design thought, as I am still
on th
Hi Luke,
-Original Message-
From: Luke
Date: Wednesday, January 28, 2015 at 7:15 PM
To: Chris Mattmann , Chris Mattmann
, "dev@tika.apache.org"
Cc: NSF Polar CyberInfrastructure DR Students
Subject: RE: [jira] [Commented] (TIKA-1535) Inheritance modification for
the class MIMETypes
>H
Thanks professor for the prompt and kind response, will keep you updated on the
progress and findings.
-Original Message-
From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Wednesday, January 28, 2015 8:17 PM
To: Luke; 'Christian Alan Mattmann'; dev@tika.apache.o
[
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296439#comment-14296439
]
Chris A. Mattmann commented on TIKA-1518:
-
Thanks Tyler. Can you raise #2 on infras
[
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296439#comment-14296439
]
Chris A. Mattmann edited comment on TIKA-1518 at 1/29/15 6:15 AM:
---
[
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296439#comment-14296439
]
Chris A. Mattmann edited comment on TIKA-1518 at 1/29/15 6:15 AM:
---
Dear Gabriele,
Thanks for your question. It should be sent to dev@tika.apache.org
(moving dev-ow...@tika.apache.org to BCC).
I’ll take a look tomorrow if someone else hasn’t answered yet.
Cheers,
Chris
++
Chris Mattmann, Ph.D.
Chi
[
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated TIKA-1423:
---
Attachment: TIKA-1423v2.patch
Patch for trunk which passes all tests including issues e
[
https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296541#comment-14296541
]
Lewis John McGibbney edited comment on TIKA-1423 at 1/29/15 7:54 AM:
48 matches
Mail list logo