[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2015-01-10 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272570#comment-14272570
 ] 

Chris A. Mattmann commented on TIKA-1445:
-

yeesh, caught up on all this great work. Awesome job guys.

> Figure out how to add Image metadata extraction to Tesseract parser
> ---
>
> Key: TIKA-1445
> URL: https://issues.apache.org/jira/browse/TIKA-1445
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Reporter: Chris A. Mattmann
>Assignee: Chris A. Mattmann
>Priority: Blocker
> Fix For: 1.7
>
> Attachments: 03.doc, TIKA-1445.Mattmann.101214.patch.txt, 
> TIKA-1445.Palsulich.102614.patch, TIKA-1445_20150106_tallison.patch, 
> TIKA-1445_tallison_20141027.patch.txt, TIKA-1445_tallison_v2_20141027.patch, 
> TIKA-1445_tallison_v3_20141027.patch
>
>
> Now that Tesseract is the default image parser in Tika for many image types, 
> consider how to add back in the metadata extraction capabilities by the other 
> Image parsers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-623) Add support for Outlook PST

2015-01-10 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272497#comment-14272497
 ] 

Tim Allison edited comment on TIKA-623 at 1/10/15 1:53 PM:
---

Gah! Of course. Sorry and thank you. Should we modify the PSTParser so that it 
can take an EmbeddedParserDecorator? Separate parser that would grab the mail 
object from ParseContext instead of handling the inputstream?


was (Author: talli...@mitre.org):
Gah! Of course. Sorry and thank you. Should we modify the PSTParser so that it 
can take an EmbeddedParserDecorator? Inner class parser that would grab the 
mail object from ParseContext instead of handling the inputstream?

> Add support for Outlook PST
> ---
>
> Key: TIKA-623
> URL: https://issues.apache.org/jira/browse/TIKA-623
> Project: Tika
>  Issue Type: New Feature
>  Components: parser
>Reporter: Tran Nam Quang
> Fix For: 1.6
>
> Attachments: OutlookPSTParser.java
>
>
> Hello everyone,
> As you might know, Outlook stores its mails and other stuff in a single PST 
> file. There's a relatively new Java library called java-libpst for reading 
> Outlook PST files. It is licensed under the LGPL and available over here: 
> http://code.google.com/p/java-libpst/
> I have tested the library on Outlook 2000 and Outlook 2003, with good 
> results. It would be great if the library could be integrated into Tika.
> Best regards
> Tran Nam Quang



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-623) Add support for Outlook PST

2015-01-10 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272497#comment-14272497
 ] 

Tim Allison commented on TIKA-623:
--

Gah! Of course. Sorry and thank you. Should we modify the PSTParser so that it 
can take an EmbeddedParserDecorator? Inner class parser that would grab the 
mail object from ParseContext instead of handling the inputstream?

> Add support for Outlook PST
> ---
>
> Key: TIKA-623
> URL: https://issues.apache.org/jira/browse/TIKA-623
> Project: Tika
>  Issue Type: New Feature
>  Components: parser
>Reporter: Tran Nam Quang
> Fix For: 1.6
>
> Attachments: OutlookPSTParser.java
>
>
> Hello everyone,
> As you might know, Outlook stores its mails and other stuff in a single PST 
> file. There's a relatively new Java library called java-libpst for reading 
> Outlook PST files. It is licensed under the LGPL and available over here: 
> http://code.google.com/p/java-libpst/
> I have tested the library on Outlook 2000 and Outlook 2003, with good 
> results. It would be great if the library could be integrated into Tika.
> Best regards
> Tran Nam Quang



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)