[ 
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920692#comment-13920692
 ] 

Hong-Thai Nguyen edited comment on TIKA-623 at 3/5/14 9:30 AM:
---------------------------------------------------------------

java-libpst-0.7 has been uploaded to oss sonatype nexus: 
https://issues.sonatype.org/browse/OSSRH-8965
If there's no objection, I'll refactory attached parser and provide output as:
{code}
<html xmlns="http://www.w3.org/1999/xhtml";>
<head>
<meta name="Content-Length" content="271360" />
<meta name="isValid" content="true" />
<meta name="Content-Type" content="application/vnd.ms-outlook" />
<title></title>
</head>
<body>
        <div class="email-folder">
                <h1>Début du fichier de données Outlook</h1>
                <div class="email-entry">
                        <h1>&lt;530d9cac.5080...@gmail.com&gt;</h1>
                        <meta subject="Re: Feature Generators" />
                        <meta 
internetMessageId="&lt;530d9cac.5080...@gmail.com&gt;" />
                        <meta descriptorNodeId="2097188" />
                        <meta lastModificationTime="1393418263291" />
                        <meta senderName="Jörn Kottmann" />
                        <meta senderEmailAddress="kottm...@gmail.com" />
                        <meta recipients="No recipients table!" />
                        <p>mail content</p>
                </div>
                <div class="email-folder">
                        <h1>Éléments supprimés</h1>
                </div>
        </div>
        <div class="email-folder">
                <h1>Racine (pour la recherche)</h1>
        </div>
        <div class="email-folder">
                <h1>SPAM Search Folder 2</h1>
        </div>
</body>
</html>
{code}


was (Author: thaichat04):
java-libpst-0.7 has been uploaded to oss sonatype nexus. If there's no 
objection, I'll refactory attached parser and provide output as:
{code}
<html xmlns="http://www.w3.org/1999/xhtml";>
<head>
<meta name="Content-Length" content="271360" />
<meta name="isValid" content="true" />
<meta name="Content-Type" content="application/vnd.ms-outlook" />
<title></title>
</head>
<body>
        <div class="email-folder">
                <h1>Début du fichier de données Outlook</h1>
                <div class="email-entry">
                        <h1>&lt;530d9cac.5080...@gmail.com&gt;</h1>
                        <meta subject="Re: Feature Generators" />
                        <meta 
internetMessageId="&lt;530d9cac.5080...@gmail.com&gt;" />
                        <meta descriptorNodeId="2097188" />
                        <meta lastModificationTime="1393418263291" />
                        <meta senderName="Jörn Kottmann" />
                        <meta senderEmailAddress="kottm...@gmail.com" />
                        <meta recipients="No recipients table!" />
                        <p>mail content</p>
                </div>
                <div class="email-folder">
                        <h1>Éléments supprimés</h1>
                </div>
        </div>
        <div class="email-folder">
                <h1>Racine (pour la recherche)</h1>
        </div>
        <div class="email-folder">
                <h1>SPAM Search Folder 2</h1>
        </div>
</body>
</html>
{code}

> Add support for Outlook PST
> ---------------------------
>
>                 Key: TIKA-623
>                 URL: https://issues.apache.org/jira/browse/TIKA-623
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Tran Nam Quang
>            Assignee: Hong-Thai Nguyen
>             Fix For: 1.6
>
>         Attachments: OutlookPSTParser.java
>
>
> Hello everyone,
> As you might know, Outlook stores its mails and other stuff in a single PST 
> file. There's a relatively new Java library called java-libpst for reading 
> Outlook PST files. It is licensed under the LGPL and available over here: 
> http://code.google.com/p/java-libpst/
> I have tested the library on Outlook 2000 and Outlook 2003, with good 
> results. It would be great if the library could be integrated into Tika.
> Best regards
> Tran Nam Quang



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to