Re: [jira] [Issue Comment Edited] (TIKA-623) Add support for Outlook PST

2011-04-04 Thread Jukka Zitting

Hi,

On 04/03/2011 06:07 PM, Oleg Tikhonov wrote:

One of the possible solutions is to create temporary file, pass it to
the citor and delete it after using.


As noted by Nick on the issue, a better alternative is to use
TikaInputStream as it automatically takes care of creating and removing 
the temporary file if needed.


The relevant code within a parse() method should look something like this:

TemporaryFiles tmp = new TemporaryFiles();
try {
TikaInputStream tis = TikaInputStream.get(stream, tmp);
File file = tis.getFile();
...; // process the file
} finally {
tmp.dispose();
}

See the TikaInputStream and TemporaryFiles javadocs for the details.

--
Jukka Zitting


Re: [jira] [Issue Comment Edited] (TIKA-623) Add support for Outlook PST

2011-04-03 Thread Oleg Tikhonov
One of the possible solutions is to create temporary file, pass it to the
citor and delete it after using.


OutputStream out = null;

File tempFile = new File(tempPSTFileName);
try {
out = new FileOutputStream(tempFile);
byte buf[] = new byte[1<<20];
int len;
while ((len = inputStream.read(buf)) > 0) {
out.write(buf, 0, len);
}

} catch (IOException e) {

} finally {
out.flush();
out.close();
if(inputStream != null && inputStream.available() > 0)
inputStream.close();
}
}

//delete after
tempFile.delete();
-or-
tempFile.deleteOnExit();


BR,
Oleg


On Sun, Apr 3, 2011 at 5:26 PM, Tran Nam Quang (JIRA) wrote:

>
>[
> https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015152#comment-13015152]
>
> Tran Nam Quang edited comment on TIKA-623 at 4/3/11 2:25 PM:
> -
>
> I started work on the Tika parser, but got stuck with the following
> problem: In order to access the Outlook PST file, I need to create a PSTFile
> instance. Now, the PSTFile constructor requires either a File or a String
> argument that points at the PST file. The constructor then takes either of
> these arguments to create a RandomAccessFile internally. However, Tika's
> Parser interface gives me an InputStream. What do I do?
>
>  was (Author: qforce):
> I started work on the Tika parser, but got stuck with the following
> problem: In order to access the Outlook PST file, I need to create a PSTFile
> instance. Now, the PSTFile constructor requires either a File or a String
> argument that points at the PST file. The constructor then takes either of
> these to create a RandomAccessFile internally. However, Tika's Parser
> interface gives me an InputStream. What do I do?
>
> > Add support for Outlook PST
> > ---
> >
> > Key: TIKA-623
> > URL: https://issues.apache.org/jira/browse/TIKA-623
> > Project: Tika
> >  Issue Type: New Feature
> >  Components: parser
> >Reporter: Tran Nam Quang
> >
> > Hello everyone,
> > As you might know, Outlook stores its mails and other stuff in a single
> PST file. There's a relatively new Java library called java-libpst for
> reading Outlook PST files. It is licensed under the LGPL and available over
> here: http://code.google.com/p/java-libpst/
> > I have tested the library on Outlook 2000 and Outlook 2003, with good
> results. It would be great if the library could be integrated into Tika.
> > Best regards
> > Tran Nam Quang
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>