Re: pdf in Chinese

Ben Litchfield Wed, 08 Sep 2004 04:44:44 -0700

This appears to be more of a PDFBox issue than a lucene issue, please post
an issue to the PDFBox site.


Also note, that because of certain encodings that a PDF writer can use, it
is impossible to extract text from all PDF documents.

Ben

On Wed, 8 Sep 2004, [EMAIL PROTECTED] wrote:

> it is not about analyzer ,i  need to read text from pdf file first.
>
> ----- Original Message -----
> From: "Chandan Tamrakar" <[EMAIL PROTECTED]>
> To: "Lucene Users List" <[EMAIL PROTECTED]>
> Sent: Wednesday, September 08, 2004 4:15 PM
> Subject: Re: pdf in Chinese
>
>
> > which analyzer you are using to index chinese pdf documents ?
> > I think you should use cjkanalyzer
> > ----- Original Message -----
> > From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Wednesday, September 08, 2004 11:27 AM
> > Subject: pdf in Chinese
> >
> >
> > > Hi all,
> > >     i use pdfbox to parse pdf file to lucene document.when i parse
> > Chinese
> > > pdf file,pdfbox is not always success.
> > >     Is anyone have some advice?
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: pdf in Chinese

Reply via email to