Re: Problem while Indexing Pdf files

2004-03-25 Thread Ben Litchfield
The latest release of PDFBox changed the way it dealt with fonts and introduced this bug, please try the version in CVS and let me know if you are still having a problem. Ben On Thu, 25 Mar 2004, Ankur Goel wrote: > > Hi, > > I have to index PDF files. For that I am using pdfbox. But when I tr

Problem while Indexing Pdf files

2004-03-25 Thread Ankur Goel
Hi, I have to index PDF files. For that I am using pdfbox. But when I try to extract text from pdf file using pdfbox I get the following error: java.io.IOException: Error: No 'ToUnicode' and no 'Encoding' for Font at org.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:347) at or

Re: indexing PDF files

2002-05-05 Thread CNew
gt; To: Lucene Users List <[EMAIL PROTECTED]> Sent: Saturday, May 04, 2002 1:28 AM Subject: Re: indexing PDF files > You might want to take a look at WebSearch http://www.i2a.com/websearch/. It > has an _ok_ system going with respect to PDFs. PDFGo supports viewing of PDF > but a

Re: indexing PDF files

2002-05-04 Thread Kelvin Tan
(i.e text extraction), maybe it's not so bad. I've spent abit of time in this area before so feel free to email me offline about this. Not sure how much help I can be though. - Original Message - From: "petite_abeille" <[EMAIL PROTECTED]> To: "Lucene Users List&q

Re: indexing PDF files

2002-05-03 Thread petite_abeille
On Friday, May 3, 2002, at 03:16 PM, Moturu,Praveen wrote: > Can I assume none of the poeple on the lucene user group had > implemented indexing a pdf document using lucene. Who knows...?!? In any case, it's not public knowledge... > If some one has.. Please help me by providing the solution.

Re: indexing PDF files

2002-05-03 Thread W. Eliot Kimber
"Moturu,Praveen" wrote: > > Good Morning to you all. Can I assume none of the poeple on the lucene user > group had implemented indexing a pdf document using lucene. If some one > has.. Please help me by providing the solution. You can try using Eytemon's PJ library (www.eytemon.com). But be awa

Re: indexing PDF files

2002-05-03 Thread Moturu,Praveen
Good Morning to you all. Can I assume none of the poeple on the lucene user group had implemented indexing a pdf document using lucene. If some one has.. Please help me by providing the solution. Thanks > Praveen Moturu > > > -- To unsubscribe, e-mail: For addit

Re: indexing PDF files

2002-05-03 Thread petite_abeille
On Wednesday, May 1, 2002, at 05:41 PM, Otis Gospodnetic wrote: > Wouldn't you want to convert to XML instead and use XSLT to transform > the XML representation to any desired format by just applying a style > sheet? > Sounds like less work with bigger document type coverage. Sounds good... But

Re: indexing PDF files

2002-05-01 Thread Otis Gospodnetic
> > Hm, this should be a FAQ. > > Maybe it should... ;-) It is now. > > Check Lucene contributions page, there are some starting points > there, > > Well, this seems to be a very popular request... In fact I need > something like that also. Unfortunately, there seems to be no > authoritative

Re: indexing PDF files

2002-05-01 Thread Peter Carlson
I don't know what they have to offer, but I think adobe has something. Here is something I just found on the topic from Abobe's site. How can I license Acrobat Viewer to distribute with my own products or to use in my custom Java development? How much will it cost to license? Adobe Acrobat Viewe

Re: indexing PDF files

2002-04-30 Thread petite_abeille
On Tuesday, April 30, 2002, at 10:46 PM, Otis Gospodnetic wrote: > Hm, this should be a FAQ. Maybe it should... ;-) > Check Lucene contributions page, there are some starting points there, Well, this seems to be a very popular request... In fact I need something like that also. Unfortunately,