Andreas Hartmann wrote:
Hi Lenya devs,
in 2.0.1-dev, PDF documents are now indexed. The text inside the PDF
is extracted using PDFBox (http://pdfbox.org) which fortunately uses a
BSD license. The sitemap determines if a document contains a PDF based
on the source extension. I hope this is sufficient, maybe it makes
sense to use the MIME type instead.
I think the mime type is better, because you cannot always trust a
suffix. Also you might want to consider using the Tika project.
Cheers
Michael
Testing is of course greatly appreciated!
-- Andreas
--
Michael Wechner
Wyona - Open Source Content Management - Yanel, Yulup
http://www.wyona.com
[EMAIL PROTECTED], [EMAIL PROTECTED]
+41 44 272 91 61
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]