nono, these are "plain" pdf files, mostly converted from winword. so there is a lot of text. when i use pdf2text or pdftohtml and look in the result, i get all the words/text from the pdf file. so something different happens here...
mfg Markus Rietzler * <rietzler_software/> * RZF NRW * Tel: 0211.4572-130 -----Urspr�ngliche Nachricht----- Von: Gregory Kozlovsky [mailto:[EMAIL PROTECTED]] Gesendet am: Mittwoch, 11. September 2002 10:07 An: '[EMAIL PROTECTED]' Betreff: RE: [aseek-users] external converters, pdf files Sometimes, what appears to be text in .pdf files is actually scanned images that cannot be indexed. Check for it. Gregory Kozlovsky -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Mittwoch, 11. September 2002 09:59 To: [EMAIL PROTECTED] Subject: [aseek-users] external converters, pdf files hi, i am trying to setup aspseek with external converter support. i installed pdftohtml, indexing works fine, pdf files seem to be processed, i can find the urls to the pdf files in urlword table even with status code 200. but when i do a search with words from the pdf-files i get no result, pdf files were not listet in the results... any idea? thanxs mfg Markus Rietzler * <rietzler_software/> * RZF NRW * Tel: 0211.4572-130 -----Urspr�ngliche Nachricht----- Von: Charlie Farinella [mailto:[EMAIL PROTECTED]] Gesendet am: Dienstag, 10. September 2002 23:35 An: [EMAIL PROTECTED] Betreff: [aseek-users] selective removal of urls Is there a way to selectively remove a url from our database after it has been indexed? We would like to remove porn sites from a family friendly database. -- ------------------------------------------------------------------------ Charlie Farinella, Appropriate Solutions, Inc. [EMAIL PROTECTED] 603-924-6079
