Howdy Everyone,

I am having trouble setting up HTdig to search pdf
files.

All right I have set up Htdig and have configured it
to read 4 files on my website. The first file is a
HTML with links to 3 pdf files. From the verbose
output of rundig I can tell that HTdig sees the files
and then eliminates them from the database due to no
excerpt. NOT because of any size limits! I have run
pdftotext and pdfinfo on both of these files
successfully. I have also ran the doc2html.pl script
from the command line on the pdf files with no
problem. 

In my attempts to figure out what is going on I have
added debug code to the perl doc2html.pl script to see
if it is even running. The debug code simply writes to
a file the parameters it receives. Here is the code
from doc2html.pl

$outputfile = �/var/www/cgi-bin/doc2html.debug�;
open(DAT, �>>$outputfile�) || die(�Cannot Open
Outputfile.�);
print DAT
�\n\nInput:�$Input�;MIME_type:�$MIME_type�;URL:�$URL�;Name:�$Name;\n\n�
close(DAT);

This code is right after $Input, $MIME_type, and $Name
are assigned values so it should give an idea of what
Htdig is sending the perl script. This works wonders
when I run perl script and yet when I run Htdig the
output file doesn�t get touched. So to me this would
imply that Htdig is not even calling the perl script
from here one might assume that my web server is
misconfigure and not passing the mime type as
�application\pdf�. I have checked the configuration of
my Apache server which by the way is version 2.0.40,
and it is great. Not to mention the fact that the
verbose method of rundig declared the mime type of the
pdf�s to be application\pdf! My version of htdig is
3.2.0b4, and my version of perl is 5.8.0. 

So if any one has any ideas I would love to hear them
as I am out of stuff to try. 

I would love to hear if anyone has htdig searching
pdfs on a RedHat machine presently as then I would at
least know that it is my set up instead of something
more fishy.

Thanks,
Dustin York. 


__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to