I must be one of many who run htdig (both 3.1.6 & 3.20b4) and doc2html under
RedHat Linux.

As doc2hml.pl works OK on the command line and your server is returning the
correct MIME/type then the probable source of your fault is the
configuration file.

Check your external_parsers: statement most carefully.  If you have a '\' at
the end of the line make sure that there is no following whitespace.

Check your entire configuration file for errors.  If htdig 3.20b4 finds an
error anywhere then it should give you a message and it may ignore all
following statements.

David Adams
Corporate Information Services
Information Systems Services
University of Southampton

----- Original Message ----- 
From: "Dustin York" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, July 25, 2003 8:46 PM
Subject: [htdig] HELP on Searching PDF files on RedHat Linux v9


> Howdy Everyone,
>
> I am having trouble setting up HTdig to search pdf
> files.
>
> All right I have set up Htdig and have configured it
> to read 4 files on my website. The first file is a
> HTML with links to 3 pdf files. From the verbose
> output of rundig I can tell that HTdig sees the files
> and then eliminates them from the database due to no
> excerpt. NOT because of any size limits! I have run
> pdftotext and pdfinfo on both of these files
> successfully. I have also ran the doc2html.pl script
> from the command line on the pdf files with no
> problem.
>
> In my attempts to figure out what is going on I have
> added debug code to the perl doc2html.pl script to see
> if it is even running. The debug code simply writes to
> a file the parameters it receives. Here is the code
> from doc2html.pl
>
> $outputfile = '/var/www/cgi-bin/doc2html.debug';
> open(DAT, ">>$outputfile") || die("Cannot Open
> Outputfile.");
> print DAT
> "\n\nInput:'$Input';MIME_type:'$MIME_type';URL:'$URL';Name:'$Name;\n\n"
> close(DAT);
>
> This code is right after $Input, $MIME_type, and $Name
> are assigned values so it should give an idea of what
> Htdig is sending the perl script. This works wonders
> when I run perl script and yet when I run Htdig the
> output file doesn't get touched. So to me this would
> imply that Htdig is not even calling the perl script
> from here one might assume that my web server is
> misconfigure and not passing the mime type as
> "application\pdf". I have checked the configuration of
> my Apache server which by the way is version 2.0.40,
> and it is great. Not to mention the fact that the
> verbose method of rundig declared the mime type of the
> pdf's to be application\pdf! My version of htdig is
> 3.2.0b4, and my version of perl is 5.8.0.
>
> So if any one has any ideas I would love to hear them
> as I am out of stuff to try.
>
> I would love to hear if anyone has htdig searching
> pdfs on a RedHat machine presently as then I would at
> least know that it is my set up instead of something
> more fishy.
>
> Thanks,
> Dustin York.
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design software
> http://sitebuilder.yahoo.com
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by: Free pre-built ASP.NET sites including
> Data Reports, E-commerce, Portals, and Forums are available now.
> Download today and enter to win an XBOX or Visual Studio .NET.
>
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
> _______________________________________________
> ht://Dig general mailing list: <[EMAIL PROTECTED]>
> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
> List information (subscribe/unsubscribe, etc.)
> https://lists.sourceforge.net/lists/listinfo/htdig-general
>



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to