Ainhoa,
My first instinct would now be to check the parser output - try adding
another  v  to your config, (and possibly restricting your indexing to
just this one file) and check the log output - it may be that htdig does
not like the output from your PERL script. www.htdig.org  explains what
the output means. I seem to recall you saying that you had already
tested that it ran on its own, but possibly there is something not right
there, or a typo in the config that neither of us can see.
 
Regards,
Mike
 
 
________________________________

From: Ainhoa L [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 11, 2008 9:33 AM
To: Brockington,MJ,Michael,JPGA4X R
Cc: [email protected]
Subject: Re: [htdig] Htdig and MHT files



        Hi Mike,
        Yes you were right, I was missing that part and I didn't even
noticed!
        I changed the config file and wrote this:
        application/pdf->text/html
/usr/local/apache/htdocs/htdig-3.1.6/contrib/parsepdf.pl \

        application/vnd.wap.xhtml+xml->text/html /opt/vin/mht2html.pl

        vnd.wap.xhtml+xml was the MIME type for my mht documents. 
        So I run dig and everything seems to go fine, having at the end:


        0/http://172.26.0.169/testdig/
        1/http://172.26.0.169/testdig/About_comments_eex3.mht
        2/http://172.26.0.169/testdig/aster.pdf
        3/http://172.26.0.169/testdig/beepmacro.mht
        4/http://172.26.0.169/testdig/index.txt
        5/http://172.26.0.169/testdig/test.html
         
        (I am doing this in a test folder)
         
        But when I go to the search page, it won't find words inside the
mht files. It works for the pdf, txt and html ones, but can't find the
words that are in the mht ones.
         
        I suppose I am missing something here... do I need to setup any
other settings for the search engine?
         
        Thanks a lot for all your help,
         
        Ainhoa

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to