In this case I can be fairly sure they were not called! Note the line that says 'not changed' ? Not sure how extensive your indexes are, or if you are in a production status, but you may want to add the -i flag to do an index from scratch. From memory, the -s flag turns on a set of summary statistics, which may include useful info. During a normal run at the correct level, you should see a line like ++++---++- for each file that you index. www.htdig.org can reveal what these symbols mean - I can't remember off hand, but this helps to indicate what is actually found inside a document. Check also that htmerge is running at a similar verbosity setting.
On my system, doc2html etc is called from an intermediate DOS batch file, which is an easy place to put in an extra bit of logging. Alternatively, you may be brave enough to put a debug line into doc2html itself - it is just a bit of PERL if I remember correctly. Mike NB, I have copied this back to the list - not sure if you meant to send this direct, I get that wrong all the time! > -----Original Message----- > From: CHUN KI SHIN [mailto:[EMAIL PROTECTED] > Sent: Thursday, May 10, 2007 4:12 PM > To: Brockington,MJ,Michael,JPGA4X R > Subject: Re: [htdig] doc2html - indexed but no hits > > Thanks for the quick response, Mike. > > Ok, I ran the script with -vv, and I don't know what I'm > looking for from > the index log. Only thing I can see is the following: > > pick: devserverxxx.com, # servers = 1 > 234:31:2:https://devserverxxx.com/library/ADJA/docs/portlet-1_ > 0-fr-spec.pdf: > not changed > > and the same for .doc. > > Could you tell me how to make sure doc2html is being called? > > Also, what do you mean by 'statistics' in htdig? > > Thanks for your time and help! > > >From: <[EMAIL PROTECTED]> > >To: <[email protected]> > >Subject: Re: [htdig] doc2html - indexed but no hits > >Date: Thu, 10 May 2007 14:14:59 +0100 > >MIME-Version: 1.0 > >Received: from lists-outbound.sourceforge.net ([66.35.250.225]) by > >bay0-mc10-f3.bay0.hotmail.com with Microsoft > SMTPSVC(6.0.3790.2668); Thu, > >10 May 2007 06:15:16 -0700 > >Received: from sc8-sf-list1-new.sourceforge.net > >(sc8-sf-list1-new-b.sourceforge.net [10.3.1.93])by > >sc8-sf-spam2.sourceforge.net (Postfix) with ESMTPid > 05C7C12E15; Thu, 10 May > >2007 06:15:16 -0700 (PDT) > >Received: from sc8-sf-mx1-b.sourceforge.net > >([10.3.1.91]helo=mail.sourceforge.net)by > sc8-sf-list1-new.sourceforge.net > >with esmtp (Exim 4.43)id 1Hm8U9-0004LN-Hnfor > >[email protected]; Thu, 10 May 2007 06:15:09 -0700 > >Received: from smtp2.smtp.bt.com ([217.32.164.150])by > mail.sourceforge.net > >with esmtp (Exim 4.44) id 1Hm8U7-0004Pw-NFfor > >[email protected]; Thu, 10 May 2007 06:15:09 -0700 > >Received: from I2KF03CV-UKBR.domain1.systemhost.net > ([193.113.197.43]) > >bysmtp2.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.1830); > Thu, 10 May 2007 > >14:15:00 +0100 > >Received: from E03MVZ4-UKDY.domain1.systemhost.net ([193.113.30.63]) > >byI2KF03CV-UKBR.domain1.systemhost.net with > MicrosoftSMTPSVC(6.0.3790.211); > >Thu, 10 May 2007 14:15:00 +0100 > >X-Message-Info: > >LsUYwwHHNt3igTN6QK+bgHeD79v5SZW0Ne7jEEII55/mb39+2hv8+2ps07jKcsv0 > >X-MimeOLE: Produced By Microsoft Exchange V6.5 > >Content-class: urn:content-classes:message > >X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [htdig] > doc2html - > >indexed but no hits > >Thread-Index: AceTAM4rcEeX2/+QTI2LarpwABt5LAABAOJg > >X-OriginalArrivalTime: 10 May 2007 13:15:00.0122 > >(UTC)FILETIME=[3676BFA0:01C79305] > >X-Spam-Score: 1.2 (+) > >X-Spam-Report: Spam Filtering performed by sourceforge.net.See > >http://spamassassin.org/tag/ for more details.Report problems > >tohttp://sf.net/tracker/?func=add&group_id=1&atid=2000010.2 > NO_REAL_NAME > > From: does not include a real name1.0 FORGED_RCVD_HELO > >Received: contains a forged HELO > >X-BeenThere: [email protected] > >X-Mailman-Version: 2.1.8 > >Precedence: list > >List-Id: "A mailing list for general ht://Dig > >discussion"<htdig-general.lists.sourceforge.net> > >List-Unsubscribe: > ><https://lists.sourceforge.net/lists/listinfo/htdig-general>, > ><mailto:[EMAIL PROTECTED] > nsubscribe> > >List-Archive: > ><http://sourceforge.net/mailarchive/forum.php?forum=htdig-general> > >List-Post: <mailto:[email protected]> > >List-Help: > ><mailto:[EMAIL PROTECTED]> > >List-Subscribe: > ><https://lists.sourceforge.net/lists/listinfo/htdig-general>, > ><mailto:[EMAIL PROTECTED] > ubscribe> > >Errors-To: [EMAIL PROTECTED] > >Return-Path: [EMAIL PROTECTED] > > > >Can you tell if doc2html is actually being called by htdig? Just > >because htdig is downloading the document, it does not > guarantee that it > >is being passed over for conversion to an indexable format. > >It might be worth decreasing the number of v's you are > using by one or > >two so that you can see what is being found in each > document. Not sure > >if you have the 'statistics' turned on? > > > >Regards, > >Mike > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On > > > Behalf Of CHUN KI SHIN > > > Sent: Thursday, May 10, 2007 1:43 PM > > > To: [email protected] > > > Subject: [htdig] doc2html - indexed but no hits > > > > > > I've been trying to index .pdf and .doc documents in v. > 3.2.0b with > > > doc2html/catdoc/pdf2html. > > > I can see both types indexed fine (though I'm not sure why > > > log doesn't tell > > > which words and tags have been indexed). See below: > > > > > > >------------------------------------------------------------- > ------------ > >This SF.net email is sponsored by DB2 Express > >Download DB2 Express C - the FREE version of DB2 express and take > >control of your XML. No limits. Just data. Click to get it now. > >http://sourceforge.net/powerbar/db2/ > >_______________________________________________ > >ht://Dig general mailing list: <[email protected]> > >ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html > >List information (subscribe/unsubscribe, etc.) > >https://lists.sourceforge.net/lists/listinfo/htdig-general > > _________________________________________________________________ > PC Magazine's 2007 editors' choice for best Web > mail-award-winning Windows > Live Hotmail. > http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_ > TAGHM_migration_HM_mini_pcmag_0507 > > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

