I am having a strange problem with htmerge from 3.2.0b6 running on a RH
Linux box.

I am running the following script as root to execute my building of my
htdig database.

------------------------------ BEGIN CODE
-------------------------------
echo QUICK Rundig Script for testing
TIME=`date`
echo htdig: Start Time:    $TIME
echo htdig: Start Time:    $TIME > /var/log/rundig.log
/usr/htdig/bin/htdig -vv -s -i -c /usr/htdig/conf/htdig_reindex.conf >>
/var/log/rundig.log
TIME=`date`
echo htdig: Finish Time:   $TIME
echo htdig: Finish Time:   $TIME >> /var/log/rundig.log
echo htmerge: Start Time:  $TIME >> /var/log/rundig.log
/usr/htdig/bin/htmerge -v -s  -c /usr/htdig/conf/htdig_reindex.conf >>
/var/log/rundig.log
TIME=`date`
echo htmerge: Finish Time: $TIME
echo htmerge: Finish Time: $TIME >> /var/log/rundig.log
echo htfuzzy: Start Time:  $TIME >> /var/log/rundig.log
/usr/htdig/bin/htfuzzy -v -c /usr/htdig/conf/htdig_reindex.conf endings
synonyms >>  /var/log/rundig.log
echo htfuzzy: Finish Time: $TIME >> /var/log/rundig.log
echo htfuzzy: Finish Time: $TIME
------------------------------ END CODE -------------------------------


The portion of the rundig.log that is relevant to our discussion is:
------------------------------ BEGIN LOG -------------------------------
<...SNIP...>
 HTTP Average request time : 0.0258429 secs
 HTTP Average speed        : 1411.23 KBytes/secs

ht://dig End Time: Thu Nov  3 13:25:14 2005
htdig: Finish Time: Thu Nov 3 13:25:15 CST 2005
htmerge: Start Time: Thu Nov 3 13:25:15 CST 2005
htmerge: Finish Time: Thu Nov 3 13:25:17 CST 2005
htfuzzy: Start Time: Thu Nov 3 13:25:17 CST 2005
htfuzzy: Selected algorithm: endings
htfuzzy/endings: Reading rules
<...SNIP...>
------------------------------ END LOG -------------------------------

This shows that htdig runs nicely.  It dug it's way through 2477
documents.  Htfuzzy 
works as expected buzzing along and spitting out content.  But not a
byte from htmerge.  
It took 2 seconds to do whatever it was that htmerge did.  (I know that
it runs as 
when I had the words "do", "it", "we" in my bad words list, htmerge
would generate 
warning messages that the words were too small.)

For sake of completeness, the htdig_reindex.conf is hereby included:

------------------------------ BEGIN CONF
-------------------------------
maintainer:        [EMAIL PROTECTED]
database_dir:      /var/lib/htdig/reindex
start_url:         http://ronald.cori.missouri.edu/cori_kbase/2004-12/
limit_urls_to:     ${start_url}
http://ronald.cori.missouri.edu/cori_kbase_test
exclude_urls:     /cgi-bin/ .cgi file:// ftp:// gopher:// email: mailto:
javascript: \
                   /cori_sykutam/
bad_extensions:    .AI .AIF .AMI .AU .AVI \
                   .ai .aif .ami .au .avi \
                   .BAK .BACK .BIN .BMP .BSP .BZ2 \
                   .bak .back .bin .bmp .bsp .bz2 \
                   .CAP .CLASS .CMF .COM .CSS \
                   .cap .class .cmf .com .css \
                   .DBF .DCR .DEB .DIR .DLL .DTA .DWT \
                   .dbf .dcr .deb .dir .dll .dta .dwt \
                   .ENF .EOT .EXE .FLA .GUESS .GIF .GZ .HQX .ICO .INDD
.INS \
                   .enf .eot .exe .fla .guess .gif .gz .hqx .ico .indd
.ins \
                   .JAVA .JPEG .JPG .JS .LBI .LCK .LGI .LNK \
                   .java .jpeg .jpg .js .lbi .lck .lgi .lnk \
                   .M3U .M4 .MAX .MAP .MDB .MID .MNO .MOV .MP3 .MPE .MPG
.MSO \
                   .m3u .m4 .max .map .mdb .mid .mno .mov .mp3 .mpe .mpg
.mso \
                   .NFO .NSF .OLD .ORA .ORACLE .OUT \
                   .nfo .nsf .old .ora .oracle .out \
                   .PCZ .PDB .PNG .PPZ .PRE .PRZ .PPT .PUB .RA .RAM .RPM
\
                   .pcz .pdb .png .ppz .pre .prz .ppt .pub .ra .ram .rpm
\
                   .SAV .SBK .SFL .SIT .SLP .SPF .SUB .SWI \
                   .sav .sbk .sfl .sit .slp .spf .sub .swi \
                   .TAR .TGZ .THM .TIFF .TIF .TMP .TTF .URL .VRML \
                   .tar .tgz .thm .tiff .tif .tmp .ttf .url .vrml \
                   .WAV .WMF .WMZ .WRL .Z .ZIP \
                   .wav .wmf .wmz .wrl .z .zip WS_FTP.LOG .xls .XLS
bad_word_list:     /usr/htdig/share/htdig/bad_words3
minimum_word_length:    3
max_head_length:   512
max_doc_size:      500000
no_excerpt_show_top:    false
search_algorithm:  exact:1 synonyms:0.5 endings:0.1
next_page_text:    <img src="/htdig/buttonr.gif" border="0"
align="middle" width="30" height="30" alt="next">
no_next_page_text:
prev_page_text:    <img src="/htdig/buttonl.gif" border="0"
align="middle" width="30" height="30" alt="prev">
no_prev_page_text:
page_number_text:
no_page_number_text:
------------------------------ END CONF -------------------------------


Any thoughts or suggestions what might be preventing htmerge from doing
anything useful?
  

James H. Cutts III

Computer Project Manager
Contracting and Organizations Research Institute
University of Missouri - Columbia
143C Mumford Hall
Columbia, MO
65211
 
Phone: (573) 882-6181
E-mail: [EMAIL PROTECTED]
 
Programming is the eternal competition between programmers who try to
make apps 
more and more idiot proof and the universe that makes dumber idiots. So
far, 
the universe is winning... 


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to