At 16:53 08/02/2003, Paul Kleeberg wrote:
I am attempting to get htdig to work on a RedHat 8.0 system with Apache 2.0, htdig 3.2.0 and Mailman 2.1

I installed the 4 patches (668685, 661138, 444879 & 444884) to Mailman 2.1 to create the searchable archives for Mailman with htdig, and then reinstalled mailman. Created the link:

ln -s /var/mailman/archives/htdig /etc/htdig-mailman
As long as htidg was configured with /etc as the default directory to contain htdig configuration files this should be OK.


and in mm_cfg.py, to make this compatible with RH8 I added:

 HTDIG_RUNDIG_PATH  = '/usr/bin/rundig'
If that's where the Redhat RPM installed rundig, that's OK


Then added:

 USE_HTDIG = 1

to mm_cfg.py

and then ran the indexing engine:

  /var/mailman/cron/nightly_htdig -v

and I get:

  /usr/bin/rundig: line 48:  1104 Aborted       $BINDIR/htnotify $opts
  htfuzzy: Unable to open word database /var/lib/htdig/db.words.db

but I would think htfuzzy should look in:

  /var/mailman/archives/private/<listname>/htdig/db.words.db
Have you checked out the section under heading "htdig Permissions Considerations" in the file INSTALL.htdig-mm which patch 444884 installs in $build? Some of the htdig 'databases' generated by the components called by rundig can be safely shared between lists while others need to be list specific to avoid information leakage from one list's indexes into another's.

<quote from=INSTALL.htdig-mm>
htdig Permissions Considerations
------------------------------------

Python scripts added by this patch (nightly_htdig and its relatives) run
the htdig rundig script identified by HTDIG_RUNDIG_PATH to build search
indices for Mailman archives. Code added by this patch generates per
list htdig configuration files which are passed as a parameter to the
rundig script. These configuration files identify a list specific
directory ($prefix/archives/private/<listname>/htdig) in which list
specific data files generated by and used by htdig are to be placed.

However, the rundig script identified by HTDIG_RUNDIG_PATH may attempt
to generate some files in htdig's COMMON_DIR when it is first run by
nightly_htdig; the files concerned are likely to be root2word.db,
word2root.db, synonyms.db and possibly some others generated by htidg's
htfuzzy program. The standard rundig script generates these files
selectively if they do not already exist. Depending on how you have
installed htdig and how the rundig script is first run, there may be a
permissions problem when nightly_hdig executes rundig under the mailman
UID if it tries to generate these files.
</quote>

Basically you may have to change permssions over the htdig common directory . For instance on my internal test system I have the following setup:

mailman@mailman2:/opt/www/htdig> ls -l
total 16
drwxr-xr-x 2 root root 4096 Jan 13 16:28 bin
drwxrwxr-x 2 root mailman 4096 Jan 14 17:19 common
drwxr-xr-x 2 root root 4096 Jan 14 17:22 conf
drwxrwxr-x 2 root mailman 4096 Jan 14 17:19 db
mailman@mailman2:/opt/www/htdig> ls -l db
total 0
mailman@mailman2:/opt/www/htdig> ls -l common/
total 6248
-rw-rw-r-- 1 root mailman 84 Jan 13 16:28 bad_words
-rw-rw-r-- 1 root mailman 923308 Jan 13 16:28 english.0
-rw-rw-r-- 1 root mailman 5756 Jan 13 16:28 english.aff
-rw-rw-r-- 1 root mailman 190 Jan 13 16:28 footer.html
-rw-rw-r-- 1 root mailman 877 Jan 13 16:28 header.html
-rw-rw-r-- 1 root mailman 194 Jan 13 16:28 long.html
-rw-rw-r-- 1 root mailman 1390 Jan 13 16:28 nomatch.html
-rw-rw-r-- 1 mailman mailman 2285568 Jan 14 17:19 root2word.db
-rw-rw-r-- 1 root mailman 67 Jan 13 16:28 short.html
-rw-rw-r-- 1 root mailman 14481 Jan 13 16:28 synonyms
-rw-rw-r-- 1 mailman mailman 90112 Jan 14 17:19 synonyms.db
-rw-rw-r-- 1 root mailman 1261 Jan 13 16:28 syntax.html
-rw-rw-r-- 1 mailman mailman 3022848 Jan 14 17:19 word2root.db
-rw-rw-r-- 1 root mailman 1087 Jan 13 16:28 wrapper.html
mailman@mailman2:/opt/www/htdig>

As you can see 3 of the files in common were written by the mailman userid when nightly_htdig first ran rundig. You will have to tweak things to suit your htdig installation setup.


In addition, when I look at the source for the search form on an archive page I see <form method="post" action="/cgi-bin/htsearch">. But on my system, htsearch exists in /usr/bin.
The htsearch program has to be available to the web server in a directory from which the server is prepared to run cgi programs.

Remember that execution of htdig's components is in two parts. The indexing of the material is typically done by a cron script running htidg components as one some user id from whatever was set up as htdig's bin directory, for example /opt/www/htdig/bin.

The 'search' operation, i.e. looking up stuff in the search indexes, using htsearch is run as a cgi-bin script under the auspices of the User/Group your web server is configured to run as.

I think it is usual for the htdig installation process to involve copying htsearch into the cgi-bin directory in whatever is configured by the ServerRoot directive in your web server's httpd.conf file.

Personally, I build htdig from source and in any event run SuSe Linux so I do not know how the Redhat RPMs have been configured.

If all else fails, as root, copy htsearch into the web server's cgi-bin directory and make sure that it readable and excutable but not writable by owner, group and other.

What am I overlooking?

Paul
--
Paul Kleeberg
[EMAIL PROTECTED]
Let me know if you continue to have problems.


------------------------------------------------------
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: archive@jab.org
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Reply via email to