--- Begin Message ---
Package: htdig
Version: 3.2.0b5-4
Severity: important

The current version in sarge takes obscenely long to index my docs. 

With version 3.1.x it took probably half an hour which I consider the
upper limit of what is tolerable. Now it takes a number of hours. I
don't know how long it will take to complete, as I'm writing this,
it has been running for about 4 hours.

To initially create the database I had to run htdig overnight. 

This renders the package virtually unusable, given that htdig eats LOTS
of memory and CPU time while indexing.

I've found the following statement in the htdig faq 
(http://rudi.3linden/doc/htdig-doc/html/FAQ.html#q5.20):

[snip]
5.20. Why are the betas of 3.2 so slow at indexing?

As the release notes for these versions suggest, they are somewhat
unoptimized and are made available for testing Since the 3.2 code
indexes all locations of words to support phrase searching and other
advanced methods, this additional data slows down the indexer. To
compensate, the code has a cache configured by the wordlist_cache_size
attribute. As of this writing, the word database code will slow down
considerably when the cache fills up. Setting the cache as large as
possible provides considerable performance improvement. Development is
in progress to improve cache performance
[snap]

(I followed the instructions and the success was limited. It still
takes hours.)

To sum up the above: The current version is _unsuitable_ for sarge. It
is a beta that should not have been uploaded. Unless a stable and
optimized 3.2 is released on time, I'd strongly suggest to downgrade
htdig to version 3.1.x before the release of sarge.

Thanks,

Johannes


-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.5-1-k7
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8

Versions of packages htdig depends on:
ii  debconf                     1.4.16       Debian configuration management sy
ii  libc6                       2.3.2.ds1-11 GNU C Library: Shared libraries an
ii  libgcc1                     1:3.3.3-5    GCC support library
ii  libnewt0.51                 0.51.4-23    Not Erik's Windowing Toolkit - tex
ii  libstdc++5                  1:3.3.3-5    The GNU Standard C++ Library v3
ii  lockfile-progs              0.1.10       Programs for locking and unlocking
ii  perl                        5.8.3-2      Larry Wall's Practical Extraction 
ii  zlib1g                      1:1.2.1-5    compression library - runtime

-- debconf information:
* htdig/generate-databases: false
  htdig/dblocation-changed: 
  htdig/remove-databases: false
* htdig/run-htnotify: false



--- End Message ---

Reply via email to