I am very happy to announce the release of htdig version 3.2.0b2. This is the second beta for the 3.2.0 release and it is the result of quite a bit of hard work from a number of people. Again, we're looking for as much feedback as possible, including suggestions, bug reports, fixes, features, etc. To download, see <http://www.htdig.org/files/htdig-3.2.0b2.tar.gz> For documentation and Release Notes, see <http://dev.htdig.org/htdig-3.2/> For the ChangeLog, see <http://dev.htdig.org/htdig-3.2/ChangeLog> Feedback on the release should be primarily directed to [EMAIL PROTECTED] -Geoff Hutchison Williams Students Online http://wso.williams.edu/ Release notes for htdig-3.2.0b2 11 Apr 2000 This version is still marked beta because it has still only received limited testing. However, it adds more functionality and should fix all known bugs in the previous 3.2.0b1 release, including the security hole fixed in version 3.1.5 in production versions. As with 3.2.0b1, if you are upgrading from a previous version, you should read the upgrade guide first. * Fixed several bugs in the new HTTP/1.1 implementation that would cause problems with so-called "Chunked" data. * Fixed a bug in the new regex-based configuration options that would ignore the case_sensitive attribute. * Fixed the robots.txt parsing to more rigorously stick to the standard. * Fixed a bug where upper-case META robots directives would be ignored. * Fixed a bug that could leave a connection open when it failed. * Fixed the timeout in the connection code to ensure that hung connections are killed properly. * Fixed a bug where duplicates of modified documents could pile up over time. * Fixed a bug in the SGML entity handling where numeric entities would be ignored. (e.g. ¢ -> �) * Fixed a bug in the new configuration parser that wouldn't accept lists including numbers * Fixed a potential infinite loop in the phrase searching parser that came up when fuzzy algorithms were used. * The HTML parser now ignores anything between <script> tags, much like it does for <style> tags. * Fixed some performance problems in the new word database code. * Removed the attributes translate_quot, translate_lt, translate_gt and translate_amp since all SGML entities are now encoded and decoded when displayed. * Removed the attribute uncoded_db_compatible since the 3.2 databases are no longer compatible with previous versions anyway. * Removed the attribute word_list because the db.wordlist file is no longer generated. To get an ASCII version of the database, use the word_dump attribute. * Removed the pdf_parser attribute. It is now preferred to use the external parser or external converter support with xpdf. * The wordlist_compress attribute is now turned on by default. * The output from htsearch and the default and included templates should now be more HTML-4.0 compliant. * Added support for searching collections of multiple databases. To use this, supply multiple config fields or config names separated by "|" characters. Also see the collection_names attribute. * Added a new accents fuzzy algorithm, which treats accented and unaccented words the same. You must create an accents_db with htfuzzy after indexing. * Added new attributes tcp_max_retries and tcp_wait_time to control how many times a low-level connection is retried and how long to wait on a hung connection. * Add any_keywords attribute to OR the keywords field in a search form instead of AND-ing them together. * Add the attributes search_results_order and url_seed_score to control result ranking and scoring based on URL patterns. * Moved the htnotify program into the new httools directory. * Added the programs htdump, htload, htstat and htpurge. * There are the usual variety of other fixes and changes. See the ChangeLog for more details. * Once again, a huge thank you to everyone who contributed bug reports, fixes and patches! ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
