This was exctracted with a simple perl script.

May be usefull to some of you ? 

-------------------------------
htlib/Configuration.h:
        This class provides an object lookup table.  Each object 
        in the Configuration is indexed with a string.  The objects 
        can be returned by mentioning their string index.

htlib/Connection.h:
        This class forms a easy to use interface to the berkeley
        tcp socket library. All the calls are basically the same, 
        but the parameters do not have any stray _addr or _in
        mixed in...

htlib/DB2_db.h:
        implements the btree database instance of a Database object

htlib/DB2_hash.h:
        implements the hash database instance of a Database object

htlib/Database.h:
        Class which defines the interface to a generic, 
        simple database.

htlib/Dictionary.h:
        This class provides an object lookup table.  
        Each object in the dictionary is indexed with a string.  
        The objects can be returned by mentioning their
        string index.

htlib/HtCodec.h:
        Provide a generic means to take a String, code
        it, and return the encoded string.  And vice versa.

htlib/HtDateTime.h:
        Parse, split, compare and format dates and times.

htlib/HtHeap.h:
        A Heap class which holds objects of type Object.
        (A heap is a semi-ordered tree-like structure.
         it ensures that the first item is *always* the largest.
         NOTE: To use a heap, you must implement the Compare() function for 
                your Object classes. The assumption used here is -1 means 
                less-than, 0 means equal, and +1 means greater-than. Thus 
                this is a "min heap" for that definition.)

htlib/HtPack.h:
        Compress and uncompress data in e.g. simple structures.

htlib/HtRegex.h:
        A simple C++ wrapper class for the system regex routines.

htlib/HtSGMLCodec.h:
        A Specialized HtWordCodec class to convert between SGML 
        ISO 8859-1 entities and high-bit characters.

htlib/HtURLCodec.h:
        Specialized HtWordCodec which just caters to the
        needs of "url_part_aliases" and "common_url_parts".
        Used for coding URLs when they are on disk; the key and the
        href field in db.docdb.

htlib/HtVector.h:
        A Vector class which holds objects of type Object.
        (A vector is an array that can expand as necessary)
        This class is very similar in interface to the List class

htlib/HtWordCodec.h:
        Given two lists of pair of "words" 'from' and 'to';
        simple one-to-one translations, use those lists to translate.
        Only restriction are that no null (0) characters must be
        used in "words", and that there is a character "joiner" that
        does not appear in any word.  One-to-one consistency may be
        checked at construction.

htlib/HtWordType.h:
        Wrap some attributes to make is...() type
        functions and other common functions without having to manage
        the attributes or the exact attribute combination semantics.

htlib/HtZlibCodec.h:
        Provide a generic access to the zlib compression routines.
        If zlib is not present, encode and decode are simply 
        assignment functions.

htlib/IntObject.h:
        int variable encapsulated in Object derived class

htlib/List.h:
        A List class which holds objects of type Object.

htlib/Object.h:
        This baseclass defines how an object should behave.
        This includes the ability to be put into a list

htlib/ParsedString.h:
        Contains a string. The string my contain $var, ${var}, $(var)
        `filename`. The get method will expand those using the
        dictionary given in argument.

htlib/Queue.h:
        This class implements a linked list of objects.  It itself is also an
        object

htlib/QuotedStringList.h:
        Fed with a string it will extract separator delimited
        words and store them in a list. The words may be 
        delimited by " or ', hence the name.

htlib/Stack.h:
        This class implements a linked list of objects.  It itself is also an
        object

htlib/StringList.h:
        Specialized List containing String objects. 

htlib/StringMatch.h:
        This class provides an interface to a fairly specialized string
        lookup facility.  It is intended to be used as a replace for any
        regualr expression matching when the pattern string is in the form:

htlib/URL.h:
        A URL parsing class, implementing as closely as possible the standard
        laid out in RFC2396 (e.g. http://www.faqs.org/rfcs/rfc2396.html)
        including support for multiple schemes.

htlib/cgi.h:
        Parse cgi arguments and put them in a dictionary.

htlib/good_strtok.h:
        The good_strtok() function is very similar to the 
        standard strtok() library function, except that good_strtok() 

htlib/htString.h:
        (implementation in String.cc) Just Another String class.

htlib/io.h:
        Perform low level I/O. The Connection class is derived from io.

htlib/langinfo.h:
        compatibility for strptime implementation on architectures
        that do not contain this header.

htlib/lib.h:
        Contains typical declarations and header inclusions used by
        most sources in this directory.

htlib/regex.h:
        replacement of the regex function for architectures that do
         not have them.

htcommon/DocumentDB.h:
        This class is the interface to the database of document
        references. This database is only used while digging.  
        An extract of this database is used for searching.  
        This is because digging requires a different index
        than searching.

htcommon/DocumentRef.h:
        Reference to an indexed document. Keeps track of all
        information stored on the document, either by the dig 
        or temporary search information.

htcommon/WordList.h:
        Interface to the word database. Previously, this wrote to 
        a temporary text file. Now it writes directly to the 
        word database. 
        NOTE: Some code previously attempted to directly read from 
        the word db. This will no longer work, so it's preferred to 
        use the access methods here.

htcommon/WordRecord.h:
        Record for storing word information in the word database
        Each word is stored as a separate key/record pair.

htcommon/WordReference.h:
        Reference to a word. Store everything we need for internal use
        Defined as a class to allow the comparison 
        method (for sorting).

htcommon/defaults.h:
        Default configuration values for the ht programs

htdig/Document.h:
        This class holds everything there is to know about a document.
        The actual contents of the document may or may not be present at
        all times for memory conservation reasons.
        The document can be told to retrieve its contents.  This is done
        with the Retrieve call.  In case the retrieval causes a 
        redirect, the link is followed, but this process is done 
        only once (to prevent loops.) If the redirect didn't 
        work, Document_not_found is returned.

htdig/ExternalParser.h:
        Allows external programs to parse unknown document formats.
        The parser is expected to return the document in a 
        specific format. The format is documented 
        in http://www.htdig.org/attrs.html#external_parser

htdig/HTML.h:
        Class to parse HTML documents and return useful information 
        to the Retriever

htdig/HtHTTP.h:
        Class for HTTP messaging (derived from Transport)

htdig/Images.h:
        Issue an HTTP request to retrieve the size of an image from
        the content-length field.

htdig/PDF.h:
        This class parses PDF (acrobat) files.
        Parsing is done on PostScript translation of the PDF file 
        by Acrobat Reader (acroread). It is freely available for 
        most platform at www.adobe.com

htdig/Parsable.h:
        Base class for file parsers (HTML, PDF, ExternalParser ...)

htdig/Plaintext.h:
        Parses plaintext files. Not much to do, really.

htdig/Retriever.h:
        Crawl from a list of URLs and calls appropriate parsers. The
        parser notifies the Retriever object that it got something
        (got_* functions) and the Retriever object feed the databases
        and statistics accordingly.

htdig/Server.h:
        A class to keep track of server specific information.

htdig/Transport.h:
        A virtual transport interface class for accessing
        remote documents. Used to grab URLs based on the 
        scheme (e.g. http://, ftp://...)

htdig/URLRef.h:
        A definition of a URL/Referer pair with associated hopcount

htdig/htdig.h:
        Indexes the web sites specified in the config file
        generating several databases to be used by htmerge

htmerge/htmerge.h:
        The interface to the htmerge program
        Defines the calling conventions for
          mergeDB -> db.cc (merging two databases)
          mergeWords -> words.cc (updating the word db)
          convertDocs -> docs.cc (updating the doc db)
          reportError -> htmerge.cc (reporting errors)

htsearch/Display.h:
        Implementation of Display
        Takes results of search and fills in the HTML templates

htsearch/DocMatch.h:
        Data object only. Contains information related to a given
        document that was matched by a search. For instance, the
        score of the document for this search.

htsearch/ResultList.h:
        A Dictionary indexed on the document id that holds
        documents found for a search.

htsearch/ResultMatch.h:
        Contains information related to a given
        document that was matched by a search. For instance, the
        score of the document for this search. Similar to the
        DocMatch class but designed for result display purposes.

htsearch/Template.h:
        Gives access to template files used to format the output
        of htsearch.

htsearch/TemplateList.h:
        Holds the templates available to format a list of
        results. These can be compiled in or read from
        files.

htsearch/WeightWord.h:
        ?

htsearch/htsearch.h:
        Command-line and CGI interface to search the databases
        Expects the databases are generated using htdig, htmerge, 
        and htfuzzy. Outputs HTML-ized results of the search based 
        on the templates specified

htsearch/parser.h:
        Parse the string containing a search request and find the
        document that matches.

---------------------
Script used to generate it:

perl script htlib/*.h htcommon/*.h htdig/*.h htmerge/*.h htsearch/*.h

foreach $file (@ARGV) {
    my($comment) = '';
    my($head);
    open(FILE, "<$file");
    my($tag);
    ($tag = $file) =~ s|.*/(.*)\.h$|$1|;
    while(<FILE>) {
        if(s|^(//\s+$tag:\s+)||) {
            $head = $1;
            $head =~ s|[^/]| |g;
            $comment .= "\t$_";
        } elsif($head) {
            if(s/^$head//) {
                $comment .= "\t$_";
            } else {
                last;
            }
        }
    }
    close(FILE);

    print "$file:\n$comment\n";
}

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to