Bruce Perens wrote:

> From: "Scott K. Ellis" <[EMAIL PROTECTED]>
> > I have no problem when HTML is the provided upstream documentation source,
> > and don't want to cripple my ability to read that.  However, when the
> > upstream source is something else, such as info/texinfo, I don't want HTML
> > as well.
> 
> Well, you're going to need a script to implement that policy. Probably
> the best way to handle this is to provide a way to tell the package system
> that you have deliberately removed a file, and that this file should not
> be replaced. I wouldn't expect this in version 1.0 .

Bruce,

Your answer sounds as if you think this is a particular problem and not
a general one. I think this issue affects a considerable proportion of
Debian users and therefore a more general solution should be provided.

In my opinion, the current policy on documentation is inadequate because
of the following very important points:

        *) Forces users to waste bandwidth and disk space by not allowing
       an easy way to select which documents to keep (and move).

        *) It is inconsistent: treats some source formats (man) as acceptable 
for
       binary packages and some other (texinfo) as not acceptable. I do not
       have any use of groff other than for viewing man pages. Groff and
       texinfo have equivalent functionality but they are treated differently.

        *) It is not flexible: The packaging system should prevent users from
       installing mutually conflicting packages but it should allow them
       to install any document at all.

        *) It is incomplete. The policy says that the users must be able to
       view any document with an HTML browser, but it does not specify a
       default method (i.e. a default web server that should be installed
       as part of the base system and a default browser, part of the base
       system too)


If we compare what is necessary to view a man page (and nobody never complains)
then we can understand that something similar might be necessary for other
types of documentation.

A man page is in source format. We need a compiler (groff), a caching manpage
server (man) and a viewer (less). They are usually in every system by
default and nobody complains that, for example, an alternative less
bloated manpage compiler should be used.

I suggest the following changes to the policy:

[Note: I use base system to mean the basic default system, not just the
packages in "base"]

        *) The default format should be HTML, but everything necessary to
       view the documents should be provided as part of the base system.
       This includes a default HTML viewer (lynx), which users can override
       by means of the update-alternatives method. It would also include
       a default HTML server, which should also be installed as part of the
       base system and changeable by the update-alternatives method (or
       maybe conflicting with others?).
       Suggestions for HTML server: boa, small and fast
                                    cern, if you want caching

        *) The package dwww should be marked important. It should provide
       on-the-fly converters (as CGI programs) for as many formats as
       possible. No converter should depend on a non-required package.
       They should be self-contained or dependent on required packages only.

        *) A default searching/indexing engine should be chosen. It would be
       marked standard, but not important. (I don't know which one is good,
       maybe Bruce's idea of shell+zgrep can be made into a package)

    *) Documents should be provided in the least processed (closest to
       the original source) format for which an on-the-fly converter exists.
       Given the choice of several formats, the most versatile one (which
       can more easily be converted to other formats) should be chosen.

        *) Until there is a better option available, dpkg should include a 
script
       to automate the process of unpacking the /usr/doc/package part of a
       package without installing it. This is to allow users to install
       documentation of packages which conflict with an installed one.
       Users might need to manually remove the directory /usr/doc/package
       when they no longer need it.
       Scripts to register and unregister documents with dwww should be
       provided in order to properly handle this case

        *) The project should stick to the policy and not include alternative
       formats or viewers by default. If we are convinced that HTML is the
       format, then we must show it.

        *) Man pages should be installed in raw format and converted to HTML
       on-the-fly. Since a man->html converter which does not depend on groff
       is possible, neither "groff" nor "man" should be installed by
       default. Use of the "man" program should be discouraged. (Should we
       still insist on having man pages for any program? For Unix compatibility
       maybe. But man pages is one thing and the man program is another one.)

        *) Texinfo sources should be installed in raw format and not in info
       format. "Info" should be an optional package, which on installation
       scans /usr/doc looking for texinfo files, compiles them and places
       the output in /usr/info. On deinstallation it should erase /usr/info.
       This means that "info" should depend on "texinfo" and would be
       needed for organizing info files even for emacs users. Emacs should not
       provide info. (Emacs could be used as info browser, though)
       Besides, dwww should have a hook for calling the info installer (if
       present) when some texinfo document is registered.
       If info is not installed, "info package" would invoke the regular
       texinfo->html on-the-fly converter.
       (Maybe some Perl hacking is needed to convert texinfo2html
       from a static compiler into an on-the-fly cgi translator, but it
       should not be difficult)

        *) Tex, sgml, groff, html and plaintext sources should be installed in
       raw format since they either can easily be converted to html or 
       are somehow viewable with an HTML browser. It would be nice to have
       good-looking conversions, but functionality and format consistency
       should be the main concern. A file explaining what packages are
       needed to generate a specific output format from a given input format
       should be included in the main Debian documentation. Scripts should
       be provided within those packages to minimize user work if they need
       to generate alternative formats.
       Note: don't forget the ability of browsers to produce plaintext (lynx),
       html or PostScript (Netscape) copies of cgi-generated pages.

    *) Documents originally in binary format (PS, DVI, PDF, MS-WORD) for
       which no conversion is possible should be provided separately. A file
       explaining how to get the documentation (including which programs
       the user will need: ghostscript, xdvi, MS Word) and a brief summary
       of the document (developers _do_ read the documents, don't they?)
       should be included in the binary package (in a convertible format, of
       course).

        *) Similarly, documentation in any format which excedes a maximum size
       should be included as a separate package. However, an overview of
       what a package does and its basic functionality should be included
       with the package, along with a reference to the rest of the
       documentation.

        *) The README.debian file should be replaced by the index.html of the
       /usr/doc/package directory. The file README.debian should contain
       a text explaining how to use "doc package" for viewing an index
       of the documents about a package. Similarly, we could consider
       displaying that text when the user types "man package" or
       "info package" and man or info are not installed. (or just fall
       through to the html browser?)

I think all this is necessary if we really want to have HTML as the default
documentation format. If we chicken out of requiring a default browser and
a default server plus a set of cgi converters to be base packages, then 
we should forget about having HTML as default and go with Chris idea
of having separate packages in different formats and let users choose.


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
[EMAIL PROTECTED] . 
Trouble?  e-mail to [EMAIL PROTECTED] .

Reply via email to