[2010-06-04 11:02] Jari Aalto <jari.aa...@cante.net>
> markus schnalke <mei...@marmaro.de> writes:
> 
> >     .TH curl 1 "22 Oct 2003" "Curl 7.10.8" "Curl Manual"
> >     .TH curl 1 22\ Oct\ 2003 Curl\ 7.10.8 Curl\ Manual
> 
> >From unscientific test:
> 
>     $ ls /usr/share/man/man1/*.gz | xargs zgrep '^\.TH.*\\' | egrep -v '\"'
> 
>     usr/share/man/man1/ci.1.gz:.TH CI 1 \*(Dt GNU
>     /usr/share/man/man1/co.1.gz:.TH CO 1 \*(Dt GNU
>     /usr/share/man/man1/evince-thumbnailer.1.gz:.TH evince\-thumbnailer 1 
> 2007\-01\-15  
>     /usr/share/man/man1/formail.1.gz:.TH FORMAIL 1 \*(Dt BuGless
>     /usr/share/man/man1/gnome-panel.1.gz:.TH gnome-panel 1 2006\-03\-07
>     /usr/share/man/man1/html2text.1.gz:.TH html2text 1 2008\-09\-20
>     /usr/share/man/man1/ident.1.gz:.TH IDENT 1 \*(Dt GNU
>     /usr/share/man/man1/join-dctrl.1.gz:.TH join\-dctrl 1
>     /usr/share/man/man1/lockfile.1.gz:.TH LOCKFILE 1 \*(Dt BuGless
>     /usr/share/man/man1/merge.1.gz:.TH MERGE 1 \*(Dt GNU
>     /usr/share/man/man1/patch.1.gz:.TH PATCH 1 \*(Dt GNU
>     /usr/share/man/man1/procmail.1.gz:.TH PROCMAIL 1 \*(Dt BuGless
>     /usr/share/man/man1/rcs.1.gz:.TH RCS 1 \*(Dt GNU
>     /usr/share/man/man1/rcsclean.1.gz:.TH RCSCLEAN 1 \*(Dt GNU
>     /usr/share/man/man1/rcsdiff.1.gz:.TH RCSDIFF 1 \*(Dt GNU
>     /usr/share/man/man1/rcsfreeze.1.gz:.TH RCSFREEZE 1 \*(Dt GNU
>     /usr/share/man/man1/rcsintro.1.gz:.TH RCSINTRO 1 \*(Dt GNU
>     /usr/share/man/man1/rcsmerge.1.gz:.TH RCSMERGE 1 \*(Dt GNU
>     /usr/share/man/man1/rlog.1.gz:.TH RLOG 1 \*(Dt GNU
>     /usr/share/man/man1/rpcgen.1.gz:.TH \*(x}
>     /usr/share/man/man1/saidar.1.gz:.TH saidar 1 $Date:\ 2006/11/30\ 
> 23:42:42\ $ i\-scream 
> 
> There doesn't seem to be cases where "\ " is used.

The last line is such a case.

> I'm inclined to conclude that bug reports should be sent to packages
> that have pages using backslashes in .TH line. These pages should be
> converted to use the double quote notation.

I could agree for TH lines with escaped spaces, but not for using any
backslashes in TH lines.

Especially \- must be possible as it means something different to -.

> The main problem is with
> those pages:
> 
>     - No information can be parsed reliably; there is no delimiters
>       (start, stop) to specify which text is within which.

If you parse it char for char, then you can parse it reliable. Nroff
can do it. But I don't think we want this overhead here.


The most important thing is detecting the first two parameters (name
and section). These will almost always be detectable without problems.
If we can detect them, we should display them in the page title. The
``secret man page'' should then appear almost never.

For all the other parameters we should try to detect them as good as
possible. If we can detect values then we should use them, otherwise
we should just ignore them. IMO we can ignore escaped spaces here.


> In any case, here is patch to improve the TH detection in cases like the
> above.
> 
> Daniel, would you apply this to CVS.

I think we can still improve that one. Let's do it in two steps:

First detect the first two arguments, which will succeed almost
always. And as a separate step we could try to detect the rest.

In general: Did you notice that nothing but the first argument of TH
is ever used by roffit? Thus we should think about how much code we
put into roffit to detect the other arguments.

It might be enough to detect the first two arguments which will be
successful in most cases, and we don't have to mess around with the
rest.


Unsolved still is \*(Dt. Your patch deletes it. This might be the best
solution for now.


meillo



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to