[2010-06-04 11:02] Jari Aalto <jari.aa...@cante.net> > markus schnalke <mei...@marmaro.de> writes: > > > .TH curl 1 "22 Oct 2003" "Curl 7.10.8" "Curl Manual" > > .TH curl 1 22\ Oct\ 2003 Curl\ 7.10.8 Curl\ Manual > > >From unscientific test: > > $ ls /usr/share/man/man1/*.gz | xargs zgrep '^\.TH.*\\' | egrep -v '\"' > > usr/share/man/man1/ci.1.gz:.TH CI 1 \*(Dt GNU > /usr/share/man/man1/co.1.gz:.TH CO 1 \*(Dt GNU > /usr/share/man/man1/evince-thumbnailer.1.gz:.TH evince\-thumbnailer 1 > 2007\-01\-15 > /usr/share/man/man1/formail.1.gz:.TH FORMAIL 1 \*(Dt BuGless > /usr/share/man/man1/gnome-panel.1.gz:.TH gnome-panel 1 2006\-03\-07 > /usr/share/man/man1/html2text.1.gz:.TH html2text 1 2008\-09\-20 > /usr/share/man/man1/ident.1.gz:.TH IDENT 1 \*(Dt GNU > /usr/share/man/man1/join-dctrl.1.gz:.TH join\-dctrl 1 > /usr/share/man/man1/lockfile.1.gz:.TH LOCKFILE 1 \*(Dt BuGless > /usr/share/man/man1/merge.1.gz:.TH MERGE 1 \*(Dt GNU > /usr/share/man/man1/patch.1.gz:.TH PATCH 1 \*(Dt GNU > /usr/share/man/man1/procmail.1.gz:.TH PROCMAIL 1 \*(Dt BuGless > /usr/share/man/man1/rcs.1.gz:.TH RCS 1 \*(Dt GNU > /usr/share/man/man1/rcsclean.1.gz:.TH RCSCLEAN 1 \*(Dt GNU > /usr/share/man/man1/rcsdiff.1.gz:.TH RCSDIFF 1 \*(Dt GNU > /usr/share/man/man1/rcsfreeze.1.gz:.TH RCSFREEZE 1 \*(Dt GNU > /usr/share/man/man1/rcsintro.1.gz:.TH RCSINTRO 1 \*(Dt GNU > /usr/share/man/man1/rcsmerge.1.gz:.TH RCSMERGE 1 \*(Dt GNU > /usr/share/man/man1/rlog.1.gz:.TH RLOG 1 \*(Dt GNU > /usr/share/man/man1/rpcgen.1.gz:.TH \*(x} > /usr/share/man/man1/saidar.1.gz:.TH saidar 1 $Date:\ 2006/11/30\ > 23:42:42\ $ i\-scream > > There doesn't seem to be cases where "\ " is used.
The last line is such a case. > I'm inclined to conclude that bug reports should be sent to packages > that have pages using backslashes in .TH line. These pages should be > converted to use the double quote notation. I could agree for TH lines with escaped spaces, but not for using any backslashes in TH lines. Especially \- must be possible as it means something different to -. > The main problem is with > those pages: > > - No information can be parsed reliably; there is no delimiters > (start, stop) to specify which text is within which. If you parse it char for char, then you can parse it reliable. Nroff can do it. But I don't think we want this overhead here. The most important thing is detecting the first two parameters (name and section). These will almost always be detectable without problems. If we can detect them, we should display them in the page title. The ``secret man page'' should then appear almost never. For all the other parameters we should try to detect them as good as possible. If we can detect values then we should use them, otherwise we should just ignore them. IMO we can ignore escaped spaces here. > In any case, here is patch to improve the TH detection in cases like the > above. > > Daniel, would you apply this to CVS. I think we can still improve that one. Let's do it in two steps: First detect the first two arguments, which will succeed almost always. And as a separate step we could try to detect the rest. In general: Did you notice that nothing but the first argument of TH is ever used by roffit? Thus we should think about how much code we put into roffit to detect the other arguments. It might be enough to detect the first two arguments which will be successful in most cases, and we don't have to mess around with the rest. Unsolved still is \*(Dt. Your patch deletes it. This might be the best solution for now. meillo -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org