Package: roffit Version: 0.6+cvs20090507-1 Severity: wishlist Tags: upstream patch
Roffit is very strict in detecting man page headers (TH lines). It should recognize fields correctly if unneccesary double quotes are omitted. Also, arbitrary white space between fields should not matter. The attached patch probably fixes these problems. One issue is still unsolved: Escaped spaces. For nroff the following lines are equivalent: .TH curl 1 "22 Oct 2003" "Curl 7.10.8" "Curl Manual" .TH curl 1 22\ Oct\ 2003 Curl\ 7.10.8 Curl\ Manual This corner-case will probably seldom show up, however. Fixing it might require more than the single regexp that is used currently. -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.30-2-686-bigmem (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages roffit depends on: ii perl 5.10.1-12 Larry Wall's Practical Extraction roffit recommends no packages. roffit suggests no packages. -- no debconf information
--- roffit.orig 2010-06-03 16:30:34.000000000 +0200 +++ roffit 2010-06-03 16:56:57.000000000 +0200 @@ -212,13 +212,19 @@ # man page header: # curl 1 "22 Oct 2003" "Curl 7.10.8" "Curl Manual" # NAME SECTION DATE VERSION MANUAL - if($rest =~ /([^ ]*) (\d+) \"([^\"]*)\" \"([^\"]*)\"(\"([^\"]*)\")?/) { + if($rest =~ / + ([^ ]+)[ \t]+ + (\d+)[ \t]+ + (\"[^\"]+\"|[^ \t]+)[ \t]+ + (\"[^\"]+\"|[^ \t]+)[ \t]+ + (\"[^\"]+\"|[^ \t]+)? + /x) { # strict matching only so far $manpage{'name'} = $1; $manpage{'section'} = $2; $manpage{'date'} = $3; $manpage{'version'} = $4; - $manpage{'manual'} = $6; + $manpage{'manual'} = $5; } } elsif($keyword =~ /^S[HS]$/) {