markus schnalke <mei...@marmaro.de> writes: >> /usr/share/man/man1/rcsintro.1.gz:.TH RCSINTRO 1 \*(Dt GNU >> /usr/share/man/man1/saidar.1.gz:.TH saidar 1 $Date:\ 2006/11/30\ >> 23:42:42\ $ i\-scream >> > The last line is such a case.
Handled n the patch. > If you parse it char for char, then you can parse it I meant thet You can't read information from space delimited text, where the information means different things. It needs a quote to say BEGIN and quote to say END for: NAME SECTION DATE VERSION MANUAL > The most important thing is detecting the first two parameters > ... First detect the first two arguments, which will succeed almost > always. Added final ELSIF case. Daniel, use this. Jari
>From 5675160c2b879b9d4b9b29e16224a8090ce32b0a Mon Sep 17 00:00:00 2001 From: Jari Aalto <jari.aa...@cante.net> Date: Fri, 4 Jun 2010 10:12:23 +0300 Subject: [PATCH] roffit: improve TH handling Organization: Private Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Signed-off-by: Jari Aalto <jari.aa...@cante.net> --- roffit | 52 +++++++++++++++++++++++++++++++++++++++++++++------- 1 files changed, 45 insertions(+), 7 deletions(-) diff --git a/roffit b/roffit index 3149f37..ae55406 100755 --- a/roffit +++ b/roffit @@ -203,23 +203,61 @@ sub parsefile { $out = ""; # cut off initial spaces - $rest =~ s/^ +//g; + $rest =~ s/^\s+//; - if($keyword eq "\\\"") { + if ( $keyword eq q(\\") ) { # this is a comment, skip this line } - elsif($keyword =~ /^TH$/) { + elsif ( $keyword eq "TH" ) { # man page header: # curl 1 "22 Oct 2003" "Curl 7.10.8" "Curl Manual" + + # Treat pages that have "*(Dt": + # .TH IDENT 1 \*(Dt GNU + + $rest =~ s,\Q\\*(Dt,,g; + + # Delete backslashes + + $rest =~ s,\\,,g; + + # Delete old RCS tags + # .TH saidar 1 $Date:\ 2006/11/30\ 23:42:42\ $ i\-scream + + $rest =~ s,\$Date:\s+(.*?)\s+\$,$1,g; + # NAME SECTION DATE VERSION MANUAL - if($rest =~ /([^ ]*) (\d+) \"([^\"]*)\" \"([^\"]*)\"(\"([^\"]*)\")?/) { + # section can be: 1 or 3C + + if ( $rest =~ /(\S+)\s+\"?(\d\S?+)\"?\s+\"([^\"]*)\" \"([^\"]*)\"(\"([^\"]*)\")?/ ) { # strict matching only so far - $manpage{'name'} = $1; + $manpage{'name'} = $1; $manpage{'section'} = $2; - $manpage{'date'} = $3; + $manpage{'date'} = $3; $manpage{'version'} = $4; - $manpage{'manual'} = $6; + $manpage{'manual'} = $6; } + # .TH html2text 1 2008-09-20 HH:MM:SS + elsif ( $rest =~ m, (\S+) \s+ \"?(\d\S?+)\"? \s+ \"?([ \d:/-]+)\"? \s* (.*) ,x ) + { + $manpage{'name'} = $1; + $manpage{'section'} = $2; + $manpage{'date'} = $3; + $manpage{'manual'} = $4; + } + # .TH program 1 description + elsif ( $rest =~ /(\S+) \s+ \"?(\d\S?+)\"? \s+ (.+)/x ) + { + $manpage{'name'} = $1; + $manpage{'section'} = $2; + $manpage{'manual'} = $3; + } + # .TH program 1 + elsif ( $rest =~ /(\S+) \s+ \"?(\d\S?+)\"? /x ) + { + $manpage{'name'} = $1; + $manpage{'section'} = $2; + } } elsif($keyword =~ /^S[HS]$/) { # SS is treated the same as SH -- 1.7.1