> to be able to retrieve informations from man pages, specially 
> existing
> options and their help strings.

I'd actually try parsing the man page sources. They are already in
a layout language so finding the options should be easy - there is
a man macro for that I believe. But i've never tried it!

If that turns out to be harder than I think, then an XML option
would be next best. But i donlt know of a man2xml command.

Finally the man2html route is definitely possible and BeautifulSoup
should be able to find the option tags for you - but it may find
many others besides so you might need some clever selection
criteria. But the parsing at least should be done for you.

> could parse directly the troff sources (is there already a parser 
> for that?),

I don;t know if there is a parser, but ISTR there is a specific option
tag in the man macros for command options so it should be easy to
find and extract the data by looking for

.OP

or whatever the tag is. The good news is that troff macros are nearly
always located at the marging and start with a dot so they are easy
to locate using simple regex.

Actually I just had a quick look and its not so good after all, the
format seems to be

.SH Options
......text in here
.B  <option>

But worse not all pages follow the official format as exemplified in
the "yes" man page, some don;t even have an Options SubHeading...

Another option(sic) to consider is parsing the .cat files that are
produced by man on first use - a bit like pyhon produces .pyc files).
They are plain text so might be easier to search.

Finally you could look at the info files(written in LaTeX, if they 
exist
for all your commands.

But I think an xml/html solution is looking better!

HTH,

Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to