TJ,

Providing automated cleanup tools is one of the things I enjoy doing for the Documentation Project. The major tool I use is Writer: people tend to forget that our flagship word processor is a fine and powerful text editor, too. Add a little bit -- or a lot -- of Basic code, and one can accomplish almost anything, with text documents. And, by definition, it runs on any supported platform.

You may be surprised but we do actually use Writer to produce the
help content. We are using special import and export filters to
load and save the xhp. This includes a set of Basic scripts to
control the output, see
http://documentation.openoffice.org/source/browse/documentation/www/online_help/helpers/helpauthoring/

(I am currently updating the files)

Writer handles XML, and HTML 3.2 Help files, just fine, as ordinary text. It's not as pretty, for humans, as specialized editors would be, but Basic doesn't care.

Specifically, getting rid of the 6-blanks strings in the CWS should be easy, assuming that there are no legitimate instances of such a thing.

Any such validation must be based on rules, we need to define so we do
not by accident "fix" bugs that are no bugs. Some come to mind:

- No trailing spaces in a <paragraph>
- No empty elements except for <br/>
- Adjacent elements need to be merged:
  <emph>One</emph><emph> Two</emph> -> <emph>One Two</emph>

If the files are (or can be) laid out in a hierarchical set of directories, the way the Help files are when a source tar-ball is unpacked, then I can scan all the files in one operation, processing each one, and save the output in place, or to a parallel set of directories.

[...]

If anyone is interested in this approach, please let me know (on dev-doc). --/tj/

Ultimately, I guess we need to make this part of the help authoring
cycle, adding that validation/normaliation either to the
helpauthoring extension for OOo, or as part of the module build.

Frank

--
Frank Peters x66757
Learning and Publications Manager
SLS - CCLS Application Services
Office Productivity & Communication Suite
Sun Microsystems, Hamburg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@l10n.openoffice.org
For additional commands, e-mail: dev-h...@l10n.openoffice.org

Reply via email to