I am the editor of a document [the IEEE 754-2008 standard] that was created around 15 years ago (using OpenOffice), and has had nearly 200 drafts, a number of editors, and countless edits. It was last changed in 2008, but is now about to go though a new revision cycle.
I was delighted to find that LibreOffice handled the 2008 .odt file almost perfectly, with only 7 errors (all were weird spurious empty reference tags, of unknown provenance, that OpenOffice quietly ignored). While identifying and removing those from the content.xml, I noticed that there are hundreds (possibly thousands) of redundant tags. These are typically in the context: <span whatever>text1</span><span whatever>text2</span> where 'whatever' is identical, and either or both 'text1' or 'text2' may be empty. It there a tool to clean these up? I could write one myself (I recently wrote an XML parser) but if one already exists ... Many thanks -- Mike Cowlishaw [Apologies if this is a duplicate .. I tried it on askLibo some time ago but it is still "awaiting moderation".] -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted