Re: [W2X] How to map a custom paragraph style to a DITA
containing and/or

Hussein Shafie Sat, 21 Sep 2019 01:54:28 -0700

Damian C. wrote:
Thanks for your recent support. It's all been very helpful and we've got our docx import working almost exactly the way that we want it. However, as you know, we sometimes receive client documents with strange style choices and the one that I've attached is a good example. For some reason there are paragraph titles which are styled bold and/or italic in Word and yet when converted to dita these are imported as plain paragraphs with no styling. In other words we're not getting the semantic bold/italic tags that we'd expect within the paragraph. Now I can see that the styles being applied (such as p-BodyTextBold) are based on a style called s123456basebodytext which seems a bit strange.
Not a problem.
However I can't tell if this is the problem.
No.
So I was wondering if you can shed some light on this?

 and elements are generated only for MS-Word *character* styles, not for MS-Word *paragraph* styles like those found in "Project foo.docx" (see them listed below).

When an MS-Word paragraph style specifies that all the text contained in the paragraph is by default bold and/or italic, this conveys no semantic meaning. If needed, you are supposed to map this MS-Word paragraph style to a DITA semantic element generally rendered as bold and/or italic (e.g. <title>).

In a nutshell, out of the box, w2x has no way to generate a having all its text wrapped in a single and/or .

--> What follows is an relatively easy way to implement what you want (as far as I understand it).
w2x -f foo.options "Project foo.docx" out.dita where attached "foo.options" is: --- -o topic -p edit.remove-styles.preserved-classes "/^p-BodyTextBold/" -t foo.xslt --- (If you want a map, simply replace "-o topic" by "-o map")
-p edit.remove-styles.preserved-classes "/^p-BodyTextBold/" means keep all classes having a name starting by "p-BodyTextBold" in the intermediate semantic XHTML which will be then converted to DITA by the means of W2X_install_dir/xslt/topic.xslt.
Attached "foo.xslt" is: --- <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; xmlns:h="http://www.w3.org/1999/xhtml"; exclude-result-prefixes="h"> <xsl:import href="w2x:xslt/topic.xslt"/> <xsl:template match="h:p[@class = 'p-BodyTextBold']"> <xsl:call-template name="processCommonAttributes"/> <xsl:apply-templates/> </xsl:template> <xsl:template match="h:p[@class = 'p-BodyTextBoldItalics']"> <xsl:call-template name="processCommonAttributes"/> <xsl:apply-templates/> </xsl:template> </xsl:stylesheet> ---
PS Do you know if there's a way to output the intermediate xhtml/css files which are used when converting from Word to Dita? I'd quite like to look at these for debugging purposes but I can't find a parameter which lets me keep these files.

--> If you want to look at MS-Word styles converted the CSS styles, please convert your file to the "xhtml_css" format (the default output format) and then look for <style> inside the generated ".html" file.
Example: w2x "Project foo.docx" out.html out.html contains: --- ... .p-BodyTextBold { font-family: Arial; font-weight: bold; ... } .p-BodyTextBoldItalics { font-family: Arial; font-style: italic; font-weight: bold; ... } ... ---
--> If you want to look at the intermediate semantic XHTML which is then converted to DITA by the means of W2X_install_dir/xslt/topic.xslt, please convert your file to the "xhtml_loose" format:
Example: w2x -o xhtml_loose "Project foo.docx" out.xhtml
-o topic -p edit.remove-styles.preserved-classes "/^p-BodyTextBold/" -t foo.xslt

foo.xslt
Description: application/xslt

-- XMLmind Word To XML Support List w2x-support@xmlmind.com https://www.xmlmind.com/mailman/listinfo/w2x-support

Previous message

View by thread

View by date

Next message

Re: [W2X] How to map a custom paragraph style to a DITA <... Hussein Shafie

Re: [W2X] How to map a custom paragraph style to a DIT... Hussein Shafie

Reply via email to

Re: [W2X] How to map a custom paragraph style to a DITA containing and/or

Reply via email to

Re: [W2X] How to map a custom paragraph style to a DITA
containing and/or