Yasser S. wrote:
Thanks Hussein. That worked and I bought a license.
I have a question about unicode characters: The documents I'm working on have
characters in the unicode private range like 0xF072; which becomes u when it
goes through w2x. Is there a way to preserve these characters all the way to
the output XHML?
Yasser S. wrote:
I hacked a solution and would like your opinion on it.
I added a xed script in main.xed in the before.after-translate step.
I thought you wanted to generate styled XHTML and not semantic XHTML.
main.xed is for semantic XHTML. main-styled is for styled XHTML. Step
after-translate is found only in main.xed.
Here's the script:
(:
: Transform arabic honorifics (ligatures)
:
:)
namespace "http://www.w3.org/1999/xhtml";
namespace html = "http://www.w3.org/1999/xhtml";
warning("In cordoba ");
for-each /html/body//span[contains(@style, 'AGA Arabesque')]/text() {
set-variable("honorific", string(.));
warning("honorific set to ", $honorific);
if ($honorific = 'u') {
set-variable("honorific", "");
} elseif ($honorific = 'r') {
set-variable("honorific", "");
} elseif ($honorific = 't') {
set-variable("honorific", "");
} else {
warning("Unknown character ", .);
}
warning("Honorific is: ", $honorific);
replace(<span class="honorific">{$honorific}</span>, ./..);
}
This seems to work and is producing the desired output.
Your script is well-thought, but could be made slightly more general.
Is there a better way to do this?
The following script (I called it "ar1.xed") works whether you generate
styled XHTML or semantic XHTML:
---
(:
: Transform Arabic honorifics (ligatures)
:
:)
namespace "http://www.w3.org/1999/xhtml";
namespace html = "http://www.w3.org/1999/xhtml";
(: PITFALL: lookup-style('font-family') returns a QUOTED STRING like
'AGA Arabesque' or "AGA Arabesque".
Hence test using "contains()" and not "=". :)
for-each /html/body//span[contains(lookup-style('font-family'),
'AGA Arabesque')] {
set-variable("honorific", string(.));
message("Testing honorific ", concat('"', $honorific, '"'));
if ($honorific = 'u') {
set-variable("honorific", "");
} elseif ($honorific = 'r') {
set-variable("honorific", "");
} elseif ($honorific = 't') {
set-variable("honorific", "");
} else {
message("Unknown honorific character ",
concat('"', $honorific, '"'));
continue();
}
message("Setting honorific to ", concat('"', $honorific, '"'));
replace(<span class="honorific">{$honorific}</span>);
}
---
It is invoked as follows:
-pu edit.after.init-styles ar1.xed
(-pu before ar1.xed is a relative URL, not a plain string.)
The main difference with yours is that it is invoked after step
init-styles which INTERNS THEN SUPPRESSES the style and class
attributes. That is, why I use lookup-style('font-family') and not @style.
Using @style is less efficient. Moreover @style only contains direct
styles when lookup-style() performs a full style search, included named
and inherited styles. See "string lookup-style(string, node?)" in
http://www.xmlmind.com/w2x/_distrib/doc/xedscript/w2xfuncs.html#lookup-style
Another difference with your script is the use of "continue();" in the
"for-each" loop.
--
XMLmind Word To XML Support List
[email protected]
http://www.xmlmind.com/mailman/listinfo/w2x-support