Hello everyone,

the other day I had an idea that might help to solve the problem with
html translation that is caused by the reordering of words/phrases in
transfer.
I do not know much about the transfer engine's internals, but
conceptually this might work:

A deformatter would separate html tags from text and add an origin
descriptor pseudo-tag to the LEMMA of each word, indicating e.g. the
parent html tag

Now consider the following semantic changes:

1.
Superblanks encode actual blanks only. (Whitespace, tab, newline etc.
and any combination thereof)

2.
The lem-element of a lexical unit is a pair consisting of the lemma
string and an origin descriptor (e.g. a reference to the parent DOM
element in an HTML document).
The origin descriptor is invisible in transfer and therefore cannot be
tampered with.

3.
A variable is a pair consisting of a string and an origin descriptor.

4.
Assignment operations (e.g. <let>) involving a lem transparently copy
the origin descriptor, unless this behavior is altered by an optional
argument along the lines of <let origin="copy">...</let>, <let
origin="keep-old">...</let>, <let origin="reset">...</let>


The reformatter could then make sure that all words are within the right
html tags and might even ensure validity using the format's DTD.

This way html translation could effectively be implemented without
changing the source code of existing language pairs.


Let me know what you think

Benedikt

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to