On Fri, Aug 10, 2012 at 10:16 PM, Mikel Artetxe <artet...@gmail.com> wrote:

> 1) Invoke it as an external program.
>
Probably the easiest to get working, but does add a silly text generation
and parsing step.


> 2) Create a Java interface for CG using JNI. ... For instance, just
> looking at the installation instructions I see that it depends on some
> external libraries, so things start getting more complex...
>
Boost is header-only, so doesn't add any files to the distribution.
libtcmalloc is optional.
ICU is the heavy one. I've looked at removing ICU and making a UTF-8-only
version of CG-3, since everyone uses just UTF-8 these days. The key problem
with that is regular expressions: I pass regex off to ICU's very nice
Unicode character class (e.g. \p{Katakana}) capable regex engine.
>From what I could find, the only C++ engines capable of UTF-8 and Unicode
character classes are ICU and PCRE, so that would be trading one library
for another less capable one.

And I'm open for making the library version easier to use, or just add an
easier to use API. The current API is for those who want almost total
control.


> 3) Develop a Java port of it. Probably the best solution but, obviously,
> the hardest one to implement...
>
Haven't really looked into that as I consider JNI a better solution.
But, it's all hash maps and hash sets, so maybe not that hard to convert.
Again, regex is a significant feature and apparently only Java 7 and newer
gets that right.


> If people think that it would be useful, I could implement solution 1)
> quite easily. That would serve to make the OmegaT plug-in work with
> language pairs that depend on CG (although the user would have to install
> CG manually).
>
It's definitely possible to distribute CG-3 alongside the pairs that need
it, or even one shared CG-3 package. ICU is what adds weight to that, but
ideally the ICU extra files are only needed on Windows and Mac.

-- Tino Didriksen
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to