Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
And, finally, the new versions of Apertium Caffeine and the OmegaT plugin are here!!! You can download Apertium Caffeine herehttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-caffeine/apertium-caffeine.jar, and the OmegaT plugin herehttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-omegat/apertium-omegat.jar. You should remove your previous installation (at least the directory for the packages) before trying them. At the same time, a new version of apertium-viewer has been released, which can be launched by following thishttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-viewer/launch.jnlplink. If that doesn't work, run the following command in the terminal: javaws https://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-viewer/launch.jnlp The binaries, as well as the language pair packages, are now kept at the new builds/ directory at SVN. A total of 20 language pairs (out of the 31 released pairs at Sourceforge) has shown to be compatible and, thus, have been released there. All these pairs are naturally supported by Apertium Caffeine, the OmegaT plugin and apertium-viewer (yes, the new apertium-viewer can work with online packages too!). When Arink releases the next version, the Android app will be supporting them as well. At the same time, all the 20 language pair packages can be launched through Java Web Start. For instance, you can follow thishttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-af-nl/apertium-af-nl.jnlplink for apertium-af-nl, or thishttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-ca-it/apertium-ca-it.jnlpone for apertium-ca-it. For the rest of the pairs, you can look at the directory structure at https://apertium.svn.sourceforge.net/svnroot/apertium/builds/ . The jnlp files in each directory are the links for Java Web Start. Their corresponding JARs can be downloaded and run as standalone programs as well. As said before, we have 20 working language pairs out of the 31 released pairs. 7 released pairs depend on external programs that aren't part of lttoolbox-java (one depends on apertium-pn-recogniser, and the other six on the Constraint Grammar package) and, thus, are incompatible with it. This means that there are 4 pairs that should be compatible but, for some reason, are giving some kind of problem. Let's see if somebody can help me with them These are the pairs along with the error they are giving: - apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate against /usr/local/share/apertium/transfer.dtd - apertium-oc-ca: Document oc-ca.t1x does not validate against /usr/local/share/apertium/transfer.dtd - apertium-oc-es: Document oc-es.t1x does not validate against /usr/local/share/apertium/transfer.dtd - apertium-pt-gl: java.lang.NumberFormatException: For input string: s The validation problems happen during compilation, and compilation fails because of it. The NumberFormatException happens while trying to generate the transfer bytecode. It seems that the transfer file contains a s where a number is expected... So, any idea about how to solve these problems? -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote: apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate against /usr/local/share/apertium/transfer.dtd I can't find any instance of 'trules' anywhere in that package. Are you using the current SVN version? apertium-oc-ca: Document oc-ca.t1x does not validate against /usr/local/share/apertium/transfer.dtd apertium-oc-es: Document oc-es.t1x does not validate against /usr/local/share/apertium/transfer.dtd These two involve running an xsl script (alt.xsl) on the transfer files first. apertium-pt-gl: java.lang.NumberFormatException: For input string: s The validation problems happen during compilation, and compilation fails because of it. The NumberFormatException happens while trying to generate the transfer bytecode. It seems that the transfer file contains a s where a number is expected... So, any idea about how to solve these problems? Fixed. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Jimmy O'Regan jore...@gmail.com writes: On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote: apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate against /usr/local/share/apertium/transfer.dtd I can't find any instance of 'trules' anywhere in that package. Are you using the current SVN version? It's in the release tarball (it needed https://gist.github.com/3273244 to compile here). apertium-oc-ca: Document oc-ca.t1x does not validate against /usr/local/share/apertium/transfer.dtd apertium-oc-es: Document oc-es.t1x does not validate against /usr/local/share/apertium/transfer.dtd These two involve running an xsl script (alt.xsl) on the transfer files first. … but they could all do with a bugfix release (when I packaged the releases for Arch Linux, I had to do https://gist.github.com/3273264 and https://gist.github.com/3273266 to make them compile). Who maintains the packages? -- Kevin Brubeck Unhammer GPG: 0x766AC60C -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On Mon, Aug 6, 2012 at 11:47 AM, Jimmy O'Regan jore...@gmail.com wrote: On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote: apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate against /usr/local/share/apertium/transfer.dtd I can't find any instance of 'trules' anywhere in that package. Are you using the current SVN version? No, I'm using the last released version at Sourceforge. apertium-oc-ca: Document oc-ca.t1x does not validate against /usr/local/share/apertium/transfer.dtd apertium-oc-es: Document oc-es.t1x does not validate against /usr/local/share/apertium/transfer.dtd These two involve running an xsl script (alt.xsl) on the transfer files first. Oh, you told me that before but I didn't try it... Sorry! The README doesn't mention anything about it, so I assumed that it was done automatically by the build script! Anyway, I haven't figured out how to do it, so I will try with the patches pointed by Kevin. apertium-pt-gl: java.lang.NumberFormatException: For input string: s The validation problems happen during compilation, and compilation fails because of it. The NumberFormatException happens while trying to generate the transfer bytecode. It seems that the transfer file contains a s where a number is expected... So, any idea about how to solve these problems? Fixed. Thank you! I've already uploaded apertium-pt-gl to the builds/ area. Anyway, shouldn't we fix the version released at Sourceforge as well? I mean, all the pairs except pt-gl are using the last released version, so doing the same here would be the most coherent thing, I guess. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 6 August 2012 12:07, Mikel Artetxe artet...@gmail.com wrote: On Mon, Aug 6, 2012 at 11:47 AM, Jimmy O'Regan jore...@gmail.com wrote: On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote: apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate against /usr/local/share/apertium/transfer.dtd I can't find any instance of 'trules' anywhere in that package. Are you using the current SVN version? No, I'm using the last released version at Sourceforge. apertium-oc-ca: Document oc-ca.t1x does not validate against /usr/local/share/apertium/transfer.dtd apertium-oc-es: Document oc-es.t1x does not validate against /usr/local/share/apertium/transfer.dtd These two involve running an xsl script (alt.xsl) on the transfer files first. Oh, you told me that before but I didn't try it... Sorry! The README doesn't mention anything about it, so I assumed that it was done automatically by the build script! Anyway, I haven't figured out how to do it, so I will try with the patches pointed by Kevin. apertium-pt-gl: java.lang.NumberFormatException: For input string: s The validation problems happen during compilation, and compilation fails because of it. The NumberFormatException happens while trying to generate the transfer bytecode. It seems that the transfer file contains a s where a number is expected... So, any idea about how to solve these problems? Fixed. Thank you! I've already uploaded apertium-pt-gl to the builds/ area. Anyway, shouldn't we fix the version released at Sourceforge as well? I mean, all the pairs except pt-gl are using the last released version, so doing the same here would be the most coherent thing, I guess. I'm doing that, starting with es-ro (running the tarball build as I type, in fact). I want to also fix the warnings first. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
thank you for the OmegaT plugin! It works like a charm. Nice to hear that you like it! PS You wrote: 7 released pairs depend on external programs that aren't part of lttoolbox-java (one depends on apertium-pn-recogniser, and the other six on the Constraint Grammar package) and, thus, are incompatible with it. What language pairs are not compatible? The following ones: - apertium-br-fr - apertium-cy-en - apertium-es-ast - apertium-is-en - apertium-mk-bg - apertium-mk-en - apertium-nn-nb apertium-es-ast depends on apertium-pn-recogniser. The rest depend on the Constraint Grammar package. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
thank you for the OmegaT plugin! It works like a charm. Nice to hear that you like it! PS You wrote: 7 released pairs depend on external programs that aren't part of lttoolbox-java (one depends on apertium-pn-recogniser, and the other six on the Constraint Grammar package) and, thus, are incompatible with it. What language pairs are not compatible? The following ones: - apertium-br-fr - apertium-cy-en - apertium-es-ast - apertium-is-en - apertium-mk-bg - apertium-mk-en - apertium-nn-nb apertium-es-ast depends on apertium-pn-recogniser. As far as I can tell, apertium-pn-recogniser can just be removed, or a version of the package built without it. I don't think it effects the quality of translation that much -- the coverage is quite high anyway. Fran -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Hi, I have considered to start working with a new language pair. In turn I have considered the following: sv-en, sv-fr and sv-nb (Swedish - Norwegian bokmål). I reckon, only the pair sv-nb (as the da-sv pair) could do without a constraint grammar. All the same it would profit from using one. Thus, I'll have to wait for some rich professional translator sponsoring the development of an OmegaT plug-in that understands constraint grammars. :-) Yours, Per Tunedal PS The apertium-pn-recogniser would be useful if someone ever would work with a pair involving German, where nouns are beginning with upper case. On Mon, Aug 6, 2012, at 14:40, Mikel Artetxe wrote: thank you for the OmegaT plugin! It works like a charm. Nice to hear that you like it! PS You wrote: 7 released pairs depend on external programs that aren't part of lttoolbox-java (one depends on apertium-pn-recogniser, and the other six on the Constraint Grammar package) and, thus, are incompatible with it. What language pairs are not compatible? The following ones: * apertium-br-fr * apertium-cy-en * apertium-es-ast * apertium-is-en * apertium-mk-bg * apertium-mk-en * apertium-nn-nb apertium-es-ast depends on apertium-pn-recogniser. The rest depend on the Constraint Grammar package. --- --- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. [1]http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list [2]Apertium-stuff@lists.sourceforge.net [3]https://lists.sourceforge.net/lists/listinfo/apertium-stuff References 1. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ 2. mailto:Apertium-stuff@lists.sourceforge.net 3. https://lists.sourceforge.net/lists/listinfo/apertium-stuff -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 6 August 2012 14:05, Francis Tyers fty...@prompsit.com wrote: apertium-es-ast depends on apertium-pn-recogniser. As far as I can tell, apertium-pn-recogniser can just be removed, or a version of the package built without it. I don't think it effects the quality of translation that much -- the coverage is quite high anyway. I've added a 'NO_PN' version for download: apertium-es-ast_NO-PN-1.1.0.tar.gz - it includes some small fixes (which I've added to SVN), but nothing that would otherwise merit a new release. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Hi again! I would like to try the OmegaT plugin. Where can I find it? Anything I have to know to be able to use it? What Apertium installation is used? A local one or the one at the Apertium web site? Yours, Per Tunedal PS What happened to Online language pair packages. I tried the Esperanto ⇆ English today and it didn't work. Maybe you have a more mature version and more language pairs to try? On Mon, Jul 23, 2012, at 20:11, Mikel Artetxe wrote: I've updated both apertium-caffeine and the OmegaT plugin for the following (you can find them in the usual place): ---snip-- -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On Sat, Aug 4, 2012 at 7:54 PM, Mikel Artetxe artet...@gmail.com wrote: On Sat, Aug 4, 2012 at 7:25 PM, Per Tunedal per.tune...@operamail.comwrote: Hi again! I would like to try the OmegaT plugin. Where can I find it? https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-caffeine.jar Sorry. That's for apertium-caffeine. This is the one for the OmegaT plugin: https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat.jar -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
I've updated both apertium-caffeine and the OmegaT plugin for the following (you can find them in the usual place): Nevertheless I think the 'display ambiguity' option should be expelled from Apertium-caffeine as end user will never use this option. Done! WRT formatters and deformatters I think its fine to make a simple (de)formatter like the one needed for omegaT or for HTML, if you anticipate they are needed for plugins. I've written a formatter for the OmegaT plugin that simply makes everything inside tags a superblank. I think that this is the only thing that is needed, but I'm not an OmegaT user and I'm not really sure. So, in case you find that it doesn't work as it should, please let me know. As for the HTML formatter, I haven't looked at how the C++ version works yet, but I guess that it would require more work than the OmegaT one. And, in any case, it wouldn't have any application right now, and it doesn't occur to me any possible application for the future neither... -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Hi there, Sorry for not being able to react before. Below you find 3 seperate subjects == Ive looked at Jimmy's changes and I decided to deploy a little different solution: I simply check if sf has length 0. The input ^=$^.sent$[] can now be handled and seems to work: $ echo I will see | java -jar lttoolbox-java/dist/lttoolbox.jar apertium -a -d /usr/local/share/apertium/ en-es =Yo = =ver * * $ echo will | java -jar lttoolbox-java/dist/lttoolbox.jar apertium -a -d /usr/local/share/apertium/ en-es = Ive comitted in Mikel's branch (as this is the branch which will have future development and it already differs quite from trunk) http://apertium.svn.sourceforge.net/viewvc/apertium?view=revisionrevision=39580 To Mikel, Jimmy and others interested in lttoolbox-java: When I develop a new feature I usually make a small 'Playground' test program, and run it like $ java -cp lttoolbox-java/dist/lttoolbox.jar Playground This is a much faster way to see if you've nailed a bug, because you can comfortably debug in your favorite tool (for me its Netbeans), inspect variables and see what's happening. I usually set my breakpoints and right-click in Playground and debugs it. Nevertheless I think the 'display ambiguity' option should be expelled from Apertium-caffeine as end user will never use this option. And developers hardly use it. === WRT formatters and deformatters I think its fine to make a simple (de)formatter like the one needed for omegaT or for HTML, if you anticipate they are needed for plugins. More advanced (de)formatters is for the C++ version, which has a sophisticated (some would say complicated :-) way of (de)formatting which I *don't* recommend you to look into. But you could play with the C++ version to get a feel of it. For example: $ echo I am bfine/ and all. :-) | apertium-deshtml I am[ b]fine[\/b ]and all. :-).[][ ] $ echo I am bfine/b and all. :-) | apertium-deshtml | apertium-rehtml I am bfine/b and all. :-) Stephen Tigener worked with the text formatter. Probably, if he have time, he could quickly put something together. Jacob -- Jacob Nordfalk http://profiles.google.com/jacob.nordfalk javabog.dk Androidudvikler og -underviser på IHKhttp://cv.ihk.dk/diplomuddannelser/itd/vf/MAUog LundBendsen https://www.lundogbendsen.dk/undervisning/beskrivelse/LB1809/ -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Mikel Forcada m...@dlsi.ua.es writes: [...] There is one thing that could be easily solved. Víctor Sánchez (cc-ed) maybe can help you. When one uses the Apertium webservice from inside OmegaT, we avoid translating the tags (u0, etc.). Some minor changes were made to the code that calls Apertium as a webservice (you'll easily find them, but if not, I can help) and some changes were made in the webservice itself (Víctor can help here). I think it is a matter of using some regular expressions to hide these in some way... I guess that you are talking about this. I might be blind, but I haven't been able to identify the relevant piece of code there... You're right. Most of the work is done at the Apertium server when it receives format=omegat. Perhaps you can just use the translate meapertium-notransdon't translate me/apertium-notrans method, this works in e.g. html and html-noent formats (grep tells me it should also be supported in odt, pptx, xlsx, wxml). [...] Yes. We should probably create a new directory in SVN and start creating and uploading packages for every language pair. The question is how to maintain it in long-term: we could integrate my script in the makefiles of each language pair to make things easier (although the dependency of Android-SDK and lttoolbox-java can still be a problem for some people), but we would still need the implication of every language pair developer in Apertium (or some responsible to take care of the whole maintenance). This deserves a deeper thought. Any ideas? I liked the idea of just adding a make goal, though perhaps the script could be installed by lttoolbox-java (since that's a dependency of the script anyway), so that copies wouldn't be required by every language pair? -- Kevin Brubeck Unhammer GPG: 0x766AC60C -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
I guess that you are talking about this. I might be blind, but I haven't been able to identify the relevant piece of code there... You're right. Most of the work is done at the Apertium server when it receives format=omegat. Perhaps you can just use the translate meapertium-notransdon't translate me/apertium-notrans method, this works in e.g. html and html-noent formats (grep tells me it should also be supported in odt, pptx, xlsx, wxml). I guess that would work in C++ Apertium, but lttoolbox-java can only format/deformat plain text... We will probably have to write a new formatter for lttoolbox-java if we want to avoid translating OmegaT tags. Yes. We should probably create a new directory in SVN and start creating and uploading packages for every language pair. The question is how to maintain it in long-term: we could integrate my script in the makefiles of each language pair to make things easier (although the dependency of Android-SDK and lttoolbox-java can still be a problem for some people), but we would still need the implication of every language pair developer in Apertium (or some responsible to take care of the whole maintenance). This deserves a deeper thought. Any ideas? I liked the idea of just adding a make goal, though perhaps the script could be installed by lttoolbox-java (since that's a dependency of the script anyway), so that copies wouldn't be required by every language pair? I like that idea. I think that we should also consider installing dx.jar together with it. It takes about 800 KB, and it is part of the Android-SDK. This way, we would solve the dependency of the Android-SDK as well, but I'm not sure if its license allows doing it (I guess so, but we will probably have to keep a copyright notice or so)... -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Cool. Works like a charm. A minor quibble: I said I wanted it installed in /tmp and it decided that every file in there was an already installed language pair. Perhaps a regex identifying language pairs by name would be very helpful. I now look at the file extension to filter JAR files. I think that it should be enough. It would also be possible to look at the whole filename, but this would make manually installed pairs unusable unless they are properly named... The standard style is apertium-[a-z][a-z][a-z]?-[a-z][a-z][a-z]?.jar I think it would be no problem for your program to require that. Yes, but the JARs online follow a different naming convention so that we can know the exact modes they contain without downloading them (that is, simply looking at their name). For instance, en-eo,eo-en.jar is used instead of apertium-eo-en.jar, expressing that the contained modes are en-eo.mode and eo-en.mode (so, in this case, the program could conclude that the language pair is bidirectional even if it hasn't downloaded it yet). In any case, I've updated both programs so that they look for the following pattern, accepting both naming conventions: return name.matches(([a-z_]+-[a-z_]+(,[a-z_]+-[a-z_]+)*).jar|(apertium-[a-z_]+-[a-z_]+).jar); The code that carries out the translation in the plug-in can be found herehttp://apertium.svn.sourceforge.net/viewvc/apertium/branches/gsoc2012/artetxem/apertium-omegat/src/org/omegat/plugin/machinetranslators/ApertiumTranslate.java?revision=39499view=markupfrom line 56 to 66. The function is really simple, so it should be easy to apply the necessary changes there (if we know what this necessary changes are!). I'm sure Víctor can tell you. It's a matter of catching and escaping the tags generated by OmegaT so that their name does not get translated. So, basically, what we need is to not translate anything between tags, right? For instance, if we have something like Ez dakit zer arraio idatzi, we should avoid translating zer arraio... is that all we need? If so, it shouldn't be too hard to achieve. It is important to note that these settings as well as the installed language pairs are shared with Apertium Caffeine, which means that, if you install, uninstall or update a language pair, the changes will be reflected in both programs. This is a design decision that I took, but it would be simple to make them independent if you prefer it. Oops I hadn't read that. I think it would be nice to have separate options, yes. Now they are completely independent. But this could be problematic in some special cases: if the same directory is chosen for both programs, they would conflict and we would probably get a strange behaviour... You could, perhaps, add some kind of file that says which (OmegaT or Caffeine) is using the directory, and announce the conflict when the other one tries to use it too. Done! The (already uploaded) new versions do that check. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Mikel [et al]: Yes, but the JARs online follow a different naming convention so that we can know the exact modes they contain without downloading them (that is, simply looking at their name). For instance, en-eo,eo-en.jar is used instead of apertium-eo-en.jar, expressing that the contained modes are en-eo.mode and eo-en.mode (so, in this case, the program could conclude that the language pair is bidirectional even if it hasn't downloaded it yet). Cool. In any case, I've updated both programs so that they look for the following pattern, accepting both naming conventions: return name.matches(([a-z_]+-[a-z_]+(,[a-z_]+-[a-z_]+)*).jar|(apertium-[a-z_]+-[a-z_]+).jar); Great stuff! After all, it's up to you to decide the naming scheme for Caffeine, so if this is the naming style, go ahead with it. So, basically, what we need is to not translate anything between tags, right? For instance, if we have something like Ez dakit zer arraio idatzi, we should avoid translating zer arraio... is that all we need? If so, it shouldn't be too hard to achieve. Yeah, turning all of that into a superblank and letting Apertium deal with it. Would that be feasible? You could, perhaps, add some kind of file that says which (OmegaT or Caffeine) is using the directory, and announce the conflict when the other one tries to use it too. Done! The (already uploaded) new versions do that check. Thanks a lot for such a quick turnaround! I look forward to the Apertium filtering. You'll have about 100 users for the OmegaT plugin in our Translation Technologies course at Universitat d'Alacant. That will ensure a lot of feedback! Is there a webpage or a wiki page that explains how to export language pairs for use with Apertium Caffeine and the OmegaT plugin? Cheers Mikel -- Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/) Departament de Llenguatges i Sistemes Informàtics Universitat d'Alacant E-03071 Alacant, Spain Phone: +34 96 590 9776 Fax: +34 96 590 9326 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
So, basically, what we need is to not translate anything between tags, right? For instance, if we have something like Ez dakit zer arraio idatzi, we should avoid translating zer arraio... is that all we need? If so, it shouldn't be too hard to achieve. Yeah, turning all of that into a superblank and letting Apertium deal with it. Would that be feasible? If I'm not wrong it would require writing a new formatter (same as TextFormatterhttp://apertium.svn.sourceforge.net/viewvc/apertium/branches/gsoc2012/artetxem/lttoolbox-java/src/org/apertium/formatter/TextFormatter.java?revision=38326view=markup, but turning those tags into superblanks as you say) and integrate it with the rest of lttoolbox-java, including the Translator API class so that we can call the proper function from the plug-in and let it do all the work. So it wouldn't be trivial, but it is definitely feasible. Thanks a lot for such a quick turnaround! I look forward to the Apertium filtering. You'll have about 100 users for the OmegaT plugin in our Translation Technologies course at Universitat d'Alacant. That will ensure a lot of feedback! That sounds really good! Is there a webpage or a wiki page that explains how to export language pairs for use with Apertium Caffeine and the OmegaT plugin? Not really. But, just in case, I would like to remark that the JARs used by Apertium Caffeine, the OmegaT plug-in, the new apertium-viewer, the Android app that Arink is developing as well as the ones launched through Java Web Start are actually exactly the same files. They aim to be something like universal packages that would work with many different programs in any OS. They can even be opened by any Zip program, extract the proper files and run with the local installation of C++ Apertium. This is why I wanted to discuss the best way of creating and, in particular, maintaining them in SVN. As for your question, the same instructions that I explained and were under discussion in that other thread I started would apply in this case as well. Quoting myself: The solution that I have been (and I'm still) working on comes in form of two bash scripts, each one to carry out one of the tasks (you can find them herehttps://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/in my branch): 1) apertium-pack-j offers an easy way to generate the packages. It requires to have lttoolbox-java (the one in my branch, not the one in trunk) and android-sdk installed, and their location must be specified by setting the LTTOOLBOX_JAVA_PATH and ANDROID_SDK_PATH environment variables. After that, you can simply run it passing the path to the mode files for which you want to generate the package as argument, and a ready-to-use package would be created by the script. For instance, the following command would create a ready-to-use package for the Esperanto-English language pair named apertium-eo-en.jar in my machine: LTTOOLBOX_JAVA_PATH=/usr/local/share/apertium/lttoolbox.jar ANDROID_SDK_PATH=/home/mikel/developer/android-sdk-linux ./apertium-pack-j /usr/local/share/apertium/modes/eo-en.mode /usr/local/share/apertium/modes/en-eo.mode As you can see, I simply specify the correct location of lttoolbox-java and android-sdk in my machine, and pass the location of eo-en.mode and en-eo.mode (the main modes that correspond to the Esperanto-English language pair) as argument to apertium-pack-j. As I said, the dependency of the Android-SDK can be eliminated if we install dx.jar together with lttoolbox-java. It still crashes if I click on the ambiguity option... $ java -jar /tmp/apertium-caffeine.jar java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang. AbstractStringBuilder.charAt(AbstractStringBuilder.java:191) at java.lang.StringBuilder.charAt(StringBuilder.java:72) at org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1372) at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:240) at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259) at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333) at org.apertium.Translator.translate(Translator.java:305) at org.apertium.caffeine.ApertiumCaffeine$12.run(ApertiumCaffeine.java:275) at java.lang.Thread.run(Thread.java:679) Perhaps it would be a good idea to leave this option out for stability... Jimmy identified the problem yesterday but it hasn't been solved yet: Sorry, this is a different error: char ch0 = sf.charAt(0); This also needs a null check, but javac doesn't like char ch0 = (sf != null) ? sf.charAt(0) : ''; and char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED; isn't the same thing, though it doesn't look like it would matter in this case. One for Jacob, I think. I've looked at the code and I agree with Jimmy that char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED;
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 19 July 2012 18:12, Mikel Artetxe artet...@gmail.com wrote: char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED; or even something like char ch0 = (sf != null) ? sf.charAt(0) : '\0'; Character.UNASSIGNED is 0, so this is the same thing. I'd lean towards 'Character.UNASSIGNED' because, though it's longer, it's self-documenting. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Hi there, in three letters: wow! A more detailed assessment follows. _* Apertium Caffeine*_ Apertium Caffeine is a small, user-oriented Apertium client, similar in concept to apertium-tolk, but which has some great advantages over it: * It doesn't depend on anything external and is written in Java. This means that it is completely platform-independent (it can work on Linux, OS X as well as Windows) and its only requirement is a Java VM (i.e. you don't need a separate installation of Apertium or its language pairs). Since Java uses UTF-16 internally, we shouldn't be having any encoding problem neither. * It manages language pairs within the app. In other words, you can install, uninstall and even update language pairs from the app itself in a simple, user-friendly way. Cool. Works like a charm. A minor quibble: I said I wanted it installed in /tmp and it decided that every file in there was an already installed language pair. Perhaps a regex identifying language pairs by name would be very helpful. * Some other features for a better user experience that you will find as you use the program: highlighting of unknown and ambiguous words, full language names... Haven't checked these yet. The source code can be found here https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-caffeine/, but you can also download and test the ready-to-use JAR here https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-caffeine.jar. _*Apertium plug-in for OmegaT*_ This is something that some of you suggested in the other thread and I have implemented as a proof of concept of how easy can lttoolbox-java be integrated in bigger Java projects. It shares most of its code with Apertium Caffeine, including the ability to manage language pairs within the app and, of course, it works offline. The source code can be found here https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat/, and the ready-to-use JAR here https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat.jar. If you want to try it, simply copy the JAR to the plugins directory of your OmegaT installation. The next time that you launch OmegaT, you will see a new option at Options - Machine Translate called Apertium (offline), which has to be checked to activate the plug-in. If you want to configure the plug-in or manage language pairs, go to Options - Apertium settings. The menu appearde but this one crashed on me: I think the problem is that I had a previous installation somewhere else (they use the same .java/.userPrefs/org/apertium/ . I remove these. Then on launching OmegaT I get a dialog that may be confusing for some, as it does not identify itself as an Apertium warning. It should... It kindly asks me to decide where to install itself, and where to install language pairs. Then I open a project, mark Apertium offline in the Machine Translation options, and it crashes. java.lang.IndexOutOfBoundsException: Index: 13, Size: 7 at java.util.ArrayList.rangeCheck(ArrayList.java:571) at java.util.ArrayList.remove(ArrayList.java:412) at org.apertium.lttoolbox.process.State.nodeStatePool_get(State.java:101) at org.apertium.lttoolbox.process.State.copy(State.java:145) at org.apertium.lttoolbox.process.State.copy(State.java:158) at org.apertium.lttoolbox.process.FSTProcessor.analysis(FSTProcessor.java:857) at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:287) at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259) at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333) at org.apertium.Translator.translate(Translator.java:305) at org.omegat.plugin.machinetranslators.ApertiumTranslate.translate(ApertiumTranslate.java:62) at org.omegat.core.machinetranslators.BaseTranslate.getTranslation(BaseTranslate.java:64) at org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:128) at org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:103) at org.omegat.gui.common.EntryInfoSearchThread.run(EntryInfoSearchThread.java:95) I try again and it works. Very nice. There is one thing that could be easily solved. Víctor Sánchez (cc-ed) maybe can help you. When one uses the Apertium webservice from inside OmegaT, we avoid translating the tags (u0, etc.). Some minor changes were made to the code that calls Apertium as a webservice (you'll easily find them, but if not, I can help) and some changes were made in the webservice itself (Víctor can help here). I think it is a matter of using some regular expressions to hide these in some way... You should consider sending a message to the OmegaT list. I can help here, as I am subscribed. I think this is very, very important! And
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Hi Mikel On Wednesday 18 July 2012 Mikel Artetxe said Apertium Caffeine is a small, user-oriented Apertium client, similar in concept to apertium-tolk, but which has some great advantages over it: Looks good, but ticking Mark ambiguity causes it to crash. :-( Ubuntu 10.04 Java(TM) SE Runtime Environment (build 1.6.0_22-b04) java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:174) at java.lang.StringBuilder.charAt(StringBuilder.java:55) at org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1372) at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:240) at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259) at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333) at org.apertium.Translator.translate(Translator.java:305) at org.apertium.caffeine.ApertiumCaffeine$11.run(ApertiumCaffeine.java:242) at java.lang.Thread.run(Thread.java:662) Using the following test phrase: I want to head off to the beach now. When will we go next? -- Pob hwyl / Best wishes Kevin Donnelly kevindonnelly.org.uk -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Looks good, but ticking Mark ambiguity causes it to crash. :-( Using the following test phrase: I want to head off to the beach now. When will we go next? Thank you for reporting it. I've discovered that it is lttoolbox-java who crashes and not Apertium Caffeine itself. The problematic word is will in the es-en language pair when passing the -a flag. In other words, the following causes exactly the same crash: echo will | java -jar lttoolbox.jar apertium -a -d /usr/local/share/apertium/ en-es Unfortunately, org.apertium.lttoolbox.process.FSTProcessor, which causes the crash, is super complex (more than 2000 lines of code!), and my work so far hasn't been related to it... Perhaps somebody that worked on it can help us? Also, note that, although it doesn't crash, C++ Apertium gives a strange output for the same input: [mikel@fedora ~]$ echo will | apertium -a en-es =# so perhaps the problem is in the en-es language pair... Any idea? -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 18 July 2012 13:46, Mikel Artetxe artet...@gmail.com wrote: Looks good, but ticking Mark ambiguity causes it to crash. :-( Using the following test phrase: I want to head off to the beach now. When will we go next? Thank you for reporting it. I've discovered that it is lttoolbox-java who crashes and not Apertium Caffeine itself. The problematic word is will in the es-en language pair when passing the -a flag. In other words, the following causes exactly the same crash: echo will | java -jar lttoolbox.jar apertium -a -d /usr/local/share/apertium/ en-es Unfortunately, org.apertium.lttoolbox.process.FSTProcessor, which causes the crash, is super complex (more than 2000 lines of code!), and my work so far hasn't been related to it... Perhaps somebody that worked on it can help us? :D I think this goes to show how little the '-a' option is used, because neither implementation of generation in lttoolbox handles it. The word 'will' disappears because the tagger is picking the auxiliary form, which has a null translation in the bidix, but the generation error appears with other words: $ echo can |apertium -a en-es =#poder $ echo can |apertium en-es Puede The solution for you is simply to remove the option - it does nothing that's useful for an end user. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 18 July 2012 15:22, Jimmy O'Regan jore...@gmail.com wrote: The word 'will' disappears because the tagger is picking the auxiliary form, which has a null translation in the bidix, but the generation error appears with other words: $ echo can |apertium -a en-es =#poder $ echo can |apertium en-es Puede I was only half right - this is a different issue. I committed 39497 to lttoolbox-java in trunk[1] to check for an empty translation, but this is only a partial fix, as it leads to ghost = signs in the text where this has happened. (To do otherwise would require a bit more effort, and I'm not entirely convinced it's worth it). [1] 39496 for lttoolbox (C++) -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 18 July 2012 16:13, Jimmy O'Regan jore...@gmail.com wrote: On 18 July 2012 15:22, Jimmy O'Regan jore...@gmail.com wrote: The word 'will' disappears because the tagger is picking the auxiliary form, which has a null translation in the bidix, but the generation error appears with other words: $ echo can |apertium -a en-es =#poder $ echo can |apertium en-es Puede I was only half right - this is a different issue. The different issue, for the record, is that an extra optional transduction needs to be inserted when preprocessing the transfer files (to make '^=' at the start of a token equivalent to '^'). -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Thank you for your feedback Mikel! I've fixed some of the issues that you have found (the new version is already uploaded): *Apertium Caffeine* Apertium Caffeine is a small, user-oriented Apertium client, similar in concept to apertium-tolk, but which has some great advantages over it: - It doesn't depend on anything external and is written in Java. This means that it is completely platform-independent (it can work on Linux, OS X as well as Windows) and its only requirement is a Java VM (i.e. you don't need a separate installation of Apertium or its language pairs). Since Java uses UTF-16 internally, we shouldn't be having any encoding problem neither. - It manages language pairs within the app. In other words, you can install, uninstall and even update language pairs from the app itself in a simple, user-friendly way. Cool. Works like a charm. A minor quibble: I said I wanted it installed in /tmp and it decided that every file in there was an already installed language pair. Perhaps a regex identifying language pairs by name would be very helpful. I now look at the file extension to filter JAR files. I think that it should be enough. It would also be possible to look at the whole filename, but this would make manually installed pairs unusable unless they are properly named... *Apertium plug-in for OmegaT* This is something that some of you suggested in the other thread and I have implemented as a proof of concept of how easy can lttoolbox-java be integrated in bigger Java projects. It shares most of its code with Apertium Caffeine, including the ability to manage language pairs within the app and, of course, it works offline. The source code can be found herehttps://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat/, and the ready-to-use JAR herehttps://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat.jar. If you want to try it, simply copy the JAR to the plugins directory of your OmegaT installation. The next time that you launch OmegaT, you will see a new option at Options - Machine Translate called Apertium (offline), which has to be checked to activate the plug-in. If you want to configure the plug-in or manage language pairs, go to Options - Apertium settings. The menu appearde but this one crashed on me: I think the problem is that I had a previous installation somewhere else (they use the same .java/.userPrefs/org/apertium/ . I remove these. They now use independent preferences, so we shouldn't have that problem anymore. Then on launching OmegaT I get a dialog that may be confusing for some, as it does not identify itself as an Apertium warning. It should... Now it does! It kindly asks me to decide where to install itself, and where to install language pairs. Then I open a project, mark Apertium offline in the Machine Translation options, and it crashes. java.lang.IndexOutOfBoundsException: Index: 13, Size: 7 at java.util.ArrayList.rangeCheck(ArrayList.java:571) at java.util.ArrayList.remove(ArrayList.java:412) at org.apertium.lttoolbox.process.State.nodeStatePool_get(State.java:101) at org.apertium.lttoolbox.process.State.copy(State.java:145) at org.apertium.lttoolbox.process.State.copy(State.java:158) at org.apertium.lttoolbox.process.FSTProcessor.analysis(FSTProcessor.java:857) at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:287) at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259) at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333) at org.apertium.Translator.translate(Translator.java:305) at org.omegat.plugin.machinetranslators.ApertiumTranslate.translate(ApertiumTranslate.java:62) at org.omegat.core.machinetranslators.BaseTranslate.getTranslation(BaseTranslate.java:64) at org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:128) at org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:103) at org.omegat.gui.common.EntryInfoSearchThread.run(EntryInfoSearchThread.java:95) I'm not sure if I have correctly identified the problem, but it is probably fixed now. There is one thing that could be easily solved. Víctor Sánchez (cc-ed) maybe can help you. When one uses the Apertium webservice from inside OmegaT, we avoid translating the tags (u0, etc.). Some minor changes were made to the code that calls Apertium as a webservice (you'll easily find them, but if not, I can help) and some changes were made in the webservice itself (Víctor can help here). I think it is a matter of using some regular expressions to hide these in some way... I guess that you are talking about thishttp://omegat.svn.sourceforge.net/viewvc/omegat/trunk/src/org/omegat/core/machinetranslators/ApertiumTranslate.java?revision=4434view=markup. I might be blind, but
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 18 July 2012 18:28, Mikel Artetxe artet...@gmail.com wrote: I was only half right - this is a different issue. I committed 39497 to lttoolbox-java in trunk[1] to check for an empty translation, but this is only a partial fix, as it leads to ghost = signs in the text where this has happened. (To do otherwise would require a bit more effort, and I'm not entirely convinced it's worth it). I haven't really understood the technical details, but lttoolbox-java still crashes... Look: [mikel@fedora dist]$ echo will | java -jar lttoolbox.jar apertium -a -d /usr/local/share/apertium/ en-es java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:174) at java.lang.StringBuilder.charAt(StringBuilder.java:55) at org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1323) Sorry, this is a different error: char ch0 = sf.charAt(0); This also needs a null check, but javac doesn't like char ch0 = (sf != null) ? sf.charAt(0) : ''; and char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED; isn't the same thing, though it doesn't look like it would matter in this case. One for Jacob, I think. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
On 18 July 2012 19:25, Jimmy O'Regan jore...@gmail.com wrote: On 18 July 2012 18:28, Mikel Artetxe artet...@gmail.com wrote: I was only half right - this is a different issue. I committed 39497 to lttoolbox-java in trunk[1] to check for an empty translation, but this is only a partial fix, as it leads to ghost = signs in the text where this has happened. (To do otherwise would require a bit more effort, and I'm not entirely convinced it's worth it). I haven't really understood the technical details, but lttoolbox-java still crashes... Look: [mikel@fedora dist]$ echo will | java -jar lttoolbox.jar apertium -a -d /usr/local/share/apertium/ en-es java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:174) at java.lang.StringBuilder.charAt(StringBuilder.java:55) at org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1323) Sorry, this is a different error: char ch0 = sf.charAt(0); This also needs a null check, but javac doesn't like char ch0 = (sf != null) ? sf.charAt(0) : ''; and char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED; isn't the same thing, though it doesn't look like it would matter in this case. One for Jacob, I think. I've attached the diff, it might make it more clear. -- Sefam Are any of the mentors around? jimregan yes, they're the ones trolling you patch Description: Binary data -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT
Hi Mikel: Thank you for your feedback Mikel! I've fixed some of the issues that you have found (the new version is already uploaded): Cool! Cool. Works like a charm. A minor quibble: I said I wanted it installed in /tmp and it decided that every file in there was an already installed language pair. Perhaps a regex identifying language pairs by name would be very helpful. I now look at the file extension to filter JAR files. I think that it should be enough. It would also be possible to look at the whole filename, but this would make manually installed pairs unusable unless they are properly named... The standard style is apertium-[a-z][a-z][a-z]?-[a-z][a-z][a-z]?.jar I think it would be no problem for your program to require that. _*Apertium plug-in for OmegaT*_ The menu appearde but this one crashed on me: I think the problem is that I had a previous installation somewhere else (they use the same .java/.userPrefs/org/apertium/ . I remove these. They now use independent preferences, so we shouldn't have that problem anymore. Then on launching OmegaT I get a dialog that may be confusing for some, as it does not identify itself as an Apertium warning. It should... Now it does! I'll take a look. Thanks a lot! It kindly asks me to decide where to install itself, and where to install language pairs. Then I open a project, mark Apertium offline in the Machine Translation options, and it crashes. java.lang.IndexOutOfBoundsException: Index: 13, Size: 7 at java.util.ArrayList.rangeCheck(ArrayList.java:571) at java.util.ArrayList.remove(ArrayList.java:412) at org.apertium.lttoolbox.process.State.nodeStatePool_get(State.java:101) at org.apertium.lttoolbox.process.State.copy(State.java:145) at org.apertium.lttoolbox.process.State.copy(State.java:158) at org.apertium.lttoolbox.process.FSTProcessor.analysis(FSTProcessor.java:857) at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:287) at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259) at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333) at org.apertium.Translator.translate(Translator.java:305) at org.omegat.plugin.machinetranslators.ApertiumTranslate.translate(ApertiumTranslate.java:62) at org.omegat.core.machinetranslators.BaseTranslate.getTranslation(BaseTranslate.java:64) at org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:128) at org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:103) at org.omegat.gui.common.EntryInfoSearchThread.run(EntryInfoSearchThread.java:95) I'm not sure if I have correctly identified the problem, but it is probably fixed now. I will test it again, to see what happens. There is one thing that could be easily solved. Víctor Sánchez (cc-ed) maybe can help you. When one uses the Apertium webservice from inside OmegaT, we avoid translating the tags (u0, etc.). Some minor changes were made to the code that calls Apertium as a webservice (you'll easily find them, but if not, I can help) and some changes were made in the webservice itself (Víctor can help here). I think it is a matter of using some regular expressions to hide these in some way... I guess that you are talking about this http://omegat.svn.sourceforge.net/viewvc/omegat/trunk/src/org/omegat/core/machinetranslators/ApertiumTranslate.java?revision=4434view=markup. I might be blind, but I haven't been able to identify the relevant piece of code there... You're right. Most of the work is done at the Apertium server when it receives format=omegat. The code that carries out the translation in the plug-in can be found here http://apertium.svn.sourceforge.net/viewvc/apertium/branches/gsoc2012/artetxem/apertium-omegat/src/org/omegat/plugin/machinetranslators/ApertiumTranslate.java?revision=39499view=markup from line 56 to 66. The function is really simple, so it should be easy to apply the necessary changes there (if we know what this necessary changes are!). I'm sure Víctor can tell you. It's a matter of catching and escaping the tags generated by OmegaT so that their name does not get translated. You should consider sending a message to the OmegaT list. I can help here, as I am subscribed. I think this is very, very important! And perhaps you can get OmegaT people to help with some of the issues. I will do it then! Excellent. It is important to note that these settings as well as the installed language pairs are shared with Apertium Caffeine, which means that, if you install, uninstall or update a language pair, the changes will be reflected in both programs. This is a design decision that I took, but it would be simple to make them