Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Mikel Artetxe
And, finally, the new versions of Apertium Caffeine and the OmegaT plugin
are here!!! You can download Apertium Caffeine
herehttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-caffeine/apertium-caffeine.jar,
and the OmegaT plugin
herehttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-omegat/apertium-omegat.jar.
You should remove your previous installation (at least the directory for
the packages) before trying them. At the same time, a new version of
apertium-viewer has been released, which can be launched by following
thishttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-viewer/launch.jnlplink.
If that doesn't work, run the following command in the terminal:

javaws
https://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-viewer/launch.jnlp

The binaries, as well as the language pair packages, are now kept at the
new builds/ directory at SVN. A total of 20 language pairs (out of the 31
released pairs at Sourceforge) has shown to be compatible and, thus, have
been released there. All these pairs are naturally supported by Apertium
Caffeine, the OmegaT plugin and apertium-viewer (yes, the new
apertium-viewer can work with online packages too!). When Arink releases
the next version, the Android app will be supporting them as well.

At the same time, all the 20 language pair packages can be launched through
Java Web Start. For instance, you can follow
thishttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-af-nl/apertium-af-nl.jnlplink
for apertium-af-nl, or
thishttps://apertium.svn.sourceforge.net/svnroot/apertium/builds/apertium-ca-it/apertium-ca-it.jnlpone
for apertium-ca-it. For the rest of the pairs, you can look at the
directory structure at
https://apertium.svn.sourceforge.net/svnroot/apertium/builds/ . The jnlp
files in each directory are the links for Java Web Start. Their
corresponding JARs can be downloaded and run as standalone programs as well.

As said before, we have 20 working language pairs out of the 31 released
pairs. 7 released pairs depend on external programs that aren't part of
lttoolbox-java (one depends on apertium-pn-recogniser, and the other six on
the Constraint Grammar package) and, thus, are incompatible with it. This
means that there are 4 pairs that should be compatible but, for some
reason, are giving some kind of problem. Let's see if somebody can help me
with them These are the pairs along with the error they are giving:

   - apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not
   validate against /usr/local/share/apertium/transfer.dtd
   - apertium-oc-ca: Document oc-ca.t1x does not validate against
   /usr/local/share/apertium/transfer.dtd
   - apertium-oc-es: Document oc-es.t1x does not validate against
   /usr/local/share/apertium/transfer.dtd
   - apertium-pt-gl: java.lang.NumberFormatException: For input string: s

The validation problems happen during compilation, and compilation fails
because of it. The NumberFormatException happens while trying to generate
the transfer bytecode. It seems that the transfer file contains a s where
a number is expected... So, any idea about how to solve these problems?
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Jimmy O'Regan
On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote:
 apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate
 against /usr/local/share/apertium/transfer.dtd

I can't find any instance of 'trules' anywhere in that package. Are
you using the current SVN version?

 apertium-oc-ca: Document oc-ca.t1x does not validate against
 /usr/local/share/apertium/transfer.dtd
 apertium-oc-es: Document oc-es.t1x does not validate against
 /usr/local/share/apertium/transfer.dtd

These two involve running an xsl script (alt.xsl) on the transfer files first.

 apertium-pt-gl: java.lang.NumberFormatException: For input string: s

 The validation problems happen during compilation, and compilation fails
 because of it. The NumberFormatException happens while trying to generate
 the transfer bytecode. It seems that the transfer file contains a s where
 a number is expected... So, any idea about how to solve these problems?

Fixed.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Kevin Brubeck Unhammer
Jimmy O'Regan jore...@gmail.com
writes:

 On 6 August 2012 10:24, Mikel Artetxe
 artet...@gmail.com wrote:
 apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not validate
 against /usr/local/share/apertium/transfer.dtd

 I can't find any instance of 'trules' anywhere in that package. Are
 you using the current SVN version?

It's in the release tarball (it needed https://gist.github.com/3273244
to compile here).

 apertium-oc-ca: Document oc-ca.t1x does not validate against
 /usr/local/share/apertium/transfer.dtd
 apertium-oc-es: Document oc-es.t1x does not validate against
 /usr/local/share/apertium/transfer.dtd

 These two involve running an xsl script (alt.xsl) on the transfer files first.


… but they could all do with a bugfix release (when I packaged the
releases for Arch Linux, I had to do https://gist.github.com/3273264 and
https://gist.github.com/3273266 to make them compile).

Who maintains the packages?


-- 
Kevin Brubeck Unhammer

GPG: 0x766AC60C


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Mikel Artetxe
On Mon, Aug 6, 2012 at 11:47 AM, Jimmy O'Regan jore...@gmail.com wrote:

 On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote:
  apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not
 validate
  against /usr/local/share/apertium/transfer.dtd

 I can't find any instance of 'trules' anywhere in that package. Are
 you using the current SVN version?


No, I'm using the last released version at Sourceforge.



  apertium-oc-ca: Document oc-ca.t1x does not validate against
  /usr/local/share/apertium/transfer.dtd
  apertium-oc-es: Document oc-es.t1x does not validate against
  /usr/local/share/apertium/transfer.dtd

 These two involve running an xsl script (alt.xsl) on the transfer files
 first.


Oh, you told me that before but I didn't try it... Sorry! The README
doesn't mention anything about it, so I assumed that it was done
automatically by the build script!

Anyway, I haven't figured out how to do it, so I will try with the patches
pointed by Kevin.



  apertium-pt-gl: java.lang.NumberFormatException: For input string: s
 
  The validation problems happen during compilation, and compilation fails
  because of it. The NumberFormatException happens while trying to generate
  the transfer bytecode. It seems that the transfer file contains a s
 where
  a number is expected... So, any idea about how to solve these problems?

 Fixed.


Thank you! I've already uploaded apertium-pt-gl to the builds/ area.
Anyway, shouldn't we fix the version released at Sourceforge as well? I
mean, all the pairs except pt-gl are using the last released version, so
doing the same here would be the most coherent thing, I guess.
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Jimmy O'Regan
On 6 August 2012 12:07, Mikel Artetxe artet...@gmail.com wrote:
 On Mon, Aug 6, 2012 at 11:47 AM, Jimmy O'Regan jore...@gmail.com wrote:

 On 6 August 2012 10:24, Mikel Artetxe artet...@gmail.com wrote:
  apertium-es-ro: Document apertium-es-ro.trules-ro-es.xml does not
  validate
  against /usr/local/share/apertium/transfer.dtd

 I can't find any instance of 'trules' anywhere in that package. Are
 you using the current SVN version?


 No, I'm using the last released version at Sourceforge.



  apertium-oc-ca: Document oc-ca.t1x does not validate against
  /usr/local/share/apertium/transfer.dtd
  apertium-oc-es: Document oc-es.t1x does not validate against
  /usr/local/share/apertium/transfer.dtd

 These two involve running an xsl script (alt.xsl) on the transfer files
 first.


 Oh, you told me that before but I didn't try it... Sorry! The README doesn't
 mention anything about it, so I assumed that it was done automatically by
 the build script!

 Anyway, I haven't figured out how to do it, so I will try with the patches
 pointed by Kevin.



  apertium-pt-gl: java.lang.NumberFormatException: For input string: s
 
  The validation problems happen during compilation, and compilation fails
  because of it. The NumberFormatException happens while trying to
  generate
  the transfer bytecode. It seems that the transfer file contains a s
  where
  a number is expected... So, any idea about how to solve these problems?

 Fixed.


 Thank you! I've already uploaded apertium-pt-gl to the builds/ area. Anyway,
 shouldn't we fix the version released at Sourceforge as well? I mean, all
 the pairs except pt-gl are using the last released version, so doing the
 same here would be the most coherent thing, I guess.

I'm doing that, starting with es-ro (running the tarball build as I
type, in fact). I want to also fix the warnings first.


-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Mikel Artetxe

 thank you for the OmegaT plugin! It works like a charm.


Nice to hear that you like it!



 PS You wrote: 7 released pairs depend on external programs that aren't
 part of lttoolbox-java (one depends on apertium-pn-recogniser, and the
 other six on the Constraint Grammar package) and, thus, are incompatible
 with it.  What language pairs are not compatible?


The following ones:

   - apertium-br-fr
   - apertium-cy-en
   - apertium-es-ast
   - apertium-is-en
   - apertium-mk-bg
   - apertium-mk-en
   - apertium-nn-nb

apertium-es-ast depends on apertium-pn-recogniser. The rest depend on the
Constraint Grammar package.
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Francis Tyers

 thank you for the OmegaT plugin! It works like a charm.


 Nice to hear that you like it!



 PS You wrote: 7 released pairs depend on external programs that aren't
 part of lttoolbox-java (one depends on apertium-pn-recogniser, and the
 other six on the Constraint Grammar package) and, thus, are incompatible
 with it.  What language pairs are not compatible?


 The following ones:

- apertium-br-fr
- apertium-cy-en
- apertium-es-ast
- apertium-is-en
- apertium-mk-bg
- apertium-mk-en
- apertium-nn-nb

 apertium-es-ast depends on apertium-pn-recogniser.

As far as I can tell, apertium-pn-recogniser can just be removed, or a
version of the package built without it. I don't think it effects the
quality of translation that much -- the coverage is quite high anyway.

Fran


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Per Tunedal
Hi,
I have considered to start working with a new language pair. In turn I
have considered the following: sv-en, sv-fr and sv-nb (Swedish -
Norwegian bokmål). I reckon, only the pair sv-nb (as the da-sv pair)
could do without a constraint grammar. All the same it would profit from
using one.

Thus, I'll have to wait for some rich professional translator sponsoring
the development of an OmegaT plug-in that understands constraint
grammars. :-)

Yours,
Per Tunedal

PS The apertium-pn-recogniser would be useful if someone ever would work
with a pair involving German, where nouns are beginning with upper case.


On Mon, Aug 6, 2012, at 14:40, Mikel Artetxe wrote:

  thank you for the OmegaT plugin! It works like a charm.

Nice to hear that you like it!

  PS You wrote: 7 released pairs depend on external programs that
  aren't

part of lttoolbox-java (one depends on apertium-pn-recogniser, and the
other six on the Constraint Grammar package) and, thus, are
incompatible

  with it.  What language pairs are not compatible?

  The following ones:
  * apertium-br-fr
  * apertium-cy-en
  * apertium-es-ast
  * apertium-is-en
  * apertium-mk-bg
  * apertium-mk-en
  * apertium-nn-nb

  apertium-es-ast depends on apertium-pn-recogniser. The rest depend
  on the Constraint Grammar package.

---
---

Live Security Virtual Conference

Exclusive live event will cover all the ways today's security and

threat landscape has changed and how IT managers can respond.
Discussions

will include endpoint security, mobile security and the latest in
malware

threats. [1]http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

___

Apertium-stuff mailing list

[2]Apertium-stuff@lists.sourceforge.net

[3]https://lists.sourceforge.net/lists/listinfo/apertium-stuff

References

1. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
2. mailto:Apertium-stuff@lists.sourceforge.net
3. https://lists.sourceforge.net/lists/listinfo/apertium-stuff

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-06 Thread Jimmy O'Regan
On 6 August 2012 14:05, Francis Tyers fty...@prompsit.com wrote:
 apertium-es-ast depends on apertium-pn-recogniser.

 As far as I can tell, apertium-pn-recogniser can just be removed, or a
 version of the package built without it. I don't think it effects the
 quality of translation that much -- the coverage is quite high anyway.

I've added a 'NO_PN' version for download:
apertium-es-ast_NO-PN-1.1.0.tar.gz - it includes some small fixes
(which I've added to SVN), but nothing that would otherwise merit a
new release.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-04 Thread Per Tunedal
Hi again!
I would like to try the OmegaT plugin. Where can I find it? Anything I
have to know to be able to use it? What Apertium installation is used?
A local one or the one at the Apertium web site?
Yours,
Per Tunedal

PS What happened to Online language pair packages. I tried the
Esperanto ⇆ English today and it didn't work. Maybe you have a more
mature version and more language pairs to try?

On Mon, Jul 23, 2012, at 20:11, Mikel Artetxe wrote:

   I've updated both apertium-caffeine and the OmegaT plugin for the
  following (you can find them in the usual place):


---snip--

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-08-04 Thread Mikel Artetxe
On Sat, Aug 4, 2012 at 7:54 PM, Mikel Artetxe artet...@gmail.com wrote:

 On Sat, Aug 4, 2012 at 7:25 PM, Per Tunedal per.tune...@operamail.comwrote:

 Hi again!
 I would like to try the OmegaT plugin. Where can I find it?



 https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-caffeine.jar


Sorry. That's for apertium-caffeine. This is the one for the OmegaT plugin:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat.jar
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-23 Thread Mikel Artetxe
 I've updated both apertium-caffeine and the OmegaT plugin for the
following (you can find them in the usual place):


 Nevertheless I think the 'display ambiguity' option should be expelled
 from Apertium-caffeine as end user will never use this option.


Done!



 WRT formatters and deformatters I think its fine to make a simple
 (de)formatter like the one needed for omegaT or for HTML, if you anticipate
 they are needed for plugins.


I've written a formatter for the OmegaT plugin that simply makes everything
inside  tags a superblank. I think that this is the only thing that is
needed, but I'm not an OmegaT user and I'm not really sure. So, in case you
find that it doesn't work as it should, please let me know.

As for the HTML formatter, I haven't looked at how the C++ version works
yet, but I guess that it would require more work than the OmegaT one. And,
in any case, it wouldn't have any application right now, and it doesn't
occur to me any possible application for the future neither...
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-22 Thread Jacob Nordfalk
Hi there,

Sorry for not being able to react before. Below you find 3 seperate subjects

==

Ive looked at Jimmy's changes and I decided to deploy a little different
solution: I simply check if sf has length 0.
The input ^=$^.sent$[] can now be handled and seems to work:


$ echo I will see   | java -jar lttoolbox-java/dist/lttoolbox.jar
 apertium -a -d /usr/local/share/apertium/ en-es
=Yo = =ver * *

$ echo will | java -jar lttoolbox-java/dist/lttoolbox.jar  apertium -a -d
/usr/local/share/apertium/ en-es
=

Ive comitted in Mikel's branch (as this is the branch which will have
future development and it already differs quite from trunk)
http://apertium.svn.sourceforge.net/viewvc/apertium?view=revisionrevision=39580

To Mikel, Jimmy and others interested in lttoolbox-java: When I develop a
new feature I usually make a small 'Playground' test program, and run it
like

$ java -cp lttoolbox-java/dist/lttoolbox.jar Playground

This is a much faster way to see if you've nailed a bug, because you can
comfortably debug in your favorite tool (for me its Netbeans), inspect
variables and see what's happening. I usually set my breakpoints and
right-click in Playground and debugs it.



Nevertheless I think the 'display ambiguity' option should be expelled from
Apertium-caffeine as end user will never use this option. And developers
hardly use it.


===


WRT formatters and deformatters I think its fine to make a simple
(de)formatter like the one needed for omegaT or for HTML, if you anticipate
they are needed for plugins.  More advanced (de)formatters is for the C++
version, which has a sophisticated (some would say complicated :-) way of
(de)formatting which I *don't* recommend you to look into.

But you could play with the C++ version to get a feel of it. For example:

$ echo I am bfine/ and all. :-) | apertium-deshtml

I am[ b]fine[\/b ]and all. :-).[][
]


$ echo I am bfine/b and all. :-) | apertium-deshtml  |
apertium-rehtml
I am bfine/b and all. :-)

Stephen Tigener worked with the text formatter. Probably, if he have time,
he could quickly put something together.


Jacob

-- 
Jacob Nordfalk http://profiles.google.com/jacob.nordfalk
javabog.dk
Androidudvikler og -underviser på
IHKhttp://cv.ihk.dk/diplomuddannelser/itd/vf/MAUog
LundBendsen https://www.lundogbendsen.dk/undervisning/beskrivelse/LB1809/
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-19 Thread Kevin Brubeck Unhammer
Mikel Forcada m...@dlsi.ua.es writes:

[...]

 There is one thing that could be easily solved. Víctor Sánchez
 (cc-ed) maybe can help you. When one uses the Apertium
 webservice from inside OmegaT, we avoid translating the tags
 (u0, etc.). Some minor changes were made to the code that
 calls Apertium as a webservice (you'll easily find them, but
 if not, I can help) and some changes were made in the
 webservice itself (Víctor can help here). I think it is a
 matter of using some regular expressions to hide these in some
 way...
 
 
 I guess that you are talking about this. I might be blind, but I
 haven't been able to identify the relevant piece of code there...
 
 You're right. Most of the work is done at the Apertium server when it
 receives format=omegat. 

Perhaps you can just use the

translate meapertium-notransdon't translate me/apertium-notrans

method, this works in e.g. html and html-noent formats (grep tells me it
should also be supported in odt, pptx, xlsx, wxml).

[...]

 Yes. We should probably create a new directory in SVN and start
 creating and uploading packages for every language pair. The
 question is how to maintain it in long-term: we could integrate my
 script in the makefiles of each language pair to make things
 easier (although the dependency of Android-SDK and lttoolbox-java
 can still be a problem for some people), but we would still need
 the implication of every language pair developer in Apertium (or
 some responsible to take care of the whole maintenance).
 
 This deserves a deeper thought. Any ideas?

I liked the idea of just adding a make goal, though perhaps the script
could be installed by lttoolbox-java (since that's a dependency of the
script anyway), so that copies wouldn't be required by every language
pair?


-- 
Kevin Brubeck Unhammer

GPG: 0x766AC60C


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-19 Thread Mikel Artetxe

 
  I guess that you are talking about this. I might be blind, but I
  haven't been able to identify the relevant piece of code there...
 
  You're right. Most of the work is done at the Apertium server when it
  receives format=omegat.

 Perhaps you can just use the

 translate meapertium-notransdon't translate me/apertium-notrans

 method, this works in e.g. html and html-noent formats (grep tells me it
 should also be supported in odt, pptx, xlsx, wxml).


I guess that would work in C++ Apertium, but lttoolbox-java can only
format/deformat plain text... We will probably have to write a new
formatter for lttoolbox-java if we want to avoid translating OmegaT tags.


 Yes. We should probably create a new directory in SVN and start
  creating and uploading packages for every language pair. The
  question is how to maintain it in long-term: we could integrate my
  script in the makefiles of each language pair to make things
  easier (although the dependency of Android-SDK and lttoolbox-java
  can still be a problem for some people), but we would still need
  the implication of every language pair developer in Apertium (or
  some responsible to take care of the whole maintenance).
 
  This deserves a deeper thought. Any ideas?

 I liked the idea of just adding a make goal, though perhaps the script
 could be installed by lttoolbox-java (since that's a dependency of the
 script anyway), so that copies wouldn't be required by every language
 pair?


I like that idea. I think that we should also consider installing dx.jar
together with it. It takes about 800 KB, and it is part of the Android-SDK.
This way, we would solve the dependency of the Android-SDK as well, but I'm
not sure if its license allows doing it (I guess so, but we will probably
have to keep a copyright notice or so)...
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-19 Thread Mikel Artetxe

  Cool. Works like a charm. A minor quibble: I said I wanted it installed
 in /tmp and it decided that every file in there was an already installed
 language pair. Perhaps a regex identifying language pairs by name would be
 very helpful.


 I now look at the file extension to filter JAR files. I think that it
 should be enough. It would also be possible to look at the whole filename,
 but this would make manually installed pairs unusable unless they are
 properly named...

 The standard style is apertium-[a-z][a-z][a-z]?-[a-z][a-z][a-z]?.jar I
 think it would be no problem for your program to require that.


Yes, but the JARs online follow a different naming convention so that we
can know the exact modes they contain without downloading them (that is,
simply looking at their name). For instance, en-eo,eo-en.jar is used
instead of apertium-eo-en.jar, expressing that the contained modes are
en-eo.mode and eo-en.mode (so, in this case, the program could conclude
that the language pair is bidirectional even if it hasn't downloaded it
yet).

In any case, I've updated both programs so that they look for the following
pattern, accepting both naming conventions:

return
name.matches(([a-z_]+-[a-z_]+(,[a-z_]+-[a-z_]+)*).jar|(apertium-[a-z_]+-[a-z_]+).jar);


 The code that carries out the translation in the plug-in can be found 
 herehttp://apertium.svn.sourceforge.net/viewvc/apertium/branches/gsoc2012/artetxem/apertium-omegat/src/org/omegat/plugin/machinetranslators/ApertiumTranslate.java?revision=39499view=markupfrom
  line 56 to 66. The function is really simple, so it should be easy to
 apply the necessary changes there (if we know what this necessary changes
 are!).

 I'm sure Víctor can tell you. It's a matter of catching and escaping the
  tags generated by OmegaT so that their name does not get translated.


So, basically, what we need is to not translate anything between  tags,
right? For instance, if we have something like Ez dakit zer arraio
idatzi, we should avoid translating zer arraio... is that all we need?
If so, it shouldn't be too hard to achieve.


   It is important to note that these settings as well as the installed
 language pairs are shared with Apertium Caffeine, which means that, if you
 install, uninstall or update a language pair, the changes will be reflected
 in both programs. This is a design decision that I took, but it would be
 simple to make them independent if you prefer it.

  Oops I hadn't read that. I think it would be nice to have separate
 options, yes.


 Now they are completely independent. But this could be problematic in some
 special cases: if the same directory is chosen for both programs, they
 would conflict and we would probably get a strange behaviour...

 You could, perhaps, add some kind of file that says which (OmegaT or
 Caffeine) is using the directory, and announce the conflict when the other
 one tries to use it too.


Done! The (already uploaded) new versions do that check.
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-19 Thread Mikel Forcada

Mikel [et al]:
Yes, but the JARs online follow a different naming convention so that 
we can know the exact modes they contain without downloading them 
(that is, simply looking at their name). For instance, en-eo,eo-en.jar 
is used instead of apertium-eo-en.jar, expressing that the contained 
modes are en-eo.mode and eo-en.mode (so, in this case, the program 
could conclude that the language pair is bidirectional even if it 
hasn't downloaded it yet).



Cool.
In any case, I've updated both programs so that they look for the 
following pattern, accepting both naming conventions:


return 
name.matches(([a-z_]+-[a-z_]+(,[a-z_]+-[a-z_]+)*).jar|(apertium-[a-z_]+-[a-z_]+).jar);
Great stuff! After all, it's up to you to decide the naming scheme for 
Caffeine, so if this is the naming style, go ahead with it.
So, basically, what we need is to not translate anything between  
tags, right? For instance, if we have something like Ez dakit zer 
arraio idatzi, we should avoid translating zer arraio... is that 
all we need? If so, it shouldn't be too hard to achieve.



Yeah, turning all of that into a superblank and letting Apertium deal 
with it. Would that be feasible?





You could, perhaps, add some kind of file that says which (OmegaT
or Caffeine) is using the directory, and announce the conflict
when the other one tries to use it too.


Done! The (already uploaded) new versions do that check.
Thanks a lot for such a quick turnaround! I look forward to the Apertium 
filtering. You'll have about 100 users for the OmegaT plugin in our 
Translation Technologies course at Universitat d'Alacant. That will 
ensure a lot of feedback!


Is there a webpage or a wiki page that explains how to export language 
pairs for use with Apertium Caffeine and the OmegaT plugin?


Cheers

Mikel

--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-19 Thread Mikel Artetxe

  So, basically, what we need is to not translate anything between 
 tags, right? For instance, if we have something like Ez dakit zer arraio
 idatzi, we should avoid translating zer arraio... is that all we need?
 If so, it shouldn't be too hard to achieve.


   Yeah, turning all of that into a superblank and letting Apertium deal
 with it. Would that be feasible?


If I'm not wrong it would require writing a new formatter (same as
TextFormatterhttp://apertium.svn.sourceforge.net/viewvc/apertium/branches/gsoc2012/artetxem/lttoolbox-java/src/org/apertium/formatter/TextFormatter.java?revision=38326view=markup,
but turning those tags into superblanks as you say) and integrate it with
the rest of lttoolbox-java, including the Translator API class so that we
can call the proper function from the plug-in and let it do all the work.
So it wouldn't be trivial, but it is definitely feasible.



 Thanks a lot for such a quick turnaround! I look forward to the Apertium
 filtering. You'll have about 100 users for the OmegaT plugin in our
 Translation Technologies course at Universitat d'Alacant. That will ensure
 a lot of feedback!


That sounds really good!



 Is there a webpage or a wiki page that explains how to export language
 pairs for use with Apertium Caffeine and the OmegaT plugin?


Not really. But, just in case, I would like to remark that the JARs used by
Apertium Caffeine, the OmegaT plug-in, the new apertium-viewer, the Android
app that Arink is developing as well as the ones launched through Java Web
Start are actually exactly the same files. They aim to be something like
universal packages that would work with many different programs in any
OS. They can even be opened by any Zip program, extract the proper files
and run with the local installation of C++ Apertium. This is why I wanted
to discuss the best way of creating and, in particular, maintaining them in
SVN.

As for your question, the same instructions that I explained and were under
discussion in that other thread I started would apply in this case as well.
Quoting myself:

The solution that I have been (and I'm still) working on comes in form of
 two bash scripts, each one to carry out one of the tasks (you can find them
 herehttps://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/in
  my branch):

 1) apertium-pack-j offers an easy way to generate the packages. It
 requires to have lttoolbox-java (the one in my branch, not the one in
 trunk) and android-sdk installed, and their location must be specified by
 setting the LTTOOLBOX_JAVA_PATH and ANDROID_SDK_PATH environment
 variables. After that, you can simply run it passing the path to the mode
 files for which you want to generate the package as argument, and a
 ready-to-use package would be created by the script. For instance, the
 following command would create a ready-to-use package for the
 Esperanto-English language pair named apertium-eo-en.jar in my machine:

 LTTOOLBOX_JAVA_PATH=/usr/local/share/apertium/lttoolbox.jar
 ANDROID_SDK_PATH=/home/mikel/developer/android-sdk-linux
 ./apertium-pack-j /usr/local/share/apertium/modes/eo-en.mode
 /usr/local/share/apertium/modes/en-eo.mode

 As you can see, I simply specify the correct location of lttoolbox-java
 and android-sdk in my machine, and pass the location of eo-en.mode and
 en-eo.mode (the main modes that correspond to the Esperanto-English
 language pair) as argument to apertium-pack-j.


As I said, the dependency of the Android-SDK can be eliminated if we
install dx.jar together with lttoolbox-java.


It still crashes if I click on the ambiguity option...

 $ java -jar /tmp/apertium-caffeine.jar
 java.lang.StringIndexOutOfBoundsException: String index out of range: 0
 at java.lang.
 AbstractStringBuilder.charAt(AbstractStringBuilder.java:191)
 at java.lang.StringBuilder.charAt(StringBuilder.java:72)

 at
 org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1372)
 at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:240)
 at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259)
 at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333)
 at org.apertium.Translator.translate(Translator.java:305)
 at
 org.apertium.caffeine.ApertiumCaffeine$12.run(ApertiumCaffeine.java:275)
 at java.lang.Thread.run(Thread.java:679)

 Perhaps it would be a good idea to leave this option out for stability...


Jimmy identified the problem yesterday but it hasn't been solved yet:

Sorry, this is a different error:

 char ch0 = sf.charAt(0);

 This also needs a null check, but javac doesn't like
 char ch0 = (sf != null) ? sf.charAt(0) : '';
 and
 char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED;
 isn't the same thing, though it doesn't look like it would matter in this
 case.

 One for Jacob, I think.


I've looked at the code and I agree with Jimmy that
char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED;

Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-19 Thread Jimmy O'Regan
On 19 July 2012 18:12, Mikel Artetxe artet...@gmail.com wrote:
 char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED;
 or even something like
 char ch0 = (sf != null) ? sf.charAt(0) : '\0';

Character.UNASSIGNED is 0, so this is the same thing. I'd lean towards
'Character.UNASSIGNED' because, though it's longer, it's
self-documenting.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Mikel Forcada

Hi there,
in three letters: wow! A more detailed assessment follows.

_*
Apertium Caffeine*_

Apertium Caffeine is a small, user-oriented Apertium client, similar 
in concept to apertium-tolk, but which has some great advantages over it:


  * It doesn't depend on anything external and is written in Java.
This means that it is completely platform-independent (it can work
on Linux, OS X as well as Windows) and its only requirement is a
Java VM (i.e. you don't need a separate installation of Apertium
or its language pairs). Since Java uses UTF-16 internally, we
shouldn't be having any encoding problem neither.
  * It manages language pairs within the app. In other words, you can
install, uninstall and even update language pairs from the app
itself in a simple, user-friendly way.

Cool. Works like a charm. A minor quibble: I said I wanted it installed 
in /tmp and it decided that every file in there was an already installed 
language pair. Perhaps a regex identifying language pairs by name would 
be very helpful.


  * Some other features for a better user experience that you will
find as you use the program: highlighting of unknown and ambiguous
words, full language names...


Haven't checked these yet.


The source code can be found here 
https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-caffeine/, 
but you can also download and test the ready-to-use JAR here 
https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-caffeine.jar.



_*Apertium plug-in for OmegaT*_

This is something that some of you suggested in the other thread and I 
have implemented as a proof of concept of how easy can lttoolbox-java 
be integrated in bigger Java projects. It shares most of its code with 
Apertium Caffeine, including the ability to manage language pairs 
within the app and, of course, it works offline.


The source code can be found here 
https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat/, 
and the ready-to-use JAR here 
https://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat.jar. 
If you want to try it, simply copy the JAR to the plugins directory of 
your OmegaT installation. The next time that you launch OmegaT, you 
will see a new option at Options - Machine Translate called Apertium 
(offline), which has to be checked to activate the plug-in. If you 
want to configure the plug-in or manage language pairs, go to Options 
- Apertium settings.


The menu appearde but this one crashed on me: I think the problem is 
that I had a previous installation somewhere else (they use the same 
.java/.userPrefs/org/apertium/ . I remove these.


Then on launching OmegaT I get a dialog that may be confusing for some, 
as it does not identify itself as an Apertium warning. It should...


It kindly asks me to decide where to install itself, and where to 
install language pairs. Then I open a project, mark Apertium offline 
in the Machine Translation options, and it crashes.


java.lang.IndexOutOfBoundsException: Index: 13, Size: 7
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.remove(ArrayList.java:412)
at 
org.apertium.lttoolbox.process.State.nodeStatePool_get(State.java:101)

at org.apertium.lttoolbox.process.State.copy(State.java:145)
at org.apertium.lttoolbox.process.State.copy(State.java:158)
at 
org.apertium.lttoolbox.process.FSTProcessor.analysis(FSTProcessor.java:857)

at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:287)
at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259)
at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333)
at org.apertium.Translator.translate(Translator.java:305)
at 
org.omegat.plugin.machinetranslators.ApertiumTranslate.translate(ApertiumTranslate.java:62)
at 
org.omegat.core.machinetranslators.BaseTranslate.getTranslation(BaseTranslate.java:64)
at 
org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:128)
at 
org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:103)
at 
org.omegat.gui.common.EntryInfoSearchThread.run(EntryInfoSearchThread.java:95)



I try again and it works. Very nice.

There is one thing that could be easily solved. Víctor Sánchez (cc-ed) 
maybe can help you. When one uses the Apertium webservice from inside 
OmegaT, we avoid translating the tags (u0, etc.). Some minor changes 
were made to the code that calls Apertium as a webservice (you'll easily 
find them, but if not, I can help) and some changes were made in the 
webservice itself (Víctor can help here). I think it is a matter of 
using some regular expressions to hide these in some way...


You should consider sending a message to the OmegaT list. I can help 
here, as I am subscribed. I think this is very, very important! And 

Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Kevin Donnelly
Hi Mikel

On Wednesday 18 July 2012 Mikel Artetxe said
 Apertium Caffeine is a small, user-oriented Apertium client, similar in
 concept to apertium-tolk, but which has some great advantages over it:

Looks good, but ticking Mark ambiguity causes it to crash. :-(

Ubuntu 10.04 
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)

java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at 
java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:174)
at java.lang.StringBuilder.charAt(StringBuilder.java:55)
at 
org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1372)  
   
at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:240)   
at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259)  
at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333)  
at org.apertium.Translator.translate(Translator.java:305)  
at 
org.apertium.caffeine.ApertiumCaffeine$11.run(ApertiumCaffeine.java:242)
   
at java.lang.Thread.run(Thread.java:662)  

Using the following test phrase:
I want to head off to the beach now.  When will we go next?

-- 
Pob hwyl / Best wishes

Kevin Donnelly
kevindonnelly.org.uk

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Mikel Artetxe

 Looks good, but ticking Mark ambiguity causes it to crash. :-(

 Using the following test phrase:
 I want to head off to the beach now.  When will we go next?


Thank you for reporting it.

I've discovered that it is lttoolbox-java who crashes and not Apertium
Caffeine itself. The problematic word is will in the es-en language pair
when passing the -a flag. In other words, the following causes exactly
the same crash:

echo will | java -jar lttoolbox.jar apertium -a -d
/usr/local/share/apertium/ en-es

Unfortunately, org.apertium.lttoolbox.process.FSTProcessor, which causes
the crash, is super complex (more than 2000 lines of code!), and my work so
far hasn't been related to it... Perhaps somebody that worked on it can
help us?

Also, note that, although it doesn't crash, C++ Apertium gives a strange
output for the same input:

[mikel@fedora ~]$ echo will | apertium -a en-es
=#

so perhaps the problem is in the en-es language pair... Any idea?
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Jimmy O'Regan
On 18 July 2012 13:46, Mikel Artetxe artet...@gmail.com wrote:
 Looks good, but ticking Mark ambiguity causes it to crash. :-(

 Using the following test phrase:
 I want to head off to the beach now.  When will we go next?


 Thank you for reporting it.

 I've discovered that it is lttoolbox-java who crashes and not Apertium
 Caffeine itself. The problematic word is will in the es-en language pair
 when passing the -a flag. In other words, the following causes exactly the
 same crash:

 echo will | java -jar lttoolbox.jar apertium -a -d
 /usr/local/share/apertium/ en-es

 Unfortunately, org.apertium.lttoolbox.process.FSTProcessor, which causes the
 crash, is super complex (more than 2000 lines of code!), and my work so far
 hasn't been related to it... Perhaps somebody that worked on it can help us?


:D

I think this goes to show how little the '-a' option is used, because
neither implementation of generation in lttoolbox handles it.

The word 'will' disappears because the tagger is picking the auxiliary
form, which has a null translation in the bidix, but the generation
error appears with other words:

$ echo can |apertium -a en-es
=#poder

$ echo can |apertium en-es
Puede

The solution for you is simply to remove the option - it does nothing
that's useful for an end user.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Jimmy O'Regan
On 18 July 2012 15:22, Jimmy O'Regan jore...@gmail.com wrote:
 The word 'will' disappears because the tagger is picking the auxiliary
 form, which has a null translation in the bidix, but the generation
 error appears with other words:

 $ echo can |apertium -a en-es
 =#poder

 $ echo can |apertium en-es
 Puede

I was only half right - this is a different issue. I committed 39497
to lttoolbox-java in trunk[1] to check for an empty translation, but
this is only a partial fix, as it leads to ghost = signs in the text
where this has happened. (To do otherwise would require a bit more
effort, and I'm not entirely convinced it's worth it).

[1] 39496 for lttoolbox (C++)

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Jimmy O'Regan
On 18 July 2012 16:13, Jimmy O'Regan jore...@gmail.com wrote:
 On 18 July 2012 15:22, Jimmy O'Regan jore...@gmail.com wrote:
 The word 'will' disappears because the tagger is picking the auxiliary
 form, which has a null translation in the bidix, but the generation
 error appears with other words:

 $ echo can |apertium -a en-es
 =#poder

 $ echo can |apertium en-es
 Puede

 I was only half right - this is a different issue.

The different issue, for the record, is that an extra optional
transduction needs to be inserted when preprocessing the transfer
files (to make '^=' at the start of a token equivalent to '^').

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Mikel Artetxe
Thank you for your feedback Mikel! I've fixed some of the issues that you
have found (the new version is already uploaded):

*Apertium Caffeine*

 Apertium Caffeine is a small, user-oriented Apertium client, similar in
 concept to apertium-tolk, but which has some great advantages over it:

- It doesn't depend on anything external and is written in Java. This
means that it is completely platform-independent (it can work on Linux, OS
X as well as Windows) and its only requirement is a Java VM (i.e. you don't
need a separate installation of Apertium or its language pairs). Since Java
uses UTF-16 internally, we shouldn't be having any encoding problem 
 neither.
- It manages language pairs within the app. In other words, you can
install, uninstall and even update language pairs from the app itself in a
simple, user-friendly way.

  Cool. Works like a charm. A minor quibble: I said I wanted it installed
 in /tmp and it decided that every file in there was an already installed
 language pair. Perhaps a regex identifying language pairs by name would be
 very helpful.


I now look at the file extension to filter JAR files. I think that it
should be enough. It would also be possible to look at the whole filename,
but this would make manually installed pairs unusable unless they are
properly named...

*Apertium plug-in for OmegaT*

 This is something that some of you suggested in the other thread and I
 have implemented as a proof of concept of how easy can lttoolbox-java be
 integrated in bigger Java projects. It shares most of its code with
 Apertium Caffeine, including the ability to manage language pairs within
 the app and, of course, it works offline.

 The source code can be found 
 herehttps://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat/,
 and the ready-to-use JAR 
 herehttps://apertium.svn.sourceforge.net/svnroot/apertium/branches/gsoc2012/artetxem/apertium-omegat.jar.
 If you want to try it, simply copy the JAR to the plugins directory of your
 OmegaT installation. The next time that you launch OmegaT, you will see a
 new option at Options - Machine Translate called Apertium (offline),
 which has to be checked to activate the plug-in. If you want to configure
 the plug-in or manage language pairs, go to Options - Apertium settings.


 The menu appearde but this one crashed on me: I think the problem is that
 I had a previous installation somewhere else (they use the same
 .java/.userPrefs/org/apertium/ . I remove these.


They now use independent preferences, so we shouldn't have that problem
anymore.


Then on launching OmegaT I get a dialog that may be confusing for some, as
 it does not identify itself as an Apertium warning. It should...


Now it does!


It kindly asks me to decide where to install itself, and where to install
 language pairs. Then I open a project, mark Apertium offline in the
 Machine Translation options, and it crashes.

 java.lang.IndexOutOfBoundsException: Index: 13, Size: 7
 at java.util.ArrayList.rangeCheck(ArrayList.java:571)
 at java.util.ArrayList.remove(ArrayList.java:412)
 at
 org.apertium.lttoolbox.process.State.nodeStatePool_get(State.java:101)
 at org.apertium.lttoolbox.process.State.copy(State.java:145)
 at org.apertium.lttoolbox.process.State.copy(State.java:158)
 at
 org.apertium.lttoolbox.process.FSTProcessor.analysis(FSTProcessor.java:857)
 at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:287)
 at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259)
 at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333)
 at org.apertium.Translator.translate(Translator.java:305)
 at
 org.omegat.plugin.machinetranslators.ApertiumTranslate.translate(ApertiumTranslate.java:62)
 at
 org.omegat.core.machinetranslators.BaseTranslate.getTranslation(BaseTranslate.java:64)
 at
 org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:128)
 at
 org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:103)
 at
 org.omegat.gui.common.EntryInfoSearchThread.run(EntryInfoSearchThread.java:95)


I'm not sure if I have correctly identified the problem, but it is probably
fixed now.


There is one thing that could be easily solved. Víctor Sánchez (cc-ed)
 maybe can help you. When one uses the Apertium webservice from inside
 OmegaT, we avoid translating the tags (u0, etc.). Some minor changes were
 made to the code that calls Apertium as a webservice (you'll easily find
 them, but if not, I can help) and some changes were made in the webservice
 itself (Víctor can help here). I think it is a matter of using some regular
 expressions to hide these in some way...


I guess that you are talking about
thishttp://omegat.svn.sourceforge.net/viewvc/omegat/trunk/src/org/omegat/core/machinetranslators/ApertiumTranslate.java?revision=4434view=markup.
I might be blind, but 

Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Jimmy O'Regan
On 18 July 2012 18:28, Mikel Artetxe artet...@gmail.com wrote:

 I was only half right - this is a different issue. I committed 39497
 to lttoolbox-java in trunk[1] to check for an empty translation, but
 this is only a partial fix, as it leads to ghost = signs in the text
 where this has happened. (To do otherwise would require a bit more
 effort, and I'm not entirely convinced it's worth it).


 I haven't really understood the technical details, but lttoolbox-java still
 crashes... Look:

 [mikel@fedora dist]$ echo will | java -jar lttoolbox.jar apertium -a -d
 /usr/local/share/apertium/ en-es

 java.lang.StringIndexOutOfBoundsException: String index out of range: 0
 at
 java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:174)
 at java.lang.StringBuilder.charAt(StringBuilder.java:55)
 at
 org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1323)

Sorry, this is a different error:

char ch0 = sf.charAt(0);

This also needs a null check, but javac doesn't like
char ch0 = (sf != null) ? sf.charAt(0) : '';
and
char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED;
isn't the same thing, though it doesn't look like it would matter in this case.

One for Jacob, I think.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Jimmy O'Regan
On 18 July 2012 19:25, Jimmy O'Regan jore...@gmail.com wrote:
 On 18 July 2012 18:28, Mikel Artetxe artet...@gmail.com wrote:

 I was only half right - this is a different issue. I committed 39497
 to lttoolbox-java in trunk[1] to check for an empty translation, but
 this is only a partial fix, as it leads to ghost = signs in the text
 where this has happened. (To do otherwise would require a bit more
 effort, and I'm not entirely convinced it's worth it).


 I haven't really understood the technical details, but lttoolbox-java still
 crashes... Look:

 [mikel@fedora dist]$ echo will | java -jar lttoolbox.jar apertium -a -d
 /usr/local/share/apertium/ en-es

 java.lang.StringIndexOutOfBoundsException: String index out of range: 0
 at
 java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:174)
 at java.lang.StringBuilder.charAt(StringBuilder.java:55)
 at
 org.apertium.lttoolbox.process.FSTProcessor.generation(FSTProcessor.java:1323)

 Sorry, this is a different error:

 char ch0 = sf.charAt(0);

 This also needs a null check, but javac doesn't like
 char ch0 = (sf != null) ? sf.charAt(0) : '';
 and
 char ch0 = (sf != null) ? sf.charAt(0) : (char) Character.UNASSIGNED;
 isn't the same thing, though it doesn't look like it would matter in this 
 case.

 One for Jacob, I think.

I've attached the diff, it might make it more clear.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you


patch
Description: Binary data
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New applications: Apertium Caffeine and Apertium plug-in for OmegaT

2012-07-18 Thread Mikel Forcada

Hi Mikel:

Thank you for your feedback Mikel! I've fixed some of the issues that 
you have found (the new version is already uploaded):

Cool!


Cool. Works like a charm. A minor quibble: I said I wanted it
installed in /tmp and it decided that every file in there was an
already installed language pair. Perhaps a regex identifying
language pairs by name would be very helpful.


I now look at the file extension to filter JAR files. I think that it 
should be enough. It would also be possible to look at the whole 
filename, but this would make manually installed pairs unusable unless 
they are properly named...
The standard style is apertium-[a-z][a-z][a-z]?-[a-z][a-z][a-z]?.jar I 
think it would be no problem for your program to require that.



_*Apertium plug-in for OmegaT*_




The menu appearde but this one crashed on me: I think the problem
is that I had a previous installation somewhere else (they use the
same .java/.userPrefs/org/apertium/ . I remove these.


They now use independent preferences, so we shouldn't have that 
problem anymore.



Then on launching OmegaT I get a dialog that may be confusing for
some, as it does not identify itself as an Apertium warning. It
should...


Now it does!

I'll take a look. Thanks a lot!



It kindly asks me to decide where to install itself, and where to
install language pairs. Then I open a project, mark Apertium
offline in the Machine Translation options, and it crashes.

java.lang.IndexOutOfBoundsException: Index: 13, Size: 7
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.remove(ArrayList.java:412)
at
org.apertium.lttoolbox.process.State.nodeStatePool_get(State.java:101)
at org.apertium.lttoolbox.process.State.copy(State.java:145)
at org.apertium.lttoolbox.process.State.copy(State.java:158)
at
org.apertium.lttoolbox.process.FSTProcessor.analysis(FSTProcessor.java:857)
at org.apertium.lttoolbox.LTProc.doMain(LTProc.java:287)
at org.apertium.pipeline.Dispatcher.doLTProc(Dispatcher.java:259)
at org.apertium.pipeline.Dispatcher.dispatch(Dispatcher.java:333)
at org.apertium.Translator.translate(Translator.java:305)
at

org.omegat.plugin.machinetranslators.ApertiumTranslate.translate(ApertiumTranslate.java:62)
at

org.omegat.core.machinetranslators.BaseTranslate.getTranslation(BaseTranslate.java:64)
at

org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:128)
at

org.omegat.gui.exttrans.MachineTranslateTextArea$FindThread.search(MachineTranslateTextArea.java:103)
at

org.omegat.gui.common.EntryInfoSearchThread.run(EntryInfoSearchThread.java:95)


I'm not sure if I have correctly identified the problem, but it is 
probably fixed now.

I will test it again, to see what happens.



There is one thing that could be easily solved. Víctor Sánchez
(cc-ed) maybe can help you. When one uses the Apertium webservice
from inside OmegaT, we avoid translating the tags (u0, etc.).
Some minor changes were made to the code that calls Apertium as a
webservice (you'll easily find them, but if not, I can help) and
some changes were made in the webservice itself (Víctor can help
here). I think it is a matter of using some regular expressions to
hide these in some way...


I guess that you are talking about this 
http://omegat.svn.sourceforge.net/viewvc/omegat/trunk/src/org/omegat/core/machinetranslators/ApertiumTranslate.java?revision=4434view=markup. 
I might be blind, but I haven't been able to identify the relevant 
piece of code there...
You're right. Most of the work is done at the Apertium server when it 
receives format=omegat.


The code that carries out the translation in the plug-in can be found 
here 
http://apertium.svn.sourceforge.net/viewvc/apertium/branches/gsoc2012/artetxem/apertium-omegat/src/org/omegat/plugin/machinetranslators/ApertiumTranslate.java?revision=39499view=markup 
from line 56 to 66. The function is really simple, so it should be 
easy to apply the necessary changes there (if we know what this 
necessary changes are!).
I'm sure Víctor can tell you. It's a matter of catching and escaping the 
 tags generated by OmegaT so that their name does not get translated.


You should consider sending a message to the OmegaT list. I can
help here, as I am subscribed. I think this is very, very
important! And perhaps you can get OmegaT people to help with some
of the issues.


I will do it then!

Excellent.



It is important to note that these settings as well as the
installed language pairs are shared with Apertium Caffeine, which
means that, if you install, uninstall or update a language pair,
the changes will be reflected in both programs. This is a design
decision that I took, but it would be simple to make them