[Apertium-stuff] The Story was: Re: Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Per Tunedal
Hi again, I just found my corrections on a Linux-box I haven't used for long. I did conclude that the original was the English version. See below: 1 VAR ÄR JAMES? (Engelskt original: Proycon) 1 HVOR ER JAMES? (Engelsk original: Procyon) 2 James och Mary är i trädgården. Väd

[Apertium-stuff] sv-da: Generate mode files

2013-08-30 Thread Per Tunedal
Hi, I happened to create a new repository and don't find any mode-files. I've tried to create them but sh apertium-generate-modes modes.xml doesn't work: sh can't open apertium-generate-modes It's a long time since I experimented with modes. Besides: I cannot compile the dictionaries with "make

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Francis Tyers
El dv 30 de 08 de 2013 a les 14:39 +0200, en/na Per Tunedal va escriure: > HI, > amazing. Maybe I should have started from scratch instead! > > Anyhow the Swedish is kind of weird in the story. What language is the > original? The original is English written by a Dutch native speaker. Yes, the la

Re: [Apertium-stuff] sv-da: Generate mode files

2013-08-30 Thread Francis Tyers
El dv 30 de 08 de 2013 a les 16:48 +0200, en/na Per Tunedal va escriure: > Hi, > I happened to create a new repository and don't find any mode-files. > I've tried to create them but sh apertium-generate-modes modes.xml > doesn't work: > sh can't open apertium-generate-modes > > It's a long time

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Kevin Brubeck Unhammer
Per Tunedal writes: [...] > Right, you seldom translate a Block World Text, but most words and > grammatical intrinsics are useful. The verbs take and put are in fact > very frequent. Of course they are, they're the only two verbs in your corpus. Or did you mean they're frequent outside the bl

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Per Tunedal
Hi again, Long ago I sent some examples of strange things in the story and asked for the original. I intended to correct the translations, but didn't get any answer. It's not any use to discuss examples if you don't know which text is the original. Maybe what appears strange is more close to the or

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Kevin Brubeck Unhammer
Per Tunedal writes: > Yes, > it might be a good idea to test on a more meaningful text. In fact > that's what I've planned for the next step. > > I don't like that story, though. I found some of the translations not > accurate when I looked at it long ago. And that made me suspect that > some oth

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Per Tunedal
HI, amazing. Maybe I should have started from scratch instead! Anyhow the Swedish is kind of weird in the story. What language is the original? Besides: there's a fundamental problem with the story. It's used both for developing and evaluation. That's violating the rules for development! Simple c

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Per Tunedal
Yes, it might be a good idea to test on a more meaningful text. In fact that's what I've planned for the next step. I don't like that story, though. I found some of the translations not accurate when I looked at it long ago. And that made me suspect that some other translations in languages I didn

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Francis Tyers
El dv 30 de 08 de 2013 a les 09:12 +, en/na Francis Tyers va escriure: > El dv 30 de 08 de 2013 a les 11:04 +0200, en/na Per Tunedal va escriure: > > Hi again, > > Thank you. I will dig into this. > > > > You didn't answer my question about what's wrong with the English > > version of the Bloc

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Francis Tyers
El dv 30 de 08 de 2013 a les 11:04 +0200, en/na Per Tunedal va escriure: > Hi again, > Thank you. I will dig into this. > > You didn't answer my question about what's wrong with the English > version of the Block World Corpus? It might be a good idea to improve > the language: It's not worth impr

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Per Tunedal
Hi again, Thank you. I will dig into this. You didn't answer my question about what's wrong with the English version of the Block World Corpus? It might be a good idea to improve the language: > The English data for the corpus is kind of weird (borderline > ungrammatical) in some places. Feel f

[Apertium-stuff] Challenge: Make your language pair beat SMT with IBM model 1

2013-08-30 Thread Per Tunedal
Hi all language maintainers, I've got a challenge for you. It should not take more than a week. Let your Apertium language pair meet my simple IBM1 decoder in the ring. The fight is about doing the best translation of the Block World Corpus found at my site tunedal.nu : http://www.tunedal.nu/down

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Francis Tyers
El dv 30 de 08 de 2013 a les 10:39 +0200, en/na Per Tunedal va escriure: > Hi, > > On Thu, Aug 29, 2013, at 11:20, Francis Tyers wrote: > > El dj 29 de 08 de 2013 a les 10:13 +0200, en/na Per Tunedal va escriure: > > > Hi, > > > the design of Apertium has some resemblance with the outdated > > > w

Re: [Apertium-stuff] Starting a pair with separated mondixies was: Re: new top-level SVN module for monolingual language packs

2013-08-30 Thread Francis Tyers
El dv 30 de 08 de 2013 a les 10:18 +0200, en/na Per Tunedal va escriure: > Hi, > I would do with a full description of the extra steps involved to be > able to make an appreciation of the needed effort, compared to a > standard approach. > Yours, > Per Tunedal > > On Thu, Aug 29, 2013, at 11:24, F

Re: [Apertium-stuff] Old fashoned SMT IBM model 1 outperforms Apertium

2013-08-30 Thread Per Tunedal
Hi, On Thu, Aug 29, 2013, at 11:20, Francis Tyers wrote: > El dj 29 de 08 de 2013 a les 10:13 +0200, en/na Per Tunedal va escriure: > > Hi, > > the design of Apertium has some resemblance with the outdated > > word-to-word statistical translations models, especially the simplest: > > IBM model 1:

Re: [Apertium-stuff] Starting a pair with separated mondixies was: Re: new top-level SVN module for monolingual language packs

2013-08-30 Thread Per Tunedal
Hi, I would do with a full description of the extra steps involved to be able to make an appreciation of the needed effort, compared to a standard approach. Yours, Per Tunedal On Thu, Aug 29, 2013, at 11:24, Francis Tyers wrote: > El dt 27 de 08 de 2013 a les 08:07 +0200, en/na Per Tunedal va escr

[Apertium-stuff] New version of AddToDix

2013-08-30 Thread Per Tunedal
Hi, I've just uploaded a new version of my java-tools for adding to Apertium dictionnaries. Now it works OK from Danish to Swedish too! I suppose the tools might be usable for some other language pairs as well, maybe after some tweaking. You'll find AddToDix at the download page of my site tuned