Re: [Apertium-stuff] Definite determiner in apertium-es-ca

Kevin Brubeck Unhammer Wed, 11 Jul 2012 11:06:02 -0700

"Bernard Chardonneau" <bechapert...@free.fr>
writes:

> Hey everybody.
>
> After 10 days mostly in the nature without a computer and just before
> 8 other weeks without a permanent internet connexion (widely chosen),
> I want to give my opinion as a new pair developer about the discussion
> about what should countain dictionaries.
>
> 1) For monodices, I perfectly agree with Fran and some others to think
> all interesting information should be there even if not used for several
> pairs.
>
> As doing that generally means to write a complete paradigm, and after
> just to use it hundred or thousand of times for the main ones, it is
> not a big problem.
>
> 2) For bidixes, the most natural way to build them is to write something
> like :
>
> <e><p><l>my_word<s n="kind1"/></l><r>my_translation<s n="kind2"/></r></p></e>
>
> where kind1 and kind2 are often the same and can be built from the
> name of the paradigm used in the monodix.
>
> I tell that because I quickly realised that including a new line
> typing the right xml syntax in a file with more 40 000 other lines
> becomes quickly painful.
> So I wrote a 4 parameter shell to generate new lines, and another
> to put these lines at the good place.
>
> I think a lot of pair developers have their own shell to do the
> same or something similar to build a bidix when monodices are
> available.
>
> So, making bidixes lines like as above means other <s n="something"/>
> would be better if not needed.
>
> Of course, there are exceptions witch permit to get pleasant results
> like in fr-es pair :
>
> <e><p><l>coma<s n="n"/><s n="m"/></l><r>coma<s n="n"/><s n="m"/></r></p></e>
> <e><p><l>virgule<s n="n"/><s n="f"/></l><r>coma<s n="n"/><s 
> n="f"/></r></p></e>
>
> or
>
> <e><p><l>composant<s n="n"/><s n="m"/></l><r>componente<s n="n"/><s 
> n="m"/></r></p></e>
> <e><p><l>composante<s n="n"/><s n="f"/></l><r>componente<s n="n"/><s 
> n="f"/></r></p></e>
>
> But having to write (in eo-fr pair)
> <e><p><l>ABC<s n="np"/><s n="al"/></l><r>ABC<s n="np"/><s n="al"/><s 
> n="mf"/></r></p></e>
> without forgeting any <s n="al"/> or the <s n="mf"/> to prevent
> getting a # in the translation, is not a very nice way to work.
>
> There is of course the problem of the beginner not doing that and
> asking on the list why it does not work. But that can be learned
> quickly.
>
> But the most important problem is being obliged to do that quite
> allways and finaly having bigger and a little less readable lines
> in the bidix.
>
> I think event in this case :
> <e><p><l>ajout<s n="n"/><s n="m"/></l><r>adición<s n="n"/><s
> n="f"/></r></p></e>(gender changing), there should be no need to give
> gender if there
> is no word ambiguity in each langage (like for coma and componente
> in Spanish).
>
> And of course something like :
> <e r="LR"><p><l>binaire<s n="adj"/><s n="mf"/></l><r>binario<s
> n="adj"/><s n="GD"/></r></p></e>
> <e r="RL"><p><l>binaire<s n="adj"/><s n="mf"/></l><r>binario<s
> n="adj"/><s n="f"/></r></p></e>
> <e r="RL"><p><l>binaire<s n="adj"/><s n="mf"/></l><r>binario<s
> n="adj"/><s n="m"/></r></p></e>
>
> would become more simple in one line.
>
> So, the question is how to succeed to do that without breaking things.
>
>
> Solution 1 : paradigm
>
> Several people spoke about it but without details.
> I remark the information <s n="kind"/> inside bidixes can generally
> be generated from the name of the paradigm used in the monodix
> witch looks like "something__kind" (or "foo__bar" if you prefer).
>
> But of course, there is les information in "kind" than in
> "something__kind".
>
> So a nice approach woud be for each paradigm of every monodix, to
> build a paradigm with the same name in the bidix just countaining
> an invariant list of informations like :
>
> <s n="thing1"/><s n="thing2"/>
>
> And like that, even gender ambiguities like for the Spanish word
> coma could be solved elegantly :
>
> <e><p><l>coma<s n="livre__n"/></l><r>coma<s n="abismo__n"/></r></p></e>
> <e><p><l>virgule<s n="abeille__n"/></l><r>coma<s n="abeja__n"/></r></p></e>


Didn't Jacob Nordfalk and Michael Kristensen make a script to do that
kind of thing with sv-da? Ie. automatically create bidix pardefs based
on monodix pardefs.

> Solution 2 : during compilation
>
> That's another approch. For compiling bidixes files, two cases :
> - an information is in a <s n="thing"/> , so just use it
> - this information is not indicated, so it is taken from the
>   monodix.
>
>
> Have a good summer.

You too :-)

--
Kevin Brubeck Unhammer

GPG: 0x766AC60C


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Definite determiner in apertium-es-ca

Reply via email to