On 23 March 2011 05:40, Mikel Forcada <[email protected]> wrote:
> Hi Sagie,
>
> I would love to adopt a language pair, namely an English-Hebrew one.
> I noticed there's a resource link posted for Hebrew in the
> incubator: http://www.mila.cs.technion.ac.il/english/resources/lexicons/
> But this page only seem to list examples to some very basic lexicon
> collections.
> Since I'm not familiar with the amount of data required for a language pair,
> my question is how difficult will creating a new pair out of this kind of
> data would be? Is the data even good enough?
>
> As a rule of thumb, you would need a couple of thousands of dictionary
> entries to cover about 90% of everyday language. I don't know the data in
> http://www.mila.cs.technion.ac.il/english/resources/lexicons/ : the server
> wasn't working this morning.
>

They have a GPL'd WordNet for Hebrew, which is aligned with the synset
IDs of the English one: http://mila.cs.technion.ac.il/wordnet/ver2.0/

I haven't seen anything more than the examples for their morphological
database, but they're relatively well known as being pro-GPL,
right-thinking people, so it may be an oversight.

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to