+1! I can more or less test the Catalan, but at work there are two colleagues from the Basque Country and I think I could check with them.
Thanks a lot! On Sun, 17 Nov 2024 at 11:01, Richard Zowalla <[email protected]> wrote: > +1 (do it) > > Am 17. November 2024 10:48:18 MEZ schrieb Martin Wiesner < > [email protected]>: > >Hi, > > > >Any objections against (or support for) doing a release (1.2) for the > pre-trained models? Details below: > > > >I’ve added 9 new languages (via UD treebanks) to our model list: > > > >- "Armenian|hy|BSUT" > >- "Basque|eu|BDT" > >- "Catalan|ca|AnCora" > >- "Georgian|ka|GLC" > >- "Greek|el|GDT" > >- "Kazakh|kk|KTB" > >- "Korean|ko|Kaist" > >- "Icelandic|is|IcePaHC" > >- "Turkish|tr|BOUN" > > > >In total, this results in 32 supported languages, each with sentence > detection, tokenization and POS tagging. > > > >Moreover, the updated ud-train script now produces ME models for the > Lemmatizer component, 32 of them. > > > >The training is conducted with OpenNLP 2.5.0 and with the treebanks from > the latest UD release, dating Nov 15, 2024. > > > >I can act as RM. > > > >Best > >Martin
