Re: [Apertium-stuff] scraper.py

2012-12-09 Thread Ryan Johnson
If you're looking for tools to build corpuses from miscellaneous websites, I'd recommend taking a look at pattern.web[1]. It's very easy to construct very specific spiders, and also convert stuff to plaintext. I've used it precisely for this task several times, so feel free to ask if there's anythi

[Apertium-stuff] Finnish election parallel texts

2012-10-20 Thread Ryan Johnson
This could be of interest to several here, perhaps, if anyone needs small-ish but mostly parallel texts. The simplified finnish version of this text appears to not be all that parallel, however. http://www.vaalit.fi/58721.htm languages are: finnish, swedish, inari sami, skolt sami, north sami, ka

Re: [Apertium-stuff] GSoC: Embedding Apertium in iPhone is a no-go.

2012-03-28 Thread Ryan Johnson
On Mar 28, 2012 6:20 AM, "Jacob Nordfalk" wrote: > > > > 2012/3/28 Jacob Nordfalk >> >> What!? >> >> >> 2012/3/28 Jimmy O'Regan >>> >>> On 28 March 2012 11:33, Jacob Nordfalk wrote: >>> > Overall, make sure that your port can be published in Apples store. >>> > >>> >>> The short answer is, it c

Re: [Apertium-stuff] Open source English CG?

2012-02-16 Thread Ryan Johnson
On Wed, Feb 15, 2012 at 11:45 PM, Kevin Donnelly wrote: > Hi Ryan > > On Thursday 16 February 2012 Ryan Johnson said > > It looks like the only English constraint grammar I can find, linked from > > the official CG site, is the one that lingsoft.fi has-- does any

Re: [Apertium-stuff] Open source English CG?

2012-02-16 Thread Ryan Johnson
On Thu, Feb 16, 2012 at 12:42 AM, Tino Didriksen wrote: > On Thu, Feb 16, 2012 at 06:03, Ryan Johnson wrote: > >> It looks like the only English constraint grammar I can find, linked from >> the official CG site, is the one that lingsoft.fi has-- does anyone know >> of any

[Apertium-stuff] Open source English CG?

2012-02-15 Thread Ryan Johnson
Hey all, It looks like the only English constraint grammar I can find, linked from the official CG site, is the one that lingsoft.fi has-- does anyone know of any open source English CGs, or are they all under tight wraps? Ryan ---

Re: [Apertium-stuff] aligned morphological dictionaries

2011-09-18 Thread Ryan Johnson
SFST, HFST and XFST and things like them can do this and in ways that are not inefficient for the program (and in fact part of the intended use). If you have an exact example, I might be able to show you what you'd do. Otherwise, the example you've provided should be easy. I was a little curious a

Re: [Apertium-stuff] Catalan WordNet

2011-04-10 Thread Ryan Johnson
It is so delightful to see "hot!" next to the words 'Catalan WordNet' on that page. ;) R On Sun, Apr 10, 2011 at 5:33 PM, Jimmy O'Regan wrote: > The Catalan WordNet > (http://nlp.lsi.upc.edu/web/index.php?option=com_docman&Itemid=135) > has been GPL since last July. > > -- > Are any of the me

Re: [Apertium-stuff] Language codes (sme-nob, nb-se what does that stand for?)

2010-07-23 Thread Ryan Johnson
They're separate ISO standards. One of the standards that does the two letter names couldn't have enough letters to cover all the world's languages, and so there are other standards. One covers language families and another covers individual language names. It seems like on the whole, most big lang

Re: [Apertium-stuff] apertium-sme-fin coverage statistics

2010-07-19 Thread Ryan Johnson
h needs to be rendered in Sámi with: váldooasi oasis: Fin: pääosa osa Sme: váldooassi oassi Unsure at the moment as to whether this is a stock phrase, or also the result of something more productive. Fixing these issues will be good to do, but for now it seems at least like someone can make s

[Apertium-stuff] apertium-sme-fin coverage statistics

2010-07-18 Thread Ryan Johnson
Hi all, Here's some of the coverage statistics for sme-fin. I tried a couple texts, first some small ones to get my coverage script working, and then a big one to give it a full test. I'm only reporting on the small texts because there are some interesting differences as they are different types o

[Apertium-stuff] Example sentence database

2010-07-15 Thread Ryan Johnson
Hey all, Someone passed this on to me. I figure it will be useful to someone here :> It's a database of example sentences matched up in various languages. Judging from the amount of online visitors (53) they have at the time of writing this it seems like the site is gaining popularity... So, perh

[Apertium-stuff] HTML/XML shortcut syntax

2010-05-02 Thread Ryan Johnson
Somehow I get the feeling you all write a lot of XML from time to time. Since GSoC hasn't really gotten underway I haven't become thoroughly acquainted with how much or how little XML I'll be writing (since I expect a good amount of it will be scripted)... But, this seems like it'd be totally usefu

[Apertium-stuff] apertium-fin-sme update: Karlsson's Finnish CG has been GPL'd

2010-04-14 Thread Ryan Johnson
Hey all, Quick update, as you can see in the subject, Fred Karlsson has just released his Finnish Constraint Grammar. It is currently being (semi-automatically) converted, with the goal of transferring it from CG1 to VISL-CG3. It's available here: https://victorio.uit.no/langtech/trunk/kt/fin/src.

[Apertium-stuff] apertium-fin-sme update: Karlsson's Finnish CG has been GPL'd

2010-04-14 Thread Ryan Johnson
Hey all, Quick update, as you can see in the subject, Fred Karlsson has just released his Finnish Constraint Grammar. It is currently being (semi-automatically) converted, with the goal of transferring it from CG1 to VISL-CG3. It's available here: https://victorio.uit.no/langtech/trunk/kt/fin/src.