Per Tunedal kirjoitti 9. aug. 2012 kello 20:21:
> Tihomir has told before that he plans to start developing a constraint
> grammar for Swedish.

Good. Again: 
- Are there open resources?
- Could something be ported from Norwegian? (perhaps only indirectly).

>> Yes, a production system (say, I want to translate a sv article to nn on
>> Wikipedia) (…)

> Yes, that was the scenario I first had in mind. But it would break if
> there is a need for a constraint grammar, wouldn't it? And then there
> wont be any use left for the Apertium-translation.

Well. Since a handful of rules will remove most ambiguities, what is left will 
be partly disambiguated. And how bad this is for MT needs to be seen. So it 
will not break. It will only be more problematic, and the result will be poorer.

>> The good news is that the making of such an
>> enlarged transfer lexicon in part can be done automatically, and then
>> manually post edited.
> What do you have in mind? Please tell me more about how to generate the
> bidic automatically!

a. via a parallel corpus (of course)
b. by
--- 1 taking a sv list of words
--- run it through a sv2no orthographical + lexical transfer
--- analyze the output, and pick the recognized matches (input N Sg -> collect 
all N Sg output)
--- go through the result manually

About the transducer:
Lexical changes: samhälle > samfunn, prefikset o- -> u-, stad -> by (when these 
occur in compounds) 
suffixes: -tion -> -sjon, 
The obvious things: ö>ø, ä>æ, x>ks

See e.g.
associationsrikedom, variationsrikedom, situationsrikedom, infektionssjukdom, 
kombinationsslalom, informationsergonom, nationalekonom, 
sundströmnationalekonom, konsumtionsboom, kommunikationsform, 
organisationsform, notationsform, injektionsform, portionsform, 
distributionsform, nationalsocialism, ationalism, nationalism, smygnationalism, 
multinationalism, hypernationalism, internationalism, vänsternationalism, 
naturnationalism, hägnainossnationalism, statsnationalism, rationalism, 
sensationalism, traditionalism, funktionalism, exceptionalism, 
koncentrationskapitalism, mutationsmekanism, isolationism, exhibitionism, 
perfektionism, protektionism, interventionism

This is a list over -tion- words. They shall all have -sjon- in nb, nn. In 
addition: c > s, rikedom > rikdom, sjuk > sky (nb only), ekonom > økonom, 
social > social, -ism > -isme, xc > ks, 

Thys a long row of small changes are needed for making such loanword strings 
into Norwegian. In a recent frequency corpus from Svenska språkbanken i found 
365000 unique word forms, of these, 7700 contained -tion, and thus need the 
ruleset above.

> And for the manual part:
> Keld once told me there is a lists of "false friends" for da/sv/nb.
> Where do I find that list of problematic words?

In paper dictionaries and textbooks used in the universities for learning your 
neighboring language.

>> 
>> 1 in the analysis/generation of Swedish
>> 2 … and in the bidix.
>> 
>> As for 1, we should look around in the Swedish language technology
>> landscape and look for open resources, e.g. in Gothenburg (Aarne Ranta,
>> also Språkbanken).
> 
> What kind of resources do I need?

For 1: swetwol :-) But it seems there are resources in Gothenburg:

http://www.cse.chalmers.se/alumni/markus/FM/
http://www.cse.chalmers.se/alumni/markus/FM/download/swedish.lexicon

This might even work.

>> As for 2, Lexin might be one resource. I am on Euralex in Oslo right now,
>> and will ask around.
> Fine! Besides, what's Lexin?

Lexicon för invandrare, http://lexin.nada.kth.se/lexin/

Trond.


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to