Hi,

On Mon, Aug 13, 2012, at 11:42, Kevin Brubeck Unhammer wrote:
> Per Tunedal <per.tune...@operamail.com>
> writes:
> 
> > Hi,
> >
> > On Sat, Aug 11, 2012, at 00:53, Jacob Nordfalk wrote:
> >> 2012/8/10 Per Tunedal <per.tune...@operamail.com>
> 
> [...]
> 
> > What about the translation in the other direction, nb/nn to sv? Is there
> > the same need for a Constraint Grammar? Or can I do without it? My
> > original plans where to start developing translation in that direction.
> > As a native Swede I would find it much easier to translate from
> > Norwegian to Swedish: I wouldn't have to check that much in
> > dictionaries. Besides, professional translators always translates into
> > their mother tongue.
> 
> Likewise, people who work on MT are recommended to begin with
> translating into their mother tongue :) (that's also the reason for the
> state of da→sv, no Swedes have worked on it).

I see. There is no end to the work to be done by a Swede! What exactly
has to be done before releasing the pair da - sv? I recall what Keld
wrote in a previous post:

"As Danish is a kind of old Norwegian bokmaal, maybe we could in[c]lude
that language too.
Then all three languages could benefit from the combined work."

> 
> You could simply copy the CG's over unchanged from apertium-nb-nn.
> 
> > An other conclusion from the discussion is that I need to create very
> > large dictionaries, to overcome that there are much fewer words that are
> > exactly the same in Norwegian an Swedish, compared to Norwegian bokmål
> > (nb) and Norwegian nynorsk (nn). On the other hand: someone wrote that
> > for comprehension, only a short list of difficult words is needed. My
> > own conclusion is that it might turn out to be very useful with a "pair"
> > Norwegian (nb/nn) to Swedish (sv) containing the most frequent words +
> > words that are known to cause difficulties (including "false friends").
> > Any one that know of how to figure out what words to include in the
> > later list? Collect personal experiences from experts like you?
> >
> > BTW I ran a few words from the nb frequency wordlist in
> > Apertium-caffeine to translate with the da-sv pair. I expected to get a
> > very low percentage of unknown words, due to my experiences of the hand
> > cream translation. Unfortunately I got as much as 40 % unknown words.
> >
> > I planned to translate say the first 1500 or so words on the frequency
> > list, to get the most important unknown words to work with for a start.
> > I expected to get only a few hundred of them, now I'm not so sure any
> > longer.
> 
> On deciding what words to work on first:
> 
> 1. get all "closed category" words done (pronouns, determiners, question
> words)
> 
> 2. then make a frequency list of open category words (nouns, verbs,
> adjectives, adverbs) and start adding from the top
> 
> Of course, if you have a list of false friends, give them priority, but
> most such lists are short, that part will hardly take up much time. The
> main work with creating a related-languages pair is adding open category
> translations.

Thank you for your advice.

> 
> > What about word order? I found a translation on my sun cream that has
> > different word order for Danish (da) and Norwegian bokmål (nb). Just a
> > coincidence or a fact to take into account?
> 
> What matters is whether it's different in nb and sv …

Well, as I've said before; I plan to reuse as much as possible from the
pair sv - da.

> 
> I doubt the grammatical constructions found in the Sun Cream Corpus are
> very representative of normal language use; typically, when you've
> worked a while on step 1. and 2. above, you run some text (e.g. news
> articles, wikipedia) through the translator and find the most commonly
> "odd" or plain wrong grammatical constructions; these you can fix with
> transfer rules.
> 
> > An other concern of mine: will the solution (3) with separate mono
> > lingual dictionaries and a common bilingual dictionary work "out of the
> > box" with Apertium, Apertium-caffeine and the OmegaT-plugin? Or does
> > this solution imply some changes to the code? Apertium would have to
> > find out somehow what monolingual dictionary to look into, wouldn't it?
> > I intend to start to play around with Swedish (sv), Danish (da) and
> > Norwegian bokmål (nb): can I test drive my dictionaries and rules?
> 
> It will; it is the solution used by e.g. en-ca to translate from ca to
> either en_GB or en_US.
> 

Excellent!

> 
> 
> -- 
> Kevin Brubeck Unhammer
> 
--snip--

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to