Re: Semi-mechanizing the DTTP translations

2013-01-16 Thread Redmar
Hannie Dumoleyn schreef op di 15-01-2013 om 12:22 [+0100]:
> Hi,
> You can add your own Translation Memory (TM) to the Translator Kit, but 
> the number of Kb's is limited.
> And yes, the translation is pretty poor. It is only usable for common 
> phrases and words.
> This means that, like you said, it is better to use Redmar's script to 
> sort on popularity and then translate offline.
> Hannie

When you merge the translations of an older version of ubuntu into the
current version (msgmerge quantal_ddtp.po raring_ddtp.po -o
merged_ddtp.po, for example), there will be a lot of 'fuzzy'
translations for strings that are similar (for example, meta packages
for different programs, debugging symbols etc). This can also save
considerable time, and does not have the problems your language might
have with google translate.

> 
> 
> Op 14-01-13 23:24, Hendrik Knackstedt schreef:
> > Hey everybody!
> > Tried Google Translator Kit today for German translations and sadly it
> > doesn't produce a lot of useful translations. The main problem is
> > sentence structure and grammar are totally messed up.
> >
> > In addition, is there a way to tell Google how it should translate a
> > certain word? It messes up words like "engine" that better should not be
> > translated.
> >
> > I'll try a bit more, maybe there is something useful coming out of it
> > somehow. Otherwise offline translation probably will be the best option
> > we have right now. Redmar, your script is very useful for this!

Thanks! 
> >
> > Hendrik
> >
> 
> 



signature.asc
Description: This is a digitally signed message part
-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2013-01-15 Thread Hannie Dumoleyn

Hi,
You can add your own Translation Memory (TM) to the Translator Kit, but 
the number of Kb's is limited.
And yes, the translation is pretty poor. It is only usable for common 
phrases and words.
This means that, like you said, it is better to use Redmar's script to 
sort on popularity and then translate offline.

Hannie


Op 14-01-13 23:24, Hendrik Knackstedt schreef:

Hey everybody!
Tried Google Translator Kit today for German translations and sadly it
doesn't produce a lot of useful translations. The main problem is
sentence structure and grammar are totally messed up.

In addition, is there a way to tell Google how it should translate a
certain word? It messes up words like "engine" that better should not be
translated.

I'll try a bit more, maybe there is something useful coming out of it
somehow. Otherwise offline translation probably will be the best option
we have right now. Redmar, your script is very useful for this!

Hendrik




--
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2013-01-14 Thread Pierre Slamich
Hey,
We had the same problem initially for French. As noted on the pad, we went
through mass refactoring using Regular Expressions and such, and we're a
throusand strings away now :-P
Seriously: a lot of us were skeptical, but with some cleanup (you'll
quickly find patterns when it comes to mistakes), it will prove useful.
For the rest, you'll have snippets in German that you'll be able to reuse.


Out of curiosity, could you share part of the result so far ?
If you can read French, there's a list of common mistakes we corrected (
http://lite.framapad.org/p/ddtpUbuntu). As we spotted more common mistakes,
we reuploaded the file in the bogus project after quick find and replace
all.



Pierre Slamich
pierre.slam...@gmail.com

2013/1/14 Hendrik Knackstedt 

> Hey everybody!
> Tried Google Translator Kit today for German translations and sadly it
> doesn't produce a lot of useful translations. The main problem is
> sentence structure and grammar are totally messed up.
>
> In addition, is there a way to tell Google how it should translate a
> certain word? It messes up words like "engine" that better should not be
> translated.
>
> I'll try a bit more, maybe there is something useful coming out of it
> somehow. Otherwise offline translation probably will be the best option
> we have right now. Redmar, your script is very useful for this!
>
> Hendrik
>
> --
> ubuntu-translators mailing list
> ubuntu-translators@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>
-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2013-01-14 Thread Hendrik Knackstedt
Hey everybody!
Tried Google Translator Kit today for German translations and sadly it
doesn't produce a lot of useful translations. The main problem is
sentence structure and grammar are totally messed up.

In addition, is there a way to tell Google how it should translate a
certain word? It messes up words like "engine" that better should not be
translated.

I'll try a bit more, maybe there is something useful coming out of it
somehow. Otherwise offline translation probably will be the best option
we have right now. Redmar, your script is very useful for this!

Hendrik

-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2013-01-06 Thread Pierre Slamich
Hi Tom,
The approach works best for large files where the scale effect works best
vs manual translations. We have tested it on documentation and related
stuff so far. It works on virtually any po file, but you need to check
whether it outputs translations good enough to actually reduce translation
load.
Feel free to forward the original mail.

Pierre

On Sat, Jan 5, 2013 at 2:12 PM, Tom Davies  wrote:

> Hi :)
> Would this semi-mechanising tool be good for other projects to use?  Is it
> good for translating websites, wiki's or printed documentation or all 3?
>
> If it's good for other projects is anyone here on the main LibreOffice
> LoCos mailing list?  Could one of you approach them to suggest it?  If not
> please let me know.
> Regards from
> Tom :)
>
>
>   --
> *From:* Pierre Slamich 
> *To:* Hannie Dumoleyn 
> *Cc:* Ubuntu Translators 
> *Sent:* Friday, 4 January 2013, 20:03
> *Subject:* Re: Semi-mechanizing the DTTP translations
>
> We keep making incredible progress thanks to the process: we validated on
> average 400 strings a day going from 49289 untranslated strings on Dec 16th
> to 42746 today.
>
> I've updated the structure and the instructions on the Pad to be more
> detailed and more linear. I've added a link to Redmar's script,
> instructions on validating the files and mass-correcting translations
> errors before upload.
> Feel free to ask if you're stuck at any point.
>
> http://lite.framapad.org/p/ddtpUbuntu
>
>  Pierre
>
> On Thu, Dec 27, 2012 at 4:00 PM, Pierre Slamich 
> wrote:
>
> Viva Low-Tech ;-)
> When you come at the point of importing them back, let me know so that I
> can grant you upload rights to the mock project.
>
> Sincerely,
>
> Pierre
> pierre.slam...@gmail.com
>
>
> On Thu, Dec 27, 2012 at 9:28 AM, Hannie Dumoleyn <
> lafeber-dumole...@zonnet.nl> wrote:
>
>  Hello Hendrik, Redmar, Pierre,
> Redmar, thanks for writing the script.
> The way I did the splitting so far is: open the sorted ddtp file in gedit,
> select lines 1 - 30.000 (which is about 940 Kb), copy these in a new
> document and save it. It only takes a few minutes. Then you can select the
> next 30.000 lines, and the next. Done!
> Of course, using a script to split the whole file in one go is also very
> useful.
> Hannie
> Ubuntu Dutch Translators
>
> Op 23-12-12 11:39, Hendrik Knackstedt schreef:
>
> Am 23.12.2012 10:33, schrieb Redmar:
>
> Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
>
>  Am 20.12.2012 13:43, schrieb Pierre Slamich:
>
>
>  I don't have a clean way to split them right now. I split them by
> size to keep below 900ko (I took 800 for safety), but I then had to
> adjust manually because the strings were split right in the middle.
>
>  Ok, I'll take a look at it and see if I can come up with something
> useful.
>
>  I've been working with python-polib for a bit, so I think I'd be able to
> create a script to split up a po file into multiple parts pretty
> quickly. I haven't started yet, since I don't want to do duplicate work,
> but please let me know if you want me to make a script or if you need
> help with python-polib.
>
>
> If you can do this, that's great. Thanks!
>
> Hendrik
>
> Regards,
>
> Redmar
> --
> Ubuntu Dutch Translators
>
>  If you don't mind, it would be great to take advantage of the German
> process to automate the process as much as possible.
> Would you be willing to expand the pad
> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
> of French-German partnership ;-P)?
>
>  Sure. What do you mean by "the German process"? I'm a bit short on
> time right now but just let me know what has to be done and I'll try
> to get it done asap.
>
> Regards,
> Hendrik
>
>  Pierre
>
> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik 
> Knackstedt  
> wrote:
> Hey Pierre!
>
>
> I'd like to test your approach for the German language also.
> How exactly did you split the files? Did you use an existing
> program/script or can you provide a script for doing this?
> Thanks!
>
> Hendrik
>
> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>
> > Yes, although we might be finished by then ;-)
> > Thanks to the method we're reviewing and correcting around
> > 1000 strings per day at the moment.
> >
> >
> > sincerely,
> > Pierre
> >
> >
> > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
> 

Re: Semi-mechanizing the DTTP translations

2013-01-04 Thread Pierre Slamich
We keep making incredible progress thanks to the process: we validated on
average 400 strings a day going from 49289 untranslated strings on Dec 16th
to 42746 today.

I've updated the structure and the instructions on the Pad to be more
detailed and more linear. I've added a link to Redmar's script,
instructions on validating the files and mass-correcting translations
errors before upload.
Feel free to ask if you're stuck at any point.

http://lite.framapad.org/p/ddtpUbuntu

 Pierre

On Thu, Dec 27, 2012 at 4:00 PM, Pierre Slamich wrote:

> Viva Low-Tech ;-)
> When you come at the point of importing them back, let me know so that I
> can grant you upload rights to the mock project.
>
> Sincerely,
>
> Pierre
> pierre.slam...@gmail.com
>
>
> On Thu, Dec 27, 2012 at 9:28 AM, Hannie Dumoleyn <
> lafeber-dumole...@zonnet.nl> wrote:
>
>>  Hello Hendrik, Redmar, Pierre,
>> Redmar, thanks for writing the script.
>> The way I did the splitting so far is: open the sorted ddtp file in
>> gedit, select lines 1 - 30.000 (which is about 940 Kb), copy these in a new
>> document and save it. It only takes a few minutes. Then you can select the
>> next 30.000 lines, and the next. Done!
>> Of course, using a script to split the whole file in one go is also very
>> useful.
>> Hannie
>> Ubuntu Dutch Translators
>>
>> Op 23-12-12 11:39, Hendrik Knackstedt schreef:
>>
>> Am 23.12.2012 10:33, schrieb Redmar:
>>
>> Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
>>
>>  Am 20.12.2012 13:43, schrieb Pierre Slamich:
>>
>>
>>  I don't have a clean way to split them right now. I split them by
>> size to keep below 900ko (I took 800 for safety), but I then had to
>> adjust manually because the strings were split right in the middle.
>>
>>  Ok, I'll take a look at it and see if I can come up with something
>> useful.
>>
>>  I've been working with python-polib for a bit, so I think I'd be able to
>> create a script to split up a po file into multiple parts pretty
>> quickly. I haven't started yet, since I don't want to do duplicate work,
>> but please let me know if you want me to make a script or if you need
>> help with python-polib.
>>
>>
>> If you can do this, that's great. Thanks!
>>
>> Hendrik
>>
>> Regards,
>>
>> Redmar
>> --
>> Ubuntu Dutch Translators
>>
>>  If you don't mind, it would be great to take advantage of the German
>> process to automate the process as much as possible.
>> Would you be willing to expand the pad
>> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
>> of French-German partnership ;-P)?
>>
>>  Sure. What do you mean by "the German process"? I'm a bit short on
>> time right now but just let me know what has to be done and I'll try
>> to get it done asap.
>>
>> Regards,
>> Hendrik
>>
>>  Pierre
>>
>> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik 
>> Knackstedt  
>> wrote:
>> Hey Pierre!
>>
>>
>> I'd like to test your approach for the German language also.
>> How exactly did you split the files? Did you use an existing
>> program/script or can you provide a script for doing this?
>> Thanks!
>>
>> Hendrik
>>
>> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>>
>> > Yes, although we might be finished by then ;-)
>> > Thanks to the method we're reviewing and correcting around
>> > 1000 strings per day at the moment.
>> >
>> >
>> > sincerely,
>> > Pierre
>> >
>> >
>> > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
>> >   wrote:
>> > Hi Pierre, Redmar, and all who are interested,
>> > Would it be an idea to brainstorm on this in
>> > #ubuntu-translators? Perhaps in January 2013?
>> > I agree with Redmar that the msgmerge is a good
>> > method, especially for huge documents. The only
>> > snag is that you still have to approve the fuzzies
>> > offline before uploading the file back to
>> > Launchpad. We use this method for the Ubuntu
>> > Manual "Getting started with Ubuntu" (Lucid >
>> > Maverick > > Raring) and with success.
>> > Redmar, sorry for not yet having tested your
>> > popsort :(
>> > Regards,
>> > Hannie
>> >
>> > Op 18-12-12 00:51, Pierre Slamich schreef:
>> >
>> > > Hi Hannie, Hi Redmar,
>> > > Thanks a lot for the tips: we're interested in
>> > > using your approach, and more generally it might
>> > > be interesting expending the msmerge approach to
>> > > all teams that are already underway for the
>> > > DDTP, and the Google one to the teams that need
>> > > to get started.
>> > >
>> > >
>> > > - For the Google Tran

Re: Semi-mechanizing the DTTP translations

2012-12-27 Thread Pierre Slamich
Viva Low-Tech ;-)
When you come at the point of importing them back, let me know so that I
can grant you upload rights to the mock project.

Sincerely,

Pierre
pierre.slam...@gmail.com


On Thu, Dec 27, 2012 at 9:28 AM, Hannie Dumoleyn <
lafeber-dumole...@zonnet.nl> wrote:

>  Hello Hendrik, Redmar, Pierre,
> Redmar, thanks for writing the script.
> The way I did the splitting so far is: open the sorted ddtp file in gedit,
> select lines 1 - 30.000 (which is about 940 Kb), copy these in a new
> document and save it. It only takes a few minutes. Then you can select the
> next 30.000 lines, and the next. Done!
> Of course, using a script to split the whole file in one go is also very
> useful.
> Hannie
> Ubuntu Dutch Translators
>
> Op 23-12-12 11:39, Hendrik Knackstedt schreef:
>
> Am 23.12.2012 10:33, schrieb Redmar:
>
> Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
>
>  Am 20.12.2012 13:43, schrieb Pierre Slamich:
>
>
>  I don't have a clean way to split them right now. I split them by
> size to keep below 900ko (I took 800 for safety), but I then had to
> adjust manually because the strings were split right in the middle.
>
>  Ok, I'll take a look at it and see if I can come up with something
> useful.
>
>  I've been working with python-polib for a bit, so I think I'd be able to
> create a script to split up a po file into multiple parts pretty
> quickly. I haven't started yet, since I don't want to do duplicate work,
> but please let me know if you want me to make a script or if you need
> help with python-polib.
>
>
> If you can do this, that's great. Thanks!
>
> Hendrik
>
> Regards,
>
> Redmar
> --
> Ubuntu Dutch Translators
>
>  If you don't mind, it would be great to take advantage of the German
> process to automate the process as much as possible.
> Would you be willing to expand the pad
> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
> of French-German partnership ;-P)?
>
>  Sure. What do you mean by "the German process"? I'm a bit short on
> time right now but just let me know what has to be done and I'll try
> to get it done asap.
>
> Regards,
> Hendrik
>
>  Pierre
>
> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik 
> Knackstedt  
> wrote:
> Hey Pierre!
>
>
> I'd like to test your approach for the German language also.
> How exactly did you split the files? Did you use an existing
> program/script or can you provide a script for doing this?
> Thanks!
>
> Hendrik
>
> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>
> > Yes, although we might be finished by then ;-)
> > Thanks to the method we're reviewing and correcting around
> > 1000 strings per day at the moment.
> >
> >
> > sincerely,
> > Pierre
> >
> >
> > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
> >   wrote:
> > Hi Pierre, Redmar, and all who are interested,
> > Would it be an idea to brainstorm on this in
> > #ubuntu-translators? Perhaps in January 2013?
> > I agree with Redmar that the msgmerge is a good
> > method, especially for huge documents. The only
> > snag is that you still have to approve the fuzzies
> > offline before uploading the file back to
> > Launchpad. We use this method for the Ubuntu
> > Manual "Getting started with Ubuntu" (Lucid >
> > Maverick > > Raring) and with success.
> > Redmar, sorry for not yet having tested your
> > popsort :(
> > Regards,
> > Hannie
> >
> > Op 18-12-12 00:51, Pierre Slamich schreef:
> >
> > > Hi Hannie, Hi Redmar,
> > > Thanks a lot for the tips: we're interested in
> > > using your approach, and more generally it might
> > > be interesting expending the msmerge approach to
> > > all teams that are already underway for the
> > > DDTP, and the Google one to the teams that need
> > > to get started.
> > >
> > >
> > > - For the Google Translator Kit approach, I
> > > guess we could extend the mock project we did
> > > for fr_FR to other languages (and streamlining
> > > our process by using Bazaar) by creating a
> > > global team responsible for the DDTP Mock
> > > project and including in this team one member
> > > from each language team responsible for
> > > uploading the machine translated po for his or
> > > her language.
> > >
> > >
> > > - For the msmerge approach, do you already have
> > > a project to handl

Re: Semi-mechanizing the DTTP translations

2012-12-27 Thread Hannie Dumoleyn

Hello Hendrik, Redmar, Pierre,
Redmar, thanks for writing the script.
The way I did the splitting so far is: open the sorted ddtp file in 
gedit, select lines 1 - 30.000 (which is about 940 Kb), copy these in a 
new document and save it. It only takes a few minutes. Then you can 
select the next 30.000 lines, and the next. Done!
Of course, using a script to split the whole file in one go is also very 
useful.

Hannie
Ubuntu Dutch Translators

Op 23-12-12 11:39, Hendrik Knackstedt schreef:

Am 23.12.2012 10:33, schrieb Redmar:

Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:

Am 20.12.2012 13:43, schrieb Pierre Slamich:


I don't have a clean way to split them right now. I split them by
size to keep below 900ko (I took 800 for safety), but I then had to
adjust manually because the strings were split right in the middle.

Ok, I'll take a look at it and see if I can come up with something
useful.

I've been working with python-polib for a bit, so I think I'd be able to
create a script to split up a po file into multiple parts pretty
quickly. I haven't started yet, since I don't want to do duplicate work,
but please let me know if you want me to make a script or if you need
help with python-polib.


If you can do this, that's great. Thanks!

Hendrik

Regards,

Redmar
--
Ubuntu Dutch Translators

If you don't mind, it would be great to take advantage of the German
process to automate the process as much as possible.
Would you be willing to expand the pad
(http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
of French-German partnership ;-P)?

Sure. What do you mean by "the German process"? I'm a bit short on
time right now but just let me know what has to be done and I'll try
to get it done asap.

Regards,
Hendrik

Pierre

On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt
  wrote:
 Hey Pierre!
 
 
 I'd like to test your approach for the German language also.

 How exactly did you split the files? Did you use an existing
 program/script or can you provide a script for doing this?
 Thanks!
 
 Hendrik
 
 Am 19.12.2012 15:58, schrieb Pierre Slamich:
 
 > Yes, although we might be finished by then ;-)

 > Thanks to the method we're reviewing and correcting around
 > 1000 strings per day at the moment.
 >
 >
 > sincerely,
 > Pierre
 >
 >
 > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
 >  wrote:
 > Hi Pierre, Redmar, and all who are interested,
 > Would it be an idea to brainstorm on this in
 > #ubuntu-translators? Perhaps in January 2013?
 > I agree with Redmar that the msgmerge is a good
 > method, especially for huge documents. The only
 > snag is that you still have to approve the fuzzies
 > offline before uploading the file back to
 > Launchpad. We use this method for the Ubuntu
 > Manual "Getting started with Ubuntu" (Lucid >
 > Maverick > > Raring) and with success.
 > Redmar, sorry for not yet having tested your
 > popsort :(
 > Regards,
 > Hannie
 >
 > Op 18-12-12 00:51, Pierre Slamich schreef:
 >
 > > Hi Hannie, Hi Redmar,
 > > Thanks a lot for the tips: we're interested in
 > > using your approach, and more generally it might
 > > be interesting expending the msmerge approach to
 > > all teams that are already underway for the
 > > DDTP, and the Google one to the teams that need
 > > to get started.
 > >
 > >
 > > - For the Google Translator Kit approach, I
 > > guess we could extend the mock project we did
 > > for fr_FR to other languages (and streamlining
 > > our process by using Bazaar) by creating a
 > > global team responsible for the DDTP Mock
 > > project and including in this team one member
 > > from each language team responsible for
 > > uploading the machine translated po for his or
 > > her language.
 > >
 > >
 > > - For the msmerge approach, do you already have
 > > a project to handle this ? Is there any
 > > advantage in msmerging raring against releases
 > > older than quantal to get more modified
 > > strings ? How many strings have you been able to
 > > recover using that approach ?  It might be neat
 > > to generate the msmerged po for all languages ?
 > > Importing them as ac

Re: Semi-mechanizing the DTTP translations

2012-12-23 Thread Hendrik Knackstedt
Am 23.12.2012 10:33, schrieb Redmar:
> Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
>> Am 20.12.2012 13:43, schrieb Pierre Slamich:
>>
>>> I don't have a clean way to split them right now. I split them by
>>> size to keep below 900ko (I took 800 for safety), but I then had to
>>> adjust manually because the strings were split right in the middle.
>> Ok, I'll take a look at it and see if I can come up with something
>> useful.
> I've been working with python-polib for a bit, so I think I'd be able to
> create a script to split up a po file into multiple parts pretty
> quickly. I haven't started yet, since I don't want to do duplicate work,
> but please let me know if you want me to make a script or if you need
> help with python-polib.

If you can do this, that's great. Thanks!

Hendrik
>
> Regards,
>
> Redmar
> --
> Ubuntu Dutch Translators
>>> If you don't mind, it would be great to take advantage of the German
>>> process to automate the process as much as possible.
>>> Would you be willing to expand the pad
>>> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
>>> of French-German partnership ;-P)?
>> Sure. What do you mean by "the German process"? I'm a bit short on
>> time right now but just let me know what has to be done and I'll try
>> to get it done asap.
>>
>> Regards,
>> Hendrik
>>>
>>> Pierre
>>>
>>> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt
>>>  wrote:
>>> Hey Pierre!
>>> 
>>> 
>>> I'd like to test your approach for the German language also.
>>> How exactly did you split the files? Did you use an existing
>>> program/script or can you provide a script for doing this?
>>> Thanks!
>>> 
>>> Hendrik
>>> 
>>> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>>> 
>>> > Yes, although we might be finished by then ;-) 
>>> > Thanks to the method we're reviewing and correcting around
>>> > 1000 strings per day at the moment.
>>> > 
>>> > 
>>> > sincerely,
>>> > Pierre
>>> > 
>>> > 
>>> > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
>>> >  wrote:
>>> > Hi Pierre, Redmar, and all who are interested,
>>> > Would it be an idea to brainstorm on this in
>>> > #ubuntu-translators? Perhaps in January 2013?
>>> > I agree with Redmar that the msgmerge is a good
>>> > method, especially for huge documents. The only
>>> > snag is that you still have to approve the fuzzies
>>> > offline before uploading the file back to
>>> > Launchpad. We use this method for the Ubuntu
>>> > Manual "Getting started with Ubuntu" (Lucid >
>>> > Maverick > > Raring) and with success.
>>> > Redmar, sorry for not yet having tested your
>>> > popsort :( 
>>> > Regards,
>>> > Hannie
>>> > 
>>> > Op 18-12-12 00:51, Pierre Slamich schreef:
>>> > 
>>> > > Hi Hannie, Hi Redmar, 
>>> > > Thanks a lot for the tips: we're interested in
>>> > > using your approach, and more generally it might
>>> > > be interesting expending the msmerge approach to
>>> > > all teams that are already underway for the
>>> > > DDTP, and the Google one to the teams that need
>>> > > to get started.
>>> > > 
>>> > > 
>>> > > - For the Google Translator Kit approach, I
>>> > > guess we could extend the mock project we did
>>> > > for fr_FR to other languages (and streamlining
>>> > > our process by using Bazaar) by creating a
>>> > > global team responsible for the DDTP Mock
>>> > > project and including in this team one member
>>> > > from each language team responsible for
>>> > > uploading the machine translated po for his or
>>> > > her language.
>>> > > 
>>> > > 
>>> > > - For the msmerge approach, do you already have
>>> > > a project to handle this ? Is there any
>>> > > advantage in msmerging raring against releases
>>> > > older than quantal to get more modified
>>> > > strings ? How many strings have you been able to
>>> > > recover using that approach ?  It might be neat
>>> > > to generate the msmerged po for all languages ?
>>> > > Importing them as actual translations (not
>>> > > fuzzy) into a mock project like the Google
>>> > > Translate one would show them as suggestions for
>>>

Re: Semi-mechanizing the DTTP translations

2012-12-23 Thread Redmar
Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
> Am 20.12.2012 13:43, schrieb Pierre Slamich:
> 
> > I don't have a clean way to split them right now. I split them by
> > size to keep below 900ko (I took 800 for safety), but I then had to
> > adjust manually because the strings were split right in the middle.
> 
> Ok, I'll take a look at it and see if I can come up with something
> useful.

I've been working with python-polib for a bit, so I think I'd be able to
create a script to split up a po file into multiple parts pretty
quickly. I haven't started yet, since I don't want to do duplicate work,
but please let me know if you want me to make a script or if you need
help with python-polib.

Regards,

Redmar
--
Ubuntu Dutch Translators
> > 
> > If you don't mind, it would be great to take advantage of the German
> > process to automate the process as much as possible.
> > Would you be willing to expand the pad
> > (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
> > of French-German partnership ;-P)?
> 
> Sure. What do you mean by "the German process"? I'm a bit short on
> time right now but just let me know what has to be done and I'll try
> to get it done asap.
> 
> Regards,
> Hendrik
> > 
> > 
> > Pierre
> > 
> > On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt
> >  wrote:
> > Hey Pierre!
> > 
> > 
> > I'd like to test your approach for the German language also.
> > How exactly did you split the files? Did you use an existing
> > program/script or can you provide a script for doing this?
> > Thanks!
> > 
> > Hendrik
> > 
> > Am 19.12.2012 15:58, schrieb Pierre Slamich:
> > 
> > > Yes, although we might be finished by then ;-) 
> > > Thanks to the method we're reviewing and correcting around
> > > 1000 strings per day at the moment.
> > > 
> > > 
> > > sincerely,
> > > Pierre
> > > 
> > > 
> > > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
> > >  wrote:
> > > Hi Pierre, Redmar, and all who are interested,
> > > Would it be an idea to brainstorm on this in
> > > #ubuntu-translators? Perhaps in January 2013?
> > > I agree with Redmar that the msgmerge is a good
> > > method, especially for huge documents. The only
> > > snag is that you still have to approve the fuzzies
> > > offline before uploading the file back to
> > > Launchpad. We use this method for the Ubuntu
> > > Manual "Getting started with Ubuntu" (Lucid >
> > > Maverick > > Raring) and with success.
> > > Redmar, sorry for not yet having tested your
> > > popsort :( 
> > > Regards,
> > > Hannie
> > > 
> > > Op 18-12-12 00:51, Pierre Slamich schreef:
> > > 
> > > > Hi Hannie, Hi Redmar, 
> > > > Thanks a lot for the tips: we're interested in
> > > > using your approach, and more generally it might
> > > > be interesting expending the msmerge approach to
> > > > all teams that are already underway for the
> > > > DDTP, and the Google one to the teams that need
> > > > to get started.
> > > > 
> > > > 
> > > > - For the Google Translator Kit approach, I
> > > > guess we could extend the mock project we did
> > > > for fr_FR to other languages (and streamlining
> > > > our process by using Bazaar) by creating a
> > > > global team responsible for the DDTP Mock
> > > > project and including in this team one member
> > > > from each language team responsible for
> > > > uploading the machine translated po for his or
> > > > her language.
> > > > 
> > > > 
> > > > - For the msmerge approach, do you already have
> > > > a project to handle this ? Is there any
> > > > advantage in msmerging raring against releases
> > > > older than quantal to get more modified
> > > > strings ? How many strings have you been able to
> > > > recover using that approach ?  It might be neat
> > > > to generate the msmerged po for all languages ?
> > > > Importing them as actual translations (not
> > > > fuzzy) into a mock project like the Google
> > > > Translate one would show them as suggestions for
> > > > the actual DDTP as well.
> > > > The translator would thus be able to p

Re: Semi-mechanizing the DTTP translations

2012-12-20 Thread Hendrik Knackstedt
Am 20.12.2012 13:43, schrieb Pierre Slamich:
> I don't have a clean way to split them right now. I split them by size
> to keep below 900ko (I took 800 for safety), but I then had to adjust
> manually because the strings were split right in the middle.

Ok, I'll take a look at it and see if I can come up with something useful.
>
> If you don't mind, it would be great to take advantage of the German
> process to automate the process as much as possible.
> Would you be willing to expand the pad
> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof of
> French-German partnership ;-P)?

Sure. What do you mean by "the German process"? I'm a bit short on time
right now but just let me know what has to be done and I'll try to get
it done asap.

Regards,
Hendrik
>
> Pierre
>
> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt
>  > wrote:
>
> Hey Pierre!
>
>
> I'd like to test your approach for the German language also. How
> exactly did you split the files? Did you use an existing
> program/script or can you provide a script for doing this? Thanks!
>
> Hendrik
>
> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>> Yes, although we might be finished by then ;-)
>> Thanks to the method we're reviewing and correcting around 1000
>> strings per day at the moment.
>>
>> sincerely,
>> Pierre
>>
>> On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
>> > > wrote:
>>
>> Hi Pierre, Redmar, and all who are interested,
>> Would it be an idea to brainstorm on this in
>> #ubuntu-translators? Perhaps in January 2013?
>> I agree with Redmar that the msgmerge is a good method,
>> especially for huge documents. The only snag is that you
>> still have to approve the fuzzies offline before uploading
>> the file back to Launchpad. We use this method for the Ubuntu
>> Manual "Getting started with Ubuntu" (Lucid > Maverick >
>> > Raring) and with success.
>> Redmar, sorry for not yet having tested your popsort :(
>> Regards,
>> Hannie
>>
>> Op 18-12-12 00:51, Pierre Slamich schreef:
>>> Hi Hannie, Hi Redmar,
>>> Thanks a lot for the tips: we're interested in using your
>>> approach, and more generally it might be interesting
>>> expending the msmerge approach to all teams that are already
>>> underway for the DDTP, and the Google one to the teams that
>>> need to get started.
>>>
>>> - For the Google Translator Kit approach, I guess we
>>> could extend the mock project we did for fr_FR to other
>>> languages (and streamlining our process by using Bazaar) by
>>> creating a global team responsible for the DDTP Mock project
>>> and including in this team one member from each language
>>> team responsible for uploading the machine translated po for
>>> his or her language.
>>>
>>> - For the msmerge approach, do you already have a project to
>>> handle this ? Is there any advantage in msmerging raring
>>> against releases older than quantal to get more modified
>>> strings ? How many strings have you been able to recover
>>> using that approach ?  It might be neat to generate the
>>> msmerged po for all languages ? Importing them as actual
>>> translations (not fuzzy) into a mock project like the Google
>>> Translate one would show them as suggestions for the actual
>>> DDTP as well.
>>> The translator would thus be able to pick the human
>>> translated one when available or to build on the machine
>>> translated one otherwise.
>>>
>>> Can we try to schedule some time to coordinate on this so
>>> that we can use both approaches and try to onboard all the
>>> other languages teams once we have a rock-solid process ?
>>>
>>> Pierre
>>>
>>> Pierre Slamich
>>> pierre.slam...@gmail.com 
>>>
>>>
>>> On Mon, Dec 17, 2012 at 10:30 PM, Redmar
>>> mailto:red...@ubuntu-nl.org>> wrote:
>>>
>>> Hi Pierre,
>>>
>>> I've actually tried a similar approach for Dutch using
>>> msgmerge, which
>>> might also be worth checking out. When you merge the
>>> translations of an
>>> older version of ubuntu into the current version (msgmerge
>>> quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for
>>> example), there
>>> will be a lot of 'fuzzy' translations for strings that
>>> are similar (for
>>> example, meta packages for different programs, debugging
>>> symbols etc).
>>> These fuzzy often only need a few small changes (eg
>>> program name) to be
>>>   

Re: Semi-mechanizing the DTTP translations

2012-12-20 Thread Pierre Slamich
I don't have a clean way to split them right now. I split them by size to
keep below 900ko (I took 800 for safety), but I then had to adjust manually
because the strings were split right in the middle.

If you don't mind, it would be great to take advantage of the German
process to automate the process as much as possible.
Would you be willing to expand the pad (
http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof of
French-German partnership ;-P)?
 
Pierre

On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt <
hendrik.knackst...@t-online.de> wrote:

>  Hey Pierre!
>
>
> I'd like to test your approach for the German language also. How exactly
> did you split the files? Did you use an existing program/script or can you
> provide a script for doing this? Thanks!
>
> Hendrik
>
> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>
> Yes, although we might be finished by then ;-)
> Thanks to the method we're reviewing and correcting around 1000 strings
> per day at the moment.
>
>  sincerely,
> Pierre
>
> On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn <
> lafeber-dumole...@zonnet.nl> wrote:
>
>>  Hi Pierre, Redmar, and all who are interested,
>> Would it be an idea to brainstorm on this in #ubuntu-translators? Perhaps
>> in January 2013?
>> I agree with Redmar that the msgmerge is a good method, especially for
>> huge documents. The only snag is that you still have to approve the fuzzies
>> offline before uploading the file back to Launchpad. We use this method for
>> the Ubuntu Manual "Getting started with Ubuntu" (Lucid > Maverick > >
>> Raring) and with success.
>> Redmar, sorry for not yet having tested your popsort :(
>> Regards,
>> Hannie
>>
>> Op 18-12-12 00:51, Pierre Slamich schreef:
>>
>> Hi Hannie, Hi Redmar,
>> Thanks a lot for the tips: we're interested in using your approach, and
>> more generally it might be interesting expending the msmerge approach to
>> all teams that are already underway for the DDTP, and the Google one to the
>> teams that need to get started.
>>
>>  - For the Google Translator Kit approach, I guess we could extend the
>> mock project we did for fr_FR to other languages (and streamlining our
>> process by using Bazaar) by creating a global team responsible for the DDTP
>> Mock project and including in this team one member from each language team
>> responsible for uploading the machine translated po for his or her language.
>>
>>  - For the msmerge approach, do you already have a project to handle
>> this ? Is there any advantage in msmerging raring against releases older
>> than quantal to get more modified strings ? How many strings have you been
>> able to recover using that approach ?  It might be neat to generate the
>> msmerged po for all languages ? Importing them as actual translations (not
>> fuzzy) into a mock project like the Google Translate one would show them as
>> suggestions for the actual DDTP as well.
>> The translator would thus be able to pick the human translated one when
>> available or to build on the machine translated one otherwise.
>>
>>  Can we try to schedule some time to coordinate on this so that we can
>> use both approaches and try to onboard all the other languages teams once
>> we have a rock-solid process ?
>>
>>  Pierre
>>
>> Pierre Slamich
>> pierre.slam...@gmail.com
>>
>>
>> On Mon, Dec 17, 2012 at 10:30 PM, Redmar  wrote:
>>
>>> Hi Pierre,
>>>
>>> I've actually tried a similar approach for Dutch using msgmerge, which
>>> might also be worth checking out. When you merge the translations of an
>>> older version of ubuntu into the current version (msgmerge
>>> quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for example), there
>>> will be a lot of 'fuzzy' translations for strings that are similar (for
>>> example, meta packages for different programs, debugging symbols etc).
>>> These fuzzy often only need a few small changes (eg program name) to be
>>> accepted, which can really speed up translations. And you don't have to
>>> worry about google putting in a weird translation, since it is all based
>>> on earlier translations done by a human.
>>>
>>> On a related note, if any of you work on ddtp-translations offline, I
>>> have written a python program that can sort entries in ddtp po-files
>>> based on the popularity of the package. This way, the most popular
>>> packages will be at the top of the po file, and you are always sure you
>>> are working on the most important packages first.
>>>
>>> You can get the code here:
>>> bzr branch lp:~redmar/+junk/ddtp_popsort
>>>
>>> It has a small readme file, please let me know if something is unclear
>>> or not working for you.
>>>
>>> Regards,
>>> Redmar
>>> --
>>> Ubuntu Dutch Translators
>>>
>>>
>>> Hannie Dumoleyn schreef op ma 17-12-2012 om 17:58 [+0100]:
>>>  > Hello Pierre,
>>> > This is a very good idea! I have just uploaded the first part of the
>>> > incomplete Dutch translation (900kb) to GTT.
>>> > Thanks,
>>> > Hannie
>>> >
>>> > Op 17-12-12 12:55, Pierre Slamich sc

Re: Semi-mechanizing the DTTP translations

2012-12-20 Thread Hendrik Knackstedt
Hey Pierre!


I'd like to test your approach for the German language also. How exactly
did you split the files? Did you use an existing program/script or can
you provide a script for doing this? Thanks!

Hendrik

Am 19.12.2012 15:58, schrieb Pierre Slamich:
> Yes, although we might be finished by then ;-)
> Thanks to the method we're reviewing and correcting around 1000
> strings per day at the moment.
>
> sincerely,
> Pierre
>
> On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
> mailto:lafeber-dumole...@zonnet.nl>> wrote:
>
> Hi Pierre, Redmar, and all who are interested,
> Would it be an idea to brainstorm on this in #ubuntu-translators?
> Perhaps in January 2013?
> I agree with Redmar that the msgmerge is a good method, especially
> for huge documents. The only snag is that you still have to
> approve the fuzzies offline before uploading the file back to
> Launchpad. We use this method for the Ubuntu Manual "Getting
> started with Ubuntu" (Lucid > Maverick > > Raring) and with
> success.
> Redmar, sorry for not yet having tested your popsort :(
> Regards,
> Hannie
>
> Op 18-12-12 00:51, Pierre Slamich schreef:
>> Hi Hannie, Hi Redmar,
>> Thanks a lot for the tips: we're interested in using your
>> approach, and more generally it might be interesting expending
>> the msmerge approach to all teams that are already underway for
>> the DDTP, and the Google one to the teams that need to get started.
>>
>> - For the Google Translator Kit approach, I guess we
>> could extend the mock project we did for fr_FR to other languages
>> (and streamlining our process by using Bazaar) by creating a
>> global team responsible for the DDTP Mock project and including
>> in this team one member from each language team responsible for
>> uploading the machine translated po for his or her language.
>>
>> - For the msmerge approach, do you already have a project to
>> handle this ? Is there any advantage in msmerging raring against
>> releases older than quantal to get more modified strings ? How
>> many strings have you been able to recover using that approach
>> ?  It might be neat to generate the msmerged po for all languages
>> ? Importing them as actual translations (not fuzzy) into a mock
>> project like the Google Translate one would show them as
>> suggestions for the actual DDTP as well.
>> The translator would thus be able to pick the human translated
>> one when available or to build on the machine translated one
>> otherwise.
>>
>> Can we try to schedule some time to coordinate on this so that we
>> can use both approaches and try to onboard all the other
>> languages teams once we have a rock-solid process ?
>>
>> Pierre
>>
>> Pierre Slamich
>> pierre.slam...@gmail.com 
>>
>>
>> On Mon, Dec 17, 2012 at 10:30 PM, Redmar > > wrote:
>>
>> Hi Pierre,
>>
>> I've actually tried a similar approach for Dutch using
>> msgmerge, which
>> might also be worth checking out. When you merge the
>> translations of an
>> older version of ubuntu into the current version (msgmerge
>> quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for
>> example), there
>> will be a lot of 'fuzzy' translations for strings that are
>> similar (for
>> example, meta packages for different programs, debugging
>> symbols etc).
>> These fuzzy often only need a few small changes (eg program
>> name) to be
>> accepted, which can really speed up translations. And you
>> don't have to
>> worry about google putting in a weird translation, since it
>> is all based
>> on earlier translations done by a human.
>>
>> On a related note, if any of you work on ddtp-translations
>> offline, I
>> have written a python program that can sort entries in ddtp
>> po-files
>> based on the popularity of the package. This way, the most
>> popular
>> packages will be at the top of the po file, and you are
>> always sure you
>> are working on the most important packages first.
>>
>> You can get the code here:
>> bzr branch lp:~redmar/+junk/ddtp_popsort
>>
>> It has a small readme file, please let me know if something
>> is unclear
>> or not working for you.
>>
>> Regards,
>> Redmar
>> --
>> Ubuntu Dutch Translators
>>
>>
>> Hannie Dumoleyn schreef op ma 17-12-2012 om 17:58 [+0100]:
>> > Hello Pierre,
>> > This is a very good idea! I have just uploaded the first
>> part of the
>> > incomplete Dutch translation (900kb) to GTT.
>> > Thanks,
>> > Hannie
>> >
>> > Op 17-12-12 12:55, Pier

Re: Semi-mechanizing the DTTP translations

2012-12-19 Thread Pierre Slamich
Yes, although we might be finished by then ;-)
Thanks to the method we're reviewing and correcting around 1000 strings per
day at the moment.

sincerely,
Pierre

On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn <
lafeber-dumole...@zonnet.nl> wrote:

>  Hi Pierre, Redmar, and all who are interested,
> Would it be an idea to brainstorm on this in #ubuntu-translators? Perhaps
> in January 2013?
> I agree with Redmar that the msgmerge is a good method, especially for
> huge documents. The only snag is that you still have to approve the fuzzies
> offline before uploading the file back to Launchpad. We use this method for
> the Ubuntu Manual "Getting started with Ubuntu" (Lucid > Maverick > >
> Raring) and with success.
> Redmar, sorry for not yet having tested your popsort :(
> Regards,
> Hannie
>
> Op 18-12-12 00:51, Pierre Slamich schreef:
>
> Hi Hannie, Hi Redmar,
> Thanks a lot for the tips: we're interested in using your approach, and
> more generally it might be interesting expending the msmerge approach to
> all teams that are already underway for the DDTP, and the Google one to the
> teams that need to get started.
>
>  - For the Google Translator Kit approach, I guess we could extend the
> mock project we did for fr_FR to other languages (and streamlining our
> process by using Bazaar) by creating a global team responsible for the DDTP
> Mock project and including in this team one member from each language team
> responsible for uploading the machine translated po for his or her language.
>
>  - For the msmerge approach, do you already have a project to handle this
> ? Is there any advantage in msmerging raring against releases older than
> quantal to get more modified strings ? How many strings have you been able
> to recover using that approach ?  It might be neat to generate the msmerged
> po for all languages ? Importing them as actual translations (not fuzzy)
> into a mock project like the Google Translate one would show them as
> suggestions for the actual DDTP as well.
> The translator would thus be able to pick the human translated one when
> available or to build on the machine translated one otherwise.
>
>  Can we try to schedule some time to coordinate on this so that we can
> use both approaches and try to onboard all the other languages teams once
> we have a rock-solid process ?
>
>  Pierre
>
> Pierre Slamich
> pierre.slam...@gmail.com
>
>
> On Mon, Dec 17, 2012 at 10:30 PM, Redmar  wrote:
>
>> Hi Pierre,
>>
>> I've actually tried a similar approach for Dutch using msgmerge, which
>> might also be worth checking out. When you merge the translations of an
>> older version of ubuntu into the current version (msgmerge
>> quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for example), there
>> will be a lot of 'fuzzy' translations for strings that are similar (for
>> example, meta packages for different programs, debugging symbols etc).
>> These fuzzy often only need a few small changes (eg program name) to be
>> accepted, which can really speed up translations. And you don't have to
>> worry about google putting in a weird translation, since it is all based
>> on earlier translations done by a human.
>>
>> On a related note, if any of you work on ddtp-translations offline, I
>> have written a python program that can sort entries in ddtp po-files
>> based on the popularity of the package. This way, the most popular
>> packages will be at the top of the po file, and you are always sure you
>> are working on the most important packages first.
>>
>> You can get the code here:
>> bzr branch lp:~redmar/+junk/ddtp_popsort
>>
>> It has a small readme file, please let me know if something is unclear
>> or not working for you.
>>
>> Regards,
>> Redmar
>> --
>> Ubuntu Dutch Translators
>>
>>
>> Hannie Dumoleyn schreef op ma 17-12-2012 om 17:58 [+0100]:
>>  > Hello Pierre,
>> > This is a very good idea! I have just uploaded the first part of the
>> > incomplete Dutch translation (900kb) to GTT.
>> > Thanks,
>> > Hannie
>> >
>> > Op 17-12-12 12:55, Pierre Slamich schreef:
>> >
>> > > The DDTP represent around 50 000 strings to translate * 140
>> > > languages. On very good weeks, a typical translation team translates
>> > > 500 strings (see UWN for examples weekly figures).
>> > >
>> > >
>> > > Would take a lot of weeks (years?) with highly motivated volunteers
>> > > of a large translation team, working non-stop, at their best to get
>> > > done with it.
>> > > Thus we had the idea to delegate initial translation suggestions to
>> > > Google Translator Kit and review translations with humans to speed
>> > > the process.
>> > >
>> > > We successfully did an import for circa 40 000 French strings  (yup
>> > > you read that right) this week-end in a mock project called DDTP
>> > > Automation (https://translations.launchpad.net/ddtpautomation).
>> > > To keep it short, the translations from this project appear as
>> > > suggestions in the French DDTP, and can be reviewed by actual
>> > > translators.
>>

Re: Semi-mechanizing the DTTP translations

2012-12-18 Thread Hannie Dumoleyn

Hi Pierre, Redmar, and all who are interested,
Would it be an idea to brainstorm on this in #ubuntu-translators? 
Perhaps in January 2013?
I agree with Redmar that the msgmerge is a good method, especially for 
huge documents. The only snag is that you still have to approve the 
fuzzies offline before uploading the file back to Launchpad. We use this 
method for the Ubuntu Manual "Getting started with Ubuntu" (Lucid > 
Maverick > > Raring) and with success.

Redmar, sorry for not yet having tested your popsort :(
Regards,
Hannie

Op 18-12-12 00:51, Pierre Slamich schreef:

Hi Hannie, Hi Redmar,
Thanks a lot for the tips: we're interested in using your approach, 
and more generally it might be interesting expending the msmerge 
approach to all teams that are already underway for the DDTP, and the 
Google one to the teams that need to get started.


- For the Google Translator Kit approach, I guess we could extend the 
mock project we did for fr_FR to other languages (and streamlining our 
process by using Bazaar) by creating a global team responsible for the 
DDTP Mock project and including in this team one member from each 
language team responsible for uploading the machine translated po for 
his or her language.


- For the msmerge approach, do you already have a project to handle 
this ? Is there any advantage in msmerging raring against releases 
older than quantal to get more modified strings ? How many strings 
have you been able to recover using that approach ?  It might be neat 
to generate the msmerged po for all languages ? Importing them as 
actual translations (not fuzzy) into a mock project like the Google 
Translate one would show them as suggestions for the actual DDTP as well.
The translator would thus be able to pick the human translated one 
when available or to build on the machine translated one otherwise.


Can we try to schedule some time to coordinate on this so that we can 
use both approaches and try to onboard all the other languages teams 
once we have a rock-solid process ?


Pierre

Pierre Slamich
pierre.slam...@gmail.com 


On Mon, Dec 17, 2012 at 10:30 PM, Redmar > wrote:


Hi Pierre,

I've actually tried a similar approach for Dutch using msgmerge, which
might also be worth checking out. When you merge the translations
of an
older version of ubuntu into the current version (msgmerge
quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for example), there
will be a lot of 'fuzzy' translations for strings that are similar
(for
example, meta packages for different programs, debugging symbols etc).
These fuzzy often only need a few small changes (eg program name)
to be
accepted, which can really speed up translations. And you don't
have to
worry about google putting in a weird translation, since it is all
based
on earlier translations done by a human.

On a related note, if any of you work on ddtp-translations offline, I
have written a python program that can sort entries in ddtp po-files
based on the popularity of the package. This way, the most popular
packages will be at the top of the po file, and you are always
sure you
are working on the most important packages first.

You can get the code here:
bzr branch lp:~redmar/+junk/ddtp_popsort

It has a small readme file, please let me know if something is unclear
or not working for you.

Regards,
Redmar
--
Ubuntu Dutch Translators


Hannie Dumoleyn schreef op ma 17-12-2012 om 17:58 [+0100]:
> Hello Pierre,
> This is a very good idea! I have just uploaded the first part of the
> incomplete Dutch translation (900kb) to GTT.
> Thanks,
> Hannie
>
> Op 17-12-12 12:55, Pierre Slamich schreef:
>
> > The DDTP represent around 50 000 strings to translate * 140
> > languages. On very good weeks, a typical translation team
translates
> > 500 strings (see UWN for examples weekly figures).
> >
> >
> > Would take a lot of weeks (years?) with highly motivated
volunteers
> > of a large translation team, working non-stop, at their best
to get
> > done with it.
> > Thus we had the idea to delegate initial translation
suggestions to
> > Google Translator Kit and review translations with humans to speed
> > the process.
> >
> > We successfully did an import for circa 40 000 French strings
 (yup
> > you read that right) this week-end in a mock project called DDTP
> > Automation (https://translations.launchpad.net/ddtpautomation).
> > To keep it short, the translations from this project appear as
> > suggestions in the French DDTP, and can be reviewed by actual
> > translators.
> > We've started using them, and it turns out that a lot of them are
> > actually useful and are speeding up the translation process a lot.
> >
> > We de

Re: Semi-mechanizing the DTTP translations

2012-12-17 Thread Pierre Slamich
Hi Hannie, Hi Redmar,
Thanks a lot for the tips: we're interested in using your approach, and
more generally it might be interesting expending the msmerge approach to
all teams that are already underway for the DDTP, and the Google one to the
teams that need to get started.

- For the Google Translator Kit approach, I guess we could extend the mock
project we did for fr_FR to other languages (and streamlining our process
by using Bazaar) by creating a global team responsible for the DDTP Mock
project and including in this team one member from each language team
responsible for uploading the machine translated po for his or her language.

- For the msmerge approach, do you already have a project to handle this ?
Is there any advantage in msmerging raring against releases older than
quantal to get more modified strings ? How many strings have you been able
to recover using that approach ?  It might be neat to generate the msmerged
po for all languages ? Importing them as actual translations (not fuzzy)
into a mock project like the Google Translate one would show them as
suggestions for the actual DDTP as well.
The translator would thus be able to pick the human translated one when
available or to build on the machine translated one otherwise.

Can we try to schedule some time to coordinate on this so that we can use
both approaches and try to onboard all the other languages teams once we
have a rock-solid process ?

Pierre

Pierre Slamich
pierre.slam...@gmail.com


On Mon, Dec 17, 2012 at 10:30 PM, Redmar  wrote:

> Hi Pierre,
>
> I've actually tried a similar approach for Dutch using msgmerge, which
> might also be worth checking out. When you merge the translations of an
> older version of ubuntu into the current version (msgmerge
> quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for example), there
> will be a lot of 'fuzzy' translations for strings that are similar (for
> example, meta packages for different programs, debugging symbols etc).
> These fuzzy often only need a few small changes (eg program name) to be
> accepted, which can really speed up translations. And you don't have to
> worry about google putting in a weird translation, since it is all based
> on earlier translations done by a human.
>
> On a related note, if any of you work on ddtp-translations offline, I
> have written a python program that can sort entries in ddtp po-files
> based on the popularity of the package. This way, the most popular
> packages will be at the top of the po file, and you are always sure you
> are working on the most important packages first.
>
> You can get the code here:
> bzr branch lp:~redmar/+junk/ddtp_popsort
>
> It has a small readme file, please let me know if something is unclear
> or not working for you.
>
> Regards,
> Redmar
> --
> Ubuntu Dutch Translators
>
>
> Hannie Dumoleyn schreef op ma 17-12-2012 om 17:58 [+0100]:
> > Hello Pierre,
> > This is a very good idea! I have just uploaded the first part of the
> > incomplete Dutch translation (900kb) to GTT.
> > Thanks,
> > Hannie
> >
> > Op 17-12-12 12:55, Pierre Slamich schreef:
> >
> > > The DDTP represent around 50 000 strings to translate * 140
> > > languages. On very good weeks, a typical translation team translates
> > > 500 strings (see UWN for examples weekly figures).
> > >
> > >
> > > Would take a lot of weeks (years?) with highly motivated volunteers
> > > of a large translation team, working non-stop, at their best to get
> > > done with it.
> > > Thus we had the idea to delegate initial translation suggestions to
> > > Google Translator Kit and review translations with humans to speed
> > > the process.
> > >
> > > We successfully did an import for circa 40 000 French strings  (yup
> > > you read that right) this week-end in a mock project called DDTP
> > > Automation (https://translations.launchpad.net/ddtpautomation).
> > > To keep it short, the translations from this project appear as
> > > suggestions in the French DDTP, and can be reviewed by actual
> > > translators.
> > > We've started using them, and it turns out that a lot of them are
> > > actually useful and are speeding up the translation process a lot.
> > >
> > > We detailed the (somewhat) tedious process in English at
> > > http://lite.framapad.org/p/ddtpUbuntu
> > > Questions and inquiries welcome.
> > >
> > > Pierre
> > >
> > >
> > > ---
> > > pierre.slam...@gmail.com
> > >
> > >
> >
>
>
> --
> ubuntu-translators mailing list
> ubuntu-translators@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>
>
-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2012-12-17 Thread Redmar
Hi Pierre,

I've actually tried a similar approach for Dutch using msgmerge, which
might also be worth checking out. When you merge the translations of an
older version of ubuntu into the current version (msgmerge
quantal_ddtp.po raring_ddtp.po -o merged_ddtp.po, for example), there
will be a lot of 'fuzzy' translations for strings that are similar (for
example, meta packages for different programs, debugging symbols etc).
These fuzzy often only need a few small changes (eg program name) to be
accepted, which can really speed up translations. And you don't have to
worry about google putting in a weird translation, since it is all based
on earlier translations done by a human.

On a related note, if any of you work on ddtp-translations offline, I
have written a python program that can sort entries in ddtp po-files
based on the popularity of the package. This way, the most popular
packages will be at the top of the po file, and you are always sure you
are working on the most important packages first.

You can get the code here:
bzr branch lp:~redmar/+junk/ddtp_popsort 

It has a small readme file, please let me know if something is unclear
or not working for you.

Regards,
Redmar
--
Ubuntu Dutch Translators


Hannie Dumoleyn schreef op ma 17-12-2012 om 17:58 [+0100]:
> Hello Pierre,
> This is a very good idea! I have just uploaded the first part of the
> incomplete Dutch translation (900kb) to GTT.
> Thanks,
> Hannie
> 
> Op 17-12-12 12:55, Pierre Slamich schreef:
> 
> > The DDTP represent around 50 000 strings to translate * 140
> > languages. On very good weeks, a typical translation team translates
> > 500 strings (see UWN for examples weekly figures).  
> > 
> > 
> > Would take a lot of weeks (years?) with highly motivated volunteers
> > of a large translation team, working non-stop, at their best to get
> > done with it.
> > Thus we had the idea to delegate initial translation suggestions to
> > Google Translator Kit and review translations with humans to speed
> > the process.
> > 
> > We successfully did an import for circa 40 000 French strings  (yup
> > you read that right) this week-end in a mock project called DDTP
> > Automation (https://translations.launchpad.net/ddtpautomation). 
> > To keep it short, the translations from this project appear as
> > suggestions in the French DDTP, and can be reviewed by actual
> > translators.
> > We've started using them, and it turns out that a lot of them are
> > actually useful and are speeding up the translation process a lot.
> > 
> > We detailed the (somewhat) tedious process in English at
> > http://lite.framapad.org/p/ddtpUbuntu
> > Questions and inquiries welcome.
> > 
> > Pierre
> > 
> > 
> > ---
> > pierre.slam...@gmail.com 
> > 
> > 
> 



signature.asc
Description: This is a digitally signed message part
-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2012-12-17 Thread Hannie Dumoleyn

Hello Pierre,
This is a very good idea! I have just uploaded the first part of the 
incomplete Dutch translation (900kb) to GTT.

Thanks,
Hannie

Op 17-12-12 12:55, Pierre Slamich schreef:
The DDTP represent around 50 000 strings to translate * 140 languages. 
On very good weeks, a typical translation team translates 500 strings 
(see UWN for examples weekly figures).


Would take a lot of weeks (years?) with highly motivated volunteers of 
a large translation team, working non-stop, at their best to get done 
with it.
Thus we had the idea to delegate initial translation suggestions to 
Google Translator Kit and review translations with humans to speed the 
process.


We successfully did an import for circa 40 000 French strings  (yup 
you read that right) this week-end in a mock project called DDTP 
Automation (https://translations.launchpad.net/ddtpautomation).
To keep it short, the translations from this project appear as 
suggestions in the French DDTP, and can be reviewed by actual translators.
We've started using them, and it turns out that a lot of them are 
actually useful and are speeding up the translation process a lot.


We detailed the (somewhat) tedious process in English at 
http://lite.framapad.org/p/ddtpUbuntu

Questions and inquiries welcome.

Pierre


---
pierre.slam...@gmail.com 




-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators


Re: Semi-mechanizing the DTTP translations

2012-12-17 Thread Jeroen Vermeulen
On 12/17/2012 06:55 PM, Pierre Slamich wrote:

> To keep it short, the translations from this project appear as
> suggestions in the French DDTP, and can be reviewed by actual translators.
> We've started using them, and it turns out that a lot of them are
> actually useful and are speeding up the translation process a lot.

Well done!

For what it's worth, many in the translation industry find this approach
useful.  Besides saving time, it also means that the work can be done
with less expert knowledge of English.


Jeroen


-- 
ubuntu-translators mailing list
ubuntu-translators@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators