Re: [libreoffice-l10n] Bavarian and Nipmuck - report

2015-12-12 Thread Michael Bauer



Sgrìobh toki na leanas 12/12/2015 aig 22:51:
That doesn't mean that work on spell checking and grammar checking 
should not be done. 

I didn't say that either.
It does mean that whoever creates the spell checker and grammar 
checker have a hardcopy (paper) copy of the printed dictionary and 
grammar checker of the language in question, and understands both what 
in the book is contentious, and why it is contentious.
Yes but these are skills different to those of a localizer. Not every 
localizer is a grammarian and vice verse and the same applies to the 
writing or even analysis of dictionaries.
How many people should be on a team is as dependent on cultural 
factors, as it is on practical factors. 

Agreed
The vice of a one-person team, is that there is nobody to hand off 
responsibility for the project off to, if the sole team member is no 
longer able to work on the project. 
Agreed. Or at least to the extent that if a single localizer fails to 
plan for the future once the project(s) have reached a degree of maturity.
The vice of multi-member teams, is death by paralysis. The inability 
to come to mutually acceptable solutions, when questions/problems 
arise. In the corporate world in Europe and North America, researchers 
have found that eight people on a team, is the most effective size, 
for a project to be successful
And my rough and ready analysis has told me you can expect about 1 
active localizer for each half million speakers or so. Like you seems to 
get an almost invariable proportion of lurkers to participants on 
forums, some ratio like that also seems to apply to l10n. Which means 8 
is an almost unachievable number for many languages. All the more reason 
to plan but also an argument for not just turning down folk because they 
haven't got a team.
a) The primary issue with translating the help file after the rest of 
the UI, is that it does not get done. (Take a look at the number of 
localizations of LibO, where the Help file is not translated into the 
target language.) 
That is one way of looking at it. The other is that of cost vs benefit. 
It would be nice if someone actually did any research but using the 
in-built help in my experience has almost become a joke, like the tools 
Windows includes for automatically fixing issues. In most cases, it does 
not have the answer you're looking for and/or is out of date. Even 
commercial projects with large development teams often product shoddy 
Help, even for key features/issues. Like the Help on using Hunspell in 
Trados.


To be honest I'm surprised LO has not tried to determine if the in-built 
help is actually worth the effort from the end-user POV in contrast to 
online help and/or active user forums etc.


For a small team, it is certainly the smallest bang for you buck in my 
view.
b) By starting with the Help File, one can incorporate it into the 
Documentation Work Flow, ensuring the documentation is consistent 
across the various mediums. Otherwise you end up with situations like 
the US English, where the written documentation and the help file 
contradict each other. Even worse, is when both the built-in help 
file, and written documentation are wrong

Which makes the Help stuff an even worse place to start.
The advanced/expert users of the software. 
No. They hit Google and find the answer on some forum, wiki or some such 
platform. I would bet the farm on it.
I realize that Apple pioneered "Prohibit documentation wherever and 
whenever possible", but all that really results in, is to ensure that 
the user is unable to use the product as designed
I'm not an Apple disciple, if that's what you're hinting at. If the 
software is well designed, you shouldn't have to resort to Help much.
It is no more heartbreaking to translate the Help file, as it is to 
translate the rest of the UI, or the documentation that is in other 
languages, into the target language.
Fundamentally disagree. The UI on the whole is not that bad and in some 
places, actually a lot of fun (with the Calc functions are probably the 
worst bit of the UI) - on the whole, they're manageable chunks and you 
can use TM to great effect. Help is full of big chunks with eye watering 
tags that confuse not only translators but also TMs. And like 
translating EULAs, all the while you're thinking "who's actually going 
to read this".


It's a bit of a cliché but "more research would be welcome" :)

Michael

--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Bavarian and Nipmuck - report

2015-12-12 Thread toki
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/12/2015 14:31, Michael Bauer wrote:

> 1) Orthography
> Terrible reason to turn down a project. Most l10n projects LO has invo
lve languages where spelling is a potentially contentious issue.

Nonetheless, every one of the 3,000 languages that have been reduced to
writing has a published dictionary, and grammar. I'll grant that
regardless of which language one is talking about, there will be
contentious issues regarding the accuracy of the book in question.  That
doesn't mean that work on spell checking and grammar checking should not
be done.  It does mean that whoever creates the spell checker and
grammar checker have a hardcopy (paper) copy of the printed dictionary
and grammar checker of the language in question, and understands both
what in the book is contentious, and why it is contentious.

It also means that when automated tools are used to pare out foreign
words, ensure that only foreign words are omitted.  (How many iterations
of the Afrikaans spell checker were needed, before "die" was included in
it? The same thing happens in other languages, but usually aren't as
blatantly obvious omissions to even the most casual user.)

>Team Size

> Errr no. 1 dedicated localizer is more than enough. 

How many people should be on a team is as dependent on cultural factors,
as it is on practical factors.

The vice of a one-person team, is that there is nobody to hand off
responsibility for the project off to, if the sole team member is no
longer able to work on the project.

The vice of multi-member teams, is death by paralysis. The inability to
come to mutually acceptable solutions, when questions/problems arise.

In the corporate world in Europe and North America, researchers have
found that eight people on a team, is the most effective size, for a
project to be successful.

> 5) Start with documentation/help
> No.It would raise the wrong expectations, if you give the average user
 a screen that says Filte, unless highly cynical, they would expect the
rest in the same lingo too.

a)  The primary issue with translating the help file after the rest of
the UI, is that it does not get done. (Take a look at the number of
localizations of LibO, where the Help file is not translated into the
target language.)

b) By starting with the Help File, one can incorporate it into the
Documentation Work Flow, ensuring the documentation is consistent across
the various mediums.  Otherwise you end up with situations like the US
English, where the written documentation and the help file contradict
each other. Even worse, is when both the built-in help file, and written
documentation are wrong.

> As to the Help, who reads the Help? 

The advanced/expert users of the software.

I realize that Apple pioneered "Prohibit documentation wherever and
whenever possible", but all that really results in, is to ensure that
the user is unable to use the product as designed.

>its the worst starting point and a soul-destroying task.

It is no more heartbreaking to translate the Help file, as it is to
translate the rest of the UI, or the documentation that is in other
languages, into the target language.

jonathon
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWbKT8AAoJEKG7hs8nSMR70CwP/iw5yyVbNkrdwXgDuI9Z7Iw8
IzFJXk1N10uWhorLOKP2oCe/OvHeagLiOsj9vYjnIuFxPV9puHbci/u9GCgt4bYy
ldcOPngm9WzJIvFwoJZkxIFogNjXrYglPmN47ak14dGEjnBsMQUMAIAMcPafx/JL
dnHnDaOKmNFymvztALRMBHhHxHqBkIrZle76sdpkzwpo1ggPAiOe50utBj1sT1qn
V+DsobD3LEX/fDXdUIFoOhsSB4Ko+kVvSrkpnQQvbsPkaj4o/NSw57f5J/JSaOwo
W/l/a+OcqioT04rmBzJTa3PcWXuwPZ5wshpkvSCbFceXrblqd65FqYcehaE8PC3I
KMDX0Ykrwuam24K6OhdCX0Q73Hqn4r5oOJovRcbWi+0lOfIV3/jiQBbdtzFF8ag3
6cB8T8nq4clcxmRv7x7lZEv3t7pkSRs8JbBezCg1sxCQbIplSSI0iWBNi8QxUzFd
Fhqsf7DwPswCtGz+SdGCxPgWpBvJU6+AImjzIoQaccGd8s+RWVzbRIh0lICFtSkp
kfGoqFJI/JZTUY1PtdluqmR6e6xK0SrDBm33Q8doz82nl9IwLmEsX4+PyyDCqNa4
b6LBSmkMHUUGmMpIx8xlwlehGAsgfzTq41gf4Mr0h/kvy+7DQi/I7+trIw1935Y3
61hnROdhTwlO/YYwy5+o
=lj6Q
-END PGP SIGNATURE-

-- 
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


[libreoffice-l10n] Bavarian and Nipmuck - report

2015-12-09 Thread Michael Bauer


Somehow the mail client ate most of my email, reposting, sorry...


---


Sorry for the delay in responding, Im travelling.


I think I disagree with most things that have been said in this discussion so 
far.


Let me try and go through them one by one...


1) Orthography


Terrible reason to turn down a project. Most l10n projects LO has involve 
languages where spellling is a potentially contentious issue. Perhaps the 
really big locales have very settled spelling systems but even they are not 
immune. For example, I doubt that anyone is enforcing either pre or post 
spelling reform spellings in the German project. Some locales actually 
deliberately use l10n to help standardize spelling.


2) Team size


Errr no. 1 dedicated locaizer is more than enough. I have a day job and I also 
do virtually all the l10n work on Mozilla, LO, WorPress (both), VLC, and 
several other projects. In fact, a single localizer can be more effective in 
some instances provided they put in sufficient time and effort. In fact having 
a team for Scottish Gaelic initially would have been a hindrance, not a help 
because there would have been ENDLESS debates around terminology and spelling. 
In a non-standardized language, a single translator can produce translations 
which are superior than those of a team, provided they are fluent and generally 
good with technology.


3) Its extinct or critically endangered
Well, so is Scottish Gaelic, less than 60k speakers is hardly a stadium full of 
people... l10n is a key part of any revitalization effort in a society which is 
not cut off from technology. It is perhaps the one way in which a marginalized 
language can gain a foothold on the screens of the next generation, small as it 
may be. A program with a UI in a marginalized language has a big wow factor if 
done well. If you localize Diablo III into German, people just expect that, its 
not news. Translate it into Nipmuck and itll be all over the airwaves.


Wikipedia or even Ethnologue are not the pinnacle of information when it comes 
to smaller languages. On several occasions have I come across languages marked 
as extinct in one, but not the other or vice versa or even where both were 
simply wrong. For example, they had a Basque Creole lumped in with a Romani 
language code in once instance.


4) Better to translate literature


Yes and no. Im a very good localizer but Im totally useless at translating 
literature or poetry or songs. Its called a specialism, no translator worth 
their money translate EVERYTHING. Id be equally useless at writing 
non-technical content.


5) Start with documentation/help

No.It would raise the wrong expectations, if you give the average user a screen 
that says Filte, unless highly cynical, they would expect the rest in the same 
lingo too.


As to the Help, who reads the Help? Ever? Unless they dont have web access. 
Even if some folk use it, its the worst starting point and a soul-destroying 
task.


6) Professors say to prioritise proofing


Maybe but that depends on the locale. To create a spellchecker you first need 
either really good dictionary or ody of well spelled texts, plus someone who 
can do code to some extent because doing a Hunspell package is not entirely 
straight forward. Grammar checkers are equally nice but not a priority to begin 
with I would say. Small languages often have not codified their grammar fully 
and thus if you just write some rules, youll just annoy everybody.


In the end, these are just opinions. They are neither uniform (I disagree for 
one) not are they based on research.


7) Firefox


That is actually the best alternative suggestion Ive heard in this debate. It 
might make sense to look into that. But either way, LO and Firefox are both 
must-haves really so it doesnt make that much of a difference which one you 
start with. Firefox, since it has Android and iOS versions now, would get you 
more bang for your buck faster though to begin with


8) Machine Translation


Worst idea ever. MT relies on massive bilingual corpora - and thats just the 
start of the headaches. The last thing a language like Nipmuck needs is a MT 
system that cost them huge resources to produce and which outputs 
semi-gibberish at best. Irish is in a much better position regarding 
English/Irish data and yet Google Translate produces Irish which either makes 
you laugh yourself silly or makes you cry.
Long story short, my view is, welcome to both, just have a moment to consider 
the implications regarding time/effort/other challenges and if you still think 
its a good idea, good on you.


Michael


-- 
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted