The recent removal of the extension prereg mechanism revealed a problem
with how we select which dictionaries (which come in the form of bundled
extensions) are included in a given installation.
At least with the "official" (<http://download.libreoffice.org>) Linux
and Mac OS X installation sets, the base installation set contains en-US
localization and only contains dictionaries "related" to that locale
(dict-en, dict-es, dict-fr; see below for details of what "related"
means). The additional per-language langpacks contain dictionaries
"related" to the given langpack (e.g., langpack_de contains dict-de).
However, on Windows, the base installation set contains all available
localizations and all available dictionaries. During msi installation,
some code apparently determines a default selection of only a subset of
the "Additional user interface languages" entries (presumably based on
the current system locale settings), but all of the available "Optional
Components - Dictionaries" entries are selected by default. This now
causes per-user generation of data about all those bundled dictionary
extensions at per-user first-start of LO, leading to noticeable time and
space requirements (see
<https://bugs.freedesktop.org/show_bug.cgi?id=53009> "Large
UserInstallation's user/extensions/bundled/ tree").
Hence, one suggestion to address that problem would be to reduce the
amount of "Optional Components - Dictionaries" entries selected by
default during Windows msi installation, similar to how a certain
combination of base installation set plus langpack(s) on the other
platforms also only installs a subset of all the available dictionaries.
(That is, the code that apparently now determines a default selection
of "Additional user interface languages" entries would need to be
extended to also determine a default selection of "related" "Optional
Components - Dictionaries" entries.)
Initial reactions on IRC (see below) were that (a) the status quo on
Windows was to avoid "political issues" (though that would be
inconsistent with the status quo on the other platforms), and (b) to
rethink having dictionaries as bundled extensions (though I would prefer
to keep things simple, solving the problem by harmonizing behavior
across platforms now and leaving anything more ambitious for the future).
Any further thoughts?
Stephan
PS1: The way dictionaries "related" to a given locale are determined
appears to be the the list at
setup_native/source/packinfo/spellchecker_selection.txt. That's why the
en-US base installation set for Linux and Mac OS X contains dict-en,
dict-es, and dict-fr, for example. However, an apparent inconsistency
is that langpack_de only contains dict-de, and not also dict-fr and
dict-it, as that list would suggest.
PS2: At least the Mac OS X LO 3.6.1 en-US base installation set contains
share/extension/dict-* directories for all available dictionaries, not
just dict-en, dict-es, dict-fr, but the additional ones are effectively
empty and their existence is a bug.
PS3: For the record, the relevant log of yesterday's #libreofifice-dev:
Aug 29 12:50:57 <sberg> timar, do you know anything about our msi by default installing all
"Optional Components - Dictionaries" entries, but only selected (at installation time, I
presume?) "Additional user interface languages"?
Aug 29 12:51:59 <timar> sberg: yes, we always install all dictionaries on Windows in
order to avoid "political issues"
Aug 29 12:52:26 <tml_> is this the old "omg, I waste SEVERAL MEGABYTES on
dictionaries for languages I don't even like" discussion?
Aug 29 12:53:41 <sberg> timar, but that causes one part of the problems of
fdo#53009, so I had hoped we could fix that
Aug 29 12:53:44 <IZBot> LibreOffice-Libreoffice normal/medium ASSIGNED Large
UserInstallation's user/extensions/bundled/ tree
https://bugs.freedesktop.org/show_bug.cgi?id=53009
Aug 29 12:54:41 <tml_> wouldn't the best solution then be to stop treating these as
"extensions"?
Aug 29 12:55:12 <tml_> don't we have too much optionality in the installer
anyway?
Aug 29 12:55:40 <tml_> hmm, those are orthogonal issues, sorry
Aug 29 12:58:36 <timar> sberg: what is your suggestion?
Aug 29 13:02:55 <sberg> timar, assuming that there is code in our msi to default-enable some subset X of
"Additional user interface languages" entries: extend that code to also default-enable only a
"matching" subset of "Optional Components - Dictionaries" entries
Aug 29 13:03:44 <tml_> that assumes people would prefer to use software
(including the OS) in the same language as they write/edit documents it. not true
Aug 29 13:03:46 <sberg> ...for some suitable definition of "matching"
Aug 29 13:05:01 <timar> sberg: tml_ there is
http://opengrok.libreoffice.org/xref/core/setup_native/source/packinfo/spellchecker_selection.txt
that we still use for creating Linux langpacks IMHO (not sure)
Aug 29 13:05:11 <sberg> tml_, no, but it might be a better approximation to typical
users' needs than the current "install everything" approach (after all, users /can/
install additional dics -- its only about the defaults)
Aug 29 13:06:45 <sberg> timar, yes, that list I had on my mind
Aug 29 13:06:56 <tml_> sberg: one person's good approximation is another
person's grave insult to the XXX people ;)
Aug 29 13:07:26 <sberg> tml_, we already use that approximation on other
platforms
Aug 29 13:07:45 <tml_> so that is broken, then? ;)
Aug 29 13:09:16 <sberg> tml_, do you have a better suggestion?
Aug 29 13:10:01 <tml_> sberg: is that there are lots of *extensions* that is
causing problems, or lots of *dictionaries* ?
Aug 29 13:11:03 <tml_> or, wait, am I smoking crack with this talk about
extensions?
Aug 29 13:11:25 <tml_> (I somehow had the impression that many dictionaires are
technically packaged as "extensions", are they?)
Aug 29 13:11:51 <timar> tml_: dictionaries are extensions
Aug 29 13:12:15 <sberg> tml_, dictionaries come as bundled extensions, and
every bundled extension increases the per-user space reqs and per-user--first-start
time reqs (though some do more than others)
Aug 29 13:12:20 <tml_> ok, so then the question above to sberg still holds
Aug 29 13:12:52 <tml_> sberg: ok, so wouldn't the solution then be to stop
packaging dictionaries as extensions? or do they *have* to be such for some obscure
technical reason?
Aug 29 13:13:05 <tml_> I mean, they could still be optional in the installer
even if they weren't extensions
Aug 29 13:13:29 <tml_> just like lots of other things are optional but aren't
extensions
Aug 29 13:16:28 <sberg> tml_, I think the origin of having dicts as exts is so
that (a) people can install additional ones (OOo traditionally did not come with such
a large number of bundled dicts as LO does at least on Windows, IIUC), and (b) people
can update dicts independently from updating the app itself (as the dicts were
traditionally provided by 3rd parties, IIUC)
Aug 29 13:17:38 <tml_> but having the bundled ones not be extensions wouldn't
stop (a), and (b) is made unnecessary by our time-based frequent releases
Aug 29 13:22:54 <sberg> tml_, I'm not arguing that having dicts as exts is
necessarily good; what I'm not sure about is whether turning a given dict from ext to
non-ext could cause technical problems, if a user installed an ext variant of that
dict into a LO that contains that dict as non-ext
Aug 29 13:24:24 <tml_> that is something to check (and fix) then, if the
bundled dictionaries would not be extensions any more
Aug 29 13:24:31 <sberg> maybe makes sense to put this on the ESC agenda
Aug 29 13:27:11 <caolan> some of the code for the old pre-extension mechanism
for dictionaries still exists in lingucomponent/source/lingutil/lingutil.cxx now used
for the system dictionary case
Aug 29 13:27:30 <caolan> its *supposed* to prefer extensions IIRC over system
dicts
Aug 29 13:27:41 <caolan> *shrug*
Aug 29 13:28:43 <caolan> the removed pre-extension code had a dictionary.lst in
some dir or other that listed the dicts and languages they were for
Aug 29 13:29:47 <caolan> but that was back in pre language tool days, not sure
if that makes some of our bundled dicts no longer just simple hunspell/hyphen/mythes
containers
Aug 29 13:30:10 <tml_> sberg: but anyway, I am not opposed to making the
installer by default select only a (somewhat arbitrary) subset of dictionaries to
install, if that fixes a problem for most people
Aug 29 13:30:37 <tml_> and even if I was opposed, that could be ignored;)
Aug 29 13:32:23 <caolan> throw the net wide enough, dict for langpack + top X
languages always installed + langs also in use in territory + Y neighbouring langs :-)
Aug 29 13:36:46 <tml_> caolan: but isn't it so that exactly selecting "neighbouring
langs" (but not langs from some country a few borders away) can cause immense irritation. "why
would we proud Freedonians want to write in the language of those dogs of Elbonia. what we need is the
language of our beloved friends from Bulvania"
Aug 29 13:37:36 <tml_> but whatever
Aug 29 13:40:20 <caolan> including Russian in a shortlist of dicts for the
Latvian langpack is a potential contender for that problem
Aug 29 13:41:58 <tml_> which is why when including *all* one can always say "we
don't make any judgements"
Aug 29 13:42:29 <caolan> Bosnian/Serbian/Croatian, *shudder*
Aug 29 13:45:12 <tml_> caolan: Serbian/Albanian/Russian was the real-world example I had
in mind. even if Albanian seems to be a "recognized minority language" in Serbia, so
at least officially they couldn't oppose it that heavily
Aug 29 13:46:33 <tml_> caolan: and what do I know, maybe I am too pessimistic,
and only a very small minority of people would take stuff like this so seriously
Aug 29 13:46:43 <tml_> caolan: after all, it isn't *maps* ;)
Aug 29 13:47:34 <caolan> tml_: RH has a utility to search for possible maps in
software packages :-)
_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice