[
https://issues.apache.org/jira/browse/XERCESC-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134558#comment-17134558
]
Roger Leigh commented on XERCESC-2204:
--
Since this is all abstracted via the msgloader interface, it won't in and of
itself cause an ABI or API break; there will always be inmemory to fall back on.
If there are any users of ICU msgloader and they have done their own custom
build with their own custom messages added in then this would certainly break.
As far as I can see, they aren't loaded dynamically at runtime; they all get
generated and linked in at compile time. But maybe it's more flexible than
that and I haven't looked closely enough.
I have nothing against internationalisation; most of the stuff I work on is
fully internationalised with multiple languages. But here, in Xerces-C we
don't just have one translation system. We have *three*. All apparently
unused.
Just to pick a string at random:
{noformat}
$ git grep "unknown complexType '{0}'"
src/xercesc/NLS/EN_US/XMLErrList_EN_US.Xml:
src/xercesc/util/MsgLoaders/ICU/resources/root.txt: "unknown
complexType '{0}'" ,
src/xercesc/util/MsgLoaders/MsgCatalog/XercesMessages_en_US.Msg:37 unknown
complexType '{0}'
{noformat}
So we don't just have one set of translations. We have three. Just for en_US!
If we were to keep the translation machinery, I think it would make sense to
pick one and require it. The iconv (catgets) one is UNIX-specific. The
inmemory one is limited to a single language. ICU is the most flexible, but
requires ICU. But then, ICU is useful for proper Unicode support. If we were
to require ICU unconditionally, then we could use the ICU message loader
unconditionally.
It's all horribly complicated compared with gettext.
Regarding overall goals, I would like to modernise some parts of Xerces-C++,
primarily to make it more usable with current compilers and more easily
interoperable with current libraries, and also to improve correctness and
performance. I'm certainly not envisaging any world-breaking rewrite. Rather,
some very selective rationalisation of features, requiring a selected subset of
C++11, and being very careful about avoiding introducing compatibility problems
or introducing bugs. With those changes made, I would be interested in running
clang-tidy over the codebase and fixing any latent bugs it uncovers. There are
a lot of casting issues (over 10k). Many of them can be removed by introducing
mutable; previously avoided due to broken compilers, but would eliminate a lot
of casting away of constness. I did run a static analyser over the whole
codebase. Too many issues to fix in one go; it will need to be broken down
into categories of issues and tackled bit by bit. We're not at that point yet;
that will need additional issues creating once we're at the point of being able
to run the analyser properly.
> Remove message loader
> -
>
> Key: XERCESC-2204
> URL: https://issues.apache.org/jira/browse/XERCESC-2204
> Project: Xerces-C++
> Issue Type: Bug
> Components: Miscellaneous
>Affects Versions: 3.3.0
>Reporter: Roger Leigh
>Assignee: Roger Leigh
>Priority: Major
> Fix For: 3.3.0
>
>
> We support several different message loaders (inmemory, icu, iconv).
> However... we don't have any translations to actually warrant all this
> complexity, and likely never will. We have the basic en_US and that's it.
> So this code is essentially unused, and I don't see much prospect of it being
> used in the future given that there have been zero translations written in
> the last two decades.
> In order to reduce the size of the test matrix and reduce the maintenance
> burden, I'd like to ask if this is something we could safely drop?
> See also XALANC-808 which is the same issue for Xalan-C.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org