[ 
https://issues.apache.org/jira/browse/XERCESC-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134558#comment-17134558
 ] 

Roger Leigh commented on XERCESC-2204:
--------------------------------------

Since this is all abstracted via the msgloader interface, it won't in and of 
itself cause an ABI or API break; there will always be inmemory to fall back on.

If there are any users of ICU msgloader and they have done their own custom 
build with their own custom messages added in then this would certainly break.  
As far as I can see, they aren't loaded dynamically at runtime; they all get 
generated and linked in at compile time.  But maybe it's more flexible than 
that and I haven't looked closely enough.

I have nothing against internationalisation; most of the stuff I work on is 
fully internationalised with multiple languages.  But here, in Xerces-C we 
don't just have one translation system.  We have *three*.  All apparently 
unused.

Just to pick a string at random:


{noformat}
$ git grep "unknown complexType '{0}'"
src/xercesc/NLS/EN_US/XMLErrList_EN_US.Xml:            <Message 
Id="UnknownComplexType" Text="unknown complexType '{0}'"/>
src/xercesc/util/MsgLoaders/ICU/resources/root.txt:             "unknown 
complexType '{0}'" ,
src/xercesc/util/MsgLoaders/MsgCatalog/XercesMessages_en_US.Msg:37  unknown 
complexType '{0}'
{noformat}

So we don't just have one set of translations.  We have three.  Just for en_US!

If we were to keep the translation machinery, I think it would make sense to 
pick one and require it.  The iconv (catgets) one is UNIX-specific.  The 
inmemory one is limited to a single language.  ICU is the most flexible, but 
requires ICU.  But then, ICU is useful for proper Unicode support.  If we were 
to require ICU unconditionally, then we could use the ICU message loader 
unconditionally.

It's all horribly complicated compared with gettext.


Regarding overall goals, I would like to modernise some parts of Xerces-C++, 
primarily to make it more usable with current compilers and more easily 
interoperable with current libraries, and also to improve correctness and 
performance.  I'm certainly not envisaging any world-breaking rewrite.  Rather, 
some very selective rationalisation of features, requiring a selected subset of 
C++11, and being very careful about avoiding introducing compatibility problems 
or introducing bugs.  With those changes made, I would be interested in running 
clang-tidy over the codebase and fixing any latent bugs it uncovers.  There are 
a lot of casting issues (over 10k).  Many of them can be removed by introducing 
mutable; previously avoided due to broken compilers, but would eliminate a lot 
of casting away of constness.  I did run a static analyser over the whole 
codebase.  Too many issues to fix in one go; it will need to be broken down 
into categories of issues and tackled bit by bit.  We're not at that point yet; 
that will need additional issues creating once we're at the point of being able 
to run the analyser properly.

> Remove message loader
> ---------------------
>
>                 Key: XERCESC-2204
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2204
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 3.3.0
>            Reporter: Roger Leigh
>            Assignee: Roger Leigh
>            Priority: Major
>             Fix For: 3.3.0
>
>
> We support several different message loaders (inmemory, icu, iconv).  
> However... we don't have any translations to actually warrant all this 
> complexity, and likely never will.  We have the basic en_US and that's it.  
> So this code is essentially unused, and I don't see much prospect of it being 
> used in the future given that there have been zero translations written in 
> the last two decades.
> In order to reduce the size of the test matrix and reduce the maintenance 
> burden, I'd like to ask if this is something we could safely drop?
> See also XALANC-808 which is the same issue for Xalan-C.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to