To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=1601





------- Additional comments from e...@openoffice.org Fri Mar 20 20:27:28 +0000 
2009 -------
Hi, tremendous amount of work, I've quite a few nitpicks though ;-)

The implementation name must match its usage, e.g. in the sentencecase
ctor we have

    implementationName = "com.sun.star.i18n.Transliteration.sentencecase";

which should be 

    implementationName = "com.sun.star.i18n.Transliteration.SENTENCE_CASE";

instead if "SENTENCE_CASE" is to be used as the module name to be loaded
with loadModuleByImplName(). Errm.. did you actually try to run your
implementation?


sentencecase::transliterate() attempts to throw
a MultipleCharsOutputException in case upper/lower was a one-to-many
mapping. That doesn't work, transliterate() is declared to throw
RuntimeException only. Instead, a one-to-many mapping would had to be
fully processed, while adapting the offset sequence if useOffset==true,
which it is meant for. For examples see other transliterate()
implementations.


For better readability I suggest to rename class sentencecase to
Transliteration_sentencecase, and in
i18npool/source/registerservices/registerservices.cxx
change
IMPL_CREATEINSTANCE( sentencecase )
to
IMPL_CREATEINSTANCE( Transliteration_sentencecase )
and thus

{   TRLT_SERVICELNAME_L10N,
    TRLT_IMPLNAME_PREFIX  "SENTENCE_CASE",
    &sentencecase_CreateInstance },

to

{   TRLT_SERVICELNAME_L10N,
    TRLT_IMPLNAME_PREFIX  "SENTENCE_CASE",
    &Transliteration_sentencecase_CreateInstance },


There's no need for
#define TRLT_SERVICELNAME_SENTENCECASE  TRLT_SERVICELNAME_PREFIX "sentencecase"
in i18npool/inc/servicename.hxx


TRANSLITERATION_ONETOONE( sentencecase )
in i18npool/inc/transliteration_OneToOne.hxx
Why this definition and therefor have class sentencecase being derived
from class transliteration_OneToOne? Sentencecase does not provide
a oneToOneMapping table, nor does it provide a TransFunc function.


<Transliteration unoid="SENTENCE_CASE"/>
in i18npool/source/localedata/data/en_US.xml may have side effects in
such a way that the LC_TRANSLITERATION element in other locales often
refers to the en_US element and such would inherit this transliteration
as well. I'm not sure at the moment which source code actually uses this
list of available transliterations for what purpose, other than the
i18npool internals. Would had to be investigated.


The new transliteration is probably only suitable for some Western
languages. One might need to answer "what is sentence case?" for other
languages and scripts. A limited set of sentence endings and whitespace
is hard coded and the algorithm would already fail for Spanish language
with its inverted marks starting a question or exclamation.

Instead of hard coded whitespace, the unicode::isWhiteSpace() method
from i18nutil/unicode.hxx should be used.

The bPoint variable should not be reset unless an ALPHA or DIGIT
character was encountered. Use unicode::getCharType() and check the
com.sun.star.i18n.KCharacterType bits returned. In fact, everything
between the last sentence ending and the first ALPHA or DIGIT is in
a weak state and should not be folded at all.

The sentence case transliteration is not applicable to all languages,
for example German where nouns always start with an upper case letter.
Situation with CTL languages might be even more complicated. It's not
clear to me what should happen in such cases.


Performance:

- casefolding::getValue() should only be called if the character is
  LOWER (for LowerToUpper) respectively UPPER (for UpperToLower).
- Creating a temporary string for each character to be replaced and then
  use dst=dst.replaceAt() is total overkill. Use an OUStringBuffer
  instead for buffer manipulation and return buffer.makeStringAndClear()
  at the end.


SwEditShell::TransliterateText()
aTrans.loadModuleByImplName( moduleName, LANGUAGE_SYSTEM );
LANGUAGE_SYSTEM is questionable. It should be the language attribution
of the selected paragraph or sentence instead, else conversion
upper<->lower may be wrong.


All changes in module binfilter are moot, that's only used to convert
between XML and old binary file format. Basically, what the binary
format didn't know doesn't need to be handled. Especially adding UI bits
like string definitions for menus are superfluous, there's no UI for
binfilter.


Regarding the change in TransliterationWrapper:
That may change usage flow. In case code relied on the old behavior that
calling transliterate() with language attribute would load
a corresponding module if needed even if previously another module was
loaded with loadModuleByImplName(), that code would fail now if the
wrapper was reused and such a ImplName module was loaded before. The
change would need extensive checking of all callers whether such
a situation may happen, and provide means at the wrapper to reset to the
initial state as that currently isn't possible.


Wow, that was a lot, let's hope it didn't discourage you ;-)


---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@sw.openoffice.org
For additional commands, e-mail: issues-h...@sw.openoffice.org


---------------------------------------------------------------------
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org

Reply via email to