Hello all.

Salikh Zakirov wrote:
far below are results of my experiments with Log4cxx's ResourceBundle.
(I've managed to find it in Log4cxx documentation after carefully
rereading your original post).

The good news is that it does localization (severely limited).
The prototype has following good properties
* The unlocalized message is used as the message key

The message key should be "message pattern" (not a message), because
some parameters may be in this message. E.g.:

"Message pattern with integer parameter: %d"
or
"Message pattern with one parameter: {0}"

* No extra entities were introduced (like non-printable message keys)

What about very long "message pattern" (e.g. see help message from
VM)? For these cases "messageId key" should be used.

* The localizable messages are marked by _() notation and can
be extracted from the source code automatically

To my mind solution is graceful for localizable messages extraction.
But should we care about this? Once, these messages should be gathered
and put into properties file.

I propose the following solution:

Modify VM's LoggerString class. The first parameter of composite
message should be message key. If it equals empty string then the
message should not be localized. E.g.:
WARN("" << "Not localizable message with two parameters: " << 1 << "and" << 10)
WARN("localizable message with two parameters: %d and %d" << 1 << 10)
or
WARN("localizable message with two parameters: {0} and {1}" << 1 << 10)


The things that I have not implemented yet (to save time and make at least
something available):
* loading the system locale value
* reading the locale-specific localization file
* converting the localized messages to locale-specific encoding

What do you mean there?

* converting the unlocalized messages from source encoding (US-ASCII) to
UTF-16 (wchar_t[])

There is big question. We can use there char[] strings. Log4cxx
automatically converts char* to wchar_t*.
Also, we can use utf8 coding for wide characters.


The issues that I have encountered but haven't yet worked out a solution:

* PropertyResourceBundle.getString().c_str() returns the
pointer to the stack
location. To make it work, I had to use wcsdup(), thus introducing
an unacceptable memory leak.
I think there must be some way to get the pointer to original bundle
contents,
but haven't figured out how to achieve it.


May be that's the way:

LOG4CXX_DECODE_WCHAR(chstr, wchrstr);
LOG4CXX_ENCODE_CHAR(charstr, chstr);
charstr.c_str()

* PropertyResourceBundle expects the good property format, so the
unlocalized
messages needs to be mangled to property-compatible form
(in the patch below, the only transformation replaced spaces ' ' with
underscores '_',
  but it needs to be generalized).

I agree with you.


Given the number of issues PropertyResourceBundle introduces, and the number
of
services it provides (parsing property-format and constructing in-memory
hashmap),
I think that it would be easier to reimplement the functionality without
using PropertyResourceBundle,
and change the storage on-disk file format to allow unmangled messages be
the keys.


In conclusion there are my suggestions for VM's internationalization:

1. Extend log4cxx::helpers::PropertyResourceBundle class which should
allow lazy (on demand) load of properties.
2. Extend log4cxx::helpers::Properties class to allow string with
spaces as a key.
3. Choose model:
   a. _("<message key>") – localizable ; "<message>" – not localizable
   b. "<message key>" – localizable ; "" – not localizable.
4. Decide between two variants: printf format specifications or
{<number>} should be used inside message pattern for parameters.

Thanks Dmitry.

===============================================
From: Salikh Zakirov <[EMAIL PROTECTED] >
Date: Thu, 13 Jul 2006 12:06:05 +0400
Subject: [PATCH] Dummy l10n implemenation based on Log4cxx
---
vm/include/l10n.h              |   31 +++++++++++++++++++
vm/port/include/loggerstring.h |    9 +++++
vm/vmcore/src/init/l10n.cpp    |   66
++++++++++++++++++++++++++++++++++++++++
vm/vmcore/src/init/vm_main.cpp |    2 +
4 files changed, 108 insertions(+), 0 deletions(-)

diff --git a/vm/include/l10n.h b/vm/include/l10n.h
new file mode 100755
index 0000000..bb3edfe
--- /dev/null
+++ b/vm/include/l10n.h
@@ -0,0 +1,31 @@
+#ifndef _L10N_H
+#define _L10N_H
+
+#include <string>
+#include <log4cxx/helpers/propertyresourcebundle.h>
+#include <log4cxx/helpers/exception.h>
+#include <wchar.h>
+#include "cxxlog.h"
+
+extern log4cxx::helpers::ResourceBundlePtr
l10n_resource_bundle;
+
+inline const wchar_t* _(const wchar_t* message)
+{
+    if (!l10n_resource_bundle) return message;
+    try {
+        wchar_t* mangled = wcsdup(message);
+        wchar_t* c = mangled;
+        while (*c) {
+            if (*c == L' ') *c = L'_';
+            c++;
+        }
+        std::wstring & localized =
l10n_resource_bundle->getString(mangled);
+        free(mangled);
+        return wcsdup(localized.c_str()); // FIXME: leak
+    } catch (log4cxx::helpers::MissingResourceException &)
{}
+    return message;
+}
+
+void init_l10n();
+
+#endif // _L10N_H
diff --git a/vm/port/include/loggerstring.h
b/vm/port/include/loggerstring.h
old mode 100644
new mode 100755
index 1efe5d2..1eae5c1
--- a/vm/port/include/loggerstring.h
+++ b/vm/port/include/loggerstring.h
@@ -41,6 +41,15 @@ public:
        return (const char*)logger_string.c_str();
    }

+    LoggerString& operator<<(const wchar_t* message) {
+        const wchar_t* c = message;
+        while (*c) {
+            logger_string += (char)*c;
+            c++;
+        }
+        return *this;
+    }
+
    LoggerString& operator<<(const char* message) {
        logger_string += message;
        return *this;
diff --git a/vm/vmcore/src/init/l10n.cpp b/vm/vmcore/src/init/l10n.cpp
new file mode 100755
index 0000000..c8fd746
--- /dev/null
+++ b/vm/vmcore/src/init/l10n.cpp
@@ -0,0 +1,66 @@
+#include <apr_env.h>
+#include <assert.h>
+#include <fstream>
+#include <string.h>
+
+#include "cxxlog.h"
+#include "l10n.h"
+#include "platform_lowlevel.h"
+
+#include <log4cxx/helpers/locale.h>
+
+using namespace log4cxx;
+using namespace log4cxx::helpers;
+
+ResourceBundlePtr l10n_resource_bundle;
+
+void init_l10n()
+{
+    INFO2("info", "starting l10n initialization");
+
+    /*
+    apr_pool_t *pool;
+    apr_pool_create(&pool, 0); assert(pool);
+    char *lang = NULL;
+
+    apr_env_get(&lang, "LANG", pool);
+    if (!lang) lang = "C";
+
+    char *encoding = strchr(lang,'.');
+    if (encoding != NULL) {
+        *encoding = '\0';
+        encoding += 1;
+    }
+    char *region = strchr(lang,'_');
+    if (region != NULL) {
+        *region = '\0';
+        region += 1;
+    }
+    INFO2("info", "lang = " << lang << ", " << "region = " << region
+            << ", encoding = " << encoding);
+    string filename = "drlvm_";
+    assert(lang);
+    filename += lang;
+    if (region) {
+        filename += "_";
+        filename += region;
+    }
+
+    INFO2("info", "filename = " << filename.c_str());
+    //FIXME: read the localization file
+    */
+
+    std::wstring properties = L"message_1=SOOBSCHENIE
1\nmessage=SOOBSCHENIE";
+    PropertyResourceBundle* bundle =
+        new PropertyResourceBundle(properties);
+    INFO2("info", "bundle loaded (" << bundle << ")");
+    assert(bundle);
+
+    // _() can only be used after this initialization is done
+    l10n_resource_bundle = bundle;
+
+    INFO2("info", _(L"message"));
+    INFO2("info", _(L"message 1"));
+    INFO2("info", _(L"message 2"));
+    //apr_pool_destroy(pool);
+}
diff --git a/vm/vmcore/src/init/vm_main.cpp
b/vm/vmcore/src/init/vm_main.cpp
old mode 100644
new mode 100755
index e03e674..7378403
--- a/vm/vmcore/src/init/vm_main.cpp
+++ b/vm/vmcore/src/init/vm_main.cpp
@@ -42,6 +42,7 @@ #include "dll_jit_intf.h"
#include "dll_gc.h"
#include "em_intf.h"
#include "port_filepath.h"
+#include "l10n.h"

union Scalar_Arg {
    int i;
@@ -559,6 +560,7 @@ static void destroy_vm(Global_Env *p_env
VMEXPORT int vm_main(int argc, char *argv[])
{
    init_log_system();
+    init_l10n();

    char** java_args;
    int java_args_num;
--
1.4.1.g4b86





---------------------------------------------------------------------
Terms of use :
http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail:
[EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]



---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to