[i18n] Eclipse NLS apporach to localized strings

Salikh Zakirov Sat, 29 Jul 2006 09:14:26 -0700

(* I deliberately changed the subject to reflect the topic change *)

Alex Blewitt wrote:
> However, the argument is equivalent to:
> ...


Alex, thanks a lot for extensive explanation of why
Eclipse NLS approach is better. If I were in position
to decide how to localize the class library code, 
you would have convinced me. However, I am mostly interested
in DRLVM and GC.

>From the general maintainability point of view, the NLS
approach requires complex tools to check consistency of
message catalogs and to help update them.
Are these tools available with the Eclipse?

The tasks I am thinking about are

1) Assume I need to print a localizable message. I would like Eclipse
to do automatically correct the code like

   println("localizable string")

with a <magic key press> to

   println(Messages.localizable_string);

and to have the field added to Messages.java automatically.


2) Assume I have been translating messages, and have the half-filled 
translations file.
How can I update it to get the yet untranslated strings and clean out the 
strings
no more referenced?


----8<---- below is the excellent explanation of advantages of Eclipse NLS 
approach by Alex
----8<---- for those who haven't read the thread " [drlvm] string interning in 
java"

Alex Blewitt wrote:

just to be clear; an intern()'d string (assuming GC on the intern()
pool, which I doubt it is) will stay around as long as there's a
reference, just like any other GC object in Java.

However, the argument is equivalent to:

public class Argument {
 public void method() {
   String local = reader.readLine();
 }
}

versus

public class Argument {
 private static List local = new ArrayList();
 public void method() {
   local.add(reader.readLine());
 }
}

In the former, the object is eligible for GC at the end of the method.
In the latter, it's eligible for GC when the class is unloaded. It's
not a memory leak in the traditional C sense, but it *is* a leak of
resources that is programmer error. A constant string pool of:

public class Argument {
 public void method() {
   getMessage("Some Key That Is Going To Be In The Intern Table");
 }
}

will remain in the constant pool/intern pool until Argument.class is
unloaded, whereas

public class Argument {
 public void method() {
   getMessage(Thing.message);
 }
}

will *only* get loaded when the Thing.class is loaded (which will be
delayed until the method is invoked). Furthermore, if it is defined
as:

public class Thing {
 private static String message;
 static {
   message = reader.readLine()
 };
}

then there's *no* resource taken up by this until Thing.class is
loaded, which in turn only happens when the first class needs to use
it. Additionally, it doesn't pollute the intern() pool, and as such,
you get GC for free when Thing.class is unloaded (and thus doesn't
care whether the intern() pool is GC or not).

In the case we have a perfect intern() GC mechanism, there's no
significant difference. In the case we have an imperfect intern() GC
mechanism (like, there is no GC) then it's demonstrably better.

Regardless of which approach is used, it's also worth noting that if
there are several Thing.classes (e.g. one for exceptions, one for
debug messages, one for info messages) then you only load the class
(sic) of messages that you use. If you never do a
log.debug(Debug.message), then you'll never load the Debug.class, so
no resources would ever be taken up. (Same applies whether they're
dynamically read fields or hard-coded string literals.) OTOH our
current approach to messages (stick all messages in one place for one
module) somewhat defeats this benefit, because you trigger the loading
of all messages as soon as you need the first one. A slightly smarter
solution might be to have a single messages file (for ease of
management/translation) but many message classes (e.g. Info.class,
Debug.class, Exception.class) per module; that way, when the first
message is printed, you only load a sub-set of the messages for that
module.

Of course, one could put the argument forward that this is all
premature optimisation and thus we shouldn't do anything :-) But we
have a tool (Eclipse) that can do all of this for us at the moment,
and we've already got the experience of others investigating the
memory footprint of Eclipse and determining that the cause of memory
bloat was the String intern() calls; and the tools/mechanism to fix
this problem. We could, of course, reinvent the wheel (c.f. HashCode
object when there's a tool to generate decent hashCode()/equals()
implementations) -- but I'd be interested in knowing why our proposed
solution is better or what it gives us that the Eclipse NLS
infrastructure doesn't.

Oh, and for the record -- I think a GC intern() would be a cool idea.
Let's hope it happens, because there's bound to be programs that do
silly intern()'ing that need help. Hopefully Harmony doesn't need to
be one of them :-)

Alex.


---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[i18n] Eclipse NLS apporach to localized strings

Reply via email to