Hi all -- briefly, java.text is underway.  For those who haven't messed
around with it (and really, I don't blame you), it's a set of classes to
do text formatting and parsing into useful things like Date objects.  A
lot of it has to do with i18n in terms of collation elements (e.g. in
Spanish, 'ch' and 'll' are considered single collation units even though
they take two characters to represent) and date and number formatting
conventions.

Anyway, I've been working at getting the date stuff done first (for one,
it's what gets used most), and that's going well.  The factory methods
and secondary classes are pretty much written, as are most of the
format() methods.  When that's done (soon) I'll work on
SimpleDateFormat.parse() which converts a String into a Date, as if by
magic.  The other format types (Number, Choice) don't look too tough, so
they'll probably come next.

One thing that's come out of building the date stuff is that we need to
decide how we want to deal with locale-specific resources.  Right now
I've got a DateFormatSymbolsResource.properties that lives in
gnu/java/text and provides the names of the months, weekdays, default
formats, and so on.  It is referenced as a ResourceBundle, so a similar
properties file could be provided for locale X.  I wrote some JNI stuff
earlier to call strftime() to get the month names and so on, but there's
obviously no C functions to get Java SimpleDateFormat format strings
(although we could get really clunky and try reading the raw
locale-specific files and converting the C formats to Java formats,
ouch).  So the JNI stuff is scrapped for now, though I might build it as
a utility for POSIX systems to extract data when creating a properties
file.

There is additional locale-specific data needed for NumberFormat, and
for java.util.TimeZone (is anyone working on that?).  I think we need
the following:

(1) a standard directory to put resource bundles in, maybe
gnu.resource.  It needs to be in the classpath.  Resources might follow
a naming convention like "PackageClass[_locale...]", where Class is not
necessarily the only class that uses the data but the primary one
associated with the resource, e.g. "JavaTextDateFormat_en.properties",
"JavaUtilTimeZone_es_ES.properties".

(2) A standard convention for property file formats.  I've been doing
things like this:
shortMonths=Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
And then parsing the value with a StringTokenizer to split it into an
array, which the API calls for.  Suggestions are welcome.  This
obviously breaks for individual values that contain a comma, so maybe
those would be escaped with a backslash.

(3) an installation process (eventually!) that allows the user to choose
which locales they want to install.  I doubt everybody wants to lug
around all the possible locale data whenever they install classpath.
This gets a little complicated, as these resources are part of the JAR
file, so we would have to run "jar uf" (and I think the [u]pdate command
is an addition to 1.2).  Space isn't really at premium on, say, my Linux
box, but on an embedded system or something it might be a major issue.
(unfortunately the unicode database is still too big to pursue my dream
of running Java on the Commodore 64... maybe with some smart disk
caching in the JVM implementation..)

We might want to think about localizing exception messages and other
static strings at some point, too.  ResourceBundles are very easy to
use..

One thing that's bugging me is how to implement getAvailableLocales().
Are there methods for accessing the (virtual) directory structure of the
user's classpath, or do we need to provide a single Locales.properties
files or something similar that just lists the names of installed
locales?

I'm kind of blazing my own trail for this right now, so any ideas on how
to better organize are definitely welcome.  When I finish the DateFormat
stuff I'll start bugging Paul for CVS access and let the rest of you
play with it.

Wes


Reply via email to