Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]

Curt Arnold Mon, 20 Dec 2004 15:01:19 -0800

On Dec 20, 2004, at 3:26 PM, Ceki G�lc� wrote:

As invoked earlier, I think CachedDateFormat may fail for certain
patterns at certain dates. If we can recognize the limited number of
formats for which it fails (if it does) and sidestep those, then
fine. Before going any further, do you agree that patterns causing
CachedDateFormat to fail exist and that it's just not me making things
up?

For those tuning in late: The basic idea of the cached date format is that if the time is within the same integral second as a previous request, then only the milliseconds field needs to be rewritten. To find the milliseconds field, on the first request (or any request where the total length of the formatted field has changed), two times only differing in the number of milliseconds are output and the results are analyzed. If the milliseconds format is unrecognized, then the CachedDateFormat will simply delegate to the underlying DateFormat.

CachedDateFormat would not be able to detect the milliseconds field on RelativeTimeDateFormat unless the starting time was an integral second and would not be able to detect millisecond fields if non-arabic digits were set. In either of these cases, you would have an extra call per format evaluation. I believe the original patch avoided caching RelativeTimeDateFormat.

The worse-case scenario is if you could construct a date-time format where the location of the millisecond field changed, but the total length of the field did not. I don't think that you could create one with SimpleDateFormat, however you could obviously write a custom DateFormat that did.

There is an observable difference when running the performance tests to a null appender with CachedDateFormat. However, it may not be significant in more realistic deployments. It is a significant improvement over the flawed (and currently unused) caching code in the original DateFormats. However, the original motivation for the caching may no longer be relevant and so a new CachedDateFormat may not have a performance benefit that justifies the added complexity.

CachedDateFormat attempted to support multiple digit sets. However, I couldn't find any stock Java locales that used a digit set other than 0-9 in its date formats. I had expected that the Thai locale would use Thai digits, but I was wrong.
If I am not mistaken, the existing code in CachedDateFormat only
localized the digit 0. Which may be enough in case the
SimpleDateFormat intance and CachedDateFormat instance use the same
Localization but if not, then the output will be inconsistent.

The second pass used localized values of both 0 and 9 to identify the millisecond field. If the default locale changed, CachedDateFormat would not switch locales until the next integral second. There may be other issues that come to pass with any locale rework, so maybe the best approach is to leave CachedDateFormat out for now. It will be available in Bugzilla in case someone ever wants to add it later.

Date formatting was affected by the current locale and timezone of the thread and there was no mechanism to configure a timezone or locale to be used. The existing patches added configurable timezones and locales to the pattern layout which would modify the behavior of the date formats. Based on some of the previous discussions on the Jakarta Commons Dev list, I'd like to evaluate whether Appender is a better place for the locale to be specified.
What I'd like to do is:
Commit simplifications to the DateFormat's and add CachedDateFormat but simplified to only recognized arabic digit sets.
That would be good.
Review configurable locales and timezones and come back to the list with a specific recommendation. My current take is that appender is probably a more appropriate place to specify locale. However, that should be considered in a bigger scope where locale affects both the layout and rendering non-string messages. TimeZone is likely still appropriate to configure on the layout.
This raises a much wider question. Should a given customization be
allowed at the logger repository level, logger level, appender level,
at the layout level or at the pattern converter level? Getting the
answer right provides tremendous added value. For example, the named
logger hierarchy propagates 'level' values according the level
inheritance rule. This in turn provides a very fast, yet meaningful
filtering mechanism for categorizing logging statements. The fact that
we got this question right is one of the main reasons behind log4j's
success. Appender additivity is another example showing that getting
the collaboration rules between components correctly makes a big
difference.
I happen to think that the logger repository should/can be viewed as
the central point influencing all the components attached to it. For
example,
1) properties of the logger repository should/can be visible at all
components levels.
2) new pattern conversion rules defined at the logger repository level
should/can be shared by all the instances of PatternLayout attached to
that logger repository.
3) a resource bundles attached to a logger repository should/can be
shared by all *loggers* (hint hint), appenders and layouts.
4) The mapping URL (defined below) attached to a a logger repository
should/can shared by all instances of %logger2 pattern converter.
In "should/can", the "should" part signifies my current inclination to
think of the above as good design. The "can" part means that design is
still open for debate.
What is the mapping URL?
------------------------
We routinely write o.a.l.r.RollingFileAppender instead of org.apache.log4j.rolling.RollingFileAppender. The first form is almost as precise and much shorter. Whenever I get the chance, I'd like to implement a pattern converter named %logger2 which instead of printing org.apache.log4j.rolling.RollingFileAppender will print o.a.l.r.RollingFileAppender. The shortened forms will be defined in a properties file defined by the user. (We will provide a default mapping.) The location of this mapping will be specified with a URL hence the term "mapping URL".
Coming back to the TimeZone question, we could imagine that a TimeZone
could be set at the LoggerRepository level. This TImeZone would
percolate down to all levels below. However, if needed it could be
overridden at a lower level, e.g. the pattern converter level. Can the
TimeZone influence multiple pattern converters of a PatternLayout? If
that's not a plausible scenario, then it does not make sense to define
a TimeZone at the Appender level nor at the PatternLayout level.
Providing too many or meaningless extension/customization points will
confuse the user, make thins harder to manage for her, and makes the
code harder to maintain for us. Getting the collaborations rules right
makes all the difference in the world.

I'm going to have do some research before I can make a reasonable proposal.

Here is a use case that I think suggests that Layout or Appender is the right level: Send logging events to Ceki in fr-CH localized email messages with time in Central European Timezone and to Curt in en-US email messages with time in US Central time zone.

However, if you were using a SocketAppender instead and receiving in Chainsaw, there would not be a layout involved, however you would want to be able to control the locale used in the Object.toString() call used to render non-string messages. Timezone would not come into play until a layout was involved.

You could either specify TimeZone as a property of the Layout, in which case all time formats (likely one, but possibly more) within a message would be in a single time zone, or you could extend the pattern syntax for dates to to include a timezone specifier. The second would allow you to represent the time, for example, in both GMT and a local time within the same formatted message. I chose the first since I'm a wimp and it was easier.

Locale and timezone, like layout, are accommodations of the preferences of a particular audience being reached through the appender. Can you think of reasons that you would want to specify them at a higher level?

Implementing appender level locale rendering would likely involve creating threads to do rendering on non-default locales in some instances and would likely have some performance hit, but shouldn't significantly performance when not specified. However, it is going to take some experimentation to see where it can be effectively performed.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]

Reply via email to