On Jan 23, 2009, at 3:24 PM, Hervé BOUTEMY wrote:

the problem with such an auto-dection in a tool like Doxia used by
maven-site-plugin is that if the guessed encoding is not right, you can't do
anything

I was thinking that manually specifying a particular encoding would override the autodetection feature.

(or you have to configure it, which is what you wanted to avoid)

If autodetection guesses wrong (and I maintain that it would seldom guess wrong), having to configure it those few times would be better than having to configure it all the time, which is what UTF-8 users have to do now.

Another issue is that without autodetection, supporting more than one
type of character encoding for the APT files in a Maven project is
impossible.
same remarks than before: and what if guessed encoding from a file is wrong?

The error rate would go from all the time to some of the time, which is still a win. Again, I'm assuming that autodetection is optional and enabled by default; if it causes problems it could be disabled, reverting to the same behavior as before.

There are a lot of Maven plugins today that complain if you don't configure default encoding: it is a simple property to add in your POM. Doesn't it meet
your needs?

The problem is that I have many dozens of POMs, and I have to declare the encoding in all of them. Is there some way of configuring the encoding globally, perhaps in settings.xml?

In light of this, I suggest changing Doxia's APT handling so that it
defaults to UTF-8 rather than ISO-8859-1. Not only will this help
UTF-8 users (who may be a majority),
do you have figures, or is it a guess?

It's a guess, though there's circumstantial evidence pointing to the rise of UTF-8. It's definitely growing on the web [1], and text editors I've used, such as Eclipse on Linux and TextMate on Mac OS X, default to UTF-8. I'm actually surprised UTF-8 hasn't been adopted more quickly because it solves so many issues. But I worry that we're never we're never going to get there if modern applications continue to require native file encodings by default.

Trevor

[1] http://www.w3.org/QA/2008/05/utf8-web-growth.html

Reply via email to