Re: [gentoo-dev] UTF-8 locale by default

2012-12-31 Thread Zac Medico
On 12/31/2012 09:14 AM, Maxim Kammerer wrote:
> Hi,
> 
> stage3 now includes non-ASCII paths, via app-misc/ca-certificates -- e.g.:
> /usr/share/ca-certificates/mozilla/TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3.crt
> 
> Working with those (e.g., backup) probably requires a UTF-8 locale. Is
> this considered acceptable? Did anyone notice?

It's been that way for a very long time (over a year). Since bug #382199
[1], portage uses a constant UTF-8 encoding for all installed files
regardless of the locale, so at least you can count on portage handling
those UTF-8 names even if you don't have a UTF-8 locale configured.

[1] https://bugs.gentoo.org/show_bug.cgi?id=382199
-- 
Thanks,
Zac



Re: [gentoo-dev] UTF-8 locale by default

2012-12-31 Thread Maxim Kammerer
Hi,

stage3 now includes non-ASCII paths, via app-misc/ca-certificates -- e.g.:
/usr/share/ca-certificates/mozilla/TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3.crt

Working with those (e.g., backup) probably requires a UTF-8 locale. Is
this considered acceptable? Did anyone notice?

-- 
Maxim Kammerer
Liberté Linux: http://dee.su/liberte



Re: [gentoo-dev] UTF-8 locale by default

2012-08-07 Thread Dan Douglas
On Friday, August 03, 2012 07:16:45 AM Luca Barbato wrote:
> On 07/27/2012 07:24 PM, Mike Frysinger wrote:
> > yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
only 
> > real option in my mind for making unicode the default.  any other 
> > amalgamations of various locales is ugly as sin.
> 
> When they meet? I'd be fine with a pre-release =P
> 
> lu
> 

2008 TC1 is just finishing up balloting as we speak. If this isn't already in 
there you may be in for a long wait. Feel free to subscribe to the austin-
group lists -- It's open to anyone. A calendar with the teleconference 
schedule is available.
--
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Diego Elio Pettenò
On 03/08/2012 09:54, Alexis Ballier wrote:
> I don't think anyone will object to enforcing a given locale to be
> present, even en_US.UTF-8; people will object if they have to use that
> locale.
> 
> Maybe locale-gen can even generate it on-the-fly in $T, I don't know.

Agreed. And there _is_ a way to tell which locales are available:
`locale -a`.

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/



Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Alexis Ballier
On Fri, 3 Aug 2012 17:47:24 +0200
Michał Górny  wrote:
> > Python upstream is doing what they think is best in using unicode.
> > 
> > That said, what if we just temporarily set a locale in the ebuild
> > for running tests and elsewhere? Is this unreasonable or
> > impossible? It might not be a great solution, this method, since
> > users' stuff will still break.
> 
> It is impossible because you can't know which locale a particular
> system has available. AFAIK there's no 'it-will-always-work' choice;
> unless we're going to enforce generating some common locale, or do
> very ugly things.

I don't think anyone will object to enforcing a given locale to be
present, even en_US.UTF-8; people will object if they have to use that
locale.

Maybe locale-gen can even generate it on-the-fly in $T, I don't know.

A.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Michał Górny
On Fri, 3 Aug 2012 09:59:42 -0500
Matthew Summers  wrote:

> On Thu, Aug 2, 2012 at 1:32 PM, Mike Gilbert 
> wrote:
> > On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
> >  wrote:
> >> On 01/08/2012 23:42, Fabian Groffen wrote:
> >>> Honestly, if some asian person has whatever charset that I often
> >>> find in spam messages, but is not UTF-8, are you then going to
> >>> tell that person to switch to UTF-8 to get those python packages
> >>> emerged?  I hope not.
> >>
> >> Tell that to the Python team I guess. My tinderbox _has_ utf8
> >> locales available, but doesn't set in by default -> Python stuff
> >> fails to build or test -> not going to be fixed with "change your
> >> locale" reasoning.
> >>
> >> Is it mental? Yes.
> >> Would I like that to change? Yes.
> >> Do I care ẃhether that's through the use of cluebyfour on the
> >> Python team or by setting an utf-8 locale by default? Not in the
> >> least.
> >>
> >
> > Please apply the cluebyfour to the upstream developers of python and
> > python modules. :-)
> >
> > I do try to fix unicode problems if I run into them. However,
> > sometimes it just isn't worth the effort.
> >
> 
> Python upstream is doing what they think is best in using unicode.
> 
> That said, what if we just temporarily set a locale in the ebuild for
> running tests and elsewhere? Is this unreasonable or impossible? It
> might not be a great solution, this method, since users' stuff will
> still break.

It is impossible because you can't know which locale a particular
system has available. AFAIK there's no 'it-will-always-work' choice;
unless we're going to enforce generating some common locale, or do very
ugly things.

> 
> Further, I support the use of C.UTF-8 when it is ready. It seems like
> the lowest common denominator to me.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Matthew Summers
On Thu, Aug 2, 2012 at 1:32 PM, Mike Gilbert  wrote:
> On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
>  wrote:
>> On 01/08/2012 23:42, Fabian Groffen wrote:
>>> Honestly, if some asian person has whatever charset that I often find in
>>> spam messages, but is not UTF-8, are you then going to tell that person
>>> to switch to UTF-8 to get those python packages emerged?  I hope not.
>>
>> Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
>> available, but doesn't set in by default -> Python stuff fails to build
>> or test -> not going to be fixed with "change your locale" reasoning.
>>
>> Is it mental? Yes.
>> Would I like that to change? Yes.
>> Do I care ẃhether that's through the use of cluebyfour on the Python
>> team or by setting an utf-8 locale by default? Not in the least.
>>
>
> Please apply the cluebyfour to the upstream developers of python and
> python modules. :-)
>
> I do try to fix unicode problems if I run into them. However,
> sometimes it just isn't worth the effort.
>

Python upstream is doing what they think is best in using unicode.

That said, what if we just temporarily set a locale in the ebuild for
running tests and elsewhere? Is this unreasonable or impossible? It
might not be a great solution, this method, since users' stuff will
still break.

Further, I support the use of C.UTF-8 when it is ready. It seems like
the lowest common denominator to me.


-- 
Matthew W. Summers
Gentoo Foundation Inc.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Luca Barbato
On 07/27/2012 07:24 PM, Mike Frysinger wrote:
> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
> only 
> real option in my mind for making unicode the default.  any other 
> amalgamations of various locales is ugly as sin.

When they meet? I'd be fine with a pre-release =P

lu




Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Kent Fredric
On 31 July 2012 05:33, Michael Orlitzky  wrote:
> On 07/30/12 12:28, Michał Górny wrote:
>>
>> My point here is that you want the thing to change. So you first try to
>> convince people here to change. We practically did a small survey here
>> and in the result we didn't agree on doing the change.
>>
>> So you're saying we should do another survey on another group, hoping
>> that this time the result will be on your side.
>
> We didn't do a survey, we asked,
>
>   "Is there a reason for not using at least en_US.UTF-8 as a "sane"
>default value?"
>
> Unsurprisingly, the responses contained reasons for not using
> en_US.UTF-8 as the default.
>

I think its a shame that :

1. the current handbook way to change timezone is manually editing a file.
2. the handbook doesn't mention `eselect locale`
3. `eselect locale list` is useless if you have *all* locales available to you.
4. `eselect locale` can only set the LANG variable.
5. that eselect doesn't have an interactive mode yet.

Why? because this problem could be made simpler by providing a way to
use a recommended locale for your timezone, which is likely to yield a
more sane default for that timezone.

It would also make it easier to validate the value the user chooses
for their Timezone value.

Consider:

eselect timezone list
 # all level 1 timezones + groups , ie: like ls /usr/share/zoneinfo
eselect timezone list  America/
# contents of /usr/share/zoneinfo/America
eselect timezone set America/Chicago
# /etc/timezone is updated to  'America/Chicago'
# /etc/localtime is replaced with /usr/share/zoneinfo/America/Chicago
eselect locale set --all auto
# LANG and LC_* are set using the values defined as "default" for
America/Chicago
eselect locale set --ctype auto
# Only LC_CTYPE is autopopulated.
eselect locale list
# 600 items because you have a vanilla locale.defs
eselect locale list --timezone
# shows a list of LOCALE values for the current TZ, with the one that
would be used as default first/marked up differently
eselect locale list en
# shows english locale options
eselect locale set --ctype en_US.utf8


The benefits of setting these locales this way are obvious to me at
least, you can set locales to a value that is sensible automatically.
You also can validate a users choice of locale and provide feedback,
such as, you can list non-installed locales, and then tell the user if
thy try to use a locale that isn't installed yet they need to update
locales.def

The only way I can suggest something better, would be an interactive
locale setter, something like 'tzselect' , except sets timezone *and*
locale information, with the ability to automatically update
locales.def and add new locale definitions and regenerate the locale
database.

This way, you could have a selection process more like this:

https://gist.github.com/3240866

#? 1

The following information has been given:

United States
Eastern Time

Therefore TZ='America/New_York' will be used.
Local time is now: Thu Aug 2 17:33:17 EDT 2012.
Universal Time is now: Thu Aug 2 21:33:17 UTC 2012.
Is the above information OK?
1) Yes
2) No
#? 1
Your Current locale settings are:

LANG="POSIX"

The recommended settings for your locale are :
LANG="en_US.utf8"
LC_CTYPE="en_US.utf8"

Do you wish to change your locale settings at this time?
1) No
2) Yes - Use recommended settings
3) Yes - Configure locale interactively.

At least this way, the effort required to configure your system into a
very good logical UTF8 default is trivial.

-- 
Kent

perl -e  "print substr( \"edrgmaM  SPA NOcomil.ic\\@tfrken\", \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );"

http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Alexis Ballier
On Thu, 02 Aug 2012 11:21:40 -0700
Diego Elio Pettenò  wrote:

> On 01/08/2012 23:42, Fabian Groffen wrote:
> > Honestly, if some asian person has whatever charset that I often
> > find in spam messages, but is not UTF-8, are you then going to tell
> > that person to switch to UTF-8 to get those python packages
> > emerged?  I hope not.
> 
> Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
> available, but doesn't set in by default -> Python stuff fails to
> build or test -> not going to be fixed with "change your locale"
> reasoning.

not that it is hard to set LC_ALL=sth before running the failing
command, or make the pm do it... we already fix regexp bugs with other
locales (or workaround them by setting LC_ALL=C), it falls under the
same category.
you just need to teach people, and maybe mandate an utf8 locale to be
present; the same way they do not consider estonian alphabet ordering
'broken' they would not consider not having an utf8 locale 'broken',
esp. when said utf8 is far from being optimal in terms of size for asian
languages.

A.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Mike Gilbert
On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
 wrote:
> On 01/08/2012 23:42, Fabian Groffen wrote:
>> Honestly, if some asian person has whatever charset that I often find in
>> spam messages, but is not UTF-8, are you then going to tell that person
>> to switch to UTF-8 to get those python packages emerged?  I hope not.
>
> Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
> available, but doesn't set in by default -> Python stuff fails to build
> or test -> not going to be fixed with "change your locale" reasoning.
>
> Is it mental? Yes.
> Would I like that to change? Yes.
> Do I care ẃhether that's through the use of cluebyfour on the Python
> team or by setting an utf-8 locale by default? Not in the least.
>

Please apply the cluebyfour to the upstream developers of python and
python modules. :-)

I do try to fix unicode problems if I run into them. However,
sometimes it just isn't worth the effort.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Diego Elio Pettenò
On 01/08/2012 23:42, Fabian Groffen wrote:
> Honestly, if some asian person has whatever charset that I often find in
> spam messages, but is not UTF-8, are you then going to tell that person
> to switch to UTF-8 to get those python packages emerged?  I hope not.

Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
available, but doesn't set in by default -> Python stuff fails to build
or test -> not going to be fixed with "change your locale" reasoning.

Is it mental? Yes.
Would I like that to change? Yes.
Do I care ẃhether that's through the use of cluebyfour on the Python
team or by setting an utf-8 locale by default? Not in the least.

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Stelian Ionescu
On Thu, 2012-08-02 at 08:42 +0200, Fabian Groffen wrote:
> On 01-08-2012 21:00:23 -0400, Mike Gilbert wrote:
> > Diego mentioned the python issue.
> 
> Honestly, if some asian person has whatever charset that I often find in
> spam messages, but is not UTF-8, are you then going to tell that person
> to switch to UTF-8 to get those python packages emerged?  I hope not.

Yes.

-- 
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Fabian Groffen
On 01-08-2012 21:00:23 -0400, Mike Gilbert wrote:
> Diego mentioned the python issue.

Honestly, if some asian person has whatever charset that I often find in
spam messages, but is not UTF-8, are you then going to tell that person
to switch to UTF-8 to get those python packages emerged?  I hope not.

There is a difference between "there is a UTF-8 locale available on the
system" and "en_US.UTF-8 locale is in effect".

Fabian

-- 
Fabian Groffen
Gentoo on a different level


signature.asc
Description: Digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Sergey Popov
02.08.2012 04:20, Walter Dnes wrote:
>   That's right... the poster was running a POSIX locale for several
> years ***AND DID NOT HAVE ANY PROBLEMS RELATED TO IT***.  
This discussion is very similar with one, that i have seen in Russian
Linux community some years ago about migrating from ru_RU.KOI8-R to
ru_RU.UTF-8. Arguments from "KOI8-R guys" were the same - "Why we should
change something if it works?" and they are also did not notice
fundamental problems with some vitally important packages, which can not
be replaced or need to be heavily patched to work properly. Arguments
from "UTF-8 guys" were not ideal, but locale change brokes only old or
unsupported packages, so they win.

P.S. I do not think that comparison with 'initramfs and separate /usr
problem' is correct in this case. Default locale change is evolution,
not revolution...



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Peter Stuge
Walter Dnes wrote:
> The fact that "other distros do it" does not constitute
> justification for us to do it.

Unfortunately that exact reason, along with "Fedora is doing it", was
cited by a very active developer as reason to reject technical points
which I tried to make a few times.

But that is off-topic. Let's leave it for later. All I'm saying is
don't underestimate pack mentality.


//Peter



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Mike Gilbert
On Wed, Aug 1, 2012 at 8:20 PM, Walter Dnes  wrote:
> We're ignoring a very basic question here... what problems does
> shipping with a POSIX locale cause that would be fixed by setting a UTF8
> default locale???  I want a real answer.  Not something along the lines
> of "But daddy, all the other kids are doing it".
>

Try reading the rest of the thread before posting a rant.

Diego mentioned the python issue. As well, there are many test suites
that malfunction without a UTF-8 or en_US.UTF-8 locale. If you hunt
through Bugzilla, you can probably dig up other issues.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Walter Dnes
On Wed, Aug 01, 2012 at 04:29:42PM -0400, Michael Orlitzky wrote

> Every locale is wrong for somebody; the idea was that by taking
> a survey, you could make it wrong for the least amount of people
> (by default).

  Question... has anybody ever considered that maybe a POSIX locale
is wrong for the least amount of people???  There's also a very damning
statement in the post that started this thread...

On Thu, Jul 19, 2012 at 11:39:59PM +0200, Sascha Cunz wrote
> I recently discovered that I for some reason haven't noticed the
> warning about setting the locale to utf-8 in the gentoo handbook for
> obviously several years; thus i was still running all my systems in
> a POSIX locale since i never cared much about it.
> 
> However, since I noticed, I talked to several people about it; all
> of them stating as first response: "Not shipping with a utf-8 locale
> turned on by default nowadays probably is a bug in your distro"

  That's right... the poster was running a POSIX locale for several
years ***AND DID NOT HAVE ANY PROBLEMS RELATED TO IT***.  Then "several
people said" "Not shipping with a utf-8 locale turned on by default
nowadays probably is a bug in your distro".  And suddenly it's a
problem.  What's next?  Despite running with no problems for many years
with a separate /usr and no initramfs, will we have "several people"
come along and tell us that it's a bug in our distro?  Oh... wait...

  The fact that "other distros do it" does not constitute justification
for us to do it.  If I wanted to run Redhat or Ubuntu, I'd run Redhat or
Ubuntu.  We're ignoring a very basic question here... what problems does
shipping with a POSIX locale cause that would be fixed by setting a UTF8
default locale???  I want a real answer.  Not something along the lines
of "But daddy, all the other kids are doing it".

-- 
Walter Dnes 



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Michael Orlitzky
On 08/01/12 16:18, Andreas K. Huettel wrote:
> 
>>
>> If it turns out that C or POSIX is the most common response, we should
>> then default the locale to en_US.UTF-8 if we really want to default to
>> a UTF-8 setting. The reason being it makes sense to have the default
>> locale set to the country of origin, which in our case is the United
>> States.
>>
> 
> Given the number of Gentoo devs (especially on the desktop side where this 
> matters most) from other parts of the world, that's not really a valid 
> argument. In particular in cases as e.g. "Paper size setting", where 
> basically 
> US stubbornness stands against the rest of the planet.
> 

Every locale is wrong for somebody; the idea was that by taking a
survey, you could make it wrong for the least amount of people (by default).

If the majority of users use a stupid paper size, the best default is
still whatever they use regardless of any personal preferences.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Andreas K. Huettel

> 
> If it turns out that C or POSIX is the most common response, we should
> then default the locale to en_US.UTF-8 if we really want to default to
> a UTF-8 setting. The reason being it makes sense to have the default
> locale set to the country of origin, which in our case is the United
> States.
> 

Given the number of Gentoo devs (especially on the desktop side where this 
matters most) from other parts of the world, that's not really a valid 
argument. In particular in cases as e.g. "Paper size setting", where basically 
US stubbornness stands against the rest of the planet.

-- 

Andreas K. Huettel
Gentoo Linux developer 
dilfri...@gentoo.org
http://www.akhuettel.de/



signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-07-31 Thread Michael Orlitzky
On 07/30/12 15:02, Walter Dnes wrote:
> Would forcing UTF-8 cause problems for packages that expect
> specific ISO encodings in X fonts?

Not that I know of (and setting a default wouldn't force anything).

xfreecell's readme states "Make sure there is a font named 7x14" and
another thread mentions that this is provided by
media-fonts/font-misc-misc so that sounds like a bug in the ebuild to me.



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Walter Dnes
On Mon, Jul 30, 2012 at 01:33:48PM -0400, Michael Orlitzky wrote

> The technical objection to C.UTF-8 is that it's non-standard, Ok.
> What are the technical objections to LC_CTYPE=en_US.UTF-8? If the
> alternatives are all improvements, the statistics are irrelevant.

  I ran into a problem several months ago with xfreecell not running.
Turned out the ISO8859-1 fonts were not being generated, just UTF-8.
xfreecell needs ISO8859-1 fonts.  And it's not the only package.  I
modified xorg-2.eclass so that font packages would build ISO8859-1.  See
http://article.gmane.org/gmane.linux.gentoo.user/252316/ for the gory
details.  Would forcing UTF-8 cause problems for packages that expect
specific ISO encodings in X fonts?

  The important part of the eclass mod was to manually enable iso8859-1
and disable all other encodings...

if grep -q -s "disable-all-encodings" ${ECONF_SOURCE:-.}/configure; then
FONT_OPTIONS+="
--enable-iso8859-1
--disable-iso10646
--disable-iso10646-1
--disable-iso8859-2
--disable-iso8859-3
--disable-iso8859-4
--disable-iso8859-5
--disable-iso8859-6
--disable-iso8859-7
--disable-iso8859-8
--disable-iso8859-9
--disable-iso8859-10
--disable-iso8859-11
--disable-iso8859-12
--disable-iso8859-13
--disable-iso8859-14
--disable-iso8859-15
--disable-iso8859-16
--disable-jisx0201
--disable-koi8-r"
else
FONT_OPTIONS+="
--disable-iso10646
--disable-iso10646-1
--disable-iso8859-2
--disable-iso8859-3
--disable-iso8859-4
--disable-iso8859-5
--disable-iso8859-6
--disable-iso8859-7
--disable-iso8859-8
--disable-iso8859-9
--disable-iso8859-10
--disable-iso8859-11
--disable-iso8859-12
--disable-iso8859-13
--disable-iso8859-14
--disable-iso8859-15
--disable-iso8859-16
--disable-jisx0201
--disable-koi8-r"
fi

-- 
Walter Dnes 



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Orlitzky
On 07/30/12 12:28, Michał Górny wrote:
> 
> My point here is that you want the thing to change. So you first try to
> convince people here to change. We practically did a small survey here
> and in the result we didn't agree on doing the change.
> 
> So you're saying we should do another survey on another group, hoping
> that this time the result will be on your side.

We didn't do a survey, we asked,

  "Is there a reason for not using at least en_US.UTF-8 as a "sane"
   default value?"

Unsurprisingly, the responses contained reasons for not using
en_US.UTF-8 as the default.

Don't take my original reply out of context, I don't actually care what
we have as the default.


> 
> It depends on who the 'unbiased sample' is. Are you interested only in
> opinion of Gentoo users who visit the website? Who sync once a day?
> Once a week? Who follow Gentoo Planet? Who participate in the forums?
> 
> We can create the survey and announce it everywhere. But it still won't
> catch many old-time Gentoo users who can actually have something
> opposite to say. It won't be unbiased.

The technical objection to C.UTF-8 is that it's non-standard, Ok. What
are the technical objections to LC_CTYPE=en_US.UTF-8? If the
alternatives are all improvements, the statistics are irrelevant.



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Mol
On Mon, Jul 30, 2012 at 12:28 PM, Michał Górny  wrote:
> On Mon, 30 Jul 2012 10:50:29 -0400
> Michael Orlitzky  wrote:
>
>> On 07/30/12 10:41, Michał Górny wrote:
>> > On Mon, 30 Jul 2012 10:35:36 -0400
>> > Michael Orlitzky  wrote:
>> >
>> >> On 07/27/12 16:16, Aaron W. Swenson wrote:
>> >>>
>> >>> No user will be happy with whatever we decide to use as a default.
>> >>
>> >> The defaults should be what's best for the most people, with a bias
>> >> towards safety. Why don't we just take a survey and choose the most
>> >> common utf8 response?
>> >
>> > How can you take a survey like that? How will you ensure it actually
>> > hits the majority? How will you define the majority?
>> >
>>
>> Considering that the alternative is to force everyone to change it
>> manually, you can do it however you want and it'll be an improvement.
>
> My point here is that you want the thing to change. So you first try to
> convince people here to change. We practically did a small survey here
> and in the result we didn't agree on doing the change.
>
> So you're saying we should do another survey on another group, hoping
> that this time the result will be on your side.
>
>>   1) Create a webpage with a bunch of options, count the results
>>
>>   2) Ask the g.o mailing lists, count responses manually
>>
>>   3) Use google docs like the website survey that went out a few days
>>  ago
>>
>> It won't hit everyone, but no survey ever does. As long as you get a
>> large enough unbiased sample, it doesn't matter. And anything would be
>> an improvement, so it doesn't matter anyway.
>
> It depends on who the 'unbiased sample' is. Are you interested only in
> opinion of Gentoo users who visit the website? Who sync once a day?
> Once a week? Who follow Gentoo Planet? Who participate in the forums?
>
> We can create the survey and announce it everywhere. But it still won't
> catch many old-time Gentoo users who can actually have something
> opposite to say. It won't be unbiased.

I was thinking about this, and I suspect that a survey period of 1-2
months is likely fine. It should also be enough to scoop up people who
run servers and monitor those servers for security updates.

-- 
:wq



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michał Górny
On Mon, 30 Jul 2012 10:50:29 -0400
Michael Orlitzky  wrote:

> On 07/30/12 10:41, Michał Górny wrote:
> > On Mon, 30 Jul 2012 10:35:36 -0400
> > Michael Orlitzky  wrote:
> > 
> >> On 07/27/12 16:16, Aaron W. Swenson wrote:
> >>>
> >>> No user will be happy with whatever we decide to use as a default.
> >>
> >> The defaults should be what's best for the most people, with a bias
> >> towards safety. Why don't we just take a survey and choose the most
> >> common utf8 response?
> > 
> > How can you take a survey like that? How will you ensure it actually
> > hits the majority? How will you define the majority?
> > 
> 
> Considering that the alternative is to force everyone to change it
> manually, you can do it however you want and it'll be an improvement.

My point here is that you want the thing to change. So you first try to
convince people here to change. We practically did a small survey here
and in the result we didn't agree on doing the change.

So you're saying we should do another survey on another group, hoping
that this time the result will be on your side.

>   1) Create a webpage with a bunch of options, count the results
> 
>   2) Ask the g.o mailing lists, count responses manually
> 
>   3) Use google docs like the website survey that went out a few days
>  ago
> 
> It won't hit everyone, but no survey ever does. As long as you get a
> large enough unbiased sample, it doesn't matter. And anything would be
> an improvement, so it doesn't matter anyway.

It depends on who the 'unbiased sample' is. Are you interested only in
opinion of Gentoo users who visit the website? Who sync once a day?
Once a week? Who follow Gentoo Planet? Who participate in the forums?

We can create the survey and announce it everywhere. But it still won't
catch many old-time Gentoo users who can actually have something
opposite to say. It won't be unbiased.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Aaron W. Swenson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 07/30/2012 11:04 AM, Michael Mol wrote:
> On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny 
> wrote:
>> On Mon, 30 Jul 2012 10:35:36 -0400 Michael Orlitzky
>>  wrote:
>> 
>>> On 07/27/12 16:16, Aaron W. Swenson wrote:
 
 No user will be happy with whatever we decide to use as a
 default.
>>> 
>>> The defaults should be what's best for the most people, with a
>>> bias towards safety. Why don't we just take a survey and choose
>>> the most common utf8 response?
>> 
>> How can you take a survey like that? How will you ensure it
>> actually hits the majority? How will you define the majority?
> 
> Serverside script on gentoo.org. Push out a news item with the URL
> and a last-call date. Tabulate the results, using browser
> fingerprints to weed out the bulk of duplicates.
> 

I still advocate continuing how we have been.

However, the survey should be one question: What is the output of
`locale' on your workstation/desktop/laptop?

The less painful we make the survey, the more respondents we'll get,
and the less biased the results will be. Additionally, it makes the
responses easy to parse with a script.

Servers are excluded because special things take place there that may
not actually line up with what the user prefers.

If it turns out that C or POSIX is the most common response, we should
then default the locale to en_US.UTF-8 if we really want to default to
a UTF-8 setting. The reason being it makes sense to have the default
locale set to the country of origin, which in our case is the United
States.

Yes, it may irk those whose native locale is not en_US.UTF-8, but like
I said, no one will be happy. Except for those whose native locale
happens to be the default.

Start at a default, doesn't really matter which as long as the default
is the lingua franca of international business, and instruct the user,
as we already do, how to change it during the setup.

- -- 
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email: titanof...@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAWrXAACgkQVxOqA9G7/aCmowD6A8+9giw1BhhxvAag7Cmeom7o
mHVW49AfEDSo6ReknZkBAIa09FZ62SU66BCCi6m3Qisk5SW7P3YDLNbkMDS38/CZ
=lFc0
-END PGP SIGNATURE-



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Rich Freeman
On Mon, Jul 30, 2012 at 10:42 AM, Michael Mol  wrote:
>
> You'd really want to a "which do you prefer, which can you use"
> survey, then; You don't really want to choose the result preferred by
> the most people, rather you want the result which is usable by the
> most people.

I tend to agree.  Donnie said something in his manifesto which I think
applies here: any of the proposed solutions is probably better than
doing nothing.

If I forget to tweak my locale and I end up with a comma as a decimal
mark it isn't the end of the world, and neither is some output in
metric units.  I've ended up working on many a global system where
times get reported in GMT and people put up with the inconvenience
because they realize that any standard is better than no standard.

What is the real end-user impact of any of this stuff anyway?  During
the install the thing that matters is being able to partition disks
and compile kernels and such.  I doubt that too many users will be
dependent on installer locale settings for displaying weather reports
or such.  If they don't set locale, then it is like not setting
localtime - you just get to live with some default.  I would imagine
that at least by having a UTF-8 locale users would be able to do
things like set full names of users using unicode, etc.

Rich



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Mol
On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny  wrote:
> On Mon, 30 Jul 2012 10:35:36 -0400
> Michael Orlitzky  wrote:
>
>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>> >
>> > No user will be happy with whatever we decide to use as a default.
>>
>> The defaults should be what's best for the most people, with a bias
>> towards safety. Why don't we just take a survey and choose the most
>> common utf8 response?
>
> How can you take a survey like that? How will you ensure it actually
> hits the majority? How will you define the majority?

Serverside script on gentoo.org. Push out a news item with the URL and
a last-call date. Tabulate the results, using browser fingerprints to
weed out the bulk of duplicates.

-- 
:wq



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Orlitzky
On 07/30/12 10:41, Michał Górny wrote:
> On Mon, 30 Jul 2012 10:35:36 -0400
> Michael Orlitzky  wrote:
> 
>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>>
>>> No user will be happy with whatever we decide to use as a default.
>>
>> The defaults should be what's best for the most people, with a bias
>> towards safety. Why don't we just take a survey and choose the most
>> common utf8 response?
> 
> How can you take a survey like that? How will you ensure it actually
> hits the majority? How will you define the majority?
> 

Considering that the alternative is to force everyone to change it
manually, you can do it however you want and it'll be an improvement.

  1) Create a webpage with a bunch of options, count the results

  2) Ask the g.o mailing lists, count responses manually

  3) Use google docs like the website survey that went out a few days
 ago

It won't hit everyone, but no survey ever does. As long as you get a
large enough unbiased sample, it doesn't matter. And anything would be
an improvement, so it doesn't matter anyway.



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michał Górny
On Mon, 30 Jul 2012 10:35:36 -0400
Michael Orlitzky  wrote:

> On 07/27/12 16:16, Aaron W. Swenson wrote:
> > 
> > No user will be happy with whatever we decide to use as a default.
> 
> The defaults should be what's best for the most people, with a bias
> towards safety. Why don't we just take a survey and choose the most
> common utf8 response?

How can you take a survey like that? How will you ensure it actually
hits the majority? How will you define the majority?

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Mol
On Mon, Jul 30, 2012 at 10:35 AM, Michael Orlitzky  wrote:
> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>
>> No user will be happy with whatever we decide to use as a default.
>
> The defaults should be what's best for the most people, with a bias
> towards safety. Why don't we just take a survey and choose the most
> common utf8 response?

You'd really want to a "which do you prefer, which can you use"
survey, then; You don't really want to choose the result preferred by
the most people, rather you want the result which is usable by the
most people.

-- 
:wq



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Orlitzky
On 07/27/12 16:16, Aaron W. Swenson wrote:
> 
> No user will be happy with whatever we decide to use as a default.

The defaults should be what's best for the most people, with a bias
towards safety. Why don't we just take a survey and choose the most
common utf8 response?



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Diego Elio Pettenò
Il 27/07/2012 13:16, Aaron W. Swenson ha scritto:
> Really, how much of an inconvenience is it that we don't use UTF-8 as
> a default?

Given that there are a ton and a half of Python packages that do not
work with a non-utf8 locale, I'd say it's quite a thing.

So either we go with an UTF-8 default or somebody has to fix the
packages not working without it

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Aaron W. Swenson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 07/27/2012 02:29 PM, Pacho Ramos wrote:
> El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
>> On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn
>> wrote:
>>> Ulrich Mueller schrieb:
 As I had pointed out before [1], changing from POSIX to an
 en_US locale will have undesirable side effects, like commas
 as thousands separators in numbers (because of LC_NUMERIC).
 Also the defaults of en_US for LC_MEASUREMENT and LC_PAPER
 are only useful in the U.S.
 
 So if we change the default (but I still don't see the need),
 we
 
 should go for a less intrusive setting like: LANG="POSIX" 
 LC_CTYPE="en_US.utf8"
>>> 
>>> This would be better than LANG="en_US.utf8" but I would still
>>> prefer not to have any country/region attached to the locale.
>>> The C.UTF-8 locale which Debian uses for this purpose (a UTF-8
>>> locale without side effects) appears more suitable to me.
>> 
>> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.
>> that's the only real option in my mind for making unicode the
>> default.  any other amalgamations of various locales is ugly as
>> sin. -mike
> 
> Do you have any idea about how much time could that formalization
> take? If it will take a long time, maybe we could go to that
> amalgamations :-/
> 

Really, how much of an inconvenience is it that we don't use UTF-8 as
a default?

In my mind, it is sufficient that we instruct users how to set the
locale in the handbook.

No user will be happy with whatever we decide to use as a default. I
will be especially upset if we use the metric system instead of the
*STANDARD* system. It has 'standard' in the name for a reason people.
(^_^)

- -- 
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email: titanof...@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAS9xEACgkQVxOqA9G7/aDXmQEAmKW1MNgHDZpjE0JBWsWssq0h
LR32rvm0CrafIhD6v3UA/Aiuq6BTGxfJ3pO6+pP5xtQ5RD0ML5+89sSfKX6R1DEo
=JtMV
-END PGP SIGNATURE-



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Pacho Ramos
El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
> On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
> > Ulrich Mueller schrieb:
> > > As I had pointed out before [1], changing from POSIX to an en_US
> > > locale will have undesirable side effects, like commas as thousands
> > > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> > > 
> > > So if we change the default (but I still don't see the need), we
> > > 
> > > should go for a less intrusive setting like:
> > >LANG="POSIX"
> > >LC_CTYPE="en_US.utf8"
> > 
> > This would be better than LANG="en_US.utf8" but I would still prefer not
> > to have any country/region attached to the locale. The C.UTF-8 locale
> > which Debian uses for this purpose (a UTF-8 locale without side effects)
> > appears more suitable to me.
> 
> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
> only 
> real option in my mind for making unicode the default.  any other 
> amalgamations of various locales is ugly as sin.
> -mike

Do you have any idea about how much time could that formalization take?
If it will take a long time, maybe we could go to that amalgamations :-/


signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Mike Frysinger
On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
> Ulrich Mueller schrieb:
> > As I had pointed out before [1], changing from POSIX to an en_US
> > locale will have undesirable side effects, like commas as thousands
> > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> > 
> > So if we change the default (but I still don't see the need), we
> > 
> > should go for a less intrusive setting like:
> >LANG="POSIX"
> >LC_CTYPE="en_US.utf8"
> 
> This would be better than LANG="en_US.utf8" but I would still prefer not
> to have any country/region attached to the locale. The C.UTF-8 locale
> which Debian uses for this purpose (a UTF-8 locale without side effects)
> appears more suitable to me.

yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the only 
real option in my mind for making unicode the default.  any other 
amalgamations of various locales is ugly as sin.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Chí-Thanh Christopher Nguyễn
Ulrich Mueller schrieb:
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
>
>LANG="POSIX"
>LC_CTYPE="en_US.utf8"

This would be better than LANG="en_US.utf8" but I would still prefer not
to have any country/region attached to the locale. The C.UTF-8 locale
which Debian uses for this purpose (a UTF-8 locale without side effects)
appears more suitable to me.


Best regards,
Chí-Thanh Christopher Nguyễn




Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Michał Górny
On Fri, 27 Jul 2012 16:34:01 +0800
Ben de Groot  wrote:

> On 27 July 2012 16:06, Dan Douglas  wrote:
> > On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
> >> > On Fri, 27 Jul 2012, Ben de Groot wrote:
> >>
> >> > I understand why the council rejected Debian's C.UTF-8 option,
> >> > but is there really no better default that we can use?
> >>
> >> > Without any default locale set, in practically all cases that
> >> > means that the user is presented with English, and mostly the
> >> > American variant. So, in practice, we are defaulting to en_US,
> >> > just not in a unicode environment. Correct me if I'm wrong.
> >>
> >> See below. We're not defaulting to en_US for things like the number
> >> format.
> >>
> >> > Also, in most other places (such as our website, GLEPs, ebuilds)
> >> > we default to en_US.UTF-8.
> >>
> >> > So let's upgrade to en_US.UTF-8, which is for most users more
> >> > desirable than the current situation. Of course we will still
> >> > advise them to set their desired locales in /etc/locale.gen. But
> >> > at least they will start with a unicode environment, as expected
> >> > anno 2012.
> >>
> >> As I had pointed out before [1], changing from POSIX to an en_US
> >> locale will have undesirable side effects, like commas as thousands
> >> separators in numbers (because of LC_NUMERIC). Also the defaults of
> >> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> >>
> >> So if we change the default (but I still don't see the need), we
> >> should go for a less intrusive setting like:
> >>
> >>LANG="POSIX"
> >>LC_CTYPE="en_US.utf8"
> >>
> >> Ulrich
> >>
> >
> > You're concerned about the commas breaking things? Given that you
> > usually need to specifically ask for them (i.e., printf ' flag),
> > and that kind of output is usually going to be for human
> > consumption only that seems unlikely. If anything does rely upon
> > the format, can't tolerate different locales, and fails to specify
> > LC_NUMERIC then it's broken anyway.
> >
> > LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more
> > annoying defaults for some people. What do users of other distros
> > think? Is this really a serious problem for anyone?
> >
> > LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is
> > getting utf8 by default. I can live with LANG=POSIX.
> > --
> > Dan Douglas
> 
> How about the below?
> 
> LANG=en_GB.utf8
> LC_COLLATE=C
> LC_CTYPE=en_GB.utf8
> 
> That will give us A4 paper size and the metric system. If LC_NUMERIC
> is really a problem, we can set it to something more desirable.

LC_NUMERIC=pl_PL.utf8

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Michał Górny
On Fri, 27 Jul 2012 10:38:30 +0200
Cyprien Nicolas  wrote:

> Ulrich Mueller wrote:
> >> On Fri, 27 Jul 2012, Ben de Groot wrote:
> >>
> >> So let's upgrade to en_US.UTF-8, which is for most users more
> >> desirable than the current situation. Of course we will still
> >> advise them to set their desired locales in /etc/locale.gen. But
> >> at least they will start with a unicode environment, as expected
> >> anno 2012.
> > 
> > As I had pointed out before [1], changing from POSIX to an en_US
> > locale will have undesirable side effects, like commas as thousands
> > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> 
> For this very reason by system locale is en_IE.UTF-8. Still English
> but using Euro Monetary, Metric units, A4 paper, etc.
> 
> It might suit needs for most European installs, but not for everyone.

Still uses ',' for thousands sep.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Cyprien Nicolas
Ulrich Mueller wrote:
>> On Fri, 27 Jul 2012, Ben de Groot wrote:
>>
>> So let's upgrade to en_US.UTF-8, which is for most users more
>> desirable than the current situation. Of course we will still advise
>> them to set their desired locales in /etc/locale.gen. But at least
>> they will start with a unicode environment, as expected anno 2012.
> 
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

For this very reason by system locale is en_IE.UTF-8. Still English but
using Euro Monetary, Metric units, A4 paper, etc.

It might suit needs for most European installs, but not for everyone.

-- 
Cyprien / Fulax
Gentoo Lisp Project contributor




Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Ben de Groot
On 27 July 2012 16:06, Dan Douglas  wrote:
> On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
>> > On Fri, 27 Jul 2012, Ben de Groot wrote:
>>
>> > I understand why the council rejected Debian's C.UTF-8 option,
>> > but is there really no better default that we can use?
>>
>> > Without any default locale set, in practically all cases that means
>> > that the user is presented with English, and mostly the American
>> > variant. So, in practice, we are defaulting to en_US, just not in a
>> > unicode environment. Correct me if I'm wrong.
>>
>> See below. We're not defaulting to en_US for things like the number
>> format.
>>
>> > Also, in most other places (such as our website, GLEPs, ebuilds)
>> > we default to en_US.UTF-8.
>>
>> > So let's upgrade to en_US.UTF-8, which is for most users more
>> > desirable than the current situation. Of course we will still advise
>> > them to set their desired locales in /etc/locale.gen. But at least
>> > they will start with a unicode environment, as expected anno 2012.
>>
>> As I had pointed out before [1], changing from POSIX to an en_US
>> locale will have undesirable side effects, like commas as thousands
>> separators in numbers (because of LC_NUMERIC). Also the defaults of
>> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>>
>> So if we change the default (but I still don't see the need), we
>> should go for a less intrusive setting like:
>>
>>LANG="POSIX"
>>LC_CTYPE="en_US.utf8"
>>
>> Ulrich
>>
>
> You're concerned about the commas breaking things? Given that you usually need
> to specifically ask for them (i.e., printf ' flag), and that kind of output is
> usually going to be for human consumption only that seems unlikely. If
> anything does rely upon the format, can't tolerate different locales, and 
> fails
> to specify LC_NUMERIC then it's broken anyway.
>
> LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying
> defaults for some people. What do users of other distros think? Is this really
> a serious problem for anyone?
>
> LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8
> by default. I can live with LANG=POSIX.
> --
> Dan Douglas

How about the below?

LANG=en_GB.utf8
LC_COLLATE=C
LC_CTYPE=en_GB.utf8

That will give us A4 paper size and the metric system. If LC_NUMERIC is
really a problem, we can set it to something more desirable.
-- 
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Dan Douglas
On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
> > On Fri, 27 Jul 2012, Ben de Groot wrote:
> 
> > I understand why the council rejected Debian's C.UTF-8 option,
> > but is there really no better default that we can use?
> 
> > Without any default locale set, in practically all cases that means
> > that the user is presented with English, and mostly the American
> > variant. So, in practice, we are defaulting to en_US, just not in a
> > unicode environment. Correct me if I'm wrong.
> 
> See below. We're not defaulting to en_US for things like the number
> format.
> 
> > Also, in most other places (such as our website, GLEPs, ebuilds)
> > we default to en_US.UTF-8.
> 
> > So let's upgrade to en_US.UTF-8, which is for most users more
> > desirable than the current situation. Of course we will still advise
> > them to set their desired locales in /etc/locale.gen. But at least
> > they will start with a unicode environment, as expected anno 2012.
> 
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> 
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
> 
>LANG="POSIX"
>LC_CTYPE="en_US.utf8"
> 
> Ulrich
> 

You're concerned about the commas breaking things? Given that you usually need 
to specifically ask for them (i.e., printf ' flag), and that kind of output is 
usually going to be for human consumption only that seems unlikely. If 
anything does rely upon the format, can't tolerate different locales, and fails 
to specify LC_NUMERIC then it's broken anyway.

LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying 
defaults for some people. What do users of other distros think? Is this really 
a serious problem for anyone?

LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8 
by default. I can live with LANG=POSIX.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Rick "Zero_Chaos" Farina
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/27/2012 03:08 AM, Ulrich Mueller wrote:
> 
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> 
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
> 
>LANG="POSIX"
>LC_CTYPE="en_US.utf8"

I would love to see a utf8 default, if the above is agreeable then I say +1

- -Zero
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJQEkD6AAoJEKXdFCfdEflKt8MP/3wRoExV11rO5aV5952hwKhd
x9AG3wGJQqGFLkKW++gU1RLX8rhxZE+W8cRlp3/4Q1b6yLGFp7UihZv/rQj1SJra
Uz4OWqzzdYAkfkzr2MOgB94iODXInuuSbZmhcvOg8d7cgbhW3p0aIQ59uqkqom6W
U0a8BohmGtTEMvWurMtvz705atv0z8aRUsoBUkagCUmRqg96j8HJRbMibNFKcHaa
tzilNblkCouPmh5VZNuoCNIVrs6ADOT+kXmhZ8DeuOOdM88irPr41gz557K97J4l
u9ZWElpLY8zse+dHSioybE57cb9ISNph9B3OjmrzEmxMYO/Vs8+8ZRIgX4A4U2FZ
BDISvf2u77ZUhv48gCuC6pj+np7IMAUgRgk1xWiSkPIWxvlcPcvFo/K1dle3FofL
iNAxf0XcLj+crfBemhnvDWTB0ZCIIBcyn0MYax70lzcwR0t0q+xJ8XBN1hF3xWob
LOUSCd1sibc2a65D5olc/qKSjINM5KY3D+CVXhojhD1YzklmrKBb9K5gk6ziZr2y
w4OMOIkDc+iHYq0xhcYRAJU38+cuX9ViNq9O4H3ILpQXi+KRKlk4PmlLIm2v9evb
P+JNsRSl+1sxUkn2ZthBh+83vj/WtnR0s1sXEzc+6riBomBGsc0Hbsoa9Z+JgNhF
FzvV5OHsfNiuHvAzayww
=ZiLb
-END PGP SIGNATURE-



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Ulrich Mueller
> On Fri, 27 Jul 2012, Ben de Groot wrote:

> I understand why the council rejected Debian's C.UTF-8 option,
> but is there really no better default that we can use?

> Without any default locale set, in practically all cases that means
> that the user is presented with English, and mostly the American
> variant. So, in practice, we are defaulting to en_US, just not in a
> unicode environment. Correct me if I'm wrong.

See below. We're not defaulting to en_US for things like the number
format.

> Also, in most other places (such as our website, GLEPs, ebuilds)
> we default to en_US.UTF-8.

> So let's upgrade to en_US.UTF-8, which is for most users more
> desirable than the current situation. Of course we will still advise
> them to set their desired locales in /etc/locale.gen. But at least
> they will start with a unicode environment, as expected anno 2012.

As I had pointed out before [1], changing from POSIX to an en_US
locale will have undesirable side effects, like commas as thousands
separators in numbers (because of LC_NUMERIC). Also the defaults of
en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

So if we change the default (but I still don't see the need), we
should go for a less intrusive setting like:

   LANG="POSIX"
   LC_CTYPE="en_US.utf8"

Ulrich

[1] 




Re: [gentoo-dev] UTF-8 locale by default

2012-07-26 Thread Ben de Groot
On 20 July 2012 06:28, Ulrich Mueller  wrote:
>> On Thu, 19 Jul 2012, Sascha Cunz wrote:
>
>> Is there a reason for not using at least en_US.UTF-8 as a "sane"
>> default value?
>
> Because there's no one-size-fits-all locale, but it is specific to
> every system so the user must configure it?

While this is understandable, the fact remains that not having a
UTF-8 locale by default in our stage3 environment is sub-optimal.

I understand why the council rejected Debian's C.UTF-8 option,
but is there really no better default that we can use?

Without any default locale set, in practically all cases that means
that the user is presented with English, and mostly the American
variant. So, in practice, we are defaulting to en_US, just not in a
unicode environment. Correct me if I'm wrong.

Also, in most other places (such as our website, GLEPs, ebuilds)
we default to en_US.UTF-8.

So let's upgrade to en_US.UTF-8, which is for most users more
desirable than the current situation. Of course we will still advise
them to set their desired locales in /etc/locale.gen. But at least
they will start with a unicode environment, as expected anno 2012.


> The matter was recently discussed in this mailing list [1] and also in
> the March 2012 council meeting [2], and as a result the docs team has
> amended the respective section [3] of the handbook.
>
> Ulrich
>
> [1] 
> 
> [2] 
> [3] 
>

-- 
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin



Re: [gentoo-dev] UTF-8 locale by default

2012-07-19 Thread Ulrich Mueller
> On Thu, 19 Jul 2012, Sascha Cunz wrote:

> Is there a reason for not using at least en_US.UTF-8 as a "sane"
> default value?

Because there's no one-size-fits-all locale, but it is specific to
every system so the user must configure it?

The matter was recently discussed in this mailing list [1] and also in
the March 2012 council meeting [2], and as a result the docs team has
amended the respective section [3] of the handbook.

Ulrich

[1] 

[2] 
[3] 



Re: [gentoo-dev] UTF-8 locale by default

2012-07-19 Thread Chí-Thanh Christopher Nguyễn
Sascha Cunz schrieb:
> Is there a reason for not using at least en_US.UTF-8 as a "sane" default 
> value?

It has been discussed some time ago already. Setting LANG="en_US.UTF-8"
would mess with collation rules, measurement&paper units etc. which has
the potential to make users outside USA unhappy.

It might make sense to set LC_CTYPE="en_US.UTF8" but even so,
transliteration may give you unexpected results.

To illustrate this, try running

echo äå | LC_CTYPE=en_US.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=da_DK.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=de_DE.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8

and compare the output.
For the previous discussion, see this thread:
http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml


Best regards,
Chí-Thanh Christopher Nguyễn



[gentoo-dev] UTF-8 locale by default

2012-07-19 Thread Sascha Cunz
I recently discovered that I for some reason haven't noticed the warning about 
setting the locale to utf-8 in the gentoo handbook for obviously several 
years; thus i was still running all my systems in a POSIX locale since i never 
cared much about it.

However, since I noticed, I talked to several people about it; all of them 
stating as first response: "Not shipping with a utf-8 locale turned on by 
default nowadays probably is a bug in your distro".

While thinking about this and recognizing that indeed recent distributions 
ship with some UTF-8 locale by default, I tend to agree on that statement.

Though, google brings up a lot of good documentation about how to change the 
locale, I couldn't find something that tells why stage3 is still delivered 
with posix locale set.

Is there a reason for not using at least en_US.UTF-8 as a "sane" default 
value?

BR,
SaCu